JP2022043974A

JP2022043974A - Behavior recognition apparatus, learning apparatus, and behavior recognition method

Info

Publication number: JP2022043974A
Application number: JP2021037260A
Authority: JP
Inventors: 敦根尾; Atsushi Neo; 由希子荻原; Yukiko Ogiwara
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2020-09-04
Filing date: 2021-03-09
Publication date: 2022-03-16
Anticipated expiration: 2041-03-09
Also published as: JP7439004B2

Abstract

To recognize behavior of multiple kinds of recognition objects highly accurately.SOLUTION: A behavior recognition apparatus can access a behavior classification model group learned for each component group by using a behavior for learning and a component group obtained from a shape for learning by component analysis which generates statistical components by multivariable analysis, detects a shape of a recognition object from data for analysis, generates one or more components and contribution rates of each component on the basis of the shape of the recognition object by component analysis, determines an ordinal number indicating dimensions of one or more components on the basis of a cumulative contribution rate obtained from the contribution rates, selects a specific behavior classification model leaned with the same component group as the specific component group including one or more components of the determined ordinal number indicating the dimensions, from the behavior classification model group, and inputs the specific component group to the specific behavior classification model to output a recognition result indicating behavior of the recognition object.SELECTED DRAWING: Figure 4

Description

本発明は、行動認識装置、学習装置、および行動認識方法に関する。 The present invention relates to a behavior recognition device, a learning device, and a behavior recognition method.

本技術分野の背景技術として、特許文献１は、人の動作において、表面筋電位等の生体信号に頼らずにそれが意図したものであるか否かを識別する意図推定装置を開示する。この意図推定装置は、人の動作している位置、および角度の計測方法を用いて動作情報を取得し人の動作を人が実現可能な範囲に制限し、その動作中における人の関節角度と動作している部位の先端位置の位置情報を抽出し多変量解析手法を用い、さらに人の動作が意図するものであるか否かを識別する閾値を用いて、人の動作がその人が意図するものであるか否かを識別することで、表面筋電位等の生体信号に頼らずに、動作が意図したものであるか否かを識別する。 As a background technique in the present technical field, Patent Document 1 discloses an intention estimation device that discriminates whether or not a person's movement is intended without relying on a biological signal such as a surface myoelectric potential. This intention estimation device acquires motion information using a method of measuring the position and angle at which a person is moving, limits the motion of the person to a range that can be realized by the person, and determines the joint angle of the person during the motion. A person's movement is intended by extracting the position information of the tip position of the moving part, using a multivariate analysis method, and using a threshold to identify whether or not the person's movement is intended. By identifying whether or not the movement is intended, it is possible to identify whether or not the operation is intended without relying on biological signals such as surface myoelectric potential.

特開２０１２－１０１２８４号公報Japanese Unexamined Patent Publication No. 2012-101284

人の動作における意図を推定するために上記特許文献１に記載された技術では、行動を意図するものか否かの２値化判断するため、複数種類の複雑な動作の意図を分類することはできず、動作の意図推定の精度を著しく低下させる可能性が生じる。 In the technique described in Patent Document 1 for estimating an intention in a person's movement, it is possible to classify the intentions of a plurality of types of complicated movements in order to determine binarization as to whether or not the intention is an action. This is not possible, and there is a possibility that the accuracy of motion intention estimation will be significantly reduced.

本発明は、認識対象の複数種類の行動を高精度に認識することを目的とする。 An object of the present invention is to recognize a plurality of types of actions to be recognized with high accuracy.

本願において開示される発明の一側面となる行動認識装置は、プログラムを実行するプロセッサと、前記プログラムを記憶する記憶デバイスと、を有する行動認識装置であって、多変量解析で統計的な成分を生成する成分分析により学習対象の形状から得られる成分群と、前記学習対象の行動と、を用いて、成分群ごとに学習された行動分類モデル群にアクセス可能であり、前記プロセッサは、解析対象データから認識対象の形状を検出する検出処理と、前記成分分析により、前記検出処理によって検出された前記認識対象の形状に基づいて、１以上の成分と、前記成分の各々の寄与率と、を生成する成分分析処理と、前記各々の寄与率から得られる累積寄与率に基づいて、前記１以上の各々の次元を示す序数を決定する決定処理と、前記決定処理によって決定された次元を示す序数の成分を１以上含む特定の成分群と同じ成分群で学習された特定の行動分類モデルを、前記行動分類モデル群から選択する選択処理と、前記選択処理によって選択された特定の行動分類モデルに前記特定の成分群を入力することにより、前記認識対象の行動を示す認識結果を出力する行動認識処理と、を実行することを特徴とする。 The behavior recognition device according to one aspect of the invention disclosed in the present application is a behavior recognition device having a processor for executing a program and a storage device for storing the program, and a statistical component is obtained by multivariate analysis. The behavior classification model group learned for each component group can be accessed by using the component group obtained from the shape of the learning target by the generated component analysis and the behavior of the learning target, and the processor is the analysis target. A detection process for detecting the shape of the recognition target from the data, and one or more components and the contribution ratio of each of the components based on the shape of the recognition target detected by the detection process by the component analysis. A determination process for determining a sequence indicating each of the above 1 or more dimensions based on a component analysis process to be generated and a cumulative contribution rate obtained from each of the contribution rates, and a decision process indicating a dimension determined by the determination process. A specific behavior classification model learned in the same component group as a specific component group containing one or more components of is selected from the behavior classification model group and a specific behavior classification model selected by the selection process. By inputting the specific component group, an action recognition process for outputting a recognition result indicating the action of the recognition target is executed.

本願において開示される発明の他の側面となる行動認識装置は、プログラムを実行するプロセッサと、前記プログラムを記憶する記憶デバイスと、を有する行動認識装置であって、多変量解析で統計的な成分を生成する次元削減により学習対象の形状から得られる第１変数からの昇順の成分群と、前記学習対象の行動と、を用いて、成分群ごとに学習された行動分類モデル群にアクセス可能であり、前記プロセッサは、解析対象データから認識対象の形状を検出する検出処理と、前記次元削減により、前記検出処理によって検出された前記認識対象の形状に基づいて、１以上の成分と、前記成分の各々の寄与率と、を生成する次元削減処理と、前記各々の寄与率に基づいて、前記１以上の成分のうち第１変数からの昇順の成分の次元を示す序数を決定する決定処理と、前記第１変数から前記決定処理によって決定された次元を示す序数の成分までの特定の成分群と同じ成分群で学習された特定の行動分類モデルを、前記行動分類モデル群から選択する選択処理と、前記選択処理によって選択された特定の行動分類モデルに前記特定の成分群を入力することにより、前記認識対象の行動を示す認識結果を出力する行動認識処理と、を実行することを特徴とする。 The behavior recognition device according to another aspect of the invention disclosed in the present application is a behavior recognition device having a processor for executing a program and a storage device for storing the program, and is a statistical component in multivariate analysis. It is possible to access the behavior classification model group learned for each component group by using the component group in ascending order from the first variable obtained from the shape of the learning target by the dimension reduction to generate and the behavior of the learning target. The processor has one or more components and the component based on the shape of the recognition target detected by the detection process by the detection process of detecting the shape of the recognition target from the analysis target data and the dimension reduction. And the dimension reduction process to generate each contribution rate, and the decision process to determine the order indicating the dimension of the ascending component from the first variable among the one or more components based on each contribution rate. , A selection process for selecting from the behavior classification model group a specific behavior classification model learned in the same component group as the specific component group from the first variable to the component of the order indicating the dimension determined by the determination process. And, by inputting the specific component group into the specific action classification model selected by the selection process, the action recognition process that outputs the recognition result indicating the action of the recognition target is executed. do.

本願において開示される発明の一側面となる学習装置は、プログラムを実行するプロセッサと、前記プログラムを記憶する記憶デバイスと、を有する学習装置であって、前記プロセッサは、学習対象の形状および行動を含む教師データを取得する取得処理と、多変量解析で統計的な成分を生成する成分分析により、前記取得処理によって取得された前記学習対象の形状に基づいて、１以上の成分を生成する成分分析処理と、許容計算量に基づいて、前記１以上の各々の次元を示す序数を制御する制御処理と、前記制御処理によって制御された次元を示す序数の成分を１以上含む成分群と、前記学習対象の行動と、に基づいて、前記学習対象の行動を学習して、前記学習対象の行動を分類する行動分類モデルを生成する行動学習処理と、を実行することを特徴とする。 The learning device according to one aspect of the invention disclosed in the present application is a learning device having a processor for executing a program and a storage device for storing the program, and the processor determines the shape and behavior of a learning target. Component analysis that generates one or more components based on the shape of the learning target acquired by the acquisition process by the acquisition process that acquires the teacher data including and the component analysis that generates statistical components by multivariate analysis. The learning, a control process for controlling the order indicating each dimension of 1 or more based on the processing, an allowable calculation amount, a component group containing 1 or more components of the order indicating the dimension controlled by the control process, and the learning. It is characterized by executing a behavior learning process of learning the behavior of the learning target and generating a behavior classification model for classifying the behavior of the learning target based on the behavior of the target.

本願において開示される発明の他の側面となる学習装置は、プログラムを実行するプロセッサと、前記プログラムを記憶する記憶デバイスと、を有する学習装置であって、前記プロセッサは、学習対象の形状および行動を含む教師データを取得する取得処理と、多変量解析で統計的な成分を生成する次元削減により、前記取得処理によって取得された前記学習対象の形状に基づいて、１以上の成分を生成する次元削減処理と、許容計算量に基づいて、前記１以上の成分のうち第１変数からの昇順の成分の次元を示す序数を制御する制御処理と、前記第１変数から前記制御処理によって制御された次元を示す序数の成分までの成分群と、前記学習対象の行動と、に基づいて、前記学習対象の行動を学習して、前記学習対象の行動を分類する行動分類モデルを生成する行動学習処理と、を実行することを特徴とする。 A learning device according to another aspect of the invention disclosed in the present application is a learning device having a processor for executing a program and a storage device for storing the program, wherein the processor has a shape and behavior of a learning object. A dimension that generates one or more components based on the shape of the learning target acquired by the acquisition process by the acquisition process that acquires teacher data including the above and the dimension reduction that generates statistical components by multivariate analysis. It was controlled by the reduction process, the control process that controls the order indicating the dimension of the ascending component from the first variable among the one or more components based on the allowable calculation amount, and the control process from the first variable. A behavior learning process that learns the behavior of the learning target based on the component group up to the component of the order indicating the dimension and the behavior of the learning target, and generates a behavior classification model that classifies the behavior of the learning target. And, it is characterized by executing.

本発明の代表的な実施の形態によれば、認識対象の複数種類の行動を高精度に認識することができる。前述した以外の課題、構成及び効果は、以下の実施例の説明により明らかにされる。 According to a typical embodiment of the present invention, it is possible to recognize a plurality of types of actions to be recognized with high accuracy. Issues, configurations and effects other than those described above will be clarified by the description of the following examples.

図１は、実施例１にかかる行動認識システムのシステム構成例を示す説明図である。FIG. 1 is an explanatory diagram showing a system configuration example of the behavior recognition system according to the first embodiment. 図２は、コンピュータのハードウェア構成例を示すブロック図である。FIG. 2 is a block diagram showing an example of a computer hardware configuration. 図３は、学習データの一例を示す説明図である。FIG. 3 is an explanatory diagram showing an example of learning data. 図４は、実施例１にかかる行動認識システムの機能的構成例を示すブロック図である。FIG. 4 is a block diagram showing a functional configuration example of the behavior recognition system according to the first embodiment. 図５は、骨格情報処理部の詳細な機能的構成例を示すブロック図である。FIG. 5 is a block diagram showing a detailed functional configuration example of the skeleton information processing unit. 図６は、関節角度算出部が実行する関節角度の詳細な算出方法を示す説明図である。FIG. 6 is an explanatory diagram showing a detailed calculation method of the joint angle executed by the joint angle calculation unit. 図７は、移動量算出部が実行するフレーム間の移動量の詳細な算出方法の例を示す説明図である。FIG. 7 is an explanatory diagram showing an example of a detailed calculation method of the movement amount between frames executed by the movement amount calculation unit. 図８は、正規化部が実行する骨格情報の正規化の詳細な手法を示す説明図である。FIG. 8 is an explanatory diagram showing a detailed method of normalizing the skeleton information executed by the normalization unit. 図９は、教師信号ＤＢが保持する教師信号の詳細な例を示す説明図である。FIG. 9 is an explanatory diagram showing a detailed example of the teacher signal held by the teacher signal DB. 図１０は、教師信号を入力データとして主成分分析部が生成した主成分を、主成分空間上にプロットした例を示す説明図である。FIG. 10 is an explanatory diagram showing an example in which the principal component generated by the principal component analysis unit using the teacher signal as input data is plotted on the principal component space. 図１１は、行動学習部が行動を学習し、行動認識部が行動を分類するための詳細な手法を示す説明図である。FIG. 11 is an explanatory diagram showing a detailed method for the behavior learning unit to learn the behavior and the behavior recognition unit to classify the behavior. 図１２は、次元数決定部が次元数決定の際に用いる累積寄与率の推移を示すグラフである。FIG. 12 is a graph showing the transition of the cumulative contribution rate used by the dimension number determination unit when determining the dimension number. 図１３は、実施例１にかかるサーバ（学習装置）による学習処理の詳細な処理手順例を示すフローチャートである。FIG. 13 is a flowchart showing a detailed processing procedure example of the learning process by the server (learning device) according to the first embodiment. 図１４は、実施例１にかかる骨格情報処理の詳細な処理手順例を示すフローチャートである。FIG. 14 is a flowchart showing a detailed processing procedure example of the skeleton information processing according to the first embodiment. 図１５は、実施例１にかかるクライアント（行動認識装置）による行動認識処理手順例を示すフローチャートである。FIG. 15 is a flowchart showing an example of an action recognition processing procedure by a client (behavior recognition device) according to the first embodiment. 図１６は、実施例２にかかる行動認識システムの機能的構成例を示すブロック図である。FIG. 16 is a block diagram showing a functional configuration example of the behavior recognition system according to the second embodiment. 図１７は、実施例２にかかるサーバ（学習装置）による学習処理の詳細な処理手順例を示すフローチャートである。FIG. 17 is a flowchart showing a detailed processing procedure example of the learning process by the server (learning device) according to the second embodiment. 図１８は、実施例２にかかるクライアント（行動認識装置）による行動認識処理手順例を示すフローチャートである。FIG. 18 is a flowchart showing an example of an action recognition processing procedure by a client (behavior recognition device) according to the second embodiment. 図１９は、実施例４にかかる骨格情報処理部の機能的構成例を示すブロック図である。FIG. 19 is a block diagram showing a functional configuration example of the skeleton information processing unit according to the fourth embodiment. 図２０は、実施例４にかかる骨格情報処理部の詳細な処理手順例を示すフローチャートである。FIG. 20 is a flowchart showing a detailed processing procedure example of the skeleton information processing unit according to the fourth embodiment. 図２１は、実施例５にかかる行動認識システムの機能的構成例を示すブロック図である。FIG. 21 is a block diagram showing a functional configuration example of the behavior recognition system according to the fifth embodiment. 図２２は、実施例６にかかる行動認識システムの機能的構成例を示すブロック図である。FIG. 22 is a block diagram showing a functional configuration example of the behavior recognition system according to the sixth embodiment. 図２３は、行動学習部および行動認識部が行動を分類するための基礎となる手法である決定木を示す説明図である。FIG. 23 is an explanatory diagram showing a decision tree, which is a basic method for the behavior learning unit and the behavior recognition unit to classify behaviors. 図２４は、決定木による分類の詳細な展開方法を示す説明図である。FIG. 24 is an explanatory diagram showing a detailed development method of classification by a decision tree. 図２５は、アンサンブル学習と、行動学習部と行動認識部が行動を分類するために用いる手法を示す説明図である。FIG. 25 is an explanatory diagram showing ensemble learning and a method used by the behavior learning unit and the behavior recognition unit to classify behaviors. 図２６は、実施例７にかかる行動認識システムの機能的構成例を示すブロック図である。FIG. 26 is a block diagram showing a functional configuration example of the behavior recognition system according to the seventh embodiment. 図２７は、実施例７にかかるサーバ（学習装置）による学習処理の詳細な処理手順例を示すフローチャートである。FIG. 27 is a flowchart showing a detailed processing procedure example of the learning process by the server (learning device) according to the seventh embodiment.

以下、本発明に係る実施の形態を図面に基づいて説明する。なお、実施の形態を説明するための全図において、同一の部材には原則として同一の符号を付し、その繰り返しの説明は省略する。また、以下の実施の形態において、その構成要素（要素ステップ等も含む）は、特に明示した場合および原理的に明らかに必須であると考えられる場合等を除き、必ずしも必須のものではないことは言うまでもない。また、「Ａからなる」、「Ａよりなる」、「Ａを有する」、「Ａを含む」と言うときは、特にその要素のみである旨明示した場合等を除き、それ以外の要素を排除するものでないことは言うまでもない。同様に、以下の実施の形態において、構成要素等の形状、位置関係等に言及するときは、特に明示した場合および原理的に明らかにそうでないと考えられる場合等を除き、実質的にその形状等に近似または類似するもの等を含むものとする。 Hereinafter, embodiments according to the present invention will be described with reference to the drawings. In addition, in all the drawings for explaining the embodiment, the same members are designated by the same reference numerals in principle, and the repeated description thereof will be omitted. Further, in the following embodiments, the constituent elements (including element steps and the like) are not necessarily essential unless otherwise specified or clearly considered to be essential in principle. Needless to say. In addition, when saying "consisting of A", "consisting of A", "having A", and "including A", other elements are excluded unless it is clearly stated that it is only that element. It goes without saying that it is not something to do. Similarly, in the following embodiments, when the shape, positional relationship, etc. of the components or the like are referred to, the shape is substantially the same, except when it is clearly stated or when it is considered that it is not clearly the case in principle. Etc., etc. shall be included.

本明細書等における「第１」、「第２」、「第３」などの表記は、構成要素を識別するために付するものであり、必ずしも、数、順序、もしくはその内容を限定するものではない。また、構成要素の識別のための番号は文脈毎に用いられ、一つの文脈で用いた番号が、他の文脈で必ずしも同一の構成を示すとは限らない。また、ある番号で識別された構成要素が、他の番号で識別された構成要素の機能を兼ねることを妨げるものではない。 Notations such as "first", "second", and "third" in the present specification and the like are attached to identify components, and do not necessarily limit the number, order, or contents thereof. is not it. Further, the numbers for identifying the components are used for each context, and the numbers used in one context do not always indicate the same composition in the other contexts. Further, it does not prevent the component identified by a certain number from functioning as the component identified by another number.

図面等において示す各構成の位置、大きさ、形状、範囲などは、発明の理解を容易にするため、実際の位置、大きさ、形状、範囲などを表していない場合がある。このため、本発明は、必ずしも、図面等に開示された位置、大きさ、形状、範囲などに限定されない。 The position, size, shape, range, etc. of each configuration shown in the drawings and the like may not represent the actual position, size, shape, range, etc. in order to facilitate understanding of the invention. Therefore, the present invention is not necessarily limited to the position, size, shape, range and the like disclosed in the drawings and the like.

＜行動認識システム＞
図１は、実施例１にかかる行動認識システムのシステム構成例を示す説明図である。行動認識システム１００は、サーバ１０１と、１台以上のクライアント１０２と、を有する。サーバとクライアントとは、インターネット、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）などのネットワーク１０５を介して通信可能に接続される。サーバ１０１は、クライアント１０２を管理するコンピュータである。クライアント１０２は、センサ１０３に接続され、センサ１０３からのデータを取得するコンピュータである。 <Behavior recognition system>
FIG. 1 is an explanatory diagram showing a system configuration example of the behavior recognition system according to the first embodiment. The action recognition system 100 includes a server 101 and one or more clients 102. The server and the client are communicably connected via a network 105 such as the Internet, a LAN (Local Area Network), and a WAN (Wide Area Network). The server 101 is a computer that manages the client 102. The client 102 is a computer that is connected to the sensor 103 and acquires data from the sensor 103.

センサ１０３は、解析環境から解析対象データを検出する。センサ１０３は、たとえば、静止画または動画を撮像するカメラである。また、センサ１０３は、音声や匂いを検出してもよい。教師信号ＤＢ１０４は、学習データ（人の骨格情報および関節角度）と行動情報（たとえば、「立つ」、「倒れる」といった人の姿勢や動作）との組み合わせを教師信号として保持するデータベースである。教師信号ＤＢ１０４は、サーバ１０１に記憶されていてもよく、サーバ１０１またはクライアント１０２とネットワーク１０５を介して通信可能なコンピュータに接続されていてもよい。 The sensor 103 detects the data to be analyzed from the analysis environment. The sensor 103 is, for example, a camera that captures a still image or a moving image. Further, the sensor 103 may detect voice or odor. The teacher signal DB 104 is a database that holds a combination of learning data (human skeleton information and joint angle) and behavior information (for example, a person's posture or movement such as "standing" or "falling") as a teacher signal. The teacher signal DB 104 may be stored in the server 101, or may be connected to a computer capable of communicating with the server 101 or the client 102 via the network 105.

行動認識システム１００は、教師信号ＤＢ１０４を用いた学習機能と、学習機能により得られた行動分類モデルを用いた行動認識機能と、を有する。行動分類モデルとは、人や動物などの認識対象の行動を分類するための学習モデルである。学習機能および行動認識機能は、行動認識システム１００に実装されていれば、サーバ１０１およびクライアント１０２のいずれに実装されていてもよい。たとえば、サーバ１０１が学習機能を実装し、クライアント１０２が行動認識機能を実装してもよい。また、サーバ１０１が学習機能および行動認識機能を実装し、クライアント１０２は、センサ１０３からのデータをサーバ１０１に送信したり、サーバ１０１からの行動認識機能による行動認識結果を受け付けたりしてもよい。 The behavior recognition system 100 has a learning function using the teacher signal DB 104 and a behavior recognition function using the behavior classification model obtained by the learning function. The behavior classification model is a learning model for classifying the behavior of a recognition target such as a person or an animal. The learning function and the behavior recognition function may be implemented in either the server 101 or the client 102 as long as they are implemented in the behavior recognition system 100. For example, the server 101 may implement the learning function, and the client 102 may implement the action recognition function. Further, the server 101 may implement the learning function and the action recognition function, and the client 102 may send the data from the sensor 103 to the server 101 or receive the action recognition result by the action recognition function from the server 101. ..

また、クライアント１０２が学習機能および行動認識機能を実装し、サーバ１０１は、クライアント１０２からの行動分類モデルや行動認識結果を管理してもよい。なお、学習機能を実装するコンピュータを学習装置と称し、学習機能および行動認識機能のうち少なくとも行動認識機能を実装するコンピュータを行動認識装置と称す。また、図１では、クライアントサーバ型の行動認識システム１００を例に挙げたが、スタンドアロン型の行動認識装置でもよい。実施例１では、説明の便宜上、サーバ１０１が学習機能を実装し（学習装置）、クライアント１０２が行動認識機能を実装した（行動認識装置）行動認識システム１００を例に挙げて説明する。 Further, the client 102 may implement the learning function and the behavior recognition function, and the server 101 may manage the behavior classification model and the behavior recognition result from the client 102. A computer that implements the learning function is referred to as a learning device, and a computer that implements at least the behavior recognition function among the learning function and the behavior recognition function is referred to as a behavior recognition device. Further, in FIG. 1, the client-server type action recognition system 100 is taken as an example, but a stand-alone type action recognition device may also be used. In the first embodiment, for convenience of explanation, the behavior recognition system 100 in which the server 101 implements the learning function (learning device) and the client 102 implements the behavior recognition function (behavior recognition device) will be described as an example.

＜コンピュータのハードウェア構成例＞
図２は、コンピュータ（サーバ１０１、クライアント１０２）のハードウェア構成例を示すブロック図である。コンピュータ２００は、プロセッサ２０１と、記憶デバイス２０２と、入力デバイス２０３と、出力デバイス２０４と、通信インターフェース（通信ＩＦ）２０５と、を有する。プロセッサ２０１、記憶デバイス２０２、入力デバイス２０３、出力デバイス２０４、および通信ＩＦ２０５は、バス２０６により接続される。プロセッサ２０１は、コンピュータ２００を制御する。記憶デバイス２０２は、プロセッサ２０１の作業エリアとなる。また、記憶デバイス２０２は、各種プログラムやデータを記憶する非一時的なまたは一時的な記録媒体である。記憶デバイス２０２としては、たとえば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、フラッシュメモリがある。入力デバイス２０３は、データを入力する。入力デバイス２０３としては、たとえば、キーボード、マウス、タッチパネル、テンキー、スキャナがある。出力デバイス２０４は、データを出力する。出力デバイス２０４としては、たとえば、ディスプレイ、プリンタ、スピーカがある。通信ＩＦ２０５は、ネットワーク１０５と接続し、データを送受信する。 <Computer hardware configuration example>
FIG. 2 is a block diagram showing a hardware configuration example of a computer (server 101, client 102). The computer 200 has a processor 201, a storage device 202, an input device 203, an output device 204, and a communication interface (communication IF) 205. The processor 201, the storage device 202, the input device 203, the output device 204, and the communication IF 205 are connected by the bus 206. The processor 201 controls the computer 200. The storage device 202 serves as a work area for the processor 201. Further, the storage device 202 is a non-temporary or temporary recording medium for storing various programs and data. Examples of the storage device 202 include a ROM (Read Only Memory), a RAM (Random Access Memory), an HDD (Hard Disk Drive), and a flash memory. The input device 203 inputs data. The input device 203 includes, for example, a keyboard, a mouse, a touch panel, a numeric keypad, and a scanner. The output device 204 outputs data. The output device 204 includes, for example, a display, a printer, and a speaker. The communication IF 205 connects to the network 105 and transmits / receives data.

＜学習データ＞
図３は、学習データの一例を示す説明図である。学習データ３８０は、対象者ごとに骨格情報３２０と、関節角度３７０と、により構成される。骨格情報３２０は、センサ１０３から取得した解析対象データを基に検出される。関節角度３７０は、骨格情報３２０を基に算出される。対象者一人分の学習データ３８０は、たとえば、その対象者が被写体となる複数の時系列なフレームの各々から得られる骨格情報３２０および関節角度３７０の組み合わせにより構成される。 <Learning data>
FIG. 3 is an explanatory diagram showing an example of learning data. The learning data 380 is composed of skeletal information 320 and joint angle 370 for each subject. The skeleton information 320 is detected based on the analysis target data acquired from the sensor 103. The joint angle 370 is calculated based on the skeletal information 320. The learning data 380 for one subject is composed of, for example, a combination of skeletal information 320 and joint angle 370 obtained from each of a plurality of time-series frames in which the subject is a subject.

骨格情報３２０は、複数（本例では１８個）の骨格点３００～３１７の各々について、名前３２１と、Ｘ軸におけるｘ座標値３２２と、Ｘ軸に直交するｙ軸におけるｙ座標値３２３と、を有する。関節角度３７０も、複数（本例では１８個）の骨格点３００～３１７の各々について、名前３７１を有する。なお、名前３７１において、∠ａ－ｂ－ｃ（ａ，ｂ，ｃは骨格点の名前３２１）は、線分ａｂと線分ｂｃとのなす骨格点ｂの関節角度３７０である。なお、骨格情報３２０は、たとえば、指の関節を含んでもよい。また、関節角度３７０も、これら以外の関節角度３７０を含んでもよい。 The skeleton information 320 includes the name 321, the x-coordinate value 322 on the X-axis, and the y-coordinate value 323 on the y-axis orthogonal to the X-axis for each of the plurality of (18 in this example) skeleton points 300 to 317. Has. The joint angle 370 also has the name 371 for each of the plurality (18 in this example) skeletal points 300-317. In the name 371, ∠a-bc (a, b, c are the names of the skeleton points 321) is the joint angle 370 of the skeleton point b formed by the line segment ab and the line segment bc. The skeleton information 320 may include, for example, a knuckle. Further, the joint angle 370 may also include a joint angle 370 other than these.

なお、図３では、骨格点３００～３１７の座標値を２次元の位置情報（ｘ座標値およびｙ座標値の組み合わせ）としたが、３次元の位置情報としてもよい。具体的には、たとえば、Ｘ軸およびｙ軸に直交するｚ軸（たとえば、奥行き方向）におけるｚ座標値が追加されてもよい。 In FIG. 3, the coordinate values of the skeleton points 300 to 317 are used as two-dimensional position information (combination of x-coordinate value and y-coordinate value), but may be used as three-dimensional position information. Specifically, for example, a z-coordinate value on the z-axis (for example, in the depth direction) orthogonal to the X-axis and the y-axis may be added.

＜行動認識システム１００の機能的構成例＞
図４は、実施例１にかかる行動認識システム１００の機能的構成例を示すブロック図である。サーバ１０１は、教師信号取得部４０１と、欠損情報制御部４０２と、骨格情報処理部４０３と、主成分分析部４０４と、次元数制御部４０５と、行動学習部４０６と、を有する。クライアント１０２は、骨格検出部４５１と、欠損情報判断部４５２と、骨格情報処理部４５３と、主成分分析部４５４と、次元数決定部４５５と、行動分類モデル選択部４５６と、行動認識部４５７と、を有する。 <Example of functional configuration of behavior recognition system 100>
FIG. 4 is a block diagram showing a functional configuration example of the behavior recognition system 100 according to the first embodiment. The server 101 includes a teacher signal acquisition unit 401, a missing information control unit 402, a skeletal information processing unit 403, a principal component analysis unit 404, a dimension number control unit 405, and a behavior learning unit 406. The client 102 includes a skeleton detection unit 451, a defect information determination unit 452, a skeleton information processing unit 453, a principal component analysis unit 454, a dimension number determination unit 455, an action classification model selection unit 456, and an action recognition unit 457. And have.

これらは、具体的には、たとえば、図２に示した記憶デバイス２０２に記憶されたプログラムをプロセッサ２０１に実行させることにより実現される。まず、サーバ１０１側の機能的構成例について説明する。 Specifically, these are realized, for example, by causing the processor 201 to execute a program stored in the storage device 202 shown in FIG. First, a functional configuration example on the server 101 side will be described.

教師信号取得部４０１は、教師信号ＤＢ１０４から取得した教師信号について学習に用いる教師信号を単数、または複数取得して、選択した教師信号を欠損情報制御部４０２に出力する。 The teacher signal acquisition unit 401 acquires a single teacher signal or a plurality of teacher signals used for learning about the teacher signal acquired from the teacher signal DB 104, and outputs the selected teacher signal to the missing information control unit 402.

欠損情報制御部４０２は、教師信号取得部４０１から取得した教師信号の内、骨格情報３２０に対して任意の骨格点を欠損させる。欠損させる骨格点は単数でも複数でも０でもよい。欠損情報制御部４０２は、欠損後（欠損させる骨格点が０個の場合も含む）の骨格情報３２０を教師信号内の骨格情報３２０として更新する。またノイズ耐性を強くするため、情報を欠損させる際に骨格情報３２０に対して骨格点位置をずらすようなノイズを加えて、骨格情報３２０を更新してもよい。 The missing information control unit 402 deletes an arbitrary skeleton point with respect to the skeleton information 320 among the teacher signals acquired from the teacher signal acquisition unit 401. The number of skeletal points to be deleted may be singular, plural, or zero. The defect information control unit 402 updates the skeleton information 320 after the defect (including the case where the number of skeleton points to be deleted is 0) as the skeleton information 320 in the teacher signal. Further, in order to enhance the noise immunity, the skeleton information 320 may be updated by adding noise that shifts the skeleton point position with respect to the skeleton information 320 when the information is lost.

そして、欠損情報制御部４０２は、欠損させた骨格点の名前３２１および位置情報（ｘ座標値３２２、ｙ座標値３２３）である欠損情報を含む教師信号を、骨格情報処理部１２０に出力する。また、欠損情報制御部４０２は、欠損情報を骨格情報処理部４０３と主成分分析部４０４と次元数制御部４０５を介して行動学習部４０６に出力する。 Then, the defect information control unit 402 outputs a teacher signal including the name 321 of the deleted skeleton point and the defect information which is the position information (x coordinate value 322, y coordinate value 323) to the skeleton information processing unit 120. Further, the defect information control unit 402 outputs the defect information to the behavior learning unit 406 via the skeleton information processing unit 403, the principal component analysis unit 404, and the dimension number control unit 405.

骨格情報処理部４０３は、更新後の骨格情報３２０を処理する。具体的には、たとえば、骨格情報処理部４０３は、取得した更新後の教師信号の内、骨格情報３２０から関節角度３７０とフレーム間の移動量とを算出する。また、骨格情報処理部４０３は、骨格情報３２０に対して絶対的な位置情報を除外し、骨格情報３２０の大きさが一定となる正規化を実行する。そして、骨格情報処理部４０３は、関節角度３７０と、フレーム間の移動量と、正規化した骨格情報３２０と、を主成分分析部４０４に出力する。 The skeleton information processing unit 403 processes the updated skeleton information 320. Specifically, for example, the skeleton information processing unit 403 calculates the joint angle 370 and the amount of movement between frames from the skeleton information 320 in the acquired updated teacher signal. Further, the skeleton information processing unit 403 excludes the absolute position information from the skeleton information 320, and executes normalization in which the size of the skeleton information 320 is constant. Then, the skeleton information processing unit 403 outputs the joint angle 370, the amount of movement between frames, and the normalized skeleton information 320 to the principal component analysis unit 404.

図５は、骨格情報処理部４０３，４５３の詳細な機能的構成例を示すブロック図である。骨格情報処理部４０３，４５３は、関節角度算出部５０１と、移動量算出部５０２と、正規化部５０３と、を有する。 FIG. 5 is a block diagram showing a detailed functional configuration example of the skeleton information processing units 403 and 453. The skeleton information processing units 403 and 453 have a joint angle calculation unit 501, a movement amount calculation unit 502, and a normalization unit 503.

関節角度算出部５０１は、取得した教師信号の内、骨格情報３２０から関節角度３７０を算出し、移動量算出部５０２と正規化部５０３を介して主成分分析部４０４に出力する。 The joint angle calculation unit 501 calculates the joint angle 370 from the skeleton information 320 among the acquired teacher signals, and outputs the joint angle 370 to the principal component analysis unit 404 via the movement amount calculation unit 502 and the normalization unit 503.

移動量算出部５０２は、取得した教師信号の内、骨格情報３２０からフレーム間の移動量を算出し、正規化部５０３を介して主成分分析部４０４に出力する。 The movement amount calculation unit 502 calculates the movement amount between frames from the skeleton information 320 in the acquired teacher signal, and outputs the movement amount to the principal component analysis unit 404 via the normalization unit 503.

正規化部５０３は、取得した教師信号の内、骨格情報３２０に対して絶対的な位置情報を除外し、骨格情報３２０の大きさが一定となる正規化を実行して主成分分析部４０４に出力する。 The normalization unit 503 excludes the absolute position information from the acquired teacher signal with respect to the skeleton information 320, performs normalization to make the size of the skeleton information 320 constant, and causes the principal component analysis unit 404 to perform normalization. Output.

図４に戻り、主成分分析部４０４は、骨格情報処理部４０３から取得した教師信号の内、正規化した骨格情報３２０と、関節角度３７０と、フレーム間の移動量と、を入力データとして、主成分分析を実行して単数または複数の主成分を生成し、次元数制御部４０５に出力する。なお、骨格情報３２０、関節角度３７０、およびフレーム間の移動量のうち、少なくとも骨格情報３２０が入力データであればよい。 Returning to FIG. 4, the principal component analysis unit 404 uses the normalized skeleton information 320, the joint angle 370, and the movement amount between frames among the teacher signals acquired from the skeleton information processing unit 403 as input data. Principal component analysis is executed to generate one or more principal components and output to the dimension number control unit 405. Of the skeleton information 320, the joint angle 370, and the amount of movement between frames, at least the skeleton information 320 may be input data.

主成分分析では下記式（１）に示す通り、入力データｘ_ｉに係数ｗ_ｉｊを各々乗算し、加算することで主成分ｙ_ｉを生成する。主成分分析の一般式を下記式（２）に示す。係数ｗ_ｉｊは、下記式（３）に示す通り、ｙ_ｉの分散をＶ(ｙ_ｉ)として定義した場合、分散Ｖ（ｙ_ｉ）が最大となるように定める。 In the principal component analysis, as shown in the following equation (1), the principal component y _i is generated by multiplying the input data x _i by the coefficient _wij and adding them. The general formula for principal component analysis is shown in the following formula (2). As shown in the following equation (3), the coefficient _wij is determined so that the variance V (y _i ) becomes the maximum when the variance of y _i is defined as V (y _i ).

ただし、係数ｗｉｊに制約を持たせない場合、分散Ｖ（ｙ_ｉ）の絶対量は無限に大きく取ることができ、係数ｗ_ｉｊは一意に決定することができないため、下記式（４）の制約を付すことが望ましい。また、情報の重複を無くすため、新たに生成する主成分ｙ_ｋとこれまでに生成した主成分ｙ_ｋの共分散は０となる下記式（５）の制約を付すことが望ましい。 However, if the coefficient wij is not constrained, the absolute amount of the variance V (y _i ) can be infinitely large, and the coefficient _wij cannot be uniquely determined. Therefore, the constraint of the following equation (4) It is desirable to add. Further, in order to eliminate duplication of information, it is desirable to impose the constraint of the following equation (5) that the covariance between the newly generated principal component y _k and the previously generated principal component y _k is 0.

ただし、制約として付す上記式（４）と上記式（５）は、これに限らず別の制約条件を付したり、または制約を外したりして係数ｗ_ｉｊを算出しても問題ない。こうして生成した新たな主成分ｙ_ｊの分散Ｖ（ｙｊ）について下記式（６）に示す通りλ_ｊとして別途定義した場合、下記式（７）に示す通り入力データｘ_ｊの分散Ｖ（ｘ_ｊ）の合計とλ_ｊの合計は等しい。 However, the above equation (4) and the above equation (5) attached as constraints are not limited to this, and there is no problem in calculating the coefficient _wij by adding another constraint condition or removing the constraint. When the variance V (yj) of the new principal component y _j thus generated is separately defined as λ _j as shown in the following equation (6), the variance V (x _j ) of the input data x _j is shown in the following equation (7). ) And the sum of λ _j are equal.

ここでｐは入力データｘ_ｊの数とする。新たに生成した主成分ｙ_ｊの分散Ｖ（ｙ_ｊ）は高い方が元の情報をより多く反映しており、分散値が高い主成分から順に第１、第２、…、第ｍ主成分という。新たに生成した変数ｙ_ｊの分散と元のデータの分散の比を寄与率といい、下記式（８）で示される。また、第１主成分の寄与率から分散値の降順（主成分の序数ｍの昇順）に寄与率を加算した結果を累積寄与率といい、下記式(９)で示される。 Here, p is the number of input data x _j . The higher the variance V (y _j ) of the newly generated principal component y _j , the more the original information is reflected. That is. The ratio of the variance of the newly generated variable y _j to the variance of the original data is called the contribution ratio and is expressed by the following equation (8). Further, the result of adding the contribution rate from the contribution rate of the first principal component to the descending order of the dispersion values (the ascending order of the ordinal number m of the principal component) is called the cumulative contribution rate and is represented by the following equation (9).

寄与率と累積寄与率は、新たに生成した主成分ｙ_ｊや生成した複数の主成分が元のデータの情報量をどの程度表しているかといった尺度となり、主成分と共に生成される。なお、多変量解析で統計的な成分を生成する成分分析の一例として、主成分分析を適用したが、主成分分析の替わりに、同じく成分分析の一例である独立成分分析を実行してもよい。 The contribution rate and the cumulative contribution rate are measures such as how much the newly generated main component y _j and the generated plurality of main components represent the amount of information in the original data, and are generated together with the main components. Although principal component analysis was applied as an example of component analysis that produces statistical components in multivariate analysis, independent component analysis, which is also an example of component analysis, may be performed instead of principal component analysis. ..

独立成分分析の場合、主成分は独立成分となる。この独立成分が入力データｘｉにどのくらい影響を与えているのかを示す指標として、寄与率を用いてもよい。独立成分分析では、独立成分ごとの独立成分分析における混合係数行列の２乗和が、各独立成分の強度となる。 In the case of independent component analysis, the principal component is an independent component. The contribution rate may be used as an index showing how much this independent component affects the input data xi. In the independent component analysis, the sum of squares of the mixing coefficient matrix in the independent component analysis for each independent component is the intensity of each independent component.

独立成分の強度は独立成分の入力データｘ_ｉにおける分散を示す。すなわち、独立成分分析によって得られた独立成分はいずれも分散が１に統一されるため、混合係数の２乗和をとれば入力データｘ_ｉの分散になる。そして、独立成分の強度を、全独立成分の強度の総和で割った値を、その独立変数の寄与率とすればよい。 The intensity of the independent component indicates the variance of the independent component in the input data x _i . That is, since the variances of all the independent components obtained by the independent component analysis are unified to 1, the variance of the input data x _i can be obtained by taking the sum of squares of the mixing coefficients. Then, the value obtained by dividing the intensity of the independent component by the sum of the intensities of all the independent components may be used as the contribution rate of the independent variable.

次元数制御部４０５は、１以上の成分の各々の次元を示す序数ｋを制御する。具体的には、たとえば、次元数制御部４０５は、取得した主成分の内、行動学習部４０６で学習に用いる主成分を分散値の高い順に何次元まで使用するかを決定し、第１主成分から、決定した次元ｋ（ｋは１以上の整数）を序数とする第ｋ主成分までの主成分を、分散値の高い順に行動学習部４０６に出力する。 The dimension number control unit 405 controls an ordinal number k indicating each dimension of one or more components. Specifically, for example, the dimension number control unit 405 determines, among the acquired principal components, the number of dimensions of the principal components used for learning by the behavior learning unit 406 in descending order of the variance value, and the first principal component is used. The principal components from the components to the k-th principal component having the determined dimension k (k is an integer of 1 or more) as an order are output to the behavior learning unit 406 in descending order of the variance value.

行動学習部４０６は、次元数制御部４０５から取得した主成分と、教師信号ＤＢ１０４から取得した教師信号内の行動情報とを、関連付けて学習する。具体的には、たとえば、行動学習部４０６は、次元数制御部４０５から取得した第１主成分から第ｋ主成分までの主成分群を入力データとし、教師信号ＤＢ１０４から取得した教師信号内の行動情報を出力データとして、機械学習により、行動分類モデルを生成する。行動学習部４０６は、学習の結果生成した行動分類モデルを、欠損情報制御部４０２から取得した欠損情報と関連付けて、行動分類モデル選択部４５６に出力する。 The behavior learning unit 406 learns by associating the main component acquired from the dimension number control unit 405 with the behavior information in the teacher signal acquired from the teacher signal DB 104. Specifically, for example, the behavior learning unit 406 uses the principal component group from the first principal component to the kth principal component acquired from the dimension number control unit 405 as input data, and is in the teacher signal acquired from the teacher signal DB 104. Using behavior information as output data, a behavior classification model is generated by machine learning. The behavior learning unit 406 associates the behavior classification model generated as a result of learning with the defect information acquired from the defect information control unit 402, and outputs it to the behavior classification model selection unit 456.

つぎに、クライアント１０２側の機能的構成例について説明する。骨格検出部４５１は、センサ１０３から取得した解析対象データに映る人の骨格情報３２０を検出し、欠損情報判断部４５２に出力する。骨格情報３２０の検出には機械学習により生成した人の骨格情報３２０を推定可能なＮＮ（ｎｅｕｒａｌｎｅｔｗｏｒｋ）を用いてもよいし、検出したい人の骨格点にマーカーを付与して、画像に映るマーカー位置から骨格情報３２０を検出してもよく、骨格情報３２０を検出する方法は限定されない。 Next, an example of a functional configuration on the client 102 side will be described. The skeleton detection unit 451 detects the human skeleton information 320 reflected in the analysis target data acquired from the sensor 103, and outputs the information to the defect information determination unit 452. For the detection of the skeleton information 320, an NN (neural network) capable of estimating the human skeleton information 320 generated by machine learning may be used, or a marker is added to the skeleton point of the person to be detected and the marker appears in the image. The skeleton information 320 may be detected from the position, and the method for detecting the skeleton information 320 is not limited.

欠損情報判断部４５２は、骨格検出部４５１で検出した骨格情報３２０の内、オクルージョンなどにより取得できない骨格点があるか否かを判断し、取得できなかった骨格点があれば、その位置情報を欠損情報とし、骨格検出部４５１で検出した骨格情報３２０を骨格情報処理部４５３に出力する。また、欠損情報判断部４５２は、欠損情報を骨格情報処理部４５３と主成分分析部４５４と次元数決定部４５５を介して行動分類モデル選択部４５６に出力する。 The defect information determination unit 452 determines whether or not there is a skeleton point that cannot be acquired due to occlusion or the like among the skeleton information 320 detected by the skeleton detection unit 451. If there is a skeleton point that could not be acquired, the position information is obtained. The skeleton information 320 detected by the skeleton detection unit 451 is output to the skeleton information processing unit 453 as missing information. Further, the defect information determination unit 452 outputs the defect information to the behavior classification model selection unit 456 via the skeleton information processing unit 453, the principal component analysis unit 454, and the dimension number determination unit 455.

骨格情報処理部４５３は、骨格情報処理部４０３と同様の機能を有する。骨格情報処理部４５３は、骨格検出部４５１で検出した骨格情報３２０に対して骨格情報処理部４０３と同様の処理を実行して、関節角度３７０と、フレーム間の移動量と、正規化した骨格情報３２０と、を主成分分析部４５４に出力する。 The skeletal information processing unit 453 has the same function as the skeletal information processing unit 403. The skeleton information processing unit 453 performs the same processing as the skeleton information processing unit 403 on the skeleton information 320 detected by the skeleton detection unit 451 to obtain the joint angle 370, the amount of movement between frames, and the normalized skeleton. Information 320 and are output to the principal component analysis unit 454.

主成分分析部４５４は、主成分分析部４０４と同様の機能を有する。主成分分析部４５４は、骨格情報処理部４５３からの出力データに対して主成分分析部４０４と同様の処理を実行して、単数または複数の主成分を生成する。また、主成分分析部４５４は、主成分と共に生成した寄与率と累積寄与率とを次元数決定部４５５に出力する。 The principal component analysis unit 454 has the same function as the principal component analysis unit 404. The principal component analysis unit 454 executes the same processing as the principal component analysis unit 404 on the output data from the skeleton information processing unit 453 to generate a single or a plurality of principal components. Further, the principal component analysis unit 454 outputs the contribution rate and the cumulative contribution rate generated together with the principal component to the dimension number determination unit 455.

次元数決定部４５５は、各々の寄与率から得られる累積寄与率に基づいて、１以上の成分の各々の次元を示す序数ｋを決定する。具体的には、たとえば、次元数決定部４５５は、取得した寄与率および累積寄与率から、取得した主成分の内、分散の高い順に何次元までの主成分を行動分類モデル選択部４５６に出力するかを示す次元数ｋを決定する。次元数ｋとは、主成分の次元を示す序数ｋである。たとえば、第１主成分であれば、次元数（序数）ｋ＝１であり、第２主成分であれば、次元数（序数）ｋ＝２である。次元数決定部４５５は、分散の高い順に第１主成分から第ｋ主成分までの主成分群を行動分類モデル選択部４５６に出力する。 The dimension number determination unit 455 determines the ordinal number k indicating each dimension of one or more components based on the cumulative contribution rate obtained from each contribution rate. Specifically, for example, the dimension number determination unit 455 outputs the principal components of the acquired principal components from the acquired contribution rate and the cumulative contribution rate to the action classification model selection unit 456 in descending order of variance. The number of dimensions k indicating whether or not to do is determined. The dimension number k is an ordinal number k indicating the dimension of the principal component. For example, in the case of the first principal component, the number of dimensions (ordinal number) k = 1, and in the case of the second principal component, the number of dimensions (ordinal number) k = 2. The dimension number determination unit 455 outputs the principal component groups from the first principal component to the kth principal component to the behavior classification model selection unit 456 in descending order of variance.

行動分類モデル選択部４５６は、欠損情報制御部４０２が生成する欠損情報に関連付けられた行動分類モデルの内、欠損情報判断部４５２から取得した欠損情報と同じ欠損情報が関連付けられ、かつ、次元数決定部４５５が決定した第ｋ次元までの主成分群（第１主成分～第ｋ主成分）で行動学習を行った行動分類モデルを選択する。行動分類モデル選択部４５６は、第１主成分から第ｋ主成分までの主成分群と共に選択した行動分類モデルを行動認識部４５７に出力する。 The behavior classification model selection unit 456 is associated with the same defect information as the defect information acquired from the defect information determination unit 452 among the behavior classification models associated with the defect information generated by the defect information control unit 402, and has the same number of dimensions. A behavior classification model in which behavior learning is performed in the principal component group (first principal component to kth principal component) up to the kth dimension determined by the determination unit 455 is selected. The behavior classification model selection unit 456 outputs the behavior classification model selected together with the principal component group from the first principal component to the kth principal component to the behavior recognition unit 457.

特に２次元画像においては、定義したすべての骨格点をオクルージョンなどにより取得できない可能性があり、取得できなかった一部の骨格点が欠損した骨格情報３２０が骨格検出部４５１で生成される可能性がある。この一部の骨格点が欠損した骨格情報３２０について行動認識を行う場合、クライアント１０２は、骨格検出部４５１で検出された欠損した骨格情報３２０の欠損情報に関連付けられた行動学習モデルを用いて行動認識を行う。これにより、一部の骨格点が欠損した骨格情報３２０についても高精度な行動認識が実現される。 Especially in a two-dimensional image, there is a possibility that all the defined skeleton points cannot be acquired due to occlusion or the like, and the skeleton information 320 in which some of the skeleton points that could not be acquired is missing may be generated by the skeleton detection unit 451. There is. When performing behavior recognition for the skeleton information 320 lacking some of the skeleton points, the client 102 behaves using the behavior learning model associated with the missing information of the missing skeleton information 320 detected by the skeleton detection unit 451. Do recognition. As a result, highly accurate action recognition is realized even for the skeleton information 320 in which some skeleton points are missing.

なお、行動学習部４０６から取得した欠損情報制御部４０２が生成する欠損情報に関連付けられた行動分類モデルの内、欠損情報判断部４５２から取得した欠損情報と同じ欠損情報が関連付けられ、且つ次元数決定部４５５が決定した主成分の次元を示す序数ｋと同一の主成分（第１主成分～第ｋ主成分）で行動学習を行った行動分類モデルが生成されていない場合も想定される。 Of the behavior classification models associated with the defect information generated by the defect information control unit 402 acquired from the behavior learning unit 406, the same defect information as the defect information acquired from the defect information determination unit 452 is associated with the number of dimensions. It is also assumed that a behavior classification model in which behavior learning is performed with the same principal components (first principal component to k-th principal component) as the order number k indicating the dimension of the principal component determined by the determination unit 455 has not been generated.

この場合、行動分類モデル選択部４５６は、この条件に最も近い行動分類モデル（たとえば、欠損した骨格点の位置情報と所定距離以内の欠損情報が関連付けられた行動分類モデル、第１主成分～第（ｋ－１）主成分で行動学習を行った行動分類モデルなど）を選択してもよい。 In this case, the behavior classification model selection unit 456 is a behavior classification model closest to this condition (for example, a behavior classification model in which the position information of the missing skeleton point and the missing information within a predetermined distance are associated, the first principal component to the first. (K-1) A behavior classification model in which behavior learning is performed with the main component) may be selected.

行動認識部４５７は、選択した行動分類モデルと第１主成分から第ｋ主成分までの主成分群とに基づいて、センサ１０３から取得した解析対象データに映る人の行動を認識する。具体的には、たとえば、行動認識部４５７は、解析対象データから得らえた主成分群（第１主成分～第ｋ主成分）を、選択した行動分類モデルに入力することにより、解析対象データに映る人の行動を示す予測値を認識結果として出力する。 The behavior recognition unit 457 recognizes the behavior of a person reflected in the analysis target data acquired from the sensor 103 based on the selected behavior classification model and the principal component group from the first principal component to the kth principal component. Specifically, for example, the behavior recognition unit 457 inputs the principal component group (first principal component to kth principal component) obtained from the analysis target data into the selected behavior classification model, thereby analyzing the analysis target data. The predicted value indicating the behavior of the person reflected in is output as the recognition result.

＜関節角度算出の例＞
図６は、関節角度算出部５０１が実行する関節角度３７０の詳細な算出方法を示す説明図である。関節角度算出部５０１は、連結する３点の骨格点６００～６０２において関節角度θを算出する。骨格点６００～６０２の骨格情報６２０について、原点６３０を基準とする位置ベクトルＯ、Ａ、Ｂのように各々定義する。関節角度算出部５０１は、骨格点６００を原点とする相対ベクトルを下記式（１０），（１１）に示す通り算出し、算出したベクトルから下記式（１２）が成立し、下記式（１３）に示す通り逆余弦を算出することで関節角度θを算出する。 <Example of joint angle calculation>
FIG. 6 is an explanatory diagram showing a detailed calculation method of the joint angle 370 executed by the joint angle calculation unit 501. The joint angle calculation unit 501 calculates the joint angle θ at the three connected skeletal points 600 to 602. The skeleton information 620 of the skeleton points 600 to 602 is defined as the position vectors O, A, and B with respect to the origin 630, respectively. The joint angle calculation unit 501 calculates a relative vector with the skeleton point 600 as the origin as shown in the following equations (10) and (11), and the following equation (12) is established from the calculated vector, and the following equation (13) is established. The joint angle θ is calculated by calculating the inverted cosine as shown in.

＜フレーム間の移動量算出の例＞
図７は、移動量算出部５０２が実行するフレーム間の移動量の詳細な算出方法の例を示す説明図である。移動量算出部５０２は、フレーム間の移動量の算出において、同一被写体についての第Ｎフレーム目の骨格情報７０１と第Ｎ－Ｍフレーム目の骨格情報７０２とを用いる。Ｎ、Ｍは１以上の整数であり、Ｎ＞Ｍである。Ｍの値は任意に設定可能である。下記式（１４）～（１６）に示す通り、移動量算出部５０２は、各フレーム間で示される同一人物の同一骨格点３００～３１７の距離を各々算出する。１８個の骨格点３００～３１７のフレーム間の移動量が、当該人物についてのフレーム間の移動量となる。 <Example of calculation of movement amount between frames>
FIG. 7 is an explanatory diagram showing an example of a detailed calculation method of the movement amount between frames executed by the movement amount calculation unit 502. The movement amount calculation unit 502 uses the skeleton information 701 of the Nth frame and the skeleton information 702 of the NM frame for the same subject in the calculation of the movement amount between frames. N and M are integers of 1 or more, and N> M. The value of M can be set arbitrarily. As shown in the following formulas (14) to (16), the movement amount calculation unit 502 calculates the distances of the same skeleton points 300 to 317 of the same person shown between the frames. The amount of movement between the frames of the 18 skeleton points 300 to 317 is the amount of movement between the frames for the person.

ただ、移動量算出部５０２が実行するフレーム間の移動量はこれに限定されるものではなく、下記式（１７）に示す通り、移動量算出部５０２は、各フレーム間で示される同一人物の同一骨格点３００～３１７の距離を各々算出し、全１８個の骨格点３００～３１７のフレーム間の移動量を合算した値を、当該人物についてのフレーム間の移動量としてもよい。

However, the movement amount between frames executed by the movement amount calculation unit 502 is not limited to this, and as shown in the following equation (17), the movement amount calculation unit 502 is the same person shown between each frame. The distance between the same skeleton points 300 to 317 is calculated, and the total value of the movement amounts between all 18 skeleton points 300 to 317 may be used as the movement amount between frames for the person.

また、移動量算出部５０２は、第ｎフレームの骨格情報７０１と第ｎ－ｍフレームの骨格情報７０２の内、重心となる重心骨格情報７１１と重心骨格情報７１２を用いてもよい。具体的には、たとえば、移動量算出部５０２は、下記式（１８）～（１９）に示す通り、人物ごとに重心を算出し、下記式（２０）に示す通り、算出した重心に対して、当該人物についてのフレーム間の移動量を算出してもよい。 Further, the movement amount calculation unit 502 may use the center of gravity skeleton information 711 and the center of gravity skeleton information 712, which are the centers of gravity, among the skeleton information 701 of the nth frame and the skeleton information 702 of the nth frame. Specifically, for example, the movement amount calculation unit 502 calculates the center of gravity for each person as shown in the following formulas (18) to (19), and with respect to the calculated center of gravity as shown in the following formula (20). , The amount of movement between frames for the person may be calculated.

＜正規化の例＞
図８は、正規化部５０３が実行する骨格情報３２０の正規化の詳細な手法を示す説明図である。まず、正規化部５０３は、（ａ）すべてまたは一部の骨格情報３２０から重心を算出し、（ｂ）重心を原点とする相対座標に変換する。その後、正規化部５０３は、（ｃ）１８個の骨格点３００～３１７を囲う最小の長方形の対角線の長さＬで、（ｄ）骨格情報３２０の各骨格点の位置情報を割る。（ｄ）で得られた骨格情報３２０を教師信号とした場合、割り算後の骨格点３００～３１７の位置情報も組み込まれることとなる。 <Example of normalization>
FIG. 8 is an explanatory diagram showing a detailed method of normalizing the skeleton information 320 executed by the normalization unit 503. First, the normalization unit 503 calculates the center of gravity from (a) all or part of the skeleton information 320, and (b) converts it into relative coordinates with the center of gravity as the origin. After that, the normalization unit 503 divides (d) the position information of each skeleton point of the skeleton information 320 by (c) the length L of the diagonal line of the smallest rectangle surrounding the 18 skeleton points 300 to 317. When the skeleton information 320 obtained in (d) is used as a teacher signal, the position information of the skeleton points 300 to 317 after division is also incorporated.

たとえば、正規化部５０３が実行されないと「１８０ｃｍの人が地点Ａで座る」といった行動について骨格検出および行動分類のための学習が実行されると、「地点Ａ以外では座らない」、「１８０ｃｍ以外の人は座らない」といった判定が下される可能性がある。こうした限定を除外し、行動分類に汎用性を持たせるため、画像内の絶対的な位置情報と、骨格の大きさに関する情報について除去するため、正規化部５０３が骨格情報３２０の正規化を実行する。 For example, if the normalization unit 503 is not executed and learning for skeleton detection and behavior classification is executed for an action such as "a person of 180 cm sits at point A", "does not sit at any place other than point A" and "other than 180 cm". There is a possibility that a judgment such as "the person does not sit down" may be made. In order to exclude such restrictions and make the behavior classification versatile, the normalization unit 503 normalizes the skeleton information 320 in order to remove the absolute position information in the image and the information regarding the size of the skeleton. do.

＜教師信号ＤＢ１０４が保持する教師信号＞
図９は、教師信号ＤＢ１０４が保持する教師信号の詳細な例を示す説明図である。解析対象データとなる（ａ）画像９００に映る人において、（ｂ）骨格情報３２０Ａと、関節角度３７０（不図示）と、骨格情報３２０Ａに関連付けられる（ｃ）行動情報９０１（「立つ」）と、の組み合わせが、教師信号となる。同様に、解析対象データとなる（ａ）画像９１０に映る人において、（ｂ）骨格情報３２０Ｂと、関節角度３７０（不図示）と、骨格情報３２０Ｂに関連付けられる（ｃ）行動情報９１１（「倒れる」）と、の組み合わせが、教師信号となる。 <Teacher signal held by the teacher signal DB 104>
FIG. 9 is an explanatory diagram showing a detailed example of the teacher signal held by the teacher signal DB 104. (A) In the person reflected in the image 900, which is the data to be analyzed, (b) the skeleton information 320A, the joint angle 370 (not shown), and (c) the behavior information 901 (“standing”) associated with the skeleton information 320A. The combination of and becomes a teacher signal. Similarly, in the person (a) reflected in the image 910 which is the analysis target data, (b) the skeleton information 320B, the joint angle 370 (not shown), and (c) the behavior information 911 (“falling down”) associated with the skeleton information 320B. The combination of ") and is the teacher signal.

＜次元数制御部４０５による次元数制御と行動学習部４０６による行動学習＞
図１０は、教師信号を入力データとして主成分分析部４０４が生成した主成分を、主成分空間上にプロットした例を示す説明図である。凡例は教師信号に含まれる行動情報１０００～１００４を示す。 <Dimension control by dimension control unit 405 and behavior learning by behavior learning unit 406>
FIG. 10 is an explanatory diagram showing an example in which the principal component generated by the principal component analysis unit 404 using the teacher signal as input data is plotted on the principal component space. The legend shows behavioral information 1000-1004 contained in the teacher signal.

図１０において、（ａ）はＸ軸に第１主成分を、Ｙ軸に第２主成分をとり、第２主成分までの情報を２次元平面上にプロットした例を示す。（ｂ）はＸ軸に第１主成分を、Ｙ軸に第２主成分をとり、Ｚ軸に第３主成分をとり、第３主成分までの情報を３次元空間上にプロットした例を示す。 In FIG. 10, (a) shows an example in which the first principal component is taken on the X-axis and the second principal component is taken on the Y-axis, and the information up to the second principal component is plotted on a two-dimensional plane. (B) is an example in which the X-axis has the first principal component, the Y-axis has the second principal component, the Z-axis has the third principal component, and the information up to the third principal component is plotted in a three-dimensional space. show.

（ａ）において、立つ１０００と、座る１００１と、倒れる１００４は、第２主成分までの２次元平面上でも分離可能な様子が伺えるが、歩く１００２と、しゃがむ１００３は第２主成分までの２次元平面上では分離困難な様子が伺える。ここで、（ｂ）において、第３主成分までを含めた３次元空間上で、歩く１００２としゃがむ１００３をプロットした場合、分離の可能性が拡大する場合がある。 In (a), it can be seen that the standing 1000, the sitting 1001, and the falling 1004 can be separated even on the two-dimensional plane up to the second main component, but the walking 1002 and the squatting 1003 are 2 up to the second main component. It can be seen that separation is difficult on the dimensional plane. Here, in (b), when the walking 1002 and the squatting 1003 are plotted on the three-dimensional space including the third principal component, the possibility of separation may be expanded.

このため、主成分分析部４０４が生成した主成分を多く用いれば高精度な行動分類の可能性がある。ただし、主成分の次元を示す序数ｋを多くすると計算量は増加するため、精度と計算量からどこまでの主成分を考慮し、どのくらいの次元の空間で行動を表すかを判断する必要がある。 Therefore, if a large number of principal components generated by the principal component analysis unit 404 are used, there is a possibility of highly accurate behavior classification. However, since the amount of calculation increases when the ordinal number k indicating the dimension of the principal component is increased, it is necessary to consider the amount of the principal component from the accuracy and the amount of calculation and determine how many dimensions of the space the action is represented.

したがって、次元数制御部４０５は、行動学習部４０６で学習に用いる主成分の最大序数を変化させ、第１主成分～最大序数の主成分までの主成分群を行動学習部４０６に出力する。具体的には、たとえば、上述した行動分類の要求精度（たとえば、最低限必要な主成分の次元を示す序数）または／および許容計算量をあらかじめ設定しておき、次元数制御部４０５が、行動学習部４０６で学習に用いる主成分の最大序数を変化させ、要求精度または／および許容計算量を最大限充足する序数を決定する。 Therefore, the dimension number control unit 405 changes the maximum ordinal number of the principal components used for learning in the behavior learning unit 406, and outputs the principal component group from the first principal component to the principal component of the maximum ordinal number to the behavior learning unit 406. Specifically, for example, the required accuracy of the above-mentioned behavior classification (for example, an ordinal number indicating the minimum required dimension of the main component) or / and the allowable calculation amount are set in advance, and the dimension number control unit 405 performs the action. The learning unit 406 changes the maximum ordinal number of the principal components used for learning, and determines the ordinal number that satisfies the required accuracy and / and the allowable calculation amount to the maximum.

たとえば、要求精度が次元を示す序数「３」（第３主成分）という条件の場合、次元数制御部４０５は、最大序数を「３」に決定し、第１主成分～第３主成分までの主成分群を行動学習部４０６に出力する。 For example, in the case of the condition that the required accuracy is the ordinal number "3" (third principal component) indicating the dimension, the dimension number control unit 405 determines the maximum ordinal number to be "3", and the first principal component to the third principal component. The principal component group of is output to the behavior learning unit 406.

また、許容計算量が条件に設定されている場合、次元数制御部４０５は、第１主成分から昇順に計算量を順次取得し、最大序数を、許容計算量をはじめて超えたときの序数（たとえば、「５」）より１つ少ない序数（たとえば、「４」）に決定し、第１主成分から最大序数ｋ＝４の第４主成分までの主成分群を行動学習部４０６に出力する。 Further, when the permissible calculation amount is set as a condition, the dimension number control unit 405 sequentially acquires the calculation amount in ascending order from the first principal component, and the ordinal number when the maximum ordinal number is exceeded for the first time (the permissible calculation amount). For example, it is determined to have an ordinal number one less than "5") (for example, "4"), and the principal component group from the first principal component to the fourth principal component with the maximum ordinal number k = 4 is output to the behavior learning unit 406. ..

また、要求精度が次元を示す序数「３」（第３主成分）以上という条件で、かつ、許容計算量が条件に設定されている場合、第３主成分までの累積計算量が許容計算量以下であれば、次元数制御部４０５は、最大序数を「３」から「４」に変化させる。そして、第４主成分までの累積計算量が許容計算量を超えれば、次元数制御部４０５は、最大序数ｋを「３」に決定し、第１主成分～第３主成分までの主成分群を行動学習部４０６に出力する。 Further, if the required accuracy is the ordinal number "3" (third principal component) or more indicating the dimension and the allowable calculation amount is set as the condition, the cumulative calculation amount up to the third principal component is the allowable calculation amount. If the following, the dimension number control unit 405 changes the maximum ordinal number from "3" to "4". Then, if the cumulative calculation amount up to the fourth main component exceeds the allowable calculation amount, the dimension number control unit 405 determines the maximum ordinal number k to be "3", and the main components from the first main component to the third main component. The group is output to the behavior learning unit 406.

一方、第３主成分までの累積計算量が許容計算量を超えれば、次元数制御部４０５は、最大序数を「３」から「２」に変化させる。そして、第２主成分までの累積計算量が許容計算量以下であれば、次元数制御部４０５は、最大序数ｋを「２」に決定し、第１主成分～第２主成分までの主成分群を行動学習部４０６に出力する。 On the other hand, if the cumulative calculation amount up to the third principal component exceeds the allowable calculation amount, the dimension number control unit 405 changes the maximum ordinal number from "3" to "2". Then, if the cumulative calculation amount up to the second principal component is equal to or less than the allowable calculation amount, the dimension number control unit 405 determines the maximum ordinal number k to be "2", and the main principal component to the second principal component is the main component. The component group is output to the behavior learning unit 406.

なお、行動学習部４０６に出力する主成分群は、第１主成分から昇順に限定する必要はない。たとえば、次元数制御部４０５は、予め定めた主成分群を特定の数だけ取り出してもよい。また、次元数制御部４０５は、特定の主成分群を除外した上で行動学習部４０６に出力する主成分群を決定してもよい。このように、行動学習部４０６に出力する主成分群は第１主成分から昇順の主成分群に限定されない。 The principal component group output to the behavior learning unit 406 does not need to be limited in ascending order from the first principal component. For example, the dimension number control unit 405 may take out a specific number of predetermined main component groups. Further, the dimension number control unit 405 may determine the main component group to be output to the behavior learning unit 406 after excluding the specific main component group. As described above, the principal component group output to the behavior learning unit 406 is not limited to the principal component group in ascending order from the first principal component.

また、この場合においても、許容計算量が条件に設定されている場合、次元数制御部４０５は、上述した第１主成分からの昇順に限定していない主成分群について、序数の昇順に計算量を順次取得し、許容計算量をはじめて超えたときの序数より１つ前の序数までの主成分群を行動学習部４０６に出力する。たとえば、主成分群が第２主成分、第３主成分、第５主成分からなる場合、第２主成分では許容計算量を超えず、第２主成分および第３主成分でも許容計算量を超えず、第２主成分、第３主成分、および第５主成分ではじめて許容計算量を超えた場合、次元数制御部４０５は、第２主成分から第５主成分の１つ前の第３主成分までを、行動学習部４０６に出力する主成分群に決定してもよい。 Further, also in this case, when the permissible calculation amount is set as a condition, the dimension number control unit 405 calculates the principal component group not limited to the ascending order from the first principal component described above in the ascending order of the ordinal number. The quantities are sequentially acquired, and the principal component group up to the ordinal number one before the ordinal number when the permissible computational complexity is exceeded for the first time is output to the behavior learning unit 406. For example, when the principal component group consists of the second principal component, the third principal component, and the fifth principal component, the allowable calculation amount is not exceeded for the second principal component, and the allowable calculation amount is also applied for the second principal component and the third principal component. When the allowable calculation amount is exceeded for the first time in the second principal component, the third principal component, and the fifth principal component without exceeding the allowable calculation amount, the dimension number control unit 405 is the first from the second principal component to the one immediately before the fifth principal component. Up to 3 principal components may be determined as the principal component group to be output to the behavior learning unit 406.

行動学習部４０６は、予め複数の条件での行動学習を行い、行動分類モデルを生成し、行動分類モデル選択部４５６に出力する。こうして生成した複数の行動分類モデルから状況に合わせて行動分類モデルを選択することで、汎用的で高精度な行動認識を実現する。 The behavior learning unit 406 performs behavior learning under a plurality of conditions in advance, generates a behavior classification model, and outputs it to the behavior classification model selection unit 456. By selecting a behavior classification model according to the situation from the plurality of behavior classification models generated in this way, general-purpose and highly accurate behavior recognition is realized.

図１１は、行動学習部４０６が行動を学習し、行動認識部４５７が行動を分類するための詳細な手法を示す説明図である。主成分空間上での各行動について、行動学習部４０６は、（ａ）境界線１１０１や（ｂ）境界平面１１０２を用いて、各行動を領域毎に分類する。行動を学習し分類する際の手法は、ｋ平均法や、サポートベクトルマシン、決定木や、ランダムフォレストなどいずれを採用してもよく、行動学習方法は限定されない。 FIG. 11 is an explanatory diagram showing a detailed method for the behavior learning unit 406 to learn the behavior and the behavior recognition unit 457 to classify the behavior. For each action on the main component space, the action learning unit 406 classifies each action for each area by using (a) the boundary line 1101 and (b) the boundary plane 1102. As a method for learning and classifying behaviors, a k-means method, a support vector machine, a decision tree, a random forest, or the like may be adopted, and the behavior learning method is not limited.

行動学習部４０６が学習して生成した行動分類モデルを用いて、行動認識部４５７は行動を認識する。具体的には、たとえば、クライアント１０２は、新たに入力された骨格情報３２０について主成分分析を適用し、新たに生成された主成分を行動分類モデルが設定する境界線１１０１や境界平面１１０２に従って、どの領域に属するかを判定し、判定された領域に従って行動を認識する。 The behavior recognition unit 457 recognizes the behavior using the behavior classification model learned and generated by the behavior learning unit 406. Specifically, for example, the client 102 applies the principal component analysis to the newly input skeleton information 320, and the newly generated principal component is set according to the boundary line 1101 and the boundary plane 1102 set by the behavior classification model. It determines which area it belongs to, and recognizes the action according to the determined area.

図１２は、次元数決定部４５５が次元数決定の際に用いる累積寄与率の推移を示すグラフである。累積寄与率は、新たに生成した複数の主成分が元のデータの情報量をどの程度表しているかといったことを示す尺度となる。このため、主成分の数を増やして、行動分類の際の次元数を増やしても、累積寄与率に大きな変化が見られない場合は、大きな精度向上は見込めない。 FIG. 12 is a graph showing the transition of the cumulative contribution rate used by the dimension number determination unit 455 when determining the dimension number. The cumulative contribution rate is a measure of how much the newly generated principal components represent the amount of information in the original data. Therefore, even if the number of principal components is increased and the number of dimensions in the action classification is increased, if the cumulative contribution rate does not change significantly, a large improvement in accuracy cannot be expected.

そこで、次元数決定部４５５は、予め定めた累積寄与率の閾値を超えるのに必要な数だけ主成分を使用することとし、次元数を決定する。たとえば、予め定めた累積寄与率の閾値を「０．８」とする場合、第２主成分まであれば条件を満たすため、ここでの次元数ｋは「２」として、第１主成分と第２主成分とを行動分類モデル選択部４５６に出力する。 Therefore, the dimension number determination unit 455 determines the number of dimensions by using as many main components as necessary to exceed the predetermined cumulative contribution rate threshold value. For example, when the predetermined threshold value of the cumulative contribution rate is "0.8", the condition is satisfied as long as it is up to the second principal component. Therefore, the dimension number k here is set to "2", and the first principal component and the first principal component are satisfied. The two principal components are output to the behavior classification model selection unit 456.

なお、行動分類モデル選択部４５６に出力する主成分群は、第１主成分から昇順に限定する必要はない。たとえば、次元数決定部４５５は、予め定めた累積寄与率の閾値を超えずかつ累積寄与率が最大となる主成分の序数ｋの組み合わせを決定してもよい。また、次元数決定部４５５は、このような主成分の序数ｋの組み合わせを、行動分類モデルに適用さされる主成分群から選択してもよい。このように、行動分類モデル選択部４５６に出力する主成分群は第１主成分から昇順の主成分群に限定されない。 The principal component group output to the behavior classification model selection unit 456 does not need to be limited in ascending order from the first principal component. For example, the dimension number determination unit 455 may determine the combination of the ordinal number k of the principal component that does not exceed the predetermined cumulative contribution rate threshold value and maximizes the cumulative contribution rate. Further, the dimension number determination unit 455 may select such a combination of the ordinal numbers k of the principal components from the principal component group applied to the behavior classification model. As described above, the principal component group output to the behavior classification model selection unit 456 is not limited to the principal component group in ascending order from the first principal component.

＜学習処理＞
図１３は、実施例１にかかるサーバ１０１（学習装置）による学習処理の詳細な処理手順例を示すフローチャートである。サーバ１０１は、教師信号取得部４０１により、教師信号ＤＢ１０４から取得した教師信号について学習に用いる教師信号を単数、または複数取得する（ステップＳ１３００）。 <Learning process>
FIG. 13 is a flowchart showing a detailed processing procedure example of the learning process by the server 101 (learning device) according to the first embodiment. The server 101 acquires a single teacher signal or a plurality of teacher signals used for learning about the teacher signal acquired from the teacher signal DB 104 by the teacher signal acquisition unit 401 (step S1300).

サーバ１０１は、欠損情報制御部４０２により、取得した教師信号内の骨格情報３２０に対して情報を欠損させ、欠損させた骨格情報３２０を教師信号内の骨格情報３２０として更新し、欠損させた骨格点の名前３２１および位置情報（ｘ座標値３２２，ｙ座標値３２３）を欠損情報とする（ステップＳ１３０１）。欠損情報制御部が実行された教師信号を、更新教師信号と称す。 The server 101 deletes information from the acquired skeleton information 320 in the teacher signal by the defect information control unit 402, updates the deleted skeleton information 320 as the skeleton information 320 in the teacher signal, and deletes the skeleton. The point name 321 and the position information (x coordinate value 322, y coordinate value 323) are used as missing information (step S1301). The teacher signal executed by the missing information control unit is called an updated teacher signal.

サーバ１０１は、骨格情報処理部４０３により、更新教師信号ごとに骨格情報処理を実行する（ステップＳ１３０２）。具体的には、たとえば、サーバ１０１は、関節角度算出部５０１、移動量算出部５０２、および正規化部５０３による処理を実行する。 The server 101 executes skeletal information processing for each update teacher signal by the skeletal information processing unit 403 (step S1302). Specifically, for example, the server 101 executes processing by the joint angle calculation unit 501, the movement amount calculation unit 502, and the normalization unit 503.

図１４は、実施例１にかかる骨格情報処理の詳細な処理手順例を示すフローチャートである。サーバ１０１は、関節角度算出部５０１により、更新教師信号ごとに、更新教師信号内の骨格情報３２０から関節角度３７０を算出する（ステップＳ１４０１）。つぎに、サーバ１０１は、移動量算出部５０２により、更新教師信号ごとに、更新教師信号内の骨格情報３２０からフレーム間の移動量を算出する（ステップＳ１４０１）。 FIG. 14 is a flowchart showing a detailed processing procedure example of the skeleton information processing according to the first embodiment. The server 101 calculates the joint angle 370 from the skeleton information 320 in the update teacher signal for each update teacher signal by the joint angle calculation unit 501 (step S1401). Next, the server 101 calculates the movement amount between frames from the skeleton information 320 in the update teacher signal for each update teacher signal by the movement amount calculation unit 502 (step S1401).

そして、サーバ１０１は、正規化部により、更新教師信号ごとに、骨格情報３２０に対して絶対的な位置情報を除外し、骨格情報３２０の大きさが一定となる正規化を実行する（ステップＳ１３０３）。これにより、更新教師信号について、関節角度３７０と、フレーム間の移動量と、正規化した骨格情報３２０と、が得られる。そして、図１３のステップＳ１３０３に移行する。 Then, the server 101 excludes the absolute position information with respect to the skeleton information 320 for each update teacher signal by the normalization unit, and executes normalization in which the size of the skeleton information 320 becomes constant (step S1303). ). As a result, for the update teacher signal, the joint angle 370, the amount of movement between frames, and the normalized skeletal information 320 are obtained. Then, the process proceeds to step S1303 in FIG.

図１３に戻り、サーバ１０１は、主成分分析部４０４により、正規化した骨格情報３２０と、関節角度３７０と、フレーム間の移動量と、を入力データとして、主成分分析を実行して、単数または複数の主成分を生成する（ステップＳ１３０３）。 Returning to FIG. 13, the server 101 performs principal component analysis by the principal component analysis unit 404 with the normalized skeleton information 320, the joint angle 370, and the amount of movement between frames as input data, and is unilateral. Alternatively, a plurality of principal components are generated (step S1303).

つぎに、サーバ１０１は、次元数制御部４０５により、生成した主成分の内、学習に用いる主成分を分散値の高い順に何次元使用するか決定し、決定したｋ次元までの主成分（第１主成分～第ｋ主成分）を分散値の高い順に選択する（ステップＳ１３０４）。 Next, the server 101 determines, by the dimension number control unit 405, how many dimensions of the generated principal components to be used for learning are to be used in descending order of the variance value, and the determined principal components up to the k-dimension (the first). (1st principal component to kth principal component) are selected in descending order of dispersion value (step S1304).

そして、サーバ１０１は、行動学習部により、選択した主成分と、更新教師信号内の行動情報と、に基づいて学習を行い、学習の結果、行動分類モデルを生成し、欠損情報と関連付ける（ステップＳ１３０５）。 Then, the server 101 learns based on the selected principal component and the behavior information in the update teacher signal by the behavior learning unit, and as a result of the learning, generates a behavior classification model and associates it with the missing information (step). S1305).

主成分分析（ステップＳ１３０３）では、主成分分析を実行する前の情報と同じ次元数ｋの主成分を生成することが可能である。このため、ステップＳ１３０６では、サーバ１０１は、次元数制御部４０５により、ステップＳ１３０４で決定した学習に用いる主成分の次元数ｋについて、これまでに決定していない主成分の次元がある場合は（ステップＳ１３０６：Ｎｏ）、ステップＳ１３０４に戻り、これまでに決定していない主成分の次元を決定する（ステップＳ１３０４）。 In the principal component analysis (step S1303), it is possible to generate a principal component having the same number of dimensions k as the information before executing the principal component analysis. Therefore, in step S1306, if the server 101 has a dimension of the principal component that has not been determined so far with respect to the dimension number k of the principal component used for learning determined in step S1304 by the dimension number control unit 405 ( Step S1306: No), the process returns to step S1304 to determine the dimension of the principal component that has not been determined so far (step S1304).

一方、決定可能な学習に用いるすべての主成分の次元をこれまでに決定している場合は（ステップＳ１３０６：Ｙｅｓ）、ステップＳ１３０７に進む。ただ、ステップＳ１３０６の処理の判断は、決定可能な学習に用いるすべての主成分の次元の決定の是非で次の処理を判断のみに限定されない。たとえば、繰返し回数を予め定めておき、予め定めた繰返し回数だけステップＳ１３０４を繰返していれば、ステップＳ１３０７の処理に進むなどの処理としてもよい。 On the other hand, if the dimensions of all the principal components used for decidable learning have been determined so far (step S1306: Yes), the process proceeds to step S1307. However, the determination of the process in step S1306 is not limited to the determination of the next process depending on the pros and cons of determining the dimensions of all the principal components used for decidable learning. For example, if the number of repetitions is set in advance and step S1304 is repeated a predetermined number of times, the process may proceed to step S1307.

ステップＳ１３０７では、ステップＳ１３０１で欠損させた骨格情報３２０について、まだ欠損させていない骨格情報３２０があれば（ステップＳ１３０７：Ｎｏ）、ステップＳ１３０１の処理に戻り、サーバ１０１は、これまでに欠損させていない骨格について欠損させる（ステップＳ１３０１）。 In step S1307, if there is skeleton information 320 that has not been deleted yet (step S1307: No) with respect to the skeleton information 320 that was deleted in step S1301, the process returns to step S1301 and the server 101 has deleted it so far. Defects for no skeleton (step S1301).

一方、すべての骨格情報３２０について欠損させた場合（ステップＳ１３０７：Ｙｅｓ）、ステップＳ１３０８の処理に進む。ただステップＳ１３０７の処理の判断はこれに限らず、サーバ１０１は、予め定めた繰返し回数に従ってステップＳ１３０１に戻るか、ステップＳ１３０８に進むかを判断してもよい。また、欠損させる骨格を予め定めておき、サーバ１０１は、予め定めた骨格をすべて欠損させたか否かでステップＳ１３０１に戻るか、ステップＳ１３０８に進むか判断してもよい。 On the other hand, when all the skeleton information 320 is deleted (step S1307: Yes), the process proceeds to step S1308. However, the determination of the process of step S1307 is not limited to this, and the server 101 may determine whether to return to step S1301 or proceed to step S1308 according to a predetermined number of repetitions. Further, the skeleton to be deleted may be predetermined, and the server 101 may determine whether to return to step S1301 or proceed to step S1308 depending on whether or not all the predetermined skeletons are deleted.

ステップＳ１３０８では、ステップＳ１３００で選択した教師信号について、まだ選択していない教師信号があれば（ステップＳ１３０８：Ｎｏ）、サーバ１０１は、これまでに選択していない教師信号を選択する（ステップＳ１３００）。一方、すべての教師信号について選択した場合は（ステップＳ１３０８：Ｙｅｓ）、サーバ１０１は、行動学習の処理を終了する。ただステップＳ１３０８の処理の判断はこれに限らず、サーバ１０１は、予め定めた繰返し回数に従ってステップＳ１３００に戻るか、行動学習の処理を終了するかを判断してもよい。 In step S1308, if there is a teacher signal that has not yet been selected for the teacher signal selected in step S1300 (step S1308: No), the server 101 selects a teacher signal that has not been selected so far (step S1300). .. On the other hand, when all the teacher signals are selected (step S1308: Yes), the server 101 ends the behavior learning process. However, the determination of the process of step S1308 is not limited to this, and the server 101 may determine whether to return to step S1300 or end the action learning process according to a predetermined number of repetitions.

＜行動認識処理＞
図１５は、実施例１にかかるクライアント１０２（行動認識装置）による行動認識処理手順例を示すフローチャートである。クライアント１０２は、骨格検出部４５１により、センサ１０３から取得した解析対象データに映る人の骨格情報３２０を検出する（ステップＳ１５００）。つぎに、クライアント１０２は、欠損情報判断部４５２により、検出した骨格情報３２０の内、オクルージョンなどにより検出できなかった骨格点の位置情報を欠損情報であると判断する（ステップＳ１５０１） <Action recognition processing>
FIG. 15 is a flowchart showing an example of an action recognition processing procedure by the client 102 (behavior recognition device) according to the first embodiment. The client 102 detects the human skeleton information 320 reflected in the analysis target data acquired from the sensor 103 by the skeleton detection unit 451 (step S1500). Next, the client 102 determines that, among the detected skeleton information 320, the position information of the skeleton point that could not be detected due to occlusion or the like is the missing information by the missing information determination unit 452 (step S1501).

つぎに、クライアント１０２は、骨格情報処理部４５３により、ステップＳ１５００で検出した骨格情報３２０について、ステップＳ１３０２の処理と同様に、骨格情報処理を実行する（ステップＳ１５０２）。具体的には、たとえば、サーバ１０１は、図１４に示したように、関節角度算出部５０１、移動量算出部５０２、および正規化部５０３による処理を実行する。 Next, the client 102 executes skeleton information processing for the skeleton information 320 detected in step S1500 by the skeleton information processing unit 453 in the same manner as the processing in step S1302 (step S1502). Specifically, for example, as shown in FIG. 14, the server 101 executes processing by the joint angle calculation unit 501, the movement amount calculation unit 502, and the normalization unit 503.

つぎに、クライアント１０２は、主成分分析部４５４により、ステップＳ１５０２で正規化した骨格情報３２０と関節角度３７０とフレーム間の移動量とを入力データとして、主成分分析を実行して、単数または複数の主成分を生成し、主成分と共に寄与率と累積寄与率も算出する（ステップＳ１５０３）。 Next, the client 102 performs principal component analysis by the principal component analysis unit 454 with the skeleton information 320 normalized in step S1502, the joint angle 370, and the amount of movement between frames as input data, and is singular or plural. The principal component of is generated, and the contribution rate and the cumulative contribution rate are calculated together with the principal component (step S1503).

つぎに、クライアント１０２は、次元数決定部４５５により、算出した寄与率および累積寄与率から、生成した主成分の内、分散の高い順にいくつの主成分を使用するかを決定する（ステップＳ１５０４）。 Next, the client 102 determines from the calculated contribution rate and the cumulative contribution rate by the dimension number determination unit 455, how many principal components are to be used in descending order of dispersion among the generated principal components (step S1504). ..

つぎに、クライアント１０２は、行動分類モデル選択部４５６により、行動学習により生成した行動分類モデルの内、ステップＳ１５０１で検出した欠損情報と同じ欠損情報が関連付けられ、且つステップＳ１５０４で決定した主成分の次元数と同じ次元数の主成分で行動学習を行った行動分類モデルを選択する（ステップＳ１５０５）。 Next, the client 102 is associated with the same missing information as the missing information detected in step S1501 among the behavior classification models generated by the behavior learning by the behavior classification model selection unit 456, and the main component determined in step S1504. A behavior classification model in which behavior learning is performed with a principal component having the same number of dimensions as the number of dimensions is selected (step S1505).

つぎに、クライアント１０２は、行動認識部４５７により、ステップＳ１５０５で選択した行動分類モデルと主成分とに基づいて、センサ１０３から取得した解析対象データに映る人の行動を認識する（ステップＳ１５０６）。クライアント１０２は、認識結果をサーバ１０１に送信してもよく、また、認識結果を用いて、クライアント１０２に接続されている機器を制御してもよい。 Next, the client 102 recognizes the behavior of the person reflected in the analysis target data acquired from the sensor 103 based on the behavior classification model selected in step S1505 and the main component by the behavior recognition unit 457 (step S1506). The client 102 may send the recognition result to the server 101, or may use the recognition result to control the device connected to the client 102.

たとえば、センサ１０３が配備されている解析環境が工場である場合、行動認識システム１００は、認識結果を用いて、工場内での作業員の作業監視や、製品の欠陥検査などに適用可能である。解析環境が電車である場合、行動認識システム１００は、認識結果を用いて、電車内での乗客の監視や車内設備の監視、火災などの災害検知などに適用可能である。 For example, when the analysis environment in which the sensor 103 is deployed is a factory, the behavior recognition system 100 can be applied to work monitoring of workers in the factory, defect inspection of products, and the like by using the recognition result. .. When the analysis environment is a train, the behavior recognition system 100 can be applied to monitoring passengers in a train, monitoring equipment in a car, detecting a disaster such as a fire, etc. by using the recognition result.

このように、実施例１によれば、認識対象の複数種類の行動を高精度に認識することができる。特に、オクルージョンになどにより骨格点３００～３１７が一部欠損した場合においても、欠損した骨格点に応じた複数種類の行動を高精度に認識することができる。 As described above, according to the first embodiment, it is possible to recognize a plurality of types of actions to be recognized with high accuracy. In particular, even when the skeletal points 300 to 317 are partially deleted due to occlusion or the like, it is possible to recognize a plurality of types of actions according to the deleted skeletal points with high accuracy.

実施例２を、実施例１との相違点を中心に説明する。なお、実施例１と共通する点については、同一符号を付し、その説明を省略する。 The second embodiment will be described with a focus on the differences from the first embodiment. The points common to the first embodiment are designated by the same reference numerals, and the description thereof will be omitted.

図１６は、実施例２にかかる行動認識システム１００の機能的構成例を示すブロック図である。実施例２では、欠損情報制御部４０２が削除され、欠損情報判断部４５２が欠損情報補間部１６５２に変更される。これにより、人の動作している位置の計測について、オクルージョンなどにより一部骨格が計測できずに欠損情報が含まれる場合に、欠損情報補間部１６５２は、計測可能であった骨格情報３２０から欠損情報を補間する。 FIG. 16 is a block diagram showing a functional configuration example of the behavior recognition system 100 according to the second embodiment. In the second embodiment, the missing information control unit 402 is deleted, and the missing information determination unit 452 is changed to the missing information interpolation unit 1652. As a result, when the skeleton cannot be partially measured due to occlusion or the like and the missing information is included in the measurement of the position where the person is moving, the missing information interpolation unit 1652 is missing from the measurable skeleton information 320. Interpolate the information.

具体的には、たとえば、欠損情報補間部１６５２は、骨格検出部４５１から取得した骨格情報３２０の内、オクルージョンなどにより取得できなかった骨格点の位置情報を欠損情報とし、欠損情報を補間して骨格情報処理部４５３に出力する。欠損情報補間部１６５２は、たとえば、取得できた骨格情報３２０の内、連結される骨格点または欠損情報に近い位置にある骨格点から、欠損情報を補間してもよい。 Specifically, for example, the missing information interpolation unit 1652 uses the position information of the skeleton point that could not be acquired due to occlusion or the like as the missing information in the skeleton information 320 acquired from the skeleton detecting unit 451 and interpolates the missing information. It is output to the skeleton information processing unit 453. The missing information interpolation unit 1652 may interpolate the missing information from, for example, the connected skeleton points or the skeleton points close to the missing information in the acquired skeleton information 320.

また、欠損情報補間部１６５２は、欠損情報に対して予め定めた位置情報を代入してもよい。また、欠損情報補間部１６５２は、これまでに取得した別のフレームの骨格情報３２０について欠損情報を含むと判断された骨格情報３２０の欠損情報を用いて補間してもよい。このように、欠損情報の補間手法は限定されない。 Further, the missing information interpolation unit 1652 may substitute predetermined position information for the missing information. Further, the missing information interpolation unit 1652 may interpolate using the missing information of the skeleton information 320 determined to include the missing information for the skeleton information 320 of another frame acquired so far. As described above, the method of interpolating the missing information is not limited.

＜学習処理＞
図１７は、実施例２にかかるサーバ１０１（学習装置）による学習処理の詳細な処理手順例を示すフローチャートである。実施例２では、欠損情報制御（ステップＳ１３０１）が実行されず、ステップＳ１３００で選択された教師信号について骨格情報処理（ステップＳ１３０２）が実行される。すなわち、実施例２では、骨格点の欠損の有無にかかわらず、骨格情報３２０を区別することなく、行動学習部４０６が１つの行動分類モデルを生成する。 <Learning process>
FIG. 17 is a flowchart showing a detailed processing procedure example of the learning process by the server 101 (learning device) according to the second embodiment. In the second embodiment, the missing information control (step S1301) is not executed, and the skeletal information processing (step S1302) is executed for the teacher signal selected in step S1300. That is, in Example 2, the behavior learning unit 406 generates one behavior classification model without distinguishing the skeleton information 320 regardless of the presence or absence of the skeleton point defect.

＜行動認識処理＞
図１８は、実施例２にかかるクライアント１０２（行動認識装置）による行動認識処理手順例を示すフローチャートである。実施例２では、欠損情報判断（ステップＳ１５０１）が欠損情報補間（ステップＳ１８０１）に変更される。クライアント１０２は、骨格検出（ステップＳ１５００）で検出した骨格情報３２０の内、オクルージョンなどにより取得できなかった骨格点の位置情報を欠損情報で補間し、補間後の骨格情報３２０に更新する（ステップＳ１８０１）。骨格情報処理（ステップＳ１５０２）では、補間後の骨格情報３２０を含む教師信号が用いられる。 <Action recognition processing>
FIG. 18 is a flowchart showing an example of an action recognition processing procedure by the client 102 (behavior recognition device) according to the second embodiment. In the second embodiment, the missing information determination (step S1501) is changed to the missing information interpolation (step S1801). The client 102 interpolates the position information of the skeleton point that could not be acquired due to occlusion or the like among the skeleton information 320 detected in the skeleton detection (step S1500) with the missing information, and updates the skeleton information 320 after interpolation (step S1801). ). In the skeleton information processing (step S1502), a teacher signal including the skeleton information 320 after interpolation is used.

このように、実施例２によれば、オクルージョンなどにより欠損のある骨格情報３２０を補間することにより、欠損情報ごとに行動分類モデルを生成する必要がない。これにより、学習機能の処理負荷の低減と、行動認識機能の高速化と、を図ることができる。 As described above, according to the second embodiment, it is not necessary to generate a behavior classification model for each defect information by interpolating the skeleton information 320 having a defect due to occlusion or the like. As a result, it is possible to reduce the processing load of the learning function and speed up the action recognition function.

実施例３は、実施例１と実施例２とを組み合わせた実施例である。具体的には、たとえば、実施例３の行動認識システム１００では、ユーザ操作により、実施例１にかかる学習処理および行動認識処理を実行する第１モードと、実施例２にかかる学習処理および行動認識処理を実行する第２モードと、に切替可能である。 Example 3 is an example in which Example 1 and Example 2 are combined. Specifically, for example, in the action recognition system 100 of the third embodiment, the first mode for executing the learning process and the action recognition process according to the first embodiment by the user operation, and the learning process and the action recognition according to the second embodiment. It is possible to switch to the second mode for executing the process.

このように、実施例３によれば、欠損情報を考慮したければ第１モードを選択することにより、高精度な行動認識結果を得ることができ、欠損を補間したければ第２モードを選択することにより、効率的に行動認識結果を得ることができる。 As described above, according to the third embodiment, a highly accurate action recognition result can be obtained by selecting the first mode if the defect information is to be considered, and the second mode is selected if the defect is to be interpolated. By doing so, the behavior recognition result can be obtained efficiently.

実施例４を、実施例１～実施例３との相違点を中心に説明する。なお、実施例１～実施例３と共通する点については、同一符号を付し、その説明を省略する。 Example 4 will be described focusing on the differences from Examples 1 to 3. The points common to Examples 1 to 3 are designated by the same reference numerals, and the description thereof will be omitted.

図１９は、実施例４にかかる骨格情報処理部の機能的構成例を示すブロック図である。実施例４では、骨格情報処理部４０３，４５３は、相互情報正規化部１９０４を有する。相互情報正規化部１９０４は、主成分分析部４０４に出力する骨格情報３２０、関節角度３７０、およびフレーム間の移動量について、値域を一定の範囲内に正規化する。 FIG. 19 is a block diagram showing a functional configuration example of the skeleton information processing unit according to the fourth embodiment. In Example 4, the skeletal information processing units 403 and 453 have a mutual information normalization unit 1904. The mutual information normalization unit 1904 normalizes the range of the skeleton information 320, the joint angle 370, and the amount of movement between frames output to the principal component analysis unit 404 within a certain range.

骨格情報３２０およびフレーム間の移動量の値域は、解析対象データの解像度に依存する。一方、関節角度３７０の値域は、０から２π、または０度から３６０度の範囲となる。主成分分析の実行対象となるデータについて、値域に大きな違いがある場合、元のデータの主成分に対する影響にデータ種毎の偏りが生じる場合がある。 The range of the skeleton information 320 and the amount of movement between frames depends on the resolution of the data to be analyzed. On the other hand, the range of the joint angle 370 is in the range of 0 to 2π or 0 degrees to 360 degrees. If there is a large difference in the range of the data to be executed for the principal component analysis, the influence of the original data on the principal component may be biased for each data type.

この偏りを無くすため、相互情報正規化部１９０４は、主成分にかけるデータの値域を一定の範囲内にする正規化を実行する。たとえば、相互情報正規化部１９０４は、骨格情報３２０を下記式（２１）～（２２）に従い、フレーム間移動量を下記式（２３）に従って、元のデータの値域を０から２πに統一する。 In order to eliminate this bias, the mutual information normalization unit 1904 executes normalization that keeps the range of the data applied to the main component within a certain range. For example, the mutual information normalization unit 1904 unifies the range of the original data from 0 to 2π according to the following equations (21) to (22) for the skeleton information 320 and the following equation (23) for the inter-frame movement amount.

ただし、相互情報正規化部１９０４が実行する正規化の手法はこれに限らず、相互情報正規化部１９０４は、たとえば、主成分分析の実行対象となるデータの解像度の大きさに従って、関節角度３７０の値域を一定に正規化してもよい。 However, the normalization method executed by the mutual information normalization unit 1904 is not limited to this, and the mutual information normalization unit 1904 has, for example, a joint angle 370 according to the magnitude of the resolution of the data to be executed for the principal component analysis. The range of may be normalized to be constant.

図２０は、実施例４にかかる骨格情報処理部の詳細な処理手順例を示すフローチャートである。実施例４では、骨格情報処理（ステップＳ１３０２，Ｓ１５０２）において、クライアント１０２は、正規化（ステップＳ１４０３）のあと、相互情報正規化（ステップＳ２００４）を実行する。相互情報正規化（ステップＳ２００４）では、正規化部で正規化された骨格情報３２０と、関節角度３７０と、フレーム間の移動量と、について、取りえる値域を一定に正規化する。 FIG. 20 is a flowchart showing a detailed processing procedure example of the skeleton information processing unit according to the fourth embodiment. In the fourth embodiment, in the skeletal information processing (steps S1302 and S1502), the client 102 executes mutual information normalization (step S2004) after the normalization (step S1403). In the mutual information normalization (step S2004), the range that can be taken is constantly normalized with respect to the skeleton information 320 normalized by the normalization unit, the joint angle 370, and the amount of movement between frames.

このように、実施例４によれば、主成分分析を実行する元のデータ（骨格情報３２０、関節角度３７０、フレーム間の移動量）の取り得る値域を一定に統一することで、広い値域を持つ特定のデータによる主成分への影響の偏りを無くし、複数種類の行動を高精度に判別することができる。 As described above, according to the fourth embodiment, a wide range can be obtained by uniformly unifying the possible range of the original data (skeleton information 320, joint angle 370, movement amount between frames) for executing the principal component analysis. It is possible to eliminate the bias of the influence of the specific data on the principal component and to discriminate multiple types of actions with high accuracy.

実施例５を、実施例１～実施例４との相違点を中心に説明する。なお、実施例１～実施例４と共通する点については、同一符号を付し、その説明を省略する。 Example 5 will be described focusing on the differences from Examples 1 to 4. The points common to Examples 1 to 4 are designated by the same reference numerals, and the description thereof will be omitted.

図２１は、実施例５にかかる行動認識システム１００の機能的構成例を示すブロック図である。実施例５では、主成分分析部４０４と主成分分析部４４５が、次元削減部２１００と次元削減部２１０１に変更される。次元削減は、元の情報量を可能な限り維持した上で元の変数の数または元の次元の数を削減する処理であり、実施例１～実施例４の主成分分析や独立成分分析といった成分分析を包含する概念である。 FIG. 21 is a block diagram showing a functional configuration example of the behavior recognition system 100 according to the fifth embodiment. In the fifth embodiment, the principal component analysis unit 404 and the principal component analysis unit 445 are changed to the dimension reduction unit 2100 and the dimension reduction unit 2101. Dimensionality reduction is a process of reducing the number of original variables or the number of original dimensions while maintaining the original amount of information as much as possible, such as principal component analysis and independent component analysis of Examples 1 to 4. It is a concept that includes component analysis.

次元削減部２１００は、骨格情報処理部４０３から取得した教師信号の内、正規化した骨格情報３２０と、関節角度３７０と、フレーム間の移動量と、を入力データとして、次元削減を実行して単数または複数の変数を生成し、次元数制御部４０５に出力する。 The dimension reduction unit 2100 executes dimension reduction using the normalized skeletal information 320, the joint angle 370, and the movement amount between frames among the teacher signals acquired from the skeletal information processing unit 403 as input data. Generates one or more variables and outputs them to the dimension control unit 405.

次元削減部２１００が行う次元削減の手法としては、ＳＮＥ（ＳｔｏｃｈａｓｔｉｃＮｅｉｇｈｂｏｒＥｍｂｅｄｄｉｎｇ）、ｔ－ＳＮＥ（ｔ－ＤｉｓｔｒｉｂｕｔｅｄＳｔｏｃｈａｓｔｉｃＮｅｉｇｈｂｏｒＥｍｂｅｄｄｉｎｇ）、ＵＭＡＰ（ＵｎｉｆｏｒｍＭａｎｉｆｏｌｄＡｐｐｒｏｘｉｍａｔｉｏｎａｎｄＰｒｏｊｅｃｔｉｏｎ）、Ｉｓｏｍａｐ、ＬＬＥ（ＬｏｃａｌｌｙＬｉｎｅａｒＥｍｂｅｄｄｉｎｇ）、ラプラシアン固有マップ（ＬａｐｌａｃｉａｎＥｉｇｎｍａｐ）、ＬａｒｇｅＶｉｓ、および拡散マップのような手法がある。次元削減部２１００は、ｔ－ＳＮＥやＵＭＡＰに主成分分析や独立成分分析を組み合わせて次元削減してもよい。以下、各次元削減の手法と、各手法を組み合わせて行う次元削減の手法を説明する。 As the dimension reduction method performed by the dimension reduction unit 2100, SNE (Stochastic Neighbor Embedding), t-SNE (t-Distributed Stochastic Neibbor Embedding), UMAP (Uniform Management, Lap. There are techniques such as Laplacian Eignmap, LargeVis, and diffusion maps. The dimension reduction unit 2100 may reduce the dimension by combining t-SNE and UMAP with principal component analysis and independent component analysis. Hereinafter, the method of dimension reduction and the method of dimension reduction performed by combining each method will be described.

ＳＮＥの処理を、下記式（２４）～（２８）を用いて説明する。 The processing of SNE will be described using the following equations (24) to (28).

ｘ_ｉとｘ_ｊの２つのｘ座標値３２２（入力データ）の類似度をｘ_ｉが与えられたときに近傍としてｘ_ｊを選択する条件付確率ｐ_ｊ｜ｉとする。条件付確率ｐ_ｊ｜ｉを上記式（２４）に示す。この時、ｘ_ｊはｘ_ｉを中心とした正規分布に基づいて選択されると仮定する。次に、次元削減後のｙ_ｉとｙ_ｊの２つのｙ座標値３２３（主成分）の類似度も、次元削減前のｘ_ｉとｘ_ｊの類似度と同様に、上記式（２５）に示す条件付き確率ｑ_ｊ｜ｉとする。但し、次元削減後の座標値の分散は、式を簡略化するため１／√２で固定される。 Let the conditional probability p _{j | i} select x _j as the neighborhood when x _i is given, the similarity between the two x coordinate values 322 (input data) of x _i and x _j . The conditional probability p _{j | i} is shown in the above equation (24). At this time, it is assumed that x _j is selected based on a normal distribution centered on x _i . Next, the similarity between the two y-coordinate values 323 (principal component _{) after the dimension reduction is also the same as the similarity between x i} _{and x j} _before _the dimension reduction, according to the above equation (25). Let the conditional probability q _{j | i} shown. However, the variance of the coordinate values after the dimension reduction is fixed at 1 / √2 to simplify the equation.

次元削減前後での距離関係を維持するように次元削減のｙを生成すれば、情報量も可能な限り維持した上で、次元削減することが可能である。情報量の低減を抑制した上で次元削減を行うため、次元削減部２１００は、ｐ_ｊ｜ｉ＝ｑ_ｊ｜ｉとなるように処理を行う。次元削減には２つの確率分布がどの程度似ているかを表す尺度であるＫＬダイバージェンスが用いられる。 If y of the dimension reduction is generated so as to maintain the distance relationship before and after the dimension reduction, it is possible to reduce the dimension while maintaining the amount of information as much as possible. In order to reduce the dimension while suppressing the reduction of the amount of information, the dimension reduction unit 2100 performs processing so that p _{j | i} = q _{j | i} . KL divergence, which is a measure of how similar the two probability distributions are, is used for dimensionality reduction.

ＫＬダイバージェンスを損失関数として次元削減前後の確率分布を適応した式を上記式（２６）に示す。次元削減部２１００は、損失関数である上記式（２６）を確率的勾配降下法により最小化する。この勾配は損失関数をｙ_ｉで微分した上記式（２７）を用いて、ｙ_ｉを変動させる。この変動の際の更新式は上記式（２８）で示される。 The above equation (26) shows an equation to which the probability distribution before and after the dimension reduction is applied with the KL divergence as the loss function. The dimension reduction unit 2100 minimizes the above equation (26), which is a loss function, by a stochastic gradient descent method. This gradient fluctuates y _i using the above equation (27) in which the loss function is differentiated by y _i . The update formula at the time of this fluctuation is shown by the above formula (28).

以上、ｙ_ｉを変動させながら上記式（２８）を更新させ、上記式（２７）が最小となるｙ_ｉを得ることで次元削減を行ない、新たな変数を得る。ただし、ＳＮＥの場合、主成分分析と異なり処理の特性上縮約後の次元数（変数）は２または，３種類になる。このため、ＳＮＥによる次元削減を実施の際は、予め定めた次元数（変数）を次元数制御部４０５に出力し、次元数制御部４０５は前記予め定めた次元数に従って、使用する変数の数を決定すればよい。 As described above, the above equation (28) is updated while changing y _i , and the dimension is reduced by obtaining y _i that minimizes the above equation (27), and a new variable is obtained. However, in the case of SNE, unlike principal component analysis, the number of dimensions (variables) after reduction is 2 or 3 types due to the characteristics of processing. Therefore, when the dimension reduction by SNE is carried out, a predetermined number of dimensions (variables) is output to the dimension number control unit 405, and the dimension number control unit 405 uses the number of variables according to the predetermined number of dimensions. Should be decided.

ただＳＮＥでは損失関数の最小化が難しく、また次元削減の際に等距離性を保とうとして、ｘ座標値３２２およびｙ座標値３２３で特定される骨格点が密になってしまう問題がある。この問題の解決手法としてｔ－ＳＮＥがある。 However, in SNE, it is difficult to minimize the loss function, and there is a problem that the skeleton points specified by the x-coordinate value 322 and the y-coordinate value 323 become dense in order to maintain equidistantness at the time of dimension reduction. There is t-SNE as a solution method of this problem.

ｔ－ＳＮＥの処理を下記式（２９）～（３３）を用いて説明する。 The processing of t-SNE will be described using the following equations (29) to (33).

損失関数最小化を簡単にするため、損失関数を対称化する。損失関数の対称化処理では、上記式（２９）に示す通り、ｘ_ｉとｘ_ｊの距離を同時確率分布ｐ_ｉｊで表す。ｐ_ｊ｜ｉは上記式（２４）同様で上記式（３０）で示せる。また次元削減後のｙ_ｉとｙ_ｊの距離を上記式（３１）に示す同時確率分布ｑ_ｉｊで表す。 Symmetrize the loss function to simplify the loss function minimization. In the loss function symmetrization process, as shown in the above equation (29), the distance between x _i and x _j is represented by the joint probability distribution p _ij . p _{j | i} is the same as the above formula (24) and can be represented by the above formula (30). Further, the distance between y _i and y _j after the dimension reduction is expressed by the joint probability distribution q _i j shown in the above equation (31).

次元削減後の点の距離はスチューデントのｔ分布を仮定している。スチューデントのｔ分布は、正規分布に比較して、平均値からずれた値の存在確率が高いことが特徴であり、この特徴が次元削減後のデータ間の距離について長い距離の分布も許容することが可能となる。 The distance of points after dimensionality reduction assumes a Student's t distribution. Student's t distribution is characterized by a higher probability of existence of values deviating from the mean value compared to the normal distribution, and this feature also allows distribution of long distances with respect to the distance between data after dimension reduction. Is possible.

ｔ－ＳＮＥでは、次元削減部２１００は、上記式（２９）～（３１）で求めたｐｉｊとｑ_ｉｊを用いて、上記式（３２）に示す損失関数を最小化することで次元削減を行う。次元削減部２１００は、損失関数の最小化にはＳＮＥと同様に上記式（３３）に示す確率的勾配降下法を用いる。 In t-SNE, the dimension reduction unit 2100 reduces the dimension by minimizing the loss function shown in the above equation (32) by using the pij and q _ij obtained by the above equations (29) to (31). .. The dimension reduction unit 2100 uses the stochastic gradient descent method shown in the above equation (33) to minimize the loss function, as in SNE.

以上、上記式（３３）が最小となるｙ_ｉを得ることで、次元削減部２１００は、次元削減を行ない、新たな変数を得る。ｔ－ＳＮＥもＳＮＥ同様に処理の特性上縮約後の次元数（変数）は２または３種類になる。このため、ｔ－ＳＮＥによる次元削減を実施の際は、予め定めた次元数（変数）を次元数制御部４０５に出力し、次元数制御部４０５は前記予め定めた次元数に従って、使用する変数の数を決定すればよい。 As described above, by obtaining y _i that minimizes the above equation (33), the dimension reduction unit 2100 performs dimension reduction and obtains a new variable. Similar to SNE, t-SNE has 2 or 3 types of dimensions (variables) after contraction due to the characteristics of processing. Therefore, when the dimension reduction by t-SNE is carried out, the predetermined dimension number (variable) is output to the dimension number control unit 405, and the dimension number control unit 405 uses the variable according to the predetermined dimension number. You just have to decide the number of.

ｔ－ＳＮＥは、次元削減前の高次元の局所的な構造を保った上で、大局的な構造も可能な限り捉えることから精度よく次元削減可能であるが、次元削減前の次元数に応じて計算時間が増加するといった問題がある。この次元削減の計算時間の問題を解決する手法としてＵＭＡＰがある。ＵＭＡＰの処理を下記式（３４）～（３６）を用いて説明する。 t-SNE can accurately reduce dimensions because it captures the global structure as much as possible while maintaining the high-dimensional local structure before dimension reduction, but it depends on the number of dimensions before dimension reduction. There is a problem that the calculation time increases. There is UMAP as a method for solving the problem of calculation time of this dimension reduction. The processing of UMAP will be described using the following equations (34) to (36).

とり得る値の全体Ａの中で、高次元の集合Ｘ（上記式（３４））がある。Ａの中から任意のデータを取り出した際に、それが集合Ｘに含まれる度合いを０から１の範囲で出力するメンバーシップ関数をμとする。上記式（１）に示す入力Ｘに対して、上記式（２）に示すＹを用意する。ＹはＸに比較して低い次元の空間に存在するｍ（＜ｐ）個の点の集合であり、次元削減後のデータの集合である。そしてＹのメンバーシップ関数をνとして、次元削減部２１００は、上記式（３６）が最小となるようなＹを定めることで次元削減を行ない、新たな変数を得る。 Among the total A of possible values, there is a high-dimensional set X (the above equation (34)). Let μ be a membership function that outputs the degree to which arbitrary data is included in the set X in the range of 0 to 1 when arbitrary data is taken out from A. For the input X shown in the above formula (1), the Y shown in the above formula (2) is prepared. Y is a set of m (<p) points existing in a space having a lower dimension than X, and is a set of data after dimension reduction. Then, with the membership function of Y as ν, the dimension reduction unit 2100 performs dimension reduction by determining Y such that the above equation (36) is minimized, and obtains a new variable.

ＵＭＡＰによる次元削減を実施の際には、次元削減部２１００は、ＳＮＥやｔ－ＳＮＥ同様に予め定めた次元数（変数）を次元数制御部４０５に出力してもよいし、または、次元削減後のメンバーシップ関数νが予め定めた値域以上となるような次元数（変数）を必要な次元数として次元制御部４０５に出力してもよい。この際、次元数制御部４０５は、次元削減部２１００が出力する次元数（変数）に従って、使用する次元数（変数の数）を決定すればよい。 When implementing dimension reduction by UMAP, the dimension reduction unit 2100 may output a predetermined number of dimensions (variables) to the dimension number control unit 405 as in the case of SNE and t-SNE, or the dimension reduction unit may be used. A dimension number (variable) such that the later membership function ν is equal to or greater than a predetermined value range may be output to the dimension control unit 405 as a required dimension number. At this time, the dimension number control unit 405 may determine the number of dimensions (number of variables) to be used according to the number of dimensions (variables) output by the dimension reduction unit 2100.

Ｉｓｏｍａｐの処理を説明する。次元削減部２１００は、任意のデータにおいて、近傍にあるデータの最短距離を算出し、算出した距離を多次元尺度構成法（ＭＤＳ）により測地線距離行列で表すことで次元削減を行ない、新たな変数を得る。Ｉｓｏｍａｐによる次元削減の実施の際には、次元削減部２１００は、予め定めた次元数（変数）を次元数制御部４０５に出力し、前記予め定めた次元数に従って、使用する変数の数を決定すればよい。 The processing of Isomap will be described. The dimension reduction unit 2100 calculates the shortest distance of data in the vicinity of arbitrary data, and reduces the dimension by expressing the calculated distance as a geodetic distance matrix by multidimensional scaling (MDS). Get the variable. When implementing dimension reduction by Isomap, the dimension reduction unit 2100 outputs a predetermined number of dimensions (variables) to the dimension number control unit 405, and determines the number of variables to be used according to the predetermined number of dimensions. do it.

ＬＬＥについて下記式（３５）～（４１）を用いて説明する。 LLE will be described using the following equations (35) to (41).

ｘ_ｉの近傍にある点を線形結合で近似的に上記式（３５）で表す。ここで、上記式（３６）の制約下で上記式（３７）を最小化することで次元削減前のｘ_ｉの近似値が定まる。次に、次元削減後のｙ_ｉについて、次元削減後にも可能な限りｘ_ｉの線形の隣接関係を保つため、次元削減部２１００は、上記式（３８）を最小化する。この解は上記式（３９）の固有ベクトルを固有値の２番目に小さいものｖ_ｉから（ｄ＋１）番目のｖ_ｄまで抽出することで上記式（４０）の通り得られ、次元削減部２１００は、上記式（４１）の通り、次元削減後のｙ_ｉを取得する。 The points in the vicinity of _xi are approximately represented by the above equation (35) by a linear combination. Here, by minimizing the above equation (37) under the constraint of the above equation (36), the approximate value of _xi before the dimension reduction is determined. Next, with respect to y _i after dimension reduction, the dimension reduction unit 2100 minimizes the above equation (38) in order to maintain the linear adjacency relationship of x _i as much as possible even after dimension reduction. This solution is obtained according to the above equation (40) by extracting the eigenvector of the above equation (39) from the second smallest eigenvalue _vi to the (d + 1) th v _d , and the dimension reduction unit 2100 is described above. As shown in equation (41), y _i after dimension reduction is acquired.

ＬＬＥによる次元削減を実施の際には、次元削減部２１００は、予め定めた次元数（変数）を次元数制御部４０５に出力し、次元数制御部４０５は前記予め定めた次元数に従って、使用する変数の数を決定すればよい。 When implementing dimension reduction by LLE, the dimension reduction unit 2100 outputs a predetermined number of dimensions (variables) to the dimension number control unit 405, and the dimension number control unit 405 uses the dimension reduction unit 405 according to the predetermined number of dimensions. You just have to decide the number of variables to do.

ラプラシアン固有マップの処理を下記式（４２）～（４７）を用いて説明する。 The processing of the Laplacian eigenmap will be described using the following equations (42) to (47).

次元削減前のデータが生成する近傍グラフの各辺ｘ_ｉｘ_ｊを上記式（４２）または上記式（４３）に割り当てる。割り当てた重みに対して上記式（４４）のグラフラプラシアンを導入し、グラフラプラシアンの固有ベクトル（上記式（４５））を固有値の２番目に小さいｖ_ｉから（ｄ＋１）番目のｖ_ｄまで抽出することで上記式（４６）の通り得られ、次元削減部２１００は、上記式（４７）の通り次元削減後の値ｙ_ｉを取得する。 Each side x _i x _j of the neighborhood graph generated by the data before the dimension reduction is assigned to the above equation (42) or the above equation (43). Introduce the graph Laplacian of the above equation (44) to the assigned weight, and extract the eigenvector of the graph Laplacian (the above equation (45)) from the second smallest _vi of the eigenvalue to the (d + 1) th v _d . Is obtained according to the above equation (46), and the dimension reduction unit 2100 acquires the value y _i after the dimension reduction according to the above equation (47).

ラプラシアン固有マップによる次元削減を実施の際には、次元削減部２１００は、予め定めた次元数（変数）を次元数制御部４０５に出力し、次元数制御部４０５は前記予め定めた次元数に従って、使用する変数の数を決定すればよい。 When the dimension reduction by the Laplacian eigenmap is carried out, the dimension reduction unit 2100 outputs a predetermined number of dimensions (variables) to the dimension number control unit 405, and the dimension number control unit 405 follows the predetermined number of dimensions. , You just have to decide the number of variables to use.

ＬａｒｇｅＶｉｓの処理について説明する。ＬａｒｇｅＶｉｓはｔ－ＳＮＥの計算時間を改善した手法である。ｔ－ＳＮＥではデータ点同士の距離を求めるため、データ数に応じて計算時間が増大していた。ＬａｒｇｅＶｉｓでは、次元削減部２１００は、近傍のデータからＫ－ＮＮグラフを用いてデータを領域ごとに分け、領域ごとに分けられたデータモデル毎にｔ－ＳＮＥと同様の手法で次元削減を行う。 The processing of LargeVis will be described. LargeVis is a method that improves the calculation time of t-SNE. In t-SNE, since the distance between data points is obtained, the calculation time increases according to the number of data. In LargeVis, the dimension reduction unit 2100 divides data from nearby data into regions using a K-NN graph, and performs dimension reduction for each region-divided data model in the same manner as t-SNE.

ＬａｒｇｅＶｉｓによる次元削減を実施の際には、次元数制御部４０５は、予め定めた次元数（変数）を次元数制御部４０５に出力し、前記予め定めた次元数に従って、使用する変数の数を決定すればよい。 When implementing dimension reduction by LargeVis, the dimension number control unit 405 outputs a predetermined number of dimensions (variables) to the dimension number control unit 405, and the number of variables to be used is calculated according to the predetermined number of dimensions. You just have to decide.

拡散マップについて下記式（４８）～（５３）を用いて説明する。 The diffusion map will be described using the following equations (48) to (53).

次元削減前のｘ_ｉと近傍にあるｘ_ｊから構成される近傍グラフの各辺ｘ_ｉｘ_ｊに重みＷ_ｉｊを割当て、これを正規化して上記式（４８）に示すＮ×Ｎの推移確率行列Ｐを作る。ｐ_ｔ（ｘ_ｉｘ_ｊ）はＰで表現されるグラフ上のランダムウォークによってｘ_ｉを出発してｔステップ後にｘ_ｊに到達する確率を表すとする。推移行列の性質からｐ_ｔ（ｘ_ｉｘ_ｊ）はｔ→∞で定常分布φ_０（ｘ_ｊ）に収束する。この時、点ｘ_ｉｘ_ｊの拡散距離を上記式（４９）で定義する。推移確率行列Ｐの固有値を上記式（５０）、固有ベクトルを上記式（５１）とする。この時、上記式(５２)が成り立つ。λ_ｉの絶対値は１以下であるから、次元削減部２１００は、Ｎより小さい適当な次元ｄ（ｔ）までの固有ベクトルをとって、上記式（５３）の通り次元削減を行ない、新たな変数を得る。 A weight W _ij is assigned to each side x _i x _j of a neighborhood graph composed of x _i before dimension reduction and x _j in the neighborhood, and this is normalized to the transition probability of N × N shown in the above equation (48). Make a matrix P. It is assumed that pt (x _i x _j ) represents the probability of starting x _i and reaching x _j after _t steps by a random walk on the graph represented by P. Due to the nature of the transition matrix, pt (x _i x _j ) converges to the steady distribution φ ₀ (x _j ) at _t → ∞. At this time, the diffusion distance of the point x _i x _j is defined by the above equation (49). The eigenvalue of the transition probability matrix P is the above equation (50), and the eigenvector is the above equation (51). At this time, the above equation (52) holds. Since the absolute value of λ _i is 1 or less, the dimension reduction unit 2100 takes an eigenvector up to an appropriate dimension d (t) smaller than N, performs dimension reduction according to the above equation (53), and performs a new variable. To get.

拡散マップによる次元削減を実施の際には、予め定めた次元数（変数）を次元数制御部４０５に出力し、次元数制御部４０５は前記予め定めた次元数に従って、使用する変数の数を決定すればよい。 When the dimension reduction by the diffusion map is carried out, a predetermined number of dimensions (variables) is output to the dimension number control unit 405, and the dimension number control unit 405 determines the number of variables to be used according to the predetermined number of dimensions. You just have to decide.

次元削減部２１００は、これまでに説明した主成分分析、独立成分分析、ｔ－ＳＮＥ、ＵＭＡＰ、Ｉｓｏｍａｐ、ＬＬＥ、ラプラシアン固有マップ、ＬａｒｇｅＶｉｓ、拡散マップなどを組み合わせて実施してもよい。たとえば、次元削減部２１００は、３６次元、または３６変数ある高次元なデータに対して、１０次元までの次元削減を主成分分析を用いて行い、その後２次元までの次元削減をＵＭＡＰを用いるなど、次元削減に用いる手法の組合せは限定されない。このように次元削減の際に各種手法を組み合わせることで性能や計算時間に複合的な効果が期待できる。 The dimension reduction unit 2100 may be carried out by combining the principal component analysis, the independent component analysis, t-SNE, UMAP, Isomap, LLE, Laplacian eigenmap, LargeVis, diffusion map and the like described above. For example, the dimension reduction unit 2100 performs dimension reduction up to 10 dimensions using principal component analysis for high-dimensional data having 36 dimensions or 36 variables, and then uses UMAP for dimension reduction up to 2 dimensions. , The combination of methods used for dimension reduction is not limited. By combining various methods when reducing dimensions in this way, complex effects can be expected in performance and calculation time.

また、これら次元削減の手法は、実施例５に記載の範囲で限定されるものではなく、たとえば、単に高次元の情報を可算、減算、乗算または除算したり、予め定めた係数に従って畳み込んだりしてもよく、実施例５記載の手法のように高次元のデータまたは多変数を、より低い次元のデータや、少ない数の変数を生成する手法であれば、次元削減の手法は限定されない。 Further, these dimension reduction methods are not limited to the scope described in the fifth embodiment, and for example, high-dimensional information may be simply calculated, subtracted, multiplied or divided, or convoluted according to a predetermined coefficient. However, the dimension reduction method is not limited as long as it is a method for generating high-dimensional data or multiple variables, lower-dimensional data, or a small number of variables as in the method described in Example 5.

次元削減部２１０１は、次元削減部２１００と同様の機能を有する。次元削減部２１０１は、骨格情報処理部４５３からの出力データに対して次元削減部２１００と同様の処理を実行して、次元削減前に比較して、少ない単数または複数の新たな変数を生成する。また、主成分分析部４５４は、主成分と共に生成した寄与率と累積寄与率とを次元数決定部４５５に出力する。 The dimension reduction unit 2101 has the same function as the dimension reduction unit 2100. The dimension reduction unit 2101 executes the same processing as the dimension reduction unit 2100 on the output data from the skeleton information processing unit 453, and generates a smaller number or a plurality of new variables as compared with before the dimension reduction. .. Further, the principal component analysis unit 454 outputs the contribution rate and the cumulative contribution rate generated together with the principal component to the dimension number determination unit 455.

次元削減部２１０１は、次元削減部２１００と同様の機能を有する。次元削減部２１０１は、骨格情報処理部４５３からの出力データに対して次元削減部２１００と同様の処理を実行して、単数または複数の新たな変数を生成する。また、次元削減部２１０１は、次元削減部２１００同様の手法で、新たな変数と共に次元数決定部４５５で必要な次元数（変数）の情報を次元数決定部４５５に出力する。 The dimension reduction unit 2101 has the same function as the dimension reduction unit 2100. The dimension reduction unit 2101 executes the same processing as the dimension reduction unit 2100 on the output data from the skeleton information processing unit 453 to generate a single or a plurality of new variables. Further, the dimension reduction unit 2101 outputs the information of the number of dimensions (variables) required by the dimension number determination unit 455 together with the new variable to the dimension number determination unit 455 by the same method as the dimension reduction unit 2100.

次元数決定部４５５は、取得した次元数（変数）をもとに、取得した変数をいくつまで行動分類モデル選択部４５６に出力するかを示す次元数ｋを決定し、決定した数だけ新たに生成した変数を行動分類モデル選択部４５６に出力する。 The dimension number determination unit 455 determines the dimension number k indicating how many acquired variables are output to the behavior classification model selection unit 456 based on the acquired dimension number (variable), and newly determines the number of dimensions k. The generated variable is output to the behavior classification model selection unit 456.

このように、実施例５によれば、次元削減の手法を変えることで、骨格情報処理部４０３から取得するデータに合わせて、効果的に、または計算時間を短縮して次元削減可能となり、複雑な行動を高精度に判別することができる。 As described above, according to the fifth embodiment, by changing the dimension reduction method, the dimension can be reduced effectively or by shortening the calculation time according to the data acquired from the skeleton information processing unit 403, which is complicated. It is possible to discriminate various behaviors with high accuracy.

実施例６を、実施例１～実施例５との相違点を中心に説明する。なお、実施例１～実施例５と共通する点については、同一符号を付し、その説明を省略する。 Example 6 will be described focusing on the differences from Examples 1 to 5. The points common to Examples 1 to 5 are designated by the same reference numerals, and the description thereof will be omitted.

図２２は、実施例６にかかる行動認識システム１００の機能的構成例を示すブロック図である。実施例６では、行動学習部４０６と行動認識部４５７が、行動学習部２２００と行動認識部２２０１に変更される。行動学習部２２００および行動認識部２２０１が行動を分類するための詳細な手法を図２３～図２５を用いて説明する。 FIG. 22 is a block diagram showing a functional configuration example of the behavior recognition system 100 according to the sixth embodiment. In the sixth embodiment, the behavior learning unit 406 and the behavior recognition unit 457 are changed to the behavior learning unit 2200 and the behavior recognition unit 2201. A detailed method for classifying behaviors by the behavior learning unit 2200 and the behavior recognition unit 2201 will be described with reference to FIGS. 23 to 25.

図２３は、行動学習部２２００および行動認識部２２０１が行動を分類するための基礎となる手法である決定木を示す説明図である。決定木を用いた行動分類手法を説明する。決定木では、次元削減後に新たに生成された変数空間での各行動について、予め行動の種類を与えられた変数２３００から変数２３０３を用いて、（ａ）境界線２３１０が生成される。 FIG. 23 is an explanatory diagram showing a decision tree, which is a basic method for the behavior learning unit 2200 and the behavior recognition unit 2201 to classify behaviors. The behavior classification method using the decision tree will be explained. In the decision tree, (a) the boundary line 2310 is generated by using the variable 2303 from the variable 2300 given the action type in advance for each action in the variable space newly generated after the dimension reduction.

（ａ）境界線２３１０を生成する手法を説明する。決定木は、入力された変数群２３２１の不純度が最小になるように段階的に行動を分類しいく。１段階目では第２変数軸上で、行動を変数群２３２２と変数群２３２３とに分類し、２段階目では第１変数軸上で、変数群２３２２および変数群２３２３を変数群２３２４～２３２７に分類する。こうして不純度が最小となるよう分類していく過程で得られる判別式を用いて（ａ）境界線２３１０が生成される。尚、各段階でどの軸で行動を分類するかは限定されず、また各軸での行動分類について１回などの規定された回数で分類するなどの限定もされない。 (A) A method for generating the boundary line 2310 will be described. The decision tree classifies the behavior step by step so that the impureness of the input variable group 2321 is minimized. In the first stage, the behavior is classified into the variable group 2322 and the variable group 2323 on the second variable axis, and in the second stage, the variable group 2322 and the variable group 2323 are divided into the variable groups 2324 to 2327 on the first variable axis. Classify. (A) The boundary line 2310 is generated by using the discriminant obtained in the process of classifying so as to minimize the impureness. It should be noted that there is no limitation on which axis the action is classified at each stage, and there is no limitation such as classifying the action classification at each axis by a specified number of times such as once.

図２４は、決定木による分類の詳細な展開方法を示す説明図である。決定木には、レベル（深さ）ごとに決定木を成長させるレベルワイズ２４００と、リーフ（分岐後のデータ群）ごとに決定木を成長させるリーフワイズ２４０１と、がある。決定木のような分類器を重ねて学習することをアンサンブル学習という。 FIG. 24 is an explanatory diagram showing a detailed development method of classification by a decision tree. The decision tree includes a levelwise 2400 that grows a decision tree for each level (depth) and a leafwise 2401 that grows a decision tree for each leaf (data group after branching). Ensemble learning is the process of learning by stacking classifiers like decision trees.

図２５は、アンサンブル学習と、行動学習部２２００と行動認識部２２０１が行動を分類するために用いる手法を示す説明図である。アンサンブル学習には、決定木のような分類木を並列に用いるバギング２４０１と、前の結果を引き継ぎ学習結果を更新していくブースティング２４０２と、がある。実施例１のランダムフォレストは、決定木についてバギング２４０１を採用した手法で、実施例６の行動学習部２２００および行動認識部２２０１は、ブースティング２４０２を使用した分類手法である。 FIG. 25 is an explanatory diagram showing ensemble learning and a method used by the behavior learning unit 2200 and the behavior recognition unit 2201 to classify behaviors. Ensemble learning includes bagging 2401 that uses classification trees such as decision trees in parallel, and boosting 2402 that inherits the previous results and updates the learning results. The random forest of Example 1 is a method of adopting bagging 2401 for a decision tree, and the behavior learning unit 2200 and the behavior recognition unit 2201 of Example 6 are a classification method using boosting 2402.

行動学習部２２００が行動を学習し、行動認識部２２０１が行動を分類するにあたっては、各決定木をレベルワイズにより成長させ、複数の決定木を重ねるブースティングにより入力された変数を分類してもよいし、各決定木をリーフワイズにより成長させ、複数の決定木を重ねるブースティングにより入力された変数を分類してもよい。 When the behavior learning unit 2200 learns the behavior and the behavior recognition unit 2201 classifies the behavior, even if each decision tree is grown by levelwise and the variables input by boosting in which multiple decision trees are overlapped are classified. Alternatively, each decision tree may be grown leafwise and the variables input by boosting the multiple decision trees may be classified.

尚、各決定木をレベルワイズにより成長させ、複数の決定木を重ねるブースティングを行動分類手法として採用する際にはソフトウェアライブラリｘｇｂｏｏｓｔを用いて実装してもよい。また一方で、各決定木をリーフワイズにより成長させ、複数の決定木を重ねるブースティングを行動分類手法として採用する際にはソフトウェアライブラリＬｉｇｈｔＧＢＭを用いて実装してもよい。ただし、実装手法はこれらに限定されない。 When each decision tree is grown by levelwise and boosting in which a plurality of decision trees are overlapped is adopted as an action classification method, it may be implemented by using the software library xgboost. On the other hand, when each decision tree is grown by leafwise and boosting in which a plurality of decision trees are overlapped is adopted as an action classification method, it may be implemented using the software library LightGBM. However, the implementation method is not limited to these.

このように、実施例６によれば、行動分類手法にブースティングを用いて、複数の決定木を重ねることにより、複雑な行動を高精度に判別することができる。 As described above, according to the sixth embodiment, by using boosting as the behavior classification method and superimposing a plurality of decision trees, it is possible to discriminate complex behaviors with high accuracy.

実施例７を、実施例１～実施例６との相違点を中心に説明する。なお、実施例１～実施例６と共通する点については、同一符号を付し、その説明を省略する。 Example 7 will be described focusing on the differences from Examples 1 to 6. The points common to Examples 1 to 6 are designated by the same reference numerals, and the description thereof will be omitted.

図２６は、実施例７にかかる行動認識システム１００の機能的構成例を示すブロック図である。実施例７では、次元削減部２１００と次元数制御部４０５と行動学習部４０６と次元数決定部４５５が、次元削減部２６００と次元数制御部２６０１と行動学習部２６０２と次元削減部２６０３に変更される。 FIG. 26 is a block diagram showing a functional configuration example of the behavior recognition system 100 according to the seventh embodiment. In the seventh embodiment, the dimension reduction unit 2100, the dimension number control unit 405, the action learning unit 406, and the dimension number determination unit 455 are changed to the dimension reduction unit 2600, the dimension number control unit 2601, the behavior learning unit 2602, and the dimension reduction unit 2603. Will be done.

次元削減部２６００は、予め定めた次元数に従って、実施例１～実施例６のいずれかの手法で次元削減を行い、次元削減後に生成した新たな変数を次元数制御部２６０１に出力する。次元数制御部２６０１は取得した次元数に従って、次元削減後の変数を行動学習部２６０２に出力する。 The dimension reduction unit 2600 performs dimension reduction by the method of any one of Examples 1 to 6 according to a predetermined number of dimensions, and outputs a new variable generated after the dimension reduction to the dimension number control unit 2601. The dimension number control unit 2601 outputs the variable after dimension reduction to the behavior learning unit 2602 according to the acquired dimension number.

行動学習部２６０２は、取得した次元削減後の変数と共に、与えられた行動の種類から機械学習により、行動分類のための境界線を生成し、行動分類モデルを生成する。この際、生成した行動分類モデルに対して、どのくらいの精度で行動を予測できるかという行動分類精度を算出する。 The behavior learning unit 2602 generates a boundary line for behavior classification by machine learning from a given behavior type together with the acquired variable after dimension reduction, and generates a behavior classification model. At this time, the behavior classification accuracy of how accurately the behavior can be predicted with respect to the generated behavior classification model is calculated.

行動学習部２６０２は、行動分類モデル生成に用いた変数を用いて行動分類精度を算出してもよい。行動学習部２６０２は、次元制御部２６００から取得した変数の内、一部を行動分類モデル生成には用いず、行動分類生成に用いなかった変数を用いて行動分類精度を算出してもよい。ただし、行動分類精度算出の方法は、これらに限定されない。算出した行動分類精度が予め定めた精度より高ければ、行動学習部２６０２は、生成した行動分類モデルを行動分類モデル選択部４５６に出力する。またこの際、行動学習部２６０２は、取得した次元数と行動分類精度が合格であったことを次元制御部２６０１に出力する。 The behavior learning unit 2602 may calculate the behavior classification accuracy using the variables used for generating the behavior classification model. The behavior learning unit 2602 may calculate the behavior classification accuracy by using some of the variables acquired from the dimension control unit 2600 for the behavior classification model generation and using the variables not used for the behavior classification generation. However, the method of calculating the behavior classification accuracy is not limited to these. If the calculated behavior classification accuracy is higher than the predetermined accuracy, the behavior learning unit 2602 outputs the generated behavior classification model to the behavior classification model selection unit 456. At this time, the behavior learning unit 2602 outputs to the dimension control unit 2601 that the acquired number of dimensions and the behavior classification accuracy have passed.

一方で、行動学習部２６０２は、算出した行動分類精度が予め定めた精度より低ければ、行動分類精度が不合格であったことを次元制御部２６０１に出力する。ただし、設定可能な次元数（変数）すべてで行動分類モデルを生成した上で、そのすべてで行動分類精度が不合格であった場合には、行動学習部２６０２は、これまでに生成した行動分類モデルの中で最も行動分類精度が高かった行動分類モデルを行動分類モデル選択部４５６に出力し、出力した際に用いた次元数（変数）を全学習完了情報と共に次元数制御部２６０１に出力する。 On the other hand, if the calculated behavior classification accuracy is lower than the predetermined accuracy, the behavior learning unit 2602 outputs to the dimension control unit 2601 that the behavior classification accuracy has failed. However, if the behavior classification model is generated for all the configurable dimensions (variables) and the behavior classification accuracy fails in all of them, the behavior learning unit 2602 will generate the behavior classification so far. The behavior classification model with the highest behavior classification accuracy among the models is output to the behavior classification model selection unit 456, and the dimension number (variable) used at the time of output is output to the dimension number control unit 2601 together with all learning completion information. ..

次元制御部２６０１は、行動学習部２６０２から取得した合否情報と全学習完了情報に従って、合格または全学習完了情報を取得した場合には、取得した次元数情報を次元削減部２６０３に出力し、不合格であった場合には次元削減に用いる次元数を変更して再度次元削減を実施するよう次元削減命令を次元削減部２６００に出力する。 When the dimension control unit 2601 acquires pass or all learning completion information according to the pass / fail information and all learning completion information acquired from the behavior learning unit 2602, the dimension control unit 2601 outputs the acquired dimension number information to the dimension reduction unit 2603 and fails. If the result is passed, a dimension reduction instruction is output to the dimension reduction unit 2600 so that the number of dimensions used for dimension reduction is changed and the dimension reduction is performed again.

次元削減部２６００は、取得した次元削減命令に従って、これまでに設定していない次元数を設定して再度次元削減を実施し、生成した変数を次元数制御部２６０１に出力する。 The dimension reduction unit 2600 sets a dimension number that has not been set so far according to the acquired dimension reduction instruction, performs dimension reduction again, and outputs the generated variable to the dimension number control unit 2601.

次元削減部２６０３は、次元数制御部２６０１から取得した次元数（変数）に従って、骨格情報処理部４５２から取得したデータに、実施例１～実施例６の次元削減手法を用いて次元削減を行い、生成した変数を行動分類モデル選択部４５６に出力する。尚、合否を判断する行動分類精度を定めず、次元削減部２６０３は、設定可能な次元数全てで学習を行い、行動分類精度を算出した上で、算出した行動分類精度に従って、行動分類モデルと次元数を決定してもよい。 The dimension reduction unit 2603 reduces the dimensions of the data acquired from the skeleton information processing unit 452 according to the dimension number (variable) acquired from the dimension number control unit 2601 by using the dimension reduction methods of Examples 1 to 6. , The generated variable is output to the behavior classification model selection unit 456. In addition, the dimension reduction unit 2603 does not determine the behavior classification accuracy for judging pass / fail, learns with all the configurable dimensions, calculates the behavior classification accuracy, and then sets the behavior classification model according to the calculated behavior classification accuracy. The number of dimensions may be determined.

行動学習部２６０２が算出する行動分類精度は実施例１に記載の寄与率に見立ててもよい。例えば、取得した次元削減後の変数とそれを用いて算出した行動分類精度とを関連付けておき、算出された行動分類精度が、算出に用いた次元削減後の変数の元の情報に対する寄与率とする。次元制御部２６０１は、こうして見立てた寄与率に応じて、次元削減後の変数についてどれを用いて制御を行うか決定する。 The behavior classification accuracy calculated by the behavior learning unit 2602 may be regarded as the contribution rate described in Example 1. For example, the acquired variable after dimension reduction is associated with the behavior classification accuracy calculated using it, and the calculated behavior classification accuracy is the contribution rate of the variable after dimension reduction used for calculation to the original information. do. The dimension control unit 2601 determines which of the variables after the dimension reduction is used for control according to the contribution ratio thus determined.

＜学習処理＞
図２７は、実施例７にかかるサーバ１０１（学習装置）による学習処理の詳細な処理手順例を示すフローチャートである。サーバ１０１は、次元数制御部２６０１により、次元数を決定する。この際、初めて次元削減を実施する場合には予め定めた次元数を決定し、２回目以降の次元削減の場合は、これまでに決定してない次元数を決定する（ステップＳ２７００）。 <Learning process>
FIG. 27 is a flowchart showing a detailed processing procedure example of the learning process by the server 101 (learning device) according to the seventh embodiment. The server 101 determines the number of dimensions by the dimension number control unit 2601. At this time, when the dimension reduction is carried out for the first time, the predetermined number of dimensions is determined, and in the case of the second and subsequent dimension reductions, the number of dimensions which has not been determined so far is determined (step S2700).

つぎに、サーバ１０１は決定した次元数に従って、次元削減部２６０１で次元削減を行い、新たな変数を生成する（Ｓ２７０１）。ステップＳ２７０２では、サーバ１０１は行動学習部２６０２から取得した行動分類精度に対して、合否判断を行い、合格であればステップＳ１３０７に進み、不合格であればステップＳ２７００に戻る。 Next, the server 101 performs dimension reduction by the dimension reduction unit 2601 according to the determined number of dimensions, and generates a new variable (S2701). In step S2702, the server 101 makes a pass / fail judgment on the behavior classification accuracy acquired from the behavior learning unit 2602, and if it passes, it proceeds to step S1307, and if it fails, it returns to step S2700.

このように、実施例７によれば、目標の行動分類精度に合わせて次元数を変更し、次元削減を繰り返すことで、複雑な行動を高精度に判別することができる。 As described above, according to the seventh embodiment, the complicated behavior can be discriminated with high accuracy by changing the number of dimensions according to the behavior classification accuracy of the target and repeating the dimension reduction.

また、上述した実施例１～実施例７の行動認識装置および学習装置は、下記（１）～（１４）のように構成することもできる。 Further, the behavior recognition device and the learning device of Examples 1 to 7 described above can be configured as described in (1) to (14) below.

（１）プログラムを実行するプロセッサ２０１と、前記プログラムを記憶する記憶デバイス２０２と、を有する行動認識装置（クライアント１０２）は、多変量解析で統計的な成分を生成する成分分析（主成分分析または独立成分分析）により学習対象の形状（骨格情報３２０）から得られる成分群と、前記学習対象の行動と、を用いて、成分群ごとに学習された行動分類モデル群にアクセス可能であり、前記プロセッサ２０１は、センサ１０３から得られた解析対象データから認識対象の形状（骨格情報３２０）を検出する検出処理と、前記成分分析により、前記検出処理によって検出された前記認識対象の形状に基づいて、１以上の成分と、前記成分の各々の寄与率と、を生成する成分分析処理と、前記各々の寄与率から得られる累積寄与率に基づいて、前記１以上の各々の次元を示す序数ｋを決定する決定処理と、前記決定処理によって決定された次元を示す序数ｋの成分を１以上含む特定の成分群と同じ成分群で学習された特定の行動分類モデルを、前記行動分類モデル群から選択する選択処理と、前記選択処理によって選択された特定の行動分類モデルに前記特定の成分群を入力することにより、前記認識対象の行動を示す認識結果を出力する行動認識処理と、を実行する。 (1) An action recognition device (client 102) having a processor 201 for executing a program and a storage device 202 for storing the program is used for component analysis (principal component analysis or component analysis) to generate statistical components by multivariate analysis. The behavior classification model group learned for each component group can be accessed by using the component group obtained from the shape of the learning target (skeleton information 320) by the independent component analysis) and the behavior of the learning target. The processor 201 is based on the detection process of detecting the shape of the recognition target (skeleton information 320) from the analysis target data obtained from the sensor 103 and the shape of the recognition target detected by the detection process by the component analysis. Based on the component analysis process that produces one or more components and the respective contribution rates of the above components, and the cumulative contribution rate obtained from each of the above contribution rates, the order k indicating each dimension of the above one or more. From the behavior classification model group, a determination process for determining the above and a specific behavior classification model learned in the same component group as the specific component group containing one or more components of the order k indicating the dimension determined by the determination process. The selection process for selecting and the action recognition process for outputting the recognition result indicating the action of the recognition target by inputting the specific component group into the specific action classification model selected by the selection process are executed. ..

これにより、学習対象の形状に応じた行動分類モデルが用意されているため、認識対象の複数種類の行動を高精度に認識することができる。 As a result, since the behavior classification model according to the shape of the learning target is prepared, it is possible to recognize a plurality of types of behaviors to be recognized with high accuracy.

（２）上記（１）の行動認識装置において、前記行動分類モデル群の各々の行動分類モデルは、前記学習対象の形状および前記形状を構成する複数の頂点の角度（関節角度３７０）から得られる成分群と、前記学習対象の行動と、を用いて、成分群ごとに学習されており、前記プロセッサ２０１は、前記認識対象の形状に基づいて、前記認識対象の形状を構成する複数の頂点の角度（関節角度３７０）を算出する算出処理を実行し、前記成分分析処理では、前記プロセッサ２０１は、前記認識対象の形状と、前記算出処理によって算出された前記認識対象の頂点の角度と、に基づいて、前記１以上の成分と、前記寄与率と、を生成する。 (2) In the behavior recognition device of the above (1), each behavior classification model of the behavior classification model group is obtained from the shape of the learning target and the angles (joint angles 370) of a plurality of apex constituting the shape. Learning is performed for each component group using the component group and the behavior of the learning target, and the processor 201 has a plurality of vertices constituting the shape of the recognition target based on the shape of the recognition target. A calculation process for calculating an angle (joint angle 370) is executed, and in the component analysis process, the processor 201 determines the shape of the recognition target and the angle of the apex of the recognition target calculated by the calculation process. Based on this, the one or more components and the contribution rate are generated.

これにより、頂点の角度に起因する形状の変化に応じて、認識対象の複数種類の行動を高精度に認識することができる。 As a result, it is possible to recognize a plurality of types of actions to be recognized with high accuracy according to the change in shape caused by the angle of the apex.

（３）上記（１）の行動認識装置において、前記行動分類モデル群の各々の行動分類モデルは、前記学習対象の形状および前記学習対象の移動量から得られる成分群と、前記学習対象の行動と、を用いて、成分群ごとに学習されており、前記プロセッサ２０１は、前記認識対象の異なる時点の複数の形状に基づいて、前記認識対象の移動量を算出する算出処理を実行し、前記成分分析処理では、前記プロセッサ２０１は、前記認識対象の形状と、前記算出処理によって算出された前記認識対象の移動量と、に基づいて、前記１以上の成分と、前記寄与率と、を生成する。 (3) In the behavior recognition device of the above (1), each behavior classification model of the behavior classification model group has a component group obtained from the shape of the learning target and the movement amount of the learning target, and the behavior of the learning target. The processor 201 executes a calculation process for calculating the movement amount of the recognition target based on a plurality of shapes at different time points of the recognition target, and the processor 201 is learned for each component group. In the component analysis process, the processor 201 generates the one or more components and the contribution rate based on the shape of the recognition target and the movement amount of the recognition target calculated by the calculation process. do.

これにより、移動に起因する形状の経時的な変化に応じて、認識対象の複数種類の行動を高精度に認識することができる。 As a result, it is possible to recognize a plurality of types of behaviors to be recognized with high accuracy according to changes in shape over time due to movement.

（４）上記（１）の行動認識装置において、前記プロセッサ２０１は、前記認識対象の形状の大きさを正規化する第１正規化処理を実行し、前記成分分析処理では、前記プロセッサ２０１は、前記第１正規化処理による第１正規化後の前記認識対象の形状に基づいて、前記１以上の成分と、前記寄与率と、を生成する。 (4) In the behavior recognition device of the above (1), the processor 201 executes a first normalization process for normalizing the size of the shape of the recognition target, and in the component analysis process, the processor 201 performs the first normalization process. Based on the shape of the recognition target after the first normalization by the first normalization process, the one or more components and the contribution rate are generated.

これにより、行動分類の汎用性の向上により、誤認識の抑制を図ることができる。 As a result, it is possible to suppress misrecognition by improving the versatility of behavior classification.

（５）上記（２）の行動認識装置において、前記プロセッサ２０１は、前記認識対象の形状および頂点の角度が取りうる値域を正規化する第２正規化処理を実行し、前記成分分析処理では、前記プロセッサ２０１は、前記第２正規化処理による第２正規化後の前記認識対象の形状および頂点の角度（関節角度３７０）に基づいて、前記１以上の成分と、前記寄与率と、を生成する。 (5) In the behavior recognition device of the above (2), the processor 201 executes a second normalization process for normalizing a value range in which the shape of the recognition target and the angle of the apex can be taken, and in the component analysis process, the second normalization process is performed. The processor 201 generates the one or more components and the contribution rate based on the shape of the recognition target and the angle of the apex (joint angle 370) after the second normalization by the second normalization process. do.

これにより、形状と角度という異なるデータ種における値域の偏りを抑制することができ、行動認識の高精度化を図ることができる。 As a result, it is possible to suppress the bias of the range in different data types such as shape and angle, and it is possible to improve the accuracy of action recognition.

（６）上記（１）の行動認識装置において、前記決定処理では、前記プロセッサ２０１は、前記累積寄与率がしきい値を超えるのに必要な前記成分の次元を示す序数ｋを決定する。 (6) In the action recognition device of (1) above, in the determination process, the processor 201 determines an ordinal number k indicating the dimension of the component required for the cumulative contribution rate to exceed the threshold value.

累積寄与率は、新たに生成した複数の成分が元のデータの情報量をどの程度表しているかといったことを示す尺度となるため、累積寄与率を参照することにより、次元数増加の抑制を図ることができる。 Since the cumulative contribution rate is a measure of how much the newly generated components represent the amount of information in the original data, the increase in the number of dimensions is suppressed by referring to the cumulative contribution rate. be able to.

（７）上記（１）の行動認識装置において、前記行動分類モデル群の各々の行動分類モデルは、学習対象の一部欠損した形状から得られる成分群と、前記学習対象の行動と、を用いて、前記一部欠損した形状および成分群の組み合わせごとに学習されており、前記プロセッサ２０１は、前記認識対象の一部欠損した形状を判断する判断処理と、前記成分分析処理では、前記プロセッサ２０１は、前記判断処理によって判断された前記認識対象の一部欠損した形状に基づいて、前記１以上の成分と、前記１以上の成分の各々の寄与率と、を生成し、前記選択処理では、前記プロセッサ２０１は、前記認識対象の一部欠損した形状と同一欠損形状および前記特定の成分群と同じ成分群の組み合わせで学習された特定の行動分類モデルを、前記行動分類モデル群から選択する。 (7) In the behavior recognition device of the above (1), each behavior classification model of the behavior classification model group uses a component group obtained from a partially missing shape of the learning target and the behavior of the learning target. The processor 201 is learned for each combination of the partially missing shape and the component group, and the processor 201 is determined in the determination process for determining the partially missing shape of the recognition target, and the component analysis process is the processor 201. Generates one or more components and the contribution ratios of each of the one or more components based on the partially missing shape of the recognition target determined by the determination process, and in the selection process, The processor 201 selects from the behavior classification model group a specific behavior classification model learned by a combination of the same defect shape as the partially missing shape of the recognition target and the same component group as the specific component group.

認識対象の形状が一部欠損していても、当該一部欠損に対応した行動分類モデルを用いて、高精度な行動認識をおこなうことができる。 Even if the shape of the recognition target is partially defective, highly accurate behavior recognition can be performed by using the behavior classification model corresponding to the partial defect.

（８）上記（１）の行動認識装置において、前記プロセッサ２０１は、前記認識対象の形状に一部欠損があれば補間する補間処理を実行し、前記成分分析処理では、前記プロセッサ２０１は、前記補間処理による補間後の認識対象の形状に基づいて、前記１以上の成分と、前記寄与率と、を生成する。 (8) In the behavior recognition device of the above (1), the processor 201 executes an interpolation process of interpolating if there is a partial defect in the shape of the recognition target, and in the component analysis process, the processor 201 is the processor 201. Based on the shape of the recognition target after interpolation by the interpolation process, the one or more components and the contribution rate are generated.

これにより、形状に欠損がない学習対象によって生成された行動分類モデルに適切な入力を与えることができ、行動認識精度の低下を抑制することができる。 As a result, it is possible to give an appropriate input to the behavior classification model generated by the learning target having no defect in the shape, and it is possible to suppress the deterioration of the behavior recognition accuracy.

（９）プログラムを実行するプロセッサ２０１と、前記プログラムを記憶する記憶デバイス２０２と、を有する行動認識装置（クライアント１０２）は、多変量解析で統計的な成分を生成する次元削減（主成分分析または独立成分分析またはＳＮＥ（ＳｔｏｃｈａｓｔｉｃＮｅｉｇｈｂｏｒＥｍｂｅｄｄｉｎｇ）またはｔ－ＳＮＥ（ｔ－ＤｉｓｔｒｉｂｕｔｅｄＳｔｏｃｈａｓｔｉｃＮｅｉｇｈｂｏｒＥｍｂｅｄｄｉｎｇ）またはＵＭＡＰ（ＵｎｉｆｏｒｍＭａｎｉｆｏｌｄＡｐｐｒｏｘｉｍａｔｉｏｎａｎｄＰｒｏｊｅｃｔｉｏｎ）またはＩｓｏｍａｐまたはＬＬＥ（ＬｏｃａｌｌｙＬｉｎｅａｒＥｍｂｅｄｄｉｎｇ）またはラプラシアン固有マップ（ＬａｐｌａｃｉａｎＥｉｇｎｍａｐ）またはＬａｒｇｅＶｉｓまたは拡散マップ）により学習対象の形状（骨格情報３２０）から得られる第１変数からの昇順の成分群と、前記学習対象の行動と、を用いて、成分群ごとに学習された行動分類モデル群にアクセス可能であり、前記プロセッサ２０１は、センサ１０３から得られた解析対象データから認識対象の形状（骨格情報３２０）を検出する検出処理と、前記次元削減により、前記検出処理によって検出された前記認識対象の形状に基づいて、１以上の成分と、前記成分の各々の寄与率と、を生成する次元削減処理と、前記各々の寄与率から得られる累積寄与率に基づいて、前記１以上の成分のうち第１変数からの昇順の成分の次元を示す序数ｋを決定する決定処理と、前記第１変数から前記決定処理によって決定された次元を示す序数ｋの成分までの特定の成分群と同じ成分群で学習された特定の行動分類モデルを、前記行動分類モデル群から選択する選択処理と、前記選択処理によって選択された特定の行動分類モデルに前記特定の成分群を入力することにより、前記認識対象の行動を示す認識結果を出力する行動認識処理と、を実行する。 (9) A behavior recognition device (client 102) having a processor 201 for executing a program and a storage device 202 for storing the program is dimensionally reduced (principal component analysis or) to generate statistical components in multivariate analysis. independent component analysis or SNE (Stochastic Neighbor Embedding) or t-SNE (t-Distributed Stochastic Neighbor Embedding) or UMAP (Uniform Manifold Approximation and Projection) or Isomap or LLE (Locally Linear Embedding) or Laplacian specific map (Laplacian Eignmap) or LargeVis Or a behavior classification model group learned for each component group using the component group in ascending order from the first variable obtained from the shape of the learning target (skeleton information 320) by the diffusion map) and the behavior of the learning target. The processor 201 detects the shape of the recognition target (skeleton information 320) from the analysis target data obtained from the sensor 103, and the detection process is detected by the dimension reduction. Based on the shape of the recognition target, one or more components, the contribution rate of each of the components, and the dimension reduction processing for generating, and the cumulative contribution rate obtained from each of the contribution rates, the one or more. A determination process for determining the order number k indicating the dimension of the ascending component from the first variable among the components, and a specific component group from the first variable to the component having the order number k indicating the dimension determined by the determination process. By selecting a specific behavior classification model learned in the same component group from the behavior classification model group and inputting the specific component group into the specific behavior classification model selected by the selection process. The action recognition process for outputting the recognition result indicating the action of the recognition target is executed.

（１０）上記（９）の行動認識装置において、前記行動分類モデル群の各々の行動分類モデルは、前記学習対象の形状および前記形状を構成する複数の頂点の角度（関節角度３７０）から得られる第１変数からの昇順の成分群と、前記学習対象の行動と、を用いて、成分群ごとに学習されており、前記プロセッサ２０１は、前記認識対象の形状に基づいて、前記認識対象の形状を構成する複数の頂点の角度（関節角度３７０）を算出する算出処理を実行し、前記次元削減処理では、前記プロセッサ２０１は、前記認識対象の形状と、前記算出処理によって算出された前記認識対象の頂点の角度と、に基づいて、前記１以上の成分と、前記寄与率と、を生成する。 (10) In the behavior recognition device of the above (9), each behavior classification model of the behavior classification model group is obtained from the shape of the learning target and the angles (joint angles 370) of a plurality of apex constituting the shape. The learning is performed for each component group using the component group in ascending order from the first variable and the behavior of the learning target, and the processor 201 is the shape of the recognition target based on the shape of the recognition target. The calculation process for calculating the angles (joint angles 370) of a plurality of apex constituting the above is executed, and in the dimension reduction process, the processor 201 has the shape of the recognition target and the recognition target calculated by the calculation process. Based on the angle of the apex of, the one or more components and the contribution ratio are generated.

（１１）上記（９）の行動認識装置において、前記行動分類モデル群の各々の行動分類モデルは、前記学習対象の形状および前記学習対象の移動量から得られる第１変数からの昇順の成分群と、前記学習対象の行動と、を用いて、成分群ごとに学習されており、前記プロセッサ２０１は、前記認識対象の異なる時点の複数の形状に基づいて、前記認識対象の移動量を算出する算出処理を実行し、前記次元削減処理では、前記プロセッサ２０１は、前記認識対象の形状と、前記算出処理によって算出された前記認識対象の移動量と、に基づいて、前記１以上の成分と、前記寄与率と、を生成する。 (11) In the behavior recognition device of the above (9), each behavior classification model of the behavior classification model group is a component group in ascending order from the first variable obtained from the shape of the learning target and the movement amount of the learning target. And the behavior of the learning target are learned for each component group, and the processor 201 calculates the movement amount of the recognition target based on a plurality of shapes of the recognition target at different time points. The calculation process is executed, and in the dimension reduction process, the processor 201 includes the one or more components based on the shape of the recognition target and the movement amount of the recognition target calculated by the calculation process. The contribution rate and the above are generated.

（１２）上記（９）の行動認識装置において、前記プロセッサ２０１は、前記認識対象の形状の大きさを正規化する第１正規化処理を実行し、前記次元削減処理では、前記プロセッサ２０１は、前記第１正規化処理による第１正規化後の前記認識対象の形状に基づいて、前記１以上の成分と、前記寄与率と、を生成する。 (12) In the behavior recognition device of the above (9), the processor 201 executes a first normalization process for normalizing the size of the shape of the recognition target, and in the dimension reduction process, the processor 201 performs the first normalization process. Based on the shape of the recognition target after the first normalization by the first normalization process, the one or more components and the contribution rate are generated.

（１３）上記（１０）の行動認識装置において、前記プロセッサ２０１は、前記認識対象の形状および頂点の角度が取りうる値域を正規化する第２正規化処理を実行し、前記次元削減処理では、前記プロセッサ２０１は、前記第２正規化処理による第２正規化後の前記認識対象の形状および頂点の角度（関節角度３７０）に基づいて、前記１以上の成分と、前記寄与率と、を生成する。 (13) In the behavior recognition device of the above (10), the processor 201 executes a second normalization process for normalizing a value range in which the shape of the recognition target and the angle of the apex can be taken, and in the dimension reduction process, the second normalization process is performed. The processor 201 generates the one or more components and the contribution rate based on the shape of the recognition target and the angle of the apex (joint angle 370) after the second normalization by the second normalization process. do.

（１４）上記（９）の行動認識装置において、前記決定処理では、前記プロセッサ２０１は、前記第１変数からの累積寄与率がしきい値を超えるのに必要な前記第１変数からの昇順の成分の次元を示す序数ｋを決定する。 (14) In the behavior recognition device of the above (9), in the determination process, the processor 201 is in ascending order from the first variable required for the cumulative contribution rate from the first variable to exceed the threshold value. The ordinal k indicating the dimension of the component is determined.

（１５）上記（９）の行動認識装置において、前記行動分類モデル群の各々の行動分類モデルは、学習対象の一部欠損した形状から得られる第１変数からの昇順の成分群と、前記学習対象の行動と、を用いて、前記一部欠損した形状および成分群の組み合わせごとに学習されており、前記プロセッサ２０１は、前記認識対象の一部欠損した形状を判断する判断処理と、前記次元削減処理では、前記プロセッサ２０１は、前記判断処理によって判断された前記認識対象の一部欠損した形状に基づいて、前記１以上の成分と、前記１以上の成分の各々の寄与率と、を生成し、前記選択処理では、前記プロセッサ２０１は、前記認識対象の一部欠損した形状と同一欠損形状および前記特定の成分群と同じ成分群の組み合わせで学習された特定の行動分類モデルを、前記行動分類モデル群から選択する。 (15) In the behavior recognition device of the above (9), each behavior classification model of the behavior classification model group has a component group in ascending order from the first variable obtained from a partially missing shape of the learning target and the learning. The behavior of the target is learned for each combination of the partially missing shape and the component group, and the processor 201 performs a determination process for determining the partially missing shape of the recognition target and the dimension. In the reduction process, the processor 201 generates the one or more components and the contribution ratio of each of the one or more components based on the partially missing shape of the recognition target determined by the determination process. Then, in the selection process, the processor 201 uses the specific behavior classification model learned by the combination of the partially defective shape and the same defective shape of the recognition target and the same component group as the specific component group. Select from the classification model group.

（１６）上記（９）の行動認識装置において、前記プロセッサ２０１は、前記認識対象の形状に一部欠損があれば補間する補間処理を実行し、前記次元削減処理では、前記プロセッサ２０１は、前記補間処理による補間後の認識対象の形状に基づいて、前記１以上の成分と、前記寄与率と、を生成する。 (16) In the behavior recognition device of the above (9), the processor 201 executes an interpolation process of interpolating if there is a partial defect in the shape of the recognition target, and in the dimension reduction process, the processor 201 is said. Based on the shape of the recognition target after interpolation by the interpolation process, the one or more components and the contribution rate are generated.

（１７）プログラムを実行するプロセッサ２０１と、前記プログラムを記憶する記憶デバイス２０２と、を有する学習装置において、前記プロセッサ２０１は、学習対象の形状および行動を含む教師データを取得する取得処理と、多変量解析で統計的な成分を生成する成分分析（主成分分析または独立成分分析）により、前記取得処理によって取得された前記学習対象の形状に基づいて、１以上の成分を生成する成分分析処理と、許容計算量に基づいて、前記１以上の成分の各々の次元を示す序数を制御する制御処理と、前記制御処理によって制御された次元を示す序数の成分を１以上含む成分群と、前記学習対象の行動と、に基づいて、前記学習対象の行動を学習して、前記学習対象の行動を分類する行動分類モデルを生成する行動学習処理と、を実行する。 (17) In a learning device having a processor 201 for executing a program and a storage device 202 for storing the program, the processor 201 includes an acquisition process for acquiring teacher data including a shape and an action of a learning target. A component analysis process that generates one or more components based on the shape of the learning target acquired by the acquisition process by component analysis (principal component analysis or independent component analysis) that generates statistical components by variable analysis. , A control process that controls the order indicating the dimension of each of the one or more components based on the permissible calculation amount, a component group containing one or more components of the order indicating the dimension controlled by the control process, and the learning. Based on the behavior of the target, the behavior learning process of learning the behavior of the learning target and generating a behavior classification model for classifying the behavior of the learning target is executed.

これにより、学習対象の形状に応じた行動分類モデルを複数種類用意することができるため、認識対象の複数種類の行動を高精度に認識することができる。 As a result, it is possible to prepare a plurality of types of behavior classification models according to the shape of the learning target, so that it is possible to recognize a plurality of types of behaviors to be recognized with high accuracy.

（１８）上記（１７）の学習装置において、前記プロセッサ２０１は、前記学習対象の形状に基づいて、前記学習対象の形状を構成する複数の頂点の角度（関節角度３７０）を算出する算出処理を実行し、前記成分分析処理では、前記プロセッサ２０１は、前記学習対象の形状と、前記算出処理によって算出された前記学習対象の頂点の角度と、に基づいて、前記１以上の成分を生成する。 (18) In the learning device of the above (17), the processor 201 performs a calculation process of calculating angles (joint angles 370) of a plurality of vertices constituting the shape of the learning target based on the shape of the learning target. In the component analysis process, the processor 201 generates one or more components based on the shape of the learning target and the angle of the apex of the learning target calculated by the calculation process.

これにより、頂点の角度に起因する形状の変化に応じて、行動分類モデルを複数種類用意することができるため、認識対象の頂点の角度に起因する形状の変化に応じた複数種類の行動を、高精度に認識することができる。 As a result, multiple types of behavior classification models can be prepared according to the change in shape due to the angle of the apex, so that multiple types of behavior according to the change in shape due to the angle of the apex to be recognized can be prepared. It can be recognized with high accuracy.

（１９）上記（１７）の学習装置において、前記プロセッサ２０１は、前記学習対象の異なる時点の複数の形状に基づいて、前記学習対象の移動量を算出する算出処理を実行し、前記成分分析処理では、前記プロセッサ２０１は、前記学習対象の形状と、前記算出処理によって算出された前記学習対象の移動量と、に基づいて、前記１以上の成分を生成する。 (19) In the learning device of the above (17), the processor 201 executes a calculation process for calculating the movement amount of the learning target based on a plurality of shapes of the learning target at different time points, and performs the component analysis process. Then, the processor 201 generates the one or more components based on the shape of the learning target and the movement amount of the learning target calculated by the calculation process.

これにより、移動に起因する形状の経時的な変化に応じて、行動分類モデルを複数種類用意することができるため、移動に起因する形状の経時的な変化に応じた複数種類の行動を、高精度に認識することができる。 As a result, it is possible to prepare multiple types of behavior classification models according to the change in shape caused by movement over time. It can be recognized with accuracy.

（２０）上記（１７）の学習装置において、前記プロセッサ２０１は、前記学習対象の形状の大きさを正規化する第１正規化処理を実行し、前記成分分析処理では、前記プロセッサ２０１は、前記第１正規化処理による第１正規化後の前記学習対象の形状に基づいて、前記１以上の成分を生成する。 (20) In the learning device of the above (17), the processor 201 executes a first normalization process for normalizing the size of the shape of the learning target, and in the component analysis process, the processor 201 is the processor 201. Based on the shape of the learning target after the first normalization by the first normalization process, the one or more components are generated.

これにより、行動分類学習の汎用性の向上により、誤学習の抑制を図ることができる。 As a result, it is possible to suppress erroneous learning by improving the versatility of behavior classification learning.

（２１）上記（１８）の学習装置において、前記プロセッサ２０１は、前記学習対象の形状および頂点の角度が取りうる値域を正規化する第２正規化処理を実行し、前記成分分析処理では、前記プロセッサ２０１は、前記第２正規化処理による第２正規化後の前記学習対象の形状および頂点の角度に基づいて、前記１以上の成分を生成する。 (21) In the learning apparatus of (18), the processor 201 executes a second normalization process for normalizing a value range in which the shape of the learning target and the angle of the apex can be taken, and in the component analysis process, the processor 201 executes the second normalization process. The processor 201 generates the one or more components based on the shape of the learning target and the angle of the apex after the second normalization by the second normalization process.

これにより、形状と角度という異なるデータ種における値域の偏りを抑制することができ、行動分類学習の高精度化を図ることができる。 As a result, it is possible to suppress the bias of the range in different data types such as shape and angle, and it is possible to improve the accuracy of behavior classification learning.

（２２）上記（１７）の学習装置において、前記プロセッサ２０１は、前記学習対象の形状を一部欠損させる欠損制御処理を実行し、前記成分分析処理では、前記プロセッサ２０１は、前記欠損制御処理によって得られた前記学習対象の一部欠損した形状に基づいて、前記１以上の成分を生成し、前記行動学習処理では、前記プロセッサ２０１は、前記成分群と、前記学習対象の行動と、に基づいて、前記学習対象の行動を学習して、前記行動分類モデルを生成し、前記一部欠損させた形状に関する欠損情報と関連付ける。 (22) In the learning device of the above (17), the processor 201 executes a defect control process for partially deleting the shape of the learning target, and in the component analysis process, the processor 201 is subjected to the defect control process. Based on the obtained partially missing shape of the learning target, the one or more components are generated, and in the behavior learning process, the processor 201 is based on the component group and the behavior of the learning target. Then, the behavior of the learning target is learned, the behavior classification model is generated, and the behavior is associated with the defect information regarding the partially deleted shape.

意図的に一部欠損した形状を生成することにより、行動分類モデルの種類数の増加を図ることができる。これにより、認識対象の様々な形状に対応した高精度な行動認識が可能になる。 By intentionally generating a partially missing shape, it is possible to increase the number of types of behavior classification models. This enables highly accurate action recognition corresponding to various shapes of recognition targets.

（２３）プログラムを実行するプロセッサ２０１と、前記プログラムを記憶する記憶デバイス２０２と、を有する学習装置において、前記プロセッサ２０１は、学習対象の形状および行動を含む教師データを取得する取得処理と、多変量解析で統計的な成分を生成する次元削減（主成分分析または独立成分分析またはＳＮＥ（ＳｔｏｃｈａｓｔｉｃＮｅｉｇｈｂｏｒＥｍｂｅｄｄｉｎｇ）またはｔ－ＳＮＥ（ｔ－ＤｉｓｔｒｉｂｕｔｅｄＳｔｏｃｈａｓｔｉｃＮｅｉｇｈｂｏｒＥｍｂｅｄｄｉｎｇ）またはＵＭＡＰ（ＵｎｉｆｏｒｍＭａｎｉｆｏｌｄＡｐｐｒｏｘｉｍａｔｉｏｎａｎｄＰｒｏｊｅｃｔｉｏｎ）またはＩｓｏｍａｐまたはＬＬＥ（ＬｏｃａｌｌｙＬｉｎｅａｒＥｍｂｅｄｄｉｎｇ）またはラプラシアン固有マップ（ＬａｐｌａｃｉａｎＥｉｇｎｍａｐ）またはＬａｒｇｅＶｉｓまたは拡散マップ）により、前記取得処理によって取得された前記学習対象の形状に基づいて、１以上の成分を生成する次元削減処理と、許容計算量に基づいて、前記１以上の成分のうち第１変数からの昇順の成分の次元を示す序数を制御する制御処理と、前記第１変数から前記制御処理によって制御された次元を示す序数の成分までの成分群と、前記学習対象の行動と、に基づいて、前記学習対象の行動を学習して、前記学習対象の行動を分類する行動分類モデルを生成する行動学習処理と、を実行する。 (23) In a learning device having a processor 201 for executing a program and a storage device 202 for storing the program, the processor 201 includes an acquisition process for acquiring teacher data including a shape and an action to be learned. Dimensionality reduction (principal component analysis or independent component analysis or SNE (Stochastic Negative Component Analysis) or t-SNE (t-Distributed Statistical Neighbor Embedding) or UMAP (Uniform Application) Alternatively, a dimension reduction process that generates one or more components based on the shape of the learning target acquired by the acquisition process by LLE (Statistical Liner Embedding) or Laplacian Eignmap or LargeVis or diffusion map). A control process for controlling the order number indicating the dimension of the ascending component from the first variable among the one or more components and the dimension controlled by the control process from the first variable are shown based on the allowable calculation amount. Based on the component group up to the component of the order number and the behavior of the learning target, the behavior learning process of learning the behavior of the learning target and generating a behavior classification model for classifying the behavior of the learning target. Execute.

（２４）上記（２３）の学習装置において、前記プロセッサ２０１は、前記学習対象の形状に基づいて、前記学習対象の形状を構成する複数の頂点の角度（関節角度３７０）を算出する算出処理を実行し、前記次元削減処理では、前記プロセッサ２０１は、前記学習対象の形状と、前記算出処理によって算出された前記学習対象の頂点の角度と、に基づいて、前記１以上の成分を生成する。 (24) In the learning device of the above (23), the processor 201 performs a calculation process of calculating angles (joint angles 370) of a plurality of vertices constituting the shape of the learning target based on the shape of the learning target. In the dimension reduction process, the processor 201 generates one or more components based on the shape of the learning target and the angle of the apex of the learning target calculated by the calculation process.

（２５）上記（２３）の学習装置において、前記プロセッサ２０１は、前記学習対象の異なる時点の複数の形状に基づいて、前記学習対象の移動量を算出する算出処理を実行し、前記次元削減処理では、前記プロセッサ２０１は、前記学習対象の形状と、前記算出処理によって算出された前記学習対象の移動量と、に基づいて、前記１以上の成分を生成する。 (25) In the learning device of the above (23), the processor 201 executes a calculation process of calculating the movement amount of the learning target based on a plurality of shapes of the learning target at different time points, and the dimension reduction process. Then, the processor 201 generates the one or more components based on the shape of the learning target and the movement amount of the learning target calculated by the calculation process.

（２６）上記（２３）の学習装置において、前記プロセッサ２０１は、前記学習対象の形状の大きさを正規化する第１正規化処理を実行し、前記次元削減処理では、前記プロセッサ２０１は、前記第１正規化処理による第１正規化後の前記学習対象の形状に基づいて、前記１以上の成分を生成する。 (26) In the learning device of the above (23), the processor 201 executes a first normalization process for normalizing the size of the shape of the learning target, and in the dimension reduction process, the processor 201 is the processor 201. Based on the shape of the learning target after the first normalization by the first normalization process, the one or more components are generated.

（２７）上記（２４）の学習装置において、前記プロセッサ２０１は、前記学習対象の形状および頂点の角度が取りうる値域を正規化する第２正規化処理を実行し、前記次元削減処理では、前記プロセッサ２０１は、前記第２正規化処理による第２正規化後の前記学習対象の形状および頂点の角度に基づいて、前記１以上の成分を生成する。 (27) In the learning apparatus of (24), the processor 201 executes a second normalization process for normalizing a value range in which the shape of the learning target and the angle of the apex can be taken, and in the dimension reduction process, the processor 201 executes the second normalization process. The processor 201 generates the one or more components based on the shape of the learning target and the angle of the apex after the second normalization by the second normalization process.

（２８）上記（２３）の学習装置において、前記プロセッサ２０１は、前記学習対象の形状を一部欠損させる欠損制御処理を実行し、前記次元削減処理では、前記プロセッサ２０１は、前記欠損制御処理によって得られた前記学習対象の一部欠損した形状に基づいて、前記１以上の成分を生成し、前記行動学習処理では、前記プロセッサ２０１は、前記第１変数から前記次元を示す序数の成分までの成分群と、前記学習対象の行動と、に基づいて、前記学習対象の行動を学習して、前記行動分類モデルを生成し、前記一部欠損させた形状に関する欠損情報と関連付ける。 (28) In the learning device of the above (23), the processor 201 executes a defect control process for partially deleting the shape of the learning target, and in the dimension reduction process, the processor 201 is subjected to the defect control process. Based on the obtained partially missing shape of the learning target, the one or more components are generated, and in the behavior learning process, the processor 201 ranges from the first variable to the component of the order indicating the dimension. Based on the component group and the behavior of the learning target, the behavior of the learning target is learned to generate the behavior classification model, and the behavior is associated with the defect information regarding the partially deleted shape.

なお、本発明は前述した実施例に限定されるものではなく、添付した特許請求の範囲の趣旨内における様々な変形例及び同等の構成が含まれる。たとえば、前述した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明したすべての構成を備えるものに本発明は限定されない。また、ある実施例の構成の一部を他の実施例の構成に置き換えてもよい。また、ある実施例の構成に他の実施例の構成を加えてもよい。また、各実施例の構成の一部について、他の構成の追加、削除、または置換をしてもよい。 It should be noted that the present invention is not limited to the above-mentioned examples, but includes various modifications and equivalent configurations within the scope of the attached claims. For example, the above-described embodiment has been described in detail in order to explain the present invention in an easy-to-understand manner, and the present invention is not necessarily limited to those having all the described configurations. Further, a part of the configuration of one embodiment may be replaced with the configuration of another embodiment. Further, the configuration of another embodiment may be added to the configuration of one embodiment. In addition, other configurations may be added, deleted, or replaced with respect to a part of the configurations of each embodiment.

また、前述した各構成、機能、処理部、処理手段等は、それらの一部又は全部を、たとえば集積回路で設計する等により、ハードウェアで実現してもよく、プロセッサ２０１がそれぞれの機能を実現するプログラムを解釈し実行することにより、ソフトウェアで実現してもよい。 Further, each configuration, function, processing unit, processing means, etc. described above may be realized by hardware by designing a part or all of them by, for example, an integrated circuit, and the processor 201 performs each function. It may be realized by software by interpreting and executing the program to be realized.

各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリ、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記憶装置、又は、ＩＣ（ＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）カード、ＳＤカード、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ）の記録媒体に格納することができる。 Information such as programs, tables, and files that realize each function is recorded in a memory, a hard disk, a storage device such as an SSD (Solid State Drive), or an IC (Integrated Circuit) card, an SD card, or a DVD (Digital Versail Disc). It can be stored in a medium.

また、制御線や情報線は説明上必要と考えられるものを示しており、実装上必要なすべての制御線や情報線を示しているとは限らない。実際には、ほとんどすべての構成が相互に接続されていると考えてよい。 In addition, the control lines and information lines show what is considered necessary for explanation, and do not necessarily show all the control lines and information lines necessary for implementation. In practice, you can think of almost all configurations as interconnected.

１００行動認識システム
１０１サーバ
１０２クライアント
１０３センサ
１０４教師信号ＤＢ
２０１プロセッサ
２０２記憶デバイス
３２０骨格情報
４０１教師信号取得部
４０２欠損情報制御部
４０３，４５３骨格情報処理部
４０４，４５４主成分分析部
４０５，２６０１次元数制御部
４０６，２２００，２６０２行動学習部
４５１骨格検出部
４５２欠損情報判断部
４５５次元数決定部
４５６行動分類モデル選択部
４５７，２２０１行動認識部
５０１関節角度算出部
５０２移動量算出部
５０３正規化部
１６５２欠損情報補間部
１９０４相互情報正規化部
２１００，２１０１，２６００，２６０３次元削減部 100 Behavior recognition system 101 Server 102 Client 103 Sensor 104 Teacher signal DB
201 Processor 202 Storage device 320 Skeletal information 401 Teacher signal acquisition unit 402 Missing information control unit 403,453 Skeletal information processing unit 404,454 Principal component analysis unit 405,2601 Dimension control unit 406,2200,2602 Behavior learning unit 451 Skeletal detection Part 452 Missing information judgment part 455 Dimensional number determination part 456 Behavior classification model selection part 457, 2201 Action recognition part 501 Joint angle calculation part 502 Movement amount calculation part 503 Normalization part 1652 Missing information interpolation part 1904 Mutual information normalization part 2100, 2101, 2600, 2603 Dimensionality reduction department

Claims

An action recognition device having a processor for executing a program and a storage device for storing the program.
Generate statistical components by multivariate analysis Using the component group obtained from the shape of the learning target by component analysis and the behavior of the learning target, it is possible to access the behavior classification model group learned for each component group. And
The processor
Detection processing to detect the shape of the recognition target from the analysis target data,
A component analysis process that produces one or more components and the contribution rate of each of the components based on the shape of the recognition target detected by the component analysis.
A determination process for determining an ordinal number indicating each dimension of the one or more components based on the cumulative contribution rate obtained from each of the contribution rates.
A selection process for selecting a specific behavior classification model learned in the same component group as a specific component group containing one or more components of an ordinal number indicating a dimension determined by the determination process from the behavior classification model group.
Behavior recognition processing that outputs a recognition result indicating the behavior of the recognition target by inputting the specific component group into the specific behavior classification model selected by the selection processing.
An action recognition device characterized by performing.

The behavior recognition device according to claim 1.
Each behavior classification model of the behavior classification model group is a combination of the partially defective shape and the component group using the component group obtained from the partially defective shape of the learning target and the behavior of the learning target. It is learned for each
The processor
Judgment processing for determining the partially missing shape of the recognition target, and
In the component analysis process, the processor determines the contribution rate of each of the one or more components and the one or more components based on the partially missing shape of the recognition target determined by the determination process. Generate and
In the selection process, the processor uses the behavior classification model group to obtain a specific behavior classification model learned by combining the same defect shape as the partially missing shape of the recognition target and the same component group as the specific component group. Choose from,
An action recognition device characterized by this.

The behavior recognition device according to claim 1.
The processor
If there is a partial defect in the shape to be recognized, interpolation processing is executed to interpolate.
In the component analysis process, the processor generates the one or more components and the contribution rate based on the shape of the recognition target after interpolation by the interpolation process.
An action recognition device characterized by this.

An action recognition device having a processor for executing a program and a storage device for storing the program.
It was learned for each component group using the ascending component group from the first variable obtained from the shape of the learning target by the dimension reduction to generate statistical components by multivariate analysis and the behavior of the learning target. Access to behavioral classification models,
The processor
Detection processing to detect the shape of the recognition target from the analysis target data,
A dimension reduction process that generates one or more components and the contribution ratio of each of the components based on the shape of the recognition target detected by the detection process by the dimension reduction process.
A determination process for determining the ordinal number indicating the dimension of the ascending component from the first variable among the one or more components based on each of the contribution rates.
A selection process for selecting from the behavior classification model group a specific behavior classification model learned in the same component group as the specific component group from the first variable to the component of the ordinal number indicating the dimension determined by the determination process. ,
Behavior recognition processing that outputs a recognition result indicating the behavior of the recognition target by inputting the specific component group into the specific behavior classification model selected by the selection processing.
An action recognition device characterized by performing.

The action recognition device according to claim 4.
Each behavior classification model of the behavior classification model group includes an ascending component group from the first variable obtained from the shape of the learning target and the angles of a plurality of vertices constituting the shape, and the behavior of the learning target. Is learned for each component group using
The processor
Based on the shape of the recognition target, a calculation process for calculating the angles of a plurality of vertices constituting the shape of the recognition target is executed.
In the dimension reduction process, the processor obtains the one or more components and the contribution rate based on the shape of the recognition target and the angle of the apex of the recognition target calculated by the calculation process. Generate,
An action recognition device characterized by this.

The action recognition device according to claim 4.
Each behavior classification model of the behavior classification model group uses an ascending component group from the first variable obtained from the shape of the learning target and the movement amount of the learning target, and the behavior of the learning target. It is learned for each component group,
The processor
A calculation process for calculating the movement amount of the recognition target is executed based on a plurality of shapes of the recognition target at different time points.
In the dimension reduction process, the processor generates the one or more components and the contribution rate based on the shape of the recognition target and the movement amount of the recognition target calculated by the calculation process. do,
An action recognition device characterized by this.

The action recognition device according to claim 4.
The processor
The first normalization process for normalizing the size of the shape to be recognized is executed.
In the dimension reduction process, the processor generates the one or more components and the contribution rate based on the shape of the recognition target after the first normalization by the first normalization process.
An action recognition device characterized by this.

The behavior recognition device according to claim 5.
The processor
The second normalization process for normalizing the range that the shape of the recognition target and the angle of the apex can take is executed.
In the dimension reduction process, the processor generates the one or more components and the contribution rate based on the shape of the recognition target and the angle of the apex after the second normalization by the second normalization process. do,
An action recognition device characterized by this.

The action recognition device according to claim 4.
In the determination process, the processor determines an ordinal number indicating the dimension of the ascending component from the first variable required for the contribution to exceed the threshold.
An action recognition device characterized by this.

The action recognition device according to claim 4.
Each behavior classification model of the behavior classification model group uses the component group in ascending order from the first variable obtained from the partially defective shape of the learning target and the behavior of the learning target, and the partial defect. It is learned for each combination of shape and component group,
The processor
Judgment processing for determining the partially missing shape of the recognition target, and
In the dimension reduction processing, the processor determines the contribution rate of each of the one or more components and the one or more components based on the partially missing shape of the recognition target determined by the determination process. Generate and
In the selection process, the processor uses the behavior classification model group to obtain a specific behavior classification model learned by combining the same defect shape as the partially missing shape of the recognition target and the same component group as the specific component group. Choose from,
An action recognition device characterized by this.

The action recognition device according to claim 4.
The processor
If there is a partial defect in the shape to be recognized, interpolation processing is executed to interpolate.
In the dimension reduction processing, the processor generates the one or more components and the contribution ratio based on the shape of the recognition target after interpolation by the interpolation processing.
An action recognition device characterized by this.

A learning device having a processor for executing a program and a storage device for storing the program.
The processor
Acquisition process to acquire teacher data including the shape and behavior of the learning target,
A component analysis process that generates one or more components based on the shape of the learning target acquired by the acquisition process by component analysis that generates statistical components by multivariate analysis.
Control processing that controls the ordinal number indicating each dimension of the one or more components based on the allowable calculation amount, and
The behavior of the learning target is learned and the behavior of the learning target is classified based on the component group containing one or more components of the order indicating the dimension controlled by the control process and the behavior of the learning target. Behavior learning process to generate behavior classification model,
A learning device characterized by performing.

The learning device according to claim 12.
The processor
The defect control process for partially deleting the shape of the learning target is executed, and the defect control process is executed.
In the component analysis process, the processor generates the one or more components based on the partially defective shape of the learning target obtained by the defect control process.
In the behavior learning process, the processor
Based on the component group and the behavior of the learning target, the behavior of the learning target is learned to generate the behavior classification model, which is associated with the defect information regarding the partially deleted shape.
A learning device characterized by that.

A learning device having a processor for executing a program and a storage device for storing the program.
The processor
Acquisition process to acquire teacher data including the shape and behavior of the learning target,
Dimension reduction processing that generates one or more components based on the shape of the learning target acquired by the acquisition process by dimension reduction that generates statistical components in multivariate analysis.
A control process that controls the ordinal number indicating the dimension of the ascending component from the first variable among the one or more components based on the allowable calculation amount.
The behavior of the learning target is learned based on the component group from the first variable to the component of the order indicating the dimension controlled by the control process and the behavior of the learning target, and the behavior of the learning target is learned. Behavior learning process that generates behavior classification model to classify
A learning device characterized by performing.

The learning device according to claim 14.
The processor
Based on the shape of the learning target, a calculation process for calculating the angles of a plurality of vertices constituting the shape of the learning target is executed.
In the dimension reduction process, the processor generates the one or more components based on the shape of the learning target and the angle of the apex of the learning target calculated by the calculation process.
A learning device characterized by that.

The learning device according to claim 14.
The processor
A calculation process for calculating the amount of movement of the learning target is executed based on a plurality of shapes of the learning target at different time points.
In the dimension reduction process, the processor generates one or more components based on the shape of the learning target and the movement amount of the learning target calculated by the calculation process.
A learning device characterized by that.

The learning device according to claim 14.
The processor
The first normalization process for normalizing the size of the shape of the learning target is executed.
In the dimension reduction process, the processor generates the one or more components based on the shape of the learning target after the first normalization by the first normalization process.
A learning device characterized by that.

The learning device according to claim 15.
The processor
The second normalization process for normalizing the range that the shape of the learning target and the angle of the apex can take is executed.
In the dimension reduction process, the processor generates the one or more components based on the shape and the angle of the apex of the learning target after the second normalization by the second normalization process.
A learning device characterized by that.

The learning device according to claim 14.
The processor
The defect control process for partially deleting the shape of the learning target is executed, and the defect control process is executed.
In the dimension reduction processing, the processor generates the one or more components based on the partially defective shape of the learning target obtained by the defect control processing.
In the behavior learning process, the processor
Based on the component group from the first variable to the component of the ordinal number indicating the dimension and the behavior of the learning target, the behavior of the learning target is learned to generate the behavior classification model, and the part thereof. Associate with missing information about the missing shape,
A learning device characterized by that.

A behavior recognition method executed by an behavior recognition device having a processor that executes a program and a storage device that stores the program.
Generate statistical components by multivariate analysis Using the component group obtained from the shape of the learning target by component analysis and the behavior of the learning target, it is possible to access the behavior classification model group learned for each component group. And
The behavior recognition method is
The processor
Detection processing to detect the shape of the recognition target from the analysis target data,
A component analysis process that produces one or more components and the contribution rate of each of the components based on the shape of the recognition target detected by the component analysis.
A determination process for determining an ordinal number indicating each dimension of the one or more components based on the cumulative contribution rate obtained from each of the contribution rates.
A selection process for selecting a specific behavior classification model learned in the same component group as a specific component group containing one or more components of an ordinal number indicating a dimension determined by the determination process from the behavior classification model group.
Behavior recognition processing that outputs a recognition result indicating the behavior of the recognition target by inputting the specific component group into the specific behavior classification model selected by the selection processing.
A behavior recognition method characterized by performing.