JP2021191356A

JP2021191356A - Correction content learning device, operation correction device and program

Info

Publication number: JP2021191356A
Application number: JP2020098571A
Authority: JP
Inventors: 英明中辻; Hideaki Nakatsuji; 弘毅前野; Koki Maeno; 典生入子; Norio Iriko
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 2020-06-05
Filing date: 2020-06-05
Publication date: 2021-12-16

Abstract

To provide a correction content learning device, operation correction device and program which can grasp the improvement point with the correction content for the operation of a user.SOLUTION: A correction content learning device 100 comprises: moving image data input means 10 which inputs moving image data; feature data acquisition means 20 which acquires a plurality of pieces of feature data from the moving image data; model person feature data storage means 30 which stores model person feature data; feature data comparison means 40 which compares the acquired feature data with the feature data of a model person; teacher data storage means 50 which stores teacher data indicating a correction content for the comparison result; and learning means 60 which has a neural network constitution, in which an input layer takes in the comparison result and the number of output layers corresponds to each feature data. The learning means 60 changes a coefficient between neurons of the input layer and an intermediate layer in correspondence with the comparison result and teacher data.SELECTED DRAWING: Figure 6

Description

本発明は運動、スポーツ等における動作の矯正を効率的に行うための技術に関する。 The present invention relates to a technique for efficiently correcting movements in exercise, sports, and the like.

近年、健康目的や、競技目的など、様々な目的で運動が行われている。これらの目的の達成には、的確な動作で実行することが重要である。そのために、従来は、人の動作を真似したり、指導を仰いだりする行為を行っていた。最近では、コンピュータやインターネットを用いてスポーツなどの習熟度向上が手軽に実現できる学習システムも提案されている（特許文献1参照）。 In recent years, exercise has been carried out for various purposes such as health purposes and competition purposes. In order to achieve these goals, it is important to carry out with proper operation. For this reason, in the past, the act of imitating a person's movements or seeking guidance was performed. Recently, a learning system that can easily improve proficiency in sports using a computer or the Internet has been proposed (see Patent Document 1).

特開２０１７−６４０９５号公報JP-A-2017-64095

しかしながら、特許文献１に記載の技術では、撮影した画像を単に表示するだけであり、動作を矯正したいユーザが、自分で違いを判断して自分の動作に反映させなければならない。また、リアルタイムで矯正内容を知ることもできないという問題点もある。 However, in the technique described in Patent Document 1, the captured image is simply displayed, and the user who wants to correct the motion must judge the difference by himself / herself and reflect it in his / her motion. There is also the problem that it is not possible to know the correction details in real time.

そこで、本発明は、ユーザの動作に対して、矯正内容で改善点を把握することが可能な矯正内容学習装置、動作矯正装置およびプログラムを提供することを課題とする。 Therefore, it is an object of the present invention to provide a correction content learning device, a motion correction device, and a program capable of grasping improvement points in the correction content for the user's movement.

上記課題を解決するため、本発明は、
動作中の人体を撮影した動画データを入力する動画データ入力手段と、
前記入力された動画データより、動作の特徴を表現した特徴データをユーザ特徴データとして取得する特徴データ取得手段と、
模範者の前記特徴データを模範者特徴データとして記憶した模範者特徴データ記憶手段と、
前記ユーザ特徴データと、前記模範者特徴データを比較する特徴データ比較手段と
前記特徴データ比較手段による比較結果に対して、矯正すべき内容を示す前記特徴データを教師データとして記憶した教師データ記憶手段と、
入力層と複数の中間層と出力層を有するニューラルネットワーク構成を有し、前記入力層は、前記特徴データ比較手段の比較結果を取り込む手段を備え、前記出力層は、前記比較結果に対応した数のニューロンを備え、前記出力層における前記教師データに対応するニューロンの値を変化させるように、層間におけるニューロン間の係数を変化させる学習手段と、を備えた、矯正内容学習装置を提供する。 In order to solve the above problems, the present invention
Video data input means for inputting video data of a moving human body,
A feature data acquisition means for acquiring feature data expressing motion features from the input video data as user feature data, and
A modeler feature data storage means that stores the feature data of the modeler as modeler feature data,
A teacher data storage means that stores the feature data indicating the content to be corrected with respect to the comparison result by the feature data comparison means for comparing the user feature data with the model feature data and the feature data comparison means as teacher data. When,
It has a neural network configuration having an input layer, a plurality of intermediate layers, and an output layer, the input layer includes means for capturing the comparison result of the feature data comparison means, and the output layer is a number corresponding to the comparison result. Provided is a correction content learning apparatus comprising the above-mentioned neurons and a learning means for changing the coefficient between the neurons in the layers so as to change the value of the neurons corresponding to the teacher data in the output layer.

また、本発明の矯正内容学習装置は、
前記特徴データは、人体における部位および当該部位に対応する矯正量を含んでもよい。 Further, the correction content learning device of the present invention is
The feature data may include a site in the human body and a correction amount corresponding to the site.

また、本発明の矯正内容学習装置は、
前記矯正量は、前記部位における角度であってもよい。 Further, the correction content learning device of the present invention is
The correction amount may be an angle at the site.

また、本発明の矯正内容学習装置は、
前記部位における角度は、所定の状態における前記部位に繋がる部分が形成する角度であってもよい。 Further, the correction content learning device of the present invention is
The angle at the site may be an angle formed by a portion connected to the site in a predetermined state.

また、本発明の矯正内容学習装置は、
前記部位における角度は、前記部位に連結する部位である連結部位が異なる時点において形成する角度であってもよい。 Further, the correction content learning device of the present invention is
The angle at the site may be an angle formed at different time points by the connecting sites that are connected to the site.

また、本発明は、
動作中の人体を撮影した動画データを入力する動画データ入力手段と、
前記入力された動画データより、動作の特徴を表現した特徴データをユーザ特徴データとして取得する特徴データ取得手段と、
模範者の前記特徴データを模範者特徴データとして記憶した模範者特徴データ記憶手段と、
前記ユーザ特徴データと、前記模範者特徴データとを比較する特徴データ比較手段と
入力層と複数の中間層と出力層を有するニューラルネットワーク構成を有し、前記入力層は、前記特徴データ比較手段の比較結果を取り込む手段を備え、前記出力層は、前記比較結果に対応した数のニューロンを備え、前記出力層において矯正量を出力する矯正量算出手段と、
前記出力された矯正量に応じて動作中のユーザに矯正内容を伝達する矯正内容伝達手段と、
を備える、動作矯正装置を提供する。 Further, the present invention
Video data input means for inputting video data of a moving human body,
A feature data acquisition means for acquiring feature data expressing motion features from the input video data as user feature data, and
A modeler feature data storage means that stores the feature data of the modeler as modeler feature data,
It has a feature data comparison means for comparing the user feature data with the model feature data, a neural network configuration having an input layer, a plurality of intermediate layers, and an output layer, and the input layer is the feature data comparison means. The output layer is provided with a means for capturing the comparison result, the output layer is provided with a number of neurons corresponding to the comparison result, and the correction amount calculation means for outputting the correction amount in the output layer is provided.
The correction content transmission means for transmitting the correction content to the operating user according to the output correction amount, and the correction content transmission means.
Provided is a motion correction device.

また、本発明は、
コンピュータを、
動画データより、動作の特徴を表現した特徴データをユーザ特徴データとして取得する特徴データ取得手段、
模範者の前記特徴データを模範者特徴データとして記憶した模範者特徴データ記憶手段、
前記ユーザ特徴データと、前記模範者特徴データとを比較する特徴データ比較手段、前記特徴データ比較手段による比較結果に対して、矯正すべき内容を示す前記特徴データを教師データとして記憶した教師データ記憶手段、
入力層と複数の中間層と出力層を有するニューラルネットワーク構成を有し、前記入力層は、前記特徴データ比較手段の比較結果を取り込む手段を備え、前記出力層は、前記比較結果に対応した数のニューロンを備え、前記出力層における前記教師データに対応するニューロンの値を変化させるように、層間におけるニューロン間の係数を変化させる学習手段、として機能させるためのプログラムを提供する。 Further, the present invention
Computer,
Feature data acquisition means that acquires feature data that expresses the characteristics of operation as user feature data from video data,
Modeler feature data storage means that stores the feature data of the model as model feature data,
Teacher data storage that stores the feature data indicating the content to be corrected with respect to the feature data comparison means for comparing the user feature data with the model feature data and the comparison result by the feature data comparison means as teacher data. means,
It has a neural network configuration having an input layer, a plurality of intermediate layers, and an output layer, the input layer includes means for capturing the comparison result of the feature data comparison means, and the output layer is a number corresponding to the comparison result. Provided is a program for functioning as a learning means for changing a coefficient between neurons between layers so as to change the value of the neuron corresponding to the teacher data in the output layer.

また、本発明は、
コンピュータを、
動画データより、動作の特徴を表現した特徴データをユーザ特徴データとして取得する特徴データ取得手段、
模範者の前記特徴データを模範者特徴データとして記憶した模範者特徴データ記憶手段、
前記ユーザ特徴データと、前記模範者特徴データとを比較する特徴データ比較手段、
入力層と複数の中間層と出力層を有するニューラルネットワーク構成を有し、前記入力層は、前記特徴データ比較手段の比較結果を取り込む手段を備え、前記出力層は、前記比較結果に対応した数のニューロンを備え、前記出力層において矯正量を出力する矯正量算出手段、
前記出力された矯正量に応じて動作中のユーザに矯正内容を伝達する矯正内容伝達手段、として機能させるためのプログラムを提供する。 Further, the present invention
Computer,
Feature data acquisition means that acquires feature data that expresses the characteristics of operation as user feature data from video data,
Modeler feature data storage means that stores the feature data of the model as model feature data,
A feature data comparison means for comparing the user feature data with the model feature data,
It has a neural network configuration having an input layer, a plurality of intermediate layers, and an output layer, the input layer includes means for capturing the comparison result of the feature data comparison means, and the output layer is a number corresponding to the comparison result. A correction amount calculation means, which is equipped with the above-mentioned neurons and outputs a correction amount in the output layer.
Provided is a program for functioning as a correction content transmission means for transmitting correction content to a user in operation according to the output correction amount.

本発明によれば、ユーザの動作に対して、矯正内容を把握することが可能な矯正内容学習装置、動作矯正装置およびプログラムを提供することが可能となる。 According to the present invention, it is possible to provide a correction content learning device, a motion correction device, and a program capable of grasping the correction content for the user's movement.

体格差による骨格データの違いを示す図である。It is a figure which shows the difference of the skeleton data by the difference in body size. 各個人の動作の状態による骨格データの違いを示す図である。It is a figure which shows the difference of the skeleton data by the state of movement of each individual. 腕の振り幅により形成される角度を示す図である。It is a figure which shows the angle formed by the swing width of an arm. 膝の曲げ具合により形成される角度を示す図である。It is a figure which shows the angle formed by the bending degree of a knee. 体格差がある２者の骨格データを示す図である。It is a figure which shows the skeleton data of two persons having a physical disparity. 本発明の一実施形態に係る矯正内容学習装置１００の機能ブロック図である。It is a functional block diagram of the correction content learning apparatus 100 which concerns on one Embodiment of this invention. 本発明の一実施形態に係る矯正内容学習装置１００、動作矯正装置２００の主要部のハードウェア構成図である。It is a hardware block diagram of the main part of the correction content learning device 100 and the motion correction device 200 according to one embodiment of the present invention. 矯正内容学習装置１００の処理動作を示すフローチャートである。It is a flowchart which shows the processing operation of the correction content learning apparatus 100. 本発明の一実施形態に係る動作矯正装置２００の機能ブロック図である。It is a functional block diagram of the motion correction apparatus 200 which concerns on one Embodiment of this invention. 動作矯正装置２００の処理動作を示すフローチャートである。It is a flowchart which shows the processing operation of the motion correction apparatus 200. 矯正内容学習装置１００、動作矯正装置２００の適用例を示す図である。It is a figure which shows the application example of the correction content learning device 100, and the motion correction device 200. ユーザ特徴データ、模範者特徴データ、教師データの関係を示す図である。It is a figure which shows the relationship between the user characteristic data, the model person characteristic data, and the teacher data.

以下、本発明の好適な実施形態について図面を参照して詳細に説明する。
本実施形態では、ユーザの動作を撮影した動画データから得られた特徴データと模範者の特徴データを比較し、その比較結果と、指導者の判断による矯正部位と矯正量からなる矯正データを、教師データとしてニューラルネットワークに与えて、ニューラルネットワークを変化させ最適な動作の矯正内容となるよう学習する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings.
In the present embodiment, the feature data obtained from the moving image data of the user's movement is compared with the feature data of the model, and the comparison result and the correction data consisting of the correction site and the correction amount determined by the instructor are obtained. It is given to the neural network as teacher data, and the neural network is changed to learn so that the correction content of the optimum operation is obtained.

＜１．動作矯正のために必要な特徴データ＞
ここで、動作矯正のために必要な特徴データについて説明する。本実施形態に係る矯正内容学習装置１００、および後述する動作矯正装置２００においては、人体の動作を撮影して得られた動画データを用いて、矯正内容を決定する。基本的には、矯正対象であるユーザを撮影した動画データと、模範者を撮影した動画データを用いた比較が考えられる。例えば、それぞれの動画データから骨格データを取得し、骨格データどうしを比較することができる。 <1. Feature data required for motion correction>
Here, the feature data necessary for motion correction will be described. In the correction content learning device 100 according to the present embodiment and the motion correction device 200 described later, the correction content is determined using the moving image data obtained by photographing the movement of the human body. Basically, it is conceivable to compare the moving image data of the user who is the correction target and the moving image data of the model. For example, skeleton data can be acquired from each moving image data and the skeleton data can be compared with each other.

しかし、単純な骨格データどうしの比較による方法では、次のような２つの問題がある。第１の問題は、模範者とユーザの骨格データの単純な比較では、模範者とユーザの体格差によるものの成分が含まれるということである。このため、体格差により骨格データが大きく異なる。図１は、体格差による骨格データの違いを示す図である。図１（ａ）は、体格が小さい人の骨格データであり、図１（ｂ）は、体格が大きい人の骨格データである。図１において、右腕と右脚は実線で示しており、左腕と左脚は破線で示している。後述の図２〜図５においても同様である。図１に示したような、体格差の違いはアドバイスできるものではないため、模範者とユーザの体格差を無視できる特徴を取得する必要がある。 However, the simple method of comparing skeletal data has the following two problems. The first problem is that a simple comparison of the skeletal data of the model and the user includes components due to the physical disparity between the model and the user. Therefore, the skeletal data differs greatly depending on the difference in body size. FIG. 1 is a diagram showing a difference in skeletal data due to a difference in body size. FIG. 1A is skeletal data of a person with a small physique, and FIG. 1B is skeletal data of a person with a large physique. In FIG. 1, the right arm and the right leg are shown by a solid line, and the left arm and the left leg are shown by a broken line. The same applies to FIGS. 2 to 5 described later. Since the difference in body size difference as shown in FIG. 1 cannot be advised, it is necessary to acquire a feature that can ignore the difference in body size between the modeler and the user.

第２の問題は、模範者とユーザそれぞれが異なるタイミングで動いているということである。図２は、各個人の動作の状態による骨格データの違いを示す図である。図２（ａ）〜（ｃ）は、模範者の骨格データであり、図２（ｄ）〜（ｆ）は、ユーザの骨格データである。図２においては、それぞれ左から右に向かって時間が進んだときに、動画キャプチャを行って取得した骨格データを示している。 The second problem is that the model and the user are moving at different timings. FIG. 2 is a diagram showing differences in skeletal data depending on the state of movement of each individual. 2 (a) to 2 (c) are skeleton data of the model, and FIGS. 2 (d) to 2 (f) are skeleton data of the user. FIG. 2 shows the skeleton data acquired by performing video capture when the time advances from left to right, respectively.

図２に示すように、骨格データを取得するタイミングによって、どのような動作状態にあるかは変わってくる。適正な比較を行うためには、ランニング、歩行など、模範者、ユーザがどのような動作状態にあるのかを特定し、同じ状態で比較する必要がある。このような動作状態を合わせ込む処理は非常に難しく、また、非常に処理負荷が重い。そのため、リアルタイムで処理することが難しい。 As shown in FIG. 2, the operating state changes depending on the timing of acquiring the skeleton data. In order to make an appropriate comparison, it is necessary to identify the operating states of the model and the user, such as running and walking, and compare them in the same state. The process of adjusting such operating states is extremely difficult, and the processing load is extremely heavy. Therefore, it is difficult to process in real time.

上記２つの問題を解決するため、本実施形態では、動画データから動作の特徴となる部分を特徴データとして取得し、この特徴データを用いて矯正内容を特定する。特徴データとしては、人体の動作の特徴を表現することができれば、どのような態様のものであってもよいが、本実施形態では、骨格データの比較に際し、骨格間の繋ぎ目となる部位を所定の部位として特定し、その部位における動作による特徴量を用いるようにしている。さらに、特徴量としては、所定の部位に関連して形成される角度を用いるようにしている。ここで、所定の部位に関連して形成される角度について、図３、図４を用いて説明する。 In order to solve the above two problems, in the present embodiment, a part that is a feature of the operation is acquired as feature data from the moving image data, and the correction content is specified using this feature data. The feature data may be of any mode as long as it can express the characteristics of the movement of the human body, but in the present embodiment, when comparing the skeleton data, a portion that becomes a joint between the skeletons is used. It is specified as a predetermined part, and the feature amount due to the movement in that part is used. Further, as the feature amount, an angle formed in relation to a predetermined site is used. Here, the angles formed in relation to a predetermined portion will be described with reference to FIGS. 3 and 4.

図３は、腕の振り幅により形成される角度を示す図である。図３に示すように、各動作のタイミングによって、上腕の位置が異なる。一方の上腕として例えば右上腕（図中、実線で示す）に着目したとき、図３（ａ）が最も後方の位置となり、図３（ｃ）が最も前方の位置となる。前方の位置と後方の位置との差がすなわち振り幅である。図３（ｄ）に示すように、この両者の上腕の骨格により形成される角度が振り幅を示すことになる。このとき、角度は上腕の付け根である肩を中心として形成されるため、肩が着目する所定の部位となる。このとき、肩という部位における角度は、部位（肩）に連結する部位である連結部位（上腕）が異なる時点（タイミング）において形成する角度である。 FIG. 3 is a diagram showing an angle formed by the swing width of the arm. As shown in FIG. 3, the position of the upper arm differs depending on the timing of each movement. When focusing on, for example, the upper right arm (shown by a solid line in the figure) as one upper arm, FIG. 3 (a) is the most posterior position, and FIG. 3 (c) is the most anterior position. The difference between the front position and the rear position is the swing width. As shown in FIG. 3D, the angle formed by the skeletons of both upper arms indicates the swing width. At this time, since the angle is formed around the shoulder, which is the base of the upper arm, the shoulder is a predetermined portion of interest. At this time, the angle at the site called the shoulder is an angle formed at different time points (timing) by the connecting site (upper arm) which is the site connected to the site (shoulder).

図４は、膝の曲げ具合により形成される角度を示す図である。図４に示すように、各動作のタイミングによって、膝の曲げ角度が異なる。一方の脚として例えば右脚に着目したとき、模範者は、図４（ａ）が最小角となり、図４（ｃ）が最大角となる。一方、ユーザは、図４（ｆ）が最小角となり、図４（ｅ）が最大角となる。この場合、膝が矯正部位となり、膝の曲げ角度が特徴量となる。このとき、膝という部位における角度は、所定の状態における部位（膝）に繋がる部分（脚）が形成する角度である。 FIG. 4 is a diagram showing an angle formed by the degree of bending of the knee. As shown in FIG. 4, the bending angle of the knee differs depending on the timing of each movement. When focusing on, for example, the right leg as one leg, the model has the minimum angle in FIG. 4 (a) and the maximum angle in FIG. 4 (c). On the other hand, for the user, FIG. 4 (f) is the minimum angle, and FIG. 4 (e) is the maximum angle. In this case, the knee becomes the correction site, and the bending angle of the knee becomes the feature amount. At this time, the angle at the site called the knee is the angle formed by the portion (leg) connected to the site (knee) in a predetermined state.

上記のように、動作により形成される所定の部位における角度を特徴データとして用いることにより、各部位毎の動作状態を決定する必要はなくなり、比較的単純な比較が可能となる。 As described above, by using the angle at a predetermined part formed by the movement as the feature data, it is not necessary to determine the movement state for each part, and a relatively simple comparison becomes possible.

比較対象となる両者に体格差がある場合について考えてみる。図５は、体格差がある２者の骨格データを示す図である。図５において明らかなように、図５（ａ）は体格が相対的に小さい者の骨格データ、図５（ｂ）は体格が相対的に大きい者の骨格データである。このように比較対象の両者に体格差がある場合、骨格データどうしの比較には、大きな処理負荷がかかる。このような場合も、例えば、膝を中心とする脚の曲がりの角度で比較すれば、体格差に関わらず適正な比較を行うことができる。 Consider the case where there is a difference in body size between the two to be compared. FIG. 5 is a diagram showing skeletal data of two persons having a physical disparity. As is clear from FIG. 5, FIG. 5A is skeletal data of a person having a relatively small physique, and FIG. 5B is skeletal data of a person having a relatively large physique. When there is a difference in body size between the two to be compared in this way, a large processing load is applied to the comparison between the skeleton data. Even in such a case, for example, if the comparison is made based on the bending angle of the legs centered on the knees, an appropriate comparison can be made regardless of the difference in body size.

＜２．矯正内容学習装置＞
まず、ニューラルネットワークを用いた矯正内容学習装置について説明する。
学習用のデータを用いた図６の矯正内容学習装置１００による機械学習について説明する。矯正内容学習装置１００による機械学習を行う際、人体の各部位について、ユーザの動作の特徴データに対して、その運動の模範者の特徴データとの比較で、最も適切と考える矯正部位、矯正量を教師データとして用意する。矯正部位としては、例えば、肩、膝、肘、足首、背骨等の各部位とすることができる。また、矯正量としては、例えば、他の部位との角度や、水平面、垂直面など所定の基準との角度とすることができる。 <2. Orthodontic content learning device ＞
First, a correction content learning device using a neural network will be described.
Machine learning by the correction content learning device 100 of FIG. 6 using the learning data will be described. When performing machine learning with the correction content learning device 100, for each part of the human body, the correction part and the correction amount considered to be the most appropriate by comparing the characteristic data of the user's movement with the characteristic data of the model of the movement. Is prepared as teacher data. As the correction site, for example, each site such as a shoulder, a knee, an elbow, an ankle, and a spine can be used. Further, the correction amount may be, for example, an angle with another portion or an angle with a predetermined reference such as a horizontal plane or a vertical plane.

図６は、本発明の一実施形態に係る矯正内容学習装置１００の機能ブロック図である。図６において、１０は動画データ入力手段、２０は特徴データ取得手段、３０は模範者特徴データ記憶手段、４０は特徴データ比較手段、５０は教師データ記憶手段、６０は学習手段である。動画データ入力手段１０は、動作中の人体を撮影した動画データを入力する手段である。具体的には、動画を撮影可能なカメラで実現することができる。特徴データ取得手段２０は、動画データ入力手段１０により入力された動画データから、人体の動作の特徴を表現した特徴データを取得する手段である。特徴データとしては、人体の動作の特徴を表現したものであれば、どのようなものであってもよいが、本実施形態では、人体における所定の部位と、その部位に関係した特徴量を含んでいる。さらに、特徴量として、本実施形態では、所定の部位に関連して得られる角度を用いている。 FIG. 6 is a functional block diagram of the correction content learning device 100 according to the embodiment of the present invention. In FIG. 6, 10 is a moving image data input means, 20 is a feature data acquisition means, 30 is a model feature data storage means, 40 is a feature data comparison means, 50 is a teacher data storage means, and 60 is a learning means. The moving image data input means 10 is a means for inputting moving image data obtained by photographing a moving human body. Specifically, it can be realized with a camera capable of shooting a moving image. The feature data acquisition means 20 is a means for acquiring feature data expressing the characteristics of the movement of the human body from the moving image data input by the moving image data input means 10. The feature data may be any data as long as it expresses the characteristics of the movement of the human body, but in the present embodiment, a predetermined part in the human body and a feature amount related to the part are included. I'm out. Further, as a feature amount, in this embodiment, an angle obtained in relation to a predetermined site is used.

模範者特徴データ記憶手段３０は、動作中のユーザの人体から取得された特徴データであるユーザ特徴データと比較する、模範者の特徴データである模範者特徴データを記憶した記憶手段である。模範者特徴データも、動画データ入力手段１０、特徴データ取得手段２０と同様の手段により、動作中の模範者の人体を撮影し、模範者の動作の特徴を表現した特徴データとして得られる。したがって、模範者特徴データ記憶手段３０には、別途、模範者を撮影して取得された模範者特徴データが記憶されている。特徴データ比較手段４０は、特徴データ取得手段２０と模範者特徴データ記憶手段３０の特徴データとを比較する手段である。 The modeler feature data storage means 30 is a storage means for storing modeler feature data, which is the modeler's feature data, to be compared with the user feature data, which is the feature data acquired from the human body of the operating user. The model person feature data is also obtained as feature data expressing the characteristics of the model person's movement by photographing the human body of the model person in motion by the same means as the moving image data input means 10 and the feature data acquisition means 20. Therefore, the modeler feature data storage means 30 separately stores the modeler feature data acquired by photographing the modeler. The feature data comparison means 40 is a means for comparing the feature data of the feature data acquisition means 20 and the model feature data storage means 30.

教師データ記憶手段５０は、動画データに映っている動作に対して、矯正すべき内容を示す矯正データを、教師データとして記憶した記憶手段である。矯正データは、矯正すべき内容として、人体における矯正すべき部位である矯正部位と、その矯正部位において矯正すべき量である矯正量を含んでいる。矯正量は矯正すべき特徴量と表現することもできる。したがって、矯正データは、矯正内容を表現した特徴データである。本実施形態では、矯正データの内容は、動画データ入力手段１０により入力された動画データを見て、例えばその動作の指導者が決定する。なお、指導者と模範者が同一人物であってもよい。そして、決定された矯正データが設定されて教師データ記憶手段５０に記憶される。 The teacher data storage means 50 is a storage means that stores correction data indicating the content to be corrected for the motion shown in the moving image data as teacher data. The correction data includes the correction part which is the part to be corrected in the human body and the correction amount which is the amount to be corrected in the correction part as the contents to be corrected. The amount of correction can also be expressed as the amount of features to be corrected. Therefore, the correction data is feature data expressing the correction content. In the present embodiment, the content of the correction data is determined by, for example, an instructor of the operation by looking at the moving image data input by the moving image data input means 10. The leader and the model may be the same person. Then, the determined correction data is set and stored in the teacher data storage means 50.

学習手段６０は、特徴データ比較手段４０からの比較結果を用いて矯正すべき矯正部位、矯正量の候補を特定するとともに、特徴データ比較手段４０からの比較結果と教師データ記憶手段５０に記憶されている矯正データを用いて、学習を行う手段である。特徴データ比較手段４０からの比較結果と教師データ記憶手段５０に記憶されている矯正データは、学習手段６０の学習用セットである。学習手段６０は、入力層と複数の中間層と出力層を有するニューラルネットワーク構成を有し、入力層は、各特徴データを取り込む手段を備え、出力層は、各特徴データに対応した数のニューロンを備え、出力層におけるニューロンの値が教師データである矯正データに近づくように、層間におけるニューロン間の係数を変化させる。 The learning means 60 identifies a correction site to be corrected and a candidate for a correction amount using the comparison result from the feature data comparison means 40, and stores the comparison result from the feature data comparison means 40 and the teacher data storage means 50. It is a means of learning using the correction data. The comparison result from the feature data comparison means 40 and the correction data stored in the teacher data storage means 50 are a learning set of the learning means 60. The learning means 60 has a neural network configuration having an input layer, a plurality of intermediate layers, and an output layer, the input layer includes means for capturing each feature data, and the output layer has a number of neurons corresponding to each feature data. The coefficient between the neurons in the layer is changed so that the value of the neuron in the output layer approaches the correction data which is the teacher data.

図７は、本発明の一実施形態に係る矯正内容学習装置の主要部のハードウェア構成図である。本実施形態に係る矯正内容学習装置１００は、動画データ入力手段１０を構成するカメラと汎用のコンピュータで実現することができる。動画データ入力手段１０以外の部分は、図７に示すように、ＣＰＵ（Central Processing Unit）１と、コンピュータのメインメモリであるＲＡＭ（Random Access Memory）２と、ＣＰＵ１が実行するプログラムやデータを記憶するためのハードディスク、フラッシュメモリ等の大容量の記憶装置３と、キーボード、マウス等の指示入力Ｉ／Ｆ（インターフェース）４と、データ記憶媒体等の外部装置とデータ通信するためのデータ入出力Ｉ／Ｆ（インターフェース）５と、液晶ディスプレイ等の表示デバイスである表示部６と、を備え、互いにバスを介して接続されている。 FIG. 7 is a hardware configuration diagram of a main part of the correction content learning device according to the embodiment of the present invention. The correction content learning device 100 according to the present embodiment can be realized by a camera constituting the moving image data input means 10 and a general-purpose computer. As shown in FIG. 7, the portion other than the moving image data input means 10 stores the CPU (Central Processing Unit) 1, the RAM (Random Access Memory) 2 which is the main memory of the computer, and the programs and data executed by the CPU 1. A large-capacity storage device 3 such as a hard disk and a flash memory, an instruction input I / F (interface) 4 such as a keyboard and a mouse, and a data input / output I for data communication with an external device such as a data storage medium. A / F (interface) 5 and a display unit 6 which is a display device such as a liquid crystal display are provided and are connected to each other via a bus.

特徴データ取得手段２０、特徴データ比較手段４０、学習手段６０は、ＣＰＵ１が、記憶装置３に記憶されているプログラムを実行することにより実現される。すなわち、コンピュータが、専用のプログラムに従って各手段の内容を実行することになる。なお、本明細書において、コンピュータとは、ＣＰＵ等の演算処理部を有し、データ処理が可能な装置を意味し、パーソナルコンピュータなどの汎用コンピュータだけでなく、タブレット端末やスマートフォン等の携帯型端末も含む。動画データ入力手段１０は、動画データ入力手段１０を構成するカメラをデータ入出力Ｉ／Ｆ（インターフェース）５に接続することにより実現される。模範者特徴データ記憶手段３０、教師データ記憶手段５０は、記憶装置３により実現される。 The feature data acquisition means 20, the feature data comparison means 40, and the learning means 60 are realized by the CPU 1 executing a program stored in the storage device 3. That is, the computer executes the contents of each means according to a dedicated program. In the present specification, the computer means a device having an arithmetic processing unit such as a CPU and capable of data processing, and is not only a general-purpose computer such as a personal computer but also a portable terminal such as a tablet terminal or a smartphone. Also includes. The moving image data input means 10 is realized by connecting the cameras constituting the moving image data input means 10 to the data input / output I / F (interface) 5. The model person feature data storage means 30 and the teacher data storage means 50 are realized by the storage device 3.

次に、図６に示した矯正内容学習装置１００の処理動作について説明する。図８は、矯正内容学習装置１００の処理動作を示すフローチャートである。まず、動画データ入力手段１０から動画データを入力する（ステップＳ１０）。具体的には、動画撮影可能なカメラによりユーザの動作を撮影し、動画データを得る。動画データは、撮影時刻の異なる複数のフレーム（静止画像）で構成される。例えば、１秒間に３０枚のフレームを有する動画データを得ることができる。 Next, the processing operation of the correction content learning device 100 shown in FIG. 6 will be described. FIG. 8 is a flowchart showing the processing operation of the correction content learning device 100. First, the moving image data is input from the moving image data input means 10 (step S10). Specifically, a camera capable of shooting a moving image captures a user's movement and obtains moving image data. The moving image data is composed of a plurality of frames (still images) having different shooting times. For example, it is possible to obtain moving image data having 30 frames per second.

次に、特徴データ取得手段２０が、撮影された動画データから特徴データを取得する（ステップＳ２０）。本実施形態では、特徴データの取得は２段階で行う。まず、動画データから骨格データを取得する。これは、公知のモーションキャプチャ技術により行うことができ、ワイヤーモデル、ワイヤーフレーム、スケルトンモデル等の骨格データとして取得することができる。骨格データは各フレーム（静止画）について作成される。したがって、１秒間に３０枚のフレームを有する動画データに対しては、３０個の骨格データが得られる。 Next, the feature data acquisition means 20 acquires feature data from the captured moving image data (step S20). In this embodiment, the feature data is acquired in two steps. First, the skeleton data is acquired from the moving image data. This can be done by a known motion capture technique, and can be acquired as skeleton data such as a wire model, a wire frame, and a skeleton model. Skeleton data is created for each frame (still image). Therefore, for moving image data having 30 frames per second, 30 skeleton data can be obtained.

続いて、骨格データから特徴データを取得する。特徴データとして、本実施形態では、部位と、その部位に関連した動作により生じる角度を用いている。各部位は、骨格データ中における所定の位置を指定することにより、取得することができる。また、解析により動作の軸となる部分を所定の部位として特定することもできる。所定の部位として、本実施形態では、肩・膝・肘・足首・背骨を抽出している。肩・膝・肘・足首については、左右それぞれ抽出している。したがって、特徴データ取得手段２０は、部位の数に応じて特徴データを複数取得する。 Then, the feature data is acquired from the skeleton data. As feature data, in this embodiment, a site and an angle generated by an operation related to the site are used. Each site can be acquired by designating a predetermined position in the skeletal data. In addition, it is also possible to specify a portion that becomes the axis of operation as a predetermined portion by analysis. In this embodiment, shoulders, knees, elbows, ankles, and spine are extracted as predetermined parts. The shoulders, knees, elbows, and ankles are extracted on the left and right respectively. Therefore, the feature data acquisition means 20 acquires a plurality of feature data according to the number of parts.

そして、特定した所定の部位に関連した動作により生じる角度を求める。図３は、所定の部位として肩を特定した場合における、角度の算出の例を示している。図３（ａ）〜（ｃ）は、それぞれ動画データの各フレームから取得した骨格データである。すなわち図３（ａ）〜（ｃ）は、それぞれ異なる瞬間におけるユーザの姿勢を示している。そして、実線で示した右上腕に着目すると、図３（ａ）で最も後方（図面左側）に位置し、図３（ｃ）で最も前方（図面右側）に位置する。骨格データにおいては、各部位の骨格が表現されたものであるため、右上腕は線分として表現される。 Then, the angle generated by the movement related to the specified predetermined portion is obtained. FIG. 3 shows an example of calculating the angle when the shoulder is specified as a predetermined portion. 3 (a) to 3 (c) are skeleton data acquired from each frame of the moving image data. That is, FIGS. 3A to 3C show the postures of the users at different moments. Focusing on the upper right arm shown by the solid line, it is located at the rearmost position (left side in the drawing) in FIG. 3 (a) and at the frontmost position (right side in the drawing) in FIG. 3 (c). In the skeleton data, since the skeleton of each part is expressed, the upper right arm is expressed as a line segment.

特徴データ取得手段２０は、肩を中心として右上腕が最も後方に位置するフレームの骨格データと、最も前方に位置するフレームの骨格データを抽出し、最も後方に位置する右上腕と最も前方に位置する右上腕がなす角度を算出する。具体的には、図３（ｄ）に示すように、肩を中心として、最も後方に位置する右上腕と最も前方に位置する右上腕がなす角度を算出する。この角度は、所定の部位である右肩を中心とした右上腕の振り幅を表現している。 The feature data acquisition means 20 extracts the skeleton data of the frame in which the upper right arm is located most rearward with the shoulder as the center and the skeleton data of the frame located in the frontmost position, and positions the upper right arm and the most anterior position located in the rearmost position. Calculate the angle formed by the upper right arm. Specifically, as shown in FIG. 3D, the angle formed by the upper right arm located at the rearmost position and the upper right arm located at the frontmost position is calculated with the shoulder as the center. This angle expresses the swing width of the upper right arm centered on the right shoulder, which is a predetermined part.

このようにして、所定の部位である右肩とその角度が特徴データとして取得される。同様にして、膝・肘・足首・背骨についても角度が算出され、特徴データとして取得される。肩・膝・肘・足首については左右それぞれについて算出することもできる。その動作を行う競技等によっても異なるが、左右対称であることが通常である。そのため、本実施形態では、肩・膝・肘・足首については、どちらか一方を代表させることとし、肩・膝・肘・足首・背骨についての５個の特徴データを得ることにしている。ただし、肩・膝・肘・足首については、左右で異なる場合があるため、それぞれ左右の角度を取得することもできる。そのため、右肩・左肩・右膝・左膝・右肘・左肘・右足首・左足首・背骨の９個の特徴データを得るようにしてもよい。以上のようにして、ステップＳ２０においては、ユーザの動作を撮影した動画から、ユーザの動作の特徴を表現したユーザ特徴データが取得される。 In this way, the right shoulder, which is a predetermined part, and its angle are acquired as feature data. Similarly, the angles of the knees, elbows, ankles, and spine are calculated and acquired as feature data. The shoulders, knees, elbows, and ankles can be calculated for each of the left and right sides. Although it depends on the competition in which the movement is performed, it is usually symmetrical. Therefore, in the present embodiment, one of the shoulder, knee, elbow, and ankle is represented, and five characteristic data of the shoulder, knee, elbow, ankle, and spine are obtained. However, since the shoulders, knees, elbows, and ankles may differ on the left and right, it is also possible to obtain the left and right angles for each. Therefore, nine characteristic data of right shoulder, left shoulder, right knee, left knee, right elbow, left elbow, right ankle, left ankle, and spine may be obtained. As described above, in step S20, the user characteristic data expressing the characteristics of the user's movement is acquired from the moving image of the user's movement.

次に特徴データ比較手段４０は、模範者特徴データとユーザ特徴データとの比較を行う(ステップＳ３０)。ここで、ユーザ特徴データ、模範者特徴データ、教師データの関係について説明しておく。図１２は、ユーザ特徴データ、模範者特徴データ、教師データの関係を示す図である。本実施形態では、比較する部位を、肩、膝、肘、足首、背骨の５個とする。また、本実施形態では、各部位の角度を比較している。A＊は、あるAユーザの各部位の角度、M*は模範者の各部位の角度である。B＊は、あるBユーザの同じく各部位の角度である。AC＊は、Aユーザと模範者との各部位の角度の比較結果である。比較処理は、通常差分処理を行う。比較結果は、各部位と角度の組み合わせとして得られる。すなわち、ユーザ特徴データ、模範者特徴データの差分である比較結果も各部位と角度の組み合わせである特徴データとして得られる。したがって、ステップＳ３０においては、比較結果として、各部位に対応する特徴データが得られることになる。 Next, the feature data comparison means 40 compares the model feature data with the user feature data (step S30). Here, the relationship between the user characteristic data, the model characteristic data, and the teacher data will be described. FIG. 12 is a diagram showing the relationship between the user characteristic data, the model person characteristic data, and the teacher data. In this embodiment, there are five parts to be compared: shoulder, knee, elbow, ankle, and spine. Moreover, in this embodiment, the angles of each part are compared. A * is the angle of each part of a certain A user, and M * is the angle of each part of the model. B * is the angle of each part of a certain B user as well. AC * is the result of comparing the angles of each part between the A user and the model. The comparison process is usually a difference process. The comparison result is obtained as a combination of each part and angle. That is, the comparison result, which is the difference between the user feature data and the model feature data, can also be obtained as the feature data which is a combination of each part and the angle. Therefore, in step S30, feature data corresponding to each site can be obtained as a comparison result.

一方、この比較結果を見て、指導者（トレーナー）は、矯正すべき部位を１つ決定する。また、その部位の差分角度を見て、矯正すべき角度を決定する。例えば、図１２に示す例では、Aユーザに対しては矯正すべき部位を膝であると判断し、矯正すべき角度は比較結果にα度だけ加算している。また、Bユーザに対しては矯正すべき部位を足首であると判断し、矯正すべき角度は比較結果にβ度だけ加算している。このようにして、指導者が決定した矯正すべき部位と矯正すべき角度を矯正データとする。矯正データも部位と角度からなり、特徴データと同じ形式をしている。この指導者が決定した矯正データは教師データとして教師データ記憶手段５０に記憶される。 On the other hand, looking at this comparison result, the instructor (trainer) decides one part to be corrected. In addition, the angle to be corrected is determined by looking at the difference angle of the part. For example, in the example shown in FIG. 12, it is determined that the part to be corrected is the knee for the A user, and the angle to be corrected is added by α degree to the comparison result. In addition, for B user, it is judged that the part to be corrected is the ankle, and the angle to be corrected is added by β degree to the comparison result. In this way, the part to be corrected and the angle to be corrected determined by the instructor are used as correction data. The correction data also consists of the part and angle, and has the same format as the feature data. The correction data determined by the instructor is stored in the teacher data storage means 50 as teacher data.

次に、学習手段６０が、矯正内容の決定を行う（ステップＳ４０）。上述のように、学習手段６０は、ディープニューラルネットワーク構造を備えており、入力層は、部位数（特徴データ数）と同数（本実施形態では５個）が用意されている。そして、教師データ記憶手段５０から教師データが入力され、特徴データ比較手段４０の比較結果である特徴データも入力される。また、出力層も、部位数（特徴データ数）と同数が用意されており、各部位についての矯正量である角度がそれぞれ出力される。そして、矯正量が最大となる部位を矯正部位とし、その矯正量とともに出力する。 Next, the learning means 60 determines the correction content (step S40). As described above, the learning means 60 includes a deep neural network structure, and the number of input layers is the same as the number of parts (number of feature data) (five in this embodiment). Then, the teacher data is input from the teacher data storage means 50, and the feature data which is the comparison result of the feature data comparison means 40 is also input. Further, the same number of output layers as the number of parts (number of feature data) is prepared, and the angle which is the correction amount for each part is output. Then, the part where the correction amount is maximum is set as the correction part, and the correction amount is output together with the correction amount.

例えば、出力層において５個のニューロンが用意されており、それぞれ肩、膝、肘、足首、背骨に対応しており、それぞれ矯正すべき角度が出力される。そして、学習手段６０は、出力層から出力された角度のうち最も大きい値を出力したニューロンに対応する部位を矯正部位候補とする。そして、学習手段６０では、矯正部位候補とその角度（矯正量）を、教師データと比較する。矯正部位候補と教師データの部位が同じである場合（例えば両方とも肩である場合）、矯正部位候補に対応するニューロン（例えば肩に対応するニューロン）の角度の値が、教師データの矯正量である角度の値に近付くように、層間におけるニューロン間の係数を変化させる。 For example, five neurons are prepared in the output layer, each of which corresponds to the shoulder, knee, elbow, ankle, and spine, and the angle to be corrected is output for each. Then, the learning means 60 sets the site corresponding to the neuron that outputs the largest value among the angles output from the output layer as a correction site candidate. Then, in the learning means 60, the correction site candidate and its angle (correction amount) are compared with the teacher data. When the site of the correction site candidate and the site of the teacher data are the same (for example, when both are shoulders), the value of the angle of the neuron corresponding to the correction site candidate (for example, the neuron corresponding to the shoulder) is the correction amount of the teacher data. It changes the coefficient between neurons between layers so that it approaches a value at a certain angle.

一方、教師データとの比較の結果、矯正部位候補と教師データの部位が異なる場合（例えば矯正部位候補が肩であり、教師データが膝である場合）、矯正部位候補に対応するニューロン（例えば肩に対応するニューロン）の角度の値が小さくなり、教師データが示す部位に対応するニューロン（例えば膝に対応するニューロン）の角度の値が大きくなり、教師データが示す部位に対応するニューロン（例えば膝に対応するニューロン）の角度の値が、教師データの矯正量である角度の値に近付くように、層間におけるニューロン間の係数を変化させる。このようにして、次回以降は、教師データに近い矯正部位候補とその角度を出力することができるようになる。 On the other hand, as a result of comparison with the teacher data, if the correction site candidate and the teacher data site are different (for example, the correction site candidate is the shoulder and the teacher data is the knee), the neurons corresponding to the correction site candidate (for example, the shoulder). The value of the angle of the part corresponding to the teacher data becomes smaller, the value of the angle of the neuron corresponding to the part indicated by the teacher data (for example, the neuron corresponding to the knee) becomes larger, and the value of the angle corresponds to the part indicated by the teacher data (for example, the knee). The coefficient between the neurons in the layers is changed so that the value of the angle of the corresponding neuron) approaches the value of the angle which is the correction amount of the teacher data. In this way, from the next time onward, it will be possible to output correction site candidates and their angles that are close to the teacher data.

学習手段６０へは、図１２におけるACs、ACn、ACe、ACa、ACbの比較結果に対して矯正部位を膝、そして、その矯正量ACn+αを教師データとして学習させる。具体的には、このデータセットに基づいて、入力層と中間層のニューロン間の係数を変化させることにより、学習手段６０の出力結果を教師データに近づけ、より的確な矯正部位・矯正量を算出するようにする。Bユーザに対しても同様で、BCs、BCn、BCe、BCa、BCbに対して、指導者が、矯正すべき部位を足首と判断する場合はBCs、BCn、BCe、BCa、BCbの比較結果と、矯正部位を足首、矯正量BCa+βの教師データを１つのデータセットとして学習させる。さらに、学習手段６０は、このような学習を繰り返していくことにより、ニューラルネットワークが学習済みモデルとなった、矯正量算出手段６０ａ（図９参照）が得られる。本実施形態に係る矯正内容学習装置１００は、入力された動画から得られたユーザ特徴データと模範者特徴データの比較結果と、教師データとを学習用セットとしてニューラルネットワークに与え、ニューラルネットワークの層間におけるニューロン間の係数を変化させるようにしたので、矯正内容の学習を効率的に行うことが可能となる。 The learning means 60 is made to learn the correction site as the knee and the correction amount ACn + α as teacher data for the comparison result of ACs, ACn, ACe, ACa, and ACb in FIG. Specifically, by changing the coefficient between the neurons in the input layer and the intermediate layer based on this data set, the output result of the learning means 60 is brought closer to the teacher data, and a more accurate correction site / correction amount is calculated. To do. The same applies to B users, and when the instructor determines that the part to be corrected is the ankle, the comparison result of BCs, BCn, BCe, BCa, BCb is compared with BCs, BCn, BCe, BCa, BCb. , The correction site is the ankle, and the teacher data of the correction amount BCa + β is learned as one data set. Further, as the learning means 60, by repeating such learning, the correction amount calculation means 60a (see FIG. 9) in which the neural network is a trained model can be obtained. The correction content learning device 100 according to the present embodiment gives the comparison result of the user feature data and the model feature data obtained from the input moving image and the teacher data to the neural network as a learning set, and the layers of the neural network. Since the coefficient between the neural networks in the above is changed, it becomes possible to efficiently learn the correction contents.

＜３．動作矯正装置＞
次に、学習済みモデルを備えたニューラルネットワークを有する動作矯正装置の処理動作について説明する。図９は、本実施形態に係る動作矯正装置２００の機能ブロック図である。図９において、１０は動画データ入力手段、２０は特徴データ取得手段、３０は模範者特徴データ記憶手段、４０は特徴データ比較手段、６０ａは矯正量算出手段、７０は矯正内容伝達手段である。図６に示した矯正内容学習装置１００と同様のものについては、同一符号を付して詳細な説明を省略する。 <3. Motion correction device>
Next, the processing operation of the motion correction device having the neural network provided with the trained model will be described. FIG. 9 is a functional block diagram of the motion correction device 200 according to the present embodiment. In FIG. 9, 10 is a moving image data input means, 20 is a feature data acquisition means, 30 is a model feature data storage means, 40 is a feature data comparison means, 60a is a correction amount calculation means, and 70 is a correction content transmission means. The same reference numerals as those of the correction content learning device 100 shown in FIG. 6 are designated by the same reference numerals, and detailed description thereof will be omitted.

矯正量算出手段６０ａは、矯正内容学習装置１００において学習手段６０が機械学習を行うことにより得られた学習済みモデルを備えている。すなわち、矯正量算出手段６０ａは、特徴データ比較手段４０の比較結果と、教師データ記憶手段５０の教師データとを学習用セットとして、ニューロン間の係数を変化させることにより、学習を行ったニューラルネットワークを備えている。これにより、矯正量算出手段６０ａは、特徴データ比較手段４０の比較結果を用いて、入力された各部位について、対応する矯正量を算出する。 The correction amount calculation means 60a includes a learned model obtained by the learning means 60 performing machine learning in the correction content learning device 100. That is, the correction amount calculation means 60a is a neural network in which learning is performed by changing the coefficient between neurons using the comparison result of the feature data comparison means 40 and the teacher data of the teacher data storage means 50 as a learning set. It is equipped with. As a result, the correction amount calculation means 60a calculates the corresponding correction amount for each input portion by using the comparison result of the feature data comparison means 40.

矯正内容伝達手段７０は、矯正量算出手段６０ａより得られた矯正内容をユーザに伝達する手段である。矯正内容伝達手段７０としては、情報を何らかの態様で出力して、伝達（通知）することが可能な機器を用いることができる。例えば、矯正内容を音で伝達するスピーカや、矯正内容を画像・映像で伝達するディスプレイや、矯正内容を画像や文字で印刷する印刷機（プリンタ）や、矯正内容を何らかの力の作用に変換するアクチュエータ等で実現することができる。 The correction content transmitting means 70 is a means for transmitting the correction content obtained from the correction amount calculation means 60a to the user. As the correction content transmitting means 70, a device capable of outputting (notifying) information in some form can be used. For example, a speaker that transmits the correction content by sound, a display that transmits the correction content by image / video, a printing machine (printer) that prints the correction content by image or text, and the conversion of the correction content into the action of some force. It can be realized by an actuator or the like.

本実施形態に係る動作矯正装置２００は、動画データ入力手段１０を構成するカメラと、矯正内容伝達手段７０を構成するスピーカ、ディスプレイ、印刷機、アクチュエータ等と、汎用のコンピュータで実現することができる。動作矯正装置２００の主要部のハードウェア構成は、矯正内容学習装置１００と同様、図７に示したような構成となっている。 The motion correction device 200 according to the present embodiment can be realized by a camera constituting the moving image data input means 10, a speaker, a display, a printing machine, an actuator and the like constituting the correction content transmitting means 70, and a general-purpose computer. .. The hardware configuration of the main part of the motion correction device 200 is as shown in FIG. 7, similar to the correction content learning device 100.

特徴データ取得手段２０、模範者特徴データ記憶手段３０、特徴データ比較手段４０、矯正量算出手段６０ａはＣＰＵ１が、記憶装置３に記憶されているプログラムを実行することにより実現される。すなわち、コンピュータが、専用のプログラムに従って各手段の内容を実行することになる。動作矯正装置２００に用いられるコンピュータとしても、ＣＰＵ等の演算処理部を有し、データ処理が可能な装置を意味し、パーソナルコンピュータなどの汎用コンピュータだけでなく、タブレット端末やスマートフォン等の携帯型端末も用いることができる。動画データ入力手段１０は、動画データ入力手段１０を構成するカメラをデータ入出力Ｉ／Ｆ（インターフェース）５に接続することにより実現される。教師データ記憶手段５０は、記憶装置３により実現される。矯正内容伝達手段７０は、スピーカ、ディスプレイ、印刷機、アクチュエータ等をデータ入出力Ｉ／Ｆ（インターフェース）５に接続することにより実現される。 The feature data acquisition means 20, the model feature data storage means 30, the feature data comparison means 40, and the correction amount calculation means 60a are realized by the CPU 1 executing a program stored in the storage device 3. That is, the computer executes the contents of each means according to a dedicated program. The computer used in the motion correction device 200 also means a device that has an arithmetic processing unit such as a CPU and is capable of data processing, and is not only a general-purpose computer such as a personal computer but also a portable terminal such as a tablet terminal or a smartphone. Can also be used. The moving image data input means 10 is realized by connecting the cameras constituting the moving image data input means 10 to the data input / output I / F (interface) 5. The teacher data storage means 50 is realized by the storage device 3. The correction content transmission means 70 is realized by connecting a speaker, a display, a printing machine, an actuator, or the like to a data input / output I / F (interface) 5.

次に、図９に示した動作矯正装置２００の処理動作について説明する。図１０は、動作矯正装置２００の処理動作を示すフローチャートである。まず、動画データ入力手段１０から動画データを入力する（ステップＳ１１）。具体的には、矯正内容学習装置１００におけるステップＳ１０と同様、動画撮影可能なカメラによりユーザの動作を撮影し、動画データを得る。 Next, the processing operation of the motion correction device 200 shown in FIG. 9 will be described. FIG. 10 is a flowchart showing the processing operation of the motion correction device 200. First, the moving image data is input from the moving image data input means 10 (step S11). Specifically, as in step S10 in the correction content learning device 100, the user's motion is photographed by a camera capable of photographing a moving image, and moving image data is obtained.

次に、特徴データ取得手段２０が、撮影された動画データから特徴データを取得する（ステップＳ２１）。動作矯正装置２００においても、矯正内容学習装置１００におけるステップＳ２０と同様、特徴データの取得は２段階で行う。したがって、まず、動画データから骨格データを取得する。 Next, the feature data acquisition means 20 acquires feature data from the captured moving image data (step S21). Similarly to step S20 in the correction content learning device 100, the motion correction device 200 also acquires the feature data in two steps. Therefore, first, the skeleton data is acquired from the moving image data.

続いて、骨格データから特徴データを取得する。すなわち、矯正内容学習装置１００におけるステップＳ２０と同様、特徴データとして、本実施形態では、部位と、その部位に関連した動作により生じる角度を取得する。肩・膝・肘・足首については、左右で異なる場合があるため、それぞれ左右の角度を取得することもできる。ただし、動作矯正装置２００においても、所定の部位として、肩・膝・肘・足首については、左右どちらか一方を代表させることとし、肩・膝・肘・足首・背骨の５箇所を抽出している。なお、右肩・左肩・右膝・左膝・右肘・左肘・右足首・左足首・背骨の９箇所を抽出するようにしてもよい。 Then, the feature data is acquired from the skeleton data. That is, as in step S20 in the correction content learning device 100, as feature data, in the present embodiment, a portion and an angle generated by an operation related to the portion are acquired. Since the shoulders, knees, elbows, and ankles may differ on the left and right, it is also possible to obtain the left and right angles for each. However, also in the motion correction device 200, the shoulder, knee, elbow, and ankle are represented as either the left or right as predetermined parts, and five parts of the shoulder, knee, elbow, ankle, and spine are extracted. There is. In addition, 9 places of right shoulder, left shoulder, right knee, left knee, right elbow, left elbow, right ankle, left ankle, and spine may be extracted.

次に、具体的には、矯正内容学習装置１００におけるステップＳ３０と同様にして、特徴データ比較手段４０は、模範者特徴データとユーザ特徴データとの比較を行う(ステップS３１)。具体的には、ユーザ特徴データ、模範者特徴データの差分である比較結果を各部位と角度の組み合わせである特徴データとして取得する。したがって、ステップＳ３１においても、比較結果として、各部位に対応する特徴データが得られることになる。 Next, specifically, in the same manner as in step S30 in the correction content learning device 100, the feature data comparison means 40 compares the model feature data with the user feature data (step S31). Specifically, the comparison result, which is the difference between the user feature data and the model feature data, is acquired as the feature data which is a combination of each part and the angle. Therefore, even in step S31, feature data corresponding to each site can be obtained as a comparison result.

次に、矯正量算出手段６０ａが、矯正量の算出を行う（ステップＳ４１）。上述のように、矯正量算出手段６０ａは、ディープニューラルネットワーク構造を備えた学習手段６０が機械学習を行うことにより得られた学習済みモデルを備えており、入力層は、部位数（特徴データ数）と同数（本実施形態では５個）が用意されている。そして、それぞれ特徴量である角度が入力される。また、出力層も、部位数（特徴データ数）と同数が用意されており、各部位についての矯正量である角度がそれぞれ出力される。そして、矯正量が最大となる部位を矯正部位とし、その矯正量とともに出力する。 Next, the correction amount calculation means 60a calculates the correction amount (step S41). As described above, the correction amount calculation means 60a includes a trained model obtained by performing machine learning by the learning means 60 having a deep neural network structure, and the input layer has the number of parts (number of feature data). ) And the same number (5 in this embodiment) are prepared. Then, the angle, which is a feature amount, is input. Further, the same number of output layers as the number of parts (number of feature data) is prepared, and the angle which is the correction amount for each part is output. Then, the part where the correction amount is maximum is set as the correction part, and the correction amount is output together with the correction amount.

そして、矯正内容伝達手段７０が、矯正データに応じて動作中のユーザに矯正内容を伝達する（ステップＳ７０）。具体的には、矯正内容を表現した矯正データを用いて、矯正内容伝達手段７０を実現する具体的機器に応じた態様で、ユーザに矯正内容を伝達する。 Then, the correction content transmission means 70 transmits the correction content to the operating user according to the correction data (step S70). Specifically, using the correction data expressing the correction content, the correction content is transmitted to the user in an embodiment corresponding to the specific device that realizes the correction content transmission means 70.

以上のようにして、動作矯正装置２００は、ユーザの動作を撮影して得られた動画データから特徴データを取得し、矯正データを作成して矯正内容伝達手段７０に出力することにより、ユーザにリアルタイムで矯正内容を伝達することが可能となる。 As described above, the motion correction device 200 acquires the feature data from the moving image data obtained by photographing the user's motion, creates the correction data, and outputs the correction data to the correction content transmitting means 70 to the user. It is possible to convey the content of correction in real time.

＜４．適用例＞
次に、矯正内容学習装置１００、動作矯正装置２００の具体的な適用例について説明する。矯正内容学習装置１００、動作矯正装置２００は共通の部分を有するため、１つのシステムとして実現することができる。図１１は、矯正内容学習装置１００、動作矯正装置２００の適用例を示す図である。ユーザが行う動作としては、日常生活、運動、競技等、様々な状況において生じるが、図１１の例では、ランニングを行うためのトレッドミルに適用した例を示している。 <4. Application example>
Next, specific application examples of the correction content learning device 100 and the motion correction device 200 will be described. Since the correction content learning device 100 and the motion correction device 200 have a common part, they can be realized as one system. FIG. 11 is a diagram showing an application example of the correction content learning device 100 and the motion correction device 200. The movements performed by the user occur in various situations such as daily life, exercise, and competition, but the example of FIG. 11 shows an example applied to a treadmill for running.

図１１に示したシステムを矯正内容学習装置１００として用いる際は、ユーザがトレッドミルでランニングを行うと、動画データ入力手段１０であるカメラがユーザを撮影して動画データを入力する。動画データからは上述のように特徴データが取得された後、矯正内容が決定される。一方、指導者である例えばランニングのコーチは、ユーザのランニングを見て、矯正内容を決定し、それを教師データとして教師データ記憶手段５０に格納する。さらに上述のようにして、矯正内容学習装置１００としての機能により、教師データを取り込んだ学習手段６０が学習を行う。 When the system shown in FIG. 11 is used as the correction content learning device 100, when the user runs on the treadmill, the camera, which is the moving image data input means 10, captures the user and inputs the moving image data. After the feature data is acquired from the moving image data as described above, the correction content is determined. On the other hand, for example, a running coach who is an instructor looks at the running of the user, determines the correction content, and stores it as teacher data in the teacher data storage means 50. Further, as described above, the learning means 60 incorporating the teacher data performs learning by the function as the correction content learning device 100.

図１１に示したシステムを動作矯正装置２００として用いる際は、ユーザがトレッドミルでランニングを行うと、動画データ入力手段１０であるカメラがユーザを撮影して動画データを入力する。動画データからは上述のように特徴データが取得された後、矯正内容が決定される。そして、矯正内容伝達手段７０である、スピーカ、ディスプレイ、印刷機、アクチュエータからユーザに対してリアルタイムで矯正内容を伝達する。 When the system shown in FIG. 11 is used as the motion correction device 200, when the user runs on the treadmill, the camera, which is the moving image data input means 10, captures the user and inputs the moving image data. After the feature data is acquired from the moving image data as described above, the correction content is determined. Then, the correction content is transmitted to the user in real time from the speaker, the display, the printing machine, and the actuator, which are the correction content transmission means 70.

図１１の例では、スピーカからはユーザに向かって「左足のけりを強く」という音声が出力され、ディスプレイには「左足のけりを強く」と表示され、印刷機からは左足のけり方の様子が印刷され、ユーザのふくらはぎに装着されたアクチュエータには振動が伝わる。このようにして、ユーザがトレッドミルでランニングを始めると、複数の態様で矯正内容がリアルタイムで伝達され、その場で矯正を行うことができる。なお、図１１の例では、矯正内容伝達手段７０としてスピーカ、ディスプレイ、印刷機、アクチュエータの４つの機器を用いているが、１つ以上であれば、その数や種類は変更することが可能である。 In the example of FIG. 11, the speaker outputs a voice "strongly kick the left foot" to the user, the display displays "strongly kick the left foot", and the printing machine shows how the left foot is kicked. Is printed and vibration is transmitted to the actuator mounted on the user's calf. In this way, when the user starts running on the treadmill, the correction contents are transmitted in real time in a plurality of modes, and the correction can be performed on the spot. In the example of FIG. 11, four devices of a speaker, a display, a printing machine, and an actuator are used as the correction content transmitting means 70, but if there is one or more, the number and types can be changed. be.

以上、本発明の好適な実施形態について説明したが、本発明は上記実施形態に限定されず、種々の変形が可能である。例えば、上記実施形態では、コンピュータにおいて、ＣＰＵがプログラムを実行することにより、上記各手段を実現するようにしたが、上記各手段を演算回路としてハードウェアに組み込むようにしても良い。この場合、例えば液晶ディスプレイ等の表示装置に組み込むことにより、入力された原画像を高画質化して表示することも可能となる。 Although the preferred embodiment of the present invention has been described above, the present invention is not limited to the above embodiment and can be modified in various ways. For example, in the above embodiment, in the computer, the CPU executes the program to realize each of the above means, but each of the above means may be incorporated into the hardware as an arithmetic circuit. In this case, for example, by incorporating it into a display device such as a liquid crystal display, it is possible to display the input original image with high image quality.

１・・・ＣＰＵ（Central Processing Unit）
２・・・ＲＡＭ（Random Access Memory）
３・・・記憶装置
４・・・指示入力Ｉ／Ｆ
５・・・データ入出力Ｉ／Ｆ
６・・・表示部
１０・・・動画データ入力手段
２０・・・特徴データ取得手段
３０・・・模範者特徴データ記憶手段
４０・・・特徴データ比較手段
５０・・・教師データ記憶手段
６０・・・学習手段
６０ａ・・・矯正量算出手段
７０・・・矯正内容伝達手段
１００・・・矯正内容学習装置
２００・・・動作矯正装置
1 ... CPU (Central Processing Unit)
2 ... RAM (Random Access Memory)
3 ... Storage device 4 ... Instruction input I / F
5 ... Data input / output I / F
6 ... Display unit 10 ... Video data input means 20 ... Feature data acquisition means 30 ... Modeler feature data storage means 40 ... Feature data comparison means 50 ... Teacher data storage means 60 ...・・ Learning means 60a ・・・ Correction amount calculation means 70 ・・・ Correction content transmission means 100 ・・・ Correction content learning device 200 ・・・ Motion correction device

Claims

Video data input means for inputting video data of a moving human body,
A feature data acquisition means for acquiring feature data expressing motion features from the input video data as user feature data, and
A modeler feature data storage means that stores the feature data of the modeler as modeler feature data,
A teacher data storage means that stores the feature data indicating the content to be corrected with respect to the comparison result by the feature data comparison means for comparing the user feature data with the model feature data and the feature data comparison means as teacher data. When,
It has a neural network configuration having an input layer, a plurality of intermediate layers, and an output layer, the input layer includes means for capturing the comparison result of the feature data comparison means, and the output layer is a number corresponding to the comparison result. A corrective content learning device comprising the neurons of the above and comprising a learning means for changing the coefficient between the neurons in the layers so as to change the value of the neurons corresponding to the teacher data in the output layer.

The correction content learning device according to claim 1, wherein the feature data includes a part in the human body and a correction amount corresponding to the part.

The correction content learning device according to claim 2, wherein the correction amount is an angle at the site.

The correction content learning device according to claim 3, wherein the angle at the portion is an angle formed by a portion connected to the portion in a predetermined state.

The correction content learning device according to claim 3, wherein the angle at the site is an angle formed at different time points by the connecting sites that are connected to the site.

Video data input means for inputting video data of a moving human body,
A feature data acquisition means for acquiring feature data expressing motion features from the input video data as user feature data, and
A modeler feature data storage means that stores the feature data of the modeler as modeler feature data,
It has a feature data comparison means for comparing the user feature data with the model feature data, a neural network configuration having an input layer, a plurality of intermediate layers, and an output layer, and the input layer is the feature data comparison means. The output layer is provided with a means for capturing the comparison result, the output layer is provided with a number of neurons corresponding to the comparison result, and the correction amount calculation means for outputting the correction amount in the output layer is provided.
The correction content transmission means for transmitting the correction content to the operating user according to the output correction amount, and the correction content transmission means.
A motion correction device.

Computer,
Feature data acquisition means that acquires feature data that expresses the characteristics of operation as user feature data from video data,
Modeler feature data storage means that stores the feature data of the model as model feature data,
Teacher data storage that stores the feature data indicating the content to be corrected with respect to the feature data comparison means for comparing the user feature data with the model feature data and the comparison result by the feature data comparison means as teacher data. means,
It has a neural network configuration having an input layer, a plurality of intermediate layers, and an output layer, the input layer includes means for capturing the comparison result of the feature data comparison means, and the output layer is a number corresponding to the comparison result. A program for providing a neuron of the above and functioning as a learning means for changing a coefficient between neurons between layers so as to change the value of the neuron corresponding to the teacher data in the output layer.

Computer,
Feature data acquisition means that acquires feature data that expresses the characteristics of operation as user feature data from video data,
Modeler feature data storage means that stores the feature data of the model as model feature data,
A feature data comparison means for comparing the user feature data with the model feature data,
It has a neural network configuration having an input layer, a plurality of intermediate layers, and an output layer, the input layer includes means for capturing the comparison result of the feature data comparison means, and the output layer is a number corresponding to the comparison result. A correction amount calculation means, which is equipped with the above-mentioned neurons and outputs a correction amount in the output layer.
A program for functioning as a correction content transmission means for transmitting correction content to a user in operation according to the output correction amount.