JP2023175557A

JP2023175557A - Information processing program and information processing method

Info

Publication number: JP2023175557A
Application number: JP2022088059A
Authority: JP
Inventors: 卓永山本; Takahisa Yamamoto
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2022-05-30
Filing date: 2022-05-30
Publication date: 2023-12-12

Abstract

To provide a novel technique for estimating changes in a person's facial expression during exercise by AU and the exercise intensity of the person based on the changes in facial expression, wherein this technique, unlike traditional methods of estimating exercise intensity, is applicable even to the types of exercise involving large body movements.SOLUTION: An information processing program causes a computer to perform the processes of acquiring a video image in which exercise including the movement of a person's body is captured, estimating the state of movement of the exercise based on skeletal information of the person contained in the video image, estimating the orientation of the person's face based on positional information of the person's facial organs in the video image, estimating the expression of the face from the video image, and assigning weights to a first feature quantity for the orientation of the face and a second feature quantity for the estimated expression based on the relationship between the estimated state of motion and the orientation of the face, and estimating the intensity of exercise based on the first feature quantity to which the weight is assigned and the second feature quantity to which the weight is assigned.SELECTED DRAWING: Figure 5

Description

本発明は、情報処理プログラムおよび情報処理方法に関する。 The present invention relates to an information processing program and an information processing method.

ノンバーバルコミュニケーションにおいて、表情は重要な役割を果たしている。人を理解し人を支援するコンピュータを開発する上で、表情推定は必須の技術である。表情を推定するには、まず表情の記述方法を規定しなければならない。表情の記述方法として、ＡＵ（Action Unit：アクションユニット）が知られている。ＡＵは、顔面筋の解剖学的知見に基づき定義された、表情表出に関与する顔面上の動作を示す。これまでＡＵを推定する技術も提案されている。 Facial expressions play an important role in nonverbal communication. Facial expression estimation is an essential technology for developing computers that understand and support people. In order to estimate facial expressions, we must first define a method for describing facial expressions. AU (Action Unit) is known as a method for describing facial expressions. AU indicates facial movements involved in facial expressions, which are defined based on anatomical knowledge of facial muscles. Techniques for estimating AU have also been proposed.

ＡＵを推定するＡＵ推定エンジンの代表的な形態は、大量の訓練データに基づく機械学習をベースとし、訓練データとして、顔表情の画像データと、表情の判定結果である各ＡＵのOccurrence（発生の有無）やIntensity（発生強度）が用いられる。 A typical form of an AU estimation engine that estimates AUs is based on machine learning based on a large amount of training data. (presence/absence) and intensity (intensity of occurrence) are used.

また、例えば、ＡＵによって運動中の人物の表情変化を推定し、当該表情変化により人物の運動強度を推定する技術がある。このような運動強度の推定ができれば、例えば、リハビリ運動やスポーツジムでの運動において、運動中の人物を撮影した動画から、人物が適切な負荷の運動を行っているかなどを判定できるようになる。 Furthermore, for example, there is a technique of estimating a change in the facial expression of a person during exercise using AU, and estimating the exercise intensity of the person based on the change in facial expression. If such exercise intensity can be estimated, it will be possible to determine whether a person is exercising with an appropriate load, for example, from a video of a person exercising during rehabilitation exercise or exercising at a gym. .

特開２０２０－１８８８６０号公報Japanese Patent Application Publication No. 2020-188860 特開２０１３－１１１４４９号公報Japanese Patent Application Publication No. 2013-111449 米国特許出願公開第２０２０／００７４１５５号明細書US Patent Application Publication No. 2020/0074155

しかしながら、従来の運動強度の推定方法では、例えば、エアロバイク（登録商標）などのような、身体の動きがあまり大きくない運動に対する強度の推定はできるが、身体の動きが大きい運動に対する強度の推定ができない場合がある。 However, with conventional exercise intensity estimation methods, it is possible to estimate the intensity for exercises that do not involve large body movements, such as riding a stationary bike (registered trademark), but it is possible to estimate the intensity for exercises that involve large body movements. may not be possible.

例えば、身体の動きが大きい運動の一例であるスクワット運動の場合、身体が下がってから上がる時が、運動強度が現れるタイミングである。この運動強度が現れるタイミングが運動強度を計測したいタイミングであるが、このタイミングで運動強度を推定できず、推定精度が低下する場合がある。 For example, in the case of a squat exercise, which is an example of an exercise that involves large body movements, the timing at which the exercise intensity appears is when the body lowers and then rises. Although the timing at which this exercise intensity appears is the timing at which it is desired to measure the exercise intensity, the exercise intensity cannot be estimated at this timing, and the estimation accuracy may deteriorate.

１つの側面では、運動強度の推定精度をより向上させることを目的とする。 One aspect is to further improve the accuracy of estimating exercise intensity.

１つの態様において、情報処理プログラムは、人物の身体の動きを含む運動を撮影した映像を取得し、映像に含まれる人物の骨格情報に基づいて、運動の動きの状態を推定し、映像における人物の顔器官の位置情報に基づいて、人物の顔の向きを推定し、映像から顔の表情を推定し、推定された動きの状態と顔の向きとの関係に基づいて、顔の向きに対する第１の特徴量と、推定された表情に対する第２の特徴量とのそれぞれに重みを付与し、重みが付与された第１の特徴量と、重みが付与された第２の特徴量とに基づいて、運動の強度を推定する処理をコンピュータに実行させる。 In one aspect, the information processing program acquires a video of a person's exercise including body movements, estimates the state of the person's movement based on skeletal information of the person included in the video, and estimates the state of the person's movement in the video. The direction of the face of the person is estimated based on the position information of the facial organs of the person, the facial expression is estimated from the video, and the direction of the face is estimated based on the relationship between the estimated movement state and the direction of the face. 1 and a second feature for the estimated facial expression, and based on the weighted first feature and the weighted second feature. Then, the computer executes the process of estimating the intensity of the exercise.

１つの側面では、運動強度の推定精度をより向上させることができる。 In one aspect, the accuracy of estimating exercise intensity can be further improved.

図１は、顔の向きとＡＵ値との関係の一例を示す図である。FIG. 1 is a diagram illustrating an example of the relationship between face orientation and AU value. 図２は、運動のタイミングによる顔の向きの違いの一例を示す図である。FIG. 2 is a diagram illustrating an example of a difference in face orientation depending on the timing of movement. 図３は、本実施形態にかかる情報処理システムの構成例を示す図である。FIG. 3 is a diagram showing a configuration example of an information processing system according to this embodiment. 図４は、本実施形態にかかる情報処理装置１０の構成例を示すブロック図である。FIG. 4 is a block diagram showing a configuration example of the information processing device 10 according to the present embodiment. 図５は、本実施形態にかかる運動強度推定の一例を示す図である。FIG. 5 is a diagram showing an example of exercise intensity estimation according to this embodiment. 図６は、本実施形態にかかる運動強度推定処理の流れの一例を示すフローチャートである。FIG. 6 is a flowchart showing an example of the flow of exercise intensity estimation processing according to the present embodiment. 図７は、本実施形態にかかる情報処理装置１０のハードウェア構成例を示す図である。FIG. 7 is a diagram showing an example of the hardware configuration of the information processing device 10 according to the present embodiment.

以下に、本実施形態に係る情報処理プログラムおよび情報処理方法の実施例を図面に基づいて詳細に説明する。なお、この実施例により本実施形態が限定されるものではない。また、各実施例は、矛盾のない範囲内で適宜組み合わせることができる。実施例は、ＡＵを一例に説明されるが、これに限られるものではない。 Examples of the information processing program and information processing method according to the present embodiment will be described in detail below based on the drawings. Note that this embodiment is not limited to this example. Moreover, each embodiment can be combined as appropriate within a consistent range. The embodiment will be described using AU as an example, but is not limited thereto.

図１は、顔の向きとＡＵ値との関係の一例を示す図である。図１は、既存技術を用いて、同一人物が無表情で顔の向きを変えて撮像された各画像から推定された９つのＡＵ値の例を示すものである。真ん中の画像が顔を正面に向けた場合の例であり、周囲の画像がそれぞれ矢印の方向に顔を向けた場合の例である。また、各画像の下の数値がそれぞれの画像から推定されたＡＵ値である。なお、ＡＵ値は０～５の値をとり、図１の各値は、全１７種類のＡＵの平均である。 FIG. 1 is a diagram illustrating an example of the relationship between face orientation and AU value. FIG. 1 shows an example of nine AU values estimated using existing technology from images of the same person captured with an expressionless face while changing the direction of the face. The middle image is an example when the face is facing forward, and the surrounding images are examples when the face is facing in the direction of the arrow. Further, the numerical value below each image is the AU value estimated from each image. Note that the AU value takes a value from 0 to 5, and each value in FIG. 1 is the average of all 17 types of AU.

図１を参照すると、同じ無表情でも、顔の向きによってＡＵ値が変化しており、顔の向きが正面の場合のＡＵ値を正とすると、正面以外の場合のＡＵ値の推定精度が劣化していることがわかる。そのため、運動中の人物の表情変化、すなわち、ＡＵ値を用いて運動強度を推定する場合に、顔の向きを考慮した推定を行うことで、運動強度の推定精度を向上させることができる。 Referring to Figure 1, even for the same expressionless expression, the AU value changes depending on the direction of the face, and if the AU value when the face is facing forward is assumed to be positive, the estimation accuracy of the AU value when the face is facing other than frontally deteriorates. I know what you're doing. Therefore, when estimating the exercise intensity using a change in the facial expression of a person during exercise, that is, the AU value, the estimation accuracy of the exercise intensity can be improved by performing the estimation in consideration of the direction of the face.

図２は、運動のタイミングによる顔の向きの違いの一例を示す図である。図２の例は、スクワット運動のタイミングによる顔の向きの違いを示すものである。図２の左側に示すように、スクワット運動でしゃがんでいる時は、人物の顔は下を向く。一方、図２の右側に示すように、スクワット運動で立ち上がる時は、人物の顔は上を向く。このように、同じ運動でもタイミングによって顔の向きが異なる。そこで、本実施形態では、運動強度が現れるタイミング、例えば、スクワット運動の場合は、身体が下がってから上がる時に人物の顔がどの方向を向いているかを判定し、顔の向きを考慮した運動強度の推定を行う。 FIG. 2 is a diagram illustrating an example of a difference in face orientation depending on the timing of movement. The example in FIG. 2 shows the difference in face orientation depending on the timing of the squat exercise. As shown on the left side of Figure 2, when a person is squatting in a squat exercise, the person's face is directed downward. On the other hand, as shown on the right side of FIG. 2, when a person stands up during a squat exercise, the person's face faces upward. In this way, even with the same exercise, the direction of the face differs depending on the timing. Therefore, in this embodiment, the timing at which the exercise intensity appears, for example, in the case of a squat exercise, determines which direction the person's face is facing when the body lowers and then rises, and the exercise intensity takes into account the direction of the face. Estimate.

次に、顔の向きを考慮した運動強度の推定を行うための情報処理システムの構成を説明する。図３は、本実施形態にかかる情報処理システムの構成例を示す図である。図３に示すように、情報処理システム１は、情報処理装置１０と、カメラ装置１００とが、ネットワーク５０を介して相互に通信可能に接続されるシステムである。 Next, the configuration of an information processing system for estimating exercise intensity in consideration of the direction of the face will be described. FIG. 3 is a diagram showing a configuration example of an information processing system according to this embodiment. As shown in FIG. 3, the information processing system 1 is a system in which an information processing device 10 and a camera device 100 are connected to each other via a network 50 so as to be able to communicate with each other.

ネットワーク５０には、有線や無線を問わず、例えば、工場内で利用されるイントラネットなどの各種通信網を採用できる。また、ネットワーク５０は、単一のネットワークではなく、例えば、イントラネットとインターネットとがゲートウェイなどネットワーク装置やその他の装置（図示せず）を介して構成されてよい。なお、情報処理装置１０とカメラ装置１００とが直接接続される場合、または情報処理装置１０がカメラ装置１００と同等のカメラ機能を内蔵する場合、ネットワーク５０は情報処理システム１に含まれなくてよい。 For the network 50, various communication networks, such as an intranet used within a factory, can be employed, regardless of whether it is wired or wireless. Further, the network 50 is not a single network, and may be configured, for example, by connecting an intranet and the Internet via a network device such as a gateway or other devices (not shown). Note that if the information processing device 10 and the camera device 100 are directly connected, or if the information processing device 10 has a built-in camera function equivalent to that of the camera device 100, the network 50 does not need to be included in the information processing system 1. .

情報処理装置１０は、例えば、デスクトップＰＣ（Personal Computer）やノートＰＣなどであってもよいし、スマートフォンやタブレットＰＣなどのモバイル端末であってもよい。 The information processing device 10 may be, for example, a desktop PC (Personal Computer) or a notebook PC, or may be a mobile terminal such as a smartphone or a tablet PC.

情報処理装置１０は、例えば、カメラ装置１００によって人物の運動を撮影した映像を取得し、当該映像に含まれる人物の骨格情報に基づいて、運動の動きの状態を推定する。また、情報処理装置１０は、例えば、映像における人物の顔器官の位置情報に基づいて、人物の顔の向きを推定する。また、情報処理装置１０は、例えば、映像から人物のＡＵを推定して顔の表情を推定する。そして、情報処理装置１０は、例えば、推定された運動の動きの状態と人物の顔の向きとの関係に基づいて、推定された顔の向きおよび表情に対するそれぞれの特徴量を重み付けし、重み付けされた特徴量に基づいて運動の強度を推定する。 For example, the information processing device 10 acquires a video of a person's movement using the camera device 100, and estimates the state of the movement based on the skeletal information of the person included in the video. Further, the information processing device 10 estimates the orientation of the person's face, for example, based on position information of the person's facial organs in the video. Further, the information processing device 10 estimates a facial expression by estimating a person's AU from the video, for example. Then, the information processing device 10 weights the respective feature amounts for the estimated facial orientation and facial expression based on, for example, the relationship between the estimated motor movement state and the facial orientation of the person. The intensity of exercise is estimated based on the calculated features.

なお、図３では、情報処理装置１０を１台のコンピュータとして示しているが、例えば、複数台のコンピュータで構成される分散型コンピューティングシステムであってもよい。また、情報処理装置１０は、クラウドコンピューティングサービスを提供するサービス提供者によって管理されるクラウドコンピュータ装置であってもよい。 Note that although the information processing device 10 is shown as one computer in FIG. 3, it may be a distributed computing system configured of a plurality of computers, for example. Further, the information processing device 10 may be a cloud computer device managed by a service provider that provides cloud computing services.

カメラ装置１００は、例えば、人物を撮影するためのカメラである。カメラ装置１００によって撮影された映像は、随時、または所定のタイミングで情報処理装置１０に送信される。なお、上述したように、カメラ装置１００は、情報処理装置１０に内蔵されるカメラ機能であってもよい。 The camera device 100 is, for example, a camera for photographing a person. Images captured by the camera device 100 are transmitted to the information processing device 10 at any time or at a predetermined timing. Note that, as described above, the camera device 100 may have a camera function built into the information processing device 10.

（情報処理装置１０の機能構成）
次に、本実施形態の実行主体となる情報処理装置１０の機能構成について説明する。図４は、本実施形態にかかる情報処理装置１０の構成例を示すブロック図である。図４に示すように、情報処理装置１０は、通信部２０、記憶部３０、および制御部４０を有する。 (Functional configuration of information processing device 10)
Next, the functional configuration of the information processing device 10, which is the main body of execution of this embodiment, will be explained. FIG. 4 is a block diagram showing a configuration example of the information processing device 10 according to the present embodiment. As shown in FIG. 4, the information processing device 10 includes a communication section 20, a storage section 30, and a control section 40.

通信部２０は、カメラ装置１００など、他の装置との間の通信を制御する処理部であり、例えば、ネットワークインタフェースカードなどの通信インタフェースや、ＵＳＢインタフェースである。 The communication unit 20 is a processing unit that controls communication with other devices such as the camera device 100, and is, for example, a communication interface such as a network interface card, or a USB interface.

記憶部３０は、各種データや、制御部４０が実行するプログラムを記憶する機能を有し、例えば、メモリやハードディスクなどの記憶装置により実現される。記憶部１３は、例えば、映像情報３１、運動解析モデル３２、および表情筋推定モデル３３を記憶する。 The storage unit 30 has a function of storing various data and programs executed by the control unit 40, and is realized by, for example, a storage device such as a memory or a hard disk. The storage unit 13 stores, for example, video information 31, a motion analysis model 32, and a facial muscle estimation model 33.

映像情報３１は、例えば、カメラ装置１００によって撮影された映像、すなわち、動画の一連のフレームである複数の撮像画像を記憶する。 The video information 31 stores, for example, a video shot by the camera device 100, that is, a plurality of captured images that are a series of frames of a moving image.

運動解析モデル３２は、例えば、映像に映った人物の骨格情報から、人物が行っている運動の種類やタイミングを推定するための機械学習モデルに関する情報や、当該機械学習モデルを構築するためのモデルパラメータを記憶する。当該機械学習モデルは、例えば、人物の骨格情報を特徴量とし、当該人物が行っている運動の種類やタイミングを正解ラベルとして機械学習により生成される。なお、当該機械学習モデルは、情報処理装置１０によって生成されてもよいし、別の情報処理装置によって訓練され生成されてもよい。 The motion analysis model 32 includes, for example, information regarding a machine learning model for estimating the type and timing of a motion performed by a person from skeletal information of the person in the video, and a model for constructing the machine learning model. Remember parameters. The machine learning model is generated by machine learning, for example, using a person's skeletal information as a feature quantity and using the type and timing of the exercise performed by the person as a correct label. Note that the machine learning model may be generated by the information processing device 10, or may be trained and generated by another information processing device.

表情筋推定モデル３３は、例えば、人物が映った映像からＡＵの発生強度を推定するための機械学習モデルに関する情報や、当該機械学習モデルを構築するためのモデルパラメータを記憶する。当該機械学習モデルは、例えば、カメラ装置１００によって撮影される映像、すなわち、撮像画像を特徴量とし、ＡＵの発生強度を正解ラベルとして機械学習により生成される。なお、当該機械学習モデルは、情報処理装置１０によって生成されてもよいし、別の情報処理装置によって訓練され生成されてもよい。 The facial muscle estimation model 33 stores, for example, information regarding a machine learning model for estimating the intensity of AU occurrence from a video of a person, and model parameters for constructing the machine learning model. The machine learning model is generated by machine learning, for example, using a video captured by the camera device 100, that is, a captured image, as a feature quantity, and using the AU occurrence intensity as a correct label. Note that the machine learning model may be generated by the information processing device 10, or may be trained and generated by another information processing device.

なお、記憶部３０に記憶される上記情報はあくまでも一例であり、記憶部３０は、上記情報以外にも様々な情報を記憶できる。 Note that the above information stored in the storage unit 30 is just an example, and the storage unit 30 can store various information in addition to the above information.

制御部４０は、情報処理装置１０全体を司る処理部であり、例えば、プロセッサなどである。制御部４０は、運動解析部４１、顔向き推定部４２、表情筋推定部４３、および顔エネルギー算出部４４などを備える。なお、各処理部は、プロセッサが有する電子回路の一例やプロセッサが実行するプロセスの一例である。 The control unit 40 is a processing unit that controls the entire information processing device 10, and is, for example, a processor. The control unit 40 includes a motion analysis unit 41, a face orientation estimation unit 42, a facial muscle estimation unit 43, a facial energy calculation unit 44, and the like. Note that each processing unit is an example of an electronic circuit included in the processor or an example of a process executed by the processor.

運動解析部４１は、例えば、既存技術を用いて、カメラ装置１００によって撮影された映像に含まれる人物の骨格情報に基づいて、当該人物の身体の動きを含む運動の動きの状態を推定する。 The motion analysis unit 41 estimates the state of motion including the body movement of the person based on the skeletal information of the person included in the video captured by the camera device 100 using, for example, existing technology.

顔向き推定部４２は、例えば、既存技術を用いて、カメラ装置１００によって撮影された映像における人物の顔器官の各パーツの位置関係に基づいて、人物の顔の向きを推定する。 The face orientation estimating unit 42 estimates the orientation of the person's face based on the positional relationship of each part of the person's facial organs in the video captured by the camera device 100, for example, using existing technology.

表情筋推定部４３は、例えば、カメラ装置１００によって撮影された映像から人物の顔の表情を推定する。これは、例えば、表情筋推定部４３が、既存技術を用いて、顔面筋の解剖学的知見に基づき定義された、表情表出に関与する顔面上の動作を示すＡＵを推定し、推定されたＡＵに基づいて表情を推定する。 For example, the facial muscle estimating unit 43 estimates the facial expression of a person from an image captured by the camera device 100. For example, the facial muscle estimating unit 43 uses existing technology to estimate AUs that indicate facial movements related to expression expression, which are defined based on anatomical knowledge of facial muscles. The facial expression is estimated based on the AU.

顔エネルギー算出部４４は、例えば、推定された動きの状態と顔の向きとの関係に基づいて、顔の向きに対する第１の特徴量と、推定された表情に対する第２の特徴量とのそれぞれに重みを付与する。そして、顔エネルギー算出部４４は、重みが付与された第１の特徴量と、重みが付与された第２の特徴量とに基づいて、運動の強度を推定する。 For example, the facial energy calculation unit 44 calculates a first feature amount for the facial orientation and a second feature amount for the estimated facial expression based on the relationship between the estimated movement state and the facial orientation. give weight to. Then, the facial energy calculation unit 44 estimates the intensity of the movement based on the weighted first feature amount and the weighted second feature amount.

次に、情報処理装置１０を実行主体として実行される運動強度推定について、より具体的に説明する。図５は、本実施形態にかかる運動強度推定の一例を示す図である。 Next, the exercise intensity estimation executed by the information processing device 10 will be described in more detail. FIG. 5 is a diagram showing an example of exercise intensity estimation according to this embodiment.

まず、図５に示すように、例えば、カメラ装置１００によって撮像された映像が情報処理装置１０に入力される（ステップＳ１）。入力された映像は、情報処理装置１０によって映像情報３１に記憶されてよい。 First, as shown in FIG. 5, for example, a video imaged by the camera device 100 is input to the information processing device 10 (step S1). The input video may be stored in the video information 31 by the information processing device 10.

次に、情報処理装置１０は、例えば、既存の物体検出アルゴリズムや、既存の骨格推定アルゴリズムを用いて、入力された映像から、人物を検出し、当該人物の骨格情報を生成し、人物が行っている運動を解析する（ステップＳ２）。なお、これら、人物の検出、骨格情報の生成、運動の解析は、入力された映像のフレームごとに実行されてよい。 Next, the information processing device 10 detects a person from the input video using, for example, an existing object detection algorithm or an existing skeletal estimation algorithm, generates skeletal information of the person, and performs actions performed by the person. Analyze the motion that is occurring (step S2). Note that these human detection, skeletal information generation, and motion analysis may be performed for each frame of the input video.

ここで、既存の物体検出アルゴリズムとは、例えば、ＹＯＬＯ（You Only Look Once）やＳＳＤ（Single Shot Multibox Detector）、および深層学習を用いたＦａｓｔｅｒＲ－ＣＮＮ（Convolutional Neural Network）などの物体検出アルゴリズムであってよい。また、既存の骨格推定アルゴリズムとは、例えば、ＣＰＮ（Cascaded Pyramid Network）などの既存の骨格推定アルゴリズムであってよい。 Here, existing object detection algorithms include object detection algorithms such as YOLO (You Only Look Once), SSD (Single Shot Multibox Detector), and Faster R-CNN (Convolutional Neural Network) using deep learning. It's good to be there. Further, the existing skeleton estimation algorithm may be, for example, an existing skeleton estimation algorithm such as CPN (Cascaded Pyramid Network).

また、人物が行っている運動の解析は、例えば、生成された骨格情報に基づいて、人物が行っている運動の種類やタイミングを推定する。これらの推定は、例えば、それぞれ、骨格情報を特徴量とし、運動の種類やタイミングを正解ラベルとして機械学習により生成された機械学習モデルを用いて行われてよい。なお、運動の種類が予め決定されている場合は、運動のタイミングだけ推定されてもよい。運動のタイミングの推定結果は、例えば、運動の後半になればなるほど値が大きくなる０～１のような正規化された値であってよい。これは、運動の後半になるほど、運動強度が表情に出易いことによる。より具体的には、運動のタイミングの推定結果（以下、「運動のタイミング値」という）は、例えば、スクワット運動の場合、１０回のスクワットの内、１回目は０．１、２回目は０．２、・・・１０回目は１という値であってよい。 Furthermore, the analysis of the movement performed by a person involves estimating the type and timing of the movement performed by the person, based on the generated skeletal information, for example. These estimations may be performed, for example, using a machine learning model generated by machine learning, using skeletal information as a feature amount and using the type and timing of movement as a correct label. Note that if the type of exercise is determined in advance, only the timing of the exercise may be estimated. The estimation result of the movement timing may be, for example, a normalized value such as 0 to 1 that increases as the movement progresses towards the latter half. This is because the intensity of the exercise is more likely to be expressed in facial expressions as the latter half of the exercise progresses. More specifically, the estimation result of exercise timing (hereinafter referred to as "exercise timing value") is, for example, in the case of a squat exercise, 0.1 for the first squat and 0 for the second squat. .2, . . . the 10th time may have a value of 1.

次に、情報処理装置１０は、例えば、既存の顔向き推定アルゴリズムを用いて、入力された映像から、人物の顔の向きを推定する（ステップＳ３）。顔の向きの推定結果（以下、「顔の向きの値」という）は、例えば、正面の値を０として、顔の向きが上または下に向くにつれて値が大きくなる０～１のような正規化された値であってよい。または、顔の向きの値は、正面の値を０°とした場合の顔の向きの角度であってよい。これは、運動が苦しくなると人は上または下を向き易くなることによる。 Next, the information processing device 10 estimates the direction of the person's face from the input video using, for example, an existing face direction estimation algorithm (step S3). The face orientation estimation result (hereinafter referred to as "face orientation value") is a normal value such as 0 to 1, where the value for the front is set to 0 and the value increases as the face orientation is upward or downward. It may be a converted value. Alternatively, the value of the face direction may be the angle of the face direction when the front value is 0°. This is because people tend to look up or down when exercise becomes difficult.

次に、情報処理装置１０は、例えば、入力された映像から、人物の表情筋を推定する（ステップＳ４）。当該推定は、例えば、カメラ装置１００によって撮影される映像、すなわち、撮像画像を特徴量とし、ＡＵの発生強度を正解ラベルとして機械学習により生成された機械学習モデルを用いて行われてよい。ＡＵの発生強度、すなわち、ＡＵ値は、図５に示すように全１７種類あり、発生強度が低い方から０～５の値をとる。そして、各ＡＵの値により、人物の表情が推定される。なお、必ずしもＡＵの個々の値のそれぞれが表情に対応しているわけではなく、ＡＵの組み合わせと、組み合わせ内のＡＵ値により表情が推定され得る。また、ステップＳ２～Ｓ４の運動解析、顔の向き推定、および表情筋推定の処理順は、図５に示された順番でなくてもよく、各処理の順番を入れ替えて、または各処理が並列に処理されてもよい。 Next, the information processing device 10 estimates the facial muscles of the person from, for example, the input video (step S4). The estimation may be performed, for example, using a machine learning model generated by machine learning, using a video captured by the camera device 100, that is, a captured image, as a feature quantity, and using the AU occurrence intensity as a correct label. As shown in FIG. 5, there are a total of 17 types of AU generation intensity, that is, AU values, and the values range from 0 to 5, starting from the lowest generation intensity. Then, the facial expression of the person is estimated based on the value of each AU. Note that the individual values of AUs do not necessarily correspond to facial expressions, and facial expressions can be estimated by combinations of AUs and AU values within the combinations. Furthermore, the processing order of the motion analysis, facial orientation estimation, and facial muscle estimation in steps S2 to S4 does not have to be in the order shown in FIG. may be processed.

次に、情報処理装置１０は、例えば、ステップＳ２～Ｓ４の運動解析、顔の向き推定、表情筋推定の結果に基づいて、人物が行っている運動の強度を推定する（ステップＳ５）。より具体的には、情報処理装置１０は、例えば、次の式（１）により、人物が行っている運動の強度を算出する。 Next, the information processing device 10 estimates the intensity of the exercise performed by the person, based on the results of the motion analysis, facial orientation estimation, and facial muscle estimation in steps S2 to S4, for example (step S5). More specifically, the information processing device 10 calculates the intensity of the exercise performed by the person using, for example, the following equation (1).

式（１）により算出される運動強度は、映像のフレームごとに算出された運動強度の合計である。また、式（１）において、αは、例えば、ステップＳ２で推定される運動のタイミング値であってよい。また、βは、例えば、ステップＳ４で推定されるＡＵの値に対する重み係数であってよい。また、γは、ステップＳ３で算出される顔の向きの値に対する重み係数であってよい。なお、式（１）において用いられるＡＵの値は、例えば、全１７種類のＡＵの値の平均などであってよい。また、重み係数βについて、例えば、顔の向きが正面の場合に、ＡＵの値を用いるようにβを１に設定し、重みを付与してよい。また、重み係数γについて、例えば、顔の向きが運動の方向と同一の場合に、顔の向きの値を用いるようにγを１に設定し、重みを付与してよい。 The exercise intensity calculated by equation (1) is the sum of the exercise intensities calculated for each frame of the video. Further, in Equation (1), α may be, for example, the timing value of the movement estimated in step S2. Further, β may be, for example, a weighting coefficient for the value of AU estimated in step S4. Further, γ may be a weighting coefficient for the face orientation value calculated in step S3. Note that the AU value used in equation (1) may be, for example, the average of all 17 types of AU values. Further, regarding the weighting coefficient β, for example, when the face is facing forward, β may be set to 1 so as to use the value of AU, and a weight may be given. Further, regarding the weighting coefficient γ, for example, when the direction of the face is the same as the direction of movement, γ may be set to 1 so as to use the value of the direction of the face, and a weight may be assigned.

（処理の流れ）
次に、図６を用いて、情報処理装置１０を実行主体とする運動強度の推定処理の流れを説明する。図６は、本実施形態にかかる運動強度推定処理の流れの一例を示すフローチャートである。図６に示す推定処理は、例えば、一定時間ごと、またはカメラ装置１００から映像が受信される度に実行されてよい。 (Processing flow)
Next, the flow of exercise intensity estimation processing executed by the information processing device 10 will be described using FIG. 6. FIG. 6 is a flowchart showing an example of the flow of exercise intensity estimation processing according to the present embodiment. The estimation process shown in FIG. 6 may be executed, for example, at regular intervals or every time a video is received from the camera device 100.

まず、図６に示すように、情報処理装置１０は、例えば、カメラ装置１００によって人物が撮影され、情報処理装置１０に入力された映像を映像情報３１から取得する（ステップＳ１０１）。なお、図６に示す推定処理では、カメラ装置１００によって撮影された映像をほぼリアルタイムに処理するため、映像はカメラ装置１００から随時送信され、映像情報３１に記憶される。 First, as shown in FIG. 6, the information processing device 10 acquires, from the video information 31, a video of a person photographed by the camera device 100 and input to the information processing device 10, for example (step S101). Note that in the estimation process shown in FIG. 6, the video captured by the camera device 100 is processed almost in real time, so the video is transmitted from the camera device 100 at any time and stored in the video information 31.

次に、情報処理装置１０は、例えば、ステップＳ１０１で取得された映像中の人物が行っている運動を解析する（ステップＳ１０２）。より具体的には、図５において説明したように、例えば、既存の物体検出アルゴリズムや、既存の骨格推定アルゴリズムを用いて、入力された映像から、人物を検出し、当該人物の骨格情報を生成し、人物が行っている運動が解析される。なお、ステップＳ１０２以降の処理は、入力された映像のフレームごとに実行されてよい。 Next, the information processing device 10 analyzes, for example, the exercise performed by the person in the video acquired in step S101 (step S102). More specifically, as explained in FIG. 5, for example, a person is detected from an input video using an existing object detection algorithm or an existing skeleton estimation algorithm, and skeletal information of the person is generated. Then, the movements the person is performing are analyzed. Note that the processes after step S102 may be executed for each frame of the input video.

ステップＳ１０２の運動の解析により、推定された運動のタイミングが運動負荷の高いタイミングでない場合（ステップＳ１０３：Ｎｏ）、情報処理装置１０は、例えば、タイミング値αに０を設定する（ステップＳ１０４）。このように、例えば、タイミング値αに０を設定することにより、運動負荷の高いタイミングでない場合のフレームでは、式（１）のαが０になり、当該フレームに対する運動強度は０となる。 If the estimated exercise timing is not a timing with a high exercise load according to the exercise analysis in step S102 (step S103: No), the information processing device 10 sets the timing value α to 0, for example (step S104). In this way, for example, by setting the timing value α to 0, in a frame where the exercise load is not high, α in equation (1) becomes 0, and the exercise intensity for the frame becomes 0.

ステップＳ１０４の実行後、次のフレームがある場合（ステップＳ１１２：Ｙｅｓ）、次のフレームに対して、ステップＳ１０２から処理が繰り返される。 After execution of step S104, if there is a next frame (step S112: Yes), the process is repeated from step S102 for the next frame.

一方、推定された運動のタイミングが運動負荷の高いタイミングである場合（ステップＳ１０３：Ｙｅｓ）、情報処理装置１０は、例えば、タイミング値αに０より大きく１以下の値を設定する（ステップＳ１０５）。より具体的には、情報処理装置１０は、より運動負荷の高いタイミングの場合、例えば、１０回のスクワット運動の場合は、後半になるにつれ、１に近づくようにタイミング値αを設定してよい。これにより、情報処理装置１０は、式（１）による運動強度の算出の際に、運動負荷の高いタイミングの方が、運動負荷の低いタイミングの場合と比較して、より重み付けされて計算されるように制御できる。 On the other hand, if the estimated exercise timing is a timing with a high exercise load (step S103: Yes), the information processing device 10 sets the timing value α to a value greater than 0 and less than or equal to 1 (step S105). . More specifically, in the case of a timing with a higher exercise load, for example, in the case of 10 squat exercises, the information processing device 10 may set the timing value α so that it approaches 1 as the second half progresses. . As a result, when calculating the exercise intensity using equation (1), the information processing device 10 calculates the timing with a high exercise load with more weight than the timing with a low exercise load. It can be controlled as follows.

次に、情報処理装置１０は、例えば、ステップＳ１０１で取得された映像中の人物の顔の向きを、当該人物の顔器官の各パーツの位置関係に基づいて推定する（ステップＳ１０６）。そして、情報処理装置１０は、例えば、ステップＳ１０６で推定された顔の向きが正面であるか否かを判定する。なお、ステップＳ１０６の推定結果は、例えば、正面の値を０として、顔の向きの角度が大きくなるほど１に近づくような値であってよく、情報処理装置１０は、当該値が所定の閾値以下の場合に、正面を向いていると判定してよい。 Next, the information processing device 10 estimates, for example, the orientation of the face of the person in the video acquired in step S101, based on the positional relationship of each part of the facial organs of the person (step S106). Then, the information processing device 10 determines, for example, whether the estimated face direction in step S106 is the front direction. Note that the estimation result in step S106 may be, for example, a value that approaches 1 as the angle of the face direction increases, with the front value being 0. In this case, it may be determined that the object is facing forward.

ステップＳ１０６で推定された顔の向きが正面であると判定された場合（ステップＳ１０７：Ｙｅｓ）、情報処理装置１０は、例えば、重み係数βに１、γに０の値を設定する（ステップＳ１０８）。また、この際、情報処理装置１０は、例えば、ＡＵの値を算出する。これにより、顔の向きが正面でない場合、情報処理装置１０は、式（１）による運動強度の算出の際に、ＡＵの値を用い、顔の向きの値が用いないように制御できる。なお、ＡＵの値は、例えば、全１７種類のＡＵの値の平均などであってよい。ステップＳ１０８の実行後、ステップＳ１１２に進む。 If it is determined that the face direction estimated in step S106 is frontal (step S107: Yes), the information processing device 10 sets, for example, a value of 1 to the weighting coefficient β and a value of 0 to γ (step S108). ). Also, at this time, the information processing device 10 calculates, for example, the value of AU. Thereby, when the face direction is not the front direction, the information processing device 10 can be controlled so as to use the value of AU and not use the value of the face direction when calculating the exercise intensity using equation (1). Note that the AU value may be, for example, the average of all 17 types of AU values. After executing step S108, the process advances to step S112.

一方、ステップＳ１０６で推定された顔の向きが正面でないと判定された場合（ステップＳ１０７：Ｎｏ）、情報処理装置１０は、例えば、顔の向きが運動方向と同一であるか否かを判定する（ステップＳ１０９）。なお、顔の向きが運動方向と同一であるか否かの判定は、それぞれの方向を示す値などが一致するか否かの判定に限られず、例えば、それぞれの方向を示す値など所定の範囲内にあるか否かの判定などであってよい。 On the other hand, if it is determined in step S106 that the estimated face orientation is not frontal (step S107: No), the information processing device 10 determines, for example, whether the face orientation is the same as the movement direction. (Step S109). Note that the determination of whether or not the direction of the face is the same as the direction of movement is not limited to determining whether or not the values indicating each direction match, but rather, for example, determining whether the values indicating each direction match or not. This may be a determination as to whether or not it is within the range.

顔の向きが運動方向と同一でないと判定された場合（ステップＳ１０９：Ｎｏ）、情報処理装置１０は、例えば、重み係数βに０、γに０の値を設定する（ステップＳ１１０）。この場合のフレームは運動強度の算出に用いるのは適切ではないため、情報処理装置１０は、ＡＵの値、および顔の向きの値ともに用いないように制御できる。ステップＳ１１０の実行後、ステップＳ１１２に進む。 If it is determined that the direction of the face is not the same as the direction of movement (step S109: No), the information processing device 10 sets, for example, a value of 0 to the weighting coefficient β and a value of 0 to γ (step S110). Since it is not appropriate to use the frame in this case for calculating the exercise intensity, the information processing device 10 can control the frame so that neither the AU value nor the face direction value are used. After executing step S110, the process advances to step S112.

一方、顔の向きが運動方向と同一であると判定された場合（ステップＳ１０９：Ｙｅｓ）、情報処理装置１０は、例えば、重み係数βに０、γに１の値を設定する（ステップＳ１１１）。これにより、顔の向きが運動方向と同一でない場合、情報処理装置１０は、式（１）による運動強度の算出の際に、顔の向きの値を用い、ＡＵの値が用いないように制御できる。ステップＳ１１１の実行後、ステップＳ１１２に進む。 On the other hand, if it is determined that the direction of the face is the same as the direction of movement (step S109: Yes), the information processing device 10 sets, for example, a value of 0 to the weighting coefficient β and a value of 1 to the weighting coefficient γ (step S111). . As a result, when the direction of the face is not the same as the direction of movement, the information processing device 10 uses the value of the face direction and controls so that the value of AU is not used when calculating the movement intensity using equation (1). can. After executing step S111, the process advances to step S112.

次のフレームがある場合（ステップＳ１１２：Ｙｅｓ）、次のフレームに対して、ステップＳ１０２から処理が繰り返される。一方、次のフレームがない場合（ステップＳ１１２：Ｎｏ）、情報処理装置１０は、例えば、式（１）、ならびに各フレームに対するタイミング値α、重み係数βおよびγを用いて、運動強度を算出する（ステップＳ１１３）。ステップＳ１１３の実行後、図６に示す運動強度推定処理は終了する。 If there is a next frame (step S112: Yes), the process is repeated from step S102 for the next frame. On the other hand, if there is no next frame (step S112: No), the information processing device 10 calculates the exercise intensity using, for example, equation (1) and the timing value α, weighting coefficients β and γ for each frame. (Step S113). After executing step S113, the exercise intensity estimation process shown in FIG. 6 ends.

（効果）
上述したように、情報処理装置１０は、人物の身体の動きを含む運動を撮影した映像を取得し、前記映像に含まれる前記人物の骨格情報に基づいて、前記運動の前記動きの状態を推定し、前記映像における前記人物の顔器官の位置情報に基づいて、前記人物の顔の向きを推定し、前記映像から前記顔の表情を推定し、推定された前記動きの状態と前記顔の向きとの関係に基づいて、前記顔の向きに対する第１の特徴量と、推定された前記表情に対する第２の特徴量とのそれぞれに重みを付与し、前記重みが付与された前記第１の特徴量と、前記重みが付与された前記第２の特徴量とに基づいて、前記運動の強度を推定する。 (effect)
As described above, the information processing device 10 acquires a video of a person's exercise including body movements, and estimates the state of the movement based on the skeletal information of the person included in the video. The direction of the face of the person is estimated based on the position information of the facial organs of the person in the video, the expression of the face is estimated from the video, and the state of the estimated movement and the orientation of the face are estimated. A weight is assigned to each of a first feature amount for the facial orientation and a second feature amount for the estimated facial expression based on the relationship, and the first feature to which the weight is assigned is The intensity of the movement is estimated based on the amount and the second feature amount to which the weight is given.

これにより、情報処理装置１０は、映像中の人物の運動の状態、顔の向き、およびＡＵを推定し、運動の状態と顔の向きとの関係に基づいて顔の向きとＡＵに重み付けして運動強度を推定する。これにより、情報処理装置１０は、運動強度の推定精度をより向上させることができる。 As a result, the information processing device 10 estimates the state of movement, the direction of the face, and the AU of the person in the video, and weights the direction of the face and the AU based on the relationship between the state of movement and the direction of the face. Estimate exercise intensity. Thereby, the information processing device 10 can further improve the accuracy of estimating exercise intensity.

また、情報処理装置１０は、推定された前記顔の向きが正面か否かを判定する処理を前記コンピュータに実行させ、前記重みを付与する処理は、推定された前記顔の向きが正面であると判定された場合、前記第１の特徴量に、前記重みの付与として、重み係数を０に設定する処理を含む。 Further, the information processing device 10 causes the computer to execute processing for determining whether the estimated orientation of the face is frontal, and the processing for assigning the weight is performed when the estimated orientation of the face is frontal. If it is determined that this is the case, the process includes a process of setting a weighting coefficient to 0 to give the weight to the first feature amount.

これにより、情報処理装置１０は、運動強度の推定精度をより向上させることができる。 Thereby, the information processing device 10 can further improve the accuracy of estimating exercise intensity.

また、情報処理装置１０は、推定された前記顔の向きが正面か否かを判定する処理を前記コンピュータに実行させ、推定された前記顔の向きが正面でないと判定された場合、前記顔の向きと前記運動の方向とが同一であるか否かを判定し、前記重みを付与する処理は、前記顔の向きと前記運動の方向とが同一であると判定された場合、前記第２の特徴量に、前記重みの付与として、重み係数を０に設定する処理を含む。 Further, the information processing device 10 causes the computer to execute a process of determining whether or not the estimated orientation of the face is frontal, and if it is determined that the estimated orientation of the face is not frontal, the information processing device 10 The process of determining whether or not the orientation of the face and the direction of the movement are the same and assigning the weight is, when it is determined that the orientation of the face and the direction of the movement are the same, Adding the weight to the feature includes a process of setting a weighting coefficient to 0.

また、情報処理装置１０は、映像から前記運動のタイミングを判定する処理を前記コンピュータに実行させ、前記強度を推定する処理は、前記運動のタイミングにさらに基づいて、前記強度を推定する処理を含む。 Further, the information processing device 10 causes the computer to execute a process of determining the timing of the exercise from the video, and the process of estimating the intensity includes a process of estimating the intensity further based on the timing of the exercise. .

（システム）
上記文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報は、特記する場合を除いて任意に変更されてもよい。また、実施例で説明した具体例、分布、数値などは、あくまで一例であり、任意に変更されてもよい。 (system)
Information including processing procedures, control procedures, specific names, and various data and parameters shown in the above documents and drawings may be arbitrarily changed unless otherwise specified. Furthermore, the specific examples, distributions, numerical values, etc. described in the examples are merely examples, and may be changed arbitrarily.

また、各装置の構成要素の分散や統合の具体的形態は図示のものに限られない。例えば、図４の情報処理装置１０の運動解析部４１が複数の処理部に分散されたり、情報処理装置１０の運動解析部４１と顔向き推定部４２とが１つの処理部に統合されたりしてもよい。つまり、その構成要素の全部または一部は、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合されてもよい。さらに、各装置の各処理機能は、その全部または任意の一部が、ＣＰＵおよび当該ＣＰＵにて解析実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。 Furthermore, the specific form of distribution and integration of the components of each device is not limited to what is illustrated. For example, the motion analysis unit 41 of the information processing device 10 in FIG. 4 may be distributed to multiple processing units, or the motion analysis unit 41 and the face orientation estimation unit 42 of the information processing device 10 may be integrated into one processing unit. You can. In other words, all or part of the components may be functionally or physically distributed and integrated in arbitrary units depending on various loads and usage conditions. Furthermore, all or any part of each processing function of each device can be realized by a CPU and a program that is analyzed and executed by the CPU, or can be realized as hardware using wired logic.

図７は、本実施形態にかかる情報処理装置１０のハードウェア構成例を示す図である。図７に示すように、情報処理装置１０は、通信インタフェース１０ａ、ＨＤＤ（Hard Disk Drive）１０ｂ、メモリ１０ｃ、プロセッサ１０ｄを有する。また、図７に示した各部は、バスなどで相互に接続される。 FIG. 7 is a diagram showing an example of the hardware configuration of the information processing device 10 according to the present embodiment. As shown in FIG. 7, the information processing device 10 includes a communication interface 10a, an HDD (Hard Disk Drive) 10b, a memory 10c, and a processor 10d. Further, each part shown in FIG. 7 is interconnected by a bus or the like.

通信インタフェース１０ａは、ネットワークインタフェースカードなどであり、他のサーバとの通信を行う。ＨＤＤ１０ｂは、図４などに示した各処理部や情報処理装置１０の各機能を動作させるプログラムやＤＢ（Data Base）を記憶する。 The communication interface 10a is a network interface card or the like, and communicates with other servers. The HDD 10b stores programs and a DB (Data Base) for operating each processing unit and each function of the information processing device 10 shown in FIG. 4 and the like.

プロセッサ１０ｄは、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）、ＧＰＵ（Graphics Processing Unit）などである。また、プロセッサ１０ｄは、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）などの集積回路により実現されるようにしてもよい。プロセッサ１０ｄは、図４などに示した各処理部と同様の処理を実行するプログラムをＨＤＤ１０ｂなどから読み出してメモリ１０ｃに展開することで、図４などで説明した各機能を実現するプロセスを実行するハードウェア回路である。 The processor 10d is a CPU (Central Processing Unit), an MPU (Micro Processing Unit), a GPU (Graphics Processing Unit), or the like. Further, the processor 10d may be realized by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array). The processor 10d reads a program that executes the same processing as each processing unit shown in FIG. 4 etc. from the HDD 10b and deploys it in the memory 10c, thereby executing a process to realize each function described in FIG. 4 etc. It is a hardware circuit.

また、情報処理装置１０は、媒体読取装置によって記録媒体から上記プログラムを読み出し、読み出された上記プログラムを実行することで上記実施例と同様の機能を実現することもできる。なお、この他の実施例でいうプログラムは、情報処理装置１０によって実行されることに限定されるものではない。例えば、他のコンピュータまたはサーバがプログラムを実行する場合や、これらが協働してプログラムを実行するような場合にも、上記実施例が同様に適用されてよい。 Further, the information processing device 10 can also realize the same functions as in the embodiments described above by reading the above program from a recording medium using a medium reading device and executing the read program. Note that the programs in other embodiments are not limited to being executed by the information processing device 10. For example, the above embodiments may be similarly applied to cases where another computer or server executes a program, or where these computers or servers cooperate to execute a program.

このプログラムは、インターネットなどのネットワークを介して配布されてもよい。また、このプログラムは、ハードディスク、フレキシブルディスク（ＦＤ）、ＣＤ－ＲＯＭ、ＭＯ（Magneto－Optical disk）、ＤＶＤ（Digital Versatile Disc）などのコンピュータで読み取り可能な記録媒体に記録され、コンピュータによって記録媒体から読み出されることによって実行されてもよい。 This program may be distributed via a network such as the Internet. In addition, this program is recorded on a computer-readable recording medium such as a hard disk, flexible disk (FD), CD-ROM, MO (Magneto-Optical disk), or DVD (Digital Versatile Disc), and is read from the recording medium by the computer. It may also be executed by being read.

以上の実施例を含む実施形態に関し、さらに以下の付記を開示する。 Regarding the embodiments including the above examples, the following additional notes are further disclosed.

（付記１）人物の身体の動きを含む運動を撮影した映像を取得し、
前記映像に含まれる前記人物の骨格情報に基づいて、前記運動の前記動きの状態を推定し、
前記映像における前記人物の顔器官の位置情報に基づいて、前記人物の顔の向きを推定し、
前記映像から前記顔の表情を推定し、
推定された前記動きの状態と前記顔の向きとの関係に基づいて、前記顔の向きに対する第１の特徴量と、推定された前記表情に対する第２の特徴量とのそれぞれに重みを付与し、
前記重みが付与された前記第１の特徴量と、前記重みが付与された前記第２の特徴量とに基づいて、前記運動の強度を推定する
処理をコンピュータに実行させることを特徴とする情報処理プログラム。 (Additional note 1) Obtain a video of a person's exercise including body movements,
estimating the state of the movement of the movement based on skeletal information of the person included in the video;
Estimating the orientation of the person's face based on position information of facial organs of the person in the video,
estimating the facial expression from the video;
Based on the relationship between the estimated movement state and the face orientation, a weight is given to each of a first feature amount for the facial orientation and a second feature amount for the estimated facial expression. ,
Information characterized by causing a computer to perform a process of estimating the intensity of the exercise based on the first feature amount to which the weight is assigned and the second feature amount to which the weight is assigned. Processing program.

（付記２）推定された前記顔の向きが正面か否かを判定する
処理を前記コンピュータに実行させ、
前記重みを付与する処理は、
推定された前記顔の向きが正面であると判定された場合、前記第１の特徴量に、前記重みの付与として、重み係数を０に設定する
処理を含むことを特徴とする付記１に記載の情報処理プログラム。 (Additional note 2) causing the computer to execute a process of determining whether the estimated orientation of the face is frontal;
The process of assigning the weight is as follows:
As described in supplementary note 1, the method includes a process of setting a weighting coefficient to 0 as the weighting to the first feature quantity when it is determined that the estimated face orientation is a frontal direction. information processing program.

（付記３）推定された前記顔の向きが正面か否かを判定する
処理を前記コンピュータに実行させ、
推定された前記顔の向きが正面でないと判定された場合、前記顔の向きと前記運動の方向とが同一であるか否かを判定し、
前記重みを付与する処理は、
前記顔の向きと前記運動の方向とが同一であると判定された場合、前記第２の特徴量に、前記重みの付与として、重み係数を０に設定する
処理を含むことを特徴とする付記１に記載の情報処理プログラム。 (Additional note 3) causing the computer to execute a process of determining whether the estimated orientation of the face is frontal;
If it is determined that the estimated orientation of the face is not frontal, determining whether or not the orientation of the face and the direction of the movement are the same;
The process of assigning the weight is as follows:
Supplementary note characterized in that, when it is determined that the direction of the face and the direction of the movement are the same, the weighting coefficient is set to 0 for the second feature amount as the weighting. 1. The information processing program described in 1.

（付記４）前記映像から前記運動のタイミングを判定する
処理を前記コンピュータに実行させ、
前記強度を推定する処理は、
前記運動のタイミングにさらに基づいて、前記強度を推定する
処理を含むことを特徴とする付記１に記載の情報処理プログラム。 (Additional note 4) causing the computer to execute a process of determining the timing of the movement from the video;
The process of estimating the strength is as follows:
The information processing program according to supplementary note 1, further comprising: estimating the intensity based on the timing of the exercise.

（付記５）人物の身体の動きを含む運動を撮影した映像を取得し、
前記映像に含まれる前記人物の骨格情報に基づいて、前記運動の前記動きの状態を推定し、
前記映像における前記人物の顔器官の位置情報に基づいて、前記人物の顔の向きを推定し、
前記映像から前記顔の表情を推定し、
推定された前記動きの状態と前記顔の向きとの関係に基づいて、前記顔の向きに対する第１の特徴量と、推定された前記表情に対する第２の特徴量とのそれぞれに重みを付与し、
前記重みが付与された前記第１の特徴量と、前記重みが付与された前記第２の特徴量とに基づいて、前記運動の強度を推定する
処理をコンピュータが実行することを特徴とする情報処理方法。 (Additional note 5) Obtain a video of a person's exercise including body movements,
estimating the state of the movement of the movement based on skeletal information of the person included in the video;
Estimating the orientation of the person's face based on position information of facial organs of the person in the video,
estimating the facial expression from the video;
Based on the relationship between the estimated movement state and the face orientation, a weight is given to each of a first feature amount for the facial orientation and a second feature amount for the estimated facial expression. ,
Information characterized in that a computer executes a process of estimating the intensity of the exercise based on the first feature amount to which the weight is assigned and the second feature amount to which the weight is assigned. Processing method.

（付記６）推定された前記顔の向きが正面か否かを判定する
処理を前記コンピュータが実行し、
前記重みを付与する処理は、
推定された前記顔の向きが正面であると判定された場合、前記第１の特徴量に、前記重みの付与として、重み係数を０に設定する
処理を含むことを特徴とする付記５に記載の情報処理方法。 (Additional Note 6) The computer executes a process of determining whether the estimated direction of the face is frontal,
The process of assigning the weight is as follows:
According to appendix 5, the method includes a process of setting a weighting coefficient to 0 as the weighting to the first feature amount when it is determined that the estimated face direction is a frontal direction. information processing methods.

（付記７）推定された前記顔の向きが正面か否かを判定する
処理を前記コンピュータが実行し、
推定された前記顔の向きが正面でないと判定された場合、前記顔の向きと前記運動の方向とが同一であるか否かを判定し、
前記重みを付与する処理は、
前記顔の向きと前記運動の方向とが同一であると判定された場合、前記第２の特徴量に、前記重みの付与として、重み係数を０に設定する
処理を含むことを特徴とする付記５に記載の情報処理方法。 (Additional Note 7) The computer executes a process of determining whether the estimated direction of the face is frontal,
If it is determined that the estimated orientation of the face is not frontal, determining whether the orientation of the face and the direction of the movement are the same;
The process of assigning the weight is as follows:
A supplementary note characterized in that, when it is determined that the direction of the face and the direction of the movement are the same, a process of setting a weighting coefficient to 0 as the weighting to the second feature amount is included. 5. The information processing method described in 5.

（付記８）前記映像から前記運動のタイミングを判定する
処理を前記コンピュータが実行し、
前記強度を推定する処理は、
前記運動のタイミングにさらに基づいて、前記強度を推定する
処理を含むことを特徴とする付記５に記載の情報処理方法。 (Additional Note 8) The computer executes a process of determining the timing of the exercise from the video,
The process of estimating the strength is as follows:
The information processing method according to appendix 5, further comprising: estimating the intensity based on the timing of the exercise.

（付記９）プロセッサと、
前記プロセッサに動作可能に接続されたメモリと、
を備えた情報処理装置であって、前記プロセッサは、
人物の身体の動きを含む運動を撮影した映像を取得し、
前記映像に含まれる前記人物の骨格情報に基づいて、前記運動の前記動きの状態を推定し、
前記映像における前記人物の顔器官の位置情報に基づいて、前記人物の顔の向きを推定し、
前記映像から前記顔の表情を推定し、
推定された前記動きの状態と前記顔の向きとの関係に基づいて、前記顔の向きに対する第１の特徴量と、推定された前記表情に対する第２の特徴量とのそれぞれに重みを付与し、
前記重みが付与された前記第１の特徴量と、前記重みが付与された前記第２の特徴量とに基づいて、前記運動の強度を推定する
処理を実行することを特徴とする情報処理装置。 (Additional note 9) A processor,
a memory operably connected to the processor;
An information processing device comprising:
Obtain footage of a person's physical movements, including movement,
estimating the state of the movement of the movement based on skeletal information of the person included in the video;
Estimating the orientation of the person's face based on position information of facial organs of the person in the video,
estimating the facial expression from the video;
Based on the relationship between the estimated movement state and the face orientation, a weight is given to each of a first feature amount for the facial orientation and a second feature amount for the estimated facial expression. ,
An information processing device configured to perform a process of estimating the intensity of the exercise based on the first feature amount given the weight and the second feature amount given the weight. .

１情報処理システム
１０情報処理装置
１０ａ通信インタフェース
１０ｂＨＤＤ
１０ｃメモリ
１０ｄプロセッサ
２０通信部
３０記憶部
３１映像情報
３２運動解析モデル
３３表情筋推定モデル
４０制御部
４１運動解析部
４２顔向き推定部
４３表情筋推定部
４４顔エネルギー算出部
５０ネットワーク
１００カメラ装置 1 Information processing system 10 Information processing device 10a Communication interface 10b HDD
10c memory 10d processor 20 communication unit 30 storage unit 31 video information 32 motion analysis model 33 facial muscle estimation model 40 control unit 41 motion analysis unit 42 facial orientation estimation unit 43 facial muscle estimation unit 44 facial energy calculation unit 50 network 100 camera device

Claims

Obtain footage of a person's physical movements, including movement,
estimating the state of the movement of the movement based on skeletal information of the person included in the video;
Estimating the orientation of the person's face based on position information of facial organs of the person in the video,
estimating the facial expression from the video;
Based on the relationship between the estimated movement state and the face orientation, a weight is given to each of a first feature amount for the facial orientation and a second feature amount for the estimated facial expression. ,
Information characterized by causing a computer to perform a process of estimating the intensity of the exercise based on the first feature amount to which the weight is assigned and the second feature amount to which the weight is assigned. Processing program.

causing the computer to execute a process of determining whether the estimated direction of the face is frontal;
The process of assigning the weight is as follows:
2. The method according to claim 1, further comprising a process of setting a weighting coefficient to 0 to give the weight to the first feature quantity when it is determined that the estimated direction of the face is a frontal direction. The information processing program described.

causing the computer to execute a process of determining whether the estimated direction of the face is frontal;
If it is determined that the estimated orientation of the face is not frontal, determining whether or not the orientation of the face and the direction of the movement are the same;
The process of assigning the weight is as follows:
A claim characterized in that, when it is determined that the direction of the face and the direction of the movement are the same, the weighting coefficient is set to 0 for the second feature amount as the weighting. The information processing program according to item 1.

causing the computer to execute a process of determining the timing of the movement from the video;
The process of estimating the strength is as follows:
The information processing program according to claim 1, further comprising: estimating the intensity based on the timing of the exercise.

Obtain footage of a person's physical movements, including movement,
estimating the state of the movement of the movement based on skeletal information of the person included in the video;
Estimating the orientation of the person's face based on position information of facial organs of the person in the video,
estimating the facial expression from the video;
Based on the relationship between the estimated movement state and the face orientation, a weight is given to each of a first feature amount for the facial orientation and a second feature amount for the estimated facial expression. ,
Information characterized in that a computer executes a process of estimating the intensity of the exercise based on the first feature amount to which the weight is assigned and the second feature amount to which the weight is assigned. Processing method.