JP7483179B1

JP7483179B1 - Estimation device, learning device, estimation method, and estimation program

Info

Publication number: JP7483179B1
Application number: JP2024512968A
Authority: JP
Inventors: 敬士西川; 貴耶谷口; 恭平濱田; 優子菅沼; 健二瀧井
Original assignee: Mitsubishi Electric Corp; Mitsubishi Electric Building Solutions Corp
Current assignee: Mitsubishi Electric Corp; Mitsubishi Electric Building Solutions Corp
Priority date: 2023-06-20
Filing date: 2023-06-20
Publication date: 2024-05-14
Anticipated expiration: 2043-06-20

Abstract

時間分割部（１１）は、各々の属性が一様ではない複数の要素作業が含まれる作業に従事する作業従事者が作業に従事している時間である作業従事時間での作業従事者の手の変位量を判定する変位量判定の判定結果と、作業従事時間での作業従事者の手の道具の把持状況を判定する道具把持状況判定の判定結果と、作業従事時間に撮像された映像における作業従事者の手の出現状況を判定する出現状況判定の判定結果とのうちの少なくともいずれかを用いて、作業従事時間を、作業従事者が複数の要素作業のうちのいずれかの要素作業を実施している時間帯である作業実施時間帯と、作業従事者が前記複数の要素作業のうちのいずれの要素作業も実施していない時間帯である非作業時間帯とに分割する。推定部（１２）は、映像のうちの作業実施時間帯に撮像された部分である部分映像に基づき、作業従事者が作業実施時間帯に実施している要素作業を推定する。The time division unit (11) divides the work engagement time into a work performance time zone during which the worker is performing any one of the multiple elemental tasks and a non-work time zone during which the worker is not performing any one of the multiple elemental tasks, using at least one of the results of a displacement amount determination that determines the amount of displacement of the worker's hand during work engagement time, which is the time during which the worker is engaged in work including multiple elemental tasks, each of which has non-uniform attributes, a tool gripping state determination that determines the gripping state of the tool in the worker's hand during the work engagement time, and an appearance state determination that determines the appearance state of the worker's hand in an image captured during the work engagement time.The estimation unit (12) estimates the elemental task being performed by the worker during the work performance time zone, based on a partial image that is a portion of the image captured during the work performance time zone.

Description

本開示は、作業従事者が実施している要素作業を推定する技術に関する。
要素作業は、ひとまとまりと認識できる動作であり、作業の構成要素である。つまり、作業は、複数の要素作業の組み合わせで構成される。例えば、要素作業である「カバーを取り付け位置に配置する」と要素作業である「カバーをねじでとめる」とを組み合わせて１つの作業「カバーを取り付ける」が構成される。 The present disclosure relates to a technique for estimating an elemental work being performed by a worker.
An elemental task is an action that can be recognized as a single unit, and is a component of a task. In other words, a task is composed of a combination of multiple elemental tasks. For example, the elemental task "Placing the cover in the installation position" and the elemental task "Screwing the cover in place" are combined to form one task, "Attach the cover."

特許文献１に、人物の行動を撮像し、撮像により得られた映像から人物の行動を推定する技術が開示されている。
特許文献１の技術では、映像から得られた人体部位の時系列の位置データが複数の位置データに分類される。更に、特許文献１の技術では、それぞれの位置データが解析され、解析結果から、動作シーケンス（移動、変化、静止等）が生成される。そして、特許文献１の技術では、ＲＮＮ（ＲｅｃｕｒｒｅｎｔＮｅｕｒａｌＮｅｔｗｏｒｋ）等の時系列データを扱うニューラルネットワークが動作シーケンスを解析する。更に、特許文献１の技術では、解析結果をメモリニューラルネットワークが処理する。 Japanese Patent Laid-Open No. 2003-233693 discloses a technique for capturing an image of a person's behavior and estimating the person's behavior from the captured image.
In the technology of Patent Document 1, time-series position data of human body parts obtained from a video is classified into multiple position data. Furthermore, in the technology of Patent Document 1, each position data is analyzed, and a motion sequence (movement, change, rest, etc.) is generated from the analysis result. Then, in the technology of Patent Document 1, a neural network that handles time-series data, such as an RNN (Recurrent Neural Network), analyzes the motion sequence. Furthermore, in the technology of Patent Document 1, a memory neural network processes the analysis result.

特開２０２１－２２３２３号公報JP 2021-22323 A

作業には、様々な属性の要素作業が含まれる。具体的には、作業には、実施時間が短い要素作業、実施時間が長い要素作業、発生頻度の低い要素作業、発生頻度の高い要素作業といった様々な属性の要素作業が含まれる。
このような様々な属性の要素作業が含まれた作業に、特許文献１の技術を適用すると、以下の課題が生じる。 An operation includes element tasks with various attributes, such as element tasks that take a short time to perform, element tasks that take a long time to perform, element tasks that occur infrequently, and element tasks that occur frequently.
When the technology of Patent Document 1 is applied to a task that includes element tasks with such various attributes, the following problems arise.

課題（１）
特許文献１の技術は、短時間で区切られた時系列データである位置データを、時系列データを扱うニューラルネットワークが解析する。
実施時間及び／又は発生頻度が様々である複数の要素作業が含まれる作業について、短時間で区切られた時系列データをニューラルネットワークで学習させた場合に、カテゴリごとの時系列データの量が多いと、要素作業を推定するモデルを生成するための学習が困難になる。
カテゴリとは、作業従事者が実施している要素作業の種類を表すラベル及び作業従事者がいずれの要素作業も実施していない時間帯を表すラベルである。
課題（２）
また、カテゴリ間の時系列データの量のバランスを改善させるために新たな要素作業のカテゴリを設けたとしても、新たなカテゴリのために追加のデータ収集が必要となる。
課題（３）
また、実施時間の長い要素作業の時系列データを短時間ごとに複数の時系列データに区切ったとしても、各時系列データは要素作業の一部を表現したものでしかない。このため、同じカテゴリの中でも時系列データのばらつきが増大し、要素作業を推定するモデルを生成するための学習が困難になる。
課題（４）
上記の課題（１）～（３）により、実施時間及び／又は発生頻度が異なるといった、各々の属性が一様ではない複数の要素作業が作業に含まれる場合に、作業従事者が実施する要素作業を正確に推定することができない。 Issue (1)
In the technology of Patent Document 1, position data, which is time-series data divided into short intervals, is analyzed by a neural network that handles time-series data.
When a neural network is trained on time-series data separated by short intervals for an operation that includes multiple elemental operations with various execution times and/or occurrence frequencies, if the amount of time-series data for each category is large, it becomes difficult to train the network to generate a model for estimating the elemental operations.
A category is a label that indicates the type of element work that a worker is performing and a label that indicates a time period during which the worker is not performing any element work.
Issue (2)
Furthermore, even if a new category of elemental work is created in order to improve the balance of the amount of time-series data between categories, additional data collection is required for the new category.
Issue (3)
In addition, even if the time series data of an element task that takes a long time to perform is divided into multiple time series data for short periods of time, each time series data only represents a part of the element task. This increases the variability of the time series data even within the same category, making it difficult to learn to generate a model that estimates the element task.
Issue (4)
Due to the above problems (1) to (3), when a task includes multiple elemental tasks whose attributes are not uniform, such as having different execution times and/or occurrence frequencies, it is not possible to accurately estimate the elemental tasks performed by a worker.

本開示は、上記のような課題を解決することを主な目的の一つとしている。より具体的には、本開示は、各々の属性が一様ではない複数の要素作業が作業に含まれる場合でも、作業従事者が実施する要素作業を正確に推定できるようにすることを主な目的とする。One of the main objectives of the present disclosure is to solve the above-mentioned problems. More specifically, the main objective of the present disclosure is to enable accurate estimation of elemental tasks performed by a worker even when the task includes multiple elemental tasks, each of which has non-uniform attributes.

本開示に係る推定装置は、
各々の属性が一様ではない複数の要素作業が含まれる作業に従事する作業従事者が前記作業に従事している時間である作業従事時間での前記作業従事者の手の変位量を判定する変位量判定の判定結果と、前記作業従事時間での前記作業従事者の手の道具の把持状況を判定する道具把持状況判定の判定結果と、前記作業従事時間に撮像された映像における前記作業従事者の手の出現状況を判定する出現状況判定の判定結果とのうちの少なくともいずれかを用いて、前記作業従事時間を、前記作業従事者が前記複数の要素作業のうちのいずれかの要素作業を実施している時間帯である作業実施時間帯と、前記作業従事者が前記複数の要素作業のうちのいずれの要素作業も実施していない時間帯である非作業時間帯とに分割する時間分割部と、
前記映像のうちの前記作業実施時間帯に撮像された部分である部分映像に基づき、前記作業従事者が前記作業実施時間帯に実施している要素作業を推定する推定部とを有する。 The estimation device according to the present disclosure comprises:
a time division unit that divides a work engagement time into a work performance time zone during which the worker is performing any one of the plurality of elemental tasks and a non-work time zone during which the worker is not performing any one of the plurality of elemental tasks, using at least one of a result of a displacement amount determination that determines an amount of displacement of the hand of a worker during a work engagement time that is a time during which the worker is engaged in a task including a plurality of elemental tasks, each of which has non-uniform attributes, while the worker is engaged in the task, a result of a tool gripping state determination that determines a gripping state of a tool in the hand of the worker during the work engagement time, and a result of an appearance state determination that determines an appearance state of the hand of the worker in an image captured during the work engagement time;
The system further includes an estimation unit that estimates an elemental work being performed by the worker during the work performance time period based on a partial image that is a portion of the image captured during the work performance time period.

本開示によれば、各々の属性が一様ではない複数の要素作業が作業に含まれる場合でも、作業従事者が実施する要素作業を正確に推定することができる。 According to the present disclosure, it is possible to accurately estimate the component tasks performed by a worker even when the task includes multiple component tasks, each of which has non-uniform attributes.

実施の形態１に係る推定装置の機能構成例の概要を示す図。FIG. 2 is a diagram showing an overview of an example of a functional configuration of an estimation device according to the first embodiment. 実施の形態１に係る変位量判定の判定結果のみを用いた時間分割処理の例を示す図。6 is a diagram showing an example of a time division process using only the determination result of the displacement amount determination according to the first embodiment; 実施の形態１に係る道具把持状況判定の判定結果のみを用いた時間分割処理の例を示す図。11A to 11C are diagrams showing an example of a time division process using only the determination result of the tool gripping state determination in the first embodiment. 実施の形態１に係る出現状況判定の判定結果のみを用いた時間分割処理の例を示す図。11 is a diagram showing an example of a time division process using only the determination result of the appearance situation determination in the first embodiment. FIG. 実施の形態１に係る３つの判定結果を用いた時間分割処理の例を示す図。11A and 11B are diagrams showing an example of time division processing using three determination results according to the first embodiment; 実施の形態１に係る非作業時間帯と作業実施時間帯における道具の動きの例を示す図。11A to 11C are diagrams showing examples of tool movements during non-work time periods and work execution time periods in the first embodiment. 実施の形態１に係る推定装置の詳細な機能構成例を示す図。FIG. 2 is a diagram showing an example of a detailed functional configuration of the estimation device according to the first embodiment. 実施の形態１に係る作業実施時間帯検出部と要素作業推定部の内部構成例を示す図。4 is a diagram showing an example of the internal configuration of a task execution time zone detection unit and an element task estimation unit according to the first embodiment; FIG. 実施の形態１に係る推定装置の動作例を示すフローチャート。4 is a flowchart showing an example of the operation of the estimation device according to the first embodiment. 実施の形態１に係る作業実施時間帯検出部の動作例を示すフローチャート。10 is a flowchart showing an example of the operation of an operation execution time period detection unit according to the first embodiment. 実施の形態１に係る要素作業推定部の動作例を示すフローチャート。10 is a flowchart showing an example of the operation of an element work estimation unit according to the first embodiment. 実施の形態２に係る学習装置の機能構成例を示す図。FIG. 13 is a diagram showing an example of the functional configuration of a learning device according to a second embodiment. 実施の形態２に係る要素作業推定モデル生成部の内部構成例を示す図。FIG. 13 is a diagram showing an example of the internal configuration of an element work estimation model generation unit according to the second embodiment. 実施の形態２に係る学習装置の動作例を示すフローチャート。13 is a flowchart showing an example of the operation of the learning device according to the second embodiment. 実施の形態３に係る学習装置の機能構成例を示す図。FIG. 13 is a diagram showing an example of the functional configuration of a learning device according to a third embodiment. 実施の形態３に係る作業実施時間帯検出モデル生成部の内部構成例を示す図。FIG. 13 is a diagram showing an example of the internal configuration of an operation execution time period detection model generation unit according to the third embodiment. 実施の形態３に係る学習装置の動作例を示すフローチャート。13 is a flowchart showing an example of the operation of the learning device according to the third embodiment. 実施の形態４に係る推定装置の機能構成例を示す図。FIG. 13 is a diagram showing an example of a functional configuration of an estimation device according to a fourth embodiment. 実施の形態５に係る学習装置の機能構成例を示す図。FIG. 13 is a diagram showing an example of the functional configuration of a learning device according to a fifth embodiment. 実施の形態１に係る画像化処理の概要を示す図。1 is a diagram showing an overview of imaging processing according to the first embodiment; 実施の形態１に係る作業実施時間帯検出処理、要素作業推定処理及び推定結果処理の概要を示す図。4 is a diagram showing an overview of an operation execution time period detection process, an element operation estimation process, and an estimation result process according to the first embodiment. FIG. 実施の形態１及び４に係る推定装置のハードウェア構成例を示す図。FIG. 1 is a diagram showing an example of a hardware configuration of an estimation device according to first and fourth embodiments. 実施の形態２、３及び５に係る学習装置のハードウェア構成例を示す図。FIG. 13 is a diagram showing an example of the hardware configuration of a learning device according to the second, third and fifth embodiments.

以下、実施の形態を図を用いて説明する。以下の実施の形態の説明及び図面において、同一の符号を付したものは、同一の部分又は相当する部分を示す。 The following describes the embodiments with reference to the drawings. In the following description of the embodiments and in the drawings, the same reference numerals indicate the same or corresponding parts.

実施の形態１．
本開示では、作業従事者が実施している要素作業を正確に推定する方法を説明する。
「作業従事者」は、作業に従事する人（作業員）又はロボット（作業ロボット）である。また、「作業従事者の手」には、人（作業員）の手と、人の手と同様の機能及び／又は役割をもつロボット（作業ロボット）の構成部分とが含まれる。また、「作業従事者の手」には、人（作業員）の手のひら、手の甲、手首及び指の少なくともいずれかと、手のひら、手の甲、手首及び指のいずれかに相当するロボット（作業ロボット）の構成部分とが含まれる。
以下では、人（作業員）が機械調整作業に従事する例を説明する。 Embodiment 1.
This disclosure describes a method for accurately estimating the task elements being performed by workers.
An "operator" is a person (worker) or robot (working robot) engaged in work. Additionally, an "operator's hand" includes a human (worker's) hand and a component part of a robot (working robot) that has the same function and/or role as a human hand. Additionally, an "operator's hand" includes at least one of the palm, back of the hand, wrist, and fingers of a human (worker) and a component part of a robot (working robot) that corresponds to any one of the palm, back of the hand, wrist, and fingers.
In the following, an example will be described in which a person (worker) is engaged in machine adjustment work.

＊＊＊構成の説明＊＊＊
図１は、本実施の形態に係る推定装置１００の機能構成例の概要を示す。推定装置１００は、推定フェーズで動作する。
推定装置１００は、撮像装置１１０に接続されている。
撮像装置１１０は、作業従事時間に作業従事者の手を撮像する。作業従事時間は、作業従事者が作業に従事している時間である。
撮像装置１１０は、撮像により得られた、作業従事者の手の映像を推定装置１００に出力する。
撮像装置１１０は、例えば、作業従事者の頭部に装着され、１人称視点の映像を撮像する。本実施の形態では、撮像装置１１０は、作業従事者の頭部に装着されているものとする。しかし、撮像装置１１０は、作業従事者の手の映像が撮像できるのであれば、作業従事者の頭部に装着されていなくてもよい。
また、撮像装置１１０は、通常のカラー映像を撮像する装置のみならず、デプスセンサなどの別のモーダルを持つセンサが含まれていてもよい。
デプスセンサはより広角な撮像が可能である。撮像装置１１０にデプスセンサに含まれる場合は、カラーの映像に手が映っていなくとも、デプスセンサの撮像可能範囲に手が存在すれば、推定装置１００は、手が出現しているとみなしてよい。 ***Configuration Description***
1 shows an overview of an example of a functional configuration of an estimation device 100 according to the present embodiment. The estimation device 100 operates in an estimation phase.
The estimation device 100 is connected to an imaging device 110 .
The image capturing device 110 captures an image of the worker's hands during work engagement time. Work engagement time is the time during which the worker is engaged in work.
The imaging device 110 outputs an image of the worker's hands obtained by imaging to the estimation device 100.
The imaging device 110 is, for example, worn on the head of a worker and captures a first-person perspective image. In this embodiment, the imaging device 110 is assumed to be worn on the head of the worker. However, the imaging device 110 does not have to be worn on the head of the worker as long as it can capture an image of the worker's hands.
Furthermore, the image capturing device 110 may not only be a device that captures normal color images, but may also include a sensor having another modality, such as a depth sensor.
A depth sensor is capable of capturing images at a wider angle. When the imaging device 110 includes a depth sensor, even if the hand is not captured in the color image, the estimation device 100 may determine that the hand is present as long as the hand is present within the imaging range of the depth sensor.

＊＊推定装置１００の単純化した構成例の説明＊＊
推定装置１００は、時間分割部１１と推定部１２を有する。
後述するように、推定装置１００は、詳細には、図７及び図８に示す機能構成を有するが、図１では、理解のしやすさを考慮し、推定装置１００の機能構成を単純化して示している。
図７及び図８に示す推定装置１００の詳細な構成を説明する前に、図１の単純化した推定装置１００の機能構成を説明する。 **Description of a Simplified Configuration Example of the Estimation Device 100**
The estimation device 100 includes a time division unit 11 and an estimation unit 12 .
As described below, the estimation device 100 has a detailed functional configuration shown in FIGS. 7 and 8. However, in FIG. 1, the functional configuration of the estimation device 100 is shown in a simplified manner for ease of understanding.
Before describing the detailed configuration of the estimation device 100 shown in FIGS. 7 and 8, the functional configuration of the simplified estimation device 100 shown in FIG. 1 will be described.

＊＊時間分割部１１の説明＊＊
時間分割部１１は、変位量判定の判定結果と、道具把持状況判定の判定結果と、出現状況判定の判定結果とのうちの少なくともいずれかを用いて、作業従事時間を、作業実施時間帯と非作業時間帯とに分割する。
変位量判定は、作業従事者の手の変位量を判定する判定処理である。
道具把持状況判定は、作業従事者の手の道具の把持状況を判定する判定処理である。道具は、機械調整作業に用いられる道具である。
出現状況判定は、作業従事時間に撮像された映像における作業従事者の手の出現状況を判定する判定処理である。
また、作業実施時間帯は、作業従事者が複数の要素作業のうちのいずれかの要素作業を実施している時間帯である。
また、非作業時間帯は、作業従事者が複数の要素作業のうちのいずれの要素作業も実施していない時間帯である。
時間分割部１１により行われる処理は、時間分割処理に相当する。 **Explanation of time division unit 11**
The time division unit 11 divides the work time into a work execution time period and a non-work time period using at least one of the results of the displacement amount determination, the results of the tool holding status determination, and the results of the appearance status determination.
The displacement amount determination is a determination process for determining the amount of displacement of the worker's hand.
The tool gripping state determination is a process for determining the state in which a tool is gripped by the worker's hand. The tool is a tool used in machine adjustment work.
The appearance status determination is a determination process for determining the appearance status of the worker's hands in the video captured during the work engagement time.
Further, a work performance time period is a time period during which a worker performs one of a plurality of elemental works.
Moreover, a non-working time period is a time period during which the worker is not performing any of the elemental works among a plurality of elemental works.
The process performed by the time division unit 11 corresponds to a time division process.

＊＊推定部１２の説明＊＊
推定部１２は、作業従事時間に撮像された映像のうちの作業実施時間帯に撮像された部分である部分映像に基づき、作業従事者が作業実施時間帯に実施している要素作業を推定する。
具体的には、推定部１２は、学習により生成された学習済みモデルを用いて、作業従事者が作業実施時間帯に実施している要素作業を推定する。 **Description of Estimation Unit 12**
The estimation unit 12 estimates the elemental work being performed by the worker during the work performance time period based on a partial image that is a portion of the image captured during the work performance time period that is captured during the work performance time period.
Specifically, the estimation unit 12 estimates the element work performed by the worker during the work performance time period using a trained model generated by training.

＊＊時間分割処理の説明＊＊
図２～図５は、時間分割部１１による時間分割処理の例を示す。
図２は、時間分割部１１が、変位量判定の判定結果のみを用いて、作業従事時間を作業実施時間帯と非作業時間帯とに分割している例を示す。
図３は、時間分割部１１が、道具把持状況判定の判定結果のみを用いて、作業従事時間を作業実施時間帯と非作業時間帯とに分割している例を示す。
図４は、時間分割部１１が、出現状況判定の判定結果のみを用いて、作業従事時間を作業実施時間帯と非作業時間帯とに分割している例を示す。
図５は、時間分割部１１が、変位量判定の判定結果と道具把持状況判定の判定結果と出現状況判定の判定結果とを用いて、作業従事時間を作業実施時間帯と非作業時間帯とに分割している例を示す。 **Explanation of time division process**
2 to 5 show examples of the time division process by the time division unit 11. FIG.
FIG. 2 shows an example in which the time dividing unit 11 divides the work engagement time into a work execution time period and a non-work time period using only the result of the displacement amount determination.
FIG. 3 shows an example in which the time dividing unit 11 divides the work engagement time into a work execution time period and a non-work time period using only the result of the tool gripping state determination.
FIG. 4 shows an example in which the time dividing unit 11 divides the work engagement time into a work performance time period and a non-work time period using only the determination result of the appearance status determination.
FIG. 5 shows an example in which the time division unit 11 divides work time into a work performance time period and a non-work time period using the results of the displacement amount determination, the results of the tool holding status determination, and the results of the appearance status determination.

図２の変位量判定の判定結果は、作業従事時間での作業従事者の手の変位量を示す。図２では、変位量判定の判定結果をグラフで示している。
時間分割部１１は、撮像装置１１０で撮像された作業従事者の手の映像から導出された手の関節の時系列データを取得する。そして、手の関節の時系列データから単位時間あたりの手の変位量を判定する。
図１では、図示を省略しているが、推定装置１００には、作業従事者の手の映像を手の関節の時系列データに変換する機構が含まれているものとする。
時間分割部１１は、単位時間あたりの手の変位量が閾値未満である時間帯を作業実施時間帯に指定する。一方、時間分割部１１は、単位時間あたりの手の変位量が閾値以上である時間帯を非作業時間帯に指定する。
図６の（ａ）に示すように、機械調整作業では、作業を実施していないと、道具を把持した手が作業対象物に一気に近づく（又は離れる）と考えられる（道具を保持していない手でも同様と考えられる）。このため、手の変位量が大きい時間帯は非作業時間帯であると考えられる。一方、図６の（ｂ）に示すように、作業を実施している場合は、道具をゆっくり動かす又は道具の移動範囲が小さいと考えられる（道具を保持していない手でも同様と考えられる）。このため、手の変位量が小さい時間帯は作業実施時間帯であると考えられる。 The result of the displacement amount determination in Fig. 2 indicates the displacement amount of the worker's hand during the work engagement time. In Fig. 2, the result of the displacement amount determination is shown in a graph.
The time division unit 11 acquires time series data of the joints of the hand derived from an image of the worker's hand captured by the imaging device 110. Then, the amount of displacement of the hand per unit time is determined from the time series data of the joints of the hand.
Although not shown in FIG. 1, the estimation device 100 is assumed to include a mechanism for converting an image of the worker's hand into time-series data of the hand joints.
The time division unit 11 designates a time period during which the amount of hand displacement per unit time is less than the threshold as a task execution time period, while the time division unit 11 designates a time period during which the amount of hand displacement per unit time is equal to or greater than the threshold as a non-task time period.
As shown in (a) of Fig. 6, in machine adjustment work, when no work is being performed, the hand holding the tool is thought to move quickly toward (or away from) the work target (this is also thought to be the case for a hand not holding a tool). For this reason, the time period when the amount of hand displacement is large is thought to be a non-work time period. On the other hand, as shown in (b) of Fig. 6, when work is being performed, the tool is thought to be moved slowly or the range of movement of the tool is small (this is also thought to be the case for a hand not holding a tool). For this reason, the time period when the amount of hand displacement is small is thought to be a work time period.

図３の道具把持状況判定の判定結果は、道具の把持の有無を示す。
図１では、図示を省略しているが、推定装置１００には、作業従事者の手が把持している道具を識別する機構が含まれているものとする。
図３の例では、最初に作業従事者はスパナを把持していた。次に、作業従事者はドライバを把持していた。その後、作業従事者は、いずれの道具も把持しない状態になった。最後に、作業従事者は鋼尺を把持していた。
時間分割部１１は、作業従事者の手が道具を把持している時間帯を作業実施時間帯に指定する。一方、時間分割部１１は、作業従事者の手がいずれの道具も把持していない時間帯を非作業時間帯に指定する。
また、時間の経過により作業従事者の手が把持する道具の種類が変化している場合は、時間分割部１１は、作業従事者の手が異なる種類の道具を把持しているそれぞれの時間帯を、異なる作業実施時間帯に指定する。図３の例では、時間分割部１１は、作業従事者の手がスパナを把持している時間帯と、作業従事者の手がドライバを把持している時間帯を、異なる作業時実施時間帯に指定している。 The result of the tool holding state determination in FIG. 3 indicates whether or not a tool is being held.
Although not shown in FIG. 1, the estimation device 100 is assumed to include a mechanism for identifying the tool held by the worker's hand.
In the example of Figure 3, first the worker was holding a wrench. Next, the worker was holding a screwdriver. After that, the worker was not holding any tools. Finally, the worker was holding a steel ruler.
The time division unit 11 designates a time period during which the worker's hands are holding a tool as a work performance time period, while the time division unit 11 designates a time period during which the worker's hands are not holding any tool as a non-work time period.
In addition, when the type of tool held by the worker's hand changes over time, the time division unit 11 designates each time period during which the worker's hand holds a different type of tool as a different work performance time period. In the example of Fig. 3, the time division unit 11 designates a time period during which the worker's hand holds a wrench and a time period during which the worker's hand holds a screwdriver as different work performance time periods.

図４の出現状況判定の判定結果は、撮像装置１１０の映像における作業従事者の手の出現有無を示す。
時間分割部１１は、撮像装置１１０の映像の解析又は手の関節の時系列データの解析により、作業従事者の手の出現有無を判定することができる。
図４の例では、最初に作業従事者の手が映像に出現していた。次に、作業従事者の手が映像に出現しなくなった。その後、作業従事者の手が再度映像に出現した。
時間分割部１１は、作業従事者の手が映像に出現している時間帯を作業実施時間帯に指定する。一方、時間分割部１１は、作業従事者の手が映像に出現していない時間帯を非作業時間帯に指定する。 The determination result of the appearance status determination in FIG. 4 indicates whether or not the worker's hands are appearing in the image captured by the imaging device 110.
The time division unit 11 can determine whether or not the worker's hand is present by analyzing the image captured by the imaging device 110 or the time-series data of the hand joints.
In the example of Figure 4, first the worker's hand appears in the image. Then the worker's hand disappears from the image. After that, the worker's hand appears in the image again.
The time division unit 11 designates a time period during which the worker's hands appear in the video as a work performance time period, while the time division unit 11 designates a time period during which the worker's hands do not appear in the video as a non-work time period.

図５は、時間分割部１１が、３つの判定結果を用いて、作業従事時間を作業実施時間帯と非作業時間帯とに分割している例を示す。
図５の例では、時間分割部１１は、作業従事者の手の変位量が閾値以上であれば、作業従事者の手が道具を把持しているか否か及び作業従事者の手が映像に出現しているか否かに関わらず、当該時間帯を非作業時間帯に指定している。
また、時間分割部１１は、作業従事者の手の変位量が閾値未満の場合は、作業従事者の手が映像に出現していれば、作業従事者の手が道具を把持しているか否かに関わらず、当該時間帯を作業実施時間帯に指定している。一方、作業従事者の手の変位量が閾値未満の場合でも、作業従事者の手が映像に出現していなければ、時間分割部１１は、作業従事者の手が道具を把持しているか否かに関わらず、当該時間帯を非作業時間帯に指定している。
このように変位量判定の判定結果、道具把持状況判定の判定結果及び出現状況判定の判定結果のうちの２つ以上の判定結果が用いられる場合に、推定装置１００のユーザは、どのような優先順位で２つの以上の判定結果を適用するかについて事前に適用基準を定義しておく。 FIG. 5 shows an example in which the time dividing unit 11 divides the work engagement time into a work execution time period and a non-work time period using three determination results.
In the example of Figure 5, if the displacement of the worker's hand is equal to or greater than a threshold value, the time division unit 11 designates that time period as a non-work time period, regardless of whether the worker's hand is holding a tool or not and regardless of whether the worker's hand appears in the image or not.
Furthermore, when the amount of displacement of the worker's hand is less than the threshold and the worker's hand appears in the video, the time division unit 11 designates the time period as a work performance time period regardless of whether the worker's hand is holding a tool or not. On the other hand, even when the amount of displacement of the worker's hand is less than the threshold, if the worker's hand does not appear in the video, the time division unit 11 designates the time period as a non-work time period regardless of whether the worker's hand is holding a tool or not.
In this way, when two or more of the judgment results of the displacement amount judgment, the judgment result of the tool holding status judgment, and the judgment result of the appearance status judgment are used, the user of the estimation device 100 defines in advance the application criteria for determining the priority order in which the two or more judgment results are to be applied.

推定部１２は、作業実施時間帯に撮像された部分映像から導出された手の関節の時系列データを学習済みモデルに適用して、作業従事者が作業実施時間帯に実施している要素作業を推定する。
推定部１２が用いる学習済みモデルは、後述する学習装置２００により生成される。 The estimation unit 12 applies the time series data of the hand joints derived from the partial video captured during the work performance time period to the learned model, and estimates the component work performed by the worker during the work performance time period.
The trained model used by the estimation unit 12 is generated by a training device 200 described later.

このように、本実施の形態に係る推定装置１００では、時間分割部１１による作業実施時間帯の検出と、推定部１２による部分映像を用いた要素作業の推定という２段階の処理が行われる。
このため、本実施の形態に係る推定装置１００は、作業従事者により実施された要素作業を正確に推定することができる。 In this way, the estimation device 100 according to the present embodiment performs a two-stage process of detecting a task execution time period by the time dividing unit 11 and estimating an element task using a partial video by the estimation unit 12.
Therefore, the estimating device 100 according to the present embodiment can accurately estimate the work elements performed by the workers.

＊＊推定装置１００のハードウェア構成例の説明＊＊
本実施の形態に係る推定装置１００は、図２２に例示するハードウェア構成を有するコンピュータである。
図２２に示すように、推定装置１００は、ハードウェアとして、プロセッサ８０１、主記憶装置８０２、補助記憶装置８０３及び通信装置８０４を備える。
図１に示した時間分割部１１、推定部１２、更に、図７及び図８に示す構成要素の機能は、例えば、プログラムにより実現される。
補助記憶装置８０３には、これらの構成要素の機能を実現するプログラムが記憶されている。
これらプログラムは、補助記憶装置８０３から主記憶装置８０２にロードされる。そして、プロセッサ８０１がこれらプログラムを実行して、これらの構成要素の動作を行う。 **Explanation of an Example of the Hardware Configuration of the Estimation Apparatus 100**
The estimation device 100 according to this embodiment is a computer having a hardware configuration exemplified in FIG.
As shown in FIG. 22, the estimation device 100 includes, as hardware, a processor 801, a main memory device 802, an auxiliary memory device 803, and a communication device 804.
The functions of the time division unit 11 and the estimation unit 12 shown in FIG. 1, and further the components shown in FIGS. 7 and 8, are realized by, for example, a program.
The auxiliary storage device 803 stores programs that realize the functions of these components.
These programs are loaded from the auxiliary storage device 803 to the main storage device 802. Then, the processor 801 executes these programs to perform the operations of these components.

なお、推定装置１００の動作手順は、推定方法に相当する。また、推定装置１００の動作を実現するプログラムは、推定プログラムに相当する。The operating procedure of the estimation device 100 corresponds to an estimation method. Furthermore, the program that realizes the operation of the estimation device 100 corresponds to an estimation program.

＊＊推定装置１００の詳細な構成例の説明＊＊
図７は、図１に示した推定装置１００の詳細な機能構成例を示す。
図７では、推定装置１００は、関節位置時系列データ取得部１２０、関節速度計算部１２１、関節時系列データ画像化部１２２、作業実施時間帯検出部１２３、要素作業推定部１２４、推定結果処理部１２５、要素作業推定モデル保存部１３０、前処理統計量保存部１３１及び推定結果保存部１３２を備える。
なお、作業実施時間帯検出部１２３が図１に示した時間分割部１１に相当する。また、要素作業推定部１２４が図１に示した推定部１２に相当する。 **Description of a detailed configuration example of the estimation device 100**
FIG. 7 shows an example of a detailed functional configuration of the estimation device 100 shown in FIG.
In FIG. 7 , the estimation device 100 includes a joint position time series data acquisition unit 120, a joint velocity calculation unit 121, a joint time series data imaging unit 122, an task performance time zone detection unit 123, an element work estimation unit 124, an estimation result processing unit 125, an element work estimation model storage unit 130, a pre-processing statistics storage unit 131, and an estimation result storage unit 132.
The task execution time zone detection unit 123 corresponds to the time division unit 11 shown in Fig. 1. The element task estimation unit 124 corresponds to the estimation unit 12 shown in Fig. 1.

＊＊関節位置時系列データ取得部１２０の説明＊＊
関節位置時系列データ取得部１２０は、撮像装置１１０から、作業従事者の手を撮像して得られた映像Ｖを取得する。映像Ｖは、推定フェーズにおいて、作業従事時間に撮像された映像である。映像Ｖには、作業従事者の手が映されているフレームと作業者の手が映されていないフレームとが含まれる。
関節位置時系列データ取得部１２０は、映像Ｖから関節位置時系列データＨＰＴを生成する。関節位置時系列データＨＰＴは、手の各関節の空間位置座標が示されるデータである。
関節位置時系列データ取得部１２０は、例えば、手の関節位置を推定する既存のハンドトラッキングモデルを利用して、関節位置時系列データＨＰＴを生成する。
関節位置時系列データ取得部１２０は、右手の関節位置時系列データＨＰＴと左手の関節位置時系列データＨＰＴを生成する。なお、以下では、単に関節位置時系列データＨＰＴという場合は、右手の関節位置時系列データＨＰＴと左手の関節位置時系列データＨＰＴの両者を表すものとする。
関節位置時系列データ取得部１２０は、映像Ｖに手が出現しない時間がある場合は、手が出現しないことを判別可能な値を関節位置時系列データＨＰＴに挿入する。
そして、関節位置時系列データ取得部１２０は、関節位置時系列データＨＰＴを関節速度計算部１２１及び関節時系列データ画像化部１２２に出力する。また、関節位置時系列データ取得部１２０は、関節位置時系列データＨＰＴを作業実施時間帯検出部１２３に出力してもよい。 **Description of joint position time-series data acquisition unit 120**
The joint position time-series data acquisition unit 120 acquires an image V obtained by capturing an image of the worker's hand from the imaging device 110. The image V is an image captured during the work engagement time in the estimation phase. The image V includes frames in which the worker's hand is captured and frames in which the worker's hand is not captured.
The joint position time series data acquisition unit 120 generates joint position time series data HPT from the video V. The joint position time series data HPT is data indicating the spatial position coordinates of each joint of the hand.
The joint position time-series data acquisition unit 120 generates the joint position time-series data HPT, for example, by using an existing hand tracking model that estimates the positions of the hand joints.
The joint position time series data acquisition unit 120 generates the right hand joint position time series data HPT and the left hand joint position time series data HPT. Note that, hereinafter, when simply referring to the joint position time series data HPT, it refers to both the right hand joint position time series data HPT and the left hand joint position time series data HPT.
When there is a period of time in which no hands appear in the video V, the joint position time series data acquisition unit 120 inserts a value that enables determination that no hands appear into the joint position time series data HPT.
Then, the joint position time series data acquiring unit 120 outputs the joint position time series data HPT to the joint velocity calculating unit 121 and the joint time series data imaging unit 122. In addition, the joint position time series data acquiring unit 120 may output the joint position time series data HPT to the task execution time zone detecting unit 123.

＊＊関節速度計算部１２１の説明＊＊
関節速度計算部１２１は、関節位置時系列データ取得部１２０から関節位置時系列データＨＰＴを取得する。
そして、関節速度計算部１２１は、関節位置時系列データＨＰＴの各関節における時間方向への差分（速度）を計算する。そして、関節速度計算部１２１は、計算結果を示す関節速度時系列データＨＶＴを関節時系列データ画像化部１２２に出力する。関節速度計算部１２１は、関節速度時系列データＨＶＴを作業実施時間帯検出部１２３に出力してもよい。
関節速度計算部１２１は、右手の関節速度時系列データＨＶＴと左手の関節速度時系列データＨＶＴを生成する。なお、以下では、単に関節速度時系列データＨＶＴという場合は、右手の関節速度時系列データＨＶＴと左手の関節速度時系列データＨＶＴの両者を表すものとする。 **Explanation of the joint velocity calculation unit 121**
The joint velocity calculation unit 121 acquires the joint position time series data HPT from the joint position time series data acquisition unit 120 .
Then, the joint velocity calculation unit 121 calculates the difference (velocity) in the time direction for each joint of the joint position time series data HPT. Then, the joint velocity calculation unit 121 outputs the joint velocity time series data HVT indicating the calculation result to the joint time series data imaging unit 122. The joint velocity calculation unit 121 may output the joint velocity time series data HVT to the task execution time zone detection unit 123.
The joint velocity calculation unit 121 generates the joint velocity time series data HVT of the right hand and the joint velocity time series data HVT of the left hand. Note that, hereinafter, when simply referring to the joint velocity time series data HVT, it refers to both the joint velocity time series data HVT of the right hand and the joint velocity time series data HVT of the left hand.

＊＊関節時系列データ画像化部１２２の説明＊＊
関節時系列データ画像化部１２２は、関節位置時系列データ取得部１２０から関節位置時系列データＨＰＴを取得する。また、関節時系列データ画像化部１２２は、関節速度計算部１２１から関節速度時系列データＨＶＴを取得する。
関節時系列データ画像化部１２２は、関節位置時系列データＨＰＴを画像化して、左手関節位置画像ＬＪＰＩと右手関節位置画像ＲＪＰＩを生成する。また、関節時系列データ画像化部１２２は、関節速度時系列データＨＶＴを画像化して、左手関節速度画像ＬＪＶＩと右手関節速度画像ＲＪＶＩを生成する。
そして、関節時系列データ画像化部１２２は、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩと左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩを作業実施時間帯検出部１２３に出力する。 **Explanation of the joint time-series data imaging unit 122**
The joint time series data imaging unit 122 acquires the joint position time series data HPT from the joint position time series data acquisition unit 120. The joint time series data imaging unit 122 also acquires the joint velocity time series data HVT from the joint velocity calculation unit 121.
The joint time-series data imaging unit 122 images the joint position time-series data HPT to generate a left wrist joint position image LJPI and a right wrist joint position image RJPI. The joint time-series data imaging unit 122 also images the joint velocity time-series data HVT to generate a left wrist joint velocity image LJVI and a right wrist joint velocity image RJVI.
Then, the joint time-series data visualization unit 122 outputs the left hand joint position image LJPI, the right hand joint position image RJPI, the left hand joint velocity image LJVI, and the right hand joint velocity image RJVI to the task execution time zone detection unit 123.

ここで、関節時系列データ画像化部１２２による関節位置時系列データＨＰＴの画像化と関節速度時系列データＨＶＴの画像化を説明する。
図２０は、関節時系列データ画像化部１２２の画像化処理の概要を示す。ここでは、関節位置時系列データＨＰＴの画像化処理を説明する。画像化とは、三次元座標のｘ値、ｙ値及びｚ値を画像のＲ、Ｇ及びＢに見立てて関節の三次元座標の時系列データを生成することである。
図２０では、時刻ｔごとの関節位置時系列データＨＰＴには、Ｊ（Ｊ≧２）個の関節の空間位置座標が含まれているものとする。また、座標軸Ｃはｘ座標軸、ｙ座標軸及びｚ座標軸である。 Here, imaging of the joint position time series data HPT and imaging of the joint velocity time series data HVT by the joint time series data imaging unit 122 will be described.
20 shows an overview of the imaging process of the joint time-series data imaging unit 122. Here, the imaging process of the joint position time-series data HPT will be described. Imaging refers to generating time-series data of the three-dimensional coordinates of a joint by regarding the x, y, and z values of the three-dimensional coordinates as R, G, and B of an image.
20, the joint position time-series data HPT for each time t includes spatial position coordinates of J (J≧2) joints. The coordinate axes C are the x-axis, the y-axis, and the z-axis.

関節時系列データ画像化部１２２は、例えば、左手の関節位置時系列データＨＰＴのＪ個の関節の各関節ｊ及び各軸の座標値ｃを要素とするテンソルを生成する。そして、関節時系列データ画像化部１２２は、テンソルを時間方向に結合して左手関節位置画像ＬＪＰＩを生成する。
関節時系列データ画像化部１２２は、右手関節位置画像ＲＪＰＩも同様の手順で生成する。
また、関節時系列データ画像化部１２２は、同様の手順にて、左手の関節速度時系列データＨＶＴから左手関節速度画像ＬＪＶＩを生成する。
また、関節時系列データ画像化部１２２は、同様の手順にて、右手の関節速度時系列データＨＶＴから右手関節速度画像ＲＪＶＩを生成する。 The joint time-series data imaging unit 122 generates a tensor having elements of each joint j and each axis coordinate value c of the J joints in the left hand joint position time-series data HPT, for example. The joint time-series data imaging unit 122 then combines the tensors in the time direction to generate a left hand joint position image LJPI.
The joint time-series data imaging unit 122 also generates the right wrist joint position image RJPI in a similar procedure.
Further, the joint time-series data imaging unit 122 generates a left hand joint velocity image LJVI from the left hand joint velocity time-series data HVT in a similar procedure.
Further, the joint time-series data imaging unit 122 generates a right hand joint velocity image RJVI from the right hand joint velocity time-series data HVT in a similar procedure.

＊＊把持道具時系列データ生成部１２６の説明＊＊
把持道具時系列データ生成部１２６は、映像Ｖ又はセンサ情報ＲＲＳを取得する。
映像Ｖは、撮像装置１１０で撮像された映像である。つまり、把持道具時系列データ生成部１２６は、関節位置時系列データ取得部１２０から映像Ｖを取得してもよいし、関節位置時系列データ取得部１２０から独立して、撮像装置１１０から映像Ｖを取得してもよい。
センサ情報ＲＲＳは、作業従事者が道具を把持しているか否かを通知する情報である。また、作業従事者が道具を把持する場合は、センサ情報ＲＲＳは、作業従事者が把持する道具の種類を通知する。
例えば、作業従事者の手にＲＦＩＤ等のタグ型無線通信センサが装着されている場合は、タグ型無線通信センサが作業従事者が把持する道具から発信される信号を受信する。そして、タグ型無線通信センサは、受信した信号に基づき、作業従事者が把持する道具の種類を判定する。また、タグ型無線通信センサは、信号を受信しない場合は、作業従事者が道具を把持していないと判定する。タグ型無線通信センサは、判定結果をセンサ情報ＲＲＳとして把持道具時系列データ生成部１２６に出力する。 **Description of the gripping tool time-series data generating unit 126**
The gripping tool time-series data generator 126 acquires the video V or the sensor information RRS.
The video V is an image captured by the imaging device 110. In other words, the gripping tool time-series data generation unit 126 may acquire the video V from the joint position time-series data acquisition unit 120, or may acquire the video V from the imaging device 110 independently of the joint position time-series data acquisition unit 120.
The sensor information RRS is information that indicates whether the worker is holding a tool or not, and if the worker is holding a tool, the sensor information RRS indicates the type of tool that the worker is holding.
For example, when a tag-type wireless communication sensor such as an RFID is attached to the hand of a worker, the tag-type wireless communication sensor receives a signal transmitted from the tool held by the worker. The tag-type wireless communication sensor then determines the type of tool held by the worker based on the received signal. Furthermore, when the tag-type wireless communication sensor does not receive a signal, it determines that the worker is not holding a tool. The tag-type wireless communication sensor outputs the determination result as sensor information RRS to the held tool time-series data generator 126.

把持道具時系列データ生成部１２６は、映像Ｖ又はセンサ情報ＲＲＳを用いて、作業従事者の手が道具を把持しているか否かを識別する。また、作業従事者の手が道具を把持している場合は、把持道具時系列データ生成部１２６は、作業従事者の手が把持している道具の種類を識別する。
把持道具時系列データ生成部１２６は、映像Ｖを取得する場合は、例えば公知の物体検出モデルを用いて、映像Ｖの各フレームから作業従事者が道具を把持しているか否か及び道具の種類を識別する。
把持道具時系列データ生成部１２６は、センサ情報ＲＲＳを取得する場合は、センサ情報ＲＲＳを解析して、作業従事者が道具を把持しているか否か及び道具の種類を識別する。 The gripped tool time-series data generator 126 uses the video V or the sensor information RRS to identify whether the worker's hand is gripping a tool. If the worker's hand is gripping a tool, the gripped tool time-series data generator 126 identifies the type of tool the worker's hand is gripping.
When acquiring the video V, the held tool time-series data generation unit 126 identifies from each frame of the video V whether the worker is holding a tool and the type of tool, for example, by using a known object detection model.
When acquiring the sensor information RRS, the held tool time-series data generator 126 analyzes the sensor information RRS to identify whether or not the worker is holding a tool and the type of tool.

把持道具時系列データ生成部１２６は、識別結果を示す左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１を作業実施時間帯検出部１２３に出力する。
左手把持道具データＬＴＯ１は、作業従事者の左手が道具を把持しているか否か及び把持している道具の種類を時系列で示すデータである。つまり、左手把持道具データＬＴＯ１を解析すれば、作業従事者の左手が道具を把持している時間帯及び作業従事者の左手が道具を把持していない時間を識別することができる。また、左手把持道具データＬＴＯ１を解析すれば、作業従事者の左手が道具を把持している時間帯において作業従事者の左手が把持している道具の種類を時系列に識別することができる。
同様に、右手把持道具データＲＴＯ１は、作業従事者の右手が道具を把持しているか否か及び把持している道具の種類を時系列で示すデータである。つまり、右手把持道具データＲＴＯ１を解析すれば、作業従事者の右手が道具を把持している時間帯及び作業従事者の右手が道具を把持していない時間を時系列に識別することができる。また、右手把持道具データＲＴＯ１を解析すれば、作業従事者の右手が道具を把持している時間帯において作業従事者の右手が把持している道具の種類を識別することができる。 The gripped tool time-series data generating unit 126 outputs the left hand gripped tool data LTO1 and the right hand gripped tool data RTO1 indicating the identification results to the task execution time zone detecting unit 123.
The left hand held tool data LTO1 is data that indicates in chronological order whether or not the worker's left hand is holding a tool and the type of tool being held. In other words, by analyzing the left hand held tool data LTO1, it is possible to identify the time period when the worker's left hand is holding a tool and the time period when the worker's left hand is not holding a tool. Furthermore, by analyzing the left hand held tool data LTO1, it is possible to identify in chronological order the type of tool being held by the worker's left hand during the time period when the worker's left hand is holding a tool.
Similarly, the right hand held tool data RTO1 is data that indicates in a time series whether the worker's right hand is holding a tool and the type of tool being held. In other words, by analyzing the right hand held tool data RTO1, it is possible to identify in a time series the time period when the worker's right hand is holding a tool and the time period when the worker's right hand is not holding a tool. In addition, by analyzing the right hand held tool data RTO1, it is possible to identify the type of tool being held by the worker's right hand during the time period when the worker's right hand is holding a tool.

＊＊作業実施時間帯検出部１２３の説明＊＊
作業実施時間帯検出部１２３は、関節時系列データ画像化部１２２から、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩを取得する。
更に、作業実施時間帯検出部１２３は、把持道具時系列データ生成部１２６から左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１を取得する。 **Explanation of the work execution time zone detection unit 123**
The task execution time zone detection unit 123 acquires a left wrist joint position image LJPI, a left wrist joint velocity image LJVI, a right wrist joint position image RJPI, and a right wrist joint velocity image RJVI from the joint time-series data imaging unit 122.
Furthermore, the task execution time zone detection unit 123 acquires the left hand gripped tool data LTO1 and the right hand gripped tool data RTO1 from the gripped tool time series data generation unit 126.

作業実施時間帯検出部１２３は、左手関節位置画像ＬＪＰＩ及び／又は左手関節速度画像ＬＪＶＩを解析する。そして、作業実施時間帯検出部１２３は、左手が映像に出現した時間帯の集合である左手出現時間帯集合ＬＳを生成する。
同様に、作業実施時間帯検出部１２３は、右手関節位置画像ＲＪＰＩ及び／又は右手関節速度画像ＲＪＶＩを解析する。そして、作業実施時間帯検出部１２３は、右手が映像に出現した時間帯の集合である右手出現時間帯集合ＲＳを生成する。
このように、作業実施時間帯検出部１２３は、図３に示す出現状況判定を行う。
また、作業実施時間帯検出部１２３は、関節位置時系列データ取得部１２０から関節位置時系列データＨＰＴを取得している場合は、関節位置時系列データＨＰＴを用いて出現状況判定を行ってもよい。また、作業実施時間帯検出部１２３は、関節速度計算部１２１から関節速度時系列データＨＶＴを取得している場合は、関節速度時系列データＨＶＴを用いて出現状況判定を行ってもよい。 The task performance time zone detection unit 123 analyzes the left hand joint position image LJPI and/or the left hand joint velocity image LJVI. Then, the task performance time zone detection unit 123 generates a left hand appearance time zone set LS, which is a set of time zones in which the left hand appears in the video.
Similarly, the task performance time zone detection unit 123 analyzes the right hand joint position image RJPI and/or the right hand joint velocity image RJVI. Then, the task performance time zone detection unit 123 generates a right hand appearance time zone set RS which is a set of time zones in which the right hand appears in the image.
In this manner, the operation execution time zone detection unit 123 performs the occurrence status determination shown in FIG.
Furthermore, when the task performance time zone detection unit 123 acquires the joint position time series data HPT from the joint position time series data acquisition unit 120, the task performance time zone detection unit 123 may determine the appearance status using the joint position time series data HPT. Furthermore, when the task performance time zone detection unit 123 acquires the joint velocity time series data HVT from the joint velocity calculation unit 121, the task performance time zone detection unit 123 may determine the appearance status using the joint velocity time series data HVT.

また、作業実施時間帯検出部１２３は、左手把持道具データＬＴＯ１及び右手把持道具データＲＴＯ１に基づいて、左手及び右手の各々について、時系列に沿って、道具を把持している時間帯、道具を把持していない時間帯を抽出する。また、作業従事者が把持する道具の種類が変化している場合は、作業実施時間帯検出部１２３は、道具の種類ごとに異なる時間帯として抽出する。
以下では、作業実施時間帯検出部１２３の抽出結果を道具把持状況判定結果ＴＳという。
道具把持状況判定結果ＴＳには、作業従事者が道具を把持している時間帯と、作業従事者が道具を把持していない時間帯とが時系列に示される。また、道具把持状況判定結果ＴＳでは、作業従事者が道具を把持している時間帯と、当該時間帯において作業従事者が把持している道具の種類とが対応付けられている。
このように、作業実施時間帯検出部１２３は、図４に示す道具把持状況判定を行う。 Furthermore, the work performance time zone detection unit 123 extracts, in chronological order, time zones during which a tool is being held and time zones during which a tool is not being held for each of the left and right hands based on the left hand held tool data LTO1 and the right hand held tool data RTO1. Furthermore, when the type of tool held by the worker is changing, the work performance time zone detection unit 123 extracts different time zones for each type of tool.
Hereinafter, the extraction result of the work execution time zone detection unit 123 is referred to as the tool gripping state determination result TS.
The tool holding status determination result TS indicates, in chronological order, the time periods when the worker is holding the tool and the time periods when the worker is not holding the tool. The tool holding status determination result TS also associates the time periods when the worker is holding the tool with the type of tool held by the worker in that time period.
In this manner, the task execution time zone detection unit 123 performs the tool holding state determination shown in FIG.

また、作業実施時間帯検出部１２３は、左手関節位置画像ＬＪＰＩ及び／又は左手関節速度画像ＬＪＶＩを解析し、左手の単位時間あたりの変位量を計算する。
同様に、作業実施時間帯検出部１２３は、右手関節位置画像ＲＪＰＩ及び／又は右手関節速度画像ＲＪＶＩを解析し、右手の単位時間あたりの変位量を計算する。
このように、作業実施時間帯検出部１２３は、図２の変位量判定を行う。
また、作業実施時間帯検出部１２３は、関節位置時系列データ取得部１２０から関節位置時系列データＨＰＴを取得している場合は、関節位置時系列データＨＰＴを用いて変位量判定を行ってもよい。また、作業実施時間帯検出部１２３は、関節速度計算部１２１から関節速度時系列データＨＶＴを取得している場合は、関節速度時系列データＨＶＴを用いて変位量判定を行ってもよい。 Furthermore, the task execution time zone detection unit 123 analyzes the left hand joint position image LJPI and/or the left hand joint velocity image LJVI to calculate the amount of displacement of the left hand per unit time.
Similarly, the task execution time zone detection unit 123 analyzes the right hand joint position image RJPI and/or the right hand joint velocity image RJVI to calculate the amount of displacement of the right hand per unit time.
In this manner, the work execution time zone detection unit 123 performs the displacement amount determination shown in FIG.
Furthermore, when the task execution time zone detection unit 123 acquires the joint position time series data HPT from the joint position time series data acquisition unit 120, the task execution time zone detection unit 123 may determine the amount of displacement using the joint position time series data HPT. When the task execution time zone detection unit 123 acquires the joint velocity time series data HVT from the joint velocity calculation unit 121, the task execution time zone detection unit 123 may determine the amount of displacement using the joint velocity time series data HVT.

作業実施時間帯検出部１２３は、少なくともいずれかの判定結果に基づき、作業実施時間帯を作業実施時間帯と非作業時間帯とに分割する。
具体的には、作業実施時間帯検出部１２３は、作業実施時間帯の開始時刻と終了時刻を特定し、非作業時間帯の開始時刻と終了時刻を特定して、作業実施時間帯を作業実施時間帯と非作業時間帯とに分割する。
なお、作業実施時間帯検出部１２３は、出現状況判定の判定結果のみを用いて作業従事時間を作業実施時間帯と非作業時間帯とに分割する場合は、道具把持状況判定と変位量判定とを行わなくてもよい。
同様に、作業実施時間帯検出部１２３は、道具把持状況判定の判定結果のみを用いて作業従事時間を作業実施時間帯と非作業時間帯とに分割する場合は、出現状況判定と変位量判定とを行わなくてもよい。
同様に、作業実施時間帯検出部１２３は、変位量判定の判定結果のみを用いて作業従事時間を作業実施時間帯と非作業時間帯とに分割する場合は、出現状況判定と道具把持状況判定とを行わなくてもよい。 The work performance time zone detection unit 123 divides the work performance time zone into a work performance time zone and a non-work time zone based on at least one of the determination results.
Specifically, the work performance time zone detection unit 123 identifies the start time and end time of the work performance time zone, and identifies the start time and end time of the non-work time zone, and divides the work performance time zone into a work performance time zone and a non-work time zone.
In addition, when the work performance time zone detection unit 123 divides the work engagement time into a work performance time zone and a non-work time zone using only the results of the appearance status determination, it does not need to perform the tool holding status determination and the displacement amount determination.
Similarly, when the work performance time zone detection unit 123 divides the work engagement time into a work performance time zone and a non-work time zone using only the results of the tool holding status determination, it does not need to perform the appearance status determination and the displacement amount determination.
Similarly, when the work performance time zone detection unit 123 divides the work engagement time into work performance time zones and non-work time zones using only the results of the displacement amount determination, it does not need to perform appearance status determination and tool grasping status determination.

作業実施時間帯検出部１２３は、作業実施時間帯の集合である作業実施時間帯集合ＦＳを要素作業推定部１２４に出力する。作業実施時間帯集合ＦＳには、作業実施時間帯ごとに作業実施時間帯の開始時刻と終了時刻とが示される。
また、作業実施時間帯検出部１２３は、非作業時間帯の集合である非作業時間帯集合ＮＦＳを推定結果処理部１２５に出力する。非作業時間帯集合ＮＦＳには、非作業時間帯ごとに非作業時間帯の開始時刻と終了時刻とが示される。 The work implementation time zone detection unit 123 outputs a work implementation time zone set FS, which is a set of work implementation time zones, to the element work estimation unit 124. The work implementation time zone set FS indicates the start time and end time of the work implementation time zone for each work implementation time zone.
Furthermore, the work implementation time zone detection unit 123 outputs a non-work time zone set NFS, which is a set of non-work time zones, to the estimation result processing unit 125. The non-work time zone set NFS indicates the start time and end time of each non-work time zone.

＊＊要素作業推定部１２４の説明＊＊
要素作業推定部１２４は、作業実施時間帯検出部１２３から作業実施時間帯集合ＦＳを取得する。
また、要素作業推定部１２４は、関節時系列データ画像化部１２２から、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩを取得する。
また、要素作業推定部１２４は、要素作業推定モデル保存部１３０から、要素作業推定モデルＭを取得する。要素作業推定モデルＭは図１に示す学習済みモデルに相当する。
また、要素作業推定部１２４は、推定結果保存部１３２から、前処理統計量を取得する。具体的には、要素作業推定部１２４は、前処理統計量として、関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得する。 **Explanation of the element work estimation unit 124**
The element work estimation unit 124 acquires the work performance time zone set FS from the work performance time zone detection unit 123 .
Furthermore, the element work estimation unit 124 acquires a left wrist joint position image LJPI, a left wrist joint velocity image LJVI, a right wrist joint position image RJPI, and a right wrist joint velocity image RJVI from the joint time-series data imaging unit 122 .
Furthermore, the element work estimation unit 124 acquires the element work estimation model M from the element work estimation model storage unit 130. The element work estimation model M corresponds to the trained model shown in FIG.
Furthermore, the element work estimation unit 124 acquires preprocessing statistics from the estimation result storage unit 132. Specifically, the element work estimation unit 124 acquires, as the preprocessing statistics, a joint position luminance average value PM, a joint position luminance standard deviation PS, a joint velocity luminance average value VM, and a joint velocity luminance standard deviation VS.

関節位置輝度平均値ＰＭは、後述する学習装置２００で得られた、学習用の関節位置画像の輝度の平均値である。
関節位置輝度標準偏差ＰＳは、学習装置２００で得られた、学習用の関節位置画像の輝度の標準偏差である。
輝度とは、関節位置画像に含まれる座標軸Ｃごとの座標値である。
学習用の関節位置画像は、学習フェーズで用いられる関節位置画像であり、前述の左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩに相当する。 The joint position luminance average value PM is the average luminance value of the joint position image for learning obtained by the learning device 200 described later.
The joint position luminance standard deviation PS is the standard deviation of the luminance of the joint position images for training obtained by the training device 200.
The brightness is the coordinate value for each coordinate axis C included in the joint position image.
The learning joint position images are joint position images used in the learning phase, and correspond to the left hand joint position image LJPI and right hand joint position image RJPI described above.

関節位置輝度平均値ＰＭには、厳密には、左手関節位置輝度平均値ＬＰＭと右手関節位置輝度平均値ＲＰＭが含まれる。左手関節位置輝度平均値ＬＰＭは、学習用の左手関節位置画像である左手関節位置ラベル画像ｓＬＪＰＩから得られた平均値である。右手関節位置輝度平均値ＲＰＭは、学習用の右手関節位置画像である右手関節位置ラベル画像ｓＲＪＰＩから得られた平均値である。
また、左手関節位置輝度平均値ＬＰＭは、座標軸Ｃごとに求められる。このため、左手関節位置輝度平均値ＬＰＭを厳密に表記すれば、左手関節位置輝度平均値ＬＰＭ（Ｃ）と表記される。
同様に、右手関節位置輝度平均値ＲＰＭも、座標軸Ｃごとに求められる。このため、右手関節位置輝度平均値ＲＰＭを厳密に表記すれば、右手関節位置輝度平均値ＲＰＭ（Ｃ）と表記される。
なお、厳密に表記すべき場合を除き、左手関節位置輝度平均値ＬＰＭと右手関節位置輝度平均値ＲＰＭをまとめて関節位置輝度平均値ＰＭという。
なお、上記のように、左手についての輝度平均値と右手についての輝度平均値を用いるのではなく、左手と右手の両者にわたる輝度平均値を用いるようにしてもよい。以下では、左手についての輝度平均値と右手についての輝度平均値を用いる例を説明するが、以下の説明は左手と右手の両者にわたる輝度平均値を用いる場合にも適用される。 Strictly speaking, the average joint position luminance value PM includes the average left wrist joint position luminance value LPM and the average right wrist joint position luminance value RPM. The average left wrist joint position luminance value LPM is the average value obtained from the left wrist joint position label image sLJPI, which is a learning left wrist joint position image. The average right wrist joint position luminance value RPM is the average value obtained from the right wrist joint position label image sRJPI, which is a learning right wrist joint position image.
Moreover, the left hand joint position luminance average value LPM is obtained for each coordinate axis C. Therefore, strictly speaking, the left hand joint position luminance average value LPM is expressed as the left hand joint position luminance average value LPM(C).
Similarly, the right hand joint position luminance average value RPM is also found for each coordinate axis C. Therefore, strictly speaking, the right hand joint position luminance average value RPM is expressed as the right hand joint position luminance average value RPM(C).
Unless otherwise specified strictly, the left wrist joint position luminance average value LPM and the right wrist joint position luminance average value RPM will be collectively referred to as the joint position luminance average value PM.
Note that, instead of using the average luminance value for the left hand and the average luminance value for the right hand as described above, the average luminance value for both the left hand and the right hand may be used. In the following, an example in which the average luminance value for the left hand and the average luminance value for the right hand are used will be described, but the following description also applies to the case in which the average luminance value for both the left hand and the right hand is used.

関節位置輝度標準偏差ＰＳにも、厳密には、左手関節位置輝度標準偏差ＬＰＳと右手関節位置輝度標準偏差ＲＰＳが含まれる。左手関節位置輝度標準偏差ＬＰＳは、学習用の左手関節位置画像である左手関節位置ラベル画像ｓＬＪＰＩから得られた標準偏差である。右手関節位置輝度標準偏差ＲＰＳは、学習用の右手関節位置画像である右手関節位置ラベル画像ｓＲＪＰＩから得られた標準偏差である。
また、左手関節位置輝度標準偏差ＬＰＳも、座標軸Ｃごとに求められる。このため、左手関節位置輝度標準偏差ＬＰＳを厳密に表記すれば、左手関節位置輝度標準偏差ＬＰＳ（Ｃ）と表記される。
同様に、右手関節位置輝度標準偏差ＲＰＳも、座標軸Ｃごとに求められる。このため、右手関節位置輝度標準偏差ＲＰＳを厳密に表記すれば、右手関節位置輝度標準偏差ＲＰＳ（Ｃ）と表記される。
なお、厳密に表記すべき場合を除き、左手関節位置輝度標準偏差ＬＰＳと右手関節位置輝度標準偏差ＲＰＳをまとめて関節位置輝度標準偏差ＰＳという。
なお、上記のように、左手についての輝度標準偏差と右手についての輝度標準偏差を用いるのではなく、左手と右手の両者にわたる輝度標準偏差を用いるようにしてもよい。以下では、左手についての輝度標準偏差と右手についての輝度標準偏差を用いる例を説明するが、以下の説明は左手と右手の両者にわたる輝度標準偏差を用いる場合にも適用される。 Strictly speaking, the joint position luminance standard deviation PS also includes a left hand joint position luminance standard deviation LPS and a right hand joint position luminance standard deviation RPS. The left hand joint position luminance standard deviation LPS is a standard deviation obtained from a left hand joint position label image sLJPI, which is a left hand joint position image for learning. The right hand joint position luminance standard deviation RPS is a standard deviation obtained from a right hand joint position label image sRJPI, which is a right hand joint position image for learning.
In addition, the left hand joint position luminance standard deviation LPS is also found for each coordinate axis C. Therefore, in a strict sense, the left hand joint position luminance standard deviation LPS is expressed as the left hand joint position luminance standard deviation LPS(C).
Similarly, the right hand joint position luminance standard deviation RPS is also found for each coordinate axis C. Therefore, in a strict sense, the right hand joint position luminance standard deviation RPS is expressed as the right hand joint position luminance standard deviation RPS(C).
Unless a strict notation is required, the left hand joint position luminance standard deviation LPS and the right hand joint position luminance standard deviation RPS will be collectively referred to as the joint position luminance standard deviation PS.
Note that, instead of using the luminance standard deviation for the left hand and the luminance standard deviation for the right hand as described above, the luminance standard deviation over both the left hand and the right hand may be used. In the following, an example in which the luminance standard deviation for the left hand and the luminance standard deviation for the right hand are used will be described, but the following description also applies to the case in which the luminance standard deviation over both the left hand and the right hand is used.

関節速度輝度平均値ＶＭは、学習装置２００で得られた、学習用の関節速度画像の輝度の平均値である。
関節速度輝度標準偏差ＶＳは、学習装置２００で得られた、学習用の関節速度画像の輝度の標準偏差である。
学習用の関節速度画像は、学習フェーズで用いられる関節速度画像であり、前述の左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩに相当する。 The joint velocity luminance average value VM is the average luminance value of the training joint velocity image obtained by the training device 200.
The joint velocity luminance standard deviation VS is the standard deviation of the luminance of the training joint velocity image obtained by the training device 200.
The learning joint velocity images are joint velocity images used in the learning phase, and correspond to the left wrist joint velocity image LJVI and right wrist joint velocity image RJVI described above.

関節速度輝度平均値ＶＭにも、厳密には、左手関節速度輝度平均値ＬＶＭと右手関節速度輝度平均値ＲＶＭが含まれる。左手関節速度輝度平均値ＬＶＭは、学習用の左手関節速度画像である左手関節速度ラベル画像ｓＬＪＶＩから得られた平均値である。右手関節速度輝度平均値ＲＶＭは、学習用の右手関節速度画像である右手関節速度ラベル画像ｓＲＪＶＩから得られた平均値である。
また、左手関節速度輝度平均値ＬＶＭも、座標軸Ｃごとに求められる。このため、左手関節速度輝度平均値ＬＶＭを厳密に表記すれば、左手関節速度輝度平均値ＬＶＭ（Ｃ）と表記される。
同様に、右手関節速度輝度平均値ＲＶＭも、座標軸Ｃごとに求められる。このため、右手関節速度輝度平均値ＲＶＭを厳密に表記すれば、右手関節速度輝度平均値ＲＶＭ（Ｃ）と表記される。
なお、厳密に表記すべき場合を除き、左手関節速度輝度平均値ＬＶＭと右手関節速度輝度平均値ＲＶＭをまとめて関節速度輝度平均値ＶＭという。
なお、上記のように、左手についての輝度平均値と右手についての輝度平均値を用いるのではなく、左手と右手の両者にわたる輝度平均値を用いるようにしてもよい。以下では、左手についての輝度平均値と右手についての輝度平均値を用いる例を説明するが、以下の説明は左手と右手の両者にわたる輝度平均値を用いる場合にも適用される。 Strictly speaking, the joint velocity brightness average value VM also includes a left wrist joint velocity brightness average value LVM and a right wrist joint velocity brightness average value RVM. The left wrist joint velocity brightness average value LVM is the average value obtained from the left wrist joint velocity label image sLJVI, which is a left wrist joint velocity image for learning. The right wrist joint velocity brightness average value RVM is the average value obtained from the right wrist joint velocity label image sRJVI, which is a right wrist joint velocity image for learning.
The left wrist joint velocity brightness average value LVM is also calculated for each coordinate axis C. Therefore, the left wrist joint velocity brightness average value LVM can be expressed strictly as the left wrist joint velocity brightness average value LVM(C).
Similarly, the right wrist joint velocity luminance average value RVM is also calculated for each coordinate axis C. Therefore, in a strict sense, the right wrist joint velocity luminance average value RVM is expressed as the right wrist joint velocity luminance average value RVM(C).
Unless a strict notation is required, the left wrist joint velocity brightness average value LVM and the right wrist joint velocity brightness average value RVM will be collectively referred to as the joint velocity brightness average value VM.
Note that, instead of using the average luminance value for the left hand and the average luminance value for the right hand as described above, the average luminance value for both the left hand and the right hand may be used. In the following, an example in which the average luminance value for the left hand and the average luminance value for the right hand are used will be described, but the following description also applies to the case in which the average luminance value for both the left hand and the right hand is used.

関節速度輝度標準偏差ＶＳにも、厳密には、左手関節速度輝度標準偏差ＬＶＳと右手関節速度輝度標準偏差ＲＶＳが含まれる。左手関節速度輝度標準偏差ＬＶＳは、学習用の左手関節速度画像である左手関節速度ラベル画像ｓＬＪＶＩから得られた標準偏差である。右手関節速度輝度標準偏差ＲＶＳは、学習用の右手関節速度画像である右手関節速度ラベル画像ｓＲＪＶＩから得られた標準偏差である。
また、左手関節速度輝度標準偏差ＬＶＳも、座標軸Ｃごとに求められる。このため、左手関節速度輝度標準偏差ＬＶＳを厳密に表記すれば、左手関節速度輝度標準偏差ＬＶＳ（Ｃ）と表記される。
同様に、右手関節速度輝度標準偏差ＲＶＳも、座標軸Ｃごとに求められる。このため、右手関節速度輝度標準偏差ＲＶＳを厳密に表記すれば、右手関節速度輝度標準偏差ＲＶＳ（Ｃ）と表記される。
なお、厳密に表記すべき場合を除き、左手関節速度輝度標準偏差ＬＶＳと右手関節速度輝度標準偏差ＲＶＳをまとめて関節速度輝度標準偏差ＶＳという。
なお、上記のように、左手についての輝度標準偏差と右手についての輝度標準偏差を用いるのではなく、左手と右手の両者にわたる輝度標準偏差を用いるようにしてもよい。以下では、左手についての輝度標準偏差と右手についての輝度標準偏差を用いる例を説明するが、以下の説明は左手と右手の両者にわたる輝度標準偏差を用いる場合にも適用される。 Strictly speaking, the joint velocity brightness standard deviation VS also includes a left hand joint velocity brightness standard deviation LVS and a right hand joint velocity brightness standard deviation RVS. The left hand joint velocity brightness standard deviation LVS is a standard deviation obtained from a left hand joint velocity label image sLJVI, which is a left hand joint velocity image for learning. The right hand joint velocity brightness standard deviation RVS is a standard deviation obtained from a right hand joint velocity label image sRJVI, which is a right hand joint velocity image for learning.
In addition, the left wrist joint velocity luminance standard deviation LVS is also obtained for each coordinate axis C. Therefore, in a strict expression, the left wrist joint velocity luminance standard deviation LVS is expressed as the left wrist joint velocity luminance standard deviation LVS(C).
Similarly, the right hand joint velocity luminance standard deviation RVS is also calculated for each coordinate axis C. Therefore, in a strict sense, the right hand joint velocity luminance standard deviation RVS is expressed as the right hand joint velocity luminance standard deviation RVS(C).
Unless a strict notation is required, the left wrist joint velocity luminance standard deviation LVS and the right wrist joint velocity luminance standard deviation RVS will be collectively referred to as the joint velocity luminance standard deviation VS.
Note that, instead of using the luminance standard deviation for the left hand and the luminance standard deviation for the right hand as described above, the luminance standard deviation over both the left hand and the right hand may be used. In the following, an example in which the luminance standard deviation for the left hand and the luminance standard deviation for the right hand are used will be described, but the following description also applies to the case in which the luminance standard deviation over both the left hand and the right hand is used.

要素作業推定部１２４は、作業実施時間帯集合ＦＳに含まれる作業実施時間帯ごとに、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩの各々から、作業実施時間帯に撮像された部分映像を構成する部分画像を抽出する。つまり、要素作業推定部１２４は、左手関節位置画像ＬＪＰＩから、作業実施時間帯に撮像された部分映像を構成する画像を部分画像として抽出する。要素作業推定部１２４は、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶについても、同様の部分画像を抽出する。
以下では、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩの各々から抽出された部分画像の集合を部分画像集合ＩＳという。 The element work estimation unit 124 extracts partial images constituting a partial video captured during a work performance time period from each of the left hand joint position image LJPI, the left hand joint velocity image LJVI, the right hand joint position image RJPI, and the right hand joint velocity image RJVI for each work performance time period included in the work performance time period set FS. That is, the element work estimation unit 124 extracts images constituting a partial video captured during a work performance time period from the left hand joint position image LJPI as partial images. The element work estimation unit 124 also extracts similar partial images from the left hand joint velocity image LJVI, the right hand joint position image RJPI, and the right hand joint velocity image RJV.
In the following, a set of partial images extracted from each of the left wrist joint position image LJPI, left wrist joint velocity image LJVI, right wrist joint position image RJPI, and right wrist joint velocity image RJVI will be referred to as a partial image set IS.

そして、要素作業推定部１２４は、作業時間帯ごとに、部分画像集合ＩＳに含まれる各部分画像を、一定の幅及び高さをもつ画像にリサイズする。
更に、要素作業推定部１２４は、前処理統計量を用いてリサイズ後の各部分画像に含まれる画素値を座標軸Ｃごとに標準化する。なお、リサイズと標準化の順序は逆でもよい。
具体的には、要素作業推定部１２４は、左手関節位置画像ＬＪＰＩから抽出されたリサイズ後の部分画像の画素値を座標軸Ｃごとに関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。
また、要素作業推定部１２４は、右手関節位置画像ＲＪＰＩから抽出されたリサイズ後の部分画像の画素値を座標軸Ｃごとに関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。
より具体的には、要素作業推定部１２４は、左手関節位置画像ＬＪＰＩから抽出されたリサイズ後の部分画像の画素値を、左手関節位置輝度平均値ＬＰＭ（Ｃ）と左手関節位置輝度標準偏差ＬＰＳ（Ｃ）とを用いて標準化する。
また、要素作業推定部１２４は、右手関節位置画像ＲＪＰＩから抽出されたリサイズ後の部分画像の画素値を、右手関節位置輝度平均値ＲＰＭ（Ｃ）と右手関節位置輝度標準偏差ＲＰＳ（Ｃ）とを用いて標準化する。 Then, the element work estimation unit 124 resizes each partial image included in the partial image set IS for each work time period into an image having a certain width and height.
Furthermore, the element work estimation unit 124 uses the pre-processing statistics to standardize the pixel values included in each partial image after resizing for each coordinate axis C. Note that the order of resizing and standardization may be reversed.
Specifically, the element work estimation unit 124 standardizes the pixel values of the resized partial image extracted from the left hand joint position image LJPI for each coordinate axis C using the joint position luminance average value PM and the joint position luminance standard deviation PS.
In addition, the element work estimation unit 124 standardizes the pixel values of the resized partial image extracted from the right hand joint position image RJPI for each coordinate axis C using the joint position luminance average value PM and the joint position luminance standard deviation PS.
More specifically, the element work estimation unit 124 standardizes the pixel values of the resized partial image extracted from the left hand joint position image LJPI using the left hand joint position luminance average value LPM(C) and the left hand joint position luminance standard deviation LPS(C).
In addition, the element work estimation unit 124 standardizes the pixel values of the resized partial image extracted from the right hand joint position image RJPI using the right hand joint position luminance average value RPM(C) and the right hand joint position luminance standard deviation RPS(C).

更に、要素作業推定部１２４は、左手関節速度画像ＬＪＶＩから抽出されたリサイズ後の部分画像の画素値を座標軸Ｃごとに関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。
また、要素作業推定部１２４は、右手関節速度画像ＲＪＶＩから抽出されたリサイズ後の部分画像の画素値を座標軸Ｃごとに関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。
より具体的には、要素作業推定部１２４は、左手関節速度画像ＬＪＶＩから抽出されたリサイズ後の部分画像の画素値を、左手関節速度輝度平均値ＬＶＭ（Ｃ）と左手関節速度輝度標準偏差ＬＶＳ（Ｃ）とを用いて標準化する。
また、要素作業推定部１２４は、右手関節速度画像ＲＪＶＩから抽出されたリサイズ後の部分画像の画素を、右手関節速度輝度平均値ＲＶＭ（Ｃ）と右手関節速度輝度標準偏差ＲＶＳ（Ｃ）とを用いて標準化する。 Furthermore, the element work estimation unit 124 standardizes the pixel values of the resized partial image extracted from the left hand joint velocity image LJVI for each coordinate axis C using the joint velocity luminance average value VM and the joint velocity luminance standard deviation VS.
In addition, the element work estimation unit 124 standardizes the pixel values of the resized partial image extracted from the right hand joint velocity image RJVI for each coordinate axis C using the joint velocity luminance average value VM and the joint velocity luminance standard deviation VS.
More specifically, the element work estimation unit 124 standardizes the pixel values of the resized partial image extracted from the left hand joint velocity image LJVI using the left hand joint velocity brightness average value LVM(C) and the left hand joint velocity brightness standard deviation LVS(C).
In addition, the element work estimation unit 124 standardizes the pixels of the resized partial image extracted from the right hand joint velocity image RJVI using the right hand joint velocity brightness average value RVM(C) and the right hand joint velocity brightness standard deviation RVS(C).

そして、要素作業推定部１２４は、リサイズ及び標準化の後の各部分画像を、要素作業推定モデルＭに入力する。
そして、要素作業推定部１２４は、要素作業推定モデルＭから、推定結果ｑｓを取得する。推定結果ｑｓには、作業時間帯と、当該作業時間帯に作業従事者が実施していると推定される要素作業とが対応付けて示される。
要素作業推定部１２４は、以上の手順を、作業実施時間帯集合ＦＳに含まれる全ての作業実施時間帯に対して行って、各作業時間帯の推定結果ｑｓを取得する。
そして、要素作業推定部１２４は、各作業時間帯の推定結果ｑｓの集合である推定結果集合ＱＳを推定結果処理部１２５に出力する。 Then, the element work estimation unit 124 inputs each partial image after resizing and standardization to the element work estimation model M.
Then, the element work estimation unit 124 acquires the estimation result qs from the element work estimation model M. The estimation result qs indicates, in association with a work time period, an element work that is estimated to be performed by a worker during that work time period.
The element work estimation unit 124 performs the above procedure for all work execution time periods included in the work execution time period set FS, and obtains estimation results qs for each work time period.
Then, the element work estimation unit 124 outputs an estimation result set QS, which is a set of the estimation results qs for each work time period, to the estimation result processing unit 125.

＊＊推定結果処理部１２５の説明＊＊
推定結果処理部１２５は、要素作業推定部１２４から推定結果集合ＱＳを取得する。
また、推定結果処理部１２５は、作業実施時間帯検出部１２３から非作業時間帯集合ＮＦＳを取得する。
推定結果処理部１２５は、推定結果集合ＱＳに含まれる複数の推定結果ｑｓを、それぞれの推定結果ｑｓが対応する作業実施時間帯の昇順にソートする。
また、推定結果処理部１２５は、非作業時間帯集合ＮＦＳに含まれる非作業時間帯の各々に、非作業時間帯を表すラベルを付与する。更に、推定結果処理部１２５は、ラベル付与後の各非作業時間帯をソート後の推定結果集合ＱＳの該当する時間の位置に挿入する。
ラベル付与後の各非作業時間帯とソート後の推定結果集合ＱＳを、最終推定結果ＡＳとして推定結果保存部１３２に出力する。 **Explanation of the estimation result processing unit 125**
The estimation result processing unit 125 acquires the estimation result set QS from the element work estimation unit 124 .
In addition, the estimation result processing unit 125 acquires the non-work time slot set NFS from the work implementation time slot detection unit 123 .
The estimation result processing unit 125 sorts the multiple estimation results qs included in the estimation result set QS in ascending order of the work implementation time zones to which each estimation result qs corresponds.
The estimation result processing unit 125 also assigns a label representing a non-working time period to each of the non-working time periods included in the non-working time period set NFS. Furthermore, the estimation result processing unit 125 inserts each non-working time period after the label assignment into the sorted estimation result set QS at the corresponding time position.
Each non-work time slot after labeling and the sorted estimation result set QS are output to the estimation result storage unit 132 as the final estimation result AS.

図２１は、以上の作業実施時間帯検出部１２３、要素作業推定部１２４及び推定結果処理部１２５の動作の概要を示す。 Figure 21 shows an overview of the operation of the work performance time zone detection unit 123, element work estimation unit 124 and estimation result processing unit 125.

＊＊要素作業推定モデル保存部１３０の説明＊＊
要素作業推定モデル保存部１３０は、要素作業推定モデルＭを保存する。 **Explanation of the element work estimation model storage unit 130**
The element work estimation model storage unit 130 stores the element work estimation model M.

＊＊前処理統計量保存部１３１の説明＊＊
前処理統計量保存部１３１は、前処理統計量（関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳ）を保存する。 **Explanation of Pre-Processing Statistics Storage Unit 131**
The preprocessing statistics storage unit 131 stores the preprocessing statistics (the joint position luminance average value PM, the joint position luminance standard deviation PS, the joint velocity luminance average value VM, and the joint velocity luminance standard deviation VS).

＊＊推定結果保存部１３２の説明＊＊
推定結果保存部１３２は、最終推定結果ＡＳを保存する。 **Explanation of Estimation Result Storage Unit 132**
The estimation result storage unit 132 stores the final estimation result AS.

＊＊作業実施時間帯検出部１２３の内部構成例と要素作業推定部１２４の内部構成例の説明＊＊
図８は、作業実施時間帯検出部１２３の内部構成例と要素作業推定部１２４の内部構成例を示す。
図８を用いて、作業実施時間帯検出部１２３の内部構成例と要素作業推定部１２４の内部構成例を説明する。
なお、図８では、推定装置１００の機能構成要素のうち、作業実施時間帯検出部１２３の内部構成例と要素作業推定部１２４の内部構成例を説明するのに必要な機能構成要素のみが図示されている。 **Explanation of an example of the internal configuration of the task execution time zone detection unit 123 and an example of the internal configuration of the element task estimation unit 124**
FIG. 8 shows an example of the internal configuration of the work execution time zone detection unit 123 and the internal configuration of the element work estimation unit 124.
An example of the internal configuration of the work execution time zone detection unit 123 and an example of the internal configuration of the element work estimation unit 124 will be described with reference to FIG.
In addition, in FIG. 8, of the functional components of the estimation device 100, only the functional components necessary for explaining an example of the internal configuration of the work execution time zone detection unit 123 and the example of the internal configuration of the element work estimation unit 124 are shown.

＊＊作業実施時間帯検出部１２３の内部構成例の説明＊＊
作業実施時間帯検出部１２３は、内部構成として、出現状況判定部１２３１、道具把持状況判定部１２３２、変位量判定部１２３３及び作業実施時間帯判定部１２３４を有する。 **Explanation of an example of the internal configuration of the work execution time zone detection unit 123**
The task execution time zone detection unit 123 has, as its internal components, an appearance status determination unit 1231, a tool gripping status determination unit 1232, a displacement amount determination unit 1233, and a task execution time zone determination unit 1234.

＊＊出現状況判定部１２３１の説明＊＊
出現状況判定部１２３１は、出現状況判定を行う。
出現状況判定部１２３１は、関節時系列データ画像化部１２２から、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩを取得する。
そして、出現状況判定部１２３１は、左手関節位置画像ＬＪＰＩ及び／又は左手関節速度画像ＬＪＶＩを解析し、左手出現時間帯集合ＬＳを生成する。同様に、出現状況判定部１２３１は、右手関節位置画像ＲＪＰＩ及び／又は右手関節速度画像ＲＪＶＩを解析し、右手出現時間帯集合ＲＳを生成する。前述のように、左手出現時間帯集合ＬＳは、左手が映像に出現した時間帯の集合である。また、右手出現時間帯集合ＲＳは、右手が映像に出現した時間帯の集合である。左手出現時間帯集合ＬＳと右手出現時間帯集合ＲＳは、図４に出現状況判定の判定結果として示す時系列データである。
出現状況判定部１２３１は、左手出現時間帯集合ＬＳと右手出現時間帯集合ＲＳを作業実施時間帯判定部１２３４に出力する。
なお、出現状況判定部１２３１は、前述のように、関節位置時系列データＨＰＴ及び／又は関節速度時系列データＨＶＴを用いて出現状況判定を行ってもよい。
図８では、出現状況判定部１２３１への関節位置時系列データＨＰＴ及び／又は関節速度時系列データＨＶＴの入力の図示は省略している。 **Explanation of the appearance status determination unit 1231**
The appearance status determination unit 1231 performs appearance status determination.
The appearance status determination unit 1231 acquires a left wrist joint position image LJPI, a left wrist joint velocity image LJVI, a right wrist joint position image RJPI, and a right wrist joint velocity image RJVI from the joint time-series data imaging unit 122.
Then, the appearance status determination unit 1231 analyzes the left hand joint position image LJPI and/or the left hand joint velocity image LJVI to generate a left hand appearance time period set LS. Similarly, the appearance status determination unit 1231 analyzes the right hand joint position image RJPI and/or the right hand joint velocity image RJVI to generate a right hand appearance time period set RS. As described above, the left hand appearance time period set LS is a set of time periods in which the left hand appears in the image. Also, the right hand appearance time period set RS is a set of time periods in which the right hand appears in the image. The left hand appearance time period set LS and the right hand appearance time period set RS are time series data shown in FIG. 4 as the determination result of the appearance status determination.
The appearance status determination unit 1231 outputs the left hand appearance time period set LS and the right hand appearance time period set RS to the work implementation time period determination unit 1234 .
As described above, the appearance status determination unit 1231 may perform appearance status determination using the joint position time series data HPT and/or the joint velocity time series data HVT.
In FIG. 8, input of the joint position time series data HPT and/or the joint velocity time series data HVT to the appearance status determination unit 1231 is omitted.

＊＊道具把持状況判定部１２３２の説明＊＊
道具把持状況判定部１２３２は、道具把持状況判定を行う。
道具把持状況判定部１２３２は、把持道具時系列データ生成部１２６から左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１とを取得する。
そして、道具把持状況判定部１２３２は、左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１とに基づいて、左手及び右手の各々について、道具を把持している時間帯、道具を把持していない時間帯を時系列に抽出する。また、作業従事者が把持する道具の種類が変化している場合は、道具把持状況判定部１２３２は、道具を把持している時間帯を、道具の種類ごとに異なる時間帯として扱う。
道具把持状況判定部１２３２は、左手についての抽出結果が示される左手道具把持状況判定結果ＬＴＳを作業実施時間帯判定部１２３４に出力する。また、道具把持状況判定部１２３２は、右手についての抽出結果が示される右手道具把持状況判定結果ＲＴＳを作業実施時間帯判定部１２３４に出力する。左手道具把持状況判定結果ＬＴＳと右手道具把持状況判定結果ＲＴＳは、図４に道具把持状況判定の判定結果として示す時系列データである。 **Description of the tool gripping state determination unit 1232**
The tool holding status determination unit 1232 determines the tool holding status.
The tool gripping state determination unit 1232 acquires the left hand gripped tool data LTO1 and the right hand gripped tool data RTO1 from the gripped tool time-series data generation unit 126.
The tool gripping status determination unit 1232 extracts, in chronological order, time periods during which a tool is gripped and time periods during which a tool is not gripped for each of the left and right hands based on the left hand gripped tool data LTO1 and the right hand gripped tool data RTO1. In addition, when the type of tool held by the worker is changing, the tool gripping status determination unit 1232 treats the time periods during which a tool is gripped as different time periods for each type of tool.
The tool gripping situation determination unit 1232 outputs a left-hand tool gripping situation determination result LTS indicating the extraction result for the left hand to the work implementation time zone determination unit 1234. In addition, the tool gripping situation determination unit 1232 outputs a right-hand tool gripping situation determination result RTS indicating the extraction result for the right hand to the work implementation time zone determination unit 1234. The left-hand tool gripping situation determination result LTS and the right-hand tool gripping situation determination result RTS are time-series data shown in Fig. 4 as the determination results of the tool gripping situation determination.

＊＊変位量判定部１２３３の説明＊＊
変位量判定部１２３３は、変位量判定を行う。
変位量判定部１２３３は、関節時系列データ画像化部１２２から、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩを取得する。
そして、変位量判定部１２３３は、左手関節位置画像ＬＪＰＩ及び／又は左手関節速度画像ＬＪＶＩを解析し、左手の単位時間あたりの変位量を計算する。例えば、変位量判定部１２３３は、左手の手首の関節の単位時間あたりの変位量を計算する。同様に、変位量判定部１２３３は、右手関節位置画像ＲＪＰＩ及び／又は右手関節速度画像ＲＪＶＩを解析し、右手の単位時間あたりの変位量を計算する。例えば、変位量判定部１２３３は、右手の手首の関節の単位時間あたりの変位量を計算する。
そして、変位量判定部１２３３は、左手の単位時間あたりの変位量を時系列に示す左手変位量データＬＺＳを作業実施時間帯判定部１２３４に出力する。また、変位量判定部１２３３は、右手の単位時間あたりの変位量を時系列に示す右手変位量データＲＺＳを作業実施時間帯判定部１２３４に出力する。左手変位量データＬＺＳと右手変位量データＲＺＳは、図２に変位量判定の判定結果として示す時系列データである。
なお、変位量判定部１２３３は、前述のように、関節位置時系列データＨＰＴ及び／又は関節速度時系列データＨＶＴを用いて変位量判定を行ってもよい。
図８では、変位量判定部１２３３への関節位置時系列データＨＰＴ及び／又は関節速度時系列データＨＶＴの入力の図示は省略している。 **Explanation of the Displacement Amount Determination Unit 1233**
The displacement amount determination unit 1233 performs displacement amount determination.
The displacement amount determining unit 1233 acquires a left wrist joint position image LJPI, a left wrist joint velocity image LJVI, a right wrist joint position image RJPI, and a right wrist joint velocity image RJVI from the joint time-series data imaging unit 122.
Then, the displacement amount determination unit 1233 analyzes the left hand joint position image LJPI and/or the left hand joint velocity image LJVI to calculate the displacement amount of the left hand per unit time. For example, the displacement amount determination unit 1233 calculates the displacement amount of the wrist joint of the left hand per unit time. Similarly, the displacement amount determination unit 1233 analyzes the right hand joint position image RJPI and/or the right hand joint velocity image RJVI to calculate the displacement amount of the right hand per unit time. For example, the displacement amount determination unit 1233 calculates the displacement amount of the wrist joint of the right hand per unit time.
Then, the displacement amount determination unit 1233 outputs left hand displacement amount data LZS indicating the displacement amount of the left hand per unit time in a time series to the work execution time zone determination unit 1234. The displacement amount determination unit 1233 also outputs right hand displacement amount data RZS indicating the displacement amount of the right hand per unit time in a time series to the work execution time zone determination unit 1234. The left hand displacement amount data LZS and right hand displacement amount data RZS are time series data shown in FIG. 2 as the determination result of the displacement amount determination.
As described above, the displacement amount determining unit 1233 may determine the displacement amount using the joint position time series data HPT and/or the joint velocity time series data HVT.
In FIG. 8, the input of the joint position time series data HPT and/or the joint velocity time series data HVT to the displacement amount determination unit 1233 is omitted.

＊＊作業実施時間帯判定部１２３４の説明＊＊
作業実施時間帯判定部１２３４は、出現状況判定部１２３１から、左手出現時間帯集合ＬＳと右手出現時間帯集合ＲＳとを取得する。
また、作業実施時間帯判定部１２３４は、道具把持状況判定部１２３２から、左手道具把持状況判定結果ＬＴＳと右手道具把持状況判定結果ＲＴＳとを取得する。
また、作業実施時間帯判定部１２３４は、変位量判定部１２３３から、左手変位量データＬＺＳと右手変位量データＲＺＳとを取得する。
そして、作業実施時間帯判定部１２３４は、これらに基づいて、作業従事時間を作業実施時間帯と非作業時間帯とに分割する。 **Explanation of the work execution time zone determination unit 1234**
The work implementation time zone determination unit 1234 acquires a left hand appearance time zone set LS and a right hand appearance time zone set RS from the appearance status determination unit 1231.
In addition, the work execution time zone determination unit 1234 acquires the left hand tool gripping situation determination result LTS and the right hand tool gripping situation determination result RTS from the tool gripping situation determination unit 1232.
In addition, the work implementation time zone determination unit 1234 acquires the left hand displacement amount data LZS and the right hand displacement amount data RZS from the displacement amount determination unit 1233.
Then, the work execution time zone determining unit 1234 divides the work engagement time into a work execution time zone and a non-work time zone based on these.

例えば、作業実施時間帯判定部１２３４は、左手及び右手の少なくともいずれかの変位量が閾値以上であれば、左手及び右手の少なくともいずれかが道具を把持しているか否か及び左手及び右手の少なくともいずれかが映像に出現しているか否かに関わらず、当該時間帯を非作業時間帯に指定する。For example, if the amount of displacement of at least one of the left hand and the right hand is equal to or greater than a threshold value, the work performance time zone determination unit 1234 designates the time zone as a non-work time zone, regardless of whether at least one of the left hand and the right hand is holding a tool or whether at least one of the left hand and the right hand appears in the image or not.

また、作業実施時間帯判定部１２３４は、例えば、左手及び右手の変位量がともに閾値未満の場合は、左手及び右手の少なくともいずれかが映像に出現していれば、左手及び右手の少なくともいずれかが道具を把持しているか否かに関わらず、当該時間帯を作業実施時間帯に指定する。一方、左手及び右手の変位量がともに閾値未満の場合でも、左手及び右手のどちらも映像に出現していなければ、時間分割部１１は、左手及び右手の少なくともいずれかが道具を把持しているか否かに関わらず、当該時間帯を非作業時間帯に指定する。
この場合に、作業実施時間帯判定部１２３４は、左手と右手とを別々に扱ってもよい。具体的には、作業実施時間帯判定部１２３４は、以下の（Ａ１）～（Ａ３）を別々の作業実施時間帯として扱う。また、作業実施時間帯判定部１２３４は、出現状況判定のみで作業従事時間を作業実施時間帯と非作業時間帯とに分割する場合も、（Ａ１）～（Ａ３）を別々の作業実施時間帯として扱う。
（Ａ１）左手は出現しているが、右手は出現していない時間帯
（Ａ２）左手は出現していないが、右手は出現している時間帯
（Ａ３）左手も右手も出現している時間帯 Furthermore, for example, when the displacement amounts of the left and right hands are both less than a threshold, if at least one of the left and right hands appears in the video, the work performance time zone determination unit 1234 designates the time zone as a work performance time zone, regardless of whether at least one of the left and right hands is holding a tool. On the other hand, even if the displacement amounts of the left and right hands are both less than the threshold, if neither the left nor right hand appears in the video, the time division unit 11 designates the time zone as a non-work time zone, regardless of whether at least one of the left and right hands is holding a tool.
In this case, the work execution time zone determination unit 1234 may treat the left hand and the right hand separately. Specifically, the work execution time zone determination unit 1234 treats the following (A1) to (A3) as separate work execution time zones. Also, when the work engagement time is divided into a work execution time zone and a non-working time zone only by the appearance status determination, the work execution time zone determination unit 1234 treats (A1) to (A3) as separate work execution time zones.
(A1) A time period when the left hand appears but the right hand does not appear. (A2) A time period when the left hand does not appear but the right hand does appear. (A3) A time period when both the left hand and the right hand appear.

また、作業実施時間帯判定部１２３４は、例えば、左手及び右手の変位量に関わらず、左手及び右手の少なくともいずれかが道具を把持している場合は、当該時間帯を作業実施時間帯に指定する。
この場合にも、作業実施時間帯判定部１２３４は、左手と右手とを別々に扱ってもよい。具体的には、作業実施時間帯判定部１２３４は、以下の（Ｂ１）～（Ｂ３）を別々の作業実施時間帯として扱う。また、作業実施時間帯判定部１２３４は、道具把持状況判定のみで作業従事時間を作業実施時間帯と非作業時間帯とに分割する場合も、（Ｂ１）～（Ｂ３）を別々の作業実施時間帯として扱う。
（Ｂ１）左手は道具を把持しているが、右手は道具を把持していない時間帯
（Ｂ２）左手は道具を把持していないが、右手は道具を把持している時間帯
（Ｂ３）左手も右手も道具を把持している時間帯
また、作業実施時間帯判定部１２３４は、上記の（Ｂ３）を細分化して、以下の（Ｂ３－１）と（Ｂ３－２）を別々の作業実施時間帯として扱ってもよい。
（Ｂ３－１）左手も右手も同じ道具を把持している時間帯
（Ｂ３－２）左手と右手とがそれぞれ異なる道具を把持している時間帯 Furthermore, the task execution time zone determination unit 1234 designates a time zone in which at least one of the left hand and the right hand is holding a tool as the task execution time zone, regardless of the amount of displacement of the left hand and the right hand, for example.
In this case, the work execution time zone determination unit 1234 may treat the left hand and the right hand separately. Specifically, the work execution time zone determination unit 1234 treats the following (B1) to (B3) as separate work execution time zones. Also, when dividing the work engagement time into a work execution time zone and a non-working time zone only based on the tool gripping state determination, the work execution time zone determination unit 1234 treats (B1) to (B3) as separate work execution time zones.
(B1) A time period when the left hand is holding a tool, but the right hand is not holding a tool. (B2) A time period when the left hand is not holding a tool, but the right hand is holding a tool. (B3) A time period when both the left and right hands are holding a tool. In addition, the work performance time period determination unit 1234 may subdivide the above (B3) and treat the following (B3-1) and (B3-2) as separate work performance time periods.
(B3-1) A time period when the left and right hands are holding the same tool. (B3-2) A time period when the left and right hands are holding different tools.

作業実施時間帯判定部１２３４は、前述のように、変位量が閾値未満の時間帯を作業実施時間帯に指定する。
作業実施時間帯判定部１２３４は、例えば、以下のように変位量と閾値との比較を行ってもよい。
作業実施時間帯判定部１２３４は、左手出現時間帯集合ＬＳを作業実施時間帯の候補の集合とみなす。また、作業実施時間帯判定部１２３４は、左手変位量データＬＺＳにおける各候補のもつ時間帯の部分的な変位量の時系列データを抽出する。この部分的な変位量の時系列データを左手部分変位量時系列データ集合ｓＬＺＳと呼ぶ。また、左手部分変位量時系列データ集合ｓＬＺＳのｉ番目のデータをｓＬＺＳ（ｉ）と呼ぶ。
作業実施時間帯判定部１２３４は、ｓＬＺＳ（ｉ）を一定の時間幅で分割する。分割されたｓＬＺＳ（ｉ）のｋ番目の部分変位量データを短時間部分変位量データｓＬＺＳ（ｉ，ｋ）と呼ぶ。
作業実施時間帯判定部１２３４は、短時間部分変位量データｓＬＺＳ（ｉ，ｋ）が有する変位量の統計量を計算する。統計量は例えば、平均値である。
当該統計量が閾値以上であれば、作業実施時間帯判定部１２３４は、短時間部分変位量データｓＬＺＳ（ｉ，ｋ）に該当する時間帯を非作業実施時間帯と判定する。一方、当該統計量が閾値未満であれば、作業実施時間帯判定部１２３４は、短時間部分変位量データｓＬＺＳ（ｉ，ｋ）に該当する時間帯を作業実施時間帯と判定する。
作業実施時間帯判定部１２３４は、右手出現時間帯ＲＳと右手道具把持状況判定結果ＲＴＳについても同様の処理を行う。
また、左手と右手が同時に出現しており、片方が道具を把持し、もう片方が道具を把持していない場合は、作業実施時間帯判定部１２３４は、道具を把持している方の手の変位量に基づいて作業実施時間帯か否かを判定してもよい。または、作業実施時間帯判定部１２３４は、道具を把持している方の手について本段落に記載の処理によって得られる変位量の統計量と閾値との比較に基づいて作業実施時間帯か否かを判定してもよい。
両方の手が道具を把持する場合は、作業実施時間帯判定部１２３４は、左手の変位量と右手の変位量との比較、または本段落に記載の処理によって得られる左手の変位量の統計量と右手の変位量の統計量との比較を行う。そして、作業実施時間帯判定部１２３４は、当該変位量または当該変位量の統計量の小さい方を閾値と比較して作業実施時間帯か否かを判定してもよい。 As described above, the work execution time zone determination unit 1234 designates a time zone in which the amount of displacement is less than the threshold as the work execution time zone.
The work implementation time zone determination unit 1234 may compare the amount of displacement with a threshold value, for example, as follows.
The task execution time zone determination unit 1234 regards the left hand appearance time zone set LS as a set of task execution time zone candidates. The task execution time zone determination unit 1234 also extracts time series data of partial displacement amounts of time zones of each candidate in the left hand displacement amount data LZS. This time series data of partial displacement amounts is called a left hand partial displacement amount time series data set sLZS. The i-th data of the left hand partial displacement amount time series data set sLZS is also called sLZS(i).
The task execution time zone determination unit 1234 divides sLZS(i) into a certain time interval, and the kth partial displacement amount data of the divided sLZS(i) is called short-time partial displacement amount data sLZS(i, k).
The task execution time zone determination unit 1234 calculates statistics of the amount of variation contained in the short-time partial variation data sLZS(i, k). The statistics may be, for example, an average value.
If the statistic is equal to or greater than the threshold, the work execution time zone determination unit 1234 determines that the time zone corresponding to the short-time partial variation data sLZS(i,k) is a non-work execution time zone. On the other hand, if the statistic is less than the threshold, the work execution time zone determination unit 1234 determines that the time zone corresponding to the short-time partial variation data sLZS(i,k) is a work execution time zone.
The task implementation time zone determination unit 1234 performs similar processing for the right hand appearance time zone RS and the right hand tool gripping state determination result RTS.
Furthermore, when a left hand and a right hand appear at the same time, one of which is holding a tool and the other is not, the task execution time zone determination unit 1234 may determine whether or not it is a task execution time zone based on the amount of displacement of the hand holding the tool. Alternatively, the task execution time zone determination unit 1234 may determine whether or not it is a task execution time zone based on a comparison between a threshold value and a statistical amount of displacement of the hand holding the tool obtained by the process described in this paragraph.
When both hands are holding a tool, the task execution time zone determination unit 1234 compares the displacement amount of the left hand with the displacement amount of the right hand, or compares the statistics of the displacement amount of the left hand obtained by the processing described in this paragraph with the statistics of the displacement amount of the right hand. Then, the task execution time zone determination unit 1234 may compare the smaller of the displacement amount or the statistics of the displacement amount with a threshold value to determine whether or not it is a task execution time zone.

なお、作業実施時間帯判定部１２３４は、これ以外の基準で作業従事時間を作業実施時間帯と非作業時間帯とに分割してもよい。In addition, the work execution time zone determination unit 1234 may divide the work engagement time into work execution time zones and non-work time zones based on other criteria.

作業実施時間帯判定部１２３４は、作業実施時間帯の集合である作業実施時間帯集合ＦＳを要素作業推定部１２４に出力する。作業実施時間帯集合ＦＳには、作業実施時間帯ごとに作業実施時間帯の開始時刻と終了時刻とが示される。
また、作業実施時間帯判定部１２３４は、非作業時間帯の集合である非作業時間帯集合ＮＦＳを推定結果処理部１２５に出力する。非作業時間帯集合ＮＦＳには、非作業時間帯ごとに非作業時間帯の開始時刻と終了時刻とが示される。 The work implementation time slot determination unit 1234 outputs a work implementation time slot set FS, which is a set of work implementation time slots, to the element work estimation unit 124. The work implementation time slot set FS indicates the start time and end time of the work implementation time slot for each work implementation time slot.
Furthermore, the work implementation time slot determination unit 1234 outputs a non-work time slot set NFS, which is a set of non-work time slots, to the estimation result processing unit 125. The non-work time slot set NFS indicates the start time and end time of each non-work time slot.

＊＊要素作業推定部１２４の内部構成例の説明＊＊
要素作業推定部１２４は、内部構成として、関節位置画像取得部１２４１、関節速度画像取得部１２４２、関節位置特徴抽出部１２４３、関節速度特徴抽出部１２４４及び関節画像特徴分類部１２４５を有する。 **Explanation of an example of the internal configuration of the element work estimation unit 124**
The element work estimation unit 124 has, as its internal components, a joint position image acquisition unit 1241 , a joint velocity image acquisition unit 1242 , a joint position feature extraction unit 1243 , a joint velocity feature extraction unit 1244 , and a joint image feature classification unit 1245 .

＊＊関節位置画像取得部１２４１の説明＊＊
関節位置画像取得部１２４１は、作業実施時間帯判定部１２３４から作業実施時間帯集合ＦＳを取得する。
また、関節位置画像取得部１２４１は、関節時系列データ画像化部１２２から、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩを取得する。
また、関節位置画像取得部１２４１は、前処理統計量保存部１３１から、前処理統計量を取得する。具体的には、関節位置画像取得部１２４１は、前処理統計量として関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得する。 **Explanation of the joint position image acquisition unit 1241**
The joint position image acquisition unit 1241 acquires the work execution time period set FS from the work execution time period determination unit 1234 .
In addition, the joint position image acquisition section 1241 acquires a left hand joint position image LJPI, a left hand joint velocity image LJVI, a right hand joint position image RJPI, and a right hand joint velocity image RJVI from the joint time-series data imaging section 122.
Furthermore, the joint position image acquisition unit 1241 acquires preprocessing statistics from the preprocessing statistics storage unit 131. Specifically, the joint position image acquisition unit 1241 acquires a joint position luminance average value PM, a joint position luminance standard deviation PS, a joint velocity luminance average value VM, and a joint velocity luminance standard deviation VS as the preprocessing statistics.

そして、関節位置画像取得部１２４１は、作業実施時間帯集合ＦＳに含まれる作業実施時間帯ごとに、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩの各々から、作業実施時間帯に撮像された部分映像を構成する部分画像を抽出する。
そして、関節位置画像取得部１２４１は、作業時間帯ごとに、各部分画像を、一定の幅及び高さをもつ画像にリサイズする。更に、関節位置画像取得部１２４１は、左手関節位置画像ＬＪＰＩから抽出されたリサイズ後の部分画像の画素値を関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。また、関節位置画像取得部１２４１は、右手関節位置画像ＲＪＰＩから抽出されたリサイズ後の部分画像の画素値を関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。前述のように、リサイズと標準化の順序は逆でもよい。
関節位置画像取得部１２４１は、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩの各々についてのリサイズ及び標準化の後の部分画像を関節位置特徴抽出部１２４３に出力する。 Then, the joint position image acquisition unit 1241 extracts partial images constituting a partial video captured during a work performance time period from each of the left hand joint position image LJPI and the right hand joint position image RJPI for each work performance time period included in the work performance time period set FS.
Then, the joint position image acquisition unit 1241 resizes each partial image to an image having a certain width and height for each working time period. Furthermore, the joint position image acquisition unit 1241 standardizes the pixel values of the resized partial image extracted from the left hand joint position image LJPI using the joint position luminance average value PM and the joint position luminance standard deviation PS. Furthermore, the joint position image acquisition unit 1241 standardizes the pixel values of the resized partial image extracted from the right hand joint position image RJPI using the joint position luminance average value PM and the joint position luminance standard deviation PS. As described above, the order of resizing and standardization may be reversed.
The joint position image acquisition section 1241 outputs the resized and standardized partial images of each of the left hand joint position image LJPI and the right hand joint position image RJPI to the joint position feature extraction section 1243.

また、関節位置画像取得部１２４１は、作業実施時間帯集合ＦＳ、左手関節速度画像ＬＪＶＩ、右手関節速度画像ＲＪＶＩ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを、関節速度画像取得部１２４２に出力する。
ここでは、関節位置画像取得部１２４１が、作業実施時間帯集合ＦＳ、左手関節速度画像ＬＪＶＩ、右手関節速度画像ＲＪＶＩ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得し、これらの値を関節速度画像取得部１２４２に出力する例を説明している。
これに代えて、関節速度画像取得部１２４２が、作業実施時間帯判定部１２３４から作業実施時間帯集合ＦＳを取得し、関節時系列データ画像化部１２２から左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩを取得し、前処理統計量保存部１３１から関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得するようにしてもよい。
この場合は、関節位置画像取得部１２４１は、関節時系列データ画像化部１２２から左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩのみを取得し、前処理統計量保存部１３１から関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳのみを取得する。 In addition, the joint position image acquisition section 1241 outputs the task execution time period set FS, the left hand joint velocity image LJVI, the right hand joint velocity image RJVI, the joint velocity brightness average value VM, and the joint velocity brightness standard deviation VS to the joint velocity image acquisition section 1242.
Here, an example is described in which the joint position image acquisition unit 1241 acquires a work performance time period set FS, a left hand joint velocity image LJVI, a right hand joint velocity image RJVI, a joint velocity brightness average VM and a joint velocity brightness standard deviation VS, and outputs these values to the joint velocity image acquisition unit 1242.
Alternatively, the joint velocity image acquisition unit 1242 may acquire a work performance time zone set FS from the work performance time zone determination unit 1234, acquire a left hand joint velocity image LJVI and a right hand joint velocity image RJVI from the joint time series data imaging unit 122, and acquire a joint velocity brightness average value VM and a joint velocity brightness standard deviation VS from the pre-processing statistics storage unit 131.
In this case, the joint position image acquisition unit 1241 acquires only the left hand joint position image LJPI and the right hand joint position image RJPI from the joint time series data imaging unit 122, and acquires only the joint position luminance average value PM and the joint position luminance standard deviation PS from the preprocessing statistics storage unit 131.

＊＊関節速度画像取得部１２４２の説明＊＊
関節速度画像取得部１２４２は、関節位置画像取得部１２４１から、作業実施時間帯集合ＦＳ、左手関節速度画像ＬＪＶＩ、右手関節速度画像ＲＪＶＩ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得する。
そして、関節速度画像取得部１２４２は、作業実施時間帯集合ＦＳに含まれる作業実施時間帯ごとに、左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩの各々から、作業実施時間帯に撮像された部分映像を構成する部分画像を抽出する。
そして、関節速度画像取得部１２４２は、作業時間帯ごとに、各部分画像を、一定の幅及び高さをもつ画像にリサイズする。更に、関節速度画像取得部１２４２は、左手関節速度画像ＬＪＶＩから抽出されたリサイズ後の部分画像の画素値を関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。また、関節速度画像取得部１２４２は、右手関節速度画像ＲＪＶＩから抽出されたリサイズ後の部分画像の画素値を関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。
関節速度画像取得部１２４２は、左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩの各々についてのリサイズ及び標準化の後の部分画像を関節速度特徴抽出部１２４４に出力する。 **Explanation of joint velocity image acquisition unit 1242**
The joint velocity image acquisition section 1242 acquires from the joint position image acquisition section 1241 a task execution time period set FS, a left hand joint velocity image LJVI, a right hand joint velocity image RJVI, a joint velocity brightness average value VM, and a joint velocity brightness standard deviation VS.
Then, the joint velocity image acquisition unit 1242 extracts partial images that constitute partial videos captured during the work performance time period from each of the left hand joint velocity image LJVI and the right hand joint velocity image RJVI for each work performance time period included in the work performance time period set FS.
Then, the joint velocity image acquisition unit 1242 resizes each partial image to an image with a constant width and height for each work time period. Furthermore, the joint velocity image acquisition unit 1242 standardizes the pixel values of the resized partial image extracted from the left hand joint velocity image LJVI using the joint velocity luminance average VM and the joint velocity luminance standard deviation VS. Furthermore, the joint velocity image acquisition unit 1242 standardizes the pixel values of the resized partial image extracted from the right hand joint velocity image RJVI using the joint velocity luminance average VM and the joint velocity luminance standard deviation VS.
The joint velocity image acquisition unit 1242 outputs the resized and standardized partial images of each of the left wrist joint velocity image LJVI and the right wrist joint velocity image RJVI to the joint velocity feature extraction unit 1244.

＊＊関節位置特徴抽出部１２４３の説明＊＊
関節位置特徴抽出部１２４３は、関節位置画像取得部１２４１から、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩの各々についてのリサイズ及び標準化の後の部分画像を取得する。
そして、関節位置特徴抽出部１２４３は、取得した部分画像を要素作業推定モデルＭに入力する。ここで用いられる要素作業推定モデルＭは、例えば、事前学習済みの畳み込みニューラルネットワークである。
関節位置特徴抽出部１２４３は、それぞれの部分画像から特徴ベクトルである関節位置特徴ベクトルを抽出する。ここでは、左手関節位置画像ＬＪＰＩについての部分画像から得られた関節位置特徴ベクトルを左手位置特徴ベクトルｆＬＰという。また、右手関節位置画像ＲＪＰＩについての部分画像から得られた関節位置特徴ベクトルを右手位置特徴ベクトルｆＲＰという。
関節位置特徴抽出部１２４３は、左手位置特徴ベクトルｆＬＰと右手位置特徴ベクトルｆＲＰとを関節画像特徴分類部１２４５に出力する。 **Explanation of the joint position feature extraction unit 1243**
The joint position feature extraction section 1243 acquires from the joint position image acquisition section 1241 partial images after resizing and standardization for each of the left hand joint position image LJPI and the right hand joint position image RJPI.
Then, the joint position feature extraction unit 1243 inputs the acquired partial image to the element work estimation model M. The element work estimation model M used here is, for example, a pre-trained convolutional neural network.
The joint position feature extraction unit 1243 extracts a joint position feature vector, which is a feature vector, from each partial image. Here, the joint position feature vector obtained from the partial image of the left hand joint position image LJPI is called a left hand position feature vector fLP. Also, the joint position feature vector obtained from the partial image of the right hand joint position image RJPI is called a right hand position feature vector fRP.
The joint position feature extraction unit 1243 outputs the left hand position feature vector fLP and the right hand position feature vector fRP to the joint image feature classification unit 1245.

＊＊関節速度特徴抽出部１２４４の説明＊＊
関節速度特徴抽出部１２４４は、関節速度画像取得部１２４２から、左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩの各々についてのリサイズ及び標準化の後の部分画像を取得する。
そして、関節速度特徴抽出部１２４４は、取得した部分画像を要素作業推定モデルＭに入力する。ここで用いられる要素作業推定モデルＭも、例えば、事前学習済みの畳み込みニューラルネットワークである。なお、ここで用いられる畳み込みニューラルネットワークは、関節位置特徴抽出部１２４３により用いられる畳み込みニューラルネットワークと同じであっても、異なっていてもよい。
関節速度特徴抽出部１２４４は、それぞれの部分画像から特徴ベクトルである関節速度特徴ベクトルを抽出する。ここでは、左手関節速度画像ＬＪＶＩについての部分画像から得られた関節速度特徴ベクトルを左手速度特徴ベクトルｆＬＶという。また、右手関節速度画像ＲＪＶＩについての部分画像から得られた関節速度特徴ベクトルを右手速度特徴ベクトルｆＲＶという。
関節速度特徴抽出部１２４４は、左手速度特徴ベクトルｆＬＶと右手速度特徴ベクトルｆＲＶとを関節画像特徴分類部１２４５に出力する。 **Description of Joint Velocity Feature Extraction Unit 1244**
The joint velocity feature extraction unit 1244 acquires, from the joint velocity image acquisition unit 1242, partial images after resizing and standardization for each of the left wrist joint velocity image LJVI and the right wrist joint velocity image RJVI.
Then, the joint velocity feature extraction unit 1244 inputs the acquired partial image to the element work estimation model M. The element work estimation model M used here is also, for example, a pre-trained convolutional neural network. Note that the convolutional neural network used here may be the same as or different from the convolutional neural network used by the joint position feature extraction unit 1243.
The joint velocity feature extraction unit 1244 extracts a joint velocity feature vector, which is a feature vector, from each partial image. Here, the joint velocity feature vector obtained from the partial image of the left hand joint velocity image LJVI is referred to as a left hand velocity feature vector fLV. Also, the joint velocity feature vector obtained from the partial image of the right hand joint velocity image RJVI is referred to as a right hand velocity feature vector fRV.
The joint velocity feature extraction unit 1244 outputs the left hand velocity feature vector fLV and the right hand velocity feature vector fRV to the joint image feature classification unit 1245.

＊＊関節画像特徴分類部１２４５の説明＊＊
関節画像特徴分類部１２４５は、関節位置特徴抽出部１２４３から左手位置特徴ベクトルｆＬＰと右手位置特徴ベクトルｆＲＰとを取得する。また、関節画像特徴分類部１２４５は、関節速度特徴抽出部１２４４から左手速度特徴ベクトルｆＬＶと右手速度特徴ベクトルｆＲＶとを取得する。 **Explanation of the joint image feature classification unit 1245**
The joint image feature classification unit 1245 acquires the left hand position feature vector fLP and the right hand position feature vector fRP from the joint position feature extraction unit 1243. In addition, the joint image feature classification unit 1245 acquires the left hand velocity feature vector fLV and the right hand velocity feature vector fRV from the joint velocity feature extraction unit 1244.

そして、関節画像特徴分類部１２４５は、左手位置特徴ベクトルｆＬＰと右手位置特徴ベクトルｆＲＰと左手速度特徴ベクトルｆＬＶと右手速度特徴ベクトルｆＲＶとを結合する。
更に、関節画像特徴分類部１２４５は、結合後の特徴ベクトルを、要素作業推定モデルＭに入力する。より具体的には、関節画像特徴分類部１２４５は、結合後の特徴ベクトルを、要素作業推定モデルＭの分類処理を行うニューラルネットワークの層に入力する。要素作業推定モデルＭの層の構成は、データ量やタスクに応じて、推定装置１００のユーザが任意に指定することができる。要素作業推定モデルＭは、一般的には複数の全結合層及び複数の活性化層を用いて構成される。
関節画像特徴分類部１２４５は、作業実施時間帯ごとに、要素作業推定モデルＭにより推定された要素作業のうちで、確率が最大となる要素作業を、当該作業時間帯に作業従事者が実施している要素作業と推定する。
そして、関節画像特徴分類部１２４５は、各作業時間帯の推定結果ｑｓの集合である推定結果集合ＱＳを推定結果処理部１２５に出力する。 Then, the joint image feature classifying unit 1245 combines the left hand position feature vector fLP, the right hand position feature vector fRP, the left hand velocity feature vector fLV, and the right hand velocity feature vector fRV.
Furthermore, the joint image feature classification unit 1245 inputs the combined feature vector to the element work estimation model M. More specifically, the joint image feature classification unit 1245 inputs the combined feature vector to a layer of a neural network that performs classification processing of the element work estimation model M. The layer configuration of the element work estimation model M can be arbitrarily specified by the user of the estimation device 100 depending on the amount of data and the task. The element work estimation model M is generally configured using multiple fully connected layers and multiple activation layers.
The joint image feature classification unit 1245 estimates, for each work performance time period, the component work estimated by the component work estimation model M, which has the highest probability, as the component work being performed by the worker during that work time period.
Then, the joint image feature classification unit 1245 outputs an estimation result set QS, which is a set of the estimation results qs for each work time period, to the estimation result processing unit 125.

＊＊＊動作の説明＊＊＊
次に、本実施の形態に係る推定装置１００の動作例を説明する。
図９は、推定装置１００の動作例を示すフローチャートである。 *** Operation Description ***
Next, an example of the operation of the estimation device 100 according to the present embodiment will be described.
FIG. 9 is a flowchart showing an example of the operation of the estimation device 100.

先ず、ステップＳ１１において、関節位置時系列データ取得部１２０が関節位置時系列データＨＰＴを取得する。つまり、関節位置時系列データ取得部１２０は、撮像装置１１０からの映像Ｖから関節位置時系列データＨＰＴを生成する。First, in step S11, the joint position time series data acquisition unit 120 acquires the joint position time series data HPT. That is, the joint position time series data acquisition unit 120 generates the joint position time series data HPT from the video V from the imaging device 110.

次に、ステップＳ１２において、関節速度計算部１２１が、関節位置時系列データＨＰＴから関節速度時系列データＨＶＴを生成する。Next, in step S12, the joint velocity calculation unit 121 generates joint velocity time series data HVT from the joint position time series data HPT.

次に、ステップＳ１３において、関節時系列データ画像化部１２２が、関節位置時系列データＨＰＴを画像化して、左手関節位置画像ＬＪＰＩと右手関節位置画像ＲＪＰＩを生成する。また、関節時系列データ画像化部１２２が、関節速度時系列データＨＶＴを画像化して、左手関節速度画像ＬＪＶＩと右手関節速度画像ＲＪＶＩを生成する。Next, in step S13, the joint time series data imaging unit 122 images the joint position time series data HPT to generate a left wrist joint position image LJPI and a right wrist joint position image RJPI. The joint time series data imaging unit 122 also images the joint velocity time series data HVT to generate a left wrist joint velocity image LJVI and a right wrist joint velocity image RJVI.

次に、ステップＳ１４において、把持道具時系列データ生成部１２６が、映像Ｖ又はセンサ情報ＲＲＳを用いて、左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１を生成する。Next, in step S14, the gripped tool time series data generation unit 126 generates left hand gripped tool data LTO1 and right hand gripped tool data RTO1 using the video V or sensor information RRS.

次に、ステップＳ１５において、作業実施時間帯検出部１２３が、左手関節位置画像ＬＪＰＩ、右手関節位置画像ＲＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節速度画像ＲＪＶＩ、左手把持道具データＬＴＯ１及び右手把持道具データＲＴＯ１を用いて、作業実施時間帯を検出する。Next, in step S15, the work performance time zone detection unit 123 detects the work performance time zone using the left hand joint position image LJPI, the right hand joint position image RJPI, the left hand joint velocity image LJVI, the right hand joint velocity image RJVI, the left hand held tool data LTO1, and the right hand held tool data RTO1.

次に、ステップＳ１６において、要素作業推定部１２４が、要素作業推定モデルＭを用いて、作業実施時間帯ごとに、作業従事者が実施する要素作業を推定する。Next, in step S16, the element work estimation unit 124 uses the element work estimation model M to estimate the element work performed by the worker for each work performance time period.

最後に、ステップＳ１７において、推定結果処理部１２５が、最終推定結果ＡＳを出力する。 Finally, in step S17, the estimation result processing unit 125 outputs the final estimation result AS.

次に、作業実施時間帯検出部１２３の動作例の詳細を説明する。
図１０は、作業実施時間帯検出部１２３の動作例の詳細を示すフローチャートである。 Next, an example of the operation of the work execution time zone detection unit 123 will be described in detail.
FIG. 10 is a flowchart showing details of an example of the operation of the work execution time zone detection unit 123.

先ず、ステップＳ１１１において、出現状況判定部１２３１が、出現状況を判定する。
出現状況判定部１２３１は、関節時系列データ画像化部１２２から、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩを取得する。
そして、出現状況判定部１２３１は、左手関節位置画像ＬＪＰＩ及び／又は左手関節速度画像ＬＪＶＩを解析し、左手出現時間帯集合ＬＳを生成する。同様に、出現状況判定部１２３１は、右手関節位置画像ＲＪＰＩ及び／又は右手関節速度画像ＲＪＶＩを解析し、右手出現時間帯集合ＲＳを生成する。
出現状況判定部１２３１は、左手出現時間帯集合ＬＳと右手出現時間帯集合ＲＳとを作業実施時間帯判定部１２３４に出力する。 First, in step S111, the appearance status determination unit 1231 determines the appearance status.
The appearance status determination unit 1231 acquires a left wrist joint position image LJPI, a left wrist joint velocity image LJVI, a right wrist joint position image RJPI, and a right wrist joint velocity image RJVI from the joint time-series data imaging unit 122.
The appearance status determination unit 1231 then analyzes the left hand joint position image LJPI and/or the left hand joint velocity image LJVI to generate a left hand appearance time period set LS. Similarly, the appearance status determination unit 1231 analyzes the right hand joint position image RJPI and/or the right hand joint velocity image RJVI to generate a right hand appearance time period set RS.
The appearance status determination unit 1231 outputs the left hand appearance time period set LS and the right hand appearance time period set RS to the work implementation time period determination unit 1234.

次に、ステップＳ１１２において、道具把持状況判定部１２３２は、道具把持状況を判定する。
道具把持状況判定部１２３２は、把持道具時系列データ生成部１２６から左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１とを取得する。
そして、道具把持状況判定部１２３２は、左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１とに基づいて、左手及び右手の各々について、道具を把持している時間帯、道具を把持していない時間帯を時系列に抽出する。また、作業従事者が把持する道具の種類が変化している場合は、道具把持状況判定部１２３２は、道具を把持している時間帯を、道具の種類ごとに異なる時間帯として扱う。
道具把持状況判定部１２３２は、左手についての抽出結果が示される左手道具把持状況判定結果ＬＴＳと右手についての抽出結果が示される右手道具把持状況判定結果ＲＴＳとを作業実施時間帯判定部１２３４に出力する。 Next, in step S112, the tool holding status determination unit 1232 determines the tool holding status.
The tool gripping state determination unit 1232 acquires the left hand gripped tool data LTO1 and the right hand gripped tool data RTO1 from the gripped tool time-series data generation unit 126.
The tool holding status determination unit 1232 extracts, in chronological order, time periods during which a tool is being held and time periods during which a tool is not being held for each of the left and right hands based on the left hand held tool data LTO1 and the right hand held tool data RTO1. If the type of tool held by the worker is changing, the tool holding status determination unit 1232 treats the time periods during which a tool is being held as different time periods for each type of tool.
The tool holding situation determination unit 1232 outputs a left hand tool holding situation determination result LTS indicating the extraction result for the left hand and a right hand tool holding situation determination result RTS indicating the extraction result for the right hand to the work execution time zone determination unit 1234.

次に、ステップＳ１１３において、変位量判定部１２３３が、変位量を判定する。
変位量判定部１２３３は、関節時系列データ画像化部１２２から、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩを取得する。
そして、変位量判定部１２３３は、左手関節位置画像ＬＪＰＩ及び／又は左手関節速度画像ＬＪＶＩを解析し、左手の単位時間あたりの変位量を計算する。同様に、変位量判定部１２３３は、右手関節位置画像ＲＪＰＩ及び／又は右手関節速度画像ＲＪＶＩを解析し、右手の単位時間あたりの変位量を計算する。
そして、変位量判定部１２３３は、左手の単位時間あたりの変位量を時系列に示す左手変位量データＬＺＳと、右手の単位時間あたりの変位量を時系列に示す右手変位量データＲＺＳとを作業実施時間帯判定部１２３４に出力する。 Next, in step S113, the displacement amount determination unit 1233 determines the displacement amount.
The displacement amount determining unit 1233 acquires a left wrist joint position image LJPI, a left wrist joint velocity image LJVI, a right wrist joint position image RJPI, and a right wrist joint velocity image RJVI from the joint time-series data imaging unit 122.
The displacement amount determination unit 1233 then analyzes the left wrist joint position image LJPI and/or the left wrist joint velocity image LJVI to calculate the displacement amount of the left hand per unit time. Similarly, the displacement amount determination unit 1233 analyzes the right wrist joint position image RJPI and/or the right wrist joint velocity image RJVI to calculate the displacement amount of the right hand per unit time.
Then, the displacement amount determination unit 1233 outputs left hand displacement amount data LZS which indicates the displacement amount of the left hand per unit time in a time series, and right hand displacement amount data RZS which indicates the displacement amount of the right hand per unit time in a time series, to the work performance time zone determination unit 1234.

最後に、ステップＳ１１４において、作業実施時間帯判定部１２３４が、作業実施時間帯を検出する。
作業実施時間帯判定部１２３４は、出現状況判定部１２３１から、左手出現時間帯集合ＬＳと右手出現時間帯集合ＲＳとを取得する。
また、作業実施時間帯判定部１２３４は、道具把持状況判定部１２３２から、左手道具把持状況判定結果ＬＴＳと右手道具把持状況判定結果ＲＴＳとを取得する。
また、作業実施時間帯判定部１２３４は、変位量判定部１２３３から、左手変位量データＬＺＳと右手変位量データＲＺＳとを取得する。
そして、作業実施時間帯判定部１２３４は、これらに基づいて、作業従事時間を作業実施時間帯と非作業時間帯とに分割する。
最後に、作業実施時間帯判定部１２３４は、作業実施時間帯の集合である作業実施時間帯集合ＦＳを関節画像特徴分類部１２４５に出力する。
また、作業実施時間帯判定部１２３４は、非作業時間帯の集合である非作業時間帯集合ＮＦＳを推定結果処理部１２５に出力する。 Finally, in step S114, the work implementation time zone determination unit 1234 detects the work implementation time zone.
The work implementation time zone determination unit 1234 acquires a left hand appearance time zone set LS and a right hand appearance time zone set RS from the appearance status determination unit 1231.
In addition, the work execution time zone determination unit 1234 acquires the left hand tool gripping situation determination result LTS and the right hand tool gripping situation determination result RTS from the tool gripping situation determination unit 1232.
In addition, the work implementation time zone determination unit 1234 acquires the left hand displacement amount data LZS and the right hand displacement amount data RZS from the displacement amount determination unit 1233.
Then, the work execution time zone determining unit 1234 divides the work engagement time into a work execution time zone and a non-work time zone based on these.
Finally, the task performance time zone determination unit 1234 outputs a task performance time zone set FS, which is a set of task performance time zones, to the joint image feature classification unit 1245.
In addition, the work implementation time zone determination unit 1234 outputs a non-work time zone set NFS, which is a set of non-work time zones, to the estimation result processing unit 125.

次に、要素作業推定部１２４の動作例の詳細を説明する。
図１１は、要素作業推定部１２４の動作例の詳細を示すフローチャートである。 Next, an example of the operation of the element work estimation unit 124 will be described in detail.
FIG. 11 is a flowchart showing details of an example of the operation of the element work estimation unit 124.

先ず、ステップＳ１２１において、関節位置画像取得部１２４１が、関節位置画像の部分画を生成し、部分画像のリサイズ及び標準化を行う。
関節位置画像取得部１２４１は、作業実施時間帯判定部１２３４から作業実施時間帯集合ＦＳを取得する。
また、関節位置画像取得部１２４１は、関節時系列データ画像化部１２２から、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩを取得する。
また、関節位置画像取得部１２４１は、前処理統計量保存部１３１から、前処理統計量として関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得する。 First, in step S121, the joint position image acquisition unit 1241 generates a partial image of the joint position image, and resizes and standardizes the partial image.
The joint position image acquisition unit 1241 acquires the work execution time period set FS from the work execution time period determination unit 1234 .
In addition, the joint position image acquisition section 1241 acquires a left hand joint position image LJPI, a left hand joint velocity image LJVI, a right hand joint position image RJPI, and a right hand joint velocity image RJVI from the joint time-series data imaging section 122.
In addition, the joint position image acquisition section 1241 acquires the joint position luminance average value PM, the joint position luminance standard deviation PS, the joint velocity luminance average value VM, and the joint velocity luminance standard deviation VS as preprocessing statistics from the preprocessing statistics storage section 131.

そして、関節位置画像取得部１２４１は、作業実施時間帯集合ＦＳに含まれる作業実施時間帯ごとに、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩの各々から、作業実施時間帯に撮像された部分映像を構成する部分画像を抽出する。
そして、関節位置画像取得部１２４１は、作業時間帯ごとに、各部分画像を、一定の幅及び高さをもつ画像にリサイズする。更に、関節位置画像取得部１２４１は、左手関節位置画像ＬＪＰＩから抽出されたリサイズ後の部分画像の画素値を関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。また、関節位置画像取得部１２４１は、右手関節位置画像ＲＪＰＩから抽出されたリサイズ後の部分画像の画素値を関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。前述のように、リサイズと標準化の順序は逆でもよい。
関節位置画像取得部１２４１は、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩの各々についてのリサイズ及び標準化の後の部分画像を関節位置特徴抽出部１２４３に出力する。
また、関節位置画像取得部１２４１は、作業実施時間帯集合ＦＳ、左手関節速度画像ＬＪＶＩ、右手関節速度画像ＲＪＶＩ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを関節速度画像取得部１２４２に出力する。 Then, the joint position image acquisition unit 1241 extracts partial images constituting a partial video captured during a work performance time period from each of the left hand joint position image LJPI and the right hand joint position image RJPI for each work performance time period included in the work performance time period set FS.
Then, the joint position image acquisition unit 1241 resizes each partial image to an image having a certain width and height for each working time period. Furthermore, the joint position image acquisition unit 1241 standardizes the pixel values of the resized partial image extracted from the left hand joint position image LJPI using the joint position luminance average value PM and the joint position luminance standard deviation PS. Furthermore, the joint position image acquisition unit 1241 standardizes the pixel values of the resized partial image extracted from the right hand joint position image RJPI using the joint position luminance average value PM and the joint position luminance standard deviation PS. As described above, the order of resizing and standardization may be reversed.
The joint position image acquisition section 1241 outputs the resized and standardized partial images of each of the left hand joint position image LJPI and the right hand joint position image RJPI to the joint position feature extraction section 1243.
In addition, the joint position image acquisition section 1241 outputs the task execution time period set FS, the left hand joint velocity image LJVI, the right hand joint velocity image RJVI, the joint velocity brightness average VM, and the joint velocity brightness standard deviation VS to the joint velocity image acquisition section 1242.

次に、ステップＳ１２２において、関節速度画像取得部１２４２が、関節速度画像の部分画像を生成し、部分画像のリサイズ及び標準化を行う。
関節速度画像取得部１２４２は、関節位置画像取得部１２４１から、作業実施時間帯集合ＦＳ、左手関節速度画像ＬＪＶＩ、右手関節速度画像ＲＪＶＩ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得する。
そして、関節速度画像取得部１２４２は、作業実施時間帯集合ＦＳに含まれる作業実施時間帯ごとに、左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩの各々から、作業実施時間帯に撮像された部分映像を構成する部分画像を抽出する。
そして、関節速度画像取得部１２４２は、作業時間帯ごとに、各部分画像を、一定の幅及び高さをもつ画像にリサイズする。更に、関節速度画像取得部１２４２は、左手関節速度画像ＬＪＶＩから抽出されたリサイズ後の部分画像の画素値を関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。また、関節速度画像取得部１２４２は、右手関節速度画像ＲＪＶＩから抽出されたリサイズ後の部分画像の画素値を関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。前述のように、リサイズと標準化の順序は逆でもよい。
関節速度画像取得部１２４２は、左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩの各々についてのリサイズ及び標準化の後の部分画像を関節速度特徴抽出部１２４４に出力する。 Next, in step S122, the joint velocity image acquisition unit 1242 generates a partial image of the joint velocity image, and resizes and standardizes the partial image.
The joint velocity image acquisition section 1242 acquires from the joint position image acquisition section 1241 a task execution time period set FS, a left hand joint velocity image LJVI, a right hand joint velocity image RJVI, a joint velocity brightness average value VM, and a joint velocity brightness standard deviation VS.
Then, the joint velocity image acquisition unit 1242 extracts partial images that constitute partial videos captured during the work performance time period from each of the left hand joint velocity image LJVI and the right hand joint velocity image RJVI for each work performance time period included in the work performance time period set FS.
Then, the joint velocity image acquisition unit 1242 resizes each partial image to an image having a constant width and height for each work time period. Furthermore, the joint velocity image acquisition unit 1242 standardizes the pixel values of the resized partial image extracted from the left hand joint velocity image LJVI using the joint velocity luminance average value VM and the joint velocity luminance standard deviation VS. Furthermore, the joint velocity image acquisition unit 1242 standardizes the pixel values of the resized partial image extracted from the right hand joint velocity image RJVI using the joint velocity luminance average value VM and the joint velocity luminance standard deviation VS. As described above, the order of resizing and standardization may be reversed.
The joint velocity image acquisition unit 1242 outputs the resized and standardized partial images of each of the left wrist joint velocity image LJVI and the right wrist joint velocity image RJVI to the joint velocity feature extraction unit 1244.

次に、ステップＳ１２３において、関節位置特徴抽出部１２４３が、関節位置特徴ベクトルを抽出する。
関節位置特徴抽出部１２４３は、関節位置画像取得部１２４１から、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩの各々についてのリサイズ及び標準化の後の部分画像を取得する。
そして、関節位置特徴抽出部１２４３は、取得した部分画像を要素作業推定モデルＭに入力する。
そして、関節位置特徴抽出部１２４３は、左手位置特徴ベクトルｆＬＰと右手位置特徴ベクトルｆＲＰとを抽出する。
関節位置特徴抽出部１２４３は、左手位置特徴ベクトルｆＬＰと右手位置特徴ベクトルｆＲＰとを関節画像特徴分類部１２４５に出力する。 Next, in step S123, the joint position feature extraction unit 1243 extracts a joint position feature vector.
The joint position feature extraction section 1243 acquires, from the joint position image acquisition section 1241, partial images after resizing and standardization for each of the left hand joint position image LJPI and the right hand joint position image RJPI.
Then, the joint position feature extraction unit 1243 inputs the acquired partial image to the element work estimation model M.
Then, the joint position feature extraction unit 1243 extracts a left hand position feature vector fLP and a right hand position feature vector fRP.
The joint position feature extraction unit 1243 outputs the left hand position feature vector fLP and the right hand position feature vector fRP to the joint image feature classification unit 1245.

次に、ステップＳ１２４において、関節速度特徴抽出部１２４４が、関節速度特徴ベクトルを抽出する。
関節速度特徴抽出部１２４４は、関節速度画像取得部１２４２から、左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩの各々についてのリサイズ及び標準化の後の部分画像を取得する。
そして、関節速度特徴抽出部１２４４は、取得した部分画像を要素作業推定モデルＭに入力する。
そして、関節速度特徴抽出部１２４４は、左手速度特徴ベクトルｆＬＶと右手速度特徴ベクトルｆＲＶとを抽出する。
関節速度特徴抽出部１２４４は、左手速度特徴ベクトルｆＬＶと右手速度特徴ベクトルｆＲＶとを関節画像特徴分類部１２４５に出力する。 Next, in step S124, the joint velocity feature extractor 1244 extracts a joint velocity feature vector.
The joint velocity feature extraction unit 1244 acquires, from the joint velocity image acquisition unit 1242, partial images after resizing and standardization for each of the left wrist joint velocity image LJVI and the right wrist joint velocity image RJVI.
Then, the joint velocity feature extraction unit 1244 inputs the acquired partial images to the element task estimation model M.
Then, the joint velocity feature extraction unit 1244 extracts a left hand velocity feature vector fLV and a right hand velocity feature vector fRV.
The joint velocity feature extraction unit 1244 outputs the left hand velocity feature vector fLV and the right hand velocity feature vector fRV to the joint image feature classification unit 1245.

最後に、ステップＳ１２５において、関節画像特徴分類部１２４５が、関節位置特徴ベクトルと関節速度特徴ベクトルとを結合し、要素作業を推定する。
関節画像特徴分類部１２４５は、関節位置特徴抽出部１２４３から左手位置特徴ベクトルｆＬＰと右手位置特徴ベクトルｆＲＰとを取得する。また、関節画像特徴分類部１２４５は、関節速度特徴抽出部１２４４から左手速度特徴ベクトルｆＬＶと右手速度特徴ベクトルｆＲＶとを取得する。
そして、関節画像特徴分類部１２４５は、左手位置特徴ベクトルｆＬＰと右手位置特徴ベクトルｆＲＰと左手速度特徴ベクトルｆＬＶと右手速度特徴ベクトルｆＲＶとを結合する。更に、関節画像特徴分類部１２４５は、結合後の特徴ベクトルを、要素作業推定モデルＭに入力する。
関節画像特徴分類部１２４５は、作業実施時間帯ごとに、要素作業推定モデルＭにより推定された要素作業のうちで、確率が最大となる要素作業を、当該作業時間帯に作業従事者が実施している要素作業と推定する。
そして、関節画像特徴分類部１２４５は、各作業時間帯の推定結果ｑｓの集合である推定結果集合ＱＳを推定結果処理部１２５に出力する。 Finally, in step S125, the joint image feature classifying unit 1245 combines the joint position feature vector and the joint velocity feature vector to estimate the element work.
The joint image feature classification unit 1245 acquires the left hand position feature vector fLP and the right hand position feature vector fRP from the joint position feature extraction unit 1243. In addition, the joint image feature classification unit 1245 acquires the left hand velocity feature vector fLV and the right hand velocity feature vector fRV from the joint velocity feature extraction unit 1244.
Then, the joint image feature classification unit 1245 combines the left hand position feature vector fLP, the right hand position feature vector fRP, the left hand velocity feature vector fLV, and the right hand velocity feature vector fRV. Furthermore, the joint image feature classification unit 1245 inputs the combined feature vector to the element task estimation model M.
The joint image feature classification unit 1245 estimates, for each work performance time period, the component work estimated by the component work estimation model M, which has the highest probability, as the component work being performed by the worker during that work time period.
Then, the joint image feature classification unit 1245 outputs an estimation result set QS, which is a set of the estimation results qs for each work time period, to the estimation result processing unit 125.

＊＊＊実施の形態の効果の説明＊＊＊
本実施の形態では、推定装置１００は、作業実施時間帯の検出と、検出した作業実施時間帯で実施されている要素作業の推定という二段階の処理を行う。
作業実施時間帯の検出では、推定装置１００は、手の出現状況、道具の把持状況及び手の変位量を解析して作業実施時間帯を検出する。このため、推定装置１００は、作業に含まれる要素作業の発生頻度及び／又は実施時間が様々であっても、作業実施時間帯を正確に検出することができる。
このように作業実施時間帯を正確に検出することができるため、要素作業の推定では、推定装置１００は作業実施時間帯に対応する部分画像を用いて、作業実施時間帯で実施されている要素作業を正確に推定することができる。
従って、本実施の形態によれば、各々の属性が一様ではない複数の要素作業が作業に含まれる場合でも、作業従事者が実施する要素作業を正確に推定することができる。 ***Description of Effects of the Embodiment***
In this embodiment, the estimation device 100 performs a two-stage process of detecting an operation execution time period and estimating an element operation being performed during the detected operation execution time period.
In detecting the task performance time period, the estimation device 100 detects the task performance time period by analyzing the appearance of the hand, the gripping state of the tool, and the amount of displacement of the hand. Therefore, the estimation device 100 can accurately detect the task performance time period even if the occurrence frequency and/or performance time of elemental tasks included in the task vary.
Because the work performance time period can be accurately detected in this manner, when estimating the component work, the estimation device 100 can accurately estimate the component work being performed during the work performance time period by using a partial image corresponding to the work performance time period.
Therefore, according to this embodiment, even if a task includes a plurality of elemental tasks, each of which has a different attribute, it is possible to accurately estimate the elemental task performed by a worker.

また、本実施の形態では、推定装置１００は、画像認識モデルを用いて要素作業を推定する。このため、本実施の形態によれば、要素作業の推定に、既存の学習済みモデルを用いることができる。すなわち、フルスクラッチでモデルを学習する必要がないため、学習に必要なデータ量を抑え、データ収集の負担を軽減することができる。 In addition, in this embodiment, the estimation device 100 estimates the element work using an image recognition model. Therefore, according to this embodiment, an existing trained model can be used to estimate the element work. In other words, since there is no need to train a model from scratch, the amount of data required for training can be reduced, and the burden of data collection can be reduced.

実施の形態２．
本実施の形態では、実施の形態１で説明した要素作業推定モデルＭを生成する学習装置を説明する。
本実施の形態では、主に実施の形態１との差異を説明する。
なお、以下で説明していない事項は、実施の形態１と同様である。 Embodiment 2.
In this embodiment, a learning device that generates the element work estimation model M described in the first embodiment will be described.
In this embodiment, differences from the first embodiment will be mainly described.
It should be noted that matters not explained below are the same as those in the first embodiment.

＊＊構成の説明＊＊
図１２は、本実施の形態に係る学習装置２００の機能構成例を示す。学習装置２００は、学習フェーズで動作する。学習フェーズは実施の形態１に係る推定装置１００が動作する推定フェーズに先立つフェーズである。
学習装置２００は、撮像装置２１０に接続されている。
撮像装置２１０は、実施の形態１で示した撮像装置１１０と同様である。
つまり、撮像装置２１０は、作業従事者の手の映像を学習装置２００に出力する。
撮像装置２１０は、例えば、作業従事者の頭部に装着され、１人称視点の映像を撮像する。本実施の形態では、撮像装置２１０は、作業従事者の頭部に装着されているものとする。しかし、撮像装置２１０は、作業従事者の手の映像が撮像できるのであれば、作業従事者の頭部に装着されていなくてもよい。 **Configuration Description**
12 illustrates an example of a functional configuration of the learning device 200 according to this embodiment. The learning device 200 operates in a learning phase. The learning phase is a phase preceding the estimation phase in which the estimation device 100 according to the first embodiment operates.
The learning device 200 is connected to an imaging device 210 .
The imaging device 210 is similar to the imaging device 110 described in the first embodiment.
That is, the imaging device 210 outputs an image of the worker's hands to the learning device 200.
The imaging device 210 is, for example, worn on the head of the worker and captures a first-person perspective image. In this embodiment, the imaging device 210 is assumed to be worn on the head of the worker. However, the imaging device 210 does not have to be worn on the head of the worker as long as it can capture an image of the worker's hands.

＊＊学習装置２００のハードウェア構成例の説明＊＊
なお、本実施の形態に係る学習装置２００は、図２３に例示するハードウェア構成を有するコンピュータである。
図２３に示すように、学習装置２００は、ハードウェアとして、プロセッサ９０１、主記憶装置９０２、補助記憶装置９０３及び通信装置９０４を備える。
図１２に示す関節位置時系列データ取得部２２０、関節速度計算部２２１等の構成要素の機能は、例えば、プログラムにより実現される。
補助記憶装置９０３には、これらの構成要素の機能を実現するプログラムが記憶されている。
これらプログラムは、補助記憶装置９０３から主記憶装置９０２にロードされる。そして、プロセッサ９０１がこれらプログラムを実行して、これらの構成要素の動作を行う。 **Explanation of an Example of the Hardware Configuration of the Learning Device 200**
The learning device 200 according to this embodiment is a computer having a hardware configuration exemplified in FIG.
As shown in FIG. 23, the learning device 200 includes, as hardware, a processor 901, a main memory device 902, an auxiliary memory device 903, and a communication device 904.
The functions of the components such as the joint position time-series data acquisition unit 220 and the joint velocity calculation unit 221 shown in FIG. 12 are realized by, for example, a program.
The auxiliary storage device 903 stores programs that realize the functions of these components.
These programs are loaded from the auxiliary storage device 903 to the main storage device 902. Then, the processor 901 executes these programs to perform the operations of these components.

＊＊学習装置２００の機能構成例の説明＊＊
図１３は、学習装置２００の機能構成例を示す。
学習装置２００は、関節位置時系列データ取得部２２０、関節速度計算部２２１、関節時系列データ画像化部２２２、学習データ生成部２２３、前処理統計量計算部２２４、要素作業推定モデル生成部２２５、学習データ保存部２３０、前処理統計量保存部２３１及び要素作業推定モデル保存部２３２を備える。 **Explanation of an Example of the Functional Configuration of the Learning Device 200**
FIG. 13 shows an example of the functional configuration of the learning device 200.
The learning device 200 includes a joint position time series data acquisition unit 220, a joint velocity calculation unit 221, a joint time series data imaging unit 222, a learning data generation unit 223, a preprocessing statistics calculation unit 224, an element work estimation model generation unit 225, a learning data storage unit 230, a preprocessing statistics storage unit 231 and an element work estimation model storage unit 232.

＊＊関節位置時系列データ取得部２２０の説明＊＊
関節位置時系列データ取得部２２０は、実施の形態１で説明した関節位置時系列データ取得部１２０と同様の動作を行う。
つまり、関節位置時系列データ取得部２２０は、撮像装置２１０から、作業従事者の手を撮像して得られた映像Ｖを取得する。映像Ｖは、学習フェーズにおいて、作業従事時間に撮像された映像である。映像Ｖには、作業従事者の手が映されているフレームと作業者の手が映されていないフレームが含まれる。
関節位置時系列データ取得部２２０は、映像Ｖから関節位置時系列データＨＰＴを生成する。
そして、関節位置時系列データ取得部２２０は、関節位置時系列データＨＰＴを関節速度計算部２２１及び関節時系列データ画像化部２２２に出力する。
本実施の形態に係る関節位置時系列データＨＰＴは、実施の形態１で説明した関節位置時系列データＨＰＴと同様である。 **Explanation of joint position time-series data acquisition unit 220**
The joint position time-series data acquiring section 220 performs the same operation as the joint position time-series data acquiring section 120 described in the first embodiment.
That is, the joint position time-series data acquisition unit 220 acquires an image V obtained by capturing an image of the worker's hand from the imaging device 210. The image V is an image captured during the work engagement time in the learning phase. The image V includes frames in which the worker's hand is captured and frames in which the worker's hand is not captured.
The joint position time series data acquisition unit 220 generates joint position time series data HPT from the video V.
Then, the joint position time series data acquisition unit 220 outputs the joint position time series data HPT to the joint velocity calculation unit 221 and the joint time series data imaging unit 222.
The joint position time series data HPT according to this embodiment is similar to the joint position time series data HPT described in the first embodiment.

＊＊関節速度計算部２２１の説明＊＊
関節速度計算部２２１は、実施の形態１で説明した関節速度計算部１２１と同様の動作を行う。
関節速度計算部２２１は、関節位置時系列データ取得部２２０から関節位置時系列データＨＰＴを取得する。
そして、関節速度計算部２２１は、関節位置時系列データＨＰＴの各関節における時間方向への差分（速度）を計算する。そして、関節速度計算部２２１は、計算結果を示す関節速度時系列データＨＶＴを関節時系列データ画像化部２２２に出力する。
本実施の形態に係る関節速度時系列データＨＶＴは、実施の形態１で説明した関節速度時系列データＨＶＴと同様である。 **Explanation of the joint velocity calculation unit 221**
The joint velocity calculation unit 221 performs the same operation as the joint velocity calculation unit 121 described in the first embodiment.
The joint velocity calculation unit 221 acquires the joint position time series data HPT from the joint position time series data acquisition unit 220 .
The joint velocity calculation unit 221 then calculates the difference (velocity) in the time direction for each joint in the joint position time series data HPT, and outputs the joint velocity time series data HVT indicating the calculation result to the joint time series data visualization unit 222.
The joint velocity time series data HVT according to this embodiment is similar to the joint velocity time series data HVT described in the first embodiment.

＊＊関節時系列データ画像化部２２２の説明＊＊
関節時系列データ画像化部２２２は、実施の形態１で説明した関節時系列データ画像化部１２２と同様の動作を行う。
つまり、関節時系列データ画像化部２２２は、関節位置時系列データ取得部２２０から関節位置時系列データＨＰＴを取得する。また、関節時系列データ画像化部２２２は、関節速度計算部２２１から関節速度時系列データＨＶＴを取得する。
関節時系列データ画像化部２２２は、関節位置時系列データＨＰＴを画像化して、左手関節位置画像ＬＪＰＩと右手関節位置画像ＲＪＰＩを生成する。また、関節時系列データ画像化部１２２は、関節速度時系列データＨＶＴを画像化して、左手関節速度画像ＬＪＶＩと右手関節速度画像ＲＪＶＩを生成する。
そして、関節時系列データ画像化部２２２は、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩと左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩを学習データ生成部２２３に出力する。
本実施の形態に係る左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩと左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩは、実施の形態１で説明した左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩと左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩと同様である。 **Explanation of the joint time-series data imaging unit 222**
The joint time-series data imaging unit 222 operates in the same manner as the joint time-series data imaging unit 122 described in the first embodiment.
That is, the joint time series data imaging unit 222 acquires the joint position time series data HPT from the joint position time series data acquisition unit 220. The joint time series data imaging unit 222 also acquires the joint velocity time series data HVT from the joint velocity calculation unit 221.
The joint time-series data imaging unit 222 images the joint position time-series data HPT to generate a left wrist joint position image LJPI and a right wrist joint position image RJPI. The joint time-series data imaging unit 122 images the joint velocity time-series data HVT to generate a left wrist joint velocity image LJVI and a right wrist joint velocity image RJVI.
Then, the joint time-series data visualization unit 222 outputs the left hand joint position image LJPI, the right hand joint position image RJPI, the left hand joint velocity image LJVI, and the right hand joint velocity image RJVI to the learning data generation unit 223.
The left hand joint position image LJPI and right hand joint position image RJPI and the left hand joint velocity image LJVI and right hand joint velocity image RJVI in this embodiment are similar to the left hand joint position image LJPI and right hand joint position image RJPI and the left hand joint velocity image LJVI and right hand joint velocity image RJVI described in embodiment 1.

＊＊学習データ生成部２２３の説明＊＊
学習データ生成部２２３は、関節時系列データ画像化部２２２から、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩと左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩを取得する。
また、学習データ生成部２２３は、ラベル情報ＬＢＬｉｎｆを、例えば、学習装置２００のユーザから取得する。
ラベル情報ＬＢＬｉｎｆには、複数のラベルＬＢＬが含まれる。各ラベルＬＢＬには、開始時刻ｔｓ、終了時刻ｔｅ及び要素作業種類ｔｙｐが含まれる。ここでは、学習装置２００のユーザがマニュアルで開始時刻ｔｓ及び終了時刻ｔｅと要素作業の種類ｔｙｐとを対応付けることとしている。しかし、他の方法で開始時刻ｔｓ及び終了時刻ｔｅと要素作業の種類ｔｙｐとを対応付けてもよい。
開始時刻ｔｓには、作業実施時間帯の開始時刻が示される。
終了時刻ｔｅには、作業実施時間帯の開始時刻が示される。
要素作業種類ｔｙｐには、作業実施時間帯で実施されている要素作業の種類が示される。 **Explanation of the learning data generation unit 223**
The learning data generation unit 223 acquires a left wrist joint position image LJPI, a right wrist joint position image RJPI, a left wrist joint velocity image LJVI, and a right wrist joint velocity image RJVI from the joint time-series data imaging unit 222.
In addition, the learning data generation unit 223 acquires label information LBLinf from, for example, the user of the learning device 200.
The label information LBLinf includes a plurality of labels LBL. Each label LBL includes a start time ts, an end time te, and an element task type typ. Here, the user of the learning device 200 manually associates the start time ts and the end time te with the element task type typ. However, the start time ts and the end time te with the element task type typ may be associated with each other by other methods.
The start time ts indicates the start time of the work execution time period.
The end time te indicates the start time of the work execution time period.
The element work type typ indicates the type of element work being performed in the work performance time period.

学習データ生成部２２３は、左手関節位置画像ＬＪＰＩから、ラベル情報ＬＢＬｉｎｆのラベルＬＢＬごとに、開始時刻ｔｓに相当する画像位置から終了時刻ｔｅに相当する画像位置までの画像領域を左手関節位置ラベル画像ｓＬＪＰＩとして抽出する。そして、学習データ生成部２２３は、抽出した左手関節位置ラベル画像ｓＬＪＰＩとラベルＬＢＬとを対応付ける。
また、学習データ生成部２２３は、右手関節位置画像ＲＪＰＩから、ラベル情報ＬＢＬｉｎｆのラベルＬＢＬごとに、開始時刻ｔｓに相当する画像位置から終了時刻ｔｅに相当する画像位置までの画像領域を右手関節位置ラベル画像ｓＲＪＰＩとして抽出する。そして、学習データ生成部２２３は、抽出した右手関節位置ラベル画像ｓＲＪＰＩとラベルＬＢＬとを対応付ける。
また、左手関節速度画像ＬＪＶＩから、ラベル情報ＬＢＬｉｎｆのラベルＬＢＬごとに、開始時刻ｔｓに相当する画像位置から終了時刻ｔｅに相当する画像位置までの画像領域を左手関節速度ラベル画像ｓＬＪＶＩとして抽出する。そして、学習データ生成部２２３は、抽出した左手関節速度ラベル画像ｓＬＪＶＩとラベルＬＢＬとを対応付ける。
また、学習データ生成部２２３は、右手関節速度画像ＲＪＶＩから、ラベル情報ＬＢＬｉｎｆのラベルＬＢＬごとに、開始時刻ｔｓに相当する画像位置から終了時刻ｔｅに相当する画像位置までの画像領域を右手関節速度ラベル画像ｓＲＪＶＩとして抽出する。そして、学習データ生成部２２３は、抽出した右手関節速度ラベル画像ｓＲＪＶＩとラベルＬＢＬとを対応付ける。
更に、学習データ生成部２２３は、左手関節位置ラベル画像ｓＬＪＰＩとラベルＬＢＬとの複数の対、右手関節位置ラベル画像ｓＲＪＰＩとラベルＬＢＬとの複数の対、左手関節速度ラベル画像ｓＬＪＶＩとラベルＬＢＬとの複数の対、右手関節速度ラベル画像ｓＲＪＶＩとラベルＬＢＬとの複数の対を、学習データｓＩｓとして学習データ保存部２３０に格納する。 The learning data generation unit 223 extracts, from the left hand joint position image LJPI, an image region from an image position corresponding to the start time ts to an image position corresponding to the end time te for each label LBL of the label information LBLinf, as a left hand joint position label image sLJPI. Then, the learning data generation unit 223 associates the extracted left hand joint position label image sLJPI with the label LBL.
In addition, the learning data generation unit 223 extracts, from the right hand joint position image RJPI, an image region from an image position corresponding to the start time ts to an image position corresponding to the end time te for each label LBL of the label information LBLinf, as a right hand joint position label image sRJPI. Then, the learning data generation unit 223 associates the extracted right hand joint position label image sRJPI with the label LBL.
Furthermore, for each label LBL in the label information LBLinf, an image region from an image position corresponding to the start time ts to an image position corresponding to the end time te is extracted as a left hand joint velocity label image sLJVI from the left hand joint velocity image LJVI. Then, the learning data generation unit 223 associates the extracted left hand joint velocity label image sLJVI with the label LBL.
Furthermore, the learning data generating unit 223 extracts, from the right hand joint velocity image RJVI, an image region from an image position corresponding to the start time ts to an image position corresponding to the end time te for each label LBL in the label information LBLinf, as a right hand joint velocity label image sRJVI. Then, the learning data generating unit 223 associates the extracted right hand joint velocity label image sRJVI with the label LBL.
Furthermore, the learning data generation unit 223 stores multiple pairs of left hand joint position label images sLJPI and labels LBL, multiple pairs of right hand joint position label images sRJPI and labels LBL, multiple pairs of left hand joint velocity label images sLJVI and labels LBL, and multiple pairs of right hand joint velocity label images sRJVI and labels LBL in the learning data storage unit 230 as learning data sIs.

＊＊前処理統計量計算部２２４の説明＊＊
前処理統計量計算部２２４は、学習データ保存部２３０から学習データｓＩｓを取得する。
具体的には、前処理統計量計算部２２４は、学習データｓＩｓとして、左手関節位置ラベル画像ｓＬＪＰＩとラベルＬＢＬとの複数の対、右手関節位置ラベル画像ｓＲＪＰＩとラベルＬＢＬとの複数の対、左手関節速度ラベル画像ｓＬＪＶＩとラベルＬＢＬとの複数の対、右手関節速度ラベル画像ｓＲＪＶＩとラベルＬＢＬとの複数の対を取得する。 **Explanation of Pre-Processing Statistics Calculation Unit 224**
The pre-processing statistics calculation unit 224 acquires the learning data sIs from the learning data storage unit 230 .
Specifically, the preprocessing statistics calculation unit 224 acquires, as the learning data sIs, multiple pairs of left hand joint position label images sLJPI and labels LBL, multiple pairs of right hand joint position label images sRJPI and labels LBL, multiple pairs of left hand joint velocity label images sLJVI and labels LBL, and multiple pairs of right hand joint velocity label images sRJVI and labels LBL.

そして、前処理統計量計算部２２４は、学習データｓＩｓから、前処理統計量として関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを計算する。
具体的には、前処理統計量計算部２２４は、学習データｓＩｓに含まれる複数の左手関節位置ラベル画像ｓＬＪＰＩにおける輝度の平均値を座標軸Ｃごとに計算する。前処理統計量計算部２２４が左手関節位置ラベル画像ｓＬＪＰＩから計算した座標軸Ｃごとの平均値は、左手関節位置輝度平均値ＬＰＭ（Ｃ）である。
また、前処理統計量計算部２２４は、学習データｓＩｓに含まれる複数の右手関節位置ラベル画像ｓＲＪＰＩにおける輝度の平均値を座標軸Ｃごとに計算する。前処理統計量計算部２２４が右手関節位置ラベル画像ｓＲＪＰＩから計算した座標軸Ｃごとの平均値は、右手関節位置輝度平均値ＲＰＭ（Ｃ）である。
更に、前処理統計量計算部２２４は、学習データｓＩｓに含まれる複数の左手関節位置ラベル画像ｓＬＪＰＩにおける輝度の標準偏差を座標軸Ｃごとに計算する。前処理統計量計算部２２４が左手関節位置ラベル画像ｓＬＪＰＩから計算した座標軸Ｃごとの標準偏差は、左手関節位置輝度標準偏差ＬＰＳ（Ｃ）である。
また、前処理統計量計算部２２４は、学習データｓＩｓに含まれる複数の右手関節位置ラベル画像ｓＲＪＰＩにおける輝度の標準偏差を座標軸Ｃごとに計算する。前処理統計量計算部２２４が右手関節位置ラベル画像ｓＲＪＰＩから計算した座標軸Ｃごとの標準偏差は、右手関節位置輝度標準偏差ＲＰＳ（Ｃ）である。
なお、上記のように、左手についての輝度平均値及び輝度標準偏差と右手についての輝度平均値及び輝度標準偏差を用いるのではなく、左手と右手の両者にわたる輝度平均値及び輝度標準偏差を用いるようにしてもよい。以下では、左手についての輝度平均値及び輝度標準偏差と右手についての輝度平均値及び輝度標準偏差を用いる例を説明するが、以下の説明は左手と右手の両者にわたる輝度平均値及び輝度標準偏差を用いる場合にも適用される。 Then, the pre-processing statistics calculation unit 224 calculates the joint position luminance average value PM, the joint position luminance standard deviation PS, the joint velocity luminance average value VM, and the joint velocity luminance standard deviation VS as pre-processing statistics from the learning data sIs.
Specifically, the preprocessing statistics calculation unit 224 calculates the average luminance value in the multiple left hand joint position label images sLJPI included in the learning data sIs for each coordinate axis C. The average value for each coordinate axis C calculated by the preprocessing statistics calculation unit 224 from the left hand joint position label image sLJPI is the left hand joint position luminance average value LPM(C).
In addition, the preprocessing statistics calculation unit 224 calculates the average luminance value in the multiple right hand joint position label images sRJPI included in the learning data sIs for each coordinate axis C. The average value for each coordinate axis C calculated by the preprocessing statistics calculation unit 224 from the right hand joint position label image sRJPI is the right hand joint position luminance average value RPM(C).
Furthermore, the preprocessing statistics calculation unit 224 calculates the standard deviation of luminance in the multiple left hand joint position label images sLJPI included in the learning data sIs for each coordinate axis C. The standard deviation for each coordinate axis C calculated by the preprocessing statistics calculation unit 224 from the left hand joint position label image sLJPI is the left hand joint position luminance standard deviation LPS(C).
In addition, the preprocessing statistics calculation unit 224 calculates the standard deviation of luminance in the multiple right hand joint position label images sRJPI included in the learning data sIs for each coordinate axis C. The standard deviation for each coordinate axis C calculated by the preprocessing statistics calculation unit 224 from the right hand joint position label image sRJPI is the right hand joint position luminance standard deviation RPS(C).
Note that, instead of using the average luminance value and standard deviation of the left hand and the average luminance value and standard deviation of the right hand as described above, the average luminance value and standard deviation of the left hand and the average luminance value and standard deviation of the right hand may be used. In the following, an example in which the average luminance value and standard deviation of the left hand and the average luminance value and standard deviation of the right hand are used will be described, but the following description also applies to the case in which the average luminance value and standard deviation of the left hand and the average luminance value and standard deviation of the right hand are used.

また、前処理統計量計算部２２４は、学習データｓＩｓに含まれる複数の左手関節速度ラベル画像ｓＬＪＶＩにおける輝度の平均値を座標軸Ｃごとに計算する。前処理統計量計算部２２４が左手関節速度ラベル画像ｓＬＪＶＩから計算した座標軸Ｃごとの平均値は、左手関節速度輝度平均値ＬＶＭ（Ｃ）である。
また、前処理統計量計算部２２４は、学習データｓＩｓに含まれる複数の右手関節速度ラベル画像ｓＲＪＶＩにおける輝度の平均値を座標軸Ｃごとに計算する。前処理統計量計算部２２４が右手関節速度ラベル画像ｓＲＪＶＩから計算した座標軸Ｃごとの平均値は、右手関節速度輝度平均値ＲＶＭ（Ｃ）である。
更に、前処理統計量計算部２２４は、学習データｓＩｓに含まれる複数の左手関節速度ラベル画像ｓＬＪＶＩにおける輝度の標準偏差を座標軸Ｃごとに計算する。前処理統計量計算部２２４が左手関節速度ラベル画像ｓＬＪＶＩから計算した座標軸Ｃごとの標準偏差は、左手関節速度輝度標準偏差ＬＶＳ（Ｃ）である。
また、前処理統計量計算部２２４は、学習データｓＩｓに含まれる複数の右手関節速度ラベル画像ｓＲＪＶＩにおける輝度の標準偏差を座標軸Ｃごとに計算する。前処理統計量計算部２２４が右手関節速度ラベル画像ｓＲＪＶＩから計算した座標軸Ｃごとの標準偏差は、右手関節速度輝度標準偏差ＲＶＳ（Ｃ）である。
なお、上記のように、左手についての輝度平均値及び輝度標準偏差と右手についての輝度平均値及び輝度標準偏差を用いるのではなく、左手と右手の両者にわたる輝度平均値及び輝度標準偏差を用いるようにしてもよい。以下では、左手についての輝度平均値及び輝度標準偏差と右手についての輝度平均値及び輝度標準偏差を用いる例を説明するが、以下の説明は左手と右手の両者にわたる輝度平均値及び輝度標準偏差を用いる場合にも適用される。 In addition, the preprocessing statistics calculation unit 224 calculates the average luminance value in the multiple left hand joint velocity label images sLJVI included in the learning data sIs for each coordinate axis C. The average value for each coordinate axis C calculated by the preprocessing statistics calculation unit 224 from the left hand joint velocity label image sLJVI is the left hand joint velocity luminance average value LVM(C).
In addition, the preprocessing statistics calculation unit 224 calculates the average brightness value of the multiple right hand joint velocity label images sRJVI included in the learning data sIs for each coordinate axis C. The average value for each coordinate axis C calculated by the preprocessing statistics calculation unit 224 from the right hand joint velocity label image sRJVI is the right hand joint velocity brightness average value RVM(C).
Furthermore, the preprocessing statistics calculation unit 224 calculates the standard deviation of luminance in the multiple left hand joint velocity label images sLJVI included in the learning data sIs for each coordinate axis C. The standard deviation for each coordinate axis C calculated by the preprocessing statistics calculation unit 224 from the left hand joint velocity label image sLJVI is the left hand joint velocity luminance standard deviation LVS(C).
In addition, the preprocessing statistics calculation unit 224 calculates the standard deviation of luminance in the multiple right hand joint velocity label images sRJVI included in the learning data sIs for each coordinate axis C. The standard deviation for each coordinate axis C calculated by the preprocessing statistics calculation unit 224 from the right hand joint velocity label image sRJVI is the right hand joint velocity luminance standard deviation RVS(C).
Note that, instead of using the average luminance value and standard luminance deviation for the left hand and the average luminance value and standard luminance deviation for the right hand as described above, the average luminance value and standard luminance deviation for both the left and right hands may be used. In the following, an example will be described in which the average luminance value and standard luminance deviation for the left hand and the average luminance value and standard luminance deviation for the right hand are used, but the following description also applies to the case in which the average luminance value and standard luminance deviation for both the left and right hands are used.

前処理統計量計算部２２４は、関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを前処理統計量保存部２３１に格納する。The pre-processing statistics calculation unit 224 stores the joint position luminance mean value PM, the joint position luminance standard deviation PS, the joint velocity luminance mean value VM and the joint velocity luminance standard deviation VS in the pre-processing statistics storage unit 231.

＊＊要素作業推定モデル生成部２２５の説明＊＊
要素作業推定モデル生成部２２５は、学習データ保存部２３０から、学習データｓＩｓを取得する。具体的には、要素作業推定モデル生成部２２５は、学習データｓＩｓとして、左手関節位置ラベル画像ｓＬＪＰＩとラベルＬＢＬとの複数の対、右手関節位置ラベル画像ｓＲＪＰＩとラベルＬＢＬとの複数の対、左手関節速度ラベル画像ｓＬＪＶＩとラベルＬＢＬとの複数の対、右手関節速度ラベル画像ｓＲＪＶＩとラベルＬＢＬとの複数の対を取得する。
また、要素作業推定モデル生成部２２５は、前処理統計量保存部２３１から、関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得する。 **Explanation of the element work estimation model generation unit 225**
The element work estimation model generation unit 225 acquires the learning data sIs from the learning data storage unit 230. Specifically, the element work estimation model generation unit 225 acquires, as the learning data sIs, a plurality of pairs of left hand joint position label images sLJPI and labels LBL, a plurality of pairs of right hand joint position label images sRJPI and labels LBL, a plurality of pairs of left hand joint velocity label images sLJVI and labels LBL, and a plurality of pairs of right hand joint velocity label images sRJVI and labels LBL.
In addition, the element work estimation model generation unit 225 acquires the joint position luminance average value PM, the joint position luminance standard deviation PS, the joint velocity luminance average value VM, and the joint velocity luminance standard deviation VS from the preprocessing statistics storage unit 231.

次に、要素作業推定モデル生成部２２５は、学習データｓＩＳに含まれる左手関節位置ラベル画像ｓＬＪＰＩを、一定の幅及び高さをもつ画像にリサイズする。
更に、要素作業推定モデル生成部２２５は、リサイズ後の左手関節位置ラベル画像ｓＬＪＰＩの画素値を座標軸Ｃごとに関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。
具体的には、要素作業推定モデル生成部２２５は、リサイズ後の左手関節位置ラベル画像ｓＬＪＰＩの画素値を左手関節位置輝度平均値ＬＰＭ（Ｃ）と左手関節位置輝度標準偏差ＬＰＳ（Ｃ）を用いて標準化する。 Next, the element work estimation model generation unit 225 resizes the left hand joint position label image sLJPI included in the learning data sIS into an image having a certain width and height.
Furthermore, the element work estimation model generation unit 225 standardizes the pixel values of the resized left hand joint position labeled image sLJPI for each coordinate axis C using the joint position luminance average value PM and the joint position luminance standard deviation PS.
Specifically, the element task estimation model generation unit 225 standardizes the pixel values of the resized left hand joint position labeled image sLJPI using the left hand joint position luminance average value LPM(C) and the left hand joint position luminance standard deviation LPS(C).

また、要素作業推定モデル生成部２２５は、学習データｓＩＳに含まれる右手関節位置ラベル画像ｓＲＪＰＩを、一定の幅及び高さをもつ画像にリサイズする。
更に、要素作業推定モデル生成部２２５は、リサイズ後の右手関節位置ラベル画像ｓＲＪＰＩの画素値を座標軸Ｃごとに関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。
具体的には、要素作業推定モデル生成部２２５は、リサイズ後の右手関節位置ラベル画像ｓＲＪＰＩの画素値を右手関節位置輝度平均値ＲＰＭ（Ｃ）と右手関節位置輝度標準偏差ＲＰＳ（Ｃ）を用いて標準化する。 Furthermore, the element work estimation model generation unit 225 resizes the right hand joint position label image sRJPI included in the learning data sIS into an image having a certain width and height.
Furthermore, the element work estimation model generation unit 225 standardizes the pixel values of the resized right hand joint position labeled image sRJPI for each coordinate axis C using the joint position luminance average value PM and the joint position luminance standard deviation PS.
Specifically, the element task estimation model generation unit 225 standardizes the pixel values of the resized right hand joint position labeled image sRJPI using the right hand joint position luminance average value RPM(C) and the right hand joint position luminance standard deviation RPS(C).

また、要素作業推定モデル生成部２２５は、学習データｓＩＳに含まれる左手関節速度ラベル画像ｓＬＪＶＩを、一定の幅及び高さをもつ画像にリサイズする。
更に、要素作業推定モデル生成部２２５は、リサイズ後の左手関節速度ラベル画像ｓＬＪＶＩの画素値を座標軸Ｃごとに関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。
具体的には、要素作業推定モデル生成部２２５は、リサイズ後の左手関節速度ラベル画像ｓＬＪＶＩの画素値を左手関節速度輝度平均値ＬＶＭ（Ｃ）と左手関節速度輝度標準偏差ＬＶＳ（Ｃ）を用いて標準化する。 Furthermore, the element work estimation model generation unit 225 resizes the left hand joint velocity label image sLJVI included in the learning data sIS into an image having a certain width and height.
Furthermore, the element work estimation model generation unit 225 standardizes the pixel values of the resized left hand joint velocity labeled image sLJVI for each coordinate axis C using the joint velocity luminance average value VM and the joint velocity luminance standard deviation VS.
Specifically, the element work estimation model generation unit 225 standardizes the pixel values of the resized left hand joint velocity labeled image sLJVI using the left hand joint velocity brightness average value LVM(C) and the left hand joint velocity brightness standard deviation LVS(C).

また、要素作業推定モデル生成部２２５は、学習データｓＩＳに含まれる右手関節速度ラベル画像ｓＲＪＶＩを、一定の幅及び高さをもつ画像にリサイズする。
更に、要素作業推定モデル生成部２２５は、リサイズ後の右手関節速度ラベル画像ｓＲＪＶＩの画素値を座標軸Ｃごとに関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。
具体的には、要素作業推定モデル生成部２２５は、リサイズ後の右手関節速度ラベル画像ｓＲＪＶＩの画素値を右手関節速度輝度平均値ＲＶＭ（Ｃ）と右手関節速度輝度標準偏差ＲＶＳ（Ｃ）を用いて標準化する。 Furthermore, the element work estimation model generation unit 225 resizes the right hand joint velocity label image sRJVI included in the learning data sIS into an image having a certain width and height.
Furthermore, the element work estimation model generation unit 225 standardizes the pixel values of the resized right hand joint velocity labeled image sRJVI for each coordinate axis C using the joint velocity luminance average value VM and the joint velocity luminance standard deviation VS.
Specifically, the element work estimation model generation unit 225 standardizes the pixel values of the resized right hand joint velocity labeled image sRJVI using the right hand joint velocity brightness average value RVM(C) and the right hand joint velocity brightness standard deviation RVS(C).

次に、要素作業推定モデル生成部２２５は、リサイズ及び標準化の後の各ラベル画像を、事前に定義したニューラルネットワークに入力する。ニューラルネットワークに入力するラベル画像は、リサイズ及び標準化の後の左手関節位置ラベル画像ｓＬＪＰＩ、リサイズ及び標準化の後の右手関節位置ラベル画像ｓＲＪＰＩ、リサイズ及び標準化の後の左手関節速度ラベル画像ｓＬＪＶＩ、及びリサイズ及び標準化の後の右手関節速度ラベル画像ｓＲＪＶＩである。
ニューラルネットワークは事前学習済みの畳み込みニューラルネットワークと全結合層を組み合わせたものとする。また、事前学習済みの畳み込みニューラルネットワークについて、利用する層の数及び層の種類等の組み合わせは、学習装置２００のユーザが任意に指定する。 Next, the element work estimation model generation unit 225 inputs each of the resized and standardized label images to a predefined neural network. The label images to be input to the neural network are the resized and standardized left hand joint position label image sLJPI, the resized and standardized right hand joint position label image sRJPI, the resized and standardized left hand joint velocity label image sLJVI, and the resized and standardized right hand joint velocity label image sRJVI.
The neural network is a combination of a pre-trained convolutional neural network and a fully connected layer. The number of layers and the type of layers to be used in the pre-trained convolutional neural network are arbitrarily specified by the user of the learning device 200.

要素作業推定モデル生成部２２５は、畳み込みニューラルネットワークによって抽出された各ラベル画像の特徴マップを取得する。
リサイズ及び標準化の後の左手関節位置ラベル画像ｓＬＪＰＩの特徴マップを、特徴マップｆＬＰという。
リサイズ及び標準化の後の右手関節位置ラベル画像ｓＲＪＰＩの特徴マップを、特徴マップｆＲＰという。
リサイズ及び標準化の後の左手関節速度ラベル画像ｓＬＪＶＩの特徴マップを、特徴マップｆＬＶという。
リサイズ及び標準化の後の右手関節速度ラベル画像ｓＲＪＶＩの特徴マップを、特徴マップｆＲＶという。 The element task estimation model generation unit 225 acquires a feature map of each label image extracted by the convolutional neural network.
The feature map of the left hand joint position label image sLJPI after resizing and standardization is called the feature map fLP.
The feature map of the right hand joint position label image sRJPI after resizing and standardization is called the feature map fRP.
The feature map of the left wrist joint velocity label image sLJVI after resizing and standardization is called the feature map fLV.
The feature map of the right wrist joint velocity label image sRJVI after resizing and standardization is referred to as the feature map fRV.

要素作業推定モデル生成部２２５は、これら特徴マップｆＬＰ、特徴マップｆＲＰ、特徴マップｆＬＶ及び特徴マップｆＲＶをベクトル化して結合する。
更に、要素作業推定モデル生成部２２５は、結合後の特徴マップを、ニューラルネットワークに入力し、ニューラルネットワークの重みを学習し、要素作業推定モデルＭを生成する。
そして、要素作業推定モデル生成部２２５は、要素作業推定モデルＭを要素作業推定モデル保存部２３２に格納する。 The element work estimation model generation unit 225 vectorizes and combines these feature maps fLP, fRP, fLV, and fRV.
Furthermore, the element work estimation model generation unit 225 inputs the combined feature map to a neural network, learns the weights of the neural network, and generates an element work estimation model M.
Then, the element work estimation model generation unit 225 stores the element work estimation model M in the element work estimation model storage unit 232 .

＊＊学習データ保存部２３０の説明＊＊
学習データ保存部２３０は、学習データｓＩｓを保存する。 **Explanation of the learning data storage unit 230**
The learning data storage unit 230 stores the learning data sIs.

＊＊前処理統計量保存部２３１の説明＊＊
前処理統計量保存部２３１は、前処理統計量（関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳ）を保存する。 **Explanation of Pre-Processing Statistics Storage Unit 231**
The pre-processing statistics storage unit 231 stores the pre-processing statistics (the joint position luminance average value PM, the joint position luminance standard deviation PS, the joint velocity luminance average value VM, and the joint velocity luminance standard deviation VS).

＊＊要素作業推定モデル保存部２３２の説明＊＊
要素作業推定モデル保存部２３２は、要素作業推定モデルＭを保存する。 **Explanation of the element work estimation model storage unit 232**
The element work estimation model storage unit 232 stores the element work estimation model M.

＊＊要素作業推定モデル生成部２２５の内部構成例の説明＊＊
図１３は、要素作業推定モデル生成部２２５の内部構成例を示す。
図１３を用いて、要素作業推定モデル生成部２２５の内部構成例を説明する。
なお、図１３では、学習装置２００の機能構成要素のうち、要素作業推定モデル生成部２２５の内部構成例を説明するのに必要な機能構成要素のみが図示されている。 **Explanation of an example of the internal configuration of the element work estimation model generation unit 225**
FIG. 13 shows an example of the internal configuration of the element work estimation model generation unit 225.
An example of the internal configuration of the element work estimation model generation unit 225 will be described with reference to FIG.
Note that, in FIG. 13, of the functional components of the learning device 200, only the functional components necessary for explaining an example of the internal configuration of the element work estimation model generation unit 225 are shown.

要素作業推定モデル生成部２２５は、内部構成として、関節位置画像取得部２２５１、関節速度画像取得部２２５２、関節位置特徴抽出部２２５３、関節速度特徴抽出部２２５４及び関節特徴分類学習部２２５５を有する。
関節位置画像取得部２２５１は、図８に示す関節位置画像取得部１２４１に相当する。
関節速度画像取得部２２５２は、図８に示す関節速度画像取得部１２４２に相当する。
関節位置特徴抽出部２２５３は、図８に示す関節位置特徴抽出部１２４３に相当する。
関節速度特徴抽出部２２５４は、図８に示す関節速度特徴抽出部１２４４に相当する。
関節画像特徴分類学習部２２５５は、図８に示す関節画像特徴分類部１２４５に相当する。 The element work estimation model generation unit 225 has, as its internal components, a joint position image acquisition unit 2251, a joint velocity image acquisition unit 2252, a joint position feature extraction unit 2253, a joint velocity feature extraction unit 2254, and a joint feature classification learning unit 2255.
The joint position image acquisition unit 2251 corresponds to the joint position image acquisition unit 1241 shown in FIG.
The joint velocity image acquisition unit 2252 corresponds to the joint velocity image acquisition unit 1242 shown in FIG.
The joint position feature extraction unit 2253 corresponds to the joint position feature extraction unit 1243 shown in FIG.
The joint velocity feature extractor 2254 corresponds to the joint velocity feature extractor 1244 shown in FIG.
The joint image feature classification learning unit 2255 corresponds to the joint image feature classification unit 1245 shown in FIG.

＊＊関節位置画像取得部２２５１の説明＊＊
関節位置画像取得部２２５１は、学習データ保存部２３０から、学習データｓＩｓを取得する。
具体的は、関節位置画像取得部２２５１は、学習データｓＩｓとして、左手関節位置ラベル画像ｓＬＪＰＩとラベルＬＢＬとの複数の対、右手関節位置ラベル画像ｓＲＪＰＩとラベルＬＢＬとの複数の対、左手関節速度ラベル画像ｓＬＪＶＩとラベルＬＢＬとの複数の対、右手関節速度ラベル画像ｓＲＪＶＩとラベルＬＢＬとの複数の対を取得する。
また、関節位置画像取得部２２５１は、前処理統計量保存部２３１から、前処理統計量を取得する。具体的には、関節位置画像取得部２２５１は、前処理統計量として関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得する。 **Explanation of the joint position image acquisition unit 2251**
The joint position image acquisition unit 2251 acquires the learning data sIs from the learning data storage unit 230.
Specifically, the joint position image acquisition unit 2251 acquires, as the learning data sIs, multiple pairs of left hand joint position label images sLJPI and labels LBL, multiple pairs of right hand joint position label images sRJPI and labels LBL, multiple pairs of left hand joint velocity label images sLJVI and labels LBL, and multiple pairs of right hand joint velocity label images sRJVI and labels LBL.
Furthermore, the joint position image acquisition section 2251 acquires preprocessing statistics from the preprocessing statistics storage section 231. Specifically, the joint position image acquisition section 2251 acquires a joint position luminance average value PM, a joint position luminance standard deviation PS, a joint velocity luminance average value VM, and a joint velocity luminance standard deviation VS as the preprocessing statistics.

そして、関節位置画像取得部２２５１は、左手関節位置ラベル画像ｓＬＪＰＩを、一定の幅及び高さをもつ画像にリサイズする。更に、関節位置画像取得部２２５１は、リサイズ後の左手関節位置ラベル画像ｓＬＪＰＩの画素値を座標軸Ｃごとに左手関節位置輝度平均値ＬＰＭ（Ｃ）と左手関節位置輝度標準偏差ＬＰＳ（Ｃ）を用いて標準化する。
また、関節位置画像取得部２２５１は、右手関節位置ラベル画像ｓＲＪＰＩを、一定の幅及び高さをもつ画像にリサイズする。更に、関節位置画像取得部２２５１は、リサイズ後の右手関節位置ラベル画像ｓＲＪＰＩの画素値を座標軸Ｃごとに右手関節位置輝度平均値ＲＰＭ（Ｃ）と右手関節位置輝度標準偏差ＲＰＳ（Ｃ）を用いて標準化する。
関節位置画像取得部２２５１は、リサイズ及び標準化の後の左手関節位置ラベル画像ｓＬＪＰＩ、リサイズ及び標準化の後の右手関節位置ラベル画像ｓＲＪＰＩを関節位置特徴抽出部２２５３に出力する。
なお、上記のように、左手についての輝度平均値及び輝度標準偏差と右手についての輝度平均値及び輝度標準偏差を用いるのではなく、左手と右手の両者にわたる輝度平均値及び輝度標準偏差を用いるようにしてもよい。以下では、左手についての輝度平均値及び輝度標準偏差と右手についての輝度平均値及び輝度標準偏差を用いる例を説明するが、以下の説明は左手と右手の両者にわたる輝度平均値及び輝度標準偏差を用いる場合にも適用される。 Then, the joint position image acquisition unit 2251 resizes the left hand joint position label image sLJPI to an image having a certain width and height. Furthermore, the joint position image acquisition unit 2251 standardizes the pixel values of the resized left hand joint position label image sLJPI using the left hand joint position luminance average value LPM(C) and the left hand joint position luminance standard deviation LPS(C) for each coordinate axis C.
The joint position image acquisition unit 2251 also resizes the right hand joint position label image sRJPI to an image having a certain width and height. Furthermore, the joint position image acquisition unit 2251 standardizes the pixel values of the resized right hand joint position label image sRJPI using the right hand joint position luminance average value RPM(C) and the right hand joint position luminance standard deviation RPS(C) for each coordinate axis C.
The joint position image acquisition unit 2251 outputs the resized and standardized left hand joint position label image sLJPI and the resized and standardized right hand joint position label image sRJPI to the joint position feature extraction unit 2253.
Note that, instead of using the average luminance value and standard luminance deviation for the left hand and the average luminance value and standard luminance deviation for the right hand as described above, the average luminance value and standard luminance deviation for both the left hand and the right hand may be used. In the following, an example will be described in which the average luminance value and standard luminance deviation for the left hand and the average luminance value and standard luminance deviation for the right hand are used, but the following description also applies to the case in which the average luminance value and standard luminance deviation for both the left hand and the right hand are used.

また、関節位置画像取得部２２５１は、左手関節速度ラベル画像ｓＬＪＶＩ、右手関節速度ラベル画像ｓＲＪＶＩ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを、関節速度画像取得部２２５２に出力する。
ここでは、関節位置画像取得部２２５１が、左手関節速度ラベル画像ｓＬＪＶＩ、右手関節速度ラベル画像ｓＲＪＶＩ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得し、これらを関節速度画像取得部２２５２に出力する例を説明している。
これに代えて、関節速度画像取得部２２５２が、学習データ保存部２３０から左手関節速度ラベル画像ｓＬＪＶＩ及び右手関節速度ラベル画像ｓＲＪＶＩを取得し、前処理統計量保存部２３１から、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得するようにしてもよい。この場合は、関節位置画像取得部２２５１は、学習データ保存部２３０から左手関節位置ラベル画像ｓＬＪＰＩ及び右手関節位置ラベル画像ｓＲＪＰＩのみを取得し、前処理統計量保存部２３１から関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳのみを取得する。 In addition, the joint position image acquisition section 2251 outputs the left hand joint velocity label image sLJVI, the right hand joint velocity label image sRJVI, the joint velocity brightness average value VM, and the joint velocity brightness standard deviation VS to the joint velocity image acquisition section 2252.
Here, an example is described in which the joint position image acquisition unit 2251 acquires a left hand joint velocity label image sLJVI, a right hand joint velocity label image sRJVI, a joint velocity brightness average value VM, and a joint velocity brightness standard deviation VS, and outputs these to the joint velocity image acquisition unit 2252.
Alternatively, the joint velocity image acquisition unit 2252 may acquire the left hand joint velocity label image sLJVI and the right hand joint velocity label image sRJVI from the learning data storage unit 230, and acquire the joint velocity luminance average value VM and the joint velocity luminance standard deviation VS from the preprocessing statistics storage unit 231. In this case, the joint position image acquisition unit 2251 acquires only the left hand joint position label image sLJPI and the right hand joint position label image sRJPI from the learning data storage unit 230, and acquires only the joint position luminance average value PM and the joint position luminance standard deviation PS from the preprocessing statistics storage unit 231.

＊＊関節速度画像取得部２２５２の説明＊＊
関節速度画像取得部２２５２は、関節位置画像取得部２２５１から、左手関節速度ラベル画像ｓＬＪＶＩ、右手関節速度ラベル画像ｓＲＪＶＩ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得する。
そして、関節速度画像取得部２２５２は、左手関節速度ラベル画像ｓＬＪＶＩを、一定の幅及び高さをもつ画像にリサイズする。更に、関節速度画像取得部２２５２は、リサイズ後の左手関節速度ラベル画像ｓＬＪＶＩの画素値を座標軸Ｃごとに左手関節速度輝度平均値ＬＶＭ（Ｃ）と左手関節速度輝度標準偏差ＬＶＳ（Ｃ）を用いて標準化する。
また、関節速度画像取得部２２５２は、右手関節速度ラベル画像ｓＲＪＰＩを、一定の幅及び高さをもつ画像にリサイズする。更に、関節速度画像取得部２２５２は、リサイズ後の右手関節速度ラベル画像ｓＲＪＶＩの画素値を座標軸Ｃごとに右手関節速度輝度平均値ＲＶＭ（Ｃ）と右手関節速度輝度標準偏差ＲＶＳ（Ｃ）を用いて標準化する。
関節速度画像取得部２２５２は、リサイズ及び標準化の後の左手関節速度ラベル画像ｓＬＪＶＩ、リサイズ及び標準化の後の右手関節速度ラベル画像ｓＲＪＶＩを関節速度特徴抽出部２２５４に出力する。 **Explanation of the joint velocity image acquisition unit 2252**
The joint velocity image acquisition section 2252 acquires a left hand joint velocity label image sLJVI, a right hand joint velocity label image sRJVI, a joint velocity brightness average value VM, and a joint velocity brightness standard deviation VS from the joint position image acquisition section 2251.
Then, the joint velocity image acquisition unit 2252 resizes the left hand joint velocity labeled image sLJVI to an image having a certain width and height. Furthermore, the joint velocity image acquisition unit 2252 standardizes the pixel values of the resized left hand joint velocity labeled image sLJVI using the left hand joint velocity brightness average value LVM(C) and the left hand joint velocity brightness standard deviation LVS(C) for each coordinate axis C.
The joint velocity image acquisition unit 2252 also resizes the right hand joint velocity labeled image sRJPI to an image with a certain width and height. Furthermore, the joint velocity image acquisition unit 2252 standardizes the pixel values of the resized right hand joint velocity labeled image sRJVI using the right hand joint velocity brightness average value RVM(C) and the right hand joint velocity brightness standard deviation RVS(C) for each coordinate axis C.
The joint velocity image acquisition unit 2252 outputs the resized and standardized left hand joint velocity label image sLJVI and the resized and standardized right hand joint velocity label image sRJVI to the joint velocity feature extraction unit 2254.

＊＊関節位置特徴抽出部２２５３の説明＊＊
関節位置特徴抽出部２２５３は、関節位置画像取得部２２５１から、リサイズ及び標準化の後の左手関節位置ラベル画像ｓＬＪＰＩと、リサイズ及び標準化の後の右手関節位置ラベル画像ｓＲＪＰＩを取得する。
そして、関節位置特徴抽出部２２５３は、リサイズ及び標準化の後の左手関節位置ラベル画像ｓＬＪＰＩを事前学習済みの畳み込みニューラルネットワークに入力し、特徴ベクトルｆＬＰを得る。 **Explanation of the joint position feature extraction unit 2253**
The joint position feature extraction unit 2253 acquires a resized and standardized left hand joint position label image sLJPI and a resized and standardized right hand joint position label image sRJPI from the joint position image acquisition unit 2251.
Then, the joint position feature extraction unit 2253 inputs the resized and standardized left hand joint position label image sLJPI to a pre-trained convolutional neural network to obtain a feature vector fLP.

また、関節位置特徴抽出部２２５３は、後述する関節画像特徴分類部２２５５から伝搬する損失値に基づくネットワークの学習、即ち、ネットワークのもつ重みの計算を行うこともある。例えば、事前学習済みのネットワークの重みを初期値として、すべての層の重みを更新する場合である。
重みの計算については、関節位置特徴抽出部２２５３は、例えば、一般に用いられる誤差逆伝搬法を用いる。
また、関節位置特徴抽出部２２５３は、重みの計算に、その他公知の手法を用いてもよい。 The joint position feature extraction unit 2253 may also perform network learning, i.e., calculation of network weights, based on the loss value propagated from the joint image feature classification unit 2255 described later. For example, the weights of all layers may be updated using the weights of a pre-trained network as initial values.
For calculating the weights, the joint position feature extraction unit 2253 uses, for example, the commonly used backpropagation algorithm.
Furthermore, the joint position feature extraction unit 2253 may use other known methods to calculate the weights.

関節位置特徴抽出部２２５３は、学習処理として、特徴マップｆＬＰの取得、損失値Ｌの計算、重みの計算を予め定めた回数繰り返す。
最終的に、関節位置特徴抽出部２２５３は、リサイズ及び標準化の後の左手関節位置ラベル画像ｓＬＪＰＩの特徴マップｆＬＰを取得する。
また、関節位置特徴抽出部２２５３は、リサイズ及び標準化の後の右手関節位置ラベル画像ｓＲＪＰＩについても同様の処理を行い、リサイズ及び標準化の後の右手関節位置ラベル画像ｓＲＪＰＩの特徴マップｆＲＰを取得する。
関節位置特徴抽出部２２５３は、特徴マップｆＬＰと特徴マップｆＲＰを関節画像特徴分類学習部２２５５に出力する。 As a learning process, the joint position feature extraction unit 2253 repeats the acquisition of the feature map fLP, the calculation of the loss value L, and the calculation of the weights a predetermined number of times.
Finally, the joint position feature extraction unit 2253 obtains a feature map fLP of the left hand joint position label image sLJPI after resizing and standardization.
In addition, the joint position feature extraction unit 2253 performs similar processing on the right hand joint position label image sRJPI after resizing and standardization, and obtains a feature map fRP of the right hand joint position label image sRJPI after resizing and standardization.
The joint position feature extraction unit 2253 outputs the feature map fLP and the feature map fRP to the joint image feature classification learning unit 2255.

＊＊関節速度特徴抽出部２２５４の説明＊＊
関節速度特徴抽出部２２５４は、関節速度画像取得部２２５２から、リサイズ及び標準化の後の左手関節速度ラベル画像ｓＬＪＶＩと、リサイズ及び標準化の後の右手関節速度ラベル画像ｓＲＪＶＩを取得する。
そして、関節速度特徴抽出部２２５４は、関節位置特徴抽出部２２５３と同様の処理を行い、リサイズ及び標準化の後の左手関節速度ラベル画像ｓＬＪＶＩの特徴マップｆＬＶを取得する。また、関節速度特徴抽出部２２５４は。リサイズ及び標準化の後の右手関節速度ラベル画像ｓＲＪＶＩの特徴マップｆＲＶを取得する。
そして、関節速度特徴抽出部２２５４は、特徴マップｆＬＶと特徴マップｆＲＶを関節画像特徴分類学習部２２５５に出力する。 **Description of the joint velocity feature extraction unit 2254**
The joint velocity feature extraction unit 2254 acquires a resized and standardized left hand joint velocity label image sLJVI and a resized and standardized right hand joint velocity label image sRJVI from the joint velocity image acquisition unit 2252.
Then, the joint velocity feature extraction unit 2254 performs the same process as the joint position feature extraction unit 2253 to obtain a feature map fLV of the left hand joint velocity label image sLJVI after resizing and standardization. Also, the joint velocity feature extraction unit 2254 obtains a feature map fRV of the right hand joint velocity label image sRJVI after resizing and standardization.
Then, the joint velocity feature extraction unit 2254 outputs the feature map fLV and the feature map fRV to the joint image feature classification learning unit 2255.

＊＊関節画像特徴分類学習部２２５５の説明＊＊
関節画像特徴分類学習部２２５５は、関節位置特徴抽出部２２５３から、特徴マップｆＬＰと特徴マップｆＲＰを取得する。
また、関節画像特徴分類学習部２２５５は、関節速度特徴抽出部２２５４から、特徴マップｆＬＶと特徴マップｆＲＶを取得する。
そして、関節画像特徴分類学習部２２５５は、特徴マップｆＬＰ、特徴マップｆＲＰ、特徴マップｆＬＶ及び特徴マップｆＲＶをベクトル化して結合する。
更に、関節画像特徴分類学習部２２５５は、結合後の特徴マップを、ニューラルネットワークに入力し、ニューラルネットワークの重みを学習し、要素作業推定モデルＭを生成する。
関節画像特徴分類学習部２２５５は、損失値Ｌの計算に、例えば、一般に分類の学習に用いるクロスエントロピーを利用する。
関節位置特徴抽出部２２５３は、ラベルＬＢＬの種類におけるデータ量のばらつきに対応するように、ＦｏｃａｌＬｏｓｓなどのインバランスデータに対応した損失値を用いてもよい。
また、関節画像特徴分類学習部２２５５は、データのばらつきに対応するように、特徴ベクトル同士の角度とラベルＬＢＬの要素作業種類ｔｙｐとの関係に基づくような距離学習を可能とする損失関数を用いてもよい。
そして、関節画像特徴分類学習部２２５５は、要素作業推定モデルＭを要素作業推定モデル保存部２３２に格納する。 **Explanation of the joint image feature classification learning unit 2255**
The joint image feature classification learning unit 2255 acquires the feature maps fLP and fRP from the joint position feature extraction unit 2253.
In addition, the joint image feature classification learning unit 2255 acquires the feature maps fLV and fRV from the joint velocity feature extraction unit 2254.
Then, the joint image feature classification learning unit 2255 vectorizes and combines the feature maps fLP, fRP, fLV, and fRV.
Furthermore, the joint image feature classification learning unit 2255 inputs the combined feature map to a neural network, learns the weights of the neural network, and generates an element task estimation model M.
The joint image feature classification learning unit 2255 uses, for example, cross-entropy, which is generally used for classification learning, to calculate the loss value L.
The joint position feature extraction unit 2253 may use a loss value corresponding to imbalance data, such as focal loss, so as to accommodate variations in the amount of data depending on the type of label LBL.
In addition, the joint image feature classification learning unit 2255 may use a loss function that enables distance learning based on the relationship between the angle between feature vectors and the element task type typ of the label LBL, so as to deal with variability in the data.
Then, the joint image feature classification learning unit 2255 stores the element work estimation model M in the element work estimation model storage unit 232.

＊＊＊動作の説明＊＊＊
次に、本実施の形態に係る学習装置２００の動作例を説明する。
図１４は、学習装置２００の動作例を示すフローチャートである。 *** Operation Description ***
Next, an example of the operation of the learning device 200 according to the present embodiment will be described.
FIG. 14 is a flowchart showing an example of the operation of the learning device 200.

先ず、ステップＳ２１において、関節位置時系列データ取得部２２０が、撮像装置２１０からの映像Ｖから関節位置時系列データＨＰＴを生成する。First, in step S21, the joint position time series data acquisition unit 220 generates joint position time series data HPT from the image V from the imaging device 210.

次に、ステップＳ２２において、関節速度計算部２２１が、関節位置時系列データＨＰＴから関節速度時系列データＨＶＴを生成する。Next, in step S22, the joint velocity calculation unit 221 generates joint velocity time series data HVT from the joint position time series data HPT.

次に、ステップＳ２３において、関節時系列データ画像化部２２２が、関節位置時系列データＨＰＴを画像化して、左手関節位置画像ＬＪＰＩと右手関節位置画像ＲＪＰＩを生成する。また、関節時系列データ画像化部２２２が、関節速度時系列データＨＶＴを画像化して、左手関節速度画像ＬＪＶＩと右手関節速度画像ＲＪＶＩを生成する。Next, in step S23, the joint time series data imaging unit 222 images the joint position time series data HPT to generate a left wrist joint position image LJPI and a right wrist joint position image RJPI. The joint time series data imaging unit 222 also images the joint velocity time series data HVT to generate a left wrist joint velocity image LJVI and a right wrist joint velocity image RJVI.

次に、ステップＳ２４において、学習データ生成部２２３が学習データｓＩｓを生成する。Next, in step S24, the learning data generation unit 223 generates learning data sIs.

次に、ステップＳ２５において、前処理統計量計算部２２４が、前処理統計量を計算する。Next, in step S25, the preprocessing statistics calculation unit 224 calculates the preprocessing statistics.

最後に、ステップＳ２６において、要素作業推定モデル生成部２２５が、学習データｓＩｓと前処理統計量とを用いて、要素作業推定モデルＭを生成する。
最後に、要素作業推定モデル生成部２２５は、要素作業推定モデルＭを要素作業推定モデル保存部２３２に格納する。 Finally, in step S26, the element work estimation model generation unit 225 generates an element work estimation model M using the learning data sIs and the pre-processing statistics.
Finally, the element work estimation model generation unit 225 stores the element work estimation model M in the element work estimation model storage unit 232 .

＊＊＊実施の形態の効果の説明＊＊＊
本実施の形態によれば、作業実施時間帯に対応する部分画像から作業実施時間帯で実施されている要素作業を推定するための要素作業推定モデルＭを生成することができる。 ***Description of Effects of the Embodiment***
According to this embodiment, it is possible to generate an element work estimation model M for estimating the element work being performed in a work execution time period from a partial image corresponding to the work execution time period.

実施の形態３．
本実施の形態では、実施の形態２で説明した学習装置２００の変形例として学習装置３００を説明する。より具体的には、本実施の形態では、学習装置３００は、実施の形態１で説明した作業実施時間帯検出部１２３が作業実施時間帯を検出するのに用いる学習済みモデルを生成する。
本実施の形態では、主に実施の形態２との差異を説明する。
なお、以下で説明していない事項は、実施の形態２と同様である。 Embodiment 3.
In this embodiment, a learning device 300 will be described as a modified example of the learning device 200 described in embodiment 2. More specifically, in this embodiment, the learning device 300 generates a trained model that is used by the work performance time zone detection unit 123 described in embodiment 1 to detect a work performance time zone.
In this embodiment, differences from the second embodiment will be mainly described.
It should be noted that matters not explained below are the same as those in the second embodiment.

＊＊構成の説明＊＊
図１５は、本実施の形態に係る学習装置３００の機能構成例を示す。
図１５において、撮像装置３１０は、図１２に示す撮像装置２１０と同様である。
また、関節位置時系列データ取得部３２０は、図１２に示す関節位置時系列データ取得部２２０と同様である。
また、関節速度計算部３２１は、図１２に示す関節速度計算部２２１と同様である。
また、関節時系列データ画像化部３２２は、図１２に示す関節時系列データ画像化部２２２と同様である。
このため、これらについては詳細な説明を省略する。
なお、学習装置３００も、学習装置２００と同様に、図２３に例示するハードウェア構成を有するものとする。 **Configuration Description**
FIG. 15 shows an example of the functional configuration of a learning device 300 according to this embodiment.
In FIG. 15, an image capturing device 310 is similar to the image capturing device 210 shown in FIG.
12. Moreover, the joint position time-series data acquisition unit 320 is similar to the joint position time-series data acquisition unit 220 shown in FIG.
12. Moreover, the joint velocity calculation unit 321 is similar to the joint velocity calculation unit 221 shown in FIG.
12. The joint time-series data imaging unit 322 is similar to the joint time-series data imaging unit 222 shown in FIG.
For this reason, detailed explanations of these will be omitted.
Similar to the learning device 200, the learning device 300 also has the hardware configuration illustrated in FIG.

＊＊把持道具時系列データ生成部３２５の説明
把持道具時系列データ生成部３２５は、実施の形態１で示した把持道具時系列データ生成部１２６と同様の処理を行う。 **Description of the grip tool time series data generating unit 325 The grip tool time series data generating unit 325 performs the same processing as the grip tool time series data generating unit 126 described in the first embodiment.

つまり、把持道具時系列データ生成部３２５は、映像Ｖ又はセンサ情報ＲＲＳを取得する。
把持道具時系列データ生成部３２５は、映像Ｖ又はセンサ情報ＲＲＳを用いて、作業従事者の手が道具を把持しているか否かを識別する。また、作業従事者の手が道具を把持している場合は、把持道具時系列データ生成部３２５は、作業従事者の手が把持している道具の種類を識別する。
把持道具時系列データ生成部３２５は、把持道具時系列データ生成部１２６と同様の方法で、作業従事者の手が道具を把持しているか否かを識別する。また、把持道具時系列データ生成部３２５は、把持道具時系列データ生成部１２６と同様の方法で、作業従事者の手が把持している道具の種類を識別する。 That is, the gripping tool time-series data generator 325 acquires the video V or the sensor information RRS.
The gripped tool time-series data generator 325 uses the video V or the sensor information RRS to identify whether the worker's hand is gripping a tool. If the worker's hand is gripping a tool, the gripped tool time-series data generator 325 identifies the type of tool the worker's hand is gripping.
The gripping tool time series data generating unit 325 identifies whether or not the worker's hand is gripping a tool, using a method similar to that of the gripping tool time series data generating unit 126. The gripping tool time series data generating unit 325 also identifies the type of tool being held by the worker's hand, using a method similar to that of the gripping tool time series data generating unit 126.

把持道具時系列データ生成部３２５は、識別結果を示す左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１を学習データ生成部３２３に出力する。
また、把持道具時系列データ生成部３２５は、左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１を把持道具情報保存部３３３に格納する。
把持道具時系列データ生成部３２５が生成する左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１は、把持道具時系列データ生成部１２６が生成する左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１と同様である。 The gripped tool time-series data generating unit 325 outputs the left hand gripped tool data LTO1 and the right hand gripped tool data RTO1 indicating the classification results to the learning data generating unit 323.
In addition, the gripping tool time-series data generating unit 325 stores the left hand gripping tool data LTO1 and the right hand gripping tool data RTO1 in the gripping tool information storage unit 333.
The left hand gripped tool data LTO1 and the right hand gripped tool data RTO1 generated by the grip tool time series data generating unit 325 are similar to the left hand gripped tool data LTO1 and the right hand gripped tool data RTO1 generated by the grip tool time series data generating unit 126.

＊＊学習データ生成部３２３の説明＊＊
学習データ生成部３２３は、実施の形態１で示した作業実施時間帯検出部１２３と同様の処理を行う。 **Explanation of the learning data generation unit 323**
The learning data generating unit 323 performs the same processing as the work execution time zone detecting unit 123 described in the first embodiment.

つまり、学習データ生成部３２３は、関節時系列データ画像化部３２２から、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩを取得する。
更に、学習データ生成部３２３は、把持道具時系列データ生成部３２５から左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１を取得する。 That is, the learning data generation unit 323 acquires a left hand joint position image LJPI, a left hand joint velocity image LJVI, a right hand joint position image RJPI, and a right hand joint velocity image RJVI from the joint time-series data visualization unit 322.
Furthermore, the learning data generating unit 323 acquires left hand gripped tool data LTO1 and right hand gripped tool data RTO1 from the gripped tool time-series data generating unit 325.

そして、学習データ生成部３２３は、作業実施時間帯検出部１２３と同様に、出現状況判定、道具把持状況判定及び変位量判定を行う。
更に、学習データ生成部３２３は、作業実施時間帯検出部１２３と同様に、出現状況判定の判定結果と道具把持状況判定の判定結果と変位量判定の判定結果の少なくともいずれかに基づき、作業従事時間を作業実施時間帯と非作業時間帯に分割する。
本実施の形態における作業従事時間は、学習フェーズにおける作業従事時間である。つまり、本実施の形態における作業従事時間は、学習フェーズにおいて、作業従事者が作業に従事している時間である。
また、本実施の形態における作業実施時間帯は、学習フェーズにおける作業実施時間帯である。つまり、本実施の形態における作業実施時間帯は、学習フェーズにおいて、作業従事者がいずれかの要素作業を実施している時間帯である。
また、本実施の形態における非作業時間帯は、学習フェーズにおける非作業時間帯である。つまり、本実施の形態における非作業時間帯は、学習フェーズにおいて、作業従事者がいずれの要素作業も実施していない時間帯である。
本実施の形態では、学習データ生成部３２３は、複数の作業実施時間帯を検出するものとする。 Then, like the task execution time zone detection unit 123, the learning data generation unit 323 performs appearance status determination, tool gripping status determination, and displacement amount determination.
Furthermore, similar to the work performance time zone detection unit 123, the learning data generation unit 323 divides the work engagement time into a work performance time zone and a non-work time zone based on at least one of the results of the appearance status determination, the results of the tool holding status determination, and the results of the displacement amount determination.
The work time in this embodiment is the work time in the learning phase, that is, the work time in this embodiment is the time during which the worker is engaged in work in the learning phase.
Moreover, the work execution time period in this embodiment is a work execution time period in the learning phase, that is, a work execution time period in this embodiment is a time period during which a worker performs any one of the elemental works in the learning phase.
The non-working time period in this embodiment is a non-working time period in the learning phase, i.e., a non-working time period in this embodiment is a time period in the learning phase when the worker is not performing any element work.
In this embodiment, the learning data generating unit 323 detects a plurality of task execution time periods.

次に、学習データ生成部３２３は、左手関節位置画像ＬＪＰＩから、作業実施時間帯ごとに、各作業実施時間帯に撮像された部分映像を構成する部分画像を、抽出左手関節位置画像ｅｘｔＬＪＰＩとして抽出する。
また、学習データ生成部３２３は、左手関節速度画像ＬＪＶＩから、作業実施時間帯ごとに、各作業実施時間帯に撮像された部分映像を構成する部分画像を、抽出左手関節速度画像ｅｘｔＬＪＶＩとして抽出する。
また、学習データ生成部３２３は、右手関節位置画像ＲＪＰＩから、作業実施時間帯ごとに、各作業実施時間帯に撮像された部分映像を構成する部分画像を、抽出右手関節位置画像ｅｘｔＲＪＰＩとして抽出する。
また、学習データ生成部３２３は、右手関節速度画像ＲＪＶＩから、作業実施時間帯ごとに、各作業実施時間帯に撮像された部分映像を構成する部分画像を、抽出右手関節速度画像ｅｘｔＲＪＶＩとして抽出する。
そして、学習データ生成部３２３は、複数の抽出左手関節位置画像ｅｘｔＬＪＰＩ、複数の抽出左手関節速度画像ｅｘｔＬＪＶＩ、複数の抽出右手関節位置画像ｅｘｔＲＪＰＩ及び複数の抽出右手関節速度画像ｅｘｔＲＪＶＩを学習データｓＩｔとして学習データ保存部３３０に格納する。
また、学習データ生成部３２３は、複数の作業実施時間帯を学習データｓＩｕとして学習データ保存部３３０に格納する。 Next, the learning data generation unit 323 extracts, for each task performance time period, from the left hand joint position image LJPI, partial images constituting partial videos captured in each task performance time period as extracted left hand joint position images extLJPI.
Furthermore, the learning data generation unit 323 extracts, for each work performance time period, from the left hand joint velocity image LJVI, partial images constituting partial videos captured in each work performance time period as extracted left hand joint velocity images extLJVI.
Furthermore, the learning data generation unit 323 extracts, for each work performance time period, from the right hand joint position image RJPI, a partial image constituting a partial video captured in each work performance time period as an extracted right hand joint position image extRJPI.
Furthermore, the learning data generation unit 323 extracts, for each work performance time period, from the right hand joint velocity image RJVI, partial images constituting partial videos captured in each work performance time period as extracted right hand joint velocity images extRJVI.
Then, the learning data generation unit 323 stores the multiple extracted left hand joint position images extLJPI, the multiple extracted left hand joint velocity images extLJVI, the multiple extracted right hand joint position images extRJPI, and the multiple extracted right hand joint velocity images extRJVI in the learning data storage unit 330 as learning data sIt.
Furthermore, the learning data generating unit 323 stores a plurality of work execution time periods in the learning data storage unit 330 as learning data sIu.

＊＊前処理統計量計算部３２４の説明＊＊
前処理統計量計算部３２４は、学習データ保存部３３０から学習データｓＩｔを取得する。
そして、前処理統計量計算部３２４は、学習データｓＩｔから、前処理統計量として関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを計算する。
具体的には、前処理統計量計算部３２４は、学習データｓＩｔに含まれる複数の抽出左手関節位置画像ｅｘｔＬＪＰＩにおける輝度の平均値を座標軸Ｃごとに計算する。前処理統計量計算部３２４が複数の抽出左手関節位置画像ｅｘｔＬＪＰＩから計算した座標軸Ｃごとの平均値は、左手関節位置輝度平均値ＬＰＭ（Ｃ）である。
また、前処理統計量計算部３２４は、学習データｓＩｔに含まれる複数の抽出右手関節位置画像ｅｘｔＲＪＰＩにおける輝度の平均値を座標軸Ｃごとに計算する。前処理統計量計算部３２４が複数の抽出右手関節位置画像ｅｘｔＲＪＰＩから計算した座標軸Ｃごとの平均値は、右手関節位置輝度平均値ＲＰＭ（Ｃ）である。
更に、前処理統計量計算部３２４は、学習データｓＩｔに含まれる複数の抽出左手関節位置画像ｅｘｔＬＪＰＩにおける輝度の標準偏差を座標軸Ｃごとに計算する。前処理統計量計算部３２４が抽出左手関節位置画像ｅｘｔＬＪＰＩから計算した座標軸Ｃごとの標準偏差は、左手関節位置輝度標準偏差ＬＰＳ（Ｃ）である。
また、前処理統計量計算部３２４は、学習データｓＩｔに含まれる複数の抽出右手関節位置画像ｅｘｔＲＪＰＩにおける輝度の標準偏差を座標軸Ｃごとに計算する。前処理統計量計算部３２４が抽出右手関節位置画像ｅｘｔＲＪＰＩから計算した座標軸Ｃごとの標準偏差は、右手関節位置輝度標準偏差ＲＰＳ（Ｃ）である。 **Explanation of Pre-Processing Statistics Calculation Unit 324**
The pre-processing statistics calculation unit 324 acquires the learning data sIt from the learning data storage unit 330 .
Then, the pre-processing statistics calculation unit 324 calculates the joint position luminance average value PM, the joint position luminance standard deviation PS, the joint velocity luminance average value VM, and the joint velocity luminance standard deviation VS as pre-processing statistics from the learning data sIt.
Specifically, the preprocessing statistics calculation unit 324 calculates the average luminance value in the multiple extracted left hand joint position images extLJPI included in the learning data sIt for each coordinate axis C. The average value for each coordinate axis C calculated by the preprocessing statistics calculation unit 324 from the multiple extracted left hand joint position images extLJPI is the left hand joint position luminance average value LPM(C).
In addition, the preprocessing statistics calculation unit 324 calculates the average luminance value in the multiple extracted right hand joint position images extRJPI included in the learning data sIt for each coordinate axis C. The average value for each coordinate axis C calculated by the preprocessing statistics calculation unit 324 from the multiple extracted right hand joint position images extRJPI is the right hand joint position luminance average value RPM(C).
Furthermore, the preprocessing statistics calculation unit 324 calculates the standard deviation of luminance in the multiple extracted left hand joint position images extLJPI included in the learning data sIt for each coordinate axis C. The standard deviation for each coordinate axis C calculated by the preprocessing statistics calculation unit 324 from the extracted left hand joint position images extLJPI is the left hand joint position luminance standard deviation LPS(C).
Furthermore, the preprocessing statistics calculation unit 324 calculates the standard deviation of luminance in the multiple extracted right hand joint position images extRJPI included in the learning data sIt for each coordinate axis C. The standard deviation for each coordinate axis C calculated by the preprocessing statistics calculation unit 324 from the extracted right hand joint position images extRJPI is the right hand joint position luminance standard deviation RPS(C).

また、前処理統計量計算部３２４は、学習データｓＩｔに含まれる複数の抽出左手関節速度画像ｅｘｔＬＪＶＩにおける輝度の平均値を座標軸Ｃごとに計算する。前処理統計量計算部３２４が抽出左手関節速度画像ｅｘｔＬＪＶＩから計算した座標軸Ｃごとの平均値は、左手関節速度輝度平均値ＬＶＭ（Ｃ）である。
また、前処理統計量計算部３２４は、学習データｓＩｔに含まれる複数の抽出右手関節速度画像ｅｘｔＲＪＶＩにおける輝度の平均値を座標軸Ｃごとに計算する。前処理統計量計算部３２４が抽出右手関節速度画像ｅｘｔＲＪＶＩから計算した座標軸Ｃごとの平均値は、右手関節速度輝度平均値ＲＶＭ（Ｃ）である。
更に、前処理統計量計算部３２４は、学習データｓＩｔに含まれる複数の抽出左手関節速度画像ｅｘｔＬＪＶＩにおける輝度の標準偏差を座標軸Ｃごとに計算する。前処理統計量計算部３２４が抽出左手関節速度画像ｅｘｔＬＪＶＩから計算した座標軸Ｃごとの標準偏差は、左手関節速度輝度標準偏差ＬＶＳ（Ｃ）である。
また、前処理統計量計算部３２４は、学習データｓＩｔに含まれる複数の抽出右手関節速度画像ｅｘｔＲＪＶＩにおける輝度の標準偏差を座標軸Ｃごとに計算する。前処理統計量計算部３２４が抽出右手関節速度画像ｅｘｔＲＪＶＩから計算した座標軸Ｃごとの標準偏差は、右手関節速度輝度標準偏差ＲＶＳ（Ｃ）である。 Furthermore, the preprocessing statistics calculation unit 324 calculates the average luminance value in the multiple extracted left hand joint velocity images extLJVI included in the learning data sIt for each coordinate axis C. The average value for each coordinate axis C calculated by the preprocessing statistics calculation unit 324 from the extracted left hand joint velocity images extLJVI is the left hand joint velocity luminance average value LVM(C).
Furthermore, the preprocessing statistics calculation unit 324 calculates the average brightness value in the multiple extracted right hand joint velocity images extRJVI included in the learning data sIt for each coordinate axis C. The average value for each coordinate axis C calculated by the preprocessing statistics calculation unit 324 from the extracted right hand joint velocity images extRJVI is the right hand joint velocity brightness average value RVM(C).
Furthermore, the preprocessing statistics calculation unit 324 calculates the standard deviation of luminance in the multiple extracted left hand joint velocity images extLJVI included in the learning data sIt for each coordinate axis C. The standard deviation for each coordinate axis C calculated by the preprocessing statistics calculation unit 324 from the extracted left hand joint velocity images extLJVI is the left hand joint velocity luminance standard deviation LVS(C).
Furthermore, the preprocessing statistics calculation unit 324 calculates the standard deviation of luminance in the multiple extracted right hand joint velocity images extRJVI included in the learning data sIt for each coordinate axis C. The standard deviation for each coordinate axis C calculated by the preprocessing statistics calculation unit 324 from the extracted right hand joint velocity images extRJVI is the right hand joint velocity luminance standard deviation RVS(C).

なお、厳密に表記すべき場合を除き、左手関節位置輝度平均値ＬＰＭ（Ｃ）と右手関節位置輝度平均値ＲＰＭ（Ｃ）とを、まとめて関節位置輝度平均値ＰＭと表記する。
同様に、左手関節位置輝度標準偏差ＬＰＳ（Ｃ）と右手関節位置輝度標準偏差ＲＰＳ（Ｃ）とを、まとめて関節位置輝度標準偏差ＰＳと表記する。
また、左手関節速度輝度平均値ＬＶＭ（Ｃ）と、右手関節速度輝度平均値ＲＶＭ（Ｃ）とを、まとめて関節速度輝度平均値ＶＭと表記する。
また、左手関節速度輝度標準偏差ＬＶＳ（Ｃ）と、右手関節速度輝度標準偏差ＲＶＳ（Ｃ）とを、まとめて関節速度輝度標準偏差ＶＳと表記する。 Unless otherwise specified strictly, the left wrist joint position luminance average value LPM(C) and the right wrist joint position luminance average value RPM(C) will be collectively referred to as the joint position luminance average value PM.
Similarly, the left hand joint position luminance standard deviation LPS(C) and the right hand joint position luminance standard deviation RPS(C) are collectively referred to as the joint position luminance standard deviation PS.
Furthermore, the left wrist joint velocity luminance average value LVM(C) and the right wrist joint velocity luminance average value RVM(C) are collectively referred to as the joint velocity luminance average value VM.
Furthermore, the left wrist joint velocity luminance standard deviation LVS(C) and the right wrist joint velocity luminance standard deviation RVS(C) are collectively referred to as the joint velocity luminance standard deviation VS.

なお、上記のように、左手についての輝度平均値及び輝度標準偏差と右手についての輝度平均値及び輝度標準偏差を用いるのではなく、左手と右手の両者にわたる輝度平均値及び輝度標準偏差を用いるようにしてもよい。以下では、左手についての輝度平均値及び輝度標準偏差と右手についての輝度平均値及び輝度標準偏差を用いる例を説明するが、以下の説明は左手と右手の両者にわたる輝度平均値及び輝度標準偏差を用いる場合にも適用される。 Note that, instead of using the average luminance value and standard deviation of the left hand and the average luminance value and standard deviation of the right hand as described above, the average luminance value and standard deviation of the left and right hands may be used. Below, an example is described in which the average luminance value and standard deviation of the left hand and the average luminance value and standard deviation of the right hand are used, but the following description also applies to the case in which the average luminance value and standard deviation of the left and right hands are used.

前処理統計量計算部３２４は、関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを前処理統計量保存部３３１に格納する。The pre-processing statistics calculation unit 324 stores the joint position luminance mean value PM, the joint position luminance standard deviation PS, the joint velocity luminance mean value VM and the joint velocity luminance standard deviation VS in the pre-processing statistics storage unit 331.

＊＊作業実施時間帯検出モデル生成部３２６の説明＊＊
作業実施時間帯検出モデル生成部３２６は、学習データ保存部３３０から、学習データｓＩｔと学習データｓＩｕを取得する。
具体的には、作業実施時間帯検出モデル生成部３２６は、学習データｓＩｔとして、複数の抽出左手関節位置画像ｅｘｔＬＪＰＩ、複数の抽出左手関節速度画像ｅｘｔＬＪＶＩ、複数の抽出右手関節位置画像ｅｘｔＲＪＰＩ及び複数の抽出右手関節速度画像ｅｘｔＲＪＶＩを取得する。
また、作業実施時間帯検出モデル生成部３２６は、学習データｓＩｕとして、複数の作業実施時間帯の真値を取得する。
また、作業実施時間帯検出モデル生成部３２６は、前処理統計量保存部３３１から、関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得する。
また、作業実施時間帯検出モデル生成部３２６は、把持道具情報保存部３３３から、左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１を取得する。 **Explanation of the work execution time period detection model generation unit 326**
The work execution time period detection model generation unit 326 acquires the learning data sIt and the learning data sIu from the learning data storage unit 330 .
Specifically, the work performance time zone detection model generation unit 326 acquires a plurality of extracted left hand joint position images extLJPI, a plurality of extracted left hand joint velocity images extLJVI, a plurality of extracted right hand joint position images extRJPI, and a plurality of extracted right hand joint velocity images extRJVI as the learning data sIt.
Furthermore, the work performance time period detection model generation unit 326 acquires true values of a plurality of work performance time periods as the learning data sIu.
In addition, the task execution time period detection model generation unit 326 acquires the joint position luminance average value PM, the joint position luminance standard deviation PS, the joint velocity luminance average value VM, and the joint velocity luminance standard deviation VS from the preprocessing statistics storage unit 331.
In addition, the task execution time zone detection model generation unit 326 acquires the left hand gripped tool data LTO1 and the right hand gripped tool data RTO1 from the gripped tool information storage unit 333.

そして、作業実施時間帯検出モデル生成部３２６は、作業実施時間帯検出部１２３が作業実施時間帯を検出するのに用いる学習済みモデルを、作業実施時間帯検出モデルＤＭとして生成する。Then, the work performance time zone detection model generation unit 326 generates a learned model, called a work performance time zone detection model DM, which is used by the work performance time zone detection unit 123 to detect the work performance time zone.

具体的には、作業実施時間帯検出モデル生成部３２６は、学習データｓＩｔに含まれる抽出左手関節位置画像ｅｘｔＬＪＰＩを、一定の幅及び高さをもつ画像にリサイズする。
更に、作業実施時間帯検出モデル生成部３２６は、リサイズ後の抽出左手関節位置画像ｅｘｔＬＪＰＩの画素値を座標軸Ｃごとに関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。
具体的には、作業実施時間帯検出モデル生成部３２６は、リサイズ後の抽出左手関節位置画像ｅｘｔＬＪＰＩの画素値を左手関節位置輝度平均値ＬＰＭ（Ｃ）と左手関節位置輝度標準偏差ＬＰＳ（Ｃ）を用いて標準化する。 Specifically, the activity performance time period detection model generation unit 326 resizes the extracted left hand joint position image extLJPI included in the learning data sIt into an image having a certain width and height.
Furthermore, the activity performance time period detection model generation unit 326 standardizes the pixel values of the resized extracted left hand joint position image extLJPI for each coordinate axis C using the joint position luminance average value PM and the joint position luminance standard deviation PS.
Specifically, the task performance time period detection model generation unit 326 standardizes the pixel values of the resized extracted left hand joint position image extLJPI using the left hand joint position luminance average value LPM(C) and the left hand joint position luminance standard deviation LPS(C).

また、作業実施時間帯検出モデル生成部３２６は、学習データｓＩｔに含まれる抽出右手関節位置画像ｅｘｔＲＪＰＩを、一定の幅及び高さをもつ画像にリサイズする。
更に、作業実施時間帯検出モデル生成部３２６は、リサイズ後の抽出右手関節位置画像ｅｘｔＲＪＰＩの画素値を座標軸Ｃごとに関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。
具体的には、作業実施時間帯検出モデル生成部３２６は、リサイズ後の抽出右手関節位置画像ｅｘｔＲＪＰＩの画素値を右手関節位置輝度平均値ＲＰＭ（Ｃ）と右手関節位置輝度標準偏差ＲＰＳ（Ｃ）を用いて標準化する。 In addition, the activity performance time period detection model generation unit 326 resizes the extracted right hand joint position image extRJPI included in the learning data sIt into an image with a certain width and height.
Furthermore, the task performance time period detection model generation unit 326 standardizes the pixel values of the resized extracted right hand joint position image extRJPI for each coordinate axis C using the joint position luminance average value PM and the joint position luminance standard deviation PS.
Specifically, the task performance time period detection model generation unit 326 standardizes the pixel values of the resized extracted right hand joint position image extRJPI using the right hand joint position luminance average value RPM(C) and the right hand joint position luminance standard deviation RPS(C).

また、作業実施時間帯検出モデル生成部３２６は、学習データｓＩｔに含まれる抽出左手関節速度画像ｅｘｔＬＪＶＩを、一定の幅及び高さをもつ画像にリサイズする。
更に、作業実施時間帯検出モデル生成部３２６は、リサイズ後の抽出左手関節速度画像ｅｘｔＬＪＶＩの画素値を座標軸Ｃごとに関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。
具体的には、作業実施時間帯検出モデル生成部３２６は、リサイズ後の抽出左手関節速度画像ｅｘｔＬＪＶＩの画素値を左手関節速度輝度平均値ＬＶＭ（Ｃ）と左手関節速度輝度標準偏差ＬＶＳ（Ｃ）を用いて標準化する。 Furthermore, the task execution time period detection model generation unit 326 resizes the extracted left hand joint velocity image extLJVI included in the learning data sIt into an image with a certain width and height.
Furthermore, the task performance time period detection model generation unit 326 standardizes the pixel values of the resized extracted left hand joint velocity image extLJVI for each coordinate axis C using the joint velocity luminance average value VM and the joint velocity luminance standard deviation VS.
Specifically, the task performance time zone detection model generation unit 326 standardizes the pixel values of the resized extracted left hand joint velocity image extLJVI using the left hand joint velocity brightness average value LVM(C) and the left hand joint velocity brightness standard deviation LVS(C).

また、作業実施時間帯検出モデル生成部３２６は、学習データｓＩｔに含まれる抽出右手関節速度画像ｅｘｔＲＪＶＩを、一定の幅及び高さをもつ画像にリサイズする。
更に、作業実施時間帯検出モデル生成部３２６は、リサイズ後の抽出右手関節速度画像ｅｘｔＲＪＶＩの画素値を座標軸Ｃごとに関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。
具体的には、作業実施時間帯検出モデル生成部３２６は、リサイズ後の抽出右手関節速度画像ｅｘｔＲＪＶＩの画素値を右手関節速度輝度平均値ＲＶＭ（Ｃ）と右手関節速度輝度標準偏差ＲＶＳ（Ｃ）を用いて標準化する。前述のように、リサイズと標準化の順序は逆でもよい。 Furthermore, the task execution time period detection model generation unit 326 resizes the extracted right hand joint velocity image extRJVI included in the learning data sIt into an image with a certain width and height.
Furthermore, the task performance time period detection model generation unit 326 standardizes the pixel values of the resized extracted right hand joint velocity image extRJVI for each coordinate axis C using the joint velocity luminance average value VM and the joint velocity luminance standard deviation VS.
Specifically, the task performance time zone detection model generation unit 326 standardizes the pixel values of the resized extracted right hand joint velocity image extRJVI using the right hand joint velocity brightness average value RVM(C) and the right hand joint velocity brightness standard deviation RVS(C). As described above, the order of resizing and standardization may be reversed.

次に、作業実施時間帯検出モデル生成部３２６は、リサイズ及び標準化の後の各画像を、事前に定義したニューラルネットワークに入力する。ニューラルネットワークに入力する画像は、リサイズ及び標準化の後の抽出左手関節位置画像ｅｘｔＬＪＰＩ、リサイズ及び標準化の後の抽出右手関節位置画像ｅｘｔＲＪＰＩ、リサイズ及び標準化の後の抽出左手関節速度画像ｅｘｔＬＪＶＩ、及びリサイズ及び標準化の後の抽出左手関節速度画像ｅｘｔＬＪＶＩである。
ここで用いられるニューラルネットワークとしては、例えば、畳み込みニューラルネットワークの機能と、Ｆａｓｔｅｒ－ＲＣＮＮ等で用いられる領域提案ネットワークの機能とを有するニューラルネットワーを想定する。 Next, the task performance time zone detection model generation unit 326 inputs each of the resized and standardized images to a predefined neural network. The images input to the neural network are the resized and standardized extracted left hand joint position image extLJPI, the resized and standardized extracted right hand joint position image extRJPI, the resized and standardized extracted left hand joint velocity image extLJVI, and the resized and standardized extracted left hand joint velocity image extLJVI.
The neural network used here is assumed to have the functions of a convolutional neural network and a region proposal network used in Faster-RCNN, for example.

また、作業実施時間帯検出モデル生成部３２６は、損失値Ｌを計算する。作業実施時間帯検出モデル生成部３２６は、損失値Ｌとして、学習データｓＩｕに含まれる複数の作業実施時間帯の真値と、後述する複数の作業実施時間帯の候補（予測値）との差分を、損失関数を用いて計算する。最も簡単な損失関数として、作業実施時間帯検出モデル生成部３２６は、二乗誤差関数を用いることができる。
作業実施時間帯検出モデル生成部３２６は、損失値Ｌを計算後、誤差逆伝搬法を用いてニューラルネットワークの重みを更新する。
作業実施時間帯検出モデル生成部３２６がこられの処理を予め定めた回数繰り返すことで、作業実施時間帯検出モデルＤＭを生成する。
作業実施時間帯検出モデル生成部３２６は、作業実施時間帯検出モデルＤＭを作業実施時間帯検出モデル保存部３３２に格納する。 Furthermore, the work performance time zone detection model generation unit 326 calculates a loss value L. The work performance time zone detection model generation unit 326 uses a loss function to calculate the difference between the true values of multiple work performance time zones included in the learning data sIu and multiple candidate values (predicted values) of the work performance time zones described below, as the loss value L. As the simplest loss function, the work performance time zone detection model generation unit 326 can use a squared error function.
After calculating the loss value L, the work execution time period detection model generation unit 326 updates the weights of the neural network using the backpropagation method.
The work performance time zone detection model generation unit 326 repeats these processes a predetermined number of times to generate the work performance time zone detection model DM.
The work performance time zone detection model generation unit 326 stores the work performance time zone detection model DM in the work performance time zone detection model storage unit 332 .

＊＊学習データ保存部３３０の説明＊＊
学習データ保存部３３０は、学習データｓＩｔ及び学習データｓＩｕを保存する。 **Explanation of learning data storage unit 330**
The learning data storage unit 330 stores the learning data sIt and the learning data sIu.

＊＊前処理統計量保存部３３１の説明＊＊
前処理統計量保存部３３１は、前処理統計量（関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳ）を保存する。 **Explanation of Pre-Processing Statistics Storage Unit 331**
The pre-processing statistics storage unit 331 stores the pre-processing statistics (the joint position luminance average value PM, the joint position luminance standard deviation PS, the joint velocity luminance average value VM, and the joint velocity luminance standard deviation VS).

＊＊作業実施時間帯検出モデル保存部３３２の説明＊＊
作業実施時間帯検出モデル保存部３３２は、作業実施時間帯検出モデルＤＭを保存する。 **Explanation of the work execution time period detection model storage unit 332**
The work performance time zone detection model storage unit 332 stores the work performance time zone detection model DM.

＊＊把持道具情報保存部３３３の説明＊＊
把持道具情報保存部３３３は、左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１とを保存する。 **Explanation of the gripping tool information storage unit 333**
The gripped tool information storage unit 333 stores left hand gripped tool data LTO1 and right hand gripped tool data RTO1.

＊＊作業実施時間帯検出モデル生成部３２６の内部構成例の説明＊＊
図１６は、作業実施時間帯検出モデル生成部３２６の内部構成例を示す。
図１６を用いて、作業実施時間帯検出モデル生成部３２６の内部構成例を説明する。
なお、図１６では、学習装置３００の機能構成要素のうち、作業実施時間帯検出モデル生成部３２６の内部構成例を説明するのに必要な機能構成要素のみが図示されている。 **Explanation of an example of the internal configuration of the work execution time period detection model generation unit 326**
FIG. 16 shows an example of the internal configuration of the work execution time period detection model generation unit 326 .
An example of the internal configuration of the work execution time period detection model generation unit 326 will be described with reference to FIG.
Note that, in FIG. 16, of the functional components of the learning device 300, only the functional components necessary for explaining an example of the internal configuration of the work execution time period detection model generation unit 326 are shown.

作業実施時間帯検出モデル生成部３２６は、内部構成として、関節位置画像取得部３２６１、関節速度画像取得部３２６２、関節位置特徴抽出部３２６３、関節速度特徴抽出部３２６４、出現状況判定部３２６５、道具把持状況判定部３２６６、作業実施時間帯判定部３２６８、提案学習部３２６９、回帰学習部３２７０を有する。The work performance time zone detection model generation unit 326 has, as its internal components, a joint position image acquisition unit 3261, a joint velocity image acquisition unit 3262, a joint position feature extraction unit 3263, a joint velocity feature extraction unit 3264, an appearance status determination unit 3265, a tool holding status determination unit 3266, a work performance time zone determination unit 3268, a proposal learning unit 3269, and a regression learning unit 3270.

＊＊出現状況判定部３２６５の説明＊＊
出現状況判定部３２６５は、図８に示す出現状況判定部１２３１と同様に、出現状況判定を行う。 **Explanation of the appearance status determination unit 3265**
The appearance status determination unit 3265 performs appearance status determination in the same manner as the appearance status determination unit 1231 shown in FIG.

出現状況判定部３２６５は、関節時系列データ画像化部３２２から、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩを取得する。
そして、出現状況判定部３２６５は、左手関節位置画像ＬＪＰＩ及び／又は左手関節速度画像ＬＪＶＩを解析し、左手出現時間帯集合ＬＳを生成する。
同様に、出現状況判定部３２６５は、右手関節位置画像ＲＪＰＩ及び／又は右手関節速度画像ＲＪＶＩを解析し、右手出現時間帯集合ＲＳを生成する。
左手出現時間帯集合ＬＳは、左手が映像に出現した時間帯の集合である。また、右手出現時間帯集合ＲＳは、右手が映像に出現した時間帯の集合である。
出現状況判定部３２６５は、左手出現時間帯集合ＬＳと右手出現時間帯集合ＲＳを作業実施時間帯判定部３２６８に出力する。
なお、出現状況判定部３２６５は、関節位置時系列データＨＰＴ及び／又は関節速度時系列データＨＶＴを用いて出現状況判定を行ってもよい。
図１６では、出現状況判定部３２６５への関節位置時系列データＨＰＴ及び／又は関節速度時系列データＨＶＴの入力の図示は省略している。 The appearance status determination unit 3265 acquires a left wrist joint position image LJPI, a left wrist joint velocity image LJVI, a right wrist joint position image RJPI, and a right wrist joint velocity image RJVI from the joint time-series data imaging unit 322.
Then, the appearance status determination unit 3265 analyzes the left hand joint position image LJPI and/or the left hand joint velocity image LJVI to generate a left hand appearance time period set LS.
Similarly, the appearance status determination unit 3265 analyzes the right hand joint position image RJPI and/or the right hand joint velocity image RJVI to generate a right hand appearance time period set RS.
The left hand appearance time period set LS is a set of time periods when a left hand appears in the image, and the right hand appearance time period set RS is a set of time periods when a right hand appears in the image.
The appearance status determination unit 3265 outputs the left hand appearance time period set LS and the right hand appearance time period set RS to the work implementation time period determination unit 3268.
The appearance status determination unit 3265 may determine the appearance status using the joint position time series data HPT and/or the joint velocity time series data HVT.
16, input of the joint position time series data HPT and/or the joint velocity time series data HVT to the appearance status determination unit 3265 is omitted.

＊＊道具把持状況判定部３２６６の説明＊＊
道具把持状況判定部３２６６は、図８に示す道具把持状況判定部１２３２と同様に、道具把持状況判定を行う。 **Explanation of the tool gripping state determination unit 3266**
The tool holding status determination unit 3266 performs tool holding status determination in the same manner as the tool holding status determination unit 1232 shown in FIG.

道具把持状況判定部３２６６は、把持道具情報保存部３３３から左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１とを取得する。
そして、道具把持状況判定部３２６６は、左手把持道具データＬＴＯ１と右手把持道具データＲＴＯ１とに基づいて、左手及び右手の各々について、道具を把持している時間帯、道具を把持していない時間帯を時系列に抽出する。
また、作業従事者が把持する道具の種類が変化している場合は、道具把持状況判定部３２６６は、道具を把持している時間帯を、道具の種類ごとに異なる時間帯として扱う。
道具把持状況判定部３２６６は、左手についての抽出結果が示される左手道具把持状況判定結果ＬＴＳと右手についての抽出結果が示される右手道具把持状況判定結果ＲＴＳとを作業実施時間帯判定部３２６８に出力する。 The tool holding status determination unit 3266 acquires the left hand held tool data LTO1 and the right hand held tool data RTO1 from the held tool information storage unit 333.
Then, based on the left hand held tool data LTO1 and the right hand held tool data RTO1, the tool holding status determination unit 3266 extracts, in chronological order, the time periods when a tool is being held and the time periods when a tool is not being held for each of the left hand and right hand.
In addition, if the type of tool held by the worker is changing, the tool holding status determination unit 3266 treats the time period during which the tool is held as a different time period for each type of tool.
The tool holding status determination unit 3266 outputs a left hand tool holding status determination result LTS indicating the extraction result for the left hand and a right hand tool holding status determination result RTS indicating the extraction result for the right hand to the work execution time zone determination unit 3268.

＊＊作業実施時間帯判定部３２６８の説明＊＊
作業実施時間帯判定部３２６８は、図８に示す作業実施時間帯判定部１２３４と同様の処理を行う。 **Explanation of the work execution time zone determination unit 3268**
The work implementation time zone determining unit 3268 performs the same processing as the work implementation time zone determining unit 1234 shown in FIG.

作業実施時間帯判定部３２６８は、出現状況判定部３２６５から、左手出現時間帯集合ＬＳと右手出現時間帯集合ＲＳを取得する。
また、作業実施時間帯判定部３２６８は、道具把持状況判定部３２６６から、左手道具把持状況判定結果ＬＴＳと右手道具把持状況判定結果ＲＴＳを取得する。
そして、作業実施時間帯判定部３２６８は、これらに基づき、作業従事時間を作業実施時間帯と非作業時間帯に分割する。
作業実施時間帯判定部３２６８は、作業実施時間帯の集合である作業実施時間帯集合ＦＳを関節位置画像取得部３２６１に出力する。作業実施時間帯集合ＦＳには、作業実施時間帯ごとに作業実施時間帯の開始時刻と終了時刻が示される。 The operation implementation time zone determination unit 3268 acquires a left hand appearance time zone set LS and a right hand appearance time zone set RS from the appearance status determination unit 3265.
In addition, the work execution time zone determination unit 3268 acquires the left hand tool holding situation determination result LTS and the right hand tool holding situation determination result RTS from the tool holding situation determination unit 3266.
Then, the work execution time zone determining unit 3268 divides the work engagement time into a work execution time zone and a non-work time zone based on these.
The work implementation time zone determination unit 3268 outputs a work implementation time zone set FS, which is a set of work implementation time zones, to the joint position image acquisition unit 3261. The work implementation time zone set FS indicates the start time and end time of each work implementation time zone.

＊＊関節位置画像取得部３２６１の説明＊＊
関節位置画像取得部３２６１は、図８に示す関節位置画像取得部１２４１と同様の処理を行う。 **Explanation of the joint position image acquisition unit 3261**
The joint position image acquisition unit 3261 performs the same processing as the joint position image acquisition unit 1241 shown in FIG.

つまり、関節位置画像取得部３２６１は、作業実施時間帯判定部３２６８から作業実施時間帯集合ＦＳを取得する。
また、関節位置画像取得部３２６１は、関節時系列データ画像化部３２２から、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩを取得する。
また、関節位置画像取得部３２６１は、前処理統計量保存部３３１から、前処理統計量を取得する。具体的には、関節位置画像取得部３２６１は、前処理統計量として関節位置輝度平均値ＰＭ、関節位置輝度標準偏差ＰＳ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得する。 That is, the joint position image acquisition unit 3261 acquires the work implementation time zone set FS from the work implementation time zone determination unit 3268.
In addition, the joint position image acquisition section 3261 acquires a left hand joint position image LJPI, a left hand joint velocity image LJVI, a right hand joint position image RJPI, and a right hand joint velocity image RJVI from the joint time-series data imaging section 322.
Furthermore, the joint position image acquisition unit 3261 acquires preprocessing statistics from the preprocessing statistics storage unit 331. Specifically, the joint position image acquisition unit 3261 acquires a joint position luminance average value PM, a joint position luminance standard deviation PS, a joint velocity luminance average value VM, and a joint velocity luminance standard deviation VS as the preprocessing statistics.

そして、関節位置画像取得部３２６１は、作業実施時間帯集合ＦＳに含まれる作業実施時間帯ごとに、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩの各々から、作業実施時間帯に撮像された部分映像を構成する部分画像を抽出する。
そして、関節位置画像取得部３２６１は、作業時間帯ごとに、各部分画像を、一定の幅及び高さをもつ画像にリサイズする。更に、関節位置画像取得部３２６１は、左手関節位置画像ＬＪＰＩから抽出されたリサイズ後の部分画像の画素値を関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。また、関節位置画像取得部３２６１は、右手関節位置画像ＲＪＰＩから抽出されたリサイズ後の部分画像の画素値を関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳを用いて標準化する。前述のように、リサイズと標準化の順序は逆でもよい。
関節位置画像取得部３２６１は、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩの各々についてのリサイズ及び標準化の後の部分画像を関節位置特徴抽出部３２６３に出力する。 Then, the joint position image acquisition unit 3261 extracts partial images constituting a partial video captured during the work performance time period from each of the left hand joint position image LJPI and the right hand joint position image RJPI for each work performance time period included in the work performance time period set FS.
Then, the joint position image acquisition unit 3261 resizes each partial image to an image having a certain width and height for each working time period. Furthermore, the joint position image acquisition unit 3261 standardizes the pixel values of the resized partial image extracted from the left hand joint position image LJPI using the joint position luminance average value PM and the joint position luminance standard deviation PS. Furthermore, the joint position image acquisition unit 3261 standardizes the pixel values of the resized partial image extracted from the right hand joint position image RJPI using the joint position luminance average value PM and the joint position luminance standard deviation PS. As described above, the order of resizing and standardization may be reversed.
The joint position image acquisition section 3261 outputs the partial images after resizing and standardization for each of the left hand joint position image LJPI and the right hand joint position image RJPI to the joint position feature extraction section 3263.

また、関節位置画像取得部３２６１は、作業実施時間帯集合ＦＳ、左手関節速度画像ＬＪＶＩ、右手関節速度画像ＲＪＶＩ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを関節速度画像取得部３２６２に出力する。
ここでは、関節位置画像取得部３２６１が、作業実施時間帯集合ＦＳ、左手関節速度画像ＬＪＶＩ、右手関節速度画像ＲＪＶＩ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得し、これらを関節速度画像取得部３２６２に出力する例を説明している。
これに代えて、関節速度画像取得部３２６２が、作業実施時間帯判定部３２６８から作業実施時間帯集合ＦＳを取得し、関節時系列データ画像化部３２２から左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩを取得し、前処理統計量保存部３３１から関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得するようにしてもよい。
この場合は、関節位置画像取得部３２６１は、関節時系列データ画像化部３２２から左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩのみを取得し、前処理統計量保存部３３１から関節位置輝度平均値ＰＭ及び関節位置輝度標準偏差ＰＳのみを取得する。 In addition, the joint position image acquisition section 3261 outputs the task execution time period set FS, the left hand joint velocity image LJVI, the right hand joint velocity image RJVI, the joint velocity brightness average value VM and the joint velocity brightness standard deviation VS to the joint velocity image acquisition section 3262.
Here, an example is described in which the joint position image acquisition unit 3261 acquires a work performance time period set FS, a left hand joint velocity image LJVI, a right hand joint velocity image RJVI, a joint velocity brightness average VM and a joint velocity brightness standard deviation VS, and outputs these to the joint velocity image acquisition unit 3262.
Alternatively, the joint velocity image acquisition unit 3262 may acquire a work performance time period set FS from the work performance time period determination unit 3268, acquire a left hand joint velocity image LJVI and a right hand joint velocity image RJVI from the joint time series data imaging unit 322, and acquire a joint velocity brightness average value VM and a joint velocity brightness standard deviation VS from the pre-processing statistics storage unit 331.
In this case, the joint position image acquisition unit 3261 acquires only the left hand joint position image LJPI and the right hand joint position image RJPI from the joint time series data imaging unit 322, and acquires only the joint position luminance average value PM and the joint position luminance standard deviation PS from the preprocessing statistics storage unit 331.

＊＊関節速度画像取得部３２６２の説明＊＊
関節速度画像取得部３２６２は、図８に示す関節速度画像取得部１２４２と同様の処理を行う。 **Explanation of joint velocity image acquisition unit 3262**
The joint velocity image acquisition section 3262 performs the same processing as the joint velocity image acquisition section 1242 shown in FIG.

つまり、関節速度画像取得部３２６２は、関節位置画像取得部３２６１から、作業実施時間帯集合ＦＳ、左手関節速度画像ＬＪＶＩ、右手関節速度画像ＲＪＶＩ、関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを取得する。
そして、関節速度画像取得部３２６２は、作業実施時間帯集合ＦＳに含まれる作業実施時間帯ごとに、左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩの各々から、作業実施時間帯に撮像された部分映像を構成する部分画像を抽出する。
そして、関節速度画像取得部３２６２は、作業時間帯ごとに、各部分画像を、一定の幅及び高さをもつ画像にリサイズする。更に、関節速度画像取得部３２６２は、左手関節速度画像ＬＪＶＩから抽出されたリサイズ後の部分画像の画素値を関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。また、関節速度画像取得部３２６２は、右手関節速度画像ＲＪＶＩから抽出されたリサイズ後の部分画像の画素値を関節速度輝度平均値ＶＭ及び関節速度輝度標準偏差ＶＳを用いて標準化する。前述のように、リサイズと標準化の順序は逆でもよい。
関節速度画像取得部３２６２は、左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩの各々についてのリサイズ及び標準化の後の部分画像を関節速度特徴抽出部３２６４に出力する。 That is, the joint velocity image acquisition unit 3262 acquires the task execution time period set FS, the left hand joint velocity image LJVI, the right hand joint velocity image RJVI, the joint velocity brightness average value VM and the joint velocity brightness standard deviation VS from the joint position image acquisition unit 3261.
Then, the joint velocity image acquisition unit 3262 extracts partial images that constitute partial videos captured during the work performance time period from each of the left hand joint velocity image LJVI and the right hand joint velocity image RJVI for each work performance time period included in the work performance time period set FS.
Then, the joint velocity image acquisition unit 3262 resizes each partial image to an image having a constant width and height for each work time period. Furthermore, the joint velocity image acquisition unit 3262 standardizes the pixel values of the resized partial image extracted from the left hand joint velocity image LJVI using the joint velocity luminance average value VM and the joint velocity luminance standard deviation VS. Furthermore, the joint velocity image acquisition unit 3262 standardizes the pixel values of the resized partial image extracted from the right hand joint velocity image RJVI using the joint velocity luminance average value VM and the joint velocity luminance standard deviation VS. As described above, the order of resizing and standardization may be reversed.
The joint velocity image acquisition unit 3262 outputs the resized and standardized partial images of each of the left wrist joint velocity image LJVI and the right wrist joint velocity image RJVI to the joint velocity feature extraction unit 3264.

＊＊関節位置特徴抽出部３２６３の説明＊＊
関節位置特徴抽出部３２６３３は、図８に示す関節位置特徴抽出部１２４３と同様の処理を行う。 **Explanation of the joint position feature extraction unit 3263**
The joint position feature extraction unit 32633 performs the same processing as the joint position feature extraction unit 1243 shown in FIG.

つまり、関節位置特徴抽出部３２６３は、関節位置画像取得部３２６１から、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩの各々についてのリサイズ及び標準化の後の部分画像を取得する。
そして、関節位置特徴抽出部３２６３は、左手関節位置画像ＬＪＰＩ及び右手関節位置画像ＲＪＰＩの各々についての部分画像を、事前学習済みの畳み込みニューラルネットワークに入力する。
関節位置特徴抽出部３２６３は、それぞれの部分画像から特徴ベクトルである関節位置特徴ベクトルを抽出する。
ここでは、左手関節位置画像ＬＪＰＩについての部分画像から得られた特徴ベクトルを左手位置特徴ベクトルｆＬＰという。また、右手関節位置画像ＲＪＰＩについての部分画像から得られた特徴ベクトルを右手位置特徴ベクトルｆＲＰという。
関節位置特徴抽出部３２６３は、左手位置特徴ベクトルｆＬＰと右手位置特徴ベクトルｆＲＰを提案学習部３２６９に出力する。 That is, the joint position feature extraction section 3263 acquires, from the joint position image acquisition section 3261, partial images after resizing and standardization for each of the left hand joint position image LJPI and the right hand joint position image RJPI.
Then, the joint position feature extraction unit 3263 inputs partial images of each of the left hand joint position image LJPI and the right hand joint position image RJPI to a pre-trained convolutional neural network.
The joint position feature extraction unit 3263 extracts a joint position feature vector, which is a feature vector, from each partial image.
Here, a feature vector obtained from a partial image of the left hand joint position image LJPI is referred to as a left hand position feature vector fLP, and a feature vector obtained from a partial image of the right hand joint position image RJPI is referred to as a right hand position feature vector fRP.
The joint position feature extraction unit 3263 outputs the left hand position feature vector fLP and the right hand position feature vector fRP to the proposal learning unit 3269.

＊＊関節速度特徴抽出部３２６４の説明＊＊
関節速度特徴抽出部３２６４は、図８に示す関節速度特徴抽出部１２４４と同様の処理を行う。 **Explanation of the joint velocity feature extraction unit 3264**
The joint velocity feature extractor 3264 performs the same processing as the joint velocity feature extractor 1244 shown in FIG.

つまり、関節速度特徴抽出部３２６４は、関節速度画像取得部３２６２から、左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩの各々についてのリサイズ及び標準化の後の部分画像を取得する。
そして、関節速度特徴抽出部３２６４は、左手関節速度画像ＬＪＶＩ及び右手関節速度画像ＲＪＶＩの各々についての部分画像を、事前学習済みの畳み込みニューラルネットワークに入力する。なお、ここで用いられる畳み込みニューラルネットワークは、関節位置特徴抽出部３２６３により用いられる畳み込みニューラルネットワークと同じであっても、異なっていてもよい。
関節速度特徴抽出部３２６４は、それぞれの部分画像から特徴ベクトルである関節速度特徴ベクトルを抽出する。
ここでは、左手関節速度画像ＬＪＶＩについての部分画像から得られた特徴ベクトルを左手速度特徴ベクトルｆＬＶという。また、右手関節速度画像ＲＪＶＩについての部分画像から得られた特徴ベクトルを右手速度特徴ベクトルｆＲＶという。
関節速度特徴抽出部３２６４は、左手速度特徴ベクトルｆＬＶと右手速度特徴ベクトルｆＲＶを提案学習部３２６９に出力する。 That is, the joint velocity feature extraction section 3264 acquires, from the joint velocity image acquisition section 3262, partial images after resizing and standardization for each of the left wrist joint velocity image LJVI and the right wrist joint velocity image RJVI.
Then, the joint velocity feature extraction unit 3264 inputs the partial images of each of the left hand joint velocity image LJVI and the right hand joint velocity image RJVI to a pre-trained convolutional neural network. Note that the convolutional neural network used here may be the same as or different from the convolutional neural network used by the joint position feature extraction unit 3263.
The joint velocity feature extraction unit 3264 extracts a joint velocity feature vector, which is a feature vector, from each partial image.
Here, the feature vector obtained from a partial image of the left hand joint velocity image LJVI is referred to as a left hand velocity feature vector fLV, and the feature vector obtained from a partial image of the right hand joint velocity image RJVI is referred to as a right hand velocity feature vector fRV.
The joint velocity feature extraction unit 3264 outputs the left hand velocity feature vector fLV and the right hand velocity feature vector fRV to the proposal learning unit 3269.

＊＊提案学習部３２６９の説明＊＊
提案学習部３２６９は、関節位置特徴抽出部３２６３から左手位置特徴ベクトルｆＬＰと右手位置特徴ベクトルｆＲＰとを取得する。また、提案学習部３２６９は、関節速度特徴抽出部３２６４から左手速度特徴ベクトルｆＬＶと右手速度特徴ベクトルｆＲＶとを取得する。
また、提案学習部３２６９は、学習データ保存部３３０から、学習データｓＩｔを取得する。 **Explanation of the proposed learning unit 3269**
The proposal learning unit 3269 acquires a left hand position feature vector fLP and a right hand position feature vector fRP from the joint position feature extraction unit 3263. In addition, the proposal learning unit 3269 acquires a left hand velocity feature vector fLV and a right hand velocity feature vector fRV from the joint velocity feature extraction unit 3264.
In addition, the proposal learning unit 3269 acquires the learning data sIt from the learning data storage unit 330.

そして、提案学習部３２６９は、左手位置特徴ベクトルｆＬＰ、右手位置特徴ベクトルｆＲＰ、左手速度特徴ベクトルｆＬＶ及び右手速度特徴ベクトルｆＲＶをマップ（ＦＷ×ＦＨ×ＦＣ）形式に変換する。提案学習部３２６９がこれら特徴ベクトルをマップ形式に変換して得られたマップを特徴マップＦＭという。
そして、提案学習部３２６９は、特徴マップＦＭに対し、作業実施時間帯の候補を提案する処理を行う。
提案学習部３２６９は、作業実施時間帯の候補の提案処理に、Ｆａｓｔｅｒ－ＲＣＮＮで用いられる、物体存在領域の複数の候補を提案する提案領域ネットワークを用いることができる。提案処理の出力は、作業実施時間帯の候補である。作業実施時間帯の候補は、開始時刻と時間長の２つを要素とするタプルで表される。この開始時刻は、作業実施時間帯の候補の開始時刻である。また、この時間長は、作業実施時間帯の候補の時間長である。
提案学習部３２６９は、複数の作業実施時間帯の候補（タプルの集合）を提案ＰＲＯＰＳとして回帰学習部３２７０に出力する。 Then, the proposal learning unit 3269 converts the left hand position feature vector fLP, the right hand position feature vector fRP, the left hand velocity feature vector fLV, and the right hand velocity feature vector fRV into a map (FW×FH×FC) format. The map obtained by the proposal learning unit 3269 converting these feature vectors into a map format is called a feature map FM.
Then, the proposal learning unit 3269 performs a process of proposing candidates for the work execution time period for the feature map FM.
The proposal learning unit 3269 can use a proposal region network used in Faster-RCNN, which proposes multiple candidates for object existence regions, for the process of proposing candidates for work time periods. The output of the proposal process is a candidate for work time periods. A candidate for work time periods is represented by a tuple having two elements, a start time and a duration. The start time is the start time of the candidate for work time periods. In addition, the duration is the duration of the candidate for work time periods.
The proposal learning unit 3269 outputs a plurality of candidates for work execution time periods (a set of tuples) to the regression learning unit 3270 as proposal PROPS.

＊＊回帰学習部３２７０の説明＊＊
回帰学習部３２７０は、提案学習部３２６９から、提案ＰＲＯＰＳを取得する。
また、回帰学習部３２７０は、学習データ保存部３３０から、学習データｓＩｕを取得する。学習データｓＩｕは、学習データ生成部３２３で得られた複数の作業実施時間帯の各々の真値である。学習データｓＩｕも、開始時刻と時間長の２つを要素とするタプルの集合である。学習データｓＩｕにおける開始時刻は、作業実施時間帯（真値）の開始時刻である。また、学習データｓＩｕにおける時間長は、作業実施時間帯（真値）の時間長である。 **Description of Regression Learning Unit 3270**
The regression learning unit 3270 obtains the suggestions PROPS from the suggestion learning unit 3269 .
Furthermore, the regression learning unit 3270 acquires the learning data sIu from the learning data storage unit 330. The learning data sIu is the true value of each of the multiple work execution time periods obtained by the learning data generation unit 323. The learning data sIu is also a collection of tuples having two elements, a start time and a duration. The start time in the learning data sIu is the start time of the work execution time period (true value). Furthermore, the duration in the learning data sIu is the duration of the work execution time period (true value).

次に、回帰学習部３２７０は、提案ＰＲＯＰＳから、作業実施時間帯の候補ごとに、開始時刻及び時間長のタプルを取得する。更に、回帰学習部３２７０は、各タプルの情報を用いて、特徴マップＦＭから抽出された部分特徴マップｓＦＭを畳み込みニューラルネットワークの全結合層によって処理し、作業実施時間帯の開始時刻および時間長の推定結果を出力する。そして、回帰学習部３２７０は、回帰学習部３２７０の出力結果が、作業実施時間帯か否かの分類及び作業実施時間帯であることの確信度合いを表す確率を算出する。
回帰学習部３２７０は、複数の作業実施時間帯の候補のうち、確信度合いが予め定めた閾値よりも大きい候補を選択する。また、回帰学習部３２７０は、複数の作業実施時間帯の候補のうち、真の作業実施時間帯との重複領域の割合を示す指標（例えば、ＩｎｔｅｒｓｅｃｔｉｏｎｏｆＵｎｉｔ）の値が予め定めた閾値よりも大きい候補を選択してもよい。更に、回帰学習部３２７０は、確信度合いを表す確率に基づく非最大値抑制処理を適用してもよい。 Next, the regression learning unit 3270 obtains a tuple of the start time and duration for each candidate work time slot from the proposed PROPS. Furthermore, the regression learning unit 3270 uses the information of each tuple to process the partial feature map sFM extracted from the feature map FM through a fully connected layer of a convolutional neural network, and outputs an estimation result of the start time and duration of the work time slot. Then, the regression learning unit 3270 calculates a probability indicating whether the output result of the regression learning unit 3270 is a work time slot or not and a degree of confidence that it is a work time slot.
The regression learning unit 3270 selects a candidate having a degree of certainty greater than a predetermined threshold value from among a plurality of candidates for the work execution time period. The regression learning unit 3270 may also select a candidate having an index (e.g., Intersection of Units) indicating the proportion of an overlapping area with a true work execution time period greater than a predetermined threshold value from among a plurality of candidates for the work execution time period. Furthermore, the regression learning unit 3270 may apply a non-maximum value suppression process based on a probability that indicates the degree of certainty.

次に、回帰学習部３２７０は、選択された候補の作業実施時間帯（予測値）と学習データｓＩｕとして取得した作業実施時間帯の真値との差分を計算する。
得られた差分は、損失値Ｌとして、提案学習部３２６９におけるニューラルネットワークの学習と回帰学習部３２７０におけるニューラルネットワークの学習に利用される。
回帰学習部３２７０は、損失値Ｌの計算に、Ｆａｓｔｅｒ－ＲＣＮＮのようにＬ１損失を用いてもよい。損失値の計算方法は、作業実施時間帯の予測値と真値との差分が適切に計算されるものであれば、どのようなものでもよい。
回帰学習部３２７０は、損失値Ｌを計算後、誤差逆伝搬法を用いてニューラルネットワークの重みを更新する。
回帰学習部３２７０がこられの処理を予め定めた回数繰り返すことで、作業実施時間帯検出モデルＤＭを生成する。
回帰学習部３２７０は、作業実施時間帯検出モデルＤＭを作業実施時間帯検出モデル保存部３３２に格納する。 Next, the regression learning unit 3270 calculates the difference between the work implementation time slot (predicted value) of the selected candidate and the true value of the work implementation time slot acquired as the learning data sIu.
The obtained difference is used as a loss value L for training the neural network in the proposal training unit 3269 and for training the neural network in the regression training unit 3270 .
The regression learning unit 3270 may use an L1 loss such as Faster-RCNN to calculate the loss value L. Any method for calculating the loss value may be used as long as it properly calculates the difference between the predicted value and the true value of the work execution time period.
After calculating the loss value L, the regression learning unit 3270 updates the weights of the neural network using the backpropagation method.
The regression learning unit 3270 repeats these processes a predetermined number of times to generate an operation execution time period detection model DM.
The regression learning unit 3270 stores the work performance time zone detection model DM in the work performance time zone detection model storage unit 332 .

＊＊＊動作の説明＊＊＊
次に、本実施の形態に係る学習装置３００の動作例を説明する。
図１７は、学習装置３００の動作例を示すフローチャートである。 *** Operation Description ***
Next, an example of the operation of the learning device 300 according to the present embodiment will be described.
FIG. 17 is a flowchart showing an example of the operation of the learning device 300.

先ず、ステップＳ３１において、関節位置時系列データ取得部３２０が、撮像装置３１０からの映像Ｖから関節位置時系列データＨＰＴを生成する。First, in step S31, the joint position time series data acquisition unit 320 generates joint position time series data HPT from the image V from the imaging device 310.

次に、ステップＳ３２において、関節速度計算部３２１が、関節位置時系列データＨＰＴから関節速度時系列データＨＶＴを生成する。Next, in step S32, the joint velocity calculation unit 321 generates joint velocity time series data HVT from the joint position time series data HPT.

次に、ステップＳ３３において、関節時系列データ画像化部３２２が、関節位置時系列データＨＰＴを画像化して、左手関節位置画像ＬＪＰＩと右手関節位置画像ＲＪＰＩを生成する。また、関節時系列データ画像化部３２２が、関節速度時系列データＨＶＴを画像化して、左手関節速度画像ＬＪＶＩと右手関節速度画像ＲＪＶＩを生成する。Next, in step S33, the joint time series data imaging unit 322 images the joint position time series data HPT to generate a left wrist joint position image LJPI and a right wrist joint position image RJPI. The joint time series data imaging unit 322 also images the joint velocity time series data HVT to generate a left wrist joint velocity image LJVI and a right wrist joint velocity image RJVI.

次に、ステップＳ３４において、学習データ生成部３２３が学習データｓＩｔ及び学習データｓＩｕを生成する。Next, in step S34, the learning data generation unit 323 generates learning data sIt and learning data sIu.

次に、ステップＳ３５において、前処理統計量計算部３２４が、前処理統計量を計算する。Next, in step S35, the preprocessing statistics calculation unit 324 calculates the preprocessing statistics.

最後に、ステップＳ３６において、作業実施時間帯検出モデル生成部３２６が、学習データｓＩｔ、学習データｓＩｕ及び前処理統計量を用いて、作業実施時間帯検出モデルＤＭを生成する。
最後に、作業実施時間帯検出モデル生成部３２６は、作業実施時間帯検出モデルＤＭを要素作業推定モデル保存部２３２に格納する。 Finally, in step S36, the work performance time zone detection model generation unit 326 generates the work performance time zone detection model DM using the learning data sIt, the learning data sIu, and the pre-processing statistics.
Finally, the work performance time zone detection model generation unit 326 stores the work performance time zone detection model DM in the element work estimation model storage unit 232 .

＊＊＊実施の形態の効果の説明＊＊＊
本実施の形態によれば、学習データｓＩｔ及び学習データｓＩｕに基づいて作業実施時間帯検出モデルＤＭを生成することができる。これにより、作業実施時間帯の検出をルールベースで行う必要がなくなる。つまり、作業実施時間帯の検出のための閾値を決める必要がなくなる。この結果、推定装置１００のユーザの負荷を軽減することができる。 ***Description of Effects of the Embodiment***
According to this embodiment, an activity performance time zone detection model DM can be generated based on the learning data sIt and the learning data sIu. This eliminates the need to perform rule-based activity performance time zone detection. In other words, it eliminates the need to determine a threshold value for activity performance time zone detection. As a result, the burden on the user of the estimation device 100 can be reduced.

実施の形態４．
図１８は、本実施の形態に係る推定装置４００の機能構成例を示す。
図１８において、作業実施時間帯検出モデル保存部４３３は、実施の形態３に示す作業実施時間帯検出モデルＤＭを保存する。
また、作業実施時間帯検出部４２３は、作業実施時間帯検出モデル保存部４３３から作業実施時間帯検出モデルＤＭを取得し、作業実施時間帯検出モデルＤＭを用いて作業実施時間帯を検出する。つまり、作業実施時間帯検出部４２３は、左手関節位置画像ＬＪＰＩ、左手関節速度画像ＬＪＶＩ、右手関節位置画像ＲＪＰＩ及び右手関節速度画像ＲＪＶＩと左手把持道具データＬＴＯ１及び右手把持道具データＲＴＯ１を作業実施時間帯検出モデルＤＭに入力して、作業実施時間帯を検出する。
作業実施時間帯検出部４２３及び作業実施時間帯検出モデル保存部４３３以外の構成要素は、図７に示す同じ名称の構成要素と同様である。このため、これらの構成要素の説明は省略する。
なお、推定装置４００も、推定装置１００と同様に、図２２に例示するハードウェア構成を有するものとする。 Embodiment 4.
FIG. 18 shows an example of a functional configuration of an estimation device 400 according to this embodiment.
In FIG. 18, the work performance time zone detection model storage unit 433 stores the work performance time zone detection model DM shown in the third embodiment.
Furthermore, the work performance time zone detection unit 423 acquires the work performance time zone detection model DM from the work performance time zone detection model storage unit 433, and detects the work performance time zone using the work performance time zone detection model DM. That is, the work performance time zone detection unit 423 inputs the left hand joint position image LJPI, left hand joint velocity image LJVI, right hand joint position image RJPI, right hand joint velocity image RJVI, left hand held tool data LTO1, and right hand held tool data RTO1 into the work performance time zone detection model DM to detect the work performance time zone.
The components other than the work implementation time zone detection unit 423 and the work implementation time zone detection model storage unit 433 are similar to the components of the same names shown in Fig. 7. Therefore, the description of these components will be omitted.
Similarly to the estimation device 100, the estimation device 400 also has the hardware configuration illustrated in FIG.

＊＊＊実施の形態の効果の説明＊＊＊
実施の形態３と同様に、作業実施時間帯の検出のための閾値を決める必要がなくなるため、推定装置４００のユーザの負荷を軽減することができる。
更に、作業実施時間帯検出モデルＤＭを用いることで、要素作業が複雑に組み合わされている作業であっても作業実施時間帯を正確に検出することができる。 ***Description of Effects of the Embodiment***
As in the third embodiment, there is no need to determine a threshold value for detecting an operation execution time period, so the burden on the user of the estimation device 400 can be reduced.
Furthermore, by using the task execution time zone detection model DM, the task execution time zone can be accurately detected even for tasks in which element tasks are combined in a complex manner.

実施の形態５．
図１９は、本実施の形態に係る学習装置５００の機能構成例を示す。学習装置５００は、実施の形態２で説明した学習装置２００と実施の形態３で説明した学習装置３００を組み合わせたものである。
学習データ生成部５２３は、学習データｓＩｓ、学習データｓＩｔ及び学習データｓＩｕを生成する。学習データｓＩｓ、学習データｓＩｔ及び学習データｓＩｕの生成方法は、実施の形態２及び実施の形態３で説明したものと同様である。
また、前処理統計量計算部５２４は、学習データｓＩｓ及び学習データｓＩｔの各々の前処理統計量を計算する。前処理統計量の計算方法は、実施の形態２及び実施の形態３で説明したものと同様である。
学習データ生成部５２３及び前処理統計量計算部５２４以外の構成要素は、図１２及び図１５に示す同じ名称の構成要素と同様である。このため、これらの構成要素の説明は省略する。
なお、学習装置５００も、学習装置２００及び学習装置３００と同様に、図２３に例示するハードウェア構成を有するものとする。 Embodiment 5.
19 shows an example of a functional configuration of a learning device 500 according to this embodiment. The learning device 500 is a combination of the learning device 200 described in the second embodiment and the learning device 300 described in the third embodiment.
The learning data generating unit 523 generates the learning data sIs, learning data sIt, and learning data sIu. The methods of generating the learning data sIs, learning data sIt, and learning data sIu are the same as those described in the second and third embodiments.
The preprocessing statistics calculation unit 524 calculates preprocessing statistics for each of the learning data sIs and the learning data sIt. The method of calculating the preprocessing statistics is the same as that described in the second and third embodiments.
The components other than the learning data generating unit 523 and the preprocessing statistics calculating unit 524 are similar to the components of the same names shown in Fig. 12 and Fig. 15. Therefore, the description of these components will be omitted.
Similarly to the learning devices 200 and 300, the learning device 500 also has the hardware configuration illustrated in FIG.

なお、作業実施時間帯検出モデルＤＭの生成のための学習と、要素作業推定モデルＭの生成のための学習は各々独立して行われてもよいし、マルチタスク学習として、同時に行ってもよい。
作業実施時間帯検出モデルＤＭの生成のための学習と要素作業推定モデルＭの生成のための学習を同時に行う場合は、ネットワーク構造及び／又は重みの一部を二つの学習で共有してもよい。 The learning for generating the work execution time period detection model DM and the learning for generating the element work estimation model M may be performed independently of each other, or may be performed simultaneously as multitask learning.
When learning for generating the task performance time zone detection model DM and learning for generating the element task estimation model M are performed simultaneously, part of the network structure and/or weights may be shared between the two learnings.

更に、推定結果としての検出及び分類結果である複数の要素作業の順序について、推定結果と真値（要素作業の真の順序）とを比較して順序に関する損失値Ｌを計算し、損失値Ｌを各モデルの学習に用いてもよい。
順序に関する損失値Ｌの計算は、例えば編集距離又はグラフ同士の差分等を用いることが考えられる。
また、作業の種類に応じて、順序に関する損失値Ｌの計算方法を変化させることが望ましい。例えば、組み立て作業の損失値Ｌを計算する場合は、組み立て作業を実施する上であり得ない要素作業の順序の出現に、大きな損失値Ｌを課すことが考えられる。 Furthermore, for the order of a plurality of element tasks that is the detection and classification result as the estimation result, a loss value L for the order may be calculated by comparing the estimation result with a true value (the true order of the element tasks), and the loss value L may be used for training each model.
The loss value L regarding the order may be calculated using, for example, an edit distance or a difference between graphs.
It is also desirable to change the calculation method of the loss value L for the sequence depending on the type of work. For example, when calculating the loss value L for an assembly work, it is possible to impose a large loss value L on the occurrence of an impossible sequence of element works in carrying out the assembly work.

＊＊＊実施の形態の効果の説明＊＊＊
本実施の形態によれば、作業実施時間帯検出モデルＤＭと要素作業推定モデルＭを効率的に生成することができる。 ***Description of Effects of the Embodiment***
According to this embodiment, the work execution time period detection model DM and the element work estimation model M can be generated efficiently.

以上、実施の形態１～５を説明したが、これらの実施の形態のうち、２つ以上を組み合わせて実施しても構わない。
あるいは、これらの実施の形態のうち、１つを部分的に実施しても構わない。
あるいは、これらの実施の形態のうち、２つ以上を部分的に組み合わせて実施しても構わない。
また、これらの実施の形態に記載された構成及び手順を必要に応じて変更してもよい。 Although the first to fifth embodiments have been described above, two or more of these embodiments may be combined for implementation.
Alternatively, one of these embodiments may be partially implemented.
Alternatively, two or more of these embodiments may be partially combined and implemented.
Furthermore, the configurations and procedures described in these embodiments may be modified as necessary.

＊＊＊ハードウェア構成の補足説明＊＊＊
最後に、推定装置１００、学習装置２００、学習装置３００、推定装置４００及び学習装置５００のハードウェア構成の補足説明を行う。
以下では、代表して推定装置１００と学習装置２００のハードウェア構成の補足説明を行う。推定装置１００についての以下の説明は推定装置４００にも適用される。同様に、学習装置２００についての以下の説明は学習装置３００及び学習装置５００にも適用される。 Additional hardware configuration information
Finally, a supplementary explanation of the hardware configurations of the estimation device 100, the learning device 200, the learning device 300, the estimation device 400, and the learning device 500 will be given.
The following provides a supplementary explanation of the hardware configurations of the estimation device 100 and the learning device 200. The following explanation of the estimation device 100 also applies to the estimation device 400. Similarly, the following explanation of the learning device 200 also applies to the learning device 300 and the learning device 500.

図２２に示すプロセッサ８０１は、プロセッシングを行うＩＣ（ＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）である。
プロセッサ８０１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）等である。
図２２に示す主記憶装置８０２は、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）である。
図２２に示す補助記憶装置８０３は、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、フラッシュメモリ、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）等である。
図２２に示す通信装置８０４は、データの通信処理を実行する電子回路である。
通信装置８０４は、例えば、通信チップ又はＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）である。 A processor 801 shown in FIG. 22 is an integrated circuit (IC) that performs processing.
The processor 801 is a central processing unit (CPU), a digital signal processor (DSP), or the like.
The main memory device 802 shown in FIG. 22 is a RAM (Random Access Memory).
The auxiliary storage device 803 shown in FIG. 22 is a ROM (Read Only Memory), a flash memory, a HDD (Hard Disk Drive), or the like.
The communication device 804 shown in FIG. 22 is an electronic circuit that executes data communication processing.
The communication device 804 is, for example, a communication chip or a NIC (Network Interface Card).

また、補助記憶装置８０３には、ＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）も記憶されている。
そして、ＯＳの少なくとも一部がプロセッサ８０１により実行される。
プロセッサ８０１はＯＳの少なくとも一部を実行しながら、推定装置１００の各機能構成要素を実現するプログラムを実行する。
プロセッサ８０１がＯＳを実行することで、タスク管理、メモリ管理、ファイル管理、通信制御等が行われる。
また、推定装置１００の各機能構成要素の処理の結果を示す情報、データ、信号値及び変数値の少なくともいずれかが、主記憶装置８０２、補助記憶装置８０３、プロセッサ８０１内のレジスタ及びキャッシュメモリの少なくともいずれかに記憶される。
また、推定装置１００の各機能構成要素を実現するプログラムは、磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ブルーレイ（登録商標）ディスク、ＤＶＤ等の可搬記録媒体に格納されていてもよい。そして、推定装置１００の各機能構成要素を実現するプログラムが格納された可搬記録媒体を流通させてもよい。 The auxiliary storage device 803 also stores an OS (Operating System).
At least a part of the OS is executed by the processor 801 .
The processor 801 executes at least a part of the OS and executes programs that realize the respective functional components of the estimation device 100 .
The processor 801 executes the OS, thereby performing task management, memory management, file management, communication control, and the like.
In addition, at least one of information, data, signal values, and variable values indicating the results of processing of each functional component of the estimation device 100 is stored in at least one of the main memory device 802, the auxiliary memory device 803, and a register and cache memory within the processor 801.
Furthermore, the programs for realizing the respective functional components of the estimation device 100 may be stored in portable recording media such as magnetic disks, flexible disks, optical disks, compact disks, Blu-ray (registered trademark) disks, DVDs, etc. Then, the portable recording media in which the programs for realizing the respective functional components of the estimation device 100 are stored may be distributed.

また、推定装置１００の機能構成要素の少なくともいずれかの「部」を、「回路」又は「工程」又は「手順」又は「処理」又は「サーキットリー」に読み替えてもよい。
また、推定装置１００は、処理回路により実現されてもよい。処理回路は、例えば、ロジックＩＣ（ＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）、ＧＡ（ＧａｔｅＡｒｒａｙ）、ＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）、ＦＰＧＡ（Ｆｉｅｌｄ－ＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）である。
この場合は、推定装置１００の機能構成要素は、それぞれ処理回路の一部として実現される。 Furthermore, at least any "part" of the functional components of the estimation device 100 may be read as a "circuit" or a "step" or a "procedure" or a "process" or a "circuitry".
Furthermore, the estimation device 100 may be realized by a processing circuit. The processing circuit is, for example, a logic IC (Integrated Circuit), a GA (Gate Array), an ASIC (Application Specific Integrated Circuit), or an FPGA (Field-Programmable Gate Array).
In this case, the functional components of the estimation device 100 are each realized as part of a processing circuit.

図２３に示すプロセッサ９０１は、プロセッシングを行うＩＣである。
プロセッサ９０１は、ＣＰＵ、ＤＳＰ等である。
図２３に示す主記憶装置９０２は、ＲＡＭである。
図２３に示す補助記憶装置９０３は、ＲＯＭ、フラッシュメモリ、ＨＤＤ等である。
図２３に示す通信装置９０４は、データの通信処理を実行する電子回路である。
通信装置９０４は、例えば、通信チップ又はＮＩＣである。 The processor 901 shown in FIG. 23 is an IC that performs processing.
The processor 901 is a CPU, a DSP, or the like.
The main memory device 902 shown in FIG. 23 is a RAM.
The auxiliary storage device 903 shown in FIG. 23 is a ROM, a flash memory, a HDD, or the like.
The communication device 904 shown in FIG. 23 is an electronic circuit that executes data communication processing.
The communication device 904 is, for example, a communication chip or a NIC.

また、補助記憶装置９０３には、ＯＳも記憶されている。
そして、ＯＳの少なくとも一部がプロセッサ９０１により実行される。
プロセッサ９０１はＯＳの少なくとも一部を実行しながら、学習装置２００の各機能構成要素を実現するプログラムを実行する。
プロセッサ９０１がＯＳを実行することで、タスク管理、メモリ管理、ファイル管理、通信制御等が行われる。
また、学習装置２００の各機能構成要素の処理の結果を示す情報、データ、信号値及び変数値の少なくともいずれかが、主記憶装置９０２、補助記憶装置９０３、プロセッサ９０１内のレジスタ及びキャッシュメモリの少なくともいずれかに記憶される。
また、学習装置２００の各機能構成要素を実現するプログラムは、磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ブルーレイ（登録商標）ディスク、ＤＶＤ等の可搬記録媒体に格納されていてもよい。そして、学習装置２００の各機能構成要素を実現するプログラムが格納された可搬記録媒体を流通させてもよい。 The auxiliary storage device 903 also stores the OS.
At least a part of the OS is executed by the processor 901 .
The processor 901 executes at least a part of the OS and executes programs that realize each functional component of the learning device 200.
The processor 901 executes the OS, thereby performing task management, memory management, file management, communication control, and the like.
In addition, at least one of information, data, signal values, and variable values indicating the results of processing of each functional component of the learning device 200 is stored in at least one of the main memory device 902, the auxiliary memory device 903, a register within the processor 901, and a cache memory.
Furthermore, the programs that realize the functional components of learning device 200 may be stored on portable recording media such as magnetic disks, flexible disks, optical disks, compact disks, Blu-ray (registered trademark) disks, DVDs, etc. Then, portable recording media on which the programs that realize the functional components of learning device 200 are stored may be distributed.

また、学習装置２００の機能構成要素の少なくともいずれかの「部」を、「回路」又は「工程」又は「手順」又は「処理」又は「サーキットリー」に読み替えてもよい。
また、学習装置２００は、処理回路により実現されてもよい。処理回路は、例えば、ロジックＩＣ、ＧＡ、ＡＳＩＣ、ＦＰＧＡである。
この場合は、学習装置２００の機能構成要素は、それぞれ処理回路の一部として実現される。 Furthermore, at least any "part" of the functional components of the learning device 200 may be read as a "circuit" or a "step" or a "procedure" or a "process" or a "circuitry".
Furthermore, the learning device 200 may be realized by a processing circuit. The processing circuit is, for example, a logic IC, a GA, an ASIC, or an FPGA.
In this case, the functional components of the learning device 200 are each realized as part of a processing circuit.

なお、本明細書では、プロセッサと処理回路との上位概念を、「プロセッシングサーキットリー」という。
つまり、プロセッサと処理回路とは、それぞれ「プロセッシングサーキットリー」の具体例である。 In this specification, the higher-level concept of a processor and a processing circuit is called "processing circuitry."
That is, a processor and a processing circuit are each specific examples of "processing circuitry."

１１時間分割部、１２推定部、１００推定装置、１１０撮像装置、１２０関節位置時系列データ取得部、１２１関節速度計算部、１２２関節時系列データ画像化部、１２３作業実施時間帯検出部、１２３１出現状況判定部、１２３２道具把持状況判定部、１２３３変位量判定部、１２３４作業実施時間帯判定部、１２４要素作業推定部、１２４１関節位置画像取得部、１２４２関節速度画像取得部、１２４３関節位置特徴抽出部、１２４４関節速度特徴抽出部、１２４５関節画像特徴分類部、１２５推定結果処理部、１２６把持道具時系列データ生成部、１３０要素作業推定モデル保存部、１３１前処理統計量保存部、１３２推定結果保存部、２００学習装置、２１０撮像装置、２２０関節位置時系列データ取得部、２２１関節速度計算部、２２２関節時系列データ画像化部、２２３学習データ生成部、２２４前処理統計量計算部、２２５要素作業推定モデル生成部、２２５１関節位置画像取得部、２２５２関節速度画像取得部、２２５３関節位置特徴抽出部、２２５４関節速度特徴抽出部、２２５５関節画像特徴分類学習部、２３０学習データ保存部、２３１前処理統計量保存部、２３２要素作業推定モデル保存部、３００学習装置、３１０撮像装置、３２０関節位置時系列データ取得部、３２１関節速度計算部、３２２関節時系列データ画像化部、３２３学習データ生成部、３２４前処理統計量計算部、３２５把持道具時系列データ生成部、３２６作業実施時間帯検出モデル生成部、３２６１関節位置画像取得部、３２６２関節速度画像取得部、３２６３関節位置特徴抽出部、３２６４関節速度特徴抽出部、３２６５出現状況判定部、３２６６道具把持状況判定部、３２６８作業実施時間帯判定部、３２６９提案学習部、３２７０回帰学習部、３３０学習データ保存部、３３１前処理統計量保存部、３３２作業実施時間帯検出モデル保存部、３３３把持道具情報保存部、４００推定装置、４１０撮像装置、４２０関節位置時系列データ取得部、４２１関節速度計算部、４２２関節時系列データ画像化部、４２３作業実施時間帯検出部、４２４要素作業推定部、４２５推定結果処理部、４２６把持道具時系列データ生成部、４３０要素作業推定モデル保存部、４３１推定結果保存部、４３２前処理統計量保存部、４３３作業実施時間帯検出モデル保存部、５００学習装置、５１０撮像装置、５２０関節位置時系列データ取得部、５２１関節速度計算部、５２２関節時系列データ画像化部、５２３学習データ生成部、５２４前処理統計量計算部、５２５作業実施時間帯検出モデル生成部、５２６把持道具時系列データ生成部、５２７要素作業推定モデル生成部、５３０学習データ保存部、５３１前処理統計量保存部、５３２作業実施時間帯検出モデル保存部、５３３把持道具情報保存部、５３４要素作業推定モデル保存部、８０１プロセッサ、８０２主記憶装置、８０３補助記憶装置、８０４通信装置、９０１プロセッサ、９０２主記憶装置、９０３補助記憶装置、９０４通信装置。11 Time division unit, 12 Estimation unit, 100 Estimation device, 110 Imaging device, 120 Joint position time series data acquisition unit, 121 Joint velocity calculation unit, 122 Joint time series data imaging unit, 123 Work execution time zone detection unit, 1231 Appearance status determination unit, 1232 Tool gripping status determination unit, 1233 Displacement amount determination unit, 1234 Work execution time zone determination unit, 124 Element work estimation unit, 1241 Joint position image acquisition unit, 1242 Joint velocity image acquisition unit, 1243 Joint position feature extraction unit, 1244 Joint velocity feature extraction unit, 1245 Joint image feature classification unit, 125 Estimation result processing unit, 126 Grip tool time series data generation unit, 130 Element work estimation model storage unit, 131 Pre-processing statistics storage unit, 132 Estimation result storage unit, 200 Learning device, 210 Imaging device, 220 Joint position time series data acquisition unit, 221 Joint velocity calculation unit, 222 Joint time series data imaging unit, 223 Learning data generation unit, 224 Pre-processing statistics calculation unit, 225 Element work estimation model generation unit, 2251 Joint position image acquisition unit, 2252 Joint velocity image acquisition unit, 2253 Joint position feature extraction unit, 2254 Joint velocity feature extraction unit, 2255 Joint image feature classification learning unit, 230 Learning data storage unit, 231 Pre-processing statistics storage unit, 232 Element work estimation model storage unit, 300 Learning device, 310 Imaging device, 320 Joint position time series data acquisition unit, 321 Joint velocity calculation unit, 322 Joint time series data imaging unit, 323 Learning data generation unit, 324 Pre-processing statistics calculation unit, 325 Grasping tool time series data generation unit, 326 Task execution time zone detection model generation unit, 3261 Joint position image acquisition unit, 3262 joint velocity image acquisition unit, 3263 joint position feature extraction unit, 3264 joint velocity feature extraction unit, 3265 appearance status determination unit, 3266 tool gripping status determination unit, 3268 task execution time zone determination unit, 3269 proposal learning unit, 3270 regression learning unit, 330 learning data storage unit, 331 pre-processing statistics storage unit, 332 task execution time zone detection model storage unit, 333 gripped tool information storage unit, 400 estimation device, 410 imaging device, 420 joint position time series data acquisition unit, 421 joint velocity calculation unit, 422 joint time series data imaging unit, 423 task execution time zone detection unit, 424 element work estimation unit, 425 estimation result processing unit, 426 gripped tool time series data generation unit, 430 element work estimation model storage unit, 431 estimation result storage unit, 432 Pre-processing statistics storage unit, 433 task performance time zone detection model storage unit, 500 learning device, 510 imaging device, 520 joint position time series data acquisition unit, 521 joint velocity calculation unit, 522 joint time series data imaging unit, 523 learning data generation unit, 524 pre-processing statistics calculation unit, 525 task performance time zone detection model generation unit, 526 gripping tool time series data generation unit, 527 element task estimation model generation unit, 530 learning data storage unit, 531 pre-processing statistics storage unit, 532 task performance time zone detection model storage unit, 533 gripping tool information storage unit, 534 element task estimation model storage unit, 801 processor, 802 main memory device, 803 auxiliary memory device, 804 communication device, 901 processor, 902 main memory device, 903 auxiliary memory device, 904 communication device.

Claims

a time division unit that divides a work engagement time into a work performance time zone during which the worker is performing any one of the plurality of elemental tasks and a non-work time zone during which the worker is not performing any one of the plurality of elemental tasks, using at least one of a result of a displacement amount determination that determines an amount of displacement of the hand of a worker during a work engagement time that is a time during which the worker is engaged in a task including a plurality of elemental tasks, each of which has non-uniform attributes, while the worker is engaged in the task, a result of a tool gripping state determination that determines a gripping state of a tool in the hand of the worker during the work engagement time, and a result of an appearance state determination that determines an appearance state of the hand of the worker in an image captured during the work engagement time;
A plurality of partial images constituting a partial video, which is a portion of the video captured during the work execution time period, are acquired, and the plurality of partial images are standardized using an average value and a standard deviation of the luminance of the learning images of the hand used in learning when generating the trained model, and resized to a specified size;
Inputting a plurality of processed partial images, which are the plurality of partial images after standardization and resizing, into the trained model to extract feature vectors of the plurality of processed partial images;
and an estimation unit that inputs feature vectors of the plurality of processed partial images into the trained model to estimate the elemental work being performed by the worker during the work performance time period.

The estimation unit is
As the plurality of partial images, a hand position partial image which is a partial image regarding a position of the hand of the worker, and a hand speed partial image which is a partial image regarding a speed of the hand of the worker are obtained,
The hand position partial image is standardized using the average value and standard deviation of the luminance of the learning image for the hand position used in learning when generating the trained model, and resized to the specified size;
The hand speed partial image is standardized using the average value and standard deviation of the luminance of the learning image for the hand speed used in learning when generating the trained model, and resized to the specified size;
A processed hand position partial image, which is the hand position partial image after standardization and resizing, is input to the trained model to extract a feature vector of the processed hand position partial image;
A processed hand velocity partial image, which is the hand velocity partial image after standardization and resizing, is input to the trained model to extract a feature vector of the processed hand velocity partial image;
2. The estimation device according to claim 1, further comprising: a feature vector of the post-processing hand position partial image and a feature vector of the post-processing hand speed partial image; and inputting the feature vector obtained by the combination into the trained model to estimate the component work being performed by the worker during the work performance time period.

The time division unit
The estimation device according to claim 1 , wherein, when a result of the displacement amount determination is used, a time period during which the displacement amount of the worker's hand is less than a threshold value is designated as the work performance time period.

The time division unit
The estimation device according to claim 1 , wherein, when a result of the tool gripping state determination is used, a time period during which the worker's hands are gripping the tool is designated as the work performance time period.

The time division unit
The estimation device according to claim 4, wherein, when the type of tool held by the worker's hand changes over time, each time period during which the worker's hand holds a different type of tool is designated as a different work performance time period.

The time division unit
The estimation device according to claim 1 , wherein, when a result of the appearance status determination is used, a time period during which the worker's hands are present in the image is designated as the work performance time period.

The time division unit
The estimation device according to claim 1, wherein at least one of the results of the displacement amount determination, the results of the tool gripping status determination, and the results of the appearance status determination is applied to the trained model to divide the work engagement time into the work performance time period and the non-work time period.

Generate candidates for the work execution time period;
A learning device that generates a trained model of claim 1 by learning the difference between the true value of the work performance time period obtained by performing at least one of a process corresponding to the displacement amount determination, a process corresponding to the tool holding status determination, and a process corresponding to the appearance status determination, and the candidate value of the work performance time period.

Calculating the average and standard deviation of the luminance of the learning images ;
A learning device that generates the trained model of claim 1 by learning using the calculated average value and standard deviation .

a computer uses at least one of a result of a displacement amount determination that determines an amount of displacement of the hand of a worker engaged in work including a plurality of elemental tasks, each of which has non-uniform attributes, during a work engagement time during which the worker is engaged in the work, a result of a tool gripping state determination that determines a state in which the tool is gripped by the hand of the worker during the work engagement time, and a result of an appearance state determination that determines an appearance state of the hand of the worker in an image captured during the work engagement time, to divide the work engagement time into a work performance time period that is a time period during which the worker is performing any one of the plurality of elemental tasks and a non-work time period that is a time period during which the worker is not performing any one of the plurality of elemental tasks;
The computer acquires a plurality of partial images constituting a partial video that is a portion of the video captured during the work execution time period, and standardizes the plurality of partial images using an average value and a standard deviation of the luminance of a learning image of a hand used in learning when generating a trained model, and resizes the plurality of partial images to a specified size ;
Inputting a plurality of processed partial images, which are the plurality of partial images after standardization and resizing, into the trained model to extract feature vectors of the plurality of processed partial images;
An estimation method for inputting feature vectors of the plurality of processed partial images into the trained model to estimate the elemental work being performed by the worker during the work performance time period.

a time division process for dividing a work engagement time into a work performance time zone during which the work worker is performing any one of the plurality of elemental tasks and a non-work time zone during which the work worker is not performing any one of the plurality of elemental tasks, using at least one of the following: a displacement amount determination result for determining an amount of displacement of the hand of a work worker during a work engagement time zone during which the work worker is engaged in a task including a plurality of elemental tasks, each of which has non-uniform attributes; a tool holding state determination result for determining a tool holding state of the hand of the work worker during the work engagement time zone; and an appearance state determination result for determining an appearance state of the hand of the work worker in an image captured during the work engagement time zone;
A plurality of partial images constituting a partial video, which is a portion of the video captured during the work execution time period, are acquired, and the plurality of partial images are standardized using an average value and a standard deviation of the luminance of the learning images of the hand used in learning when generating the trained model, and resized to a specified size;
Inputting a plurality of processed partial images, which are the plurality of partial images after standardization and resizing, into the trained model to extract feature vectors of the plurality of processed partial images;
and an estimation program that causes a computer to execute an estimation process of inputting feature vectors of the plurality of processed partial images into the trained model and estimating the elemental work being performed by the worker during the work performance time period.