JP2019053647A

JP2019053647A - Behavior estimation apparatus and behavior estimation program

Info

Publication number: JP2019053647A
Application number: JP2017178664A
Authority: JP
Inventors: 布施　透; Toru Fuse; 透布施
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2017-09-19
Filing date: 2017-09-19
Publication date: 2019-04-04
Anticipated expiration: 2037-09-19
Also published as: JP7110568B2

Abstract

To provide a behavior estimation apparatus which can estimate behavior of a subject in a predetermined scene, and a behavior estimation program.SOLUTION: A behavior estimation includes: an acquisition unit which acquires depth information of an infrared camera (100); and a behavior state estimation unit which analyzes the depth information to detect a pose and an amount of displacement (108), and generates state tables for upper body and lower body on the basis of the above information to estimate a state (110). A behavior classifying unit classifies behaviors defined in advance in two axes, upper-body state of the subject and lower-body state of the subject, on the basis of the state tables, to estimate behavior of the subject from the estimated state of the subject (112).SELECTED DRAWING: Figure 11

Description

本発明は、行動推定装置及び行動推定プログラムに関する。 The present invention relates to a behavior estimation device and a behavior estimation program.

特許文献１には、複数の対象者の各々において行動として表れる行動情報に基づいて複数の対象者の各々の行動の特徴を表す特徴情報を導出し、特徴情報に基づいて指定対象者の感情要素を表す要素情報を求めて、要素情報に基づいて指定対象者の心理状態を表す心理情報を求めて提示する情報提供装置が提案されている。 Patent Document 1 derives feature information that represents a feature of each behavior of a plurality of subjects based on behavior information that appears as behavior in each of a plurality of subjects, and an emotion element of a designated subject based on the feature information There has been proposed an information providing apparatus that obtains element information that represents and obtains and presents psychological information that represents the psychological state of the designated person based on the element information.

特開２０１１−２０１１２１号公報JP 2011-201121 A

行動として表れる行動情報を用いて心理情報を提示することや、人の動作を検出することまではできるが、会議や、授業、講義などの予め定めた場面における対象者の行動を推定することはできなかった。そこで、本発明は、予め定めた場面における対象者の行動を推定可能な行動推定装置及び行動推定プログラムを提供することを目的とする。 Although it is possible to present psychological information using behavior information that appears as behavior and to detect human movements, it is not possible to estimate the behavior of the subject in a predetermined scene such as a meeting, class, lecture, etc. could not. Then, an object of this invention is to provide the action estimation apparatus and action estimation program which can estimate the action of the subject in a predetermined scene.

請求項１に記載の行動推定装置は、１以上の対象者の動作情報を取得する取得部と、前記取得部によって取得した前記動作情報を用いて対象者の予め定めた場面における行動を推定する推定部と、を備える。 The behavior estimation apparatus according to claim 1 estimates a behavior of a target person in a predetermined scene using an acquisition unit that acquires motion information of one or more subjects and the motion information acquired by the acquisition unit. An estimation unit.

請求項２に記載の発明は、請求項１に記載の発明において、前記推定部は、前記取得部によって取得した前記動作情報を用いて、対象者の状態を推定し、推定した状態から対象者の予め定めた場面における行動を推定する。 According to a second aspect of the present invention, in the first aspect of the invention, the estimation unit estimates the state of the subject using the motion information acquired by the acquisition unit, and the subject from the estimated state The behavior in a predetermined scene is estimated.

請求項３に記載の発明は、請求項２に記載の発明において、前記推定部は、対象者の状態に対応する予め定めた行動を予め分類し、推定した対象者の状態から対象者の行動を推定する。 According to a third aspect of the present invention, in the second aspect of the present invention, the estimating unit classifies a predetermined action corresponding to the state of the subject in advance, and the behavior of the subject from the estimated state of the subject. Is estimated.

請求項４に記載の発明は、請求項３に記載の発明において、前記推定部は、対象者の上半身の状態と、対象者の下半身の状態との２軸の領域において行動を予め分類し、前記動作情報から前記上半身の状態及び前記下半身の状態の各々を推定することにより、対象者の行動を推定する。 The invention according to claim 4 is the invention according to claim 3, wherein the estimation unit classifies actions in advance in a biaxial region of a state of the upper half of the subject and a state of the lower half of the subject, The behavior of the subject is estimated by estimating each of the upper body state and the lower body state from the motion information.

請求項５に記載の発明は、請求項４に記載の発明において、前記行動を分類した領域は、２以上の重なりを有し、予め定めた特定の要素の有無を、重なりにおける分類の条件とする。 The invention according to claim 5 is the invention according to claim 4, wherein the region into which the behavior is classified has two or more overlaps, and the presence / absence of a predetermined specific element is defined as the classification condition in the overlap. To do.

請求項６に記載の発明は、請求項１〜５の何れか１項に記載の発明において、前記推定部は、前記動作情報から予め定めた時間区間毎の対象者の動作の頻度及び対象者の動作の維持時間の少なくとも一方の対象者特有動作情報を求め、該対象者特有動作情報を更に用いて対象者の予め定めた場面における行動を推定する。 The invention according to claim 6 is the invention according to any one of claims 1 to 5, wherein the estimation unit is configured to determine the frequency of motion of the target person and the target person for each time interval determined in advance from the motion information. At least one subject-specific motion information of the operation maintenance time is obtained, and the subject-specific motion information is further used to estimate the behavior of the subject in a predetermined scene.

請求項７に記載の発明は、請求項１〜６の何れか１項に記載の発明において、前記推定部が、予め定めた時間区間毎に対象者の行動を推定し、前記推定部によって推定された前記時間区間毎の対象者の行動を記憶する記憶部を更に備える。 The invention according to claim 7 is the invention according to any one of claims 1 to 6, wherein the estimation unit estimates an action of the subject for each predetermined time interval, and is estimated by the estimation unit. A storage unit is further provided for storing the behavior of the subject for each time interval.

請求項８に記載の発明は、請求項７に記載の発明において、前記記憶部は、前記動作情報以外の前記予め定めた場面における検出情報を前記時間区間毎に更に記憶する。 The invention according to claim 8 is the invention according to claim 7, wherein the storage unit further stores detection information in the predetermined scene other than the motion information for each time interval.

請求項９に記載の発明は、請求項８に記載の発明において、前記検出情報は、前記時間区間毎の音声情報及び前記時間区間毎の撮影情報の少なくとも一方を含む情報である。 The invention according to claim 9 is the invention according to claim 8, wherein the detection information is information including at least one of audio information for each time interval and shooting information for each time interval.

請求項１０に記載の発明は、請求項６〜９の何れか１項に記載の発明において、前記時間区間は、対象者の動作速度に応じて予め定めた時間に変更する。 The invention according to claim 10 is the invention according to any one of claims 6 to 9, wherein the time interval is changed to a predetermined time according to the operation speed of the subject.

請求項１１に記載の行動推定プログラムは、コンピュータを、請求項１〜１０の何れか１項に記載の行動推定装置の各部として機能させる。 An action estimation program according to an eleventh aspect causes a computer to function as each unit of the behavior estimation apparatus according to any one of the first to tenth aspects.

請求項１に記載の行動推定装置によれば、予め定めた場面における対象者の行動を推定可能な行動推定装置を提供できる。 According to the behavior estimation device of the first aspect, it is possible to provide a behavior estimation device capable of estimating the behavior of the subject in a predetermined scene.

請求項２に記載の発明によれば、予め定めた場面における対象者の行動を推定することが可能となる。 According to the invention described in claim 2, it is possible to estimate the behavior of the subject in a predetermined scene.

請求項３に記載の発明によれば、対象者の状態から行動を推定することが可能となる。 According to invention of Claim 3, it becomes possible to estimate action from a subject's state.

請求項４に記載の発明によれば、上半身の状態と下半身の状態の２軸に分けずに状態から行動を推定する場合に比べて、容易に行動を推定することが可能となる。 According to the fourth aspect of the present invention, it is possible to easily estimate the behavior as compared with the case where the behavior is estimated from the state without being divided into two axes of the upper body state and the lower body state.

請求項５に記載の発明によれば、行動を分類した領域が重なっている場合でも、対象者の状態から行動を推定することが可能となる。 According to the fifth aspect of the present invention, it is possible to estimate the action from the state of the target person even when the areas where the actions are classified overlap.

請求項６に記載の発明によれば、対象者特有動作を考慮しない場合に比べて、正確に行動を推定することが可能となる。 According to the sixth aspect of the present invention, it is possible to accurately estimate the action compared to the case where the subject-specific action is not taken into consideration.

請求項７に記載の発明によれば、行動をキーワードにして時間区間毎の検索が可能となる。 According to the seventh aspect of the present invention, it is possible to search for each time interval using behavior as a keyword.

請求項８に記載の発明によれば、行動だけでなく、行動に関係する情報の時間区間毎の検索が可能となる。 According to the eighth aspect of the invention, it is possible to search for not only behavior but also information related to behavior for each time interval.

請求項９に記載の発明によれば、行動をキーワードとして、行動に対応する時間区間における音声情報及び撮影情報の少なくとも一方の情報の検索も可能となる。 According to the ninth aspect of the present invention, it is possible to search for at least one of audio information and shooting information in a time interval corresponding to the action using the action as a keyword.

請求項１０に記載の発明によれば、時間区間を固定の時間とした場合に比べて、正確に行動を推定することが可能となる。 According to the tenth aspect of the present invention, it is possible to accurately estimate the action compared to the case where the time interval is a fixed time.

請求項１１に記載の発明によれば、予め定めた場面における対象者の行動を推定可能な行動推定プログラムを提供できる。 According to the invention described in claim 11, it is possible to provide a behavior estimation program capable of estimating the behavior of the subject in a predetermined scene.

本実施形態に係る行動推定装置の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the action estimation apparatus which concerns on this embodiment. 本実施形態に係る行動推定装置が設けられた会議室の一例を示す図である。It is a figure which shows an example of the conference room provided with the action estimation apparatus which concerns on this embodiment. 本実施形態に係る行動推定装置の制御装置の機能を示す機能ブロック図である。It is a functional block diagram which shows the function of the control apparatus of the action estimation apparatus which concerns on this embodiment. （Ａ）は人体の回折部２５点の一例を示す図であり、（Ｂ）は回折部の位置と変位の一例を示す図である。(A) is a figure which shows an example of the diffraction part 25 points | pieces of a human body, (B) is a figure which shows an example of the position and displacement of a diffraction part. 人物Ａがジェスチャと共に人物Ｂへ話をし、人物Ｃが記録をしている場合の各人物の上半身の動きと下半身の動きの一例を示す図である。It is a figure which shows an example of a motion of each person's upper body and a lower body when the person A talks to the person B with a gesture, and the person C is recording. （Ａ）は上半身の状態テーブルの一例を示す図であり、（Ｂ）は下半身の状態テーブルの一例を示す図である。(A) is a figure which shows an example of the state table of an upper body, (B) is a figure which shows an example of the state table of a lower body. 対象者の上半身の状態と、対象者の下半身の状態との２軸の領域に予め定めた行動を予め分類した例を示す図である。It is a figure which shows the example which classified beforehand the action predetermined in the biaxial area | region of a subject's upper body state and a subject's lower body state. 行動毎の特定の要素の一例を示す図である。It is a figure which shows an example of the specific element for every action. 特定の要素と重なり部分との関係の一例を示す図である。It is a figure which shows an example of the relationship between a specific element and an overlap part. 立ち上がって、ホワイトボード装置へ移動後、板書しながら説明する場合の赤外線カメラ、マイク、及びカメラの情報を用いた動作解析の一例を説明するための図である。It is a figure for demonstrating an example of the operation | movement analysis using the information of an infrared camera, a microphone, and a camera in the case of explaining while writing on a board after standing up and moving to a whiteboard device. 本実施形態に係る行動推定装置の制御装置で行われる行動推定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the action estimation process performed with the control apparatus of the action estimation apparatus which concerns on this embodiment. 深度情報解析処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of a depth information analysis process. 重なり判定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of an overlap determination process. 重なり領域Ｒ１の重なり判定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the overlap determination process of overlap area | region R1. 重なり領域Ｒ２の重なり判定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the overlap determination process of overlap area | region R2. 重なり領域Ｒ３の重なり判定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the overlap determination process of overlap area | region R3. 重なり領域Ｒ４の重なり判定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the overlap determination process of overlap area | region R4. 重なり領域Ｒ５の重なり判定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the overlap determination process of overlap area | region R5. 重なり領域Ｒ６の重なり判定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the overlap determination process of overlap area | region R6. 重なり領域Ｒ７の重なり判定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the overlap determination process of the overlap area | region R7.

以下、図面を参照して本発明の実施の形態の一例を詳細に説明する。図１は、本実施形態に係る行動推定装置１０の概略構成を示すブロック図である。また、図２は、本実施形態に係る行動推定装置１０が設けられた会議室の一例を示す図である。 Hereinafter, an example of an embodiment of the present invention will be described in detail with reference to the drawings. FIG. 1 is a block diagram illustrating a schematic configuration of a behavior estimation apparatus 10 according to the present embodiment. Moreover, FIG. 2 is a figure which shows an example of the conference room provided with the action estimation apparatus 10 which concerns on this embodiment.

本実施形態に係る行動推定装置１０は、図２に示す会議室や、講堂、教室などの予め定めた空間で行われる会議や、打ち合わせ、講義、授業等の予め定めた場面に参加した対象者の行動を推定する。ここで、本実施形態における対象者の行動とは、人の単純な動き（視線移動や、頭の動き、手腕の動き、脚の動きなど）ではなく、複数の動きの組み合わせを含む予め定めた場面における行動であり、一例として、本実施形態では、会議中における人の行動を推定する。推定する会議中の人の行動は、具体的には、本実施形態では、「説明」、「質問／意見」、「プレゼン」、「深慮・黙考」、「ノート・内職」、及び「アピール」の６種類に分類して何れの行動であるかを推定する。 The behavior estimation apparatus 10 according to the present embodiment is a target person who participates in a predetermined scene such as a meeting, a meeting, a lecture, or a class held in a predetermined space such as the conference room, lecture hall, or classroom shown in FIG. Estimate behavior. Here, the action of the subject in the present embodiment is not a simple movement of a person (such as movement of a line of sight, movement of a head, movement of a hand, movement of a leg, or the like), but includes a combination of a plurality of movements. This is an action in a scene. As an example, in this embodiment, a person's action during a meeting is estimated. Specifically, in the present embodiment, the estimated behavior of the person during the meeting is “explanation”, “question / opinion”, “presentation”, “thought / contemplation”, “note / employment”, and “appeal”. It is classified into these six types, and it is estimated which action it is.

行動推定装置１０は、各種情報を取得し、対象者の行動を推定する推定部としての制御装置２０を備えている。 The behavior estimation device 10 includes a control device 20 as an estimation unit that acquires various types of information and estimates a subject's behavior.

制御装置２０は、ＣＰＵ（Central Processing Unit）２０Ａ、ＲＯＭ（Read Only Memory）２０Ｂ、ＲＡＭ（Random Access Memory）２０Ｃ、及び入出力ポート２０Ｄがそれぞれバス２０Ｅに接続されたマイクロコンピュータで構成されている。 The control device 20 includes a microcomputer in which a CPU (Central Processing Unit) 20A, a ROM (Read Only Memory) 20B, a RAM (Random Access Memory) 20C, and an input / output port 20D are connected to a bus 20E.

ＲＯＭ２０Ｂには、会議に参加した対象者の行動を推定するための行動推定プログラム等の各種プログラムが記憶されている。ＲＯＭ２０Ｂに記憶されたプログラムをＲＡＭ２０Ｃに展開してＣＰＵ２０Ａが実行することにより、対象者の行動の推定が行われる。あるいは、ＨＤＤ（Hard Disc Drive）２０Ｆに予め格納したプログラムを、ＲＡＭ２０Ｃに展開して、ＣＰＵ２０Ａが実行することにより、対象者の行動の推定を行なってもよい。 The ROM 20B stores various programs such as a behavior estimation program for estimating the behavior of the target person who participated in the conference. The program stored in the ROM 20B is expanded in the RAM 20C and executed by the CPU 20A, whereby the behavior of the subject is estimated. Alternatively, a program stored in advance in an HDD (Hard Disc Drive) 20F may be expanded in the RAM 20C and executed by the CPU 20A to estimate the target person's action.

入出力ポート２０Ｄには、赤外線カメラ１２、カメラ１４、マイク１６、記憶部としてのデータベースＤＢ１８、表示装置２２、ホワイトボード装置２４、及び手元操作機器２６が接続されている。 The input / output port 20D is connected to the infrared camera 12, the camera 14, the microphone 16, the database DB18 as a storage unit, the display device 22, the whiteboard device 24, and the hand operating device 26.

赤外線カメラ１２、カメラ１４、及びマイク１６は、例えば、図２に示すように、会議室に設けられている。 For example, as shown in FIG. 2, the infrared camera 12, the camera 14, and the microphone 16 are provided in a conference room.

赤外線カメラ１２は、例えば、会議室の天井や壁等に設けられて、赤外線領域の光を撮影することにより、会議中の様子を撮影することによって得られる深度情報を制御装置２０に出力する。 The infrared camera 12 is provided, for example, on the ceiling or wall of a conference room, and outputs depth information obtained by photographing the state during the conference to the control device 20 by photographing light in the infrared region.

カメラ１４は、赤外線カメラ１２と同様に、例えば、会議室の天井や壁等に設けられて、会議中の様子を撮影することによって得られる撮影画像情報を制御装置２０に出力する。 Similarly to the infrared camera 12, the camera 14 is provided on, for example, a ceiling or a wall of a conference room, and outputs captured image information obtained by photographing a state during the conference to the control device 20.

マイク１６は、例えば、会議室の机や壁面等に設けられ、会議室内の会議中の音声を音声情報として取得して制御装置２０に出力する。 The microphone 16 is provided, for example, on a desk or a wall surface of a conference room, acquires voice during a conference in the conference room as voice information, and outputs it to the control device 20.

データベースＤＢ１８は、赤外線カメラ１２の深度情報、カメラ１４の撮影画像情報、及びマイク１６の音声情報等の各種情報がそれぞれ時間情報と共に記憶される。 The database DB 18 stores various information such as depth information of the infrared camera 12, captured image information of the camera 14, and audio information of the microphone 16 together with time information.

表示装置２２は、例えば、プロジェクタ装置等の表示装置が適用され、会議中の資料を表示し、表示装置２２の表示状態を時間情報と共に資料提示情報として制御装置２０に出力する。 For example, a display device such as a projector device is applied to the display device 22, displays a document during a meeting, and outputs the display state of the display device 22 to the control device 20 as material presentation information together with time information.

ホワイトボード装置２４は、会議中の板書情報を検出して、検出結果を時間情報と共に板書情報として制御装置２０に出力する。 The whiteboard device 24 detects the blackboard information during the meeting, and outputs the detection result to the control device 20 as the blackboard information together with the time information.

手元操作機器２６は、例えば、電子ペンやユーザ登録されたコンピュータやタブレット等の機器が適用され、操作状態を検出して検出結果を時間情報と共に手元操作情報として制御装置２０に出力する。 For example, a device such as an electronic pen, a user-registered computer, or a tablet is applied to the hand operation device 26, and an operation state is detected and the detection result is output to the control device 20 as hand operation information together with time information.

そして、制御装置２０は、取得した情報に基づいて、会議に参加した対象者の行動を推定して、推定した行動を時間情報と共に、データベースＤＢ１８に記憶し、後から会議中の撮影画像などを検索可能にしている。 And the control apparatus 20 estimates the action of the subject who participated in the meeting based on the acquired information, stores the estimated action in the database DB 18 together with the time information, and later captures images taken during the meeting. Searchable.

図３は、本実施形態に係る行動推定装置１０の制御装置２０の機能を示す機能ブロック図である。 FIG. 3 is a functional block diagram illustrating functions of the control device 20 of the behavior estimation device 10 according to the present embodiment.

行動推定装置１０は、図３に示すように、取得部３０及び推定部としての行動推定部４０を機能として備えている。 As shown in FIG. 3, the behavior estimation device 10 includes an acquisition unit 30 and a behavior estimation unit 40 as an estimation unit as functions.

取得部３０は、赤外線カメラ１２から得られる深度情報を１以上の対象者の動作情報として取得すると共に、カメラ１４から得られる撮影画像情報、及びマイク１６から得られる音声情報を取得する。また、取得部３０は、表示装置２２、ホワイトボード装置２４、及び手元操作機器２６などからも情報を取得する。 The acquisition unit 30 acquires depth information obtained from the infrared camera 12 as operation information of one or more subjects, and acquires captured image information obtained from the camera 14 and audio information obtained from the microphone 16. The acquisition unit 30 also acquires information from the display device 22, the whiteboard device 24, the hand operating device 26, and the like.

また、取得部３０は、資料提示検出部３２、板書検出部３４、発話検出部３６、及び手元操作検出部３８を含んでいる。資料提示検出部３２は、表示装置２２から表示情報を取得することにより、会議中の資料提示を検出する。また、板書検出部３４は、ホワイトボード装置２４の情報やカメラ１４の撮影画像などから会議中の板書を検出する。また、発話検出部３６は、マイク１６の音声情報から会議中の発話（音声）を検出する。また、手元操作検出部３８は、手元操作機器２６の情報に基づいて会議中の手元操作を検出する。そして、取得部３０は、各部の検出結果をデータベースＤＢａ１８Ａに格納する。 The acquisition unit 30 includes a material presentation detection unit 32, a board writing detection unit 34, an utterance detection unit 36, and a hand operation detection unit 38. The material presentation detection unit 32 detects display of material during the meeting by acquiring display information from the display device 22. Further, the board writing detection unit 34 detects a board writing during a meeting from information of the whiteboard device 24, a photographed image of the camera 14, or the like. Further, the utterance detection unit 36 detects the utterance (voice) during the conference from the voice information of the microphone 16. Further, the hand operation detection unit 38 detects the hand operation during the meeting based on the information of the hand operation device 26. And the acquisition part 30 stores the detection result of each part in database DBa18A.

行動推定部４０は、メディア記録部４２、動作状態推定部４４、行動分類部４６、及び重なり判定部４８の機能を含んでおり、会議に参加した対象者の行動を推定して、データベースＤＢｂ１８Ｂに推定した行動情報を格納する。 The behavior estimation unit 40 includes functions of a media recording unit 42, an operation state estimation unit 44, a behavior classification unit 46, and an overlap determination unit 48. The behavior estimation unit 40 estimates the behavior of the target person who participated in the conference, and stores it in the database DBb 18B. Stores estimated behavior information.

メディア記録部４２は、取得部３０が取得した、深度情報、撮影画像情報、及び音声情報の各々を時間情報と共に、データベースＤＢｂ１８Ｂに格納する。 The media recording unit 42 stores each of depth information, captured image information, and audio information acquired by the acquisition unit 30 in the database DBb 18B together with time information.

動作状態推定部４４は、取得部３０が取得した、深度情報を用いて、１以上の対象者の動作を検出し、検出した動作から対象者の状態を推定する。対象者の動作は、ＯｐｅｎＣＶ（Open Source Computer Vision）など、赤外線カメラ１２から得られる深度情報を使って人物の動き（動作、姿勢、移動等）を計測する周知技術を用いて検出する。例えば、Microsoft社の製品、Kinect（商標）for Windows（登録商標）では、図４（Ａ）に示すように、人体の回折部２５点を特定し、図４（Ｂ）に示すように、その位置と変位（例えば、図４（Ｂ）中のθ等）が計測されるので、これを用いて人物の動きを検出する。より具体的には、無数の赤外線パターンを照射し、赤外線カメラで被写体に照射されたパターンを観測する。パターンの投影位置から、被写体までの奥行きを計測し人物領域を検出する。その後、事前に学習した膨大な人物姿勢パターンと観測した人物領域を比較し、各身体部位の３次元位置を正確に測定する。また、会議中の人物は、例えば、人物Ａがジェスチャと共に人物Ｂへ話をし、人物Ｃが記録をしている場合、図５に示すように、上半身（図５では顔の向きａ、発話ｂ、及び腕、手の動きｃ）及び下半身（図５では姿勢変更ｄ、歩行ｅ）の各々の動きがある。そこで、動作状態推定部４４は、深度情報に基づいて検出した動きから上半身の状態、及び下半身の状態を推定する。推定する状態としては、例えば、図６（Ａ）に示すように、上半身の状態として、静止、手元を動かす、頭部を動かす、腕を動かす、歩行等の状態を一例として推定する。また、図６（Ｂ）に示すように、下半身の状態として、静止、姿勢変更、中腰、及び歩行等の状態を一例として推定する。具体的には、深度情報から求めた各関節の動きに基づいて認識したポーズと変位量を用いて図６（Ａ）に示す上半身の状態テーブル及び図６（Ｂ）に示す下半身の状態テーブルを生成することにより状態を推定する。なお、図６に示す状態テーブルのＶｕＸ、ＶｌＸには予め定めた範囲の値が入力される。また、本実施形態では、手元を動かしながらうなずく等の複数の状態の推定は行わず、手元を動かすまたはうなずくなどの単一の状態を推定するものとし、図６の状態テーブルには何れか１つの状態のみに動き（数値）が入る場合を説明する。 The motion state estimation unit 44 detects the motions of one or more subjects using the depth information acquired by the acquisition unit 30, and estimates the status of the subjects from the detected motions. The motion of the subject is detected using a well-known technique such as OpenCV (Open Source Computer Vision) that measures the motion (motion, posture, movement, etc.) of a person using depth information obtained from the infrared camera 12. For example, in the product of Microsoft Corporation, Kinect (trademark) for Windows (registered trademark), as shown in FIG. 4 (A), 25 diffracting parts of the human body are specified, and as shown in FIG. Since the position and displacement (for example, θ in FIG. 4B) are measured, the movement of the person is detected using this. More specifically, an infinite number of infrared patterns are irradiated, and the pattern irradiated to the subject is observed with an infrared camera. The depth from the pattern projection position to the subject is measured to detect the person area. After that, the enormous human posture pattern learned in advance is compared with the observed human region, and the three-dimensional position of each body part is accurately measured. In addition, for example, when the person A talks to the person B together with the gesture and the person C records, the person in the meeting, as shown in FIG. b, and movements of arms, hands c) and lower body (posture change d, walking e in FIG. 5). Therefore, the motion state estimation unit 44 estimates the state of the upper body and the state of the lower body from the motion detected based on the depth information. As the state to be estimated, for example, as shown in FIG. 6A, the upper body state is estimated as an example of a state of resting, moving hand, moving head, moving arm, walking or the like. Further, as shown in FIG. 6B, as the state of the lower body, states such as stillness, posture change, middle waist, and walking are estimated as an example. Specifically, the upper body state table shown in FIG. 6 (A) and the lower body state table shown in FIG. 6 (B) using the poses and displacement amounts recognized based on the movements of the joints obtained from the depth information. The state is estimated by generating. A value in a predetermined range is input to VuX and VlX in the state table shown in FIG. In the present embodiment, a plurality of states such as nodding are not estimated while moving the hand, but a single state such as moving or nodding is estimated, and any one of the state tables in FIG. A case where a motion (numerical value) enters only one state will be described.

行動分類部４６は、動作状態推定部４４の推定結果を用いて、対象者の行動を分類する。例えば、図７に示すように、対象者の上半身の状態と、対象者の下半身の状態との２軸の領域において予め定めた行動を予め分類し、推定した対象者の状態から対象者の行動を推定する。図７の例では、上半身の状態を縦軸とし、下半身の状態を横軸としている。そして、「説明」、「質問／意見」、「プレゼン」、「深慮・黙考」、「ノート・内職」、及び「アピール」の６種類の行動に領域を分類し、推定した対象者の状態から行動を推定する。 The behavior classification unit 46 classifies the behavior of the target person using the estimation result of the motion state estimation unit 44. For example, as shown in FIG. 7, predetermined actions are classified in advance in a biaxial region of the subject's upper body state and the subject's lower body state, and the subject's behavior is estimated from the estimated subject's state. Is estimated. In the example of FIG. 7, the upper body state is the vertical axis, and the lower body state is the horizontal axis. Then, categorize the areas into six types of actions: “explanation”, “question / opinion”, “presentation”, “thought and meditation”, “note / internal employment”, and “appeal”. Estimate behavior.

重なり判定部４８は、行動分類部４６で行動を推定する際に、各行動の重なり部分（図７のＲ１〜Ｒ７）について行動を判定する。本実施形態では、予め定めた特定の要素の有無を条件として、重なり部分の行動を判定する。特定の要素としては、本実施形態では、取得部３０が取得した情報から検出した、資料提示、板書、音声、及び手元操作の有無を条件として重なり部分について判定する。例えば、図８に示すように、「説明」の行動には、特定の要素として、板書及び音声がある。「質問／意見」の行動では、特定の要素として、音声がある。「プレゼン」の行動では、特定の要素として、音声、資料提示、及び板書がある。「深慮、黙考」の行動では、特定の要素がない。「ノート・内職」の行動では、特定の要素として、手元操作がある。「アピール」の行動では、特定の要素として、音声がある。そこで、図９に示す特定の要素と重なり部分との関係に基づいて、各要素（Ｃ１〜Ｃ４）の有無を条件としてＲ１〜Ｒ７の重なり部分について行動を判定する。 When the behavior classification unit 46 estimates the behavior, the overlap determination unit 48 determines the behavior for the overlapping portion (R1 to R7 in FIG. 7) of each behavior. In the present embodiment, the behavior of the overlapping portion is determined on the condition of the presence or absence of a predetermined specific element. As a specific element, in the present embodiment, the overlapping portion is determined on the condition that the material presentation, the board writing, the voice, and the presence / absence of the hand operation detected from the information acquired by the acquisition unit 30 are used. For example, as illustrated in FIG. 8, the “description” action includes a blackboard and a voice as specific elements. In the “question / opinion” action, sound is a specific element. In the action of “presentation”, specific elements include voice, presentation of material, and board writing. There is no specific element in the action of “thought and meditation”. In the action of “note / inside job”, there is a hand operation as a specific element. In the action of “appeal”, there is a voice as a specific element. Therefore, based on the relationship between the specific element and the overlapping portion shown in FIG. 9, the action is determined for the overlapping portion of R1 to R7 on the condition of the presence or absence of each element (C1 to C4).

また、行動推定部４０では、会議中の行動を元に撮影画像の検索等を行えるように、推定した行動を時間情報と共に、データベースＤＢｂ１８Ｂに格納する。ここで、行動推定部４０は、赤外線カメラ１２、マイク１６、及びカメラ１４によって対象者の状態に関する情報がそれぞれ検出されるので、これらを用いると共に、行動分類部４６及び重なり判定部４８により予め定めた時間区間毎に動作解析を行って対象者の行動を推定してもよい。例えば、立ち上がって（起立動作）、ホワイトボード装置２４へ移動後（歩行移動）、板書しながら説明する（板書）場合には、図１０に示すように、時間区間毎に、赤外線カメラ１２、マイク１６、及びカメラ１４の情報が得られるので、それぞれを用いて時間区間毎に動作解析を行って対象者の行動を推定する。そして、推定した行動を時間情報と共にデータベースＤＢｂ１８Ｂに格納する。図１０の例では、Δｔ１、Δｔ３、Δｔ４はそれぞれノイズ、Δｔ２は「質問・意見」、Δｔ５及びΔｔ６はそれぞれ「説明」として行動が推定された例を示す。 Further, the behavior estimation unit 40 stores the estimated behavior together with time information in the database DBb 18B so that a photographed image can be searched based on the behavior during the meeting. Here, the behavior estimation unit 40 uses the infrared camera 12, the microphone 16, and the camera 14 to detect information related to the state of the target person, so that these are used and determined in advance by the behavior classification unit 46 and the overlap determination unit 48. The behavior of the target person may be estimated by performing motion analysis for each time interval. For example, in the case of standing up (standing up), moving to the whiteboard device 24 (walking movement), and explaining while writing on the board (board writing), as shown in FIG. 16 and the information of the camera 14 are obtained, and the behavior of the subject person is estimated by performing an operation analysis for each time interval using each of the information. And the estimated action is stored in database DBb18B with time information. In the example of FIG. 10, Δt1, Δt3, and Δt4 are noises, Δt2 is “question / opinion”, and Δt5 and Δt6 are “explanation”, respectively.

また、行動推定部４０は、動作情報から予め定めた時間区間毎の対象者の動作の頻度及び対象者の動作の維持時間の少なくとも一方の対象者特有動作を求め、該対象者特有動作を更に用いて対象者の予め定めた場面における行動を推定してもよい。例えば、対象者の癖などを考慮するために、対象者の動作の頻度や、動作の継続時間を対象者特有動作として求めて、行動の推定に利用してもよい。具体的には、頻繁にうなずく人は、状態を推定する際に、「質問・意見」の行動や、「ノート・内職」の行動とは異なるうなずき状態の場合があるので、うなずいている状態と判断しない等のように行動を推定する際の状態の推定に利用してもよい。 Further, the behavior estimation unit 40 obtains at least one subject-specific motion of the subject's motion frequency and the subject's motion maintenance time for each predetermined time interval from the motion information, and further performs the subject-specific motion. It may be used to estimate the behavior of the subject in a predetermined scene. For example, in order to take into account the habit of the subject, the frequency of the subject's motion and the duration of the motion may be obtained as the subject-specific behavior and used for behavior estimation. Specifically, people who frequently nod may be nodding when estimating their status because they may be in a nodding state that is different from the behavior of `` question / opinion '' or the behavior of `` note / internal employment ''. You may utilize for the estimation of the state at the time of estimating action like not judging.

なお、行動を推定する予め定めた時間区間は、予め定めた固定の時間区間としてもよいし、予め定めた条件に従って変動する時間区間としてもよい。例えば、変動の時間区間を適用する場合には、対象者の動作速度に応じて時間を変更してもよい。具体的には、対象者の動作速度が遅いほど行動を推定するためには長い時間が必要となるので、対象者の動作速度が遅いほど長い時間区間とし、対象者の動作速度が速いほど短い時間区間とする。 Note that the predetermined time interval for estimating the behavior may be a predetermined fixed time interval or a time interval that varies according to a predetermined condition. For example, when applying a time interval of fluctuation, the time may be changed according to the operation speed of the subject. Specifically, since the longer the time is required to estimate the behavior as the operation speed of the subject is lower, the longer the time interval is, the lower the operation speed of the subject is, and the shorter the operation speed of the subject is, the shorter Time interval.

続いて、上述のように構成された本実施形態に係る行動推定装置１０の制御装置２０で行われる具体的な処理について説明する。図１１は、本実施形態に係る行動推定装置１０の制御装置２０で行われる行動推定処理の流れの一例を示すフローチャートである。 Then, the specific process performed with the control apparatus 20 of the action estimation apparatus 10 which concerns on this embodiment comprised as mentioned above is demonstrated. FIG. 11 is a flowchart illustrating an example of a flow of behavior estimation processing performed by the control device 20 of the behavior estimation device 10 according to the present embodiment.

まず、ステップ１００では、取得部３０が、各種情報を取得してステップ１０２へ移行する。すなわち、赤外線カメラ１２から得られる深度情報、カメラ１４の撮影画像、マイク１６の音声情報、表示装置２２の表示状態、ホワイトボード装置２４の板書情報、及び手元操作機器２６の操作状態の各々を取得する。 First, in step 100, the acquisition unit 30 acquires various information and proceeds to step 102. That is, the depth information obtained from the infrared camera 12, the captured image of the camera 14, the sound information of the microphone 16, the display state of the display device 22, the writing information of the whiteboard device 24, and the operation state of the hand operating device 26 are acquired. To do.

ステップ１０２では、取得部３０が、取得した情報が赤外線カメラ１２の深度情報であるか否を判定する。該判定が否定された場合にはステップ１０４へ移行し、肯定された場合にはステップ１０８へ移行する。 In step 102, the acquisition unit 30 determines whether the acquired information is depth information of the infrared camera 12. If the determination is negative, the process proceeds to step 104, and if the determination is affirmative, the process proceeds to step 108.

ステップ１０４では、板書検出部３４がカメラ１４の撮影画像や、ホワイトボード装置２４の板書情報などから板書情報を検出すると共に、発話検出部３６がカメラ１４の撮影画像中の音声情報やマイク１６の音声情報から発話情報を検出してステップ１０６へ移行する。 In step 104, the board writing detection unit 34 detects the board writing information from the photographed image of the camera 14, the board writing information of the whiteboard device 24, and the like, and the speech detection unit 36 detects the voice information in the photographed image of the camera 14 and the microphone 16. The speech information is detected from the voice information, and the process proceeds to step 106.

ステップ１０６では、取得部３０が、検出した検出情報を対応する時間情報と共に、データベースＤＢａ１８Ａに記録してステップ１００に戻って上述の処理を繰り返す。 In step 106, the acquisition unit 30 records the detected detection information together with the corresponding time information in the database DBa 18A, returns to step 100, and repeats the above-described processing.

一方、ステップ１０８では、動作状態推定部４４が、深度情報の解析処理を行ってステップ１１０へ移行する。ここで、深度情報解析処理について説明する。図１２は、深度情報解析処理の流れの一例を示すフローチャートである。 On the other hand, in step 108, the motion state estimation unit 44 performs depth information analysis processing and proceeds to step 110. Here, the depth information analysis processing will be described. FIG. 12 is a flowchart illustrating an example of the flow of depth information analysis processing.

深度情報解析処理に移行すると、ステップ２００では、動作状態推定部４４が、変位がある関節座標群を抽出してステップ２０２へ移行する。 When the process proceeds to the depth information analysis process, in step 200, the motion state estimation unit 44 extracts a joint coordinate group having a displacement and proceeds to step 202.

ステップ２０２では、動作状態推定部４４が、任意の隣接する２関節の座標をベクトルとしてスカラーと内積を計算してステップ２０４へ移行する。 In step 202, the motion state estimation unit 44 calculates a scalar and an inner product using the coordinates of any two adjacent joints as a vector, and proceeds to step 204.

ステップ２０４では、動作状態推定部４４が、変位がある２関節の変位量としての角度θを計算してステップ２０６へ移行する。 In step 204, the motion state estimation unit 44 calculates an angle θ as the displacement amount of the two joints having the displacement, and proceeds to step 206.

ステップ２０６では、動作状態推定部４４が、カウンタ及びタイマを更新してステップ２０８へ移行する。 In step 206, the operation state estimation unit 44 updates the counter and timer, and proceeds to step 208.

ステップ２０８では、動作状態推定部４４が、所定回数または所定時間が経過したか否か判定する。該判定が否定された場合にはステップ２００に戻って上述の処理を繰り返し、判定が肯定された場合にはステップ２１０へ移行する。なお、所定時間としては、上述した、予め定めた時間区間を適用する。すなわち、所定時間は、予め定めた固定の時間を適用してもよいし、予め定めた条件に従って変動する時間を適用してもよい。また、同様に、所定回数についても予め定めた固定の回数を適用してもよいし、予め定めた条件に従って変動する回数を適用してもよい。例えば、対象者の動作速度が遅いほど多い回数とし、対象者の動作速度が速いほど少ない回数に変動してもよい。 In step 208, the operation state estimation unit 44 determines whether a predetermined number of times or a predetermined time has elapsed. If the determination is negative, the process returns to step 200 and the above-described processing is repeated. If the determination is affirmative, the process proceeds to step 210. The predetermined time interval described above is applied as the predetermined time. That is, a predetermined time may be applied as the predetermined time, or a time that varies according to a predetermined condition may be applied. Similarly, a predetermined fixed number of times may be applied to the predetermined number of times, or a number of times that fluctuates according to a predetermined condition may be applied. For example, the number of times may be increased as the operation speed of the subject is lower, and the number may be decreased as the operation speed of the subject is increased.

ステップ２１０では、動作状態推定部４４が、計算したθの変位より人物のポーズ（状態）を認識してステップ２１２へ移行する。 In step 210, the motion state estimation unit 44 recognizes a person's pose (state) from the calculated displacement of θ, and proceeds to step 212.

ステップ２１２では、動作状態推定部４４が、認識したポーズと計算した変位量θとを出力して図１１のステップ１１０へ移行する。 In step 212, the motion state estimation unit 44 outputs the recognized pose and the calculated displacement amount θ, and proceeds to step 110 in FIG.

続いて、ステップ１１０では、動作状態推定部４４が、深度情報解析処理によって得られたポーズと変位量に基づいて、上半身及び下半身の各々の状態テーブルを作成し、状態を推定してステップ１１２へ移行する。すなわち、深度情報から検出した人物の動きから、図６（Ａ）、（Ｂ）に示す状態テーブルを作成する。なお、状態テーブルは、本実施形態では、単一の状態に対応する部分のみに動きがある場合の状態テーブルを作成する。複数の状態に対して動きがある場合は、状態テーブルを再作成するものとする。 Subsequently, in step 110, the motion state estimation unit 44 creates a state table for each of the upper body and lower body based on the pose and displacement obtained by the depth information analysis process, estimates the state, and proceeds to step 112. Transition. That is, the state table shown in FIGS. 6A and 6B is created from the movement of the person detected from the depth information. In the present embodiment, the state table is created when there is movement only in a portion corresponding to a single state. If there is movement for multiple states, the state table shall be recreated.

ステップ１１２では、行動分類部４６が、対象者の行動を推定してステップ１１４へ移行する。具体的には、作成した状態テーブルに基づいて、図７に示すように、対象者の上半身の状態と、対象者の下半身の状態との２軸に予め定めた行動を予め分類し、推定した対象者の状態から対象者の行動を推定する。 In step 112, the behavior classification unit 46 estimates the behavior of the subject person and proceeds to step 114. Specifically, based on the created state table, as shown in FIG. 7, pre-classified and estimated behaviors predetermined in two axes, the state of the subject's upper body and the state of the subject's lower body The behavior of the subject is estimated from the state of the subject.

ステップ１１４では、重なり判定部４８が、重なり領域の行動であるか否かを判定する。該判定は、上半身及び下半身の状態が、図７の重なり領域（Ｒ１〜Ｒ７）に対応するか否かを判定する。該判定が否定された場合にはステップ１１８へ移行し、肯定された場合にはステップ１１６へ移行する。 In step 114, the overlap determination unit 48 determines whether or not the action is an overlap region. In this determination, it is determined whether or not the upper and lower body states correspond to the overlapping regions (R1 to R7) in FIG. If the determination is negative, the process proceeds to step 118, and if the determination is affirmative, the process proceeds to step 116.

ステップ１１６では、重なり判定部４８が、重なり判定処理を行ってステップ１１８へ移行する。ここで、重なり判定処理について詳細に説明する。図１３は、重なり判定処理の流れの一例を示すフローチャートである。 In step 116, the overlap determination unit 48 performs an overlap determination process and proceeds to step 118. Here, the overlap determination process will be described in detail. FIG. 13 is a flowchart illustrating an example of the flow of the overlap determination process.

ステップ３００では、重なり判定部４８が、重なり発生時刻の検出情報をＤＢａ１８Ａから取得する。すなわち、重なり領域の上半身及び下半身の状態を検出した時刻における検出情報（ステップ１０６において取得部３０が取得してＤＢａ１８Ａに記録した情報）をデータベースＤＢａ１８Ａから取得する。取得する検出情報は、会議中の資料提示、会議中の板書、会議中の発話、及び会議中の手元操作のそれぞれの検出情報を取得する。 In step 300, the overlap determination unit 48 acquires the detection information of the overlap occurrence time from the DBa 18A. That is, detection information (information acquired by the acquisition unit 30 in step 106 and recorded in the DBa 18A) at the time when the upper and lower body states of the overlapping region are detected is acquired from the database DBa 18A. The acquired detection information acquires each detection information of presentation of material during the meeting, board writing during the meeting, utterance during the meeting, and hand operation during the meeting.

ステップ３０２では、重なり判定部４８が、重なり領域Ｒ１〜Ｒ７の何れの領域であるか分類してステップ３０４へ移行する。 In step 302, the overlap determination unit 48 classifies which of the overlap regions R1 to R7, and proceeds to step 304.

ステップ３０４では、重なり判定部４８が、重なり領域毎の重なり判定処理を行って図１１のステップ１１８へ移行する。ここで、重なり領域毎の重なり判定処理について詳細に説明する。重なり領域毎の重なり判定は、重なり領域によって異なる処理が行われ、図８、９に示す予め定めた特定の要素の有無を条件として、重なり領域の行動を判定する。図１４〜図２０は重なり領域Ｒ１〜Ｒ７のそれぞれにおける重なり判定処理の流れを示すフローチャートである。 In step 304, the overlap determination unit 48 performs an overlap determination process for each overlap region, and proceeds to step 118 in FIG. Here, the overlap determination process for each overlap region will be described in detail. The overlap determination for each overlap region is performed according to the overlap region, and the action of the overlap region is determined on the condition that there is a predetermined specific element shown in FIGS. 14 to 20 are flowcharts showing the flow of the overlap determination process in each of the overlap regions R1 to R7.

まず、重なり領域がＲ１の場合は、ステップ３１０において、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が板書Ｃ２及び音声Ｃ３であるか否か判定する。該判定が肯定された場合にはステップ３１２へ移行し、否定された場合にはステップ３１４へ移行する。 First, when the overlap region is R1, in step 310, the overlap determination unit 48 determines whether or not the specific element is the board C2 and the sound C3 based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 312, and if the determination is negative, the process proceeds to step 314.

ステップ３１２では、重なり判定部４８が、重なり領域の行動が「説明」であると判定する。 In step 312, the overlap determination unit 48 determines that the action of the overlap region is “explanation”.

一方、ステップ３１４では、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が音声Ｃ３のみであるか否か判定する。該判定が肯定された場合にはステップ３１６へ移行し、否定された場合にはステップ３１８へ移行する。 On the other hand, in step 314, the overlap determination unit 48 determines whether or not the specific element is only the voice C3 based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 316, and if the determination is negative, the process proceeds to step 318.

ステップ３１６では、重なり判定部４８が、重なり領域の行動が「質問・意見」であると判定する。 In step 316, the overlap determination unit 48 determines that the action of the overlap region is “question / opinion”.

ステップ３１８では、重なり判定部４８が、重なり領域の行動が特定なしと判定する。 In step 318, the overlap determination unit 48 determines that the behavior of the overlap region is not specified.

次に、重なり領域がＲ２の場合は、ステップ３２０において、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が資料提示Ｃ１、板書Ｃ２、及び音声Ｃ３であるか否か判定する。該判定が肯定された場合にはステップ３２２へ移行し、否定された場合にはステップ３２４へ移行する。 Next, when the overlapping area is R2, in step 320, based on the information acquired in step 300, whether or not the specific element is the material presentation C1, the board writing C2, and the voice C3 is determined. judge. If the determination is affirmative, the process proceeds to step 322, and if the determination is negative, the process proceeds to step 324.

ステップ３２２では、重なり判定部４８が、重なり領域の行動が「プレゼン」であると判定する。 In step 322, the overlap determination unit 48 determines that the action of the overlap region is “presentation”.

一方、ステップ３２４では、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が音声Ｃ３のみであるか否か判定する。該判定が肯定された場合にはステップ３２６へ移行し、否定された場合にはステップ３２８へ移行する。 On the other hand, in step 324, the overlap determination unit 48 determines whether or not the specific element is only the voice C3 based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 326. If the determination is negative, the process proceeds to step 328.

ステップ３２６では、重なり判定部４８が、重なり領域の行動が「質問・意見」であると判定する。 In step 326, the overlap determination unit 48 determines that the behavior of the overlap region is “question / opinion”.

ステップ３２８では、重なり判定部４８が、重なり領域の行動が特定なしと判定する。 In step 328, the overlap determination unit 48 determines that the behavior of the overlap region is not specified.

次に、重なり領域がＲ３の場合は、ステップ３３０において、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が板書Ｃ２及び音声Ｃ３であるか否か判定する。該判定が肯定された場合にはステップ３３２へ移行し、否定された場合にはステップ３３４へ移行する。 Next, when the overlap region is R3, in step 330, the overlap determination unit 48 determines whether or not the specific element is the board C2 and the sound C3 based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 332; if the determination is negative, the process proceeds to step 334.

ステップ３３２では、重なり判定部４８が、重なり領域の行動が「説明」であると判定する。 In step 332, the overlap determination unit 48 determines that the action of the overlap region is “explanation”.

一方、ステップ３３４では、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が検出情報なしであるか否かを判定する。該判定が肯定された場合にはステップ３３６へ移行し、否定された場合にはステップ３３８へ移行する。 On the other hand, in step 334, the overlap determination unit 48 determines whether or not the specific element has no detection information based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 336, and if the determination is negative, the process proceeds to step 338.

ステップ３３６では、重なり判定部４８が、重なり領域の行動が「深慮・黙考」であると判定する。 In step 336, the overlap determination unit 48 determines that the action of the overlap region is “thought / contemplation”.

ステップ３３８では、重なり判定部４８が、重なり領域の行動が特定なしと判定する。 In step 338, the overlap determination unit 48 determines that the behavior of the overlap region is not specified.

次に、重なり領域がＲ４の場合は、ステップ３４０において、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が音声Ｃ３のみであるか否か判定する。該判定が肯定された場合にはステップ３４２へ移行し、否定された場合にはステップ３４４へ移行する。 Next, when the overlap region is R4, in step 340, the overlap determination unit 48 determines whether the specific element is only the voice C3 based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 342. If the determination is negative, the process proceeds to step 344.

ステップ３４２では、重なり判定部４８が、重なり領域の行動が「質問・意見」であると判定する。 In step 342, the overlap determination unit 48 determines that the behavior of the overlap region is “question / opinion”.

一方、ステップ３４４では、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が手元操作Ｃ４のみであるか否かを判定する。該判定が肯定された場合にはステップ３４６へ移行し、否定された場合にはステップ３４８へ移行する。 On the other hand, in step 344, the overlap determination unit 48 determines whether the specific element is only the hand operation C4 based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 346, and if the determination is negative, the process proceeds to step 348.

ステップ３４６では、重なり判定部４８が、重なり領域の行動が「ノート・内職」であると判定する。 In step 346, the overlap determination unit 48 determines that the action of the overlap region is “note / internal work”.

ステップ３４８では、重なり判定部４８が、重なり領域の行動が特定なしと判定する。 In step 348, the overlap determination unit 48 determines that the behavior of the overlap region is not specified.

次に、重なり領域がＲ５の場合は、ステップ３５０において、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が資料提示Ｃ１、板書Ｃ２、及び音声Ｃ３であるか否か判定する。該判定が肯定された場合にはステップ３５２へ移行し、否定された場合にはステップ３５４へ移行する。 Next, when the overlap region is R5, whether or not the specific elements are the material presentation C1, the board writing C2, and the voice C3 based on the information acquired in step 300 by the overlap determination unit 48 in step 350. judge. If the determination is affirmative, the process proceeds to step 352, and if the determination is negative, the process proceeds to step 354.

ステップ３５２では、重なり判定部４８が、重なり領域の行動が「プレゼン」であると判定する。 In step 352, the overlap determination unit 48 determines that the action of the overlap region is “presentation”.

一方、ステップ３５４では、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が音声Ｃ３のみであるか否か判定する。該判定が肯定された場合にはステップ３５６へ移行し、否定された場合にはステップ３５８へ移行する。 On the other hand, in step 354, the overlap determination unit 48 determines whether or not the specific element is only the voice C3 based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 356, and if the determination is negative, the process proceeds to step 358.

ステップ３５６では、重なり判定部４８が、重なり領域の行動が「アピール」であると判定する。 In step 356, the overlap determination unit 48 determines that the action of the overlap region is “appeal”.

ステップ３５８では、重なり判定部４８が、重なり領域の行動が特定なしと判定する。 In step 358, the overlap determination unit 48 determines that the behavior of the overlap region is not specified.

次に、重なり領域がＲ６の場合は、ステップ３６０において、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が検出情報なしであるか否かを判定する。該判定が肯定された場合にはステップ３６２へ移行し、否定された場合にはステップ３６４へ移行する。 Next, when the overlap region is R6, in step 360, the overlap determination unit 48 determines whether or not the specific element has no detection information based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 362. If the determination is negative, the process proceeds to step 364.

ステップ３６２では、重なり判定部４８が、重なり領域の行動が「深慮・黙考」であると判定する。 In step 362, the overlap determination unit 48 determines that the action of the overlap region is “thought / contemplation”.

一方、ステップ３６４では、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が手元操作Ｃ４のみであるか否かを判定する。該判定が肯定された場合にはステップ３６６へ移行し、否定された場合にはステップ３６８へ移行する。 On the other hand, in step 364, the overlap determination unit 48 determines whether the specific element is only the hand operation C4 based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 366, and if the determination is negative, the process proceeds to step 368.

ステップ３６６では、重なり判定部４８が、重なり領域の行動が「ノート・内職」であると判定する。 In step 366, the overlap determination unit 48 determines that the action of the overlap region is “note / internal work”.

ステップ３６８では、重なり判定部４８が、重なり領域の行動が特定なしと判定する。 In step 368, the overlap determination unit 48 determines that the behavior of the overlap region is not specified.

次に、重なり領域がＲ７の場合は、ステップ３７０において、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が手元操作Ｃ４のみであるか否かを判定する。該判定が肯定された場合にはステップ３７２へ移行し、否定された場合にはステップ３７４へ移行する。 Next, when the overlap region is R7, in step 370, the overlap determination unit 48 determines whether the specific element is only the hand operation C4 based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 372. If the determination is negative, the process proceeds to step 374.

ステップ３７２では、重なり判定部４８が、重なり領域の行動が「ノート・内職」であると判定する。 In step 372, the overlap determination unit 48 determines that the action of the overlap region is “note / internal work”.

一方、ステップ３７４では、重なり判定部４８が、ステップ３００において取得した情報に基づいて、特定の要素が音声Ｃ３のみであるか否か判定する。該判定が肯定された場合にはステップ３７６へ移行し、否定された場合にはステップ３７８へ移行する。 On the other hand, in step 374, the overlap determination unit 48 determines whether or not the specific element is only the voice C3 based on the information acquired in step 300. If the determination is affirmative, the process proceeds to step 376, and if the determination is negative, the process proceeds to step 378.

ステップ３７６では、重なり判定部４８が、重なり領域の行動が「アピール」であると判定する。 In step 376, the overlap determination unit 48 determines that the action of the overlap region is “appeal”.

ステップ３７８では、重なり判定部４８が、重なり領域の行動が特定なしと判定する。 In step 378, the overlap determination unit 48 determines that the behavior of the overlap region is not specified.

そして、以上の何れかにより、重なり領域別の重なり判定処理が行われると、図１３の処理をリターンして図１１のステップ１１８へ移行する。 Then, when the overlap determination process for each overlap region is performed by any of the above, the process of FIG. 13 is returned and the process proceeds to step 118 of FIG.

ステップ１１８では、行動推定部４０が、推定した行動を時間情報と共に、データベースＤＢｂ１８Ｂに記録してステップ１００に戻って上述の処理を繰り返す。これにより、データベースＤＢ１８には、時間情報と共に、推定した行動が記録される共に、撮影画像情報及び音声情報などの会議に関する情報が格納される。 In step 118, the behavior estimation unit 40 records the estimated behavior together with time information in the database DBb 18B, returns to step 100, and repeats the above processing. As a result, the database DB 18 records the estimated behavior along with the time information, and stores information about the conference such as the captured image information and the audio information.

このように推定した会議中の対象者の行動は、会議の進行中にリアルタイムで会議参加者の行動を分析し、議論の停滞を検出（例えば、相対的に長いノイズ区間の検出等）して議論が停滞した際に、その時点までの分析結果を提示して次のアクションを促すために使用してもよい。 The behavior of the target person during the conference estimated in this way is analyzed in real time during the conference to detect the stagnation of the discussion (for example, detection of a relatively long noise section). When the discussion is stagnant, the analysis result up to that point may be presented to prompt the next action.

また、会議参加者の行動を、会議の全記録時間を対象として事後に分析し、ノイズ区間の除去と、行動分類とによりコミュニケーション構造を可視化して提示するために使用してもよい。 Further, the behavior of the conference participants may be analyzed after the entire recording time of the conference, and may be used to visualize and present the communication structure by removing the noise section and the behavior classification.

また、会議参加者の行動を、会議の全記録時間を対象として事後に分析し、例えば、「板書」などの行動を表すキーワードを使い、データベースＤＢ１８に記録された会議の情報を検索するために使用してもよい。例えば、あの会議での板書を検索し、この板書に至った議論の記録の検索などに使用する。 In addition, in order to search the conference information recorded in the database DB 18 using a keyword representing the behavior such as “board writing”, for example, by analyzing the conference participants' behavior after the entire recording time of the conference. May be used. For example, it searches for a board at that meeting and uses it to search for records of discussions that led to this board.

なお、上記の実施形態では、図６の状態テーブルには何れか１つの状態のみに動き（数値）が入るものとして説明したが、これに限るものではない。例えば、動作情報から図６に示す状態テーブルを生成した際に、複数の状態の各々に動き（数値）が入る場合を適用してもよい。この場合には、例えば、最も動きが大きい状態を採用して状態を推定してもよい。 In the above embodiment, the state table in FIG. 6 has been described as having movement (numerical values) in only one of the states. However, the present invention is not limited to this. For example, when the state table shown in FIG. 6 is generated from the motion information, a case where a motion (numerical value) enters each of a plurality of states may be applied. In this case, for example, the state may be estimated by adopting the state with the largest movement.

また、上記の実施形態では、図７に示すように、２軸の領域において２つの行動が重なるように分類した例を説明したが、３以上の行動が重なるように分類してもよい。 In the above embodiment, as illustrated in FIG. 7, an example in which two actions are overlapped in a biaxial region has been described. However, three or more actions may be classified.

また、上記の実施形態に係る行動推定装置１０の制御装置２０で行われる処理（図１１〜２０）は、ソフトウエアで行われる処理としてもよいし、ハードウエアで行われる処理としてもよいし、双方を組み合わせた処理としてもよい。また、制御装置２０の各部で行われる処理は、プログラムとして記憶媒体に記憶して流通させるようにしてもよい。 Moreover, the process (FIGS. 11-20) performed by the control apparatus 20 of the behavior estimation apparatus 10 according to the above embodiment may be a process performed by software, or may be a process performed by hardware. It is good also as processing which combined both. Further, the processing performed by each unit of the control device 20 may be stored and distributed as a program in a storage medium.

また、本発明は、上記に限定されるものでなく、上記以外にも、その主旨を逸脱しない範囲内において種々変形して実施可能であることは勿論である。 Further, the present invention is not limited to the above, and it is needless to say that various modifications can be made without departing from the gist of the present invention.

１０行動推定装置
１２赤外線カメラ
１４カメラ
１６マイク
１８データベースＤＢ
１８ＡデータベースＤＢａ
１８ＢデータベースＤＢｂ
２０制御装置
３０取得部
３２資料提示検出部
３４板書検出部
３６発話検出部
３８手元操作検出部
４０行動推定部
４２メディア記録部
４４動作状態推定部
４６行動分類部
４８重なり判定部 DESCRIPTION OF SYMBOLS 10 Action estimation apparatus 12 Infrared camera 14 Camera 16 Microphone 18 Database DB
18A Database DBa
18B Database DBb
DESCRIPTION OF SYMBOLS 20 Control apparatus 30 Acquisition part 32 Material presentation detection part 34 Board writing detection part 36 Speech detection part 38 Hand operation detection part 40 Action estimation part 42 Media recording part 44 Motion state estimation part 46 Action classification part 48 Overlap determination part

Claims

An acquisition unit for acquiring operation information of one or more subjects,
An estimation unit that estimates an action in a predetermined scene of the subject using the operation information acquired by the acquisition unit;
A behavior estimation device comprising:

The behavior estimation according to claim 1, wherein the estimation unit estimates the state of the target person using the motion information acquired by the acquisition unit, and estimates a behavior of the target person in a predetermined scene from the estimated state. apparatus.

The behavior estimation apparatus according to claim 2, wherein the estimation unit classifies a predetermined behavior corresponding to the state of the subject in advance and estimates the behavior of the subject from the estimated state of the subject.

The estimation unit pre-classifies actions in a biaxial region of the subject's upper body state and the subject's lower body state, and estimates each of the upper body state and the lower body state from the motion information The behavior estimation apparatus according to claim 3, wherein the behavior of the subject is estimated.

The behavior estimation apparatus according to claim 4, wherein the region in which the behavior is classified has two or more overlaps, and the presence / absence of a predetermined specific element is a condition for classification in the overlap.

The estimation unit obtains target person-specific motion information of at least one of a target person's motion frequency and a target person's motion maintenance time for each predetermined time interval from the motion information, and further includes the target person-specific motion information. The behavior estimation apparatus according to any one of claims 1 to 5, wherein the behavior estimation apparatus uses the subject to estimate a behavior in a predetermined scene.

The said estimation part is further provided with the memory | storage part which estimates a subject's action for every predetermined time interval, and memorize | stores the subject's action for every said time interval estimated by the said estimation part. The behavior estimation apparatus according to any one of the above.

The behavior estimation apparatus according to claim 7, wherein the storage unit further stores detection information in the predetermined scene other than the motion information for each time interval.

The behavior estimation apparatus according to claim 8, wherein the detection information is information including at least one of audio information for each time interval and shooting information for each time interval.

The behavior estimation apparatus according to any one of claims 6 to 9, wherein the time interval is changed to a predetermined time according to the operation speed of the subject.

The action estimation program for functioning a computer as each part of the action estimation apparatus of any one of Claims 1-10.