JP2019148849A

JP2019148849A - System for determining degree of understanding and program for determining degree of understanding

Info

Publication number: JP2019148849A
Application number: JP2018031540A
Authority: JP
Inventors: 秀典庄司; Hidenori Shoji; 彩香池嶋; Ayaka Ikejima
Original assignee: Kyocera Document Solutions Inc
Current assignee: Kyocera Document Solutions Inc
Priority date: 2018-02-26
Filing date: 2018-02-26
Publication date: 2019-09-05

Abstract

To provide a system for determining the degree of understanding and a program for determining the degree of understanding that can accurately determine a listener's degree of understanding on the basis of the facial expression of the listener.SOLUTION: A system for determining the degree of understanding comprises: means for analyzing feeling that analyzes a listener's feeling in S51 on the basis of the listener's facial expression photographed by a camera; and means for analyzing the degree of understanding that analyzes the listener's degree of understanding in S53 to S56 on the basis of the feeling analyzed by the means for analyzing feeling. The means for analyzing the degree of understanding determines the listener's degree of understanding on the basis of the feeling of "disgust" among the feelings analyzed by the means for analyzing feeling.SELECTED DRAWING: Figure 9

Description

本発明は、聴者の理解度を判定する理解度判定システムおよび理解度判定プログラムに関する。 The present invention relates to an understanding level determination system and an understanding level determination program for determining an understanding level of a listener.

従来の理解度判定システムとして、カメラによって撮影された聴者の顔の表情などの動き情報から聴者の理解度を判定するシステムが知られている（例えば、特許文献１参照。）。 As a conventional understanding level determination system, there is known a system for determining a listener's level of understanding from movement information such as a facial expression of a listener photographed by a camera (see, for example, Patent Document 1).

特開２０１６−１７７４８３号公報Japanese Patent Laid-Open No. 2006-177483

しかしながら、特許文献１に記載された理解度判定システムにおいては、聴者の顔の表情に基づいて聴者の理解度を判定する具体的な方法が不明であるので、聴者の理解度を高精度に判定することができるのか否かが不明であるという問題がある。 However, in the understanding level determination system described in Patent Document 1, since the specific method for determining the level of understanding of the listener based on the facial expression of the listener is unknown, the level of understanding of the listener is determined with high accuracy. There is a problem that it is unclear whether or not it can be done.

そこで、本発明は、聴者の顔の表情に基づいて聴者の理解度を高精度に判定することができる理解度判定システムおよび理解度判定プログラムを提供することを目的とする。 Accordingly, an object of the present invention is to provide an understanding level determination system and an understanding level determination program that can determine the level of understanding of a listener with high accuracy based on the facial expression of the listener.

本発明の理解度判定システムは、カメラによって撮影された聴者の顔の表情に基づいて前記聴者の感情を解析する感情解析手段と、前記感情解析手段によって解析された感情に基づいて前記聴者の理解度を解析する理解度解析手段とを備え、前記理解度解析手段は、前記感情解析手段によって解析された感情のうち「嫌悪」の感情に基づいて前記聴者の理解度を判定することを特徴とする。 An understanding level determination system according to the present invention includes an emotion analysis unit that analyzes an emotion of the listener based on a facial expression of the listener photographed by a camera, and an understanding of the listener based on the emotion that is analyzed by the emotion analysis unit. Comprehension level analysis means for analyzing the degree, wherein the understanding level analysis means determines the level of understanding of the listener based on an emotion of “disgust” among the emotions analyzed by the emotion analysis means. To do.

この構成により、本発明の理解度判定システムは、聴者の顔の表情に基づいて解析した感情のうち、聴者の理解度と相関する「嫌悪」の感情に基づいて聴者の理解度を判定するので、聴者の顔の表情に基づいて聴者の理解度を高精度に判定することができる。 With this configuration, the understanding level determination system of the present invention determines the level of understanding of the listener based on the emotion of “disgust” that correlates with the level of understanding of the listener among the emotions analyzed based on the facial expression of the listener. Therefore, it is possible to determine the degree of understanding of the listener with high accuracy based on the facial expression of the listener.

本発明の理解度判定システムにおいて、前記理解度解析手段は、判定した理解度をモニターに表示し、前記理解度判定システムは、前記カメラと、前記モニターとが離れた場所に設置されるリモート会議システムであっても良い。 In the understanding level determination system of the present invention, the understanding level analysis means displays the determined level of understanding on a monitor, and the understanding level determination system is a remote conference in which the camera and the monitor are installed at a remote location. It may be a system.

リモート会議、すなわち、互いに離れた場所にいる人間同士の会議は、対面での会議、すなわち、同一の場所にいる人間同士の会議と比較して、視覚によって得られる情報が少ないので、聴者の顔の表情や仕草などの非言語のコミュニケーションが話者に伝わり難く、話者が聴者の理解度を高精度に判定することが困難である。しかしながら、本発明の理解度判定システムは、カメラによって撮影された聴者の顔の表情に基づいて聴者の理解度を判定するので、リモート会議システムであっても聴者の理解度を高精度に判定することができる。 Remote meetings, that is, meetings between people who are far away from each other, have less information that can be obtained visually than a face-to-face meeting, ie, a meeting between people who are in the same place. It is difficult for non-verbal communication such as facial expressions and gestures to be transmitted to the speaker, and it is difficult for the speaker to determine the level of understanding of the listener with high accuracy. However, since the understanding level determination system of the present invention determines the listener's level of understanding based on the facial expression of the listener taken by the camera, the listener's level of understanding is determined with high accuracy even in a remote conference system. be able to.

本発明の理解度判定プログラムは、カメラによって撮影された聴者の顔の表情に基づいて前記聴者の感情を解析する感情解析手段と、前記感情解析手段によって解析された感情に基づいて前記聴者の理解度を解析する理解度解析手段とをコンピューターに実現させ、前記理解度解析手段は、前記感情解析手段によって解析された感情のうち「嫌悪」の感情に基づいて前記聴者の理解度を判定することを特徴とする。 The understanding level determination program of the present invention includes an emotion analysis unit that analyzes an emotion of the listener based on a facial expression of the listener photographed by a camera, and an understanding of the listener based on the emotion that is analyzed by the emotion analysis unit. An understanding level analysis means for analyzing the degree of the sound, and the understanding level analysis means determines the level of understanding of the listener based on an emotion of “disgust” among the emotions analyzed by the emotion analysis means. It is characterized by.

この構成により、本発明の理解度判定プログラムを実行するコンピューターは、聴者の顔の表情に基づいて解析した感情のうち、聴者の理解度と相関する「嫌悪」の感情に基づいて聴者の理解度を判定するので、聴者の顔の表情に基づいて聴者の理解度を高精度に判定することができる。 With this configuration, the computer that executes the understanding level determination program of the present invention enables the understanding level of the listener based on the emotion of “disgust” that correlates with the understanding level of the listener among the emotions analyzed based on the facial expression of the listener. Therefore, it is possible to determine the degree of understanding of the listener with high accuracy based on the facial expression of the listener.

本発明の理解度判定システムおよび理解度判定プログラムは、聴者の顔の表情に基づいて聴者の理解度を高精度に判定することができる。 The understanding level determination system and the understanding level determination program of the present invention can determine the listener's understanding level with high accuracy based on the facial expression of the listener.

本発明の一実施の形態に係る理解度判定システムを開発するために行った開発用実験の環境を示す上面図である。It is a top view which shows the environment of the experiment for a development conducted in order to develop the understanding level determination system which concerns on one embodiment of this invention. （ａ）図１に示す環境で行われた開発用実験の結果のうち「嫌悪」の感情の強度を時間毎に示したグラフである。（ｂ）図１に示す環境で行われた開発用実験の結果のうち「怒り」の感情の強度を時間毎に示したグラフである。（ｃ）図１に示す環境で行われた開発用実験の結果のうち「悲しみ」の感情の強度を時間毎に示したグラフである。（ｄ）図１に示す環境で行われた開発用実験の結果のうち「軽蔑」の感情の強度を時間毎に示したグラフである。(A) It is the graph which showed the intensity | strength of the feeling of "disgust" for every time among the results of the experiment for development performed in the environment shown in FIG. (B) It is the graph which showed the intensity | strength of the feeling of "anger" for every time among the results of the experiment for development performed in the environment shown in FIG. (C) It is the graph which showed the intensity | strength of the feeling of "sadness" for every time among the results of the experiment for development performed in the environment shown in FIG. (D) It is the graph which showed the intensity | strength of the feeling of "contempt" for every time among the results of the experiment for development performed in the environment shown in FIG. （ａ）図１に示す環境で行われた開発用実験の結果のうち「喜び」の感情の強度を時間毎に示したグラフである。（ｂ）図１に示す環境で行われた開発用実験の結果のうち「恐れ」の感情の強度を時間毎に示したグラフである。（ｃ）図１に示す環境で行われた開発用実験の結果のうち「驚き」の感情の強度を時間毎に示したグラフである。(A) It is the graph which showed the intensity | strength of the feeling of "joy" for every time among the results of the experiment for development performed in the environment shown in FIG. (B) It is the graph which showed the intensity | strength of the feeling of "fear" for every time among the results of the experiment for development performed in the environment shown in FIG. (C) It is the graph which showed the intensity | strength of the feeling of "surprise" among the results of the experiment for development performed in the environment shown in FIG. 1 for every time. （ａ）図２（ａ）に示すグラフからシーン１およびシーン５のデータを除いたグラフである。（ｂ）図４（ａ）に示すデータの対象の被験者とは異なる被験者に対する開発用実験の結果のうち「嫌悪」の感情の強度を時間毎に示したグラフである。(A) It is the graph which remove | excluded the data of the scene 1 and the scene 5 from the graph shown to Fig.2 (a). (B) It is the graph which showed the intensity | strength of the feeling of "disgust" for every time among the results of the experiment for development with respect to the test subject different from the test subject of the data shown to Fig.4 (a). （ａ）図４（ａ）に示すデータの対象の被験者に対する開発用実験の結果のうち「嫌悪」の感情の強度をクラスタリングした図である。（ｂ）図４（ｂ）に示すデータの対象の被験者に対する開発用実験の結果のうち「嫌悪」の感情の強度をクラスタリングした図である。(A) It is the figure which clustered the intensity | strength of the feeling of "disgust" among the results of the experiment for a development with respect to the test subject of the data shown to Fig.4 (a). (B) It is the figure which clustered the intensity | strength of the feeling of "disgust" among the results of the experiment for a development with respect to the test subject of the data shown in FIG.4 (b). 図１に示す環境で行われた開発用実験において被験者が回答した理解度毎に、クラスター４に含まれるデータを含むシーンと、クラスター４に含まれるデータを含まないシーンとの割合を示す図である。The figure which shows the ratio of the scene which contains the data contained in the cluster 4, and the scene which does not contain the data contained in the cluster 4 for every understanding degree which the test subject answered in the development experiment performed in the environment shown in FIG. is there. 本発明の一実施の形態に係る理解度判定システムのブロック図である。1 is a block diagram of an understanding level determination system according to an embodiment of the present invention. 図７に示すコンピューターのブロック図である。FIG. 8 is a block diagram of the computer shown in FIG. 7. カメラによって撮影される聴者の理解度をモニターに表示する場合の図８に示すコンピューターの動作のフローチャートである。It is a flowchart of operation | movement of the computer shown in FIG. 8 in the case of displaying on a monitor the understanding level of the listener image | photographed with a camera. 図７に示す理解度判定システムとは異なる構成の理解度判定システムのブロック図である。It is a block diagram of the understanding level determination system of a structure different from the understanding level determination system shown in FIG.

以下、本発明の一実施の形態について、図面を用いて説明する。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

まず、本実施の形態に係る理解度判定システムを開発するために行った実験（以下「開発用実験」という。）について説明する。 First, an experiment (hereinafter referred to as “development experiment”) performed to develop the understanding level determination system according to the present embodiment will be described.

本願の発明者は、聴者の顔の表情と、聴者の理解度との関連性を確認するために以下に説明する開発用実験を行った。 The inventor of the present application conducted a development experiment described below in order to confirm the relationship between the facial expression of the listener and the level of understanding of the listener.

１．開発用実験の内容
図１は、開発用実験の環境を示す上面図である。 1. Contents of Development Experiment FIG. 1 is a top view showing the environment of the development experiment.

図１に示すように、開発用実験は、被験者１１と、特定の説明文を被験者１１に対して読み上げる説明員１２との２名で行われる。開発用実験において、被験者１１と、説明員１２とは、テーブル１３を挟んで向かい合わせに、約２ｍの距離をあけて着席する。そして、被験者１１から約１ｍの距離をあけてテーブル１３上に配置されたカメラ１４によって、説明員１２が説明文を読み上げている最中の被験者１１の顔の表情が記録される。 As shown in FIG. 1, the development experiment is performed by two persons, that is, a subject 11 and an explanation staff 12 who reads a specific explanation from the subject 11. In the development experiment, the subject 11 and the instructor 12 are seated facing each other across the table 13 with a distance of about 2 m. Then, the facial expression of the subject 11 while the explanation staff 12 is reading the explanation is recorded by the camera 14 arranged on the table 13 at a distance of about 1 m from the subject 11.

開発用実験に用いられた説明文としては、複数の被験者の間で知識差が出難い哲学分野の課題が採用された。開発用実験に用いられた説明文は、シーン１からシーン６までの６個のシーンに分けられており、シーン２およびシーン３のそれぞれの難易度がシーン１、シーン４、シーン５およびシーン６のそれぞれの難易度より高くなっている。 As the explanatory text used in the development experiment, a philosophy field subject to which it was difficult to make a difference in knowledge among multiple subjects was adopted. The explanatory text used in the development experiment is divided into six scenes from scene 1 to scene 6, and the difficulty levels of scene 2 and scene 3 are scene 1, scene 4, scene 5 and scene 6, respectively. It is higher than each difficulty level.

開発用実験の手順は、次の通りである。
（１）説明員が被験者に実験の内容と注意事項の説明を行う。
（２）説明員がシーン１を使って実験の流れの説明をする。
（３）実験に関しての質疑応答
（４）説明員がシーン２の説明を行う。
（５）被験者がシーン２の理解度を１から５までの５段階（数値が大きいほど理解度が高い。）で回答する。
（６）上記（４）、（５）をシーン３、シーン４、シーン５、シーン６の順に繰り返す。 The procedure for the development experiment is as follows.
(1) The instructor explains the contents of the experiment and the precautions to the subject.
(2) The instructor explains the flow of the experiment using scene 1.
(3) Q & A about the experiment (4) The instructor explains the scene 2.
(5) The test subject answers the level of understanding of scene 2 in five levels from 1 to 5 (the larger the numerical value, the higher the level of understanding).
(6) The above (4) and (5) are repeated in the order of scene 3, scene 4, scene 5, and scene 6.

開発用実験は、以上の条件で、１０名の被験者のそれぞれに対して行われた。 The development experiment was conducted on each of 10 subjects under the above conditions.

２．開発用実験の結果の分析
開発用実験においてカメラ１４によって取得された動画のデータを、表情解析ソフトウェアである、ａｆｆｅｃｔｉｖａ社製のａｆｆｄｅｘを用いて解析することによって、説明員が説明文を読み上げている最中の被験者の感情を取得した。ａｆｆｄｅｘは、被験者の顔の表情から被験者の感情の強度を、「嫌悪」、「怒り」、「悲しみ」、「軽蔑」、「喜び」、「恐れ」および「驚き」という７個の感情毎に０．００〜１００．００の値で１／３０秒毎に取得することができる。 2. Analysis of the result of the development experiment The explanation is read out by the explanation staff by analyzing the data of the moving image acquired by the camera 14 in the development experiment using the affdex manufactured by effectiva, which is facial expression analysis software. The emotion of the subject in the middle was acquired. affdex determines the intensity of the subject's emotion from the facial expression of the subject for each of the seven emotions of “disgust”, “anger”, “sadness”, “contempt”, “joy”, “fear” and “surprise”. A value of 0.00 to 100.00 can be acquired every 1/30 seconds.

図２（ａ）、図２（ｂ）、図２（ｃ）、図２（ｄ）、図３（ａ）、図３（ｂ）、図３（ｃ）は、開発用実験の結果のうち、それぞれ、「嫌悪」、「怒り」、「悲しみ」、「軽蔑」、「喜び」、「恐れ」、「驚き」の感情の強度を時間毎に示したグラフである。 2 (a), FIG. 2 (b), FIG. 2 (c), FIG. 2 (d), FIG. 3 (a), FIG. 3 (b), and FIG. 3 (c) are the results of development experiments. These are graphs showing the intensity of emotions of “disgust”, “anger”, “sadness”, “contempt”, “joy”, “fear”, and “surprise” for each hour.

図２および図３に示すグラフは、特定の１人の被験者の７個の感情のそれぞれの強度を時間毎に示したグラフである。各グラフにおいて、太線で示すデータは、対象の被験者自身が回答した理解度を参考として表しており、感情の強度を示す縦軸の数値とは直接の関係はない。なお、図２および図３に示すグラフの対象の被験者は、シーン１の理解度を３と回答し、シーン２、３の理解度を２と回答し、シーン４、５の理解度を４と回答し、シーン６の理解度を５と回答している。 The graphs shown in FIGS. 2 and 3 are graphs showing the strength of each of the seven emotions of a specific subject for each time. In each graph, the data indicated by a bold line represents the degree of understanding answered by the subject subject as a reference, and is not directly related to the numerical value on the vertical axis indicating the strength of emotion. 2 and FIG. 3, the subject subject answered that the understanding level of the scene 1 is 3, the understanding level of the scenes 2 and 3 is 2, and the understanding level of the scenes 4 and 5 is 4. The answer is that the degree of understanding of the scene 6 is 5.

図２および図３に示すように、「嫌悪」、「怒り」、「悲しみ」、「軽蔑」、「喜び」、「恐れ」および「驚き」という７個の感情の強度のうち、「嫌悪」の感情の強度と、被験者が回答した理解度との間に関連性が現れた。すなわち、図２（ａ）に示すグラフでは、被験者が回答した理解度が低いシーン２およびシーン３において「嫌悪」の感情の強度が高くなっている。図２および図３に示すデータは１人の被験者のデータであるが、被験者が回答した理解度が低いシーンにおいて「嫌悪」の感情の強度が高くなる傾向は、全ての被験者に対して現れた。 As shown in FIG. 2 and FIG. 3, among the seven emotional intensities of “disgust”, “anger”, “sadness”, “contempt”, “joy”, “fear” and “surprise”, “disgust” There was a relationship between the intensity of emotion and the level of comprehension that the subjects answered. That is, in the graph shown in FIG. 2A, the intensity of the feeling of “disgust” is high in the scene 2 and the scene 3 with the low understanding level answered by the subject. The data shown in FIG. 2 and FIG. 3 are data of one subject, but the tendency that the intensity of the feeling of “disgust” increases in the scene where the level of understanding answered by the subject is low appears for all subjects. .

図４（ａ）は、図２（ａ）に示すグラフからシーン１およびシーン５のデータを除いたグラフである。図４（ｂ）は、図４（ａ）に示すデータの対象の被験者とは異なる被験者に対する開発用実験の結果のうち「嫌悪」の感情の強度を時間毎に示したグラフである。 FIG. 4A is a graph obtained by removing the data of the scene 1 and the scene 5 from the graph shown in FIG. FIG. 4B is a graph showing the intensity of the feeling of “hate” among the results of the development experiment for a subject different from the subject subject to the data shown in FIG.

図４（ｂ）に示すグラフは、図４（ａ）に示すグラフと同様に、シーン１およびシーン５のデータを除いている。 The graph shown in FIG. 4B excludes the data of the scene 1 and the scene 5 similarly to the graph shown in FIG.

図４に示すように、感情の強度には、個人差が存在する。そのため、理解度が低いと判定するための感情の強度の閾値が被験者に依存しない固定の数値である場合には、理解度を適切に判定することができない。したがって、閾値は、被験者毎に取得される必要がある。閾値を被験者毎に取得するための方法として、クラスタリングが存在する。 As shown in FIG. 4, there are individual differences in the intensity of emotion. Therefore, when the threshold value of the emotion strength for determining that the understanding level is low is a fixed numerical value that does not depend on the subject, the understanding level cannot be determined appropriately. Therefore, the threshold needs to be acquired for each subject. Clustering exists as a method for acquiring a threshold for each subject.

図５（ａ）、図５（ｂ）は、それぞれ、図４（ａ）、図４（ｂ）に示すデータの対象の被験者に対する開発用実験の結果のうち「嫌悪」の感情の強度をクラスタリングした図である。 5 (a) and 5 (b), respectively, cluster the intensity of emotions of “disgust” among the results of the development experiment for the test subject of the data shown in FIG. 4 (a) and FIG. 4 (b). FIG.

図５に示すように、被験者毎に「嫌悪」の感情の強度をクラスタリングすることによって、感情の強度の個人差の影響を抑えることができる。具体的には、クラスタリングの方法としてｋ平均法を用いた。図５において、丸印の１つずつが１／３０秒毎の感情の強度を示している。 As shown in FIG. 5, by clustering the intensity of “hate” emotion for each subject, the influence of individual differences in emotion intensity can be suppressed. Specifically, the k-means method was used as a clustering method. In FIG. 5, each one of the circles indicates the emotion intensity every 1/30 seconds.

なお、クラスタリングを行うに際して、被験者が発話中である場合や、被験者が被験者自身の顔を触る動作中である場合など、被験者の顔の表情から被験者の感情の強度を表情解析ソフトウェアによって正常に取得することができていない場合のデータは、適切なクラスタリングの妨げになるので、除去した。また、被験者が未だ実験に慣れてない場合には被験者の顔の表情が自然ではない可能性がある。したがって、クラスタリングを行うに際して、被験者が未だ実験に慣れてない時期のシーンであるシーン１のデータは、適切なクラスタリングの妨げになるので、除去した。また、シーン５は、他のシーンと比較してデータの数が少ない。したがって、クラスタリングを行うに際して、シーン５のデータは、適切なクラスタリングの妨げになるので、除去した。 When performing clustering, the subject's emotional intensity is normally obtained from the facial expression of the subject using facial expression analysis software, such as when the subject is speaking or when the subject is touching the subject's own face. If not, the data was removed because it would interfere with proper clustering. Also, if the subject is not yet used to the experiment, the facial expression of the subject may not be natural. Therefore, when performing clustering, the data of scene 1, which is a scene at a time when the subject is not yet used to the experiment, is removed because it hinders appropriate clustering. In addition, scene 5 has a smaller number of data than other scenes. Therefore, when performing clustering, the data of scene 5 is removed because it hinders proper clustering.

実験において、被験者は、１から５までの５段階の理解度のうち、実際には２から５までのいずれかの理解度のみを回答した。したがって、クラスタリングを行うに際して、クラスターの数は、４個とした。クラスターに含まれる数値の平均値が最も低いクラスターをクラスター１とし、クラスターに含まれる数値の平均値が２番目に低いクラスターをクラスター２とし、クラスターに含まれる数値の平均値が３番目に低いクラスターをクラスター３とし、クラスターに含まれる数値の平均値が最も高いクラスターをクラスター４とする。 In the experiment, the test subject answered only one of the understanding levels from 2 to 5 among the 5 levels of understanding from 1 to 5. Therefore, when performing clustering, the number of clusters is four. The cluster with the lowest average value included in the cluster is defined as cluster 1, the cluster with the second lowest average value included in the cluster is defined as cluster 2, and the cluster with the lowest average value included in the cluster is defined as cluster 3. Is the cluster 3, and the cluster with the highest average value of the numbers contained in the cluster is the cluster 4.

図６は、開発用実験において被験者が回答した理解度毎に、クラスター４に含まれるデータを含むシーンと、クラスター４に含まれるデータを含まないシーンとの割合を示す図である。 FIG. 6 is a diagram illustrating a ratio of a scene including data included in the cluster 4 and a scene not including data included in the cluster 4 for each understanding level answered by the subject in the development experiment.

図６に示すように、被験者が理解度として２を回答したシーンのうち、クラスター４に含まれるデータを含むシーンは、６７％であり、クラスター４に含まれるデータを含まないシーンは、３３％である。被験者が理解度として３を回答したシーンのうち、クラスター４に含まれるデータを含むシーンは、３３％であり、クラスター４に含まれるデータを含まないシーンは、６７％である。被験者が理解度として４を回答したシーンのうち、クラスター４に含まれるデータを含むシーンは、１４％であり、クラスター４に含まれるデータを含まないシーンは、８６％である。被験者が理解度として５を回答したシーンのうち、クラスター４に含まれるデータを含むシーンは、１７％であり、クラスター４に含まれるデータを含まないシーンは、８３％である。 As shown in FIG. 6, among the scenes for which the test subject answered 2 as the degree of understanding, 67% of the scenes included data included in cluster 4, and 33% of the scenes that did not include data included in cluster 4 It is. Of the scenes in which the test subject has answered 3 as an understanding level, 33% of the scenes include data included in the cluster 4, and 67% of the scenes do not include data included in the cluster 4. Of the scenes in which the test subject has answered 4 as the degree of understanding, 14% of the scenes include data included in cluster 4, and 86% do not include data included in cluster 4. Of the scenes in which the subject answered 5 as the degree of understanding, 17% of the scenes include data included in the cluster 4, and 83% do not include data included in the cluster 4.

ここで、被験者が回答した理解度が１または２であるシーンを「理解していない」と判定し、被験者が回答した理解度が３、４または５であるシーンを「理解している」と判定する。そして、クラスター４に含まれるデータを含むシーンを「理解していない」と判定し、クラスター４に含まれるデータを含まないシーンを「理解している」と判定することとする。被験者が回答した理解度に基づいて「理解していない」と判定されたシーンのうち、「嫌悪」の感情の強度に基づいて「理解していない」と判定されたシーン、すなわち、クラスター４に含まれるデータを含むシーンの割合は、６７％であった。被験者が回答した理解度に基づいて「理解している」と判定されたシーンのうち、「嫌悪」の感情の強度に基づいて「理解している」と判定されたシーン、すなわち、クラスター４に含まれるデータを含まないシーンの割合は、７９％であった。そして、被験者が回答した理解度に基づいた「理解していない」または「理解している」の判定と、「嫌悪」の感情の強度に基づいた「理解していない」または「理解している」の判定との一致率は、７５％であった。 Here, it is determined that a scene with a comprehension level of 1 or 2 answered by the subject is “not understood” and a scene with a comprehension level of 3, 4 or 5 answered by the test subject is “understood”. judge. Then, it is determined that the scene including the data included in the cluster 4 is “not understood”, and the scene not including the data included in the cluster 4 is determined as “understanding”. Of the scenes determined as “not understood” based on the degree of understanding that the test subject answered, the scenes determined as “not understood” based on the intensity of the emotion of “disgust”, that is, in the cluster 4 The ratio of the scene including the included data was 67%. Of the scenes determined as “understanding” based on the degree of understanding that the test subject answered, the scenes determined as “understanding” based on the intensity of the emotion of “disgust”, that is, in cluster 4 The ratio of scenes that do not include included data was 79%. Based on the degree of comprehension that the test subjects answered, they did not understand or understood, and they did not understand or understood based on the emotional intensity of disgust. The coincidence rate with the determination of “was 75%.

次に、本実施の形態に係る理解度判定システムの構成について説明する。 Next, the configuration of the understanding level determination system according to the present embodiment will be described.

図７は、本実施の形態に係る理解度判定システム２０のブロック図である。 FIG. 7 is a block diagram of the understanding level determination system 20 according to the present embodiment.

図７に示すように、理解度判定システム２０は、コンピューター３０と、コンピューター３０に接続されたカメラ３５、マイク３６、モニター３７およびスピーカー３８と、コンピューター３０とは離れて配置されているコンピューター４０と、コンピューター４０に接続されたカメラ４５、マイク４６、モニター４７およびスピーカー４８とを備えている。コンピューター３０と、コンピューター４０とは、インターネットなどのネットワーク２１経由で互いに通信可能である。 As shown in FIG. 7, the understanding level determination system 20 includes a computer 30, a camera 35, a microphone 36, a monitor 37, and a speaker 38 connected to the computer 30, and a computer 40 that is arranged away from the computer 30. A camera 45, a microphone 46, a monitor 47, and a speaker 48 connected to the computer 40 are provided. The computer 30 and the computer 40 can communicate with each other via a network 21 such as the Internet.

理解度判定システム２０において、コンピューター３０は、カメラ３５によって取得した映像と、マイク３６によって取得した音声とをコンピューター４０に送信する。そして、コンピューター４０は、コンピューター３０から送信されてきた映像をモニター４７に表示するとともに、コンピューター３０から送信されてきた音声をスピーカー４８から出力する。同様に、コンピューター４０は、カメラ４５によって取得した映像と、マイク４６によって取得した音声とをコンピューター３０に送信する。そして、コンピューター３０は、コンピューター４０から送信されてきた映像をモニター３７に表示するとともに、コンピューター４０から送信されてきた音声をスピーカー３８から出力する。すなわち、理解度判定システム２０は、互いに離れた場所にいる人間同士の会議を可能にするリモート会議システムである。 In the understanding level determination system 20, the computer 30 transmits the video acquired by the camera 35 and the audio acquired by the microphone 36 to the computer 40. Then, the computer 40 displays the video transmitted from the computer 30 on the monitor 47 and outputs the audio transmitted from the computer 30 from the speaker 48. Similarly, the computer 40 transmits the video acquired by the camera 45 and the audio acquired by the microphone 46 to the computer 30. The computer 30 displays the video transmitted from the computer 40 on the monitor 37 and outputs the audio transmitted from the computer 40 from the speaker 38. That is, the comprehension level determination system 20 is a remote conference system that enables a conference between people who are separated from each other.

図８は、コンピューター３０のブロック図である。 FIG. 8 is a block diagram of the computer 30.

図８に示すように、コンピューター３０は、種々の操作が入力される例えばキーボード、マウスなどの入力デバイスである操作部３１と、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、インターネットなどのネットワーク経由で、または、ネットワークを介さずに有線または無線によって直接に、外部の装置と通信を行う通信デバイスである通信部３２と、各種の情報を記憶する例えば半導体メモリー、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）などの不揮発性の記憶デバイスである記憶部３３と、コンピューター３０全体を制御する制御部３４とを備えている。 As shown in FIG. 8, the computer 30 includes an operation unit 31 that is an input device such as a keyboard and a mouse through which various operations are input, and a network such as a LAN (Local Area Network) and the Internet, or a network. A communication unit 32 that is a communication device that directly communicates with an external device via a wired or wireless connection without using a network, and a nonvolatile storage device such as a semiconductor memory or an HDD (Hard Disk Drive) that stores various types of information And a control unit 34 that controls the entire computer 30.

記憶部３３は、聴者の理解度を判定するための理解度判定プログラム３３ａを記憶している。理解度判定プログラム３３ａは、例えば、コンピューター３０のコンピューター３０にインストールされていても良いし、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）メモリー、ＣＤ（ＣｏｍｐａｃｔＤｉｓｋ）、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）などの外部の記憶媒体からコンピューター３０に追加でインストールされても良いし、ネットワーク上からコンピューター３０に追加でインストールされても良い。 The storage unit 33 stores an understanding level determination program 33a for determining the level of understanding of the listener. The understanding level determination program 33a may be installed in the computer 30 of the computer 30, for example, or from an external storage medium such as a USB (Universal Serial Bus) memory, a CD (Compact Disk), a DVD (Digital Versatile Disk), etc. It may be additionally installed on the computer 30 or may be additionally installed on the computer 30 from the network.

制御部３４は、例えば、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）と、プログラムおよび各種のデータを記憶しているＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）と、ＣＰＵの作業領域として用いられるＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）とを備えている。ＣＰＵは、ＲＯＭまたは記憶部３３に記憶されているプログラムを実行する。 The control unit 34 includes, for example, a CPU (Central Processing Unit), a ROM (Read Only Memory) storing programs and various data, and a RAM (Random Access Memory) used as a work area of the CPU. Yes. The CPU executes a program stored in the ROM or the storage unit 33.

制御部３４は、理解度判定プログラム３３ａを実行することによって、聴者の顔の表情に基づいて聴者の感情を解析する感情解析手段３４ａと、感情解析手段３４ａによって解析された感情に基づいて聴者の理解度を解析する理解度解析手段３４ｂとを実現する。 The control unit 34 executes the understanding level determination program 33a to analyze the emotion of the listener based on the facial expression of the listener, and the emotion analysis unit 34a based on the emotion analyzed by the emotion analysis unit 34a. An understanding level analysis means 34b for analyzing the understanding level is realized.

コンピューター４０の構成は、コンピューター３０の構成と同様である。 The configuration of the computer 40 is the same as the configuration of the computer 30.

次に、カメラ４５によって撮影される聴者の理解度をモニター３７に表示する場合のコンピューター３０の動作について説明する。 Next, the operation of the computer 30 when the understanding level of the listener photographed by the camera 45 is displayed on the monitor 37 will be described.

図９は、カメラ４５によって撮影される聴者の理解度をモニター３７に表示する場合のコンピューター３０の動作のフローチャートである。 FIG. 9 is a flowchart of the operation of the computer 30 when the understanding level of the listener photographed by the camera 45 is displayed on the monitor 37.

コンピューター３０は、コンピューター４０から送信されてきた映像の動画ファイル（以下「対象動画ファイル」という。）を受信すると、図９に示す動作を実行する。 When the computer 30 receives the moving image file of the video transmitted from the computer 40 (hereinafter referred to as “target moving image file”), the computer 30 executes the operation shown in FIG.

図９に示すように、感情解析手段３４ａは、対象動画ファイルをＦＡＣＳ（ＦａｃｉａｌＡｃｔｉｏｎＣｏｄｉｎｇＳｙｓｔｅｍ）などの表情解析ソフトウェアを用いて解析することによって、聴者の感情を取得する（Ｓ５１）。ここで、表情解析ソフトウェアは、顔の特徴点を用いて感情を解析するソフトウェアであれば良く、例えばａｆｆｅｃｔｉｖａ社製のａｆｆｄｅｘが採用されても良い。 As shown in FIG. 9, the emotion analysis means 34a acquires the emotion of the listener by analyzing the target moving image file using facial expression analysis software such as FACS (Facial Action Coding System) (S51). Here, the facial expression analysis software may be software that analyzes emotions using facial feature points. For example, affdex manufactured by Affectiva may be employed.

次いで、感情解析手段３４ａは、Ｓ５１において取得した感情のうち、「嫌悪」の感情の強度を時間毎に示す時系列データを生成する（Ｓ５２）。すなわち、感情解析手段３４ａは、図４に示す時系列データと同様な時系列データを生成する。 Next, the emotion analysis unit 34a generates time-series data indicating the intensity of the “disgust” emotion among the emotions acquired in S51 for each hour (S52). That is, the emotion analysis means 34a generates time series data similar to the time series data shown in FIG.

次いで、理解度解析手段３４ｂは、Ｓ５２において生成された時系列データを、時間における特定の区切りによって分ける（Ｓ５３）。ここで、特定の区切りは、例えば、話者が話者自身の発表資料において区切りを指定している場合には、発表資料において話者によって指定されている区切りが採用されても良いし、３０秒毎などの特定の時間が採用されても良い。 Next, the understanding level analysis unit 34b divides the time series data generated in S52 by a specific break in time (S53). Here, for example, when the speaker designates a break in the speaker's own presentation material, the break designated by the speaker in the presentation material may be adopted. A specific time such as every second may be employed.

理解度解析手段３４ｂは、Ｓ５３の処理の後、聴者が発話中である場合や、聴者が聴者自身の顔を触る動作中である場合など、聴者の顔の表情から聴者の感情の強度を表情解析ソフトウェアによって正常に取得することができていない場合のデータを、Ｓ５３において特定の区切りによって分けた時系列データから除去する（Ｓ５４）。ここで、理解度解析手段３４ｂは、聴者が発話中であることや、聴者が聴者自身の顔を触る動作中であることを、対象動画ファイルに対する画像処理によって認識することが可能である。 The understanding level analysis means 34b determines the intensity of the emotion of the listener from the facial expression of the listener, such as when the listener is speaking or after the operation of S53, or when the listener is touching the listener's own face. Data that cannot be normally acquired by the analysis software is removed from the time-series data divided by specific delimiters in S53 (S54). Here, the understanding level analysis unit 34b can recognize that the listener is speaking and that the listener is touching the face of the listener by image processing on the target moving image file.

理解度解析手段３４ｂは、Ｓ５４の処理の後、Ｓ５４において生成された時系列データのデータに対して図５に示す例と同様にクラスタリングを行う（Ｓ５５）。ここで、理解度解析手段３４ｂは、例えば、ｋ平均法を用いてクラスタリングを行う。また、理解度解析手段３４ｂは、例えばエルボー法などを用いてクラスターの数を決めても良いし、利用者によって指示された数などの特定の数をクラスターの数としても良い。 The understanding level analysis unit 34b performs clustering on the data of the time series data generated in S54 after the processing in S54 as in the example shown in FIG. 5 (S55). Here, the understanding level analysis unit 34b performs clustering using, for example, a k-average method. In addition, the understanding level analysis unit 34b may determine the number of clusters using, for example, an elbow method, or may use a specific number such as the number instructed by the user as the number of clusters.

理解度解析手段３４ｂは、Ｓ５５の処理の後、Ｓ５３において分けた区間のそれぞれに対して、「理解していない」および「理解している」のいずれであるかを判定する（Ｓ５６）。ここで、理解度解析手段３４ｂは、Ｓ５３において分けた区間のうち、Ｓ５５において分類されたクラスターに含まれる数値の平均値が最も高いクラスターに含まれるデータを含む区間を、「理解していない区間」として判定する。一方、理解度解析手段３４ｂは、Ｓ５３において分けた区間のうち、Ｓ５５において分類されたクラスターに含まれる数値の平均値が最も高いクラスターに含まれるデータを含まない区間を、「理解している区間」として判定する。 After the process of S55, the understanding level analysis unit 34b determines whether it is “I don't understand” or “I understand” for each of the sections divided in S53 (S56). Here, the comprehension level analysis unit 34b determines that the section including the data included in the cluster having the highest average value of the numerical values included in the cluster classified in S55 among the sections divided in S53, Is determined. On the other hand, the comprehension level analysis unit 34b determines that the section that does not include the data included in the cluster having the highest average value of the numerical values included in the clusters classified in S55 among the sections divided in S53, Is determined.

理解度解析手段３４ｂは、Ｓ５６の処理の後、Ｓ５６における判定結果をモニター３７に表示する（Ｓ５７）。例えば、理解度解析手段３４ｂは、Ｓ５６において「理解していない」と判定した区間に対して、理解していないことを悲しみの感情を表す絵文字などで表示し、Ｓ５６において「理解している」と判定した区間に対して、理解していることを平静の感情を表す絵文字などで表示する。 The understanding level analyzing unit 34b displays the determination result in S56 on the monitor 37 after the process of S56 (S57). For example, the understanding level analysis unit 34b displays that it does not understand with a pictogram representing a feeling of sadness for the section determined as “not understood” in S56, and “understands” in S56. For the section determined to be, the understanding is displayed with a pictograph or the like representing a calm feeling.

理解度解析手段３４ｂは、Ｓ５７の処理の後、図９に示す動作を終了する。 The understanding level analysis unit 34b ends the operation shown in FIG. 9 after the process of S57.

以上においては、カメラ４５によって撮影される聴者の理解度をモニター３７に表示する場合のコンピューター３０の動作について説明した。カメラ３５によって撮影される聴者の理解度をモニター４７に表示する場合のコンピューター４０の動作についても同様である。 In the above, the operation of the computer 30 when the understanding level of the listener photographed by the camera 45 is displayed on the monitor 37 has been described. The same applies to the operation of the computer 40 when the understanding level of the listener photographed by the camera 35 is displayed on the monitor 47.

以上に説明したように、理解度判定システム２０は、聴者の顔の表情に基づいてＳ５１において解析した感情のうち、聴者の理解度と相関する「嫌悪」の感情に基づいて聴者の理解度を判定する（Ｓ５６）ので、聴者の顔の表情に基づいて聴者の理解度を高精度に判定することができる。 As described above, the understanding level determination system 20 determines the listener's level of understanding based on the emotion of “disgust” that correlates with the level of understanding of the listener among the emotions analyzed in S51 based on the facial expression of the listener. Since it determines (S56), a listener's comprehension degree can be determined with high precision based on a facial expression of a listener.

リモート会議、すなわち、互いに離れた場所にいる人間同士の会議は、対面での会議、すなわち、同一の場所にいる人間同士の会議と比較して、視覚によって得られる情報が少ないので、聴者の顔の表情や仕草などの非言語のコミュニケーションが話者に伝わり難く、話者が聴者の理解度を高精度に判定することが困難である。しかしながら、理解度判定システム２０は、カメラ４５によって撮影された聴者の顔の表情に基づいて聴者の理解度を判定するので、リモート会議システムであっても聴者の理解度を高精度に判定することができる。したがって、理解度判定システム２０は、聴者が理解していない場合に、話者が発言内容のレベルを下げたり、説明をし直したりすることができ、その結果、情報の伝達の精度を向上することができる。 Remote meetings, that is, meetings between people who are far away from each other, have less information that can be obtained visually than a face-to-face meeting, ie, a meeting between people who are in the same place. It is difficult for non-verbal communication such as facial expressions and gestures to be transmitted to the speaker, and it is difficult for the speaker to determine the level of understanding of the listener with high accuracy. However, since the understanding level determination system 20 determines the listener's level of understanding based on the facial expression of the listener photographed by the camera 45, the listener's level of understanding can be determined with high accuracy even in a remote conference system. Can do. Therefore, in the understanding level determination system 20, when the listener does not understand, the speaker can lower the level of the utterance content or re-explain the result, thereby improving the accuracy of information transmission. be able to.

なお、以上においては、リモート会議システムについて説明した。しかしながら、理解度判定システムは、リモート会議システムでなくても良い。例えば、理解度判定システムは、図１０に示すものであっても良い。 The remote conference system has been described above. However, the understanding level determination system may not be a remote conference system. For example, the understanding level determination system may be as shown in FIG.

図１０は、図７に示す理解度判定システム２０とは異なる構成の理解度判定システム６０のブロック図である。 FIG. 10 is a block diagram of an understanding level determination system 60 having a different configuration from the understanding level determination system 20 shown in FIG.

図１０に示すように、理解度判定システム６０は、コンピューター３０と、コンピューター３０に接続されたカメラ３５およびモニター３７を備えている。カメラ３５は、聴者を撮影可能な位置に配置されている。モニター３７は、画面が話者によって視認可能な位置に配置されている。 As shown in FIG. 10, the understanding level determination system 60 includes a computer 30, a camera 35 and a monitor 37 connected to the computer 30. The camera 35 is arranged at a position where the listener can be photographed. The monitor 37 is arranged at a position where the screen can be visually recognized by the speaker.

２０理解度判定システム（リモート会議システム）
３０コンピューター
３３ａ理解度判定プログラム
３４ａ感情解析手段
３４ｂ理解度解析手段
３５カメラ
３７モニター
４０コンピューター
４５カメラ
４７モニター
６０理解度判定システム 20 Understanding determination system (remote conference system)
30 Computer 33a Comprehension Determining Program 34a Emotion Analysis Means 34b Comprehension Analyzing Means 35 Camera 37 Monitor 40 Computer 45 Camera 47 Monitor 60 Understanding Degree Determination System

Claims

Emotion analysis means for analyzing the emotion of the listener based on the facial expression of the listener photographed by the camera;
Comprehension level analysis means for analyzing the level of understanding of the listener based on the emotion analyzed by the emotion analysis means,
The comprehension level analysis system characterized in that the comprehension level analysis unit determines the comprehension level of the listener based on an emotion of "disgust" among the emotions analyzed by the emotion analysis unit.

The understanding level analysis means displays the determined level of understanding on a monitor,
The understanding level determination system according to claim 1, wherein the understanding level determination system is a remote conference system installed at a location where the camera and the monitor are separated from each other.

Emotion analysis means for analyzing the emotion of the listener based on the facial expression of the listener photographed by the camera;
An understanding level analysis means for analyzing the understanding level of the listener based on the emotion analyzed by the emotion analysis means;
The understanding level analysis unit determines the understanding level of the listener based on an emotion of “disgust” among the emotions analyzed by the emotion analysis unit.