JP2017201479A

JP2017201479A - Communication supporting system

Info

Publication number: JP2017201479A
Application number: JP2016093239A
Authority: JP
Inventors: 山田　繁夫; Shigeo Yamada; 繁夫山田; 和洋中原; Kazuhiro Nakahara; 貞晴戸木; Sadaharu Toki; 和之藤田; Kazuyuki Fujita; 白鳥　毅; Takeshi Shiratori; 毅白鳥; 政行西川; Masayuki Nishikawa; 小笠原　豊; Yutaka Ogasawara; 豊小笠原; 大橋　一広; Kazuhiro Ohashi; 一広大橋
Original assignee: Itoki Corp; Nihon Unisys Ltd
Current assignee: Itoki Corp; Nihon Unisys Ltd
Priority date: 2016-05-06
Filing date: 2016-05-06
Publication date: 2017-11-09
Anticipated expiration: 2036-05-06
Also published as: JP6730843B2

Abstract

PROBLEM TO BE SOLVED: To provide a communication supporting system that can appropriately perform, without having to set rules in advance, presentation of information for supporting a communicative situation, such as a conference, or choice of an instruction for altering a communicative environment.SOLUTION: As the situation of a communication space is objectively kept track of in an encoded or numerical way using prescribed indicators by analyzing speech and attitudes of participants acquired by sensors at prescribed intervals of time, changes in the situation can be quantified in a time series. As an algorithm prior to reinforcement learning in which the actual reactions of participants after a communication supporting instruction selected by the system are reflected again as an indicator value and feedback is given to reward the communication supporting instruction immediately before is used, the accuracy of system choice for communication supporting instruction can be enhanced without having to prepare in advance learning rules involving many preparatory man-hours.SELECTED DRAWING: Figure 1

Description

本発明は、コミュニケーションの場においてあたかも人工知能が参加者メンバーとなり、様々な情報を提供したりコミュニケーション空間を変化させる指示を出したりすることによってコミュニケーションのアウトプットを高める技術に関し、特に、コミュニケーション中の状況を音声及び画像等の解析により複数の指標で符号化又は数値化し、人工知能の学習プロセスに反映させながら提供情報の精度を向上させる技術である。 The present invention relates to a technique for enhancing communication output by providing artificial intelligence as a participant member in the field of communication and providing various information or giving instructions to change the communication space. This is a technique for improving the accuracy of provided information while encoding or digitizing a situation with a plurality of indices by analyzing speech and images, etc., and reflecting it in the learning process of artificial intelligence.

複数の参加者が集まってコミュニケーションを形成する手段の一例として、ビジネス上では会議を開催することが最も代表的であり、日々一般的に行われている。しかし、会議の終了時刻が決まっているにもかかわらず、意図した結論が会議終了までに得られていないことも日常茶飯事で生じている。これは、会議の内容の複雑さにもよるが、複数の参加者間による話し合いが時に発散したり又は収束したり、或いは沈黙状態となってしまうことが影響しているためである。 As an example of means for gathering a plurality of participants to form communication, a business conference is the most representative, and is generally performed every day. However, even though the end time of the meeting is decided, it is also happening in daily life that the intended conclusion is not obtained by the end of the meeting. This is because, although depending on the complexity of the content of the conference, it is influenced by the fact that discussions between a plurality of participants sometimes diverge or converge or become silent.

また、参加者全員が会議内容について同じレベルの知識を有していなかったり、途中から会議に加わったりする参加者もいる。参加者が会議の背景や議論の目的及びこれまでに決定した事項を十分に把握していないまま発言が交わされると、議論の方向性にそぐわない内容になることや、議論が蒸し返されることがある。そのため、当初期待していた結論をスケジュール通りに得ることができなくなってしまうので、会議前や会議中において参加者が同様の知識を共有するための情報提供が必要である。 Some participants do not have the same level of knowledge about the contents of the meeting, or join the meeting from the middle. If a participant makes a speech without fully understanding the background of the meeting, the purpose of the discussion, and the matters decided so far, the content may not match the direction of the discussion, and the discussion may be irritated. . For this reason, it is impossible to obtain the expected conclusion on schedule, so it is necessary to provide information for participants to share similar knowledge before and during the meeting.

そこで、複数の参加者による議論中の発想を支援し、議論の場を活性化させる発想支援システムが提案されている（例えば、下記特許文献１）。この発想支援システムでは、会議中に出現したキーワードや発想を支援するワードを参加者に提示する。また、どのような状況で、会議状態の変化が認識され又は状態変更アクションが実行されたかを学習するシステムもある（例えば、下記特許文献２）。 In view of this, an idea support system that supports ideas during discussions by a plurality of participants and activates the place of discussion has been proposed (for example, Patent Document 1 below). In this idea support system, keywords that appear during the conference and words that support the idea are presented to the participants. In addition, there is a system that learns under what circumstances a change in conference state is recognized or a state change action is executed (for example, Patent Document 2 below).

特許第５１２３５９１号公報Japanese Patent No. 51233591 特開２０１４−８５９１６号公報JP 2014-85916 A

上述したような会議中の参加者の発想を支援するための装置やシステムは、実際の会議で使用するとなると不十分な点が様々あることが知られている。
例えば、特許文献１の場合、会議中に発言された発話の音声解析が行われ、近接して出現する単語の出現割合を関連度として算出することによって、関連性のある単語集合（文脈形成単語集合）を形成し、キーワードや発想支援ワードを抽出する。これらキーワードや発想支援ワードを視覚的に認識する会議参加者は、会議内容を理解し共有することができるとともに、参加者だけでは得られない発想を広げられ、会議空間の場を活性化させることができるとされている。 It is known that the devices and systems for supporting the ideas of the participants during the conference as described above have insufficient points when used in actual conferences.
For example, in the case of Patent Document 1, a speech analysis of utterances made during a meeting is performed, and a related word set (context-forming word) is calculated by calculating the appearance rate of words that appear in proximity as the relevance level. A set), and keywords and idea support words are extracted. Meeting participants who visually recognize these keywords and idea support words can understand and share the contents of the meeting, and can expand ideas that cannot be obtained only by the participants and activate the meeting space. It is supposed to be possible.

しかしながら、会議参加者が発想支援システムから提供された発想支援ワードによって本当に発想が広がったのか、会議が意図した成果を上げて成功したといえるのかについて検証されているわけではない。引用文献１のような既知の発想支援システムは、システム側が解析して有益であろうと判断した情報を参加者に一方的に提供し、会議空間を活性化させる効果があるだろうと謳っているに過ぎない。発想支援システムの利用が真に有益なものであるかを見極めるためは、提供された情報を受けた参加者がその後の会議でどのような発言や態度に変化したかを定量化して評価することが重要であり、参加者の実際の変化を発想支援システムによる情報の決定処理にタイムリーに反映させる必要がある。 However, it has not been verified whether conference participants have really spread their ideas with the idea support word provided from the idea support system, or whether the conference has succeeded with the intended outcome. A known idea support system such as Cited Reference 1 unilaterally provides participants with information that the system has determined to be useful, and is said to have the effect of activating the conference space. Not too much. In order to determine whether the use of the idea support system is truly beneficial, it is necessary to quantify and evaluate the remarks and attitudes of the participants who have received the information provided at subsequent meetings. Is important, and it is necessary to reflect the actual changes of participants in the information decision processing by the idea support system in a timely manner.

引用文献２に記載のシステムは、会議の状態（例えば、活発な状態、煮詰まった状態など）、及び会議状態の好転を促すための指示（例えば、休憩を促すメッセージ表示、頻出キーワードのインターネット検索結果表示など）を参加者から受付け、どのような状況において会議状態が変化してほしいかを参加者が要求し、実際のアクションとの関連性を学習ルールとして蓄積することによって、或る会議状態が特定されたときにその学習ルールに基づき適切な情報提示を行うことが可能であることを記載している。 The system described in the cited document 2 includes a conference state (for example, an active state, a boiled state, etc.) and an instruction for prompting improvement of the conference state (for example, message display for prompting breaks, Internet search results for frequent keywords) Display, etc.) from the participant, the participant requests what kind of situation the conference state wants to change, and accumulates the relationship with the actual action as a learning rule. It describes that it is possible to present appropriate information based on the learning rules when specified.

しかしながら、引用文献２のシステムの場合、会議の状態は参加者が把握してシステムに入力するものであり、したがってその入力が必ずしも実際の会議状態をあらわしているとは限らない。例えば、参加者は発言が途切れなく継続するので会議が活発な状態であると認識してこれをシステムに入力したとしても、現実は堂々巡りの議論であって煮詰まった状態であるかもしれず、真の会議状態を反映した入力になっていないことがある。また、参加者が会議状態を常にタイムリーにシステムへ入力するとは限らないし、その入力を意識しているとなると気が散って会議に集中できなくなくなる恐れもある。さらに、システムが学習ルールとして蓄積する情報は、例えば、煮詰まった状態のとき、単に休憩を促すメッセージを表示する、出現頻度の高い単語を表示するといった単純な内容であり、期待される会議結果へ導くのに有益な情報提供とは乖離している。 However, in the case of the system of cited document 2, the state of the conference is grasped by the participant and input to the system, and therefore the input does not necessarily represent the actual conference state. For example, even if a participant recognizes that the conference is active because the speech continues without interruption, the reality may be a state-of-the-art discussion and a boiled state. The input may not reflect the meeting status. In addition, participants do not always input the conference status to the system in a timely manner, and if they are aware of the input, they may be distracted and unable to concentrate on the conference. Furthermore, the information that the system accumulates as a learning rule is, for example, a simple content such as displaying a message prompting a break or displaying a word with a high frequency of appearance when in a boiled state. It is a departure from providing useful information for guidance.

そこで、本発明は、上記課題を解決するべく、会議などのコミュニケーション状況を支援するための情報の呈示やコミュニケーション環境を変化させるための指示の選択を、事前にルールを構築することなく的確に行えるコミュニケーション支援システムを提供することを目的とする。 Therefore, in order to solve the above-mentioned problems, the present invention can accurately present information for supporting a communication situation such as a meeting and select an instruction for changing the communication environment without constructing a rule in advance. The purpose is to provide a communication support system.

前記目的を達成するために本発明に係る複数の参加者によるコミュニケーション支援システムは、音声データ及び画像データの少なくとも何れかによってコミュニケーション空間の状況を把握する少なくとも１つの状況データ取得手段と、前記状況データ取得手段により得られた感知データを用いて前記コミュニケーション空間内の任意の出力媒体を制御するサーバと、を備え、前記サーバが、所定の時間間隔毎に、前記感知データを数値化又は符号化して前記コミュニケーション空間の状況を時刻ｔの状態変数として生成し、時刻ｔの前記コミュニケーション空間の状況におけるコミュニケーションを支援するため、前記状態変数に関する複数のコミュニケーション支援項目の中から各支援項目の期待値に基づき少なくとも一つの支援項目を選択し、選択された支援項目を実行する指令を前記出力媒体に出力した後、前記状況データ取得手段により取得された感知データに基づき生成した時刻ｔ＋１の状態変数に関するコミュニケーション支援項目の期待値を用いて、時刻ｔのときに選択された前記支援項目の期待値を更新することを繰り返す、ように構成されていることを特徴とする。 In order to achieve the above object, a communication support system by a plurality of participants according to the present invention comprises at least one situation data acquisition means for grasping the situation of a communication space from at least one of audio data and image data, and the situation data And a server that controls any output medium in the communication space using the sensed data obtained by the acquisition means, and the server digitizes or encodes the sensed data at predetermined time intervals. In order to generate the state of the communication space as a state variable at time t and support communication in the state of the communication space at time t, based on the expected value of each support item from among a plurality of communication support items related to the state variable At least one support item After selecting and executing a command to execute the selected support item to the output medium, the expected value of the communication support item related to the state variable at time t + 1 generated based on the sensing data acquired by the situation data acquisition unit is used. The update of the expected value of the support item selected at time t is repeated.

また、本発明に係る複数の参加者によるコミュニケーション支援システムは前記支援項目の期待値の更新は強化学習アルゴリズムに基づくこと、前記状態変数は前記少なくとも１つの状況データ取得手段により得られる各感知データに対応した各指標の組又は各指標より算出される数値であることを特徴とする。 In the communication support system by a plurality of participants according to the present invention, the update of the expected value of the support item is based on a reinforcement learning algorithm, and the state variable is included in each sensed data obtained by the at least one situation data acquisition unit. It is a numerical value calculated from a set of each corresponding index or each index.

さらに、本発明に係る複数の参加者によるコミュニケーション支援システムにおける前記複数の指標は、コミュニケーション内容が複数の観点を含んでいるかを示す多様度、各参加者のコミュニケーション関与程度を示す均一度、コミュニケーションが複数の参加者間で交互に行われているかを示す活性度、他の参加者の発話又は態度に対する同意度、コミュニケーションが異なるカテゴリーのワードを用いて行われているかを示す発想度、の少なくとも１つを含むこと、前記サーバは、コミュニケーションの文脈に応じて前記出力媒体を選択し、当該選択した出力媒体に対して前記選択された支援項目を実行する指令を出力すること、前記出力媒体は、前記コミュニケーション空間内の１以上の壁面及びテーブルを含む共有物、並びに各参加者の携帯端末及び眼鏡型表示装置を含むパーソナル情報端末の少なくとも１つを含むことを特徴とする。 Further, the plurality of indicators in the communication support system by a plurality of participants according to the present invention include a diversity indicating whether the communication content includes a plurality of viewpoints, a uniformity indicating the degree of communication involvement of each participant, and communication At least one of an activity level indicating whether it is performed alternately among a plurality of participants, an agreement level with respect to the speech or attitude of other participants, and an idea level indicating whether communication is performed using words of different categories The server selects the output medium according to a communication context, outputs an instruction to execute the selected support item for the selected output medium, and the output medium includes: Shared objects including one or more walls and tables in the communication space, and each reference Characterized in that it comprises at least one of personal information terminal including a user of the mobile terminal and an eyeglass-type display device.

さらに、前記目的を達成するために本発明に係るテーブルは、複数の参加者によるコミュニケーション支援のために使用されるテーブルであって、少なくとも１つの状況データ取得手段を用いて取得された音声又は画像に関する感知データが送信されたサーバが、所定の時間間隔毎に、前記感知データを数値化又は符号化して前記コミュニケーション空間の状況を時刻ｔの状態変数として生成し、時刻ｔの前記コミュニケーション空間の状況におけるコミュニケーションを支援するため、前記状態変数に関する複数のコミュニケーション支援項目の中から各支援項目の期待値に基づき少なくとも一つの支援項目を選択し、選択された支援項目の実行後、前記状況データ取得手段により取得された感知データに基づき生成した時刻ｔ＋１の状態変数に関するコミュニケーション支援項目の期待値を用いて、時刻ｔのときに選択された前記支援項目の期待値を更新する、ことの繰り返しにおいて、前記選択された支援項目に応じて前記サーバが抽出した情報を表示することを特徴とする。 Furthermore, in order to achieve the above object, the table according to the present invention is a table used for communication support by a plurality of participants, and is a sound or image acquired using at least one situation data acquisition means. The server to which the sensing data related to is transmitted generates a state of the communication space as a state variable at time t by digitizing or encoding the sensing data at predetermined time intervals, and the state of the communication space at time t. In order to support communication in the state variable, at least one support item is selected from a plurality of communication support items related to the state variable based on an expected value of each support item, and after executing the selected support item, the situation data acquisition unit State at time t + 1 generated based on the sensing data acquired by Information extracted by the server according to the selected support item in the repetition of updating the expected value of the support item selected at time t using the expected value of the communication support item regarding the number Is displayed.

本発明によれば、所定の時間間隔毎に、会議などのコミュニケーション場に参加した者の実際の発話及び態度を解析し、コミュニケーション空間の状況が所定の指標により符号化或いは数値としてあらわされるため、その場の状況変化を時系列で定量化することが可能である。また、システムが選択したコミュニケーション支援項目が出された後の参加者の実際のリアクションが再び指標の値として反映され、反映後の指標に基づく報酬が直前のコミュニケーション支援指示へフィードバックしてコミュニケーション支援項目の選択確率を更新させるため、システムがコミュニケーション空間の状況に応じた適切なコミュニケーション支援を選択したかを評価した上での学習が自動的に実行されていることと等価である。このため、学習ルールをあらかじめシステムに実装することなく、システムによるコミュニケーション支援指示の選択精度を向上させることができる。 According to the present invention, for each predetermined time interval, the actual utterance and attitude of a person who participated in a communication venue such as a meeting are analyzed, and the state of the communication space is expressed by a predetermined index or expressed as a numerical value. It is possible to quantify the changes in the situation on the spot in time series. In addition, the actual reaction of the participant after the communication support item selected by the system is issued is reflected again as the index value, and the reward based on the reflected index is fed back to the previous communication support instruction and the communication support item It is equivalent to learning automatically after evaluating whether the system has selected appropriate communication support according to the situation of the communication space in order to update the selection probability. For this reason, the selection accuracy of the communication support instruction by the system can be improved without implementing the learning rule in the system in advance.

特に、本発明における情報提供などのコミュニケーション支援項目の選択は、強化学習モデルに基づく演算により決定するように構成されており、選択の回数を重ねるごとにコミュニケーション支援項目の選択の適切さを漸化式の演算を通じて更新するため、その後の選択の方略を参加者からの指示を必要とすることなくシステム自らが学習しながら決定していくことができる。すなわち、本願発明における選択は事前にシステムに組み込んだルールに従う学習手法（いわゆる教師付き学習）に基づいていないため、あらかじめ学習ルールを構築しなければ情報提供などのコミュニケーション支援項目の選択ができないモデル手法と比較し、学習ルール構築のために要する多大な工数を省略することができる。 In particular, the selection of communication support items such as information provision in the present invention is configured to be determined by calculation based on the reinforcement learning model, and the appropriateness of the selection of communication support items is gradually refined as the number of selections is repeated. Since it is updated through the calculation of the expression, the subsequent selection strategy can be determined while the system itself learns without requiring an instruction from the participant. That is, since the selection in the present invention is not based on a learning method (so-called supervised learning) according to a rule incorporated in the system in advance, a model method in which communication support items such as information provision cannot be selected unless a learning rule is constructed in advance. Compared with the above, a great amount of man-hours required for constructing learning rules can be omitted.

さらにまた、選択されたコミュニケーション支援項目の出力先は、複数の出力媒体の中から、コミュニケーション空間の状況や提供する情報の内容、或いは参加者の属性に応じて適切なものが特定されるように構成されているため、複数の参加者で共有されるべき情報、各参加者にとって必要な情報という区別がされることとなる。コミュニケーションの進行を妨げることなく、各参加者に真に有用な情報やコミュニケーション空間を変化させる指令が参加者毎に適切な出力媒体に提供され、これを認識する各参加者は、コミュニケーションの活性化のために利用することができる。 Furthermore, the output destination of the selected communication support item is specified from a plurality of output media according to the status of the communication space, the content of information to be provided, or the attributes of the participants. Since it is configured, information to be shared by a plurality of participants is distinguished from information necessary for each participant. Information that is truly useful to each participant and instructions to change the communication space are provided to the appropriate output medium for each participant without interfering with the progress of communication, and each participant who recognizes this activates communication. Can be used for.

本発明の一本実施形態による会議支援システムの空間環境を示す概略図である。It is the schematic which shows the spatial environment of the meeting assistance system by one Embodiment of this invention. 本発明の一本実施形態による会議支援システムが実行する処理の全体手順を示すフローチャートである。It is a flowchart which shows the whole procedure of the process which the conference assistance system by one Embodiment of this invention performs. 主なプロセスの流れを示したプロセス遷移図である。It is a process transition diagram showing the flow of the main process. 会議状況データを取得する処理の手順例を示すフローチャートである。It is a flowchart which shows the example of a procedure of the process which acquires meeting condition data. 会議状況の判定処理に関する詳細フローチャートである。It is a detailed flowchart regarding the determination process of a meeting condition. 報酬量の決定処理に関する詳細フローチャートフローチャートである。It is a detailed flowchart flowchart regarding the determination process of reward amount. レコメンドの決定処理に関する詳細フローチャートである。It is a detailed flowchart regarding the determination process of a recommendation. 状態変数の組（tuple）に対する各行動の価値（利得期待値）の例を示す図である。It is a figure which shows the example of the value (gain expected value) of each action with respect to the group (tuple) of a state variable. 状態変数の組（tuple）に対する各行動の価値（利得期待値）の例を示す図である。It is a figure which shows the example of the value (gain expected value) of each action with respect to the group (tuple) of a state variable. 状態変数の組（tuple）に対する各行動の価値（利得期待値）の例を示す図である。It is a figure which shows the example of the value (gain expected value) of each action with respect to the group (tuple) of a state variable. 状態変数の組（tuple）に対する各行動の価値（利得期待値）の例を示す図である。It is a figure which shows the example of the value (gain expected value) of each action with respect to the group (tuple) of a state variable. ワードの特異性に基づき個人レコメンドを行う場合の一例を示すフローチャートである。It is a flowchart which shows an example in the case of performing a personal recommendation based on the specificity of a word. レコメンドの呈示処理に関する詳細フローチャートである。It is a detailed flowchart regarding the recommendation presenting process.

以下に図面を参照しながら、本発明に係るコミュニケーション支援システムの一実施形態について説明する。本発明のコミュニケーション支援システムを企業等での会議室内で適用した会議支援システムが本実施形態である。まず、会議支援システムで使用される機器について説明する。 An embodiment of a communication support system according to the present invention will be described below with reference to the drawings. This embodiment is a conference support system in which the communication support system of the present invention is applied in a conference room of a company or the like. First, devices used in the conference support system will be described.

図１に示すように、会議室内にはテーブル１３があり、参加者が互いに向き合って椅子５に座っている状態とする。また、各参加者にはそれぞれ携帯端末２が割当てられており、無線又は有線のネットワーク（図示せず）を介してサーバ１と接続されている。携帯端末２は、参加者からの操作入力に従って、ネットワークを介してサーバ１に対して各種指示を送信することができる。そして、サーバ１は、会議室という空間から得られる音声や画像の情報を基に参加者にとって有益な情報、すなわち議論を活性化させるための情報や会議室の環境を変化させる指令を抽出又は出力する。その具体的な処理の詳細は、後述する。
なお、携帯端末２は、特に限定されるものではないが、デスクトップ型やラップトップ型のパーソナル・コンピュータ、ワークステーションなどの汎用コンピュータ装置、タブレット端末、スマートフォン端末、ＰＤＡ（Personal Digital Assistant）端末などを用いることができる。各参加者は携帯端末２の画面に表示される情報を確認することができる。 As shown in FIG. 1, there is a table 13 in the conference room, and participants are sitting on the chair 5 facing each other. Each participant is assigned a portable terminal 2 and is connected to the server 1 via a wireless or wired network (not shown). The mobile terminal 2 can transmit various instructions to the server 1 via the network in accordance with an operation input from the participant. Then, the server 1 extracts or outputs information useful for the participants based on audio and image information obtained from the space called the conference room, that is, information for activating the discussion and a command for changing the environment of the conference room. To do. Details of the specific processing will be described later.
The mobile terminal 2 is not particularly limited, but may be a desktop or laptop personal computer, a general-purpose computer device such as a workstation, a tablet terminal, a smartphone terminal, a PDA (Personal Digital Assistant) terminal, or the like. Can be used. Each participant can confirm information displayed on the screen of the mobile terminal 2.

各参加者は、携帯端末２に組み込まれた又は携帯端末に接続する１以上のマイクロフォン４を使用する。マイクロフォン４によって、各参加者が発話した音声が収集され、そしてサーバ１に送信される。本実施の形態では誰が音声を発したのかを特定することが容易にするため、参加者毎に割当てたマイクロフォン４を使用する。 Each participant uses one or more microphones 4 built into or connected to the portable terminal 2. Voices spoken by each participant are collected by the microphone 4 and transmitted to the server 1. In this embodiment, the microphone 4 assigned to each participant is used in order to make it easy to identify who made the sound.

また、各参加者は、ウェアラブル端末に属する情報表示機能付きメガネ６を着用していてもよい。情報表示機能付きメガネ６は、サーバ１との通信のために、各参加者が使用する携帯端末２を特定するクライアント識別子に対応したメガネ識別子を有している。サーバ１は、必要に応じて、特定の参加者が着用する情報表示機能付きメガネ６に情報を送信することで、その参加者は自分の携帯端末２を操作したり閲覧しながら、同時に情報表示機能付きメガネ６に表示される情報も認識することができる。なお、メガネの代わる他のウェアラブル端末としては、スマートウォッチ、リストバンド、指輪等があり、追加的にこれらを使用することも含む。さらに、会議室内にサーバ１と通信可能に接続された任意のプロジェクター７等を設置してもよく、例えば、参加者全員が一斉に視認するための共通の入力／出力画面として利用することができる。 Each participant may wear glasses 6 with an information display function belonging to the wearable terminal. The glasses 6 with an information display function have glasses identifiers corresponding to client identifiers for specifying the mobile terminals 2 used by each participant for communication with the server 1. The server 1 transmits information to the glasses 6 with an information display function worn by a specific participant as necessary, so that the participant can simultaneously display and display information while operating or browsing his / her mobile terminal 2. Information displayed on the functional glasses 6 can also be recognized. Note that other wearable terminals that replace glasses include smart watches, wristbands, rings, and the like, including the additional use of these. Furthermore, an arbitrary projector 7 or the like that is communicably connected to the server 1 may be installed in the conference room, and can be used as, for example, a common input / output screen for all participants to view all at once. .

会議室内の任意の場所（例えば、天井など）に参加者の表情や動作を撮像するためのカメラ８が設置されている。カメラ８で撮像された画像から、座っているときの姿勢（例えば、前傾/後傾、肘をつく、足を組む等）、うなずき動作を検出したり、顔の表情の解析によって喜怒哀楽の感情を推測する。これら検出及び推測の手法は、あらゆる画像解析技術を用いることができる。なお、各参加者に割当てられた携帯端末が備えるカメラ機能によって参加者毎にそれぞれのカメラ３で撮像するようにしてもよい。 A camera 8 for capturing the facial expressions and actions of the participants is installed at an arbitrary location (for example, a ceiling) in the conference room. From the image taken by the camera 8, sitting posture (for example, forward / backward tilting, elbows, crossed legs, etc.), nodding motions, and facial expressions are analyzed to analyze emotions Guess the emotions. Any image analysis technique can be used for these detection and estimation methods. In addition, you may make it image with each camera 3 for every participant with the camera function with which the portable terminal allocated to each participant is provided.

会議支援システムの全体制御を司るのが各機器とネットワーク接続したサーバ１である。サーバ１は各参加者が認識できる様々な出力媒体へ情報を送信する。出力媒体の例として、一般的には、各参加者に割当てられた携帯端末２の画面や、会議室内で参加者が共有して視認するディスプレイ装置又はプロジェクター７がある。本発明の会議支援システムは、このような既存の情報出力態様のみならず、会議室内のテーブル１３や壁面９,１０をディスプレイとして使用するように構成されている。しかも、詳細は後述するが、テーブル１３や壁面９,１０に表示する情報の意味づけがそれぞれに付与されており、例えば、参加者は“テーブル１３”に表示される情報を見て議論で重要なキーワードやフレーズ等を確認したり、“壁面９”に表示される情報を見て発想を広げることに役立てる。なお、表示される情報はテキストに限定されていない。例えば、音声付き又は音声無しの静止画像や動画像であってもよい。 The server 1 connected to each device via a network is responsible for overall control of the conference support system. The server 1 transmits information to various output media that can be recognized by each participant. Examples of the output medium generally include a screen of the mobile terminal 2 assigned to each participant, and a display device or projector 7 that is shared and viewed by participants in a conference room. The conference support system of the present invention is configured to use not only such an existing information output mode but also the table 13 and the wall surfaces 9 and 10 in the conference room as a display. In addition, although details will be described later, the meaning of information displayed on the table 13 and the wall surfaces 9 and 10 is given to each. For example, the participants see the information displayed on the “table 13” and are important in the discussion. This is useful for confirming simple keywords, phrases, etc., and expanding the idea by looking at the information displayed on “Wall 9”. Note that the displayed information is not limited to text. For example, it may be a still image or a moving image with or without sound.

図２は、会議支援システムで実行される処理全体の概略を示したフローチャートである。また、各処理をプロセス遷移図として示したのが図３である。
会議の開始後、会議室の状況は、上述したマイクロフォン４やカメラ３,８などの複数の状況データ取得手段（すなわち、センサ）で把握される（図２のステップS201）。これは、図３のプロセス遷移図で言えば、「感知プロセス３１」に対応する。 FIG. 2 is a flowchart showing an outline of the entire process executed in the conference support system. FIG. 3 shows each process as a process transition diagram.
After the start of the conference, the status of the conference room is grasped by a plurality of status data acquisition means (that is, sensors) such as the microphone 4 and the cameras 3 and 8 described above (step S201 in FIG. 2). This corresponds to “sensing process 31” in the process transition diagram of FIG.

「感知プロセス３１」の後、「コミュニケーション状態の知覚プロセス３２」に遷移する。マイクロフォン４やカメラ３,８から得られる音声データ及び画像データを解析して、いま会議がどのような状況となっているのかを数値で表すことで可視化するためである（図２のステップS202）。会議室という空間において、各種センサを通じてコミュニケーション状態をあらわす感知データが取得されているので、コミュニケーション空間の特徴量の抽出が図れる。なお、実際の会議の状況をあらわす特徴量として、例えば、会議参加者が活発に議論していて活性度が高いというような単一の指標だけでは不十分である。そこで、会議状況を複数の指標によって判断する。この特徴量をあらわす複数の指標の内容、及び複数の指標を用いたコミュニケーション状態の特定については後述する。 After the “sensing process 31”, the process proceeds to the “communication state perception process 32”. This is because the voice data and the image data obtained from the microphone 4 and the cameras 3 and 8 are analyzed, and the situation is now visualized by expressing numerically (step S202 in FIG. 2). . Since the sensing data representing the communication state is acquired through various sensors in the space called the conference room, the feature amount of the communication space can be extracted. It should be noted that, as a feature amount representing the actual conference situation, for example, a single indicator that conference participants are actively discussing and the activity level is high is not sufficient. Therefore, the meeting status is judged by a plurality of indicators. The contents of a plurality of indices representing the feature amount and the specification of the communication state using the plurality of indices will be described later.

会議開始後の最初の処理では、図２におけるステップS203の報酬処理をスキップする（報酬処理がNoの場合に相当）。この報酬は、サーバ１がキーワード呈示などコミュニケーション状態に変化を生じさせた行動結果に対して与えられる報酬であり、最初の処理においてまだサーバ１の行動前のためである。そして、「コミュニケーション状態の知覚プロセス３２」から「行動の選択プロセス３４」へ遷移する。「行動の選択プロセス３４」の実行時には、「利得期待値の学習プロセス３５」で算出される行動ａの価値（利得期待値）Ｑをパラメータとして使用する。また、「利得期待値の学習プロセス３５」は、報酬量が得られていれば報酬量も用いて価値Ｑを算出する。なお、特許請求の範囲に記載する「コミュニケーション支援項目」が行動に相当する。 In the first process after the start of the conference, the reward process in step S203 in FIG. 2 is skipped (corresponding to the case where the reward process is No). This remuneration is a remuneration given to an action result in which the server 1 causes a change in the communication state such as keyword presentation, and is still before the action of the server 1 in the first process. Then, a transition is made from the “communication state perception process 32” to the “behavior selection process 34”. When executing the “behavior selection process 34”, the value (expected gain value) Q of the behavior a calculated in the “expected gain learning process 35” is used as a parameter. Further, the “expected gain learning process 35” calculates the value Q using the reward amount if the reward amount is obtained. The “communication support item” described in the claims corresponds to an action.

「行動の選択プロセス３４」は、図２のステップS205に示すレコメンドの決定処理に対応する。「行動の選択プロセス３４」とは、サーバ１がどのような行動を採択するのが適しているかを所定の学習アルゴリズムに基づき決定することであり、複数の指標によりあらわされた最新の会議状況を変化させるトリガーになり得る行動を複数の選択肢の中から選択することである。具体的には、その会議状況に応じて発散されるようにしたり、逆に収束させるようにするのに有益な情報をレコメンド（推奨）情報として抽出したり、会議室の環境を変化させための指令を出力するので、レコメンドの決定処理となる。 The “behavior selection process 34” corresponds to the recommendation determination process shown in step S205 of FIG. The “behavior selection process 34” is to determine what action is appropriate for the server 1 to adopt based on a predetermined learning algorithm. The latest meeting situation represented by a plurality of indicators is determined. The action that can be a trigger to change is to select from a plurality of options. Specifically, information that is useful for making it diverge according to the conference situation, or conversely converging, is extracted as recommended (recommended) information, or for changing the conference room environment. Since a command is output, it is a recommendation determination process.

サーバ１による行動の選択、即ちレコメンドの決定は、人工知能の分野で知られる強化学習アルゴリズムに基づいている。本実施の形態における学習とは、現在の会議状況（状態ｓ）に対してサーバ１が選択しようとする際の各行動ａの価値、選択した行動ａの実行後の会議状況に関する各行動の価値、会議状況の変化に対する報酬などを基に各行動ａの価値を更新し、これによりサーバ１により選択されて参加者に呈示する行動の適合性を高めることである。なお、レコメンド情報は、参加者毎に向けられるものと、参加者全員に向けられているものに大別される。 The action selection by the server 1, that is, the determination of the recommendation, is based on a reinforcement learning algorithm known in the field of artificial intelligence. Learning in the present embodiment refers to the value of each action a when the server 1 tries to select the current meeting status (state s), and the value of each action related to the meeting status after execution of the selected action a. In other words, the value of each action a is updated based on a reward for a change in the meeting status, and thereby the suitability of the action selected by the server 1 and presented to the participant is improved. The recommendation information is roughly classified into information directed to each participant and information directed to all participants.

さらに、サーバ１は、決定したレコメンド情報を呈示するのに適する出力媒体を選択し、そこへ適宜表示等するための指示を出力する（図２のステップS206）。会議室という空間内に情報を出力する媒体は、例えばディスプレイ画面という通常のハードウェア機器もあれば、会議室に存在する壁、床、天井、家具などもディスプレイ画面と同様の出力媒体となり得る。 Further, the server 1 selects an output medium suitable for presenting the determined recommendation information, and outputs an instruction for appropriately displaying it on the output medium (step S206 in FIG. 2). As a medium for outputting information in a space called a conference room, for example, there is a normal hardware device such as a display screen, and walls, floors, ceilings, furniture, etc. existing in the conference room can be output media similar to the display screen.

会議の終了指示が入力されない間は、上述した処理が繰り返される。すなわち、ステップS201へ戻り、マイクロフォン４やカメラ３,８からの音声データ及び画像データを取得する。また、繰り返し処理の実行中の場合（すなわち、既にレコメンド情報が呈示されている場合）、ステップS204の報酬処理を実行する。この報酬は、サーバ１が有益な情報として特定のレコメンド情報を選択したという行動に対してコミュニケーション状態が改善された結果になればプラスの報酬を、逆にコミュニケーション状態に変化が見られない或いはむしろ悪い状態になればマイナスの報酬をサーバに返すことを意味する。したがって、会議開始間もないときにサーバが提供するレコメンドは報酬が十分に反映されていないことから必ずしも有益にならない可能性があるが、サーバ１による選択の結果が報酬という形で次に選択されるべき行動の決定に影響するという反復学習を通じて、時間の経過に伴い高い精度の（即ち、より適切と判断される）行動選択を生み出すことになる。 While the conference end instruction is not input, the above-described processing is repeated. That is, the process returns to step S201, and audio data and image data from the microphone 4 and the cameras 3 and 8 are acquired. Further, when the repetitive process is being executed (that is, when the recommendation information has already been presented), the reward process of step S204 is executed. This reward is a positive reward if the communication status is improved for the action that the server 1 has selected specific recommendation information as useful information, and conversely no change is seen in the communication status. If it goes bad, it means returning negative rewards to the server. Therefore, the recommendation provided by the server when the meeting is about to start may not necessarily be beneficial because the reward is not sufficiently reflected, but the result of selection by the server 1 is selected next in the form of reward. Through iterative learning that influences the decision of the action to be taken, it will produce action choices that are highly accurate (ie, deemed more appropriate) over time.

その後、会議参加者（例えば、司会進行の担当者）から会議終了の指示が入力され、サーバ１が当該指示を受け取ると、会議中での発話内容や呈示されたレコメンド情報等を、関係するクライアント識別子及び情報の出力媒体と関連づけながら把握できる形式で保存される（ステップS209）。議論された決定事項も併せて保存されるようにしてもよいことは言うまでもない。 Thereafter, when an instruction to end the conference is input from a conference participant (for example, the person in charge of the moderator), and the server 1 receives the instruction, the utterance contents in the conference, recommended recommendation information, etc. are related to the client. The identifier and the information are stored in a format that can be grasped while being associated with the output medium (step S209). It goes without saying that the decision items discussed may also be saved.

また、会議終了後、サーバ１が選択した行動ａに対して会議参加者からの評価を受け入れ、ステップS205におけるレコメンド決定処理において追加的な報酬として用いてもよい。開催された会議においてサーバ１によりレコメンドされた情報やコミュニケーション空間である会議室の環境を変化させるための行動指令が有益であったか等の質問が任意の出力媒体に呈示され、参加者がタッチ操作等で回答を入力する（ステップS208）。
なお、本実施の形態の場合、会議が終了した後でステップS208を実行するが、ステップS206のレコメンドの呈示の直後に実行してもよい。この場合、サーバ１によるレコメンドがある度に参加者からの評価がフィードバックされることになり、サーバ１による強化学習の収束速度が上がる。参加者からの評価をサーバ１の強化学習処理に組み入れることは、適切な行動選択までに要する時間を一層短縮させることに貢献する。 In addition, after the conference, the evaluation from the conference participant may be accepted for the action a selected by the server 1 and used as an additional reward in the recommendation determination process in step S205. In the held meeting, questions such as information recommended by the server 1 and whether action commands for changing the environment of the conference room, which is a communication space, are useful are presented on an arbitrary output medium, and the participant performs a touch operation, etc. The answer is input at (step S208).
In the present embodiment, step S208 is executed after the conference ends, but may be executed immediately after the recommendation is presented in step S206. In this case, the evaluation from the participant is fed back whenever there is a recommendation by the server 1, and the convergence speed of reinforcement learning by the server 1 increases. Incorporating the evaluation from the participant into the reinforcement learning process of the server 1 contributes to further shortening the time required for selecting an appropriate action.

以降は、図２に示した本発明の概略フローのそれぞれについて詳細に説明していく。
図４は、図２のステップに示す会議状況データ取得の処理に関する詳細フローチャートである。実際にマイクロフォン４やカメラ３,８でセンシングを開始する事前準備として、会議参加者（例えば、進行係の担当者）はどの会議の状況データであるかを識別するための会議ＩＤの入力をする（ステップS401）。各参加者が各々使用する携帯端末２、マイクロフォン４、情報表示機能付きメガネ６等は、各参加者のクライアント識別子と対応づけておく（ステップS402）。さらに、サーバ１は様々な出力媒体にレコメンド情報を呈示するが、会議参加者にしてみれば、どのような情報がどの出力媒体に呈示されるかをあらかじめ認識しておかないと混乱してしまう。そこで、出力媒体と呈示される出力内容の関連付けは、会議参加者に告知するとともに、サーバ１側にもあらかじめ設定しておく（ステップS403,S404）。 Hereinafter, each of the outline flows of the present invention shown in FIG. 2 will be described in detail.
FIG. 4 is a detailed flowchart regarding the process of acquiring the meeting status data shown in the step of FIG. As a preliminary preparation for actually starting sensing with the microphone 4 or the cameras 3 and 8, a conference participant (for example, a person in charge of a facilitator) inputs a conference ID for identifying which conference status data. (Step S401). The portable terminal 2, the microphone 4, the glasses 6 with an information display function, etc. used by each participant are associated with the client identifier of each participant (step S402). Furthermore, although the server 1 presents recommendation information on various output media, if it is considered as a conference participant, it will be confused if it does not recognize in advance what information will be presented on which output media. . Therefore, the association between the output medium and the output content to be presented is notified to the conference participants and set in advance on the server 1 side (steps S403 and S404).

会議が開始すると、所定の時間間隔毎に会議室内のマイクロフォン４やカメラ３,８が取得するデータが無線又は有線のネットワークを介してサーバ１に送信される（ステップS406）。サーバ１は、送信されたデータをメモリに感知データとして記録する（ステップS407）。なお、所定の時間間隔毎の記録を実行するとしたが、実際にはマイクロフォン４やカメラ３,８によるデータの取得は連続して行い、所定の時間間隔中（時刻ｔ-１〜時刻ｔ]に取得されるデータが、時刻ｔのタイミングでの感知データとして記録されることになるようにしてよい。 When the conference starts, data acquired by the microphone 4 and the cameras 3 and 8 in the conference room are transmitted to the server 1 via a wireless or wired network at predetermined time intervals (step S406). The server 1 records the transmitted data as sensing data in the memory (step S407). Although recording is performed at predetermined time intervals, data acquisition by the microphone 4 and the cameras 3 and 8 is actually performed continuously, and during a predetermined time interval (from time t-1 to time t). The acquired data may be recorded as sensing data at the timing of time t.

本実施の形態では参加者毎にマイクロフォン４が割当てられているので音声入力デバイス識別子、またはマイクロフォン４と接続する携帯端末２のクライアント識別子により発話者を識別し、発言内容を発話者に関連づけて記憶する。なお、他の実施形態では、参加者がこれから発言する旨を、参加者が操作する携帯端末２を介してサーバ１に通知したり、または、参加者の一部（例えば、司会進行役の担当者）が、今誰が発話しているかをサーバ１に信号として送信するための入力を行うようにしてもよい。更に別の実施形態では、単一のマイク（例えば、テーブル上に配置した卓上マイク）で複数人の音声を収集するようにしてもよい。この場合、音声の主がどの参加者であるかを特定する手法は、混在する音声の中から発話者を識別する任意の話者認識技術（例えば、声質や声紋認識を含む）を利用し、発話者を識別する。 In this embodiment, since the microphone 4 is assigned to each participant, the speaker is identified by the voice input device identifier or the client identifier of the mobile terminal 2 connected to the microphone 4, and the speech content is stored in association with the speaker. To do. In another embodiment, the server 1 notifies the server 1 that the participant will speak, or a part of the participant (for example, in charge of the chairman facilitator). The user) may make an input for transmitting as a signal to the server 1 who is currently speaking. In still another embodiment, the voices of a plurality of persons may be collected with a single microphone (for example, a desktop microphone arranged on a table). In this case, the method of identifying which participant is the main speaker uses any speaker recognition technology (for example, including voice quality and voiceprint recognition) that identifies a speaker from mixed speech, Identify the speaker.

図５は、図２のステップS202に示す会議状況の判定処理に関する詳細フローチャートである。
まず、マイクロフォン４により取得した音声データがある場合、サーバ１は、会議室に実際に居る参加者の識別、どの参加者が何を話しているのか、注目されている話題（トピック）は何か等を把握するため、音声データを解析する。具体的には、(時刻ｔ-１〜時刻ｔ]における、「発言量」及び「発話間隔」の算出をする（ステップS501）。「発言量」は、各参加者の発話開始時刻から発話終了時刻までの総時間を基にしてもよいし、後述する形態素解析により抽出される単語を構成する文字数であってもよい。「発言量」は、各参加者が会議に参加している積極性を示すのに最も分かりやすい指標の一つであり、「発話間隔」は、沈黙状況を知る上で有用な指標である。 FIG. 5 is a detailed flowchart regarding the conference status determination processing shown in step S202 of FIG.
First, when there is audio data acquired by the microphone 4, the server 1 identifies the participant who is actually in the conference room, what the participant is talking about, and what topic is being noted. Audio data is analyzed in order to understand the above. Specifically, “speech amount” and “speech interval” at (time t-1 to time t) are calculated (step S501) “speech amount” is the end of speech from each participant's speech start time. It may be based on the total time up to the time, or may be the number of characters that make up a word extracted by morphological analysis, which will be described later. It is one of the easiest indicators to show, and “speech interval” is a useful indicator for knowing the silence situation.

また、音声データを解析によって、参加者Ａの発言から参加者Ｂの発言に切替わったことを識別することができる。これは会議中で複数の参加者が会話のキャチボールをしていた事実を示しており、turn taking（ターンテイキング）量ともいう。turn taking量が多い程、多くの話し手により会議が形成されていることを表すので、「活性度」という指標が高いことになる。さらに、(時刻ｔ-１〜時刻ｔ]における、参加者１人あたりの平均発話時間を用いて「活性度」を算出することもできる。また、各参加者の発言量の総和に対する、参加者Ａの発言量の割合を計算した値は、参加者全員がほぼ満遍なく発言しているのか否かを示す「均一度」という指標になる。均一度の値が低ければ、特定の参加者が一方的に発言している可能性を示すことになる。また、(時刻ｔ-１〜時刻ｔ]における、参加者１人あたりの異なる分野又はカテゴリーから登場した単語頻度に基づく、「発想度」という指標がある。異なる分野又はカテゴリーから多くの単語が数多く発言される状況というのは、「発想度」が高いことになる。 Further, it is possible to identify that the speech has been switched from the speech of the participant A to the speech of the participant B by analyzing the audio data. This shows the fact that several participants were playing a conversation ball during the meeting, also called the amount of turn taking. The greater the amount of turn taking, the greater the number of speakers, so the higher the index of “activity”. Furthermore, “activity” can also be calculated using the average utterance time per participant at (time t-1 to time t). In addition, the participant with respect to the sum total of the speech amount of each participant The calculated value of the amount of speech of A is an index called “uniformity” that indicates whether or not all the participants are speaking almost uniformly. In addition, based on the frequency of words appearing from different fields or categories per participant (from time t-1 to time t), There is an index: A situation where many words are spoken from different fields or categories has a high degree of creativity.

さらなる指標として「多様度」がある。これは、端的に言えば、会議において同じ単語が多く使用されていたか、別の単語も多く使用されていたかという、単語の種類数からみた指標である。音声認識技術により、発言内容をテキストに変換した後に形態素解析を行って複数の単語として抽出し（ステップS503,S504）、助詞や感嘆句などの議論内容を把握する上で重要度の低い単語や文節を削除することにより、キーワードとなる単語を抽出することが可能である。その結果、同一な単語が多く使用されていた会議の場合、議論の内容は「多様度」に欠けることを示していることになる。また、厳密な同一性で判断するだけでなく、同義語分類やクラスター分類を行った結果、各キーワード同士が類似とみなされるまで広げた上で「多様度」を決定してもよい。なお、ステップS503における形態素解析等による手法は既存の技術の範疇であるため、本明細書では省略することとするが、当業者であれば十分に理解されるものである。 “Diversity” is another indicator. In short, this is an index based on the number of types of words, that is, whether the same word is used frequently or another word is used frequently in the conference. Using speech recognition technology, the utterance content is converted to text, then morphological analysis is performed and extracted as multiple words (steps S503, S504). Words that are less important in understanding the content of discussion such as particles and exclamations It is possible to extract a word as a keyword by deleting a phrase. As a result, in the case of a meeting where many of the same words are used, the content of the discussion indicates that “diversity” is lacking. In addition to determining by strict identity, as a result of synonym classification and cluster classification, each keyword may be expanded to be regarded as similar, and “diversity” may be determined. Note that the technique based on morphological analysis or the like in step S503 is within the scope of the existing technology and will be omitted in this specification, but will be fully understood by those skilled in the art.

また、「インタラクション度」という指標もある。テーブルや壁に呈示された情報へ会議参加者がタッチ操作するとタッチという事実を認識し、(時刻ｔ-１〜時刻ｔ]におけるタッチ回数を「インタラクション度」としてあらわす。 There is also an index called “degree of interaction”. When the conference participant touches the information presented on the table or the wall, the fact of touch is recognized, and the number of touches at (time t-1 to time t) is expressed as the “degree of interaction”.

上記の各指標は、いずれも数値化或いは符号化するようにし、例えば数値化の単純な例としてはそれぞれの指標が５段階レベル（１〜５）であらわす。このように、複数の指標で会議の状況を数値化或いは符号化することにより、どの参加者が何を話しているのか、各参加者が発言という形でどの程度積極的に会議に参加しているのか、注目されている話題（トピック）は何か等を定量的に把握することが可能である（ステップS506）。 Each of the above indices is digitized or encoded. For example, as a simple example of digitization, each index is represented by five levels (1 to 5). In this way, by quantifying or encoding the status of the meeting with multiple indicators, what participants are talking about what, how actively each participant participates in the form of remarks It is possible to quantitatively grasp whether or not the topic (topic) that is attracting attention (step S506).

なお、発言内容から話題を推定する手法は、これに限定するものではないが、例えば、ＬＤＡ(Latent Dirichlet Allocation)を使用すればよい、ＬＤＡ手法とは、単語は独立に存在しておらず潜在的なトピックスを有し、そして同じトピックスを持つ単語は同じ文章にあらわれやすいという仮定の下、文章中の単語のトピックを確率的に求める言語モデルである。各単語が隠れトピックから生成されていると想定しているので、発言内容を解析した単語集合の中から注目されている話題（トピック）を推定できる。ＬＤＡなどの話題推定手法を用いることにより、例えば、果物のアップルと、コンピュータ関連のアップル（Ｒ）を区別することができると言われている。 Note that the method of estimating a topic from the content of a statement is not limited to this. For example, LDA (Latent Dirichlet Allocation) may be used. With the LDA method, a word does not exist independently and is latent. This is a language model that probabilistically obtains the topic of a word in a sentence under the assumption that words having the same topic and having the same topic are likely to appear in the same sentence. Since it is assumed that each word is generated from a hidden topic, it is possible to estimate a topic (topic) that is attracting attention from a word set obtained by analyzing the content of a statement. By using a topic estimation technique such as LDA, it is said that, for example, fruit apples and computer-related apples (R) can be distinguished.

次に、カメラにより取得した画像データがある場合、その画像データから人物の顔及び姿を認識する（ステップS507）。例えば、うなずき動作をしたり、前傾姿勢の参加者を検出すれば、他の参加者の発言に対して賛同を、これに対し首をひねった動作や顔を左右に振るジェスチャをしたり、後傾姿勢の参加者を検出すれば、否定を表明していると推測されるので、「同意度」という指標になる。その他の動作や顔の表情からも賛同や否定を検出することが可能である。このように、本実施形態は、画像データの解析によって、音声データから賛成や反対などといった直接的な単語を抽出する他に、画像データによっても、「同意度」という指標を算出するようにすることを含む。 Next, when there is image data acquired by the camera, the face and figure of a person are recognized from the image data (step S507). For example, if you perform a nodding motion or detect participants with a forward leaning posture, you agree with the speech of other participants, while twisting your neck or gesturing your face to the left or right, If a participant in a backward leaning posture is detected, it is assumed that he / she has expressed a denial, so this is an index of “agreement”. It is possible to detect approval and denial from other actions and facial expressions. As described above, according to the present embodiment, in addition to extracting direct words such as approval and disapproval from audio data by analyzing image data, an index of “agreement” is calculated also from image data. Including that.

上述した音声データ及び画像データの解析によって算出された複数の指標を用いて、会議の状態を設定することができる（ステップS508）。例えば、(時刻ｔ-１〜時刻ｔ]における「発言量」、「活性度」、「均一度」、「多様度」の指標値が大きい場合、(時刻ｔ-１〜時刻ｔ]での会議状態は活発であり、発散モードに傾向している可能性が高いと判断し、逆の場合は収束モードに傾向している可能性が高いと判断する。また、「発話間隔」の指標値が大きく、多くの参加者の「発話間隔」が大きければ、会議状態は沈黙していることになるので、行き詰った或いは煮詰まったとみなせる可能性が高いことになり、これは収束モードの一形態といえることになろう。このように、議論の内容が収束しているのか、発散しているのかを自動で推定することができ、これにより開催する会議の目的を、収束モードであれば発散させることを目標にすること、発散モードであれば収束させることを目標にすることというように自動設定できる。ただし、必ずしも複数の指標が正確に発散モード又は収束モードに対応していないこともあり得るため、参加者が会議のモード状態を補助的に手動で設定してもよく、後述する実施形態においては手動で会議の目的を設定する例を示している。なお、自動推定と手動設定の組合せで総合的にモード状態を判定するようにしてもよい。 The conference state can be set using a plurality of indices calculated by the above-described analysis of the audio data and the image data (step S508). For example, if the index values of “amount of speech”, “activity”, “uniformity”, and “diversity” at (time t-1 to time t) are large, the meeting at (time t-1 to time t) It is determined that the state is active and likely to be in the divergent mode, and in the opposite case, it is likely to be in the convergent mode. If the “speech interval” of a large number of participants is large, the conference state is silent, so there is a high possibility that it can be regarded as dead or boiled. This is a form of convergence mode. In this way, it is possible to automatically estimate whether the content of the discussion has converged or diverged, so that the purpose of the conference to be held is diverged if it is in the convergence mode. Target, or converge if divergent mode However, since several indicators may not necessarily accurately correspond to the divergent mode or the convergence mode, the participant can manually set the conference mode state manually. In the embodiment described later, an example in which the purpose of the conference is manually set is shown, and the mode state may be comprehensively determined by a combination of automatic estimation and manual setting. .

図６は、図２のステップS204に示す報酬量の決定処理に関する詳細フローチャートである。
実際の会議の状況をあらわす特徴量となる「発言量」、「活性度」、「均一度」、「多様度」などの値を受け取ると（ステップS601）、次に、報酬量の決定をする（ステップS602）。すなわち、特徴量の変化を使って、サーバが直近に選択した行動に対する報酬を算定する。現在の時刻をｔとすると、直近の時間間隔 (t-1, t] 並びにその前の時間間隔 (t-2, t-1] に取得した活性度や発想量などの特徴量を用いる。
報酬の算定アルゴリズムの一例として下記に示す。
報酬 := 0;
if ((t-1, t]の発話量 - (t-2, t-1]の発話量) > 0 then 報酬に1を加える;
if ((t-1, t]の発想量 - (t-2, t-1]の発想量) > 0 then 報酬に1を加える;
if ((t-1, t]のturn taking量 - (t-2, t-1]のturn taking量) > 0 then 報酬に1を加える;
if ((t-1, t]のインタラクション量 - (t-2, t-1]のインタラクション量) > 0 then 報酬に1を加える;
などである。上述した例は、加算報酬のみを示したが、発話量等が減少したときは報酬に-1を加えて減算報酬にする。 FIG. 6 is a detailed flowchart related to the reward amount determination processing shown in step S204 of FIG.
When a value such as “amount of speech”, “activity”, “uniformity”, “diversity” or the like, which is a feature amount representing the actual meeting status, is received (step S601), the reward amount is then determined. (Step S602). That is, the reward for the action most recently selected by the server is calculated using the change in the feature amount. Assuming that the current time is t, the feature quantities such as the degree of activity and the idea amount acquired in the most recent time interval (t-1, t) and the previous time interval (t-2, t-1) are used.
An example of a reward calculation algorithm is shown below.
Reward: = 0;
if ((t-1, t] utterance volume-(t-2, t-1] utterance volume)> 0 then add 1 to the reward;
if ((t-1, t) idea amount-(t-2, t-1] idea amount)> 0 then add 1 to the reward;
if ((t-1, t) turn taking amount-(t-2, t-1] turn taking amount)> 0 then 1 is added to the reward;
if ((t-1, t) interaction amount-(t-2, t-1] interaction amount)> 0 then 1 is added to the reward;
Etc. The above-described example shows only the addition reward. However, when the utterance amount or the like decreases, −1 is added to the reward to make a subtraction reward.

報酬についてあらためて説明する。
報酬とは、例えば、会議支援システムが会議室内の参加者にどのような情報を呈示したかなど、参加者及び会議室という場所への働きかけをするときのサーバ１による行動指令が回数を重ねるごとに改善されていくための学習の手がかりをサーバ１に提供するためのものである。本実施形態における報酬処理は、人工知能の分野で研究されてきた強化学習に基づいている。 I will explain the rewards again.
The reward is, for example, every time the action command by the server 1 is repeated when the conference support system presents the participants and the conference room such as what information the conference support system has presented to the participants in the conference room. This is to provide the server 1 with clues for learning to be improved. The reward processing in the present embodiment is based on reinforcement learning that has been studied in the field of artificial intelligence.

強化学習とは、試行錯誤を通じて外部の環境に適応する学習制御を基本としており、いわゆる教師付き学習とは異なる。教師付き学習は、会議室の状況を示す状態入力に対する適切な出力（状況を変化させるイベントやアクション）を明示的に導くルールが存在するので、ルールが正しいという仮定の下、はじめから完全な正解を提供することになる。一方、強化学習は、教師（すなわち、ルール）の代わりに、報酬という情報を手がかりに試行錯誤の回数を重ねながら自律的、能動的な改善を図って最適な出力へと向かう学習制御のやり方である。つまり、行動主体であるサーバ１は会議室内の様々なセンサで検出される状況データに基づき或る行動を選択し、その行動に基づき会議室内の状況が変化する。状況の変化に伴って、何らかの報酬がサーバ１に与えられ、これを受けてサーバ１はより良い行動の選択（意思決定）を徐々に学習していくアルゴリズムである。 Reinforcement learning is based on learning control adapted to the external environment through trial and error, and is different from so-called supervised learning. In supervised learning, there is a rule that explicitly guides the appropriate output (events and actions that change the situation) for the state input that indicates the meeting room situation. Will be offered. Reinforcement learning, on the other hand, is a learning control method that aims for optimal output by autonomously and actively improving the number of trials and errors by using information called rewards as clues instead of teachers (ie, rules). is there. That is, the server 1 which is the action subject selects a certain action based on the situation data detected by various sensors in the meeting room, and the situation in the meeting room changes based on the action. A certain reward is given to the server 1 as the situation changes, and the server 1 is an algorithm that gradually learns to select a better action (decision decision) in response to this.

会議支援システムのサーバ１と、会議空間との間で以下の手順に沿ったやりとりが行われる。以下、(時刻ｔ-１〜時刻ｔ]を時刻ｔとし、(時刻ｔ〜時刻ｔ+1]を時刻ｔ+1として説明する。
(1)時刻ｔにおいて、会議空間の状態ｓ_tに応じて行動ａ_tを出力する。
(2)会議支援システムの選択した行動ａ_tにより、会議空間はｓ_t+1に状態遷移し、その遷移に応じた報酬ｒ_tを会議支援システムに与える。
(3)時刻ｔを時刻ｔ+1に進めて、(1)へ戻る。
会議支援システムによる学習は、報酬ｒの最大化を目的として、状態ｓに対する行動ａを逐次決定していくことになる。 Exchange according to the following procedure is performed between the server 1 of the conference support system and the conference space. Hereinafter, (time t-1 to time t) is assumed to be time t, and (time t to time t + 1) is assumed to be time t + 1.
(1) at time t, and outputs an action a _t in accordance with the state s _t meeting space.
(2) on the selected action a _t the conference support system, meeting space the state changes to s _{t + 1,} reward r _t corresponding to the transition to the conference support system.
(3) Advance time t to time t + 1 and return to (1).
In the learning by the conference support system, the action a for the state s is sequentially determined for the purpose of maximizing the reward r.

このような強化学習の一般的な更新式（漸化式）は、例えば以下のようになる。

Ｑ(ｓ_t+1,ａ_t+1)←Ｑ(ｓ_t,ａ_t)＋α｛ｒ_t+1＋γmaxＱ(ｓ_t+1,ａ)−Ｑ(ｓ_t,ａ_t)｝

Q（s_t,a_t）は、時刻ｔにおいて、或る状態ｓの下で、行動ａを選択する「価値」である。「価値」は、強化学習の分野において「利得期待値」とも言われる。上記更新式は、サーバ１による学習過程の中で、或る状態ｓのとき、価値Q（s,a）の最も値の高い行動ａが時刻ｔにおける最適な行動であるとして選択することを、その後の時刻ｔ_t+1においても実行し、これを繰り返すことを意味している。更新式の中のmaxの付いた項は、状態ｓ_t+1の下で（そのときにわかっている）最も価値Qの高い行動ａを選んだ場合の価値Ｑに割引率γ（０＜γ≦１）を乗じたものであり、αは学習速度係数（０＜α≦１）である。 A general update formula (recursion formula) of such reinforcement learning is, for example, as follows.

Q (s _{t + 1} , a _{t + 1} ) ← Q (s _t , a _t ) + α {r _{t + 1} + γmaxQ (s _{t + 1} , a) −Q (s _t , a _t )}

Q (s _t , a _t ) is a “value” for selecting the action a under a certain state s at time t. “Value” is also referred to as “expected gain” in the field of reinforcement learning. In the learning process by the server 1, the update formula selects that the action a having the highest value Q (s, a) is the optimum action at the time t in a certain state s. It is also executed at the subsequent time t _{t + 1} , meaning that this is repeated. The term with max in the update formula is the discount rate γ (0 <γ) when the action a having the highest value Q (known at that time) is selected under the state s _{t + 1.} ≦ 1) and α is a learning speed coefficient (0 <α ≦ 1).

この更新式は、行動a_tにより状態がｓ_t+1に変わる中で、状態ｓにおける行動ａの価値Ｑ（s_t,a_t）よりも、行動ａによる次の状態ｓ_t+1における最良の行動の価値Ｑ（ｓ_t+1,ａ_t+1）の方が大きければ、価値Ｑ（s_t,a_t）を大きくし、逆に小さければ価値Ｑ（s_t,a_t）も小さくすることを示している。なお、最初は各々の状態ｓと行動ａの組合せについて、正しい価値Q（s,a）の値はわかっていない。（その理由は、教師であるルールが存在しないためである。）したがって、すべての状態とそのときに取り得る行動のs,aの組について、初期の価値Q（s,a）をランダムに決めておく。この強化学習により、或る状態sにおける或る行動aの価値Q（s,a）を、それによる次の状態における最良の行動の価値に近づけることが可能になる。その差は、式中の割引率γと報酬ｒ_t+1に依存するが、基本的には、或る状態における最良の行動価値が、それに至る一つ前の状態における行動の価値に伝播するアルゴリズムとなっている。 This update equation, in which state the action a _t is changed to s _{t + 1,} the value Q (s _t, a _t) of action a in state s than the best in the next state s _{t + 1} by action a If the value Q (s _{t + 1} , a _{t + 1} ) of the action is larger, the value Q (s _t , a _t ) is increased, and if the value Q (s _t , a _t ) is smaller, the value Q (s _t , a _t ) is also decreased. It shows that Initially, the correct value Q (s, a) is not known for each combination of state s and action a. (The reason is that there is no teacher rule.) Therefore, the initial value Q (s, a) is randomly determined for all states and possible s, a pairs of actions that can be taken at that time. Keep it. This reinforcement learning makes it possible to bring the value Q (s, a) of a certain action a in a certain state s closer to the value of the best action in the next state. The difference depends on the discount rate γ and the reward r _{t + 1} in the formula, but basically, the best action value in a certain state propagates to the value of the action in the previous state. It is an algorithm.

サーバは、上述した強化学習モデルを利用するため、図６のステップS601で、マイクロフォン４やカメラ３,８などの各種センサで検出した時刻ｔにおける各指標（発言量:Ｐ_a、活性度:Ｐ_b、多様度:Ｐ_c、均一度:Ｐ_d、呈示した言葉に対して参加者がタップしたカウント数:Ｐ_e、呈示した推奨に対して参加者がタップしたカウント数:Ｐ_fなど）の値を受け取り、状態ｓ及び報酬ｒを定義するのである。最も簡単な例では、状態ｓは各指標値から構成した組（Ｐ_a,Ｐ_b,Ｐ_c,Ｐ_d,Ｐ_e,Ｐ_f）で定義し、発言量Ｐ_aが５段階中の４、活性度Ｐ_bが５段階中の２、…という場合、これら複数の指標の値を加算又は乗算して報酬r値を算出する。このとき、例えば「多様度」指標が他の指標よりも重視した会議であれば、複数の指標値を単純に加算又は乗算する前に、「多様度」の指標値に１より大きな重み係数を乗じておけばよい。また、指標値の算出の際、会議のモード状態（発散又は収束）についても自動的に推測或いは手動で設定しているので、会議のモード状態を考慮した報酬の算出方法であってもよい。一例を示すと、発散モード状態で時刻ｔ+1の「均一度」指標が減少すると報酬を−１する、収束モード状態で時刻ｔ+1の「活性度」指標が上昇すると報酬を＋１するといった具合である。 Server, in order to utilize the reinforcement learning model described above, in step S601 of FIG. 6, each index (remarks amount at time t detected by the various sensors such as a microphone 4 and a camera 3, 8: P _a, activity: P _b , diversity: P _c , uniformity: P _d , count number tapped by participant for presented word: _Pe , count number tapped by participant for presented recommendation: P _f ) It takes values and defines state s and reward r. In the simplest example, the state s is defined by a set (P _a , P _b , P _c , P _d , P _e , P _f ) composed of each index value, and the statement amount P _a is 4 out of 5 levels. When the activity _Pb is 2 in 5 stages,..., The reward r value is calculated by adding or multiplying the values of the plurality of indices. At this time, for example, if the “diversity” index is more important than other indexes, a weighting factor greater than 1 is added to the “diversity” index value before simply adding or multiplying the multiple index values. Multiply it. In addition, since the conference mode state (divergence or convergence) is automatically estimated or manually set when calculating the index value, a reward calculation method considering the conference mode state may be used. For example, when the “uniformity” index at time t + 1 decreases in the diverging mode state, the reward is decreased by 1. When the “activity” index at time t + 1 increases in the convergence mode state, the reward is increased by one. Condition.

次に、図６のステップS603は、会議参加者から報酬を受け付ける処理である。上述したように、基本的には会議室の各種センサで検出される感知データに基づき生成した各指標値に基づき、報酬r値は自動的に算出されるが、上述した強化学習アルゴリズムが選択したレコメンド（例えば、特定の情報の呈示）が良いという印象を会議参加者が持った場合、会議参加者からの報酬が追加的に加算されるようにしてもよい。逆に、呈示された情報は内容やタイミングからみて適切でないという印象を会議参加者が持てば、自動的に算出された報酬r値よりも低い値にするマイナス報酬が付与され、その後の強化学習アルゴリズムによる選択において当該情報が選択されにくくなるように反映する。 Next, step S603 in FIG. 6 is processing for receiving a reward from a conference participant. As described above, basically, the reward r value is automatically calculated based on each index value generated based on the sensed data detected by various sensors in the conference room. When a conference participant has an impression that a recommendation (for example, presentation of specific information) is good, a reward from the conference participant may be additionally added. Conversely, if the meeting participants have the impression that the presented information is not appropriate in terms of the content and timing, a negative reward is given that makes the value lower than the automatically calculated reward r value, and subsequent reinforcement learning The information is reflected so that it is difficult to select the information in the selection by the algorithm.

そして、最終的に、時刻ｔにおける総合報酬量を算出し（ステップS604）、これを強化学習アルゴリズムの更新式における報酬ｒ_tへ設定する（ステップS605）。 And, finally, to calculate the total reward amount at time t (step S604), this is set to reward r _t in the update formula of reinforcement learning algorithm (step S605).

次に、図７は、図２のステップS205に示すレコメンドの決定処理に関する詳細フローチャートである。
会議支援システムのサーバ１は、上述した強化学習モデルに基づく学習アルゴリズムに従い、時刻ｔにおいて、或る状態ｓの下で、行動ａを選択するが、代表的な行動ａというのは、情報のレコメンドである。サーバは、ニュース、研究論文、書籍など、必要な情報リソースを選び出して、特定の情報を抽出する。 Next, FIG. 7 is a detailed flowchart regarding the recommendation determination process shown in step S205 of FIG.
The server 1 of the conference support system selects the action a under a certain state s at time t according to the learning algorithm based on the above-described reinforcement learning model. The representative action a is a recommendation of information It is. The server selects necessary information resources such as news, research papers, books, etc., and extracts specific information.

まず、新規に開催する会議なのか、既に開催された会議の続きで行う会議であるかを会議ＩＤによって識別する（ステップS701）。なお、会議ＩＤは、議論を開始する前にサーバ１に入力しておくのが望ましい。新規の会議の場合、強化学習モデルの更新式で使用する初期の価値Q（s,a）は所定又はランダムに決めた任意のデフォルト値に設定する（ステップS703）。既に開催された会議ＩＤと同一の会議ＩＤがある場合、ランダムなデフォルト値に設定にすると、過去の会議内容に基づきサーバ１が学習してきたことが無駄になってしまう。そこで、過去の会議内容及び学習の延長上で会議が再開する形になるように、これまでの状態ｓにおける行動ａの価値Ｑ（s,a）や報酬ｒを読み出して初期値として設定する（ステップS702）。 First, a conference ID is used to identify whether the conference is a newly held conference or a conference that is a continuation of a conference that has already been held (step S701). Note that it is desirable to input the conference ID into the server 1 before starting the discussion. In the case of a new meeting, the initial value Q (s, a) used in the update formula of the reinforcement learning model is set to an arbitrary default value determined in a predetermined or random manner (step S703). If there is a conference ID that is the same as a conference ID that has already been held, setting it to a random default value will waste that the server 1 has learned based on past conference contents. Therefore, the value Q (s, a) and the reward r of the action a in the state s so far are read out and set as initial values so that the meeting is resumed due to past meeting contents and learning extension ( Step S702).

次に、会議の目的を設定する（ステップS704）。現在開催された会議が、議論を収束させることを目標にしているのか、発散させることを目標にしているのか、深めることを目標にしているのか等によって、サーバ１がレコメンドする内容（情報の呈示や会議室環境を変化させるための指示）は異なることから、ステップS704で会議の目的を設定する。本実施の形態では、サーバ１から会議の目的を選択するための入力画面が表示され、会議参加者が選択するようにしているが、参加者の入力なしに自動で設定することを除外するものではない。例えば、会議の開催回数が少なければ議論の発散或いは深めるための会議を目標し、開催回数が所定数を超えていれば収束する会議を目標にするというように自動的に設定してもよい。また、参加者メンバーの属性情報に応じて決定してもよく、例えば、参加者が入社年数が浅い新人の集団であったり、経験の少ない参加者による会議であれば、まずは自由で活発なコミュニケーションを形成することを優先させ、発散を目標にした会議として自動的に設定することも考えられる。 Next, the purpose of the conference is set (step S704). The contents that server 1 recommends (presentation of information) depending on whether the current meeting is aimed at converging the discussion, whether it is aimed at diverging, or deepening And the instruction for changing the conference room environment) are different, the purpose of the conference is set in step S704. In the present embodiment, an input screen for selecting the purpose of the conference is displayed from the server 1 so that the conference participant can select, but the automatic setting without the participant's input is excluded. is not. For example, it may be automatically set so that a meeting for diverging or deepening discussions is targeted if the number of meetings held is small, and a meeting that converges if the number of meetings exceeds a predetermined number. It may also be decided according to the participant member's attribute information.For example, if the participant is a group of newcomers who have not joined the company for a long time, or if the conference is made up of participants with little experience, free and active communication is the first step. It may be possible to automatically set a conference with a goal of divergence, giving priority to the formation of

次に、サーバ１は、ステップS704で設定した会議の目的に沿った複数の行動を設定する（ステップS705）。具体的には例えば、以下の行動が挙げられる。
１. 様子を静かに見届ける
２. コモンセンス連想を呈示
３. 共起されるモノゴトを呈示
４. 辞書内容（Wikipedia）の記事を呈示
５. ニュース記事を呈示
６. SNS（twitter）のツイートを呈示
７. ソーシャルメディアから画像を呈示
８. 関連する論文を呈示
９. 関連する書籍を呈示
なお、上記以外の他の行動を設定してもよいことは言うまでもないし、これよりも多い行動や少ない行動を設定してよい。 Next, the server 1 sets a plurality of actions according to the purpose of the conference set in step S704 (step S705). Specific examples include the following actions.
1. Quietly see the situation 2. Present the common sense association 3. Present the co-occurring monogoto 4. Present the article of the dictionary (Wikipedia) 5. Present the news article 6. Present the SNS (twitter) tweet 7. Presentation of images from social media 8. Presentation of related papers 9. Presentation of related books Needless to say, other actions may be set, and more actions or less actions may be set. May be set.

上記１.〜９.の行動が、例えば発散を目的とする会議のため設定される行動一覧であると仮定したとき、収束を目的とする或いは議論を深めることを目的とする会議の各々おいて別の行動一覧を設定する。それぞれの会議の目的にふさわしい行動群の中から一の行動が選択されることになるので、レコメンド品質が高くなる。なお、一部の行動は、各目的に共通して設定されるものであってもよい。
また、会議の目的ごとに異なる行動一覧を設定するのではなく、どの会議目的であっても共通の行動一覧を設定することも可能ではある。ただし、共通の行動一覧の場合、後述する学習ルーチンにおける計算量が指数関数的に増加すること、学習のトライアンドエラー回数が増幅する一方で会議の目的に明らかに整合しない行動の計算が多くなり無駄な計算処理が生じてしまうこと、更には行動一覧の管理のし易さなどを考慮すると、目的ごとに異なる行動一覧の設定が好ましいことが多いであろう。 Assuming that the actions 1 to 9 above are, for example, a list of actions set for a meeting aimed at diverging, in each meeting aiming at convergence or deepening discussion. Set another action list. Since one action is selected from a group of actions suitable for the purpose of each meeting, the recommendation quality is improved. Some actions may be set in common for each purpose.
It is also possible to set a common action list for any meeting purpose, instead of setting a different action list for each purpose of the meeting. However, in the case of a common action list, the amount of calculation in the learning routine, which will be described later, increases exponentially, and the number of learning trial and error increases while the number of actions that are clearly not consistent with the purpose of the meeting increases. Considering the fact that unnecessary calculation processing occurs, and the ease of management of the action list, it is often preferable to set a different action list for each purpose.

次に、状態変数の組（tuple）に対する各行動の価値（利得期待値）を用いて、最も適切であるとサーバ１が判断した行動を選択する具体的な処理を図８（A）-（D）を用いて説明する。 Next, specific processing for selecting the action that the server 1 determines to be most appropriate using the value (expected gain) of each action for the state variable tuple (tuple) is shown in FIG. This will be explained using D).

図８（A）は、(時刻ｔ-１〜時刻ｔ]を意味する時刻ｔの会議室の状況を、「活性度」、「発想度」、「均一度」、「合意度」、「インタラクション度」という５つの指標を用いてあらわすこととした場合の一例である。また、下記のとおり、各指標の値がそれぞれに予め設定した所定の閾値よりも大きければ“H”、小さければ“L”へと符号化する。
if 活性度 > P1 then “H” otherwise “L”
if 発想量 > P2 then “H” otherwise “L”
if 均一度 > P3 then “H” otherwise “L”
if 合意度 > P4 then “H” otherwise “L”
if インタラクション量 > P5 then “H” otherwise “L”
（ここで、P1, P2, …, P5は予め定められた閾値（定数）である。）
なお、他の実施形態において「ターンテイキング量」、「多様度」、「発言量」、「発話間隔」などの別の指標を用いてもよく、使用する指標の数は任意である。 FIG. 8A shows the status of the conference room at time t, which means (time t-1 to time t), “activity”, “concept”, “uniformity”, “agreement”, “interaction”. This is an example in the case of using five indexes of “degree.” Further, as described below, “H” if the value of each index is greater than a predetermined threshold value set in advance, and “L” if the value is smaller. To "".
if Activity> P1 then “H” otherwise “L”
if idea amount> P2 then “H” otherwise “L”
if Uniformity> P3 then “H” otherwise “L”
if Agreement> P4 then “H” otherwise “L”
if interaction> P5 then “H” otherwise “L”
(Here, P1, P2,..., P5 are predetermined threshold values (constants).)
In another embodiment, other indicators such as “turn taking amount”, “diversity”, “speech amount”, “speech interval” may be used, and the number of indicators to be used is arbitrary.

本実施の形態の場合、５つの指標の“H”或いは“L”を変数とする組（tuple）で１つの会議室の状況の状態Ｓを表すので、とりうる状態の数は変数の組み合わせの数（2⁵＝32通り）だけある。図８（Ａ）の左テーブルは、５つの指標の組（tuple）の各々が３２個の状態Ｓ(1)〜Ｓ(32)の何れかに対応することを示している。なお、各指標を組（tuple）にする以外に、各指標を数値であらわしてその各数値を加減乗除して得られる数値を各状態Ｓ(i)に対応させてもよい。 In the case of the present embodiment, the state S of one conference room is represented by a tuple having “H” or “L” of five indices as a variable, so the number of possible states is the combination of variables. There are only numbers (2 ⁵ = 32). The left table in FIG. 8A indicates that each of the five index tuples corresponds to one of 32 states S (1) to S (32). In addition to making each index a tuple, each index may be represented by a numerical value, and a numerical value obtained by adding / subtracting / dividing each numerical value may be associated with each state S (i).

図８（Ａ）の右テーブルは、サーバ１が選択する各行動ａに対する価値（利得期待値）Ｑ（i,j）、i＝1〜32, j=1〜9である。上述したように、本実施形態における各行動ａは、１.様子を静かに見届ける、２.コモンセンス連想を呈示、３.共起されるモノゴトを呈示、４.辞書内容（Wikipedia）の記事を呈示、５.ニュース記事を呈示、６.SNS（twitter）のツイートを呈示、７.ソーシャルメディアから画像を呈示、８.関連する論文を呈示、９.関連する書籍を呈示、である。 The right table of FIG. 8A is the value (expected gain value) Q (i, j) for each action a selected by the server 1, i = 1 to 32, and j = 1 to 9. As described above, each action a in the present embodiment is as follows: 1. quietly observe the state; 2. present the common sense association; 3. present the co-occurring monogoto; 4. the dictionary content (Wikipedia) Presenting, 5. Presenting news articles, 6. Presenting SNS (twitter) tweets, 7. Presenting images from social media, 8. Presenting related papers, 9. Presenting related books.

会議の開始時は、サーバ１が報酬に基づく学習の実行前であるため、各行動ａに対する価値（利得期待値）Ｑ（i,j）にはデフォルト値が設定されている。本実施形態では、デフォルト値として設定することを示しているのが、図８（Ｂ）である。 Since the server 1 is before the execution of learning based on the reward at the start of the meeting, a default value is set for the value (expected gain value) Q (i, j) for each action a. In the present embodiment, FIG. 8B shows setting as a default value.

次に、学習がある程度実行された後の価値（利得期待値）Ｑ（i,j）を示している図８（Ｃ）を参照しながら、サーバ１がどのようにして特定の行動を選択するかを説明する。いま、各センサから得られた特徴量に基づく時刻ｔの指標の符号値が、(活性度, 発想量, 均一度, 合意度, インタラクション量) = (L, L, L, L, H) とすると、時刻ｔの会議室の状態は図８（Ｃ）における「状態16」に対応する。この状態16に関してサーバ１が上記１〜９の各行動ａの何れかを選択することにより将来に得ることができる利得の期待値、すなわち各行動の価値Ｑ（i,j）は、網掛けした値である。 Next, referring to FIG. 8C showing the value (expected gain) Q (i, j) after learning is executed to some extent, the server 1 selects a specific action. Explain how. Now, the sign value of the index of time t based on the feature value obtained from each sensor is (activity, idea amount, uniformity, agreement degree, interaction amount) = (L, L, L, L, H) Then, the state of the conference room at time t corresponds to “state 16” in FIG. With respect to this state 16, the expected value of gain that can be obtained in the future by the server 1 selecting any one of the actions a from 1 to 9, that is, the value Q (i, j) of each action is shaded. Value.

サーバは、行動ａ₁〜ａ₉の利得期待値の割合に応じた確率で行動を選択する。すなわち、「様子を静かに見届ける」という行動ａ₁が選択される確率は0.0825(= 3.57/43.22)，「コモンセンス連想を呈示」という行動ａ₂は0.1003(=4.34/43.22)， … ,「関連する書籍を呈示」という行動ａ₉は0.0997(=4.31/43.22)となる。ここで、43.22は、各ａ₁（i＝1〜9）の総和である。
なお、本実施形態の場合、指標の値に基づき会議室の状況の状態Ｓを表す際に、閾値と比較して“H”或いは“L”に符号化した例を示したが、状態Ｓが例えば５段階レベル（１〜５）で数値化された各指標を加減乗除した数値で表現されるようにしてもよい。 The server selects an action with a probability corresponding to the ratio of expected gain values of the actions a _{1 to} a ₉ . That is, the probability that the action a _{1 of} “seen quietly” is selected is 0.0825 (= 3.57 / 43.22), and the action a ₂ of “present common sense association” is 0.1003 (= 4.34 / 43.22),. The action a ₉ “present a related book” is 0.0997 (= 4.31 / 43.22). Here, 43.22 is the sum total of each a ₁ (i = 1 to 9).
In the case of the present embodiment, an example in which the state S of the conference room state is expressed based on the index value is encoded as “H” or “L” in comparison with the threshold value. For example, you may make it represent with the numerical value which added / subtracted / divided each index digitized in five steps levels (1-5).

さらに、サーバ１は、例えばEpsilon-Greedyアルゴリズムを用いて，確率εの割合でランダムに或る行動を選択する。Epsilon-Greedyアルゴリズムでは、ある小さい確率εで行動ａ₁、行動ａ₂をランダムに選択し、確率１-εで最大のＱ（i,j）をもつ行動を選択する。例えば、ε＝０.１の場合、高い確率（１-ε＝０.９）で最大のＱ（i,j）をもつ行動を選択し、小さい確率（０.１）でランダムに行動を選択する。つまり、殆どの場合、高い価値（利得の期待値）に裏付けされた選択すべきであろう適切な行動がレコメンドされるものの、低い価値であることから一般的には選択されることが無いであろう不適切な行動も小さい確率ながら選択されるよう試行されて参加者にレコメンドされるようにする。これは、常に価値Ｑ（i,j）の値が最大のものばかり選んでいると、最初にランダムに与えたＱ値の影響が大きくなるなど局所的な収束によって学習が進展しないことを回避するためである。学習過程の適当なタイミングでＱ（i,j）値が最大な行動ａを選ばずに、あえて価値の低い他の行動ａ´を選択させるのである。 Furthermore, the server 1 selects a certain action at random with a probability ε using, for example, the Epsilon-Greedy algorithm. In the Epsilon-Greedy algorithm, the actions a ₁ and a ₂ are selected at random with a certain small probability ε, and the action having the maximum Q (i, j) with the probability 1−ε is selected. For example, if ε = 0.1, select the action with the highest Q (i, j) with a high probability (1-ε = 0.9) and select the action with a small probability (0.1). To do. In other words, in most cases, the appropriate action that should be selected based on high value (expected value of gain) is recommended, but it is generally not selected because of its low value. Try to select inappropriate behaviors that are likely to be selected with a small probability so that participants can recommend them. This avoids that learning does not progress due to local convergence, such as when the value Q (i, j) is always selected with the maximum value, the influence of the Q value given at random is increased. Because. Instead of selecting the action a having the maximum Q (i, j) value at an appropriate timing in the learning process, another action a ′ having a low value is selected.

また、他の本実施形態では、Epsilon-Greedyアルゴリズムに代わり、同様に確率的に行動を選択するSoftmax手法等によってサーバ１の行動選択を行うようにしてよい。Softmax手法は、ある状態ｓに対して所定の関数を定義し、これを選択すべき行動に対する確率分布であるとみなして、例えば逆関数法で行動を選択する。その結果、Epsilon-Greedyアルゴリズムと同様に、Ｑ値が大きくなればその行動のSoxtmax関数の値も大きくなり、その行動が選択されやすくなる。なお、Epsilon-Greedy及びSoftmax関数以外の他の手法を適用したり、複数の手法の組み合わせを適用することに何ら制限を与えていない。 In another embodiment, instead of the Epsilon-Greedy algorithm, the action selection of the server 1 may be performed by the Softmax method or the like that similarly selects the action probabilistically. The Softmax method defines a predetermined function for a certain state s, regards this as a probability distribution for an action to be selected, and selects an action by, for example, an inverse function method. As a result, like the Epsilon-Greedy algorithm, as the Q value increases, the value of the Soxtmax function of the action also increases, and the action is easily selected. It should be noted that there is no limitation on applying other methods other than Epsilon-Greedy and Softmax functions, or applying a combination of a plurality of methods.

図８（Ｃ）の例では、取り得る複数の行動の中で価値が最大の行動ａ₆の「twitterのツイートを呈示」が選択される場合を示している（ステップS706）。また、選択する行動は必ずしも１つでなければならないというものではない。例えば、次に価値が大きい行動である行動ａ₇の「ソーシャルメディアから画像を呈示」もあわせて選択するようにしてもよい。 In the example of FIG. 8C, a case where “present a twitter tweet” of the action a ₆ having the maximum value among a plurality of actions that can be taken is selected (step S706). Moreover, the action to select does not necessarily have to be one. For example, “Present an image from social media” of the action a ₇ which is the action with the next highest value may be selected.

最適な行動ａ₆を選択したサーバ１は、会議室というコミュニケーション空間においてその行動ａ₆を実行する指令を出力する（ステップS707）。行動ａ₆は「twitterのツイートを呈示」なので、サーバ１はツイート情報をレコメンド情報として会議室内に呈示することになるが、サーバ１は、参加者全体へレコメンドすべきか、特定の会議参加者のみへレコメンドすべきかを判断し、関連する出力媒体を選択してレコメンド情報を呈示させている。例えば、サーバ１が選択した行動ａ₆がパーソナル依存性が高い行動の場合、参加者全体というよりも一個人へのレコメンド情報とする。行動ａ₆のツイートはツイートを共有している者の依存性が高く、ツイート共有していない会議参加者にとっては理解できない事項もあり得る。そこで、サーバ１が上記行動ａ₆の指示をする際には、ツイート共有している会議参加者のみの携帯端末２の画面やウェラブル端末（情報表示機能付きメガネ６等）にツイートが呈示される指示をする。仮に、上記例における行動ａ₂の「コモンセンス連想を呈示」が最大価値の行動として選択されている場合では、参加者全体へのレコメンドのために会議室内に設置された共有ディスプレイ７や全員が視認可能な特定の壁面９,１０に表示される指示をする。 The server 1 that has selected the optimum action a ₆ outputs a command for executing the action a ₆ in the communication space called the conference room (step S707). Since the action a ₆ is “present a twitter tweet”, the server 1 presents the tweet information as recommendation information in the conference room, but the server 1 should recommend to all participants or only a specific conference participant It is determined whether or not to recommend, and the relevant output medium is selected to present recommendation information. For example, behavior a ₆ by the server 1 has selected the case of a high action personal dependence, and recommendation information to one person rather than the entire participants. The tweet of action a ₆ is highly dependent on the person sharing the tweet, and there may be matters that cannot be understood by the conference participants who do not share the tweet. Therefore, when the server 1 instructs the action a ₆ , a tweet is presented on the screen of the mobile terminal 2 or the wearable terminal (such as the glasses 6 with information display function) of only the conference participants sharing the tweet. Give instructions. If the action a ₂ “present common sense association” in the above example is selected as the action of maximum value, the shared display 7 installed in the conference room and all the members for the recommendation to all participants An instruction to be displayed on specific wall surfaces 9 and 10 that can be visually recognized is given.

上述したパーソナル依存性の他に、行動ａに含まれるワードの特異性が高く、特定の会議参加者のみの発言にしか含まれていないような場合にも、その参加者のみへの呈示にすればよい。図９は、ワードの特異性に基づき個人レコメンドを行う場合の一例を示すフローチャートである。 In addition to the above-mentioned personal dependence, the word included in action a is highly specific, and even if it is only included in the speech of a specific conference participant, it should be presented only to that participant. That's fine. FIG. 9 is a flowchart illustrating an example of a case where a personal recommendation is performed based on word specificity.

個人レコメンドは、各個人の音声データ及び画像データを基にレコメンド情報を決定していくのが基本となる。
そこで、まず、その個人の発言量が少ないと判断した場合（ステップS901のYes）、会議ＩＤを基に得られる過去の会議データから、現時点で話題になっているトピックス（CT：Current Topic）が登場したことがあるかを識別する（ステップS902）。もしあれば、その個人の履歴データをサーチし、そのトピックについての発言があったかを判断するために、トピックのtf-idf値を算出する（ステップS903）。tf-idf値は、情報検索や文書要約の分野で利用されており、単語の出現頻度（term frequency）と、逆文書頻度（inverse document frequency）の二つの指標に基づいて計算される値である。つまり、出現する単語数が多いほど重要な単語であり、沢山の文書に横断的に出現する単語の重要度は小さいという仮定の下に単語の重要度を決定する手法である。 The personal recommendation is basically based on determining recommendation information based on the voice data and image data of each individual.
Therefore, first, when it is determined that the amount of speech of the individual is small (Yes in step S901), the topic (CT: Current Topic) that is currently being discussed is obtained from past conference data obtained based on the conference ID. It is identified whether it has appeared (step S902). If there is, the personal history data is searched, and the tf-idf value of the topic is calculated in order to determine whether or not there is a statement about the topic (step S903). The tf-idf value is used in the fields of information retrieval and document summarization, and is a value calculated based on two indicators: word frequency (term frequency) and inverse document frequency. . In other words, this is a technique for determining the importance of a word under the assumption that the more words appear, the more important the word is, and the importance of a word that appears across many documents is small.

したがって、その個人の過去の会議データにおいてトピックの語が出現した回数と、他の会議やメールやツイートなど複数の個人リソースを横断的にみてトピックの語が出現した回数とを検出し、算出したtf-idf値が所定のレベルを超えていれば、その個人はトピックスについて知っている蓋然性が高いと判定し、サーバ１は何もしない（ステップS907）。これに対し、tf-idf値が所定のレベル以下であればトピックスについての知識がない可能性が高いと判定する。この場合、そのトピックスの意味や定義を関連するリソースから検索し（ステップS905）、その個人のパーソナルデバイスに出力する（ステップS906）。パーソナルデバイスは、一般的な携帯端末２でもよいが、本実施形態では各参加者が情報表示機能付きメガネ６を着用している。そこで、サーバは、tf-id値に基づき、他の参加者には不必要な情報であるが特定の参加者には議論に積極的に参加する上で必要とされる情報があると判断した場合、その特定の参加者の情報表示機能付きメガネ６にだけ情報を表示させるように制御することができる。 Therefore, the number of times the topic word appeared in the individual's past meeting data and the number of times the topic word appeared across multiple personal resources such as other meetings, emails and tweets were calculated and calculated. If the tf-idf value exceeds a predetermined level, it is determined that the individual has a high probability of knowing about the topic, and the server 1 does nothing (step S907). On the other hand, if the tf-idf value is equal to or lower than a predetermined level, it is determined that there is a high possibility that there is no knowledge about topics. In this case, the meaning and definition of the topic are searched from related resources (step S905), and output to the personal device of the individual (step S906). The personal device may be a general portable terminal 2, but in this embodiment, each participant wears glasses 6 with an information display function. Therefore, based on the tf-id value, the server determined that there is information that is necessary for other participants to participate actively in discussions, although it is unnecessary information for other participants. In this case, control can be performed so that information is displayed only on the glasses 6 with the information display function of the specific participant.

ここで、レコメンド情報の選択についての考え方をまとめておく。
(i)本実施形態の会議支援システムは、或る日に開催される単発の会議が成功するように支援するにとどまらず、長期間にわたる断続的なコミュニケーションを包括的に支援できるようにするため、過去の会議で支援のためにサーバ１が学習し呈示したレコメンドを将来の会議で活用する。したがって、前回の会議をベースに積み重ねられた学習による適切なレコメンドが抽出される。
(ii)本実施形態の会議支援システムは、会議などでのコミュニケーションに参加する個人の経験や知識を理解した上でレコメンド情報の選択をする。メールやソーシャルメディアへの書き込み、過去の様々な媒体への発言内容を利用して、上述したtf-idf値などから個人毎に支援すべき情報を選択するので、どの参加者にもあてはまる標準的で画一的な情報提供をとは異なる適確なピンポイント情報提供を支援する。
(iii)本実施形態の会議支援システムは、広範囲な世の中のあらゆる情報を活用してレコメンド情報の選択をする。一般には、キーワードやフレーズである文節及びトピックスに関連する情報を、予め準備してあるリソースの中から抽出することが行われる。本実施形態の場合、会議内容には直接関係しない情報を参考にしてコミュニケーションに役立てることも行う。例えば、ＰＯＳデータ、視聴率データ、気候変動データ、株価情報、渋滞情報などあらゆるビックデータの中から、バスケット分析をはじめとする任意の統計解析手法によって、事実データから隠れた法則や傾向を見つけ出し、既存の固定観念にとらわれない意外性のある相関データをレコメンド情報として抽出する。 Here, the idea about selection of recommendation information is summarized.
(i) The conference support system of the present embodiment is not limited to supporting a single conference held on a certain day so that it can be successful, but can comprehensively support intermittent communication over a long period of time. The recommendations learned and presented by the server 1 for support at past meetings are utilized at future meetings. Therefore, appropriate recommendations based on learning accumulated based on the previous meeting are extracted.
(ii) The conference support system according to the present embodiment selects recommendation information after understanding the experiences and knowledge of individuals participating in communication at a conference. Information that should be supported for each individual is selected from the tf-idf value, etc. described above, using the contents of remarks on various media such as email and social media. Supporting accurate pinpoint information provision that differs from uniform information provision.
(iii) The conference support system according to the present embodiment selects recommendation information by using all information in a wide range of the world. In general, information related to phrases and topics, which are keywords and phrases, is extracted from resources prepared in advance. In the case of this embodiment, information that is not directly related to the content of the conference is also referred to and used for communication. For example, from any big data such as POS data, audience rating data, climate change data, stock price information, traffic jam information, etc., find out hidden laws and trends from fact data by any statistical analysis method including basket analysis, Unexpected correlation data that is not bound by existing stereotypes is extracted as recommendation information.

このように、上述した本実施形態の会議支援システムのサーバ１による行動ａ_tの選択は、参加者全員に向けて共通の情報を呈示したり或いは会議室内に音楽を流す等といった室内環境全体を変化させることの他に、各会議参加者の個人的属性（経験、知識、経歴、過去の発言内容など）に基づきそれぞれの個人にとって有益となり得る特定の情報を呈示したり、或る参加者のみに冷風があたるように風向を調整する等といった参加者毎の環境変化を生じさせることを意味する。既存の会議支援システムで行われるような単なる文字情報の提供に留まらず、コミュニケーションの場の環境が全体的且つ局所的に変化させるための指示を出すことを含んでいることに留意されたい。 In this way, the choice of action a _t by the server 1 of the conference support system of the present embodiment described above, the entire indoor environment, such as such as play music or or in the conference room presenting a common information towards all participants In addition to changing, presenting specific information that can be beneficial to each individual based on the personal attributes of each conference participant (experience, knowledge, background, content of past comments, etc.) This means that the environment changes for each participant, such as adjusting the wind direction so that it is exposed to cold wind. It should be noted that the present invention includes not only providing text information as is done in existing conference support systems, but also issuing instructions for changing the environment of the communication field globally and locally.

ここで、図７に示したフローチャートに戻る。ステップS707において時刻ｔにおける最大価値をもつ最適行動ａ_tの指示が出力されると、次にサーバ１は各種センサから時刻ｔ+1の会議室内の特徴量を取得し（ステップS708）、時刻ｔのときと同様に時刻ｔ+1の状態変数の組（tuple）を生成する。上述した例で示すように、時刻ｔで状態１６であった会議室空間において、サーバ１が行動ａ₆の実行指示をした結果、状態変数の組が、(活性度, 発想量, 均一度, 合意度, インタラクション量) = （H, H, L, H, H）に遷移したとする。これは、図８（Ｃ）の状態５に相当する。また、所定の報酬の算定アルゴリズムに従い、インタラクション量およびturn taking量が共に増加したことで報酬が合計＋２になったとする。 Here, it returns to the flowchart shown in FIG. When the instruction of optimal action a _t with the maximum value at time t is output at step S707, the next server 1 acquires the feature quantity in the conference room of the time t + 1 from the various sensors (step S 708), the time t In the same manner as in the above, a tuple of state variables at time t + 1 is generated. As shown in the above-described example, in the conference room space that was in state 16 at time t, as a result of the server 1 instructing execution of action a ₆ , a set of state variables becomes (activity, idea amount, uniformity, It is assumed that the degree of agreement, the amount of interaction) = (H, H, L, H, H). This corresponds to the state 5 in FIG. Further, it is assumed that the reward becomes a total of +2 due to an increase in both the interaction amount and the turn taking amount according to a predetermined reward calculation algorithm.

この時、状態１６における行動６の利得期待値Q(16, 6)は、
Ｑ(ｓ_t+1,ａ_t+1)←Ｑ(ｓ_t,ａ_t)＋α｛ｒ_t+1＋γmaxＱ(ｓ_t+1,ａ)−Ｑ(ｓ_t,ａ_t)｝を用いて、以下のように更新される。ただし、学習計数α=0.3, 割引率γ=0.9とする。
図８（C）に示す状態５の行動ａ₁〜行動ａ₉の中で、行動ａ₉の価値（利得期待値）が7.96で最大であるため、上記式においてmaxＱ(5,ａ)=7.96を設定する（ステップS709）。 At this time, the expected gain Q (16, 6) of action 6 in state 16 is
Q (s _{t + 1} , a _{t + 1} ) ← Q (s _t , a _t ) + α {r _{t + 1} + γmaxQ (s _{t + 1} , a) −Q (s _t , a _t )} Updated as follows: However, the learning count α = 0.3 and the discount rate γ = 0.9.
Among the actions a ₁ to a _{9 in} the state 5 shown in FIG. 8C, the value (expected gain) of the action a ₉ is the maximum at 7.96, so maxQ (5, a) = 7.96 in the above formula. Is set (step S709).

すなわち、
Ｑ(16,6)←Ｑ(16,6)＋0.3｛2＋0.9maxＱ(5,ａ)−Ｑ(16,6)｝=8.56＋0.3｛2＋0.9*7.96-8.56｝であり、選択された行動ａ₆の結果時刻ｔ+1で新たな状態５へ遷移し、新たな状態における各行動の利得期待値に基づき時刻ｔ時の状態１６の利得期待値はより高い値の8.74へと更新されることになった（ステップS710）。 That is,
Q (16,6) ← Q (16,6) +0.3 {2 + 0.9max Q (5, a) −Q (16,6)} = 8.56 + 0.3 {2 + 0.9 * 7.96-8.56}, As a result of the selected action a _6, the state transitions to a new state 5 at time t + 1, and the expected gain value of state 16 at time t based on the expected gain value of each action in the new state is 8.74, which is a higher value. To be updated (step S710).

ステップS706〜S710間の処理が反復対象のルーチンである。サーバ１はレコメンド終了時（例えば、会議の終了時、所定時間経過後など）までレコメンドの呈示を続ける。 The process between steps S706 to S710 is a routine to be repeated. The server 1 continues to present the recommendation until the end of the recommendation (for example, at the end of the conference, after a predetermined time has elapsed).

図１０は、図２のステップS206に示すレコメンドの呈示処理に関する詳細フローチャートである。
上述してきたとおり、本実施形態では、レコメンド情報の呈示は、単に、目の前のモニタにテキスト表示をするというような従来の情報提供方法に限らず、多様な情報提供を可能にする。会議室という空間全体が情報を出力する媒体としての可能性がある。具体的には、図１に示す空間の壁、床、天井、家具などを出力媒体として使用する。もちろん出力媒体として、上述した参加者の情報表示機能付きメガネ６も含んでいる。任意の空間内の家具や壁面といった空間を構成する要素を表示媒体として積極的に活用し、これまでに無い情報の提示が人間の発想の転換や収束を促すことを目指しているのである。 FIG. 10 is a detailed flowchart regarding the recommendation presenting process shown in step S206 of FIG.
As described above, in the present embodiment, the presentation of the recommendation information is not limited to the conventional information providing method in which text is displayed on the monitor in front of the eyes, and various information can be provided. There is a possibility that the entire space of the conference room is a medium for outputting information. Specifically, the wall, floor, ceiling, furniture, etc. of the space shown in FIG. 1 are used as the output medium. Of course, the above-mentioned glasses 6 with the information display function of the participants are also included as output media. The aim is to actively utilize the elements that make up the space such as furniture and walls in any space as a display medium, and to present unprecedented information to promote the change and convergence of human ideas.

サーバ１は、設定された会議の目的や、推定した話題（トピックス）や、選択した各行動ａの内容に基づき、会議室内でレコメンドする情報が、例えば、発想の文脈（コンテキスト）のものか、注目の文脈のものか、独創の文脈のものか等を特定する（ステップＳ1001）。次に、サーバ１は、レコメンド情報の文脈に応じて各出力媒体を割当てる(ステップS1002)。例えば、壁面９を発想の文脈として割当てて、サーバ１が新たな発想を促すことを期待して会議中に飛び交う言葉のキーワードを壁面９に呈示する。別の壁面１０は注目の文脈として割当てる。参加者が向かいあっているテーブル１３は、独創の文脈であると割当てた場合、会議中に飛び交う言葉の中から重要な概念を拾いテーブル１３上に呈示する。会議参加者が呈示されたレコメンド情報にタッチ操作すると、タッチ位置に対応した信号がサーバに返送されるので、次にサーバはタッチされたレコメンド情報に関連する多様な情報リソースが選ばれ、注目の文脈である壁面１０に関連する情報が表示され、その情報から新たな気づきが生まれるようにする。また、壁面９,１０の代わりにプロジェクター７を使用してもよい。 The server 1 determines whether the information to be recommended in the conference room is, for example, the context of the idea based on the purpose of the set conference, the estimated topic (topics), or the content of each selected action a. Whether the context is of interest or the original context is identified (step S1001). Next, the server 1 assigns each output medium according to the context of the recommendation information (step S1002). For example, the wall surface 9 is assigned as the context of the idea, and the keyword of the words that fly during the meeting is presented on the wall surface 9 in the hope that the server 1 will prompt a new idea. Another wall 10 is assigned as the context of interest. When the table 13 where the participants are facing is assigned to be a context of originality, an important concept is picked up from words that fly during the meeting and presented on the table 13. When a conference participant performs a touch operation on the recommended information presented, a signal corresponding to the touch position is sent back to the server. Next, the server selects various information resources related to the touched recommended information, and receives attention. Information related to the wall surface 10 as the context is displayed, and a new awareness is generated from the information. Further, the projector 7 may be used instead of the wall surfaces 9 and 10.

さらに、上述したように、会議中に飛び交う言葉の中で、各参加者が知らない若しくは馴染みの薄いキーワードがあると判定すると、その定義を参加者の情報表示機能付きメガネ６に呈示する。参加者の情報表示機能付きメガネ６は、個人の情報端末２とともに、参加者それぞれのパーソナルな文脈として割当てられていることになる。
例えば、参加者Ｘの発言におけるあるワードのtf-idf値が閾値TH3よりも大きければ、参加者Ｘにとっての特異ワードであると判断して、参加者Ｘの情報表示機能付きメガネ６のみ情報の呈示を行う。一方、tf-idf値がTH3よりも小さなTH2の間であれば（TH3＜tf-idf値＜TH2）重要ワードであると判断して、テーブル１３に呈示して他の参加者も識別できるようにする。tf-idf値が閾値TH2よりも小さなTH1の間であれば（TH2＜tf-idf値＜TH1）意味のあるワードであると判断して、壁９呈示して他の参加者も識別できるようにする。さらに、tf-idf値が閾値TH2よりも小さければ（TH1＜tf-idf値＜0）意味のないワードとして切り捨て呈示しないというようにする。 Further, as described above, when it is determined that there is a keyword that each participant does not know or is not familiar with in words that fly during the meeting, the definition is presented to the glasses 6 with information display function of the participant. The glasses 6 with the information display function of the participants are assigned as personal contexts of the participants together with the personal information terminal 2.
For example, if the tf-idf value of a word in the utterance of the participant X is larger than the threshold value TH3, it is determined that the word is a unique word for the participant X, and only the glasses 6 with the information display function of the participant X are informed. Make a presentation. On the other hand, if the tf-idf value is between TH2 smaller than TH3 (TH3 <tf-idf value <TH2), it is determined that it is an important word, so that other participants can be identified by presenting it in the table 13. To. If the tf-idf value is between TH1 smaller than the threshold TH2 (TH2 <tf-idf value <TH1), it is determined that the word is meaningful, and the wall 9 is presented so that other participants can be identified. To. Further, if the tf-idf value is smaller than the threshold value TH2 (TH1 <tf-idf value <0), it is not truncated as a meaningless word.

さらに、他の実施形態の場合、サーバ１の或る状態ｓの下で行う行動ａは、情報のレコメンドではなく、会議室という空間全体を変化させる指令として出力される。例えば、空間内に音を出力する音響装置があれば、楽曲の選択や音量や方向の調整を行う。収束状態モードの場合、会議は行き詰っていることが多いことから高揚する音楽を空間全体に向けて出力し、発散状態モードの場合、照明１１からの光の色を気持ちが和らぐ色に変化させたり、照明１１の強度を落として少し暗くしたり、壁面９,１０の色の変化すること等が考えられる。その他にも、空調設備１２からの匂い（種類、量、方向）、風（風量及び風向き）、温度、湿度等を調整することもサーバ１の行動の中に含まれる。
別の実施形態では、脈拍や体温などのバイタルデータを、椅子５等に設置された非接触センサや心拍センサを内蔵した時計やリストバンドで検知して、温度や湿度の調整に利用するようにしてもよい。さらに他の実施形態では、会議参加者が座っている椅子５や正面のテーブル１３の高さを変化させたり、椅子５や床面に振動を起こす指令でもよい。しかもこのような会議室の環境を変化させる指令が、各参加者の画像データや音声データである感知データに基づき参加者毎に与えられるようにすれば、従来の画一的な単なる情報レコメンドではなし得なかったコミュニケーション成功へのきっかけを生み出すことが期待できる。 Further, in another embodiment, the action a performed under a certain state s of the server 1 is output as a command for changing the entire space of the conference room, not a recommendation of information. For example, if there is an acoustic device that outputs sound in the space, music selection and volume and direction adjustment are performed. In the converged state mode, conferences are often deadlocked, so uplifting music is output to the entire space, and in the divergent state mode, the color of the light from the illumination 11 is changed to a color that relaxes the feelings. It is conceivable that the intensity of the illumination 11 is lowered to make it slightly darker or the color of the wall surfaces 9 and 10 changes. In addition, adjusting the odor (type, amount, direction), wind (air amount and direction), temperature, humidity, and the like from the air conditioner 12 is also included in the action of the server 1.
In another embodiment, vital data such as pulse and body temperature is detected by a watch or wristband with a non-contact sensor or a heart rate sensor installed in the chair 5 or the like and used for adjusting temperature and humidity. May be. In still another embodiment, the height of the chair 5 on which the conference participant is sitting or the front table 13 may be changed, or the chair 5 or the floor may be vibrated. Moreover, if such a command to change the environment of the conference room is given to each participant based on the sensing data that is the image data and audio data of each participant, the conventional uniform simple mere information recommendation It can be expected to generate an opportunity for successful communication that could not be achieved.

上述した一例は、企業等における比較的狭い空間である会議室で本発明を適用したものであるが、これに限定されるわけではない。複数の者がコミュニケーションを図る任意の場所での適用が可能である。例えば、講義会場やコンサート会場などの大規模な空間、更には、室内という閉空間ではなく野外の場でコミュニケーションを図るときに本発明を適用することもあり得る。また、バーチャル空間、すなわち遠隔会議で適用することも含む。 In the above-described example, the present invention is applied to a conference room that is a relatively narrow space in a company or the like, but is not limited thereto. It can be applied at any place where multiple people communicate. For example, the present invention may be applied to communication in a large space such as a lecture hall or a concert hall, or in an outdoor field rather than a closed space such as a room. It also includes application in virtual space, i.e., teleconferencing.

本実施形態では、発想支援システムのサーバ１が全体の制御を行うものとして説明してきた。このサーバ１が果たす役割は、いわゆる人工知能（ＡＩ）に相当する。あたかも空間という場がＡＩとなって、複数の人の考えを共有し伝達し、多くの選択肢が生成された中で合意が形成されるような包括的且つ多角的なコミュニケーション支援である。会議内容や会議状態に適合しながら、且つ各参加者にとって有益でタイムリーなレコメンド情報が提供され、特に、ＡＩとして実行する強化学習が状況に関する事前の完全な理解が無くても設計できるという優位点がある。 In this embodiment, it has been described that the server 1 of the idea support system performs overall control. The role played by the server 1 corresponds to so-called artificial intelligence (AI). It is a comprehensive and multi-faceted communication support in which a space is an AI, sharing and communicating the ideas of multiple people, and consensus is formed as many options are created. Providing useful and timely recommended information for each participant while adapting to the content and state of the conference, especially the advantage that reinforcement learning performed as an AI can be designed without full prior understanding of the situation There is a point.

１サーバ
２携帯端末
３カメラ
４マイクロフォン
５椅子
６情報表示機能付きメガネ
７プロジェクター
８カメラ
９壁面
１０壁面
１１照明
１２空調設備
１３テーブル DESCRIPTION OF SYMBOLS 1 Server 2 Portable terminal 3 Camera 4 Microphone 5 Chair 6 Glasses with an information display function 7 Projector 8 Camera 9 Wall surface 10 Wall surface 11 Illumination 12 Air conditioning equipment 13 Table

Claims

A support system for communication by a plurality of participants,
At least one situation data acquisition means for grasping the situation of the communication space by at least one of audio data and image data;
A server that controls any output medium in the communication space using the sensing data obtained by the situation data acquisition means,
The server, at predetermined time intervals,
The sensing data is digitized or encoded to generate the state of the communication space as a state variable at time t,
In order to support communication in the state of the communication space at time t, at least one support item is selected based on an expected value of each support item from a plurality of communication support items related to the state variable,
After outputting a command to execute the selected support item to the output medium, using the expected value of the communication support item related to the state variable at time t + 1 generated based on the sensed data acquired by the situation data acquisition unit, repeatedly updating the expected value of the support item selected at time t.
Support system that is configured as follows.

The support system according to claim 1, wherein updating of the expected value of the support item is based on a reinforcement learning algorithm.

3. The support system according to claim 1, wherein the state variable is a set of indices corresponding to each sensed data obtained by the at least one situation data acquisition unit or a numerical value calculated from each index.

The plurality of indicators include a diversity indicating whether the communication content includes a plurality of viewpoints, a uniformity indicating the degree of communication involvement of each participant, and an activity indicating whether the communication is alternately performed between the plurality of participants. 4. The degree of consent, the degree of consent to the speech or attitude of other participants, and the degree of idea indicating whether communication is performed using words of different categories. The support system described.

5. The server according to claim 1, wherein the server selects the output medium according to a communication context and outputs a command to execute the selected support item for the selected output medium. The support system described.

The output medium includes at least one of a shared object including one or more wall surfaces and a table in the communication space, and a personal information terminal including each participant's portable terminal and glasses-type display device. The support system according to any one of the above.

A shared object including one or more wall surfaces and a table in the communication space is an interface for transmitting a touch operation from a participant who has acted in response to the output of information on the surfaces of the wall surface and the table to the server. The support system according to claim 1, which functions as:

The plurality of communication support items are at least one of sound, light, wind, temperature, humidity, odor, and vibration toward a specific participant among the participants or a part or the whole of the communication space. The support system according to any one of claims 1 to 7, comprising changing an output amount and a type related to the output.

A table used for communication support by a plurality of participants,
The server to which the sensing data related to the sound or image acquired using at least one situation data acquisition means is transmitted at every predetermined time interval.
The sensing data is digitized or encoded to generate the state of the communication space as a state variable at time t,
In order to support communication in the state of the communication space at time t, at least one support item is selected based on an expected value of each support item from a plurality of communication support items related to the state variable,
After the execution of the selected support item, the support selected at time t using the expected value of the communication support item related to the state variable at time t + 1 generated based on the sensed data acquired by the situation data acquisition means Update the expected value of the item,
A table that displays information extracted by the server according to the selected support item in repetition of the above.

The table according to claim 9, wherein when the participant performs a touch operation in response to information displayed on the table surface, the table is controlled to return a signal corresponding to a touch position to the server.