JP7453714B2

JP7453714B2 - Argument analysis device and method

Info

Publication number: JP7453714B2
Application number: JP2023075775A
Authority: JP
Inventors: 武志水本
Original assignee: Hylable Inc
Current assignee: Hylable Inc
Priority date: 2019-03-14
Filing date: 2023-05-01
Publication date: 2024-03-21
Anticipated expiration: 2039-03-14
Also published as: JP7279928B2; JP2020148931A; JP2023109786A

Description

本発明は、複数の参加者による議論を分析するための議論分析装置及び議論分析方法に関する。 The present invention relates to a discussion analysis device and a discussion analysis method for analyzing discussions by a plurality of participants.

グループ学習や会議における議論を分析する方法として、ハークネス法（ハークネスメソッドともいう）が知られている（例えば、非特許文献１参照）。ハークネス法では、議論において発言を行った参加者（話者）の遷移を線で記録する。これにより、各参加者の議論への貢献や、他者との関係性を分析することができる。ハークネス法は、学生が主体的に学習を行うアクティブ・ラーニングにも効果的に適用できる。 The Harkness method (also referred to as the Harkness method) is known as a method for analyzing discussions in group studies and meetings (see, for example, Non-Patent Document 1). In the Harkness method, the transitions of participants (speakers) who have made statements in a discussion are recorded with lines. This makes it possible to analyze each participant's contribution to the discussion and their relationship with others. The Harkness method can also be effectively applied to active learning, where students learn independently.

Paul Sevigny、「Extreme Discussion Circles : Preparing ESL Students for "The Harkness Method"」、Polyglossia、立命館アジア太平洋大学言語教育センター、平成24年10月、第23号、p. 181-191Paul Sevigny, "Extreme Discussion Circles: Preparing ESL Students for "The Harkness Method"," Polyglossia, Ritsumeikan Asia Pacific University Language Education Center, October 2012, No. 23, p. 181-191

ハークネス法では記録者が常に議論を記録する必要があるため、記録者の負担が大きい。そこで集音装置によって参加者が発した音声を取得し、コンピュータによって音声を分析することによって、自動的に話者の遷移を検出することが考えられる。しかしながら、コンピュータは、参加者が話している際に発生した物体の衝突音や他の参加者の相槌等の不規則な音を参加者の発言として検出してしまい、話者の遷移を正しく検出できない場合がある。 The Harkness method requires the recorder to constantly record discussions, which places a heavy burden on the recorder. Therefore, it is conceivable to automatically detect speaker transitions by acquiring voices uttered by participants using a sound collection device and analyzing the voices using a computer. However, the computer detects irregular sounds such as the collision of objects or the chiding of other participants while the participant is speaking as the participant's utterances, and the computer correctly detects the transition of the speaker. It may not be possible.

本発明はこれらの点に鑑みてなされたものであり、議論における話者の遷移の検出精度を向上させることを目的とする。 The present invention has been made in view of these points, and an object of the present invention is to improve the accuracy of detecting speaker transitions in discussions.

本発明の第１の態様の議論分析装置は、複数の参加者が参加する議論における、前記複数の参加者それぞれの発話量を取得する情報取得部と、前記議論において、第１の時間範囲ごとに前記複数の参加者のうち前記発話量が最大である最大発話者を特定する最大発話者特定部と、前記第１の時間範囲ごとの前記最大発話者の変化に基づいて、前記複数の参加者の間で発生した話者の遷移を示す遷移情報を出力する出力部と、を有する。 A discussion analysis device according to a first aspect of the present invention includes an information acquisition unit that acquires the amount of speech of each of the plurality of participants in a discussion in which the plurality of participants participate; a maximum speaker identification unit that identifies a maximum speaker with the maximum amount of speech among the plurality of participants; and an output unit that outputs transition information indicating a transition of speakers that has occurred between speakers.

前記出力部は、１つの時間範囲における前記最大発話者である第１の参加者と、前記１つの時間範囲に続く時間範囲における前記最大発話者である第２の参加者とが異なる場合に、前記第１の参加者から前記第２の参加者への前記遷移を示す前記遷移情報を出力してもよい。 When the first participant who is the largest speaker in one time range is different from the second participant who is the largest speaker in a time range following the one time range, the output unit The transition information indicating the transition from the first participant to the second participant may be output.

前記議論分析装置は、前記遷移情報の時系列の類似性に基づいて、前記議論を１つ以上のフェーズに分割するフェーズ分割部をさらに有してもよい。 The discussion analysis device may further include a phase division unit that divides the discussion into one or more phases based on chronological similarity of the transition information.

前記出力部は、前記第１の時間範囲よりも長い第２の時間範囲ごとに前記遷移の回数を示す前記遷移情報を出力し、前記フェーズ分割部は、前記遷移情報の時系列の類似性に基づいて前記第２の時間範囲ごとの前記遷移情報をクラスタリングし、生成した複数のクラスタに含まれている前記遷移情報に対応する前記第２の時間範囲の前記議論中の時刻に基づいて、前記議論を構成する前記１つ以上のフェーズを決定してもよい。 The output unit outputs the transition information indicating the number of transitions for each second time range that is longer than the first time range, and the phase division unit outputs the transition information indicating the number of transitions for each second time range that is longer than the first time range, and the phase dividing unit clustering the transition information for each second time range based on the time under discussion in the second time range corresponding to the transition information included in the generated plurality of clusters; The one or more phases that constitute the discussion may be determined.

前記議論分析装置は、前記複数の参加者の各組み合わせにおける前記遷移の有無を示す複数のパターンを生成し、前記複数のパターンのうち、前記遷移情報との類似度が所定の条件を満たすパターンを選択するパターン選択部をさらに有してもよい。 The discussion analysis device generates a plurality of patterns indicating the presence or absence of the transition for each combination of the plurality of participants, and selects, among the plurality of patterns, a pattern whose degree of similarity with the transition information satisfies a predetermined condition. It may further include a pattern selection section for selection.

前記パターン選択部は、選択した前記パターンの一部を変更した複数のサブパターンをさらに生成し、前記複数のサブパターンのうち、前記遷移情報との類似度が所定の条件を満たすサブパターンを選択してもよい。 The pattern selection unit further generates a plurality of sub-patterns in which a part of the selected pattern is changed, and selects a sub-pattern whose degree of similarity with the transition information satisfies a predetermined condition from among the plurality of sub-patterns. You may.

前記出力部は、前記パターン選択部が選択した前記パターンに基づいて前記複数の参加者の役割を判定し、前記複数の参加者それぞれと前記複数の参加者それぞれの役割とを関連付けて出力してもよい。 The output unit determines the roles of the plurality of participants based on the pattern selected by the pattern selection unit, and outputs each of the plurality of participants in association with the role of each of the plurality of participants. Good too.

前記出力部は、前記パターン選択部が選択した前記パターンに基づいて、前記複数の参加者の行動を文章として出力してもよい。 The output unit may output the actions of the plurality of participants as sentences based on the pattern selected by the pattern selection unit.

前記出力部は、前記複数の参加者のうち１人の参加者が参加した複数の前記議論のうち、所定の条件を満たす前記議論における前記１人の参加者の発話量に関する情報を、前記１人の参加者に関連付けて出力してもよい。 The output unit is configured to output information regarding the amount of speech by the one participant in the discussion that satisfies a predetermined condition from among the plurality of discussions in which one of the plurality of participants participated. It may also be output in association with other participants.

前記出力部は、所定のグループに属する前記複数の参加者が参加した複数の前記議論における前記複数の参加者の発話量に関する情報を、前記グループに関連付けて出力してもよい。 The output unit may output information regarding the amount of speech of the plurality of participants in the plurality of discussions in which the plurality of participants belonging to a predetermined group participated, in association with the group.

前記出力部は、第１の議論における前記グループに属する前記複数の参加者の発話量の順位と、前記第１の議論とは異なる第２の議論における前記グループに属する前記複数の参加者の発話量の順位とを関連付けて出力してもよい。 The output unit is configured to output a ranking of the amount of utterances of the plurality of participants belonging to the group in a first discussion, and utterances of the plurality of participants belonging to the group in a second discussion different from the first discussion. It may also be output in association with the order of quantity.

本発明の第２の態様の議論分析方法は、プロセッサが実行する、複数の参加者が参加する議論における、前記複数の参加者それぞれの発話量を取得するステップと、前記議論において、第１の時間範囲ごとに前記複数の参加者のうち前記発話量が最大である最大発話者を特定するステップと、前記第１の時間範囲ごとの前記最大発話者の変化に基づいて、前記複数の参加者の間で発生した話者の遷移を示す遷移情報を出力するステップと、を有する。 A discussion analysis method according to a second aspect of the present invention includes the steps of: acquiring the amount of speech of each of the plurality of participants in a discussion in which the plurality of participants participate, which is executed by a processor; a step of identifying a maximum speaker who has the maximum amount of speech among the plurality of participants for each time range; and outputting transition information indicating the speaker transition that occurred between.

本発明によれば、議論における話者の遷移の検出精度が向上するという効果を奏する。 According to the present invention, there is an effect that the detection accuracy of speaker transition in a discussion is improved.

実施形態に係る議論分析システムの模式図である。1 is a schematic diagram of a discussion analysis system according to an embodiment. 実施形態に係る議論分析システムのブロック図である。FIG. 1 is a block diagram of a discussion analysis system according to an embodiment. 議論分析装置が議論における話者の遷移を検出する方法の模式図である。FIG. 2 is a schematic diagram of how the discussion analysis device detects speaker transitions in a discussion. 議論分析装置が議論を１つ以上のフェーズに分割する方法の模式図である。2 is a schematic diagram of how a discussion analyzer divides a discussion into one or more phases; FIG. 議論分析装置が遷移情報に類似するパターンを選択する方法の模式図である。FIG. 6 is a schematic diagram of a method in which the discussion analysis device selects a pattern similar to transition information. 議論分析装置が遷移情報に類似するパターンを選択する方法の模式図である。FIG. 6 is a schematic diagram of a method in which the discussion analysis device selects a pattern similar to transition information. ディスカッションレポート画面を表示している表示部の前面図である。FIG. 3 is a front view of the display unit displaying a discussion report screen. 個人レポート画面を表示している表示部の前面図である。FIG. 3 is a front view of the display section displaying a personal report screen. コースレポート画面を表示している表示部の前面図である。FIG. 3 is a front view of the display unit displaying a course report screen. 議論分析装置が行う議論分析方法のフローチャートを示す図である。FIG. 3 is a diagram showing a flowchart of a discussion analysis method performed by the discussion analysis device.

［議論分析システムＳＳの概要］
図１は、本実施形態に係る議論分析システムＳＳの模式図である。議論分析システムＳＳは、議論分析装置１と、通信端末２と、集音装置３とを含む。議論分析システムＳＳが含む通信端末２及び集音装置３の数は限定されない。議論分析システムＳＳは、その他のサーバ、端末等の機器を含んでもよい。 [Overview of discussion analysis system SS]
FIG. 1 is a schematic diagram of a discussion analysis system SS according to this embodiment. The discussion analysis system SS includes a discussion analysis device 1, a communication terminal 2, and a sound collection device 3. The number of communication terminals 2 and sound collection devices 3 included in the discussion analysis system SS is not limited. The discussion analysis system SS may include other devices such as servers and terminals.

集音装置３は、異なる向きに配置された複数の集音部（マイクロフォン）を含むマイクロフォンアレイを備える。例えばマイクロフォンアレイは、地面に対する水平面において、同一円周上に等間隔で配置された８個のマイクロフォンを含む。このようなマイクロフォンアレイを用いることによって、議論分析装置１は、集音装置３を取り囲んでいる複数の参加者Ｕが発した音声に基づいて、いずれの参加者Ｕが話者（音源）であるかを特定することができる。集音装置３は、マイクロフォンアレイを用いて取得した音声をデータとして議論分析装置１へ送信する。 The sound collecting device 3 includes a microphone array including a plurality of sound collecting sections (microphones) arranged in different directions. For example, a microphone array includes eight microphones arranged at equal intervals on the same circumference in a horizontal plane relative to the ground. By using such a microphone array, the discussion analysis device 1 can determine which participant U is the speaker (sound source) based on the voices emitted by the multiple participants U surrounding the sound collection device 3. can be identified. The sound collection device 3 transmits the voice acquired using the microphone array to the discussion analysis device 1 as data.

通信端末２は、通信を行うことが可能なコンピュータである。通信端末２は、例えばパーソナルコンピュータ等のコンピュータ端末、又はスマートフォン等の携帯端末である。通信端末２は、議論分析装置１に対して分析条件を設定し、また議論分析装置１から受信した情報を表示する。 The communication terminal 2 is a computer capable of communicating. The communication terminal 2 is, for example, a computer terminal such as a personal computer, or a mobile terminal such as a smartphone. The communication terminal 2 sets analysis conditions for the discussion analysis device 1 and displays information received from the discussion analysis device 1.

議論分析装置１は、集音装置３によって取得された音声を用いて議論を分析するコンピュータである。議論分析装置１は、例えば単一のコンピュータ、又はコンピュータ資源の集合であるクラウドによって構成される。 The discussion analysis device 1 is a computer that analyzes discussions using the audio acquired by the sound collection device 3. The discussion analysis device 1 is configured by, for example, a single computer or a cloud that is a collection of computer resources.

議論分析装置１は、ローカルエリアネットワーク、インターネット等のネットワークＮを介して、通信端末２及び集音装置３に有線又は無線で接続される。議論分析装置１は、通信端末２及び集音装置３のうち少なくとも一方に、ネットワークＮを介さず直接接続されてもよい。 The discussion analysis device 1 is connected to a communication terminal 2 and a sound collection device 3 by wire or wirelessly via a network N such as a local area network or the Internet. The discussion analysis device 1 may be directly connected to at least one of the communication terminal 2 and the sound collection device 3 without using the network N.

議論分析装置１が実行する処理の概要を以下に説明する。まず議論分析装置１は、複数の参加者Ｕが参加する議論における音声を、集音装置３から取得する。議論分析装置１は、取得した音声を用いて、議論における複数の参加者Ｕそれぞれの発話量を取得する。議論分析装置１は、所定の時間範囲ごとに、発話量が最大の参加者Ｕ（すなわち最大発話者）を特定する。そして議論分析装置１は、所定の時間範囲ごとの最大発話者の時系列の変化に基づいて、複数の参加者Ｕの間で発生した話者の遷移を示す遷移情報を出力する。 An outline of the processing executed by the discussion analysis device 1 will be explained below. First, the discussion analysis device 1 acquires audio in a discussion in which a plurality of participants U participate from the sound collection device 3. The discussion analysis device 1 uses the obtained voices to obtain the amount of speech of each of the plurality of participants U in the discussion. The discussion analysis device 1 identifies the participant U with the largest amount of speech (that is, the largest speaker) for each predetermined time range. The discussion analysis device 1 then outputs transition information indicating the transition of speakers that has occurred among the plurality of participants U, based on the time-series changes in the maximum speaker for each predetermined time range.

本実施形態に係る議論分析システムＳＳによれば、議論分析装置１は、発話量が最大の参加者Ｕの変化に基づいて話者の遷移を検出するため、物体の衝突音や参加者Ｕの相槌等の発言ではない音によって話者の遷移を検出することを抑えることができ、議論における話者の遷移の検出精度を向上できる。 According to the discussion analysis system SS according to the present embodiment, the discussion analysis device 1 detects the transition of the speaker based on the change of the participant U who has the largest amount of speech, so the discussion analysis device 1 It is possible to suppress the detection of speaker transitions due to sounds that are not utterances, such as exchanges, and improve the accuracy of detecting speaker transitions in discussions.

［議論分析システムＳＳの構成］
図２は、本実施形態に係る議論分析システムＳＳのブロック図である。図２において、矢印は主なデータの流れを示しており、図２に示していないデータの流れがあってよい。図２において、各ブロックはハードウェア（装置）単位の構成ではなく、機能単位の構成を示している。そのため、図２に示すブロックは単一の装置内に実装されてよく、あるいは複数の装置内に分かれて実装されてよい。ブロック間のデータの授受は、データバス、ネットワーク、可搬記憶媒体等、任意の手段を介して行われてよい。 [Configuration of discussion analysis system SS]
FIG. 2 is a block diagram of the discussion analysis system SS according to this embodiment. In FIG. 2, arrows indicate main data flows, and there may be data flows that are not shown in FIG. In FIG. 2, each block shows the configuration of a functional unit rather than a hardware (device) unit. As such, the blocks shown in FIG. 2 may be implemented within a single device or may be implemented separately within multiple devices. Data may be exchanged between blocks via any means such as a data bus, a network, or a portable storage medium.

議論分析装置１は、制御部１１と、記憶部１２とを有する。制御部１１は、情報取得部１１１と、最大発話者特定部１１２と、遷移検出部１１３と、フェーズ分割部１１４と、パターン選択部１１５と、出力部１１６とを有する。記憶部１２は、議論情報記憶部１２１と、参加者情報記憶部１２２とを有する。 The discussion analysis device 1 includes a control section 11 and a storage section 12. The control unit 11 includes an information acquisition unit 111 , a maximum speaker identification unit 112 , a transition detection unit 113 , a phase division unit 114 , a pattern selection unit 115 , and an output unit 116 . The storage unit 12 includes a discussion information storage unit 121 and a participant information storage unit 122.

記憶部１２は、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）、ハードディスクドライブ等を含む記憶媒体である。記憶部１２は、制御部１１が実行するプログラムを予め記憶している。記憶部１２は、議論分析装置１の外部に設けられてもよく、その場合にネットワークを介して制御部１１との間でデータの授受を行ってもよい。 The storage unit 12 is a storage medium including a ROM (Read Only Memory), a RAM (Random Access Memory), a hard disk drive, and the like. The storage unit 12 stores in advance a program to be executed by the control unit 11. The storage unit 12 may be provided outside the discussion analysis device 1, and in that case, data may be exchanged with the control unit 11 via a network.

議論情報記憶部１２１は、議論に関する情報を示す議論情報を記憶する。参加者情報記憶部１２２は、議論に参加する参加者に関する情報を示す参加者情報を記憶する。議論情報記憶部１２１及び参加者情報記憶部１２２は、それぞれ記憶部１２上の記憶領域であってもよく、あるいは記憶部１２上で構成されたデータベースであってもよい。 The discussion information storage unit 121 stores discussion information indicating information regarding discussions. The participant information storage unit 122 stores participant information indicating information regarding participants participating in the discussion. The discussion information storage section 121 and the participant information storage section 122 may each be a storage area on the storage section 12 or may be a database configured on the storage section 12.

制御部１１は、例えばＣＰＵ（Central Processing Unit）等のプロセッサであり、記憶部１２に記憶されたプログラムを実行することにより、情報取得部１１１、最大発話者特定部１１２、遷移検出部１１３、フェーズ分割部１１４、パターン選択部１１５及び出力部１１６として機能する。制御部１１の機能の少なくとも一部は、電気回路によって実行されてもよい。また、制御部１１の機能の少なくとも一部は、ネットワーク経由で実行されるプログラムによって実行されてもよい。 The control unit 11 is, for example, a processor such as a CPU (Central Processing Unit), and executes a program stored in the storage unit 12 to control the information acquisition unit 111, the maximum speaker identification unit 112, the transition detection unit 113, and the phase detection unit 112. It functions as a dividing section 114, a pattern selecting section 115, and an output section 116. At least some of the functions of the control unit 11 may be performed by an electric circuit. Furthermore, at least part of the functions of the control unit 11 may be executed by a program executed via a network.

通信端末２は、制御部２１と、記憶部２２と、表示部２３とを有する。制御部２１は、受信部２１１を有する。表示部２３は、液晶ディスプレイ等、情報を表示可能な表示装置を含む。表示部２３として人間による接触の位置を検出可能なタッチスクリーンを用いてもよい。 Communication terminal 2 includes a control section 21, a storage section 22, and a display section 23. The control section 21 has a receiving section 211. The display unit 23 includes a display device capable of displaying information, such as a liquid crystal display. A touch screen that can detect the position of contact by a human may be used as the display unit 23.

記憶部２２は、ＲＯＭ、ＲＡＭ、ハードディスクドライブ等を含む記憶媒体である。記憶部２２は、制御部２１が実行するプログラムを予め記憶している。記憶部２２は、通信端末２の外部に設けられてもよく、その場合にネットワークを介して制御部２１との間でデータの授受を行ってもよい。 The storage unit 22 is a storage medium including ROM, RAM, hard disk drive, and the like. The storage unit 22 stores in advance a program to be executed by the control unit 21. The storage unit 22 may be provided outside the communication terminal 2, and in that case, data may be exchanged with the control unit 21 via a network.

制御部２１は、例えばＣＰＵ等のプロセッサであり、記憶部２２に記憶されたプログラムを実行することにより、受信部２１１として機能する。制御部２１の機能の少なくとも一部は、電気回路によって実行されてもよい。また、制御部２１の機能の少なくとも一部は、ネットワーク経由で実行されるプログラムによって実行されてもよい。 The control unit 21 is, for example, a processor such as a CPU, and functions as the reception unit 211 by executing a program stored in the storage unit 22. At least some of the functions of the control unit 21 may be performed by an electric circuit. Furthermore, at least part of the functions of the control unit 21 may be executed by a program executed via a network.

本実施形態に係る議論分析装置１及び通信端末２は、図２に示す具体的な構成に限定されない。議論分析装置１及び通信端末２は、それぞれ１つの装置に限られず、２つ以上の物理的に分離した装置が有線又は無線で接続されることにより構成されてもよい。 The discussion analysis device 1 and the communication terminal 2 according to this embodiment are not limited to the specific configuration shown in FIG. 2. The discussion analysis device 1 and the communication terminal 2 are not each limited to one device, but may be configured by two or more physically separate devices connected by wire or wirelessly.

［議論分析方法の説明］
本実施形態に係る議論分析装置１が行う議論分析方法を以下に説明する。複数の参加者は、議論を行う際に、１つの集音装置３を取り囲んで着席する。議論の参加者又は議論を分析する分析者は、通信端末２を操作することによって、分析条件の設定を行う。例えば分析条件は、分析対象とする議論の参加者の人数と、集音装置３を基準とした複数の参加者それぞれが位置する向き（すなわち、相対的な位置）とを示す情報である。議論分析装置１において、情報取得部１１１は、設定された分析条件を通信端末２から受信し、議論を識別するための識別情報（例えば議論ＩＤ）と関連付けて議論情報記憶部１２１に記憶させる。議論ＩＤは、自動的に議論に割り振られてもよく、あるいは参加者又は分析者によって入力されてもよい。 [Explanation of argument analysis method]
A discussion analysis method performed by the discussion analysis device 1 according to this embodiment will be described below. A plurality of participants are seated surrounding one sound collection device 3 when having a discussion. A participant in the discussion or an analyst who analyzes the discussion sets analysis conditions by operating the communication terminal 2. For example, the analysis conditions are information indicating the number of participants in the discussion to be analyzed and the direction in which each of the plurality of participants is located with respect to the sound collection device 3 (that is, the relative position). In the discussion analysis device 1, the information acquisition unit 111 receives the set analysis conditions from the communication terminal 2, and stores them in the discussion information storage unit 121 in association with identification information (for example, discussion ID) for identifying the discussion. A discussion ID may be automatically assigned to a discussion or may be entered by a participant or analyst.

次に参加者又は分析者は、議論を開始する際に、通信端末２を操作することによって、議論の開始を指示する。議論分析装置１において、情報取得部１１１は、議論の開始を指示する信号を通信端末２から受信すると、音声の取得を指示する信号を集音装置３へ送信する。集音装置３は、議論分析装置１から音声の取得を指示する信号を受信した場合に、音声の取得を開始する。 Next, when starting a discussion, the participant or analyst instructs the start of the discussion by operating the communication terminal 2. In the discussion analysis device 1 , when the information acquisition unit 111 receives a signal instructing to start a discussion from the communication terminal 2 , it transmits a signal instructing to acquire audio to the sound collection device 3 . When the sound collection device 3 receives a signal instructing the acquisition of audio from the discussion analysis device 1, it starts acquiring audio.

集音装置３は、複数の集音部においてそれぞれ音声を取得し、各集音部に対応する各チャネルの音声として内部に記録する。そして集音装置３は、取得した複数のチャネルの音声を、議論分析装置１へ送信する。集音装置３は、取得した音声を逐次送信してもよく、あるいは所定量又は所定時間の音声を送信してもよい。また、集音装置３は、取得の開始から終了までの音声をまとめて送信してもよい。議論分析装置１において、情報取得部１１１は、集音装置３から音声を受信し、議論ＩＤと関連付けて議論情報記憶部１２１に記憶させる。 The sound collection device 3 acquires sounds from each of the plurality of sound collection parts, and internally records the sounds as sounds of each channel corresponding to each sound collection part. Then, the sound collection device 3 transmits the acquired voices of the plurality of channels to the discussion analysis device 1. The sound collection device 3 may transmit the acquired sounds one after another, or may transmit a predetermined amount or a predetermined amount of time. Further, the sound collection device 3 may transmit all sounds from the start to the end of acquisition all at once. In the discussion analysis device 1, the information acquisition unit 111 receives audio from the sound collection device 3, and stores it in the discussion information storage unit 121 in association with the discussion ID.

参加者又は分析者は、議論を終了する際に、通信端末２を操作することによって、議論の終了を指示する。議論分析装置１において、情報取得部１１１は、議論の終了を指示す信号を通信端末２から受信すると、音声の取得の終了を指示する信号を集音装置３へ送信する。集音装置３は、議論分析装置１から音声の取得の終了を指示する信号を受信した場合に、音声の取得を終了する。 When ending the discussion, the participant or analyst instructs the end of the discussion by operating the communication terminal 2. In the discussion analysis device 1, when the information acquisition unit 111 receives a signal instructing the end of the discussion from the communication terminal 2, it transmits a signal instructing the end of audio acquisition to the sound collection device 3. When the sound collection device 3 receives a signal instructing the end of audio acquisition from the discussion analysis device 1, it ends the audio acquisition.

以降の処理は、音声の取得が終了したことを契機として、又は分析者が通信端末２に対して所定の指示を行ったことを契機として行われる。また、以降の処理は、音声の取得が開始されたことを契機として逐次処理で行われてもよい。情報取得部１１１は、集音装置３から受信した複数チャネルの音声に基づいて音源定位を行う。音源定位は、情報取得部１１１が取得した音声に含まれる音源の向きを、時間ごと（例えば１０ミリ秒～１００ミリ秒ごと）に推定する処理である。情報取得部１１１は、時間ごとに推定した音源の向きを、議論情報記憶部１２１に記憶された分析条件が示す複数の参加者それぞれの向きと関連付ける。 The subsequent processing is performed when the acquisition of the voice is completed or when the analyst gives a predetermined instruction to the communication terminal 2. Further, the subsequent processing may be performed sequentially, triggered by the start of audio acquisition. The information acquisition unit 111 performs sound source localization based on multiple channels of audio received from the sound collection device 3. Sound source localization is a process of estimating the direction of a sound source included in the sound acquired by the information acquisition unit 111 at each time (for example, every 10 to 100 milliseconds). The information acquisition unit 111 associates the direction of the sound source estimated for each time with the direction of each of the plurality of participants indicated by the analysis conditions stored in the discussion information storage unit 121.

情報取得部１１１は、取得した音声に基づいて音源の向きを特定可能であれば、ＭＵＳＩＣ（Multiple Signal Classification）法、ビームフォーミング法等、既知の音源定位方法を用いることができる。 The information acquisition unit 111 can use a known sound source localization method, such as the MUSIC (Multiple Signal Classification) method or the beamforming method, as long as the direction of the sound source can be specified based on the acquired audio.

次に情報取得部１１１は、取得した音声及び推定した音源の向きに基づいて、議論において、所定の時間ごと（例えば１０ミリ秒～１００ミリ秒ごと）に、いずれの参加者が発話（発言）したかを判別する。情報取得部１１１は、１人の参加者が発話を開始してから終了するまでの連続した期間を発話期間として特定する。同じ時間に複数の参加者が発話を行った場合には、複数の参加者の発話期間の少なくとも一部同士が重複する。情報取得部１１１は、議論において特定した発話期間を、議論ＩＤと関連付けて議論情報記憶部１２１に記憶させる。 Next, the information acquisition unit 111 determines which participant makes an utterance (utterance) at a predetermined time interval (for example, every 10 to 100 milliseconds) during the discussion based on the acquired voice and the estimated direction of the sound source. Determine whether The information acquisition unit 111 specifies a continuous period from when one participant starts speaking until it ends as a speaking period. When multiple participants speak at the same time, at least a portion of the speaking periods of the multiple participants overlap. The information acquisition unit 111 stores the utterance period specified in the discussion in the discussion information storage unit 121 in association with the discussion ID.

情報取得部１１１は、議論の中で、特定の時間範囲を除外して、発話期間を特定してもよい。この場合に、参加者又は分析者は、議論の中で除外対象の時間範囲において、通信端末２又は集音装置３に対して所定の操作を行う。参加者又は分析者は、除外対象の時間範囲中にボタン操作等の操作を継続してもよく、あるいは除外対象の時間範囲の開始時及び終了時にそれぞれボタン操作等の操作を行ってもよい。通信端末２又は集音装置３は、除外対象の時間範囲を示す情報を、議論分析装置１へ送信する。 The information acquisition unit 111 may exclude a specific time range during the discussion and specify the utterance period. In this case, the participant or analyst performs a predetermined operation on the communication terminal 2 or the sound collection device 3 during the time range to be excluded during the discussion. The participant or the analyst may continue to perform operations such as button operations during the time range to be excluded, or may perform operations such as button operations at the beginning and end of the time range to be excluded, respectively. The communication terminal 2 or the sound collection device 3 transmits information indicating the time range to be excluded to the discussion analysis device 1.

情報取得部１１１は、除外対象の時間範囲を示す情報を受信した場合に、取得した音声から該時間範囲を除外した音声を議論情報記憶部１２１に記憶させ、取得した音声から該時間範囲を除外した音声を用いて発話期間の特定を行う。これにより、参加者又は分析者は、機密事項等を話す時間範囲を、分析対象としないように設定できる。 When the information acquisition unit 111 receives information indicating a time range to be excluded, the information acquisition unit 111 causes the discussion information storage unit 121 to store audio obtained by excluding the time range from the acquired audio, and excludes the time range from the acquired audio. The utterance period is identified using the recorded audio. Thereby, the participant or the analyst can set a time range in which confidential matters are discussed so as not to be analyzed.

本実施形態において、情報取得部１１１は、集音装置３が取得した音声に基づいて発話期間を特定しているが、その他の方法によって発話期間を特定してもよい。例えば情報取得部１１１は、音声通話又はビデオ通話（ビデオ会議、ビデオチャットともいう）において参加者が発した音声に基づいて、参加者の発話期間を特定してもよい。また、例えば情報取得部１１１は、記憶部１２に予め記憶された発話期間を読み出して取得してもよい。 In the present embodiment, the information acquisition unit 111 specifies the speech period based on the sound acquired by the sound collection device 3, but the speech period may be specified using other methods. For example, the information acquisition unit 111 may identify a participant's speaking period based on the voice uttered by the participant during a voice call or a video call (also referred to as a video conference or a video chat). Further, for example, the information acquisition unit 111 may read and acquire the speech period stored in advance in the storage unit 12.

また、例えば情報取得部１１１は、議論における参加者の顔を含む画像に基づいて、参加者の発話期間を特定してもよい。この場合には、議論を行う複数の参加者の近傍に、集音装置３に代えて又は加えて撮像装置を配置する。情報取得部１１１は、議論の最中に撮像装置が撮像した複数の参加者の顔を含む時系列の画像を取得する。また、情報取得部１１１は、ビデオ通話において通信端末間で送受信される複数の参加者の顔を含む時系列の画像を取得してもよい。情報取得部１１１は、取得した画像に対して既知の顔認識処理を適用することによって、人間の顔の状態（例えば口が開いているか否か）に基づいて、複数の参加者それぞれが発話中か否かを判定し、複数の参加者それぞれの発話期間を特定する。 Further, for example, the information acquisition unit 111 may identify the speaking period of the participant based on an image including the face of the participant in the discussion. In this case, an imaging device is placed near the plurality of participants having a discussion instead of or in addition to the sound collection device 3. The information acquisition unit 111 acquires time-series images including the faces of a plurality of participants captured by an imaging device during a discussion. Furthermore, the information acquisition unit 111 may acquire time-series images including the faces of multiple participants that are transmitted and received between communication terminals in a video call. The information acquisition unit 111 applies known face recognition processing to the acquired images to determine whether each of the plurality of participants is speaking based on the state of the human face (for example, whether the mouth is open or not). The speech period of each of the plurality of participants is determined.

次に、議論分析装置１が議論における話者の遷移を検出する方法を説明する。図３は、議論分析装置１が議論における話者の遷移を検出する方法の模式図である。情報取得部１１１は、特定した発話期間に基づいて、議論における複数の参加者それぞれの時系列の発話量（発言量ともいう）を取得する。 Next, a method for the discussion analysis device 1 to detect a change in speakers in a discussion will be explained. FIG. 3 is a schematic diagram of a method by which the discussion analysis device 1 detects speaker transitions in a discussion. The information acquisition unit 111 acquires the amount of time-series utterances (also referred to as utterance amount) of each of the plurality of participants in the discussion based on the specified utterance period.

具体的には、情報取得部１１１は、議論を所定の窓幅ｗ１（例えば３０秒）の第１フレーム（すなわち第１の時間範囲）に分割する。第１フレームは窓幅ｗ１より短い所定のシフト幅ｓ１（例えば１０秒）ずつずらされており、隣接する第１フレーム同士の一部同士が時系列で互いに重複している。 Specifically, the information acquisition unit 111 divides the discussion into a first frame (ie, a first time range) of a predetermined window width w1 (for example, 30 seconds). The first frames are shifted by a predetermined shift width s1 (for example, 10 seconds) shorter than the window width w1, and adjacent first frames partially overlap each other in time series.

そして情報取得部１１１は、第１フレームにおける参加者の発話期間の長さ（合計発話時間）を窓幅ｗ１で割った値を、第１フレームごとの発話量として算出する。情報取得部１１１は、複数の参加者それぞれについて、議論の開始時刻から終了時刻までの第１フレームごとの発話量を算出する。情報取得部１１１は、議論における複数の参加者それぞれの第１フレームごとの発話量を示す情報を、議論ＩＤと関連付けて議論情報記憶部１２１に記憶させる。 The information acquisition unit 111 then calculates a value obtained by dividing the length of the participant's speech period (total speech time) in the first frame by the window width w1 as the speech amount for each first frame. The information acquisition unit 111 calculates the amount of speech for each first frame from the start time to the end time of the discussion for each of the plurality of participants. The information acquisition unit 111 causes the discussion information storage unit 121 to store information indicating the amount of speech for each first frame of each of the plurality of participants in the discussion in association with the discussion ID.

図３の上段の図は、複数の参加者の時系列の発話量のグラフＧを示している。グラフＧは、複数の参加者の発話量を積み上げグラフとして表している。グラフＧの横軸は時間、縦軸は発話量である。グラフＧの領域には、複数の参加者それぞれに応じて異なる模様が表されている。 The upper diagram in FIG. 3 shows a graph G of the amount of speech by a plurality of participants over time. Graph G represents the amount of speech by a plurality of participants as a stacked graph. The horizontal axis of the graph G is time, and the vertical axis is the amount of speech. In the area of graph G, different patterns are displayed depending on each of the plurality of participants.

さらに情報取得部１１１は、取得した発話期間及び発話量に基づいて、複数の参加者それぞれの割り込み量及び盛り上げ量を算出する。具体的には、情報取得部１１１は、２人の参加者の発話期間が時系列で互いに重複している場合に、発話期間が重複している部分の長さを、該２人の参加者のうち発話期間の開始時刻が遅い方の参加者の割り込み量として算出する。情報取得部１１１は、議論の開始から終了までの複数の参加者それぞれの割り込み量を算出する。 Further, the information acquisition unit 111 calculates the amount of interruption and the amount of excitement for each of the plurality of participants based on the acquired speaking period and amount of speaking. Specifically, when the utterance periods of two participants overlap each other in chronological order, the information acquisition unit 111 calculates the length of the overlapping utterance period between the two participants. It is calculated as the amount of interruptions for the participant whose speaking period starts later. The information acquisition unit 111 calculates the amount of interruptions by each of the plurality of participants from the start to the end of the discussion.

また、情報取得部１１１は、１人の参加者の１つの発話期間の前及び後それぞれの所定時間（例えば２０秒間）における複数の参加者全員の発話量を合計し、該発話期間の後の合計発話量から該発話期間の前の合計発話量を減算した量（すなわち、該発話期間の前から後の合計発話量の増分）を、盛り上げ量として算出する。情報取得部１１１は、議論の開始から終了まで複数の参加者それぞれの盛り上げ量を算出する。情報取得部１１１は、１人の参加者の全ての発話期間の数のうち、盛り上げ量が０より大きい発話期間の回数を、盛り上げ回数として算出してもよい。情報取得部１１１は、複数の参加者それぞれの割り込み量及び盛り上げ量（又は盛り上げ回数）を、議論ＩＤと関連付けて議論情報記憶部１２１に記憶させる。 In addition, the information acquisition unit 111 totals the amount of speech of all the participants in a predetermined time (for example, 20 seconds) before and after one speech period of one participant, and The amount obtained by subtracting the total amount of speech before the speech period from the total speech amount (that is, the increment in the total speech amount from before to after the speech period) is calculated as the excitement amount. The information acquisition unit 111 calculates the amount of excitement for each of the plurality of participants from the start to the end of the discussion. The information acquisition unit 111 may calculate the number of speech periods in which the amount of excitement is greater than 0 out of all the number of speech periods of one participant as the number of times of excitement. The information acquisition unit 111 causes the discussion information storage unit 121 to store the amount of interruptions and the amount of excitement (or the number of times of excitement) of each of the plurality of participants in association with the discussion ID.

最大発話者特定部１１２は、情報取得部１１１が取得した発話量に基づいて、第１フレームごとに複数の参加者のうち発話量が最大である最大発話者を特定する。最大発話者特定部１１２は、議論の最初の第１フレームから最後の第１フレームまでの最大発話者の配列を出力する。 The largest speaker identification unit 112 identifies the largest speaker with the largest amount of speech among the plurality of participants for each first frame, based on the amount of speech acquired by the information acquisition unit 111. The maximum speaker identification unit 112 outputs an array of maximum speakers from the first frame at the beginning of the discussion to the last first frame.

図３の中段の図は、時系列の最大発話者を帯Ｓとして示している。時系列の最大発話者の帯Ｓは、時系列の発話量のグラフＧに基づいて生成されており、横軸はグラフＧの時間に対応している。最大発話者の帯Ｓには、複数の参加者それぞれに応じて異なる模様が表されており、グラフＧの領域の模様に対応している。 The middle diagram in FIG. 3 shows the maximum speaker in the time series as band S. The time-series maximum speaker band S is generated based on the time-series speech volume graph G, and the horizontal axis corresponds to the time of the graph G. The band S of the largest speaker shows different patterns depending on each of the plurality of participants, and corresponds to the pattern of the area of the graph G.

遷移検出部１１３は、最大発話者特定部１１２が特定した第１フレームごとの最大発話者の変化に基づいて、複数の参加者の間で発生した話者の遷移を検出する。具体的には、議論を所定の窓幅ｗ２の第２フレーム（すなわち第２の時間範囲）に分割する。第２フレームの窓幅ｗ２は、第１フレームの窓幅ｗ１よりも長い。すなわち、第２フレームは、複数の第１フレームを含む。窓幅ｗ２は、窓幅ｗ１の所定の倍数（例えば窓幅ｗ１の１００倍）として定義されてもよく、あるいは所定の時間（例えば３０００秒）として定義されてもよい。 The transition detection unit 113 detects a transition in speakers that occurs between a plurality of participants, based on the change in the maximum speaker for each first frame specified by the maximum speaker identification unit 112. Specifically, the discussion is divided into a second frame (ie, a second time range) of a predetermined window width w2. The window width w2 of the second frame is longer than the window width w1 of the first frame. That is, the second frame includes a plurality of first frames. The window width w2 may be defined as a predetermined multiple of the window width w1 (for example, 100 times the window width w1), or may be defined as a predetermined time (for example, 3000 seconds).

第２フレームは窓幅ｗ２より短い所定のシフト幅ｓ２ずつずらされており、隣接する２つの第２フレームの一部同士が時系列で互いに重複している。シフト幅ｓ２は、窓幅ｗ１の所定の倍数（例えば窓幅ｗ１の５倍）として定義されてもよく、あるいは所定の時間（例えば１５０秒）として定義されてもよい。 The second frames are shifted by a predetermined shift width s2 that is shorter than the window width w2, and parts of two adjacent second frames overlap each other in time series. The shift width s2 may be defined as a predetermined multiple of the window width w1 (for example, 5 times the window width w1), or may be defined as a predetermined time (for example, 150 seconds).

そして遷移検出部１１３は、１つの第１フレームにおける最大発話者である第１の参加者と、該第１フレームに続く第１フレームにおける最大発話者である第２の参加者とが異なる場合に、該第１の参加者から該第２の参加者への遷移を検出する。遷移検出部１１３は、１つの第２フレームについて、該第２フレームの最初の第１フレームから最後の第１フレームまで、遷移の検出を繰り返し、参加者の組み合わせ（すなわち第１の参加者及び第２の参加者の組み合わせ）ごとに検出した遷移の回数を示す遷移行列を生成する。複数の参加者の数をＤとすると、遷移行列はＤ×Ｄの行列となる。 The transition detection unit 113 detects when the first participant who is the largest speaker in one first frame is different from the second participant who is the largest speaker in the first frame following the first frame. , detecting a transition from the first participant to the second participant. The transition detection unit 113 repeats transition detection for one second frame from the first frame to the last frame of the second frame, and detects the combination of participants (i.e., the first participant and the first frame). A transition matrix indicating the number of transitions detected for each combination of participants (2) is generated. When the number of participants is D, the transition matrix becomes a D×D matrix.

さらに遷移検出部１１３は、議論の最初の第２フレームから最後の第２フレームまで、遷移行列の生成を繰り返す。第２フレームの数をＮとすると、遷移検出部１１３は、Ｎ個の遷移行列を生成する。遷移検出部１１３は、第２フレームごとに生成した遷移行列を示す情報を、遷移情報として議論情報記憶部１２１に記憶させる。 Further, the transition detection unit 113 repeatedly generates the transition matrix from the first second frame of the discussion to the last second frame. When the number of second frames is N, the transition detection unit 113 generates N transition matrices. The transition detection unit 113 causes the discussion information storage unit 121 to store information indicating the transition matrix generated for each second frame as transition information.

図３の下段の図は、例示的な遷移行列Ｍを示している。図３の例では、参加者はＵ１、Ｕ２及びＵ３の３人であり、時系列の最大発話者の帯Ｓに基づいて複数の遷移行列Ｍが生成されている。遷移行列Ｍの行は遷移元の参加者を示しており、列は遷移先の参加者を示している。このように、議論分析装置１は、最大発話者の変化に基づいて話者の遷移を検出するため、物体の衝突音や参加者の相槌等の発言ではない音によって話者の遷移を検出することを抑えることができ、議論における話者の遷移の検出精度を向上できる。 The lower diagram of FIG. 3 shows an exemplary transition matrix M. In the example of FIG. 3, there are three participants, U1, U2, and U3, and a plurality of transition matrices M are generated based on the band S of the largest speaker in the time series. The rows of the transition matrix M indicate the transition source participants, and the columns indicate the transition destination participants. In this way, the discussion analysis device 1 detects the transition of speakers based on the change in the number of maximum speakers, and therefore detects the transition of speakers based on sounds that are not utterances, such as the sound of objects colliding or participants agreeing. This makes it possible to improve the accuracy of detecting speaker transitions in discussions.

次に、議論分析装置１が議論を１つ以上のフェーズに分割する方法を説明する。図４は、議論分析装置１が議論を１つ以上のフェーズに分割する方法の模式図である。フェーズ分割部１１４は、第２フレームごとに生成された遷移情報（遷移行列）の時系列の類似性に基づいて、議論を１つ以上のフェーズに分割する。ここでフェーズ分割部１１４は、１つのフェーズの中で遷移情報が類似するように、すなわち１つのフェーズに含まれる２つの第２フレームの遷移情報間の類似性が、異なる２つのフェーズに含まれる２つの第２フレームの遷移情報間の類似性よりも高くなるように、議論を１つ又は複数のフェーズに分割する。フェーズ分割部１１４は、遷移情報の時系列の類似性に基づいて議論を１つ以上のフェーズに分割することが可能な既知の方法を用いる。 Next, a method in which the discussion analysis device 1 divides the discussion into one or more phases will be explained. FIG. 4 is a schematic diagram of how the discussion analysis device 1 divides the discussion into one or more phases. The phase dividing unit 114 divides the discussion into one or more phases based on the similarity in time series of transition information (transition matrix) generated for each second frame. Here, the phase dividing unit 114 divides the transition information so that the transition information is similar within one phase, that is, the similarity between the transition information of two second frames included in one phase is included in two different phases. Divide the discussion into one or more phases such that the similarity between the transition information of the two second frames is higher. The phase division unit 114 uses a known method capable of dividing the discussion into one or more phases based on the similarity of the time series of transition information.

例えばフェーズ分割部１１４は、以下に説明するポアソン混合モデルを用いたクラスタリングを行うことによって、議論を１つ以上のフェーズに分割する。まずフェーズ分割部１１４は、遷移検出部１１３が生成した遷移行列を取得する。ここで、計算のために、フェーズ分割部１１４は、第２フレームごとの遷移行列の要素を縦一列に並べることによって、参加者の組み合わせごとの遷移の回数を要素とするＤ^２×１の縦ベクトルに変換するする。これにより、フェーズ分割部１１４は、Ｄ^２次元の非負ベクトルが時系列でＮ個並んだＤ^２×Ｎの行列を得る。 For example, the phase division unit 114 divides the discussion into one or more phases by performing clustering using a Poisson mixture model described below. First, the phase division section 114 obtains the transition matrix generated by the transition detection section 113. Here, for calculation, the phase division unit 114 arranges the elements of the transition matrix for each second frame in a vertical line, thereby dividing the number of transitions for each combination of participants into a D ² ×1 vertical column. Convert to vector. As a result, the phase division unit 114 obtains a D ² ×N matrix in ^which N two-dimensional non-negative vectors are arranged in time series.

各参加者の組み合わせは異なる遷移の傾向を有するため、遷移行列を変換したＤ^２×Ｎの行列は、式（１）に示すポアソン分布の混合分布となる。

Since each combination of participants has a different tendency of transition, the D ² ×N matrix obtained by transforming the transition matrix becomes a mixture distribution of Poisson distribution shown in Equation (1).

ここで、Ｐｏｉはポアソン分布の関数を表し、ｘは参加者の組み合わせごとの遷移が起こった回数（すなわち遷移行列の各要素）を表し、λ_ｄは参加者の組み合わせごとの遷移が起こる平均回数を表し、ｄは縦ベクトルの次元（１～Ｄ^２）を表す。 Here, Poi represents a function of Poisson distribution, x represents the number of times a transition occurs for each combination of participants (i.e., each element of the transition matrix), and λ _d represents the average number of times a transition occurs for each combination of participants. , and d represents the dimension (1 to D ² ) of the vertical vector.

議論をＫ個（Ｋは２以上の所定の数）のクラスタに分けることを考えると、上述のλ_ｄの値のセットがＫ個できる。これにより、フェーズ分割部１１４は、式（２）のようなＫ個のポアソン分布の混合分布を生成する。

Considering that the discussion is divided into K clusters (K is a predetermined number of 2 or more), K sets of the above-mentioned values of λ _d can be created. Thereby, the phase division unit 114 generates a mixed distribution of K Poisson distributions as shown in Equation (2).

ここで、フェーズ分割部１１４は、Ｎ個の遷移行列のうち、第ｎ番目の遷移行列がいずれのクラスタに所属するかを示す行列である隠れ変数ｚ_ｎｋ（ｚ_ｎｋは０又は１）を定義する。隠れ変数ｚ_ｎｋは、第ｎ番目の遷移行列が第ｋクラスタに所属するときのみ１となり、それ以外のとき０となる。 Here, the phase division unit 114 defines a hidden variable z _nk (z _nk is 0 or 1), which is a matrix indicating to which cluster the n-th transition matrix belongs among the N transition matrices. do. The hidden variable z _nk is 1 only when the n-th transition matrix belongs to the k-th cluster, and is 0 otherwise.

これにより、フェーズ分割部１１４は、式（２）の分布を式（３）に示す１つの分布にまとめる。

Thereby, the phase dividing unit 114 combines the distribution of equation (2) into one distribution shown in equation (3).

フェーズ分割部１１４は、式（３）のモデルを用いてベイズ推定を行うことによって、ｘとなる確率が所定の条件（例えば、ｘとなる確率が最大値であること）を満たすパラメータλ及びｚを算出する。これにより、フェーズ分割部１１４は、Ｎ個の遷移行列それぞれがＫ個のクラスタのうちいずれに割り当てられるかを判定する。 The phase division unit 114 performs Bayesian estimation using the model of Equation (3) to determine the parameters λ and z where the probability of x satisfies a predetermined condition (for example, the probability of x is the maximum value). Calculate. Thereby, the phase division unit 114 determines to which of the K clusters each of the N transition matrices is assigned.

このとき、フェーズ分割部１１４は、Ｋ個のクラスタのうち、割り当てられた遷移行列の数が所定の閾値以下のクラスタを削除してもよい。この場合に、削除されたクラスタに割り当てられた遷移行列は、該クラスタの前又は後のクラスタに割り当てられる。その結果、最終的に生成されるクラスタの数は、Ｋ個以下となる。これにより、フェーズ分割部１１４は、割り当てられた遷移行列が多い、クラスタだけを残して議論を１つ以上のフェーズに分割できる。 At this time, the phase division unit 114 may delete clusters in which the number of assigned transition matrices is equal to or less than a predetermined threshold value from among the K clusters. In this case, the transition matrix assigned to the deleted cluster is assigned to the cluster before or after the deleted cluster. As a result, the number of clusters finally generated is K or less. Thereby, the phase division unit 114 can divide the discussion into one or more phases, leaving only clusters that have been assigned a large number of transition matrices.

本実施形態において、フェーズ分割部１１４は、時系列を考慮せずに複数の遷移行列を複数のクラスタに割り当てているため、理論的には複数の遷移行列の時系列とクラスタの時系列とが一致しない可能性がある。しかしながら、遷移検出部１１３は、第２フレームを時系列で重複させながらシフトさせているため、検出された遷移の回数は時系列の移動平均となっている。そのため、時間的に近い複数の遷移行列は、互いに類似する。これにより、通常の状況では、複数の遷移行列の時系列と、フェーズ分割部１１４が生成した複数のクラスタの時系列とは一致する。 In this embodiment, the phase division unit 114 allocates multiple transition matrices to multiple clusters without considering the time series, so theoretically the time series of the multiple transition matrices and the time series of the clusters are different. They may not match. However, since the transition detection unit 113 shifts the second frames while overlapping them in time series, the number of detected transitions is a moving average in time series. Therefore, multiple transition matrices that are close in time are similar to each other. As a result, under normal circumstances, the time series of the plurality of transition matrices and the time series of the plurality of clusters generated by the phase division unit 114 match.

フェーズ分割部１１４は、複数のクラスタを生成した場合に、複数のクラスタそれぞれに含まれている遷移行列に対応する第２フレームの議論中の時刻に基づいて、議論を複数のフェーズに分割する。具体的には、フェーズ分割部１１４は、１つのクラスタに含まれている遷移行列に対応する第２フレームのうち最後の第２フレームの終了時刻を、フェーズの終了時刻として特定することによって、議論を構成する複数のフェーズを決定する。 When a plurality of clusters are generated, the phase division unit 114 divides the discussion into a plurality of phases based on the time during discussion of the second frame corresponding to the transition matrix included in each of the plurality of clusters. Specifically, the phase division unit 114 specifies the end time of the last second frame among the second frames corresponding to the transition matrix included in one cluster as the end time of the phase. Determine the multiple phases that make up the process.

また、フェーズ分割部１１４は、１つのクラスタを生成した場合に、議論の全体を１つのフェーズとして決定する。フェーズ分割部１１４は、決定した議論のフェーズを示す情報を、議論の識別情報と関連付けて議論情報記憶部１２１に記憶させる。 Furthermore, when one cluster is generated, the phase dividing unit 114 determines the entire discussion as one phase. The phase dividing unit 114 causes the discussion information storage unit 121 to store information indicating the determined phase of the discussion in association with discussion identification information.

単純に議論を時間によって前半、中盤、後半のようなフェーズに分割すると、議論の内容が考慮されないため、議論が分割される位置は実態に即さない。それに対して本実施形態に係る議論分析装置１は、遷移情報の時系列の類似性に基づいて議論を１つ以上のフェーズに分割するため、議論を実態に即した単位で分割できる。 If the discussion is simply divided into phases such as the first half, middle, and second half based on time, the content of the discussion will not be taken into account, and the positions at which the discussion will be divided will not correspond to the actual situation. On the other hand, the discussion analysis device 1 according to the present embodiment divides the discussion into one or more phases based on the similarity of the time series of transition information, so that the discussion can be divided into units that match the actual situation.

図４の下段の図は、例示的なフェーズ分割部が決定した議論のフェーズを示している。図４の例では、議論はフェーズＰＨ１、ＰＨ２及びＰＨ３の３つに分割されている。フェーズＰＨ１、ＰＨ２及びＰＨ３それぞれにおいて話者の遷移の傾向が類似している。議論は３つ以外のフェーズに分割されてもよい。 The bottom diagram of FIG. 4 shows the phases of the discussion determined by the exemplary phase divider. In the example of FIG. 4, the discussion is divided into three phases: PH1, PH2, and PH3. The trends of speaker transitions are similar in each of phases PH1, PH2, and PH3. The discussion may be divided into phases other than three.

次に、議論分析装置１が遷移情報に類似するパターンを選択する方法を説明する。図５、図６は、議論分析装置１が遷移情報に類似するパターンを選択する方法の模式図である。まずパターン選択部１１５は、フェーズ分割部１１４が決定した議論のフェーズごとに、遷移検出部１１３が生成した遷移行列Ｍ（遷移情報）を取得する。フェーズごとの遷移行列Ｍは、例えばフェーズに含まれる遷移行列Ｍの統計値（平均値、中央値等）であってもよく、あるいはフェーズに含まれる所定の位置（最初、中央又は最後等）の遷移行列Ｍであってもよい。 Next, a method for the discussion analysis device 1 to select a pattern similar to transition information will be explained. 5 and 6 are schematic diagrams of a method in which the discussion analysis device 1 selects a pattern similar to transition information. First, the pattern selection unit 115 obtains the transition matrix M (transition information) generated by the transition detection unit 113 for each discussion phase determined by the phase division unit 114. The transition matrix M for each phase may be, for example, a statistical value (average value, median value, etc.) of the transition matrix M included in the phase, or it may be a statistical value (average value, median value, etc.) of the transition matrix M included in the phase, or it may be a statistical value (average value, median value, etc.) of the transition matrix M included in the phase. It may be a transition matrix M.

パターン選択部１１５は、複数の参加者の各組み合わせにおける遷移の有無を示す複数のパターンを生成する。ここでは、２人の参加者の組み合わせにおいて遷移が有る又は相対的に多い場合を該２人の参加者が「接続されている」と表現し、遷移が無い又は相対的に少ない場合を該２人の参加者が「接続されていない」と表現する。パターン選択部１１５は、複数の参加者の数をＤとすると、中心となる１人がその他の全員と接続されているパターンと、ｉ人（ｉ＝２～Ｄ）が相互に接続されているパターンとからなるＤ種類のパターンを生成する。 The pattern selection unit 115 generates a plurality of patterns indicating the presence or absence of a transition in each combination of a plurality of participants. Here, when there are or relatively many transitions in a combination of two participants, the two participants are said to be "connected," and when there are no or relatively few transitions, the two participants are said to be "connected." One participant described it as "not connected." The pattern selection unit 115 selects a pattern in which, assuming the number of multiple participants is D, a pattern in which one central person is connected to all the others, and a pattern in which i people (i = 2 to D) are connected to each other. D types of patterns consisting of the patterns are generated.

図５の例では、パターン選択部１１５が生成するパターンは、中心となる１人がその他の全員と接続されているパターンＰ１と、２人が相互に接続されているパターンＰ２と、３人が相互に接続されているパターンＰ３とからなる。図５に図示していないが、パターンＰ１は中心となる１人をＵ１、Ｕ２及びＵ３に変えたパターンを含み、パターンＰ２は相互に接続される２人をＵ１、Ｕ２及びＵ３のうち２人の全ての組み合わせに変えたパターンを含む。 In the example of FIG. 5, the patterns generated by the pattern selection unit 115 are a pattern P1 in which one central person is connected to all the others, a pattern P2 in which two people are connected to each other, and a pattern P2 in which three people are connected to each other. It consists of patterns P3 that are connected to each other. Although not shown in FIG. 5, pattern P1 includes a pattern in which one central person is changed to U1, U2, and U3, and pattern P2 includes a pattern in which two people who are connected to each other are changed to two out of U1, U2, and U3. Contains patterns changed to all combinations of.

パターン選択部１１５は、生成した複数のパターンそれぞれの行列を生成する。パターンの行列は、接続されている参加者の組み合わせの要素を１とし、接続されていない参加者の組み合わせの要素を０とした遷移行列である。また、パターン選択部１１５は、フェーズごとの遷移行列の各要素を、０～１の範囲に正規化する。 The pattern selection unit 115 generates a matrix for each of the plurality of generated patterns. The pattern matrix is a transition matrix in which elements of combinations of connected participants are set to 1 and elements of combinations of unconnected participants are set to 0. Furthermore, the pattern selection unit 115 normalizes each element of the transition matrix for each phase to a range of 0 to 1.

そしてパターン選択部１１５は、生成した複数のパターンそれぞれの行列と、正規化したフェーズごとの遷移行列との間の類似度を算出する。類似度は、例えば行列間距離であるが、その他の値を用いてもよい。そしてパターン選択部１１５は、複数のパターンのうち、算出した類似度が所定の条件（例えば行列間距離が最小）を満たすパターンを選択する。パターン選択部１１５は、フェーズ分割部１１４が決定した１つ以上のフェーズそれぞれについて、パターンを選択する。 Then, the pattern selection unit 115 calculates the degree of similarity between the matrix of each of the plurality of generated patterns and the normalized transition matrix for each phase. The similarity is, for example, the distance between matrices, but other values may also be used. Then, the pattern selection unit 115 selects a pattern whose calculated degree of similarity satisfies a predetermined condition (for example, the distance between matrices is the minimum) from among the plurality of patterns. The pattern selection unit 115 selects a pattern for each of the one or more phases determined by the phase division unit 114.

さらにパターン選択部１１５は、フェーズごとに選択したパターンに変更を加えた複数のサブパターンを生成する。具体的には、パターン選択部１１５は、選択したパターンそのものに加えて、選択したパターンに含まれているいずれか１つの接続を削除したパターン、及び選択したパターンに含まれていない１つの接続を追加したパターンを、サブパターンとして生成する。パターン選択部１１５は、選択したパターンにその他の変更を加えたサブパターンを生成してもよい。 Furthermore, the pattern selection unit 115 generates a plurality of sub-patterns by adding changes to the selected pattern for each phase. Specifically, in addition to the selected pattern itself, the pattern selection unit 115 selects a pattern in which any one connection included in the selected pattern is deleted, and one connection not included in the selected pattern. Generate the added pattern as a subpattern. The pattern selection unit 115 may generate a sub-pattern by adding other changes to the selected pattern.

図６は、図５においてパターンＰ１が選択された場合の例示的なサブパターンを示している。この場合に、パターン選択部１１５が生成するサブパターンは、パターンＰ１そのものであるサブパターンＳＰ１と、パターンＰ１に含まれている１つの接続を削除したサブパターンＳＰ２と、パターンＰ１に含まれていない１つの接続を追加したサブパターンＳＰ３とからなる。図６において、削除された接続は破線で表されており、追加された接続は一点鎖線で表されている。サブパターンＳＰ２は別の接続を削除したパターンを含み、サブパターンＳＰ３は別の接続を追加したパターンを含む。 FIG. 6 shows exemplary sub-patterns when pattern P1 is selected in FIG. In this case, the subpatterns generated by the pattern selection unit 115 are a subpattern SP1 that is the pattern P1 itself, a subpattern SP2 that is obtained by deleting one connection included in the pattern P1, and a subpattern SP2 that is not included in the pattern P1. It consists of a subpattern SP3 to which one connection is added. In FIG. 6, deleted connections are represented by broken lines, and added connections are represented by dashed lines. Subpattern SP2 includes a pattern in which another connection is deleted, and subpattern SP3 includes a pattern in which another connection is added.

パターン選択部１１５は、生成した複数のサブパターンそれぞれの行列を生成する。サブパターンの行列は、接続されている参加者の組み合わせの要素を１とし、接続されていない参加者の組み合わせの要素を０とした遷移行列である。また、パターン選択部１１５は、フェーズごとの遷移行列の各要素を、０～１の範囲に正規化する。 The pattern selection unit 115 generates a matrix for each of the plurality of generated sub-patterns. The subpattern matrix is a transition matrix in which elements of combinations of connected participants are set to 1 and elements of combinations of unconnected participants are set to 0. Furthermore, the pattern selection unit 115 normalizes each element of the transition matrix for each phase to a range of 0 to 1.

そしてパターン選択部１１５は、生成した複数のサブパターンそれぞれの行列と、正規化したフェーズごとの遷移行列との間の類似度を算出する。類似度は、例えば行列間距離であるが、その他の値を用いてもよい。そしてパターン選択部１１５は、複数のサブパターンのうち、算出した類似度が所定の条件（例えば行列間距離が最小）を満たすサブパターンを選択する。パターン選択部１１５は、フェーズ分割部１１４が決定した１つ以上のフェーズそれぞれについて、サブパターンを選択する。 Then, the pattern selection unit 115 calculates the degree of similarity between the matrix of each of the plurality of generated sub-patterns and the normalized transition matrix for each phase. The similarity is, for example, the distance between matrices, but other values may also be used. Then, the pattern selection unit 115 selects a subpattern whose calculated degree of similarity satisfies a predetermined condition (for example, the distance between matrices is the minimum) from among the plurality of subpatterns. The pattern selection unit 115 selects sub-patterns for each of the one or more phases determined by the phase division unit 114.

パターン選択部１１５は、選択したパターン及びサブパターンを示す情報を、議論の識別情報と関連付けて議論情報記憶部１２１に記憶させる。パターン選択部１１５は、サブパターンの選択を行わず、パターンのみを選択して議論情報記憶部１２１に記憶させてもよい。 The pattern selection unit 115 causes the discussion information storage unit 121 to store information indicating the selected pattern and subpattern in association with discussion identification information. The pattern selection unit 115 may select only patterns and store them in the discussion information storage unit 121 without selecting sub-patterns.

議論における話者の遷移をグラフ等でそのまま表示するのみでは、遷移の傾向の解釈は分析者に任されるため、分析者によって解釈が異なってしまう場合がある。それに対して本実施形態に係る議論分析装置１は、遷移情報をパターン及びサブパターンと比較して選択することによって、複数の参加者を遷移の傾向によって自動的に分類することができ、また複数の参加者の関係性を自動的に文章として出力することが可能になる。 If the transitions of speakers in a discussion are simply displayed as they are in a graph or the like, the interpretation of the trends in the transitions is left to the analyst, which may result in different interpretations depending on the analyst. On the other hand, the discussion analysis device 1 according to the present embodiment can automatically classify multiple participants according to transition trends by comparing and selecting transition information with patterns and sub-patterns, and can automatically classify multiple participants according to their transition tendencies. It becomes possible to automatically output the relationships between participants as text.

出力部１１６は、情報取得部１１１、遷移検出部１１３、フェーズ分割部１１４及びパターン選択部１１５が議論情報記憶部１２１に記憶させた情報に基づいて、議論に関する情報を出力する。例えば出力部１１６は、図７、図８及び図９に示す画面を通信端末２の表示部２３に表示させることによって議論に関する情報を出力する。 The output unit 116 outputs information regarding the discussion based on the information stored in the discussion information storage unit 121 by the information acquisition unit 111, transition detection unit 113, phase division unit 114, and pattern selection unit 115. For example, the output unit 116 outputs information regarding the discussion by displaying screens shown in FIGS. 7, 8, and 9 on the display unit 23 of the communication terminal 2.

出力部１１６は、情報取得部１１１、遷移検出部１１３、フェーズ分割部１１４及びパターン選択部１１５の処理が終了したことを契機として、又は分析者が通信端末２に対して所定の指示を行ったことを契機として、議論情報記憶部１２１に記憶されている情報に基づいて議論に関する情報を表示するための表示情報を生成し、通信端末２へ送信する。通信端末２の受信部２１１は、議論分析装置１から受信した表示情報に基づいて、図７、図８及び図９に示す画面を表示部２３上に表示する。 The output unit 116 is triggered by the completion of the processing by the information acquisition unit 111, transition detection unit 113, phase division unit 114, and pattern selection unit 115, or when the analyst issues a predetermined instruction to the communication terminal 2. Taking this as an opportunity, display information for displaying information regarding the discussion is generated based on the information stored in the discussion information storage section 121 and transmitted to the communication terminal 2. The receiving unit 211 of the communication terminal 2 displays the screens shown in FIGS. 7, 8, and 9 on the display unit 23 based on the display information received from the discussion analysis device 1.

図７は、ディスカッションレポート画面Ａを表示している表示部２３の前面図である。ディスカッションレポート画面Ａは、１つの議論に関する情報を表示する画面である。ディスカッションレポート画面Ａは、サマリー情報Ａ１と、参加者情報Ａ２と、フェーズ情報Ａ３と、総合評価情報Ａ４とを含む。サマリー情報Ａ１は、議論における時系列の発話量の概要とともに、分析条件として設定された複数の参加者の配置を示す情報である。発話量の概要は、例えば複数の参加者の合計発話量が最大のフェーズの時間範囲を表す文字列である。 FIG. 7 is a front view of the display unit 23 displaying the discussion report screen A. Discussion report screen A is a screen that displays information regarding one discussion. Discussion report screen A includes summary information A1, participant information A2, phase information A3, and comprehensive evaluation information A4. The summary information A1 is information indicating the outline of the amount of speech in the discussion in chronological order, as well as the arrangement of a plurality of participants set as analysis conditions. The summary of the amount of speech is, for example, a character string representing the time range of the phase in which the total amount of speech by a plurality of participants is maximum.

参加者情報Ａ２は、所定の条件を満たす参加者を示す情報である。例えば参加者情報Ａ２は、複数の参加者のうち、発話量が最大の参加者、割り込み量が最大の参加者、及び盛り上げ量（盛り上げ回数でもよい）が最大の参加者を表す。さらに、参加者情報Ａ２は、パターン選択部１１５が選択されたパターンにおいて接続されている参加者を、議論の中心になった人物として表す。 Participant information A2 is information indicating participants who meet predetermined conditions. For example, participant information A2 represents, among the plurality of participants, the participant with the largest amount of speech, the participant with the largest amount of interruptions, and the participant with the largest amount of excitement (which may also be the number of times of excitement). Further, the participant information A2 represents the participant connected in the pattern selected by the pattern selection unit 115 as the person who became the center of the discussion.

フェーズ情報Ａ３は、議論におけるフェーズの時間範囲Ａ３１と、フェーズごとの参加者の役割Ａ３２とを含む。フェーズの時間範囲Ａ３１は、議論における複数の参加者の発話量の積み上げグラフ上に重畳された矢印によって、各フェーズの時間範囲を示す情報である。 The phase information A3 includes a time range A31 of a phase in the discussion and a role A32 of participants for each phase. The phase time range A31 is information that indicates the time range of each phase by an arrow superimposed on a cumulative graph of the amount of speech by a plurality of participants in the discussion.

参加者の役割Ａ３２は、パターン選択部１１５が選択したパターンに基づいて判定された複数の参加者それぞれの役割を示す情報である。役割は、議論における参加者の行動の傾向であり、例えばリーダー又はフォロワーである。 Participant role A32 is information indicating the role of each of the plurality of participants determined based on the pattern selected by the pattern selection unit 115. A role is a behavior tendency of a participant in a discussion, for example, a leader or a follower.

具体的には、参加者の役割Ａ３２を表示する場合に、出力部１１６は、パターン選択部１１５が選択したパターンに基づいて、議論のフェーズごとに複数の参加者それぞれの役割を判定する。例えば出力部１１６は、パターン選択部１１５が選択したパターンにおいて互いに接続されている複数の参加者のうち、発話量が最大の参加者を「リーダー」の役割と判定し、その他の参加者を「フォロワー」の役割と判定する。また、出力部１１６は、パターン選択部１１５が選択したパターンにおいて接続されていない参加者を「役割なし」と判定する。出力部１１６は、パターン選択部１１５が選択したパターンに基づいて、その他の役割を判定してもよい。 Specifically, when displaying participant roles A32, the output unit 116 determines the roles of each of the plurality of participants for each discussion phase based on the pattern selected by the pattern selection unit 115. For example, the output unit 116 determines that the participant with the largest amount of speech among the multiple participants connected to each other in the pattern selected by the pattern selection unit 115 has the role of "leader", and the other participants are assigned the role of "leader". The role is determined to be "Follower". Furthermore, the output unit 116 determines that participants who are not connected in the pattern selected by the pattern selection unit 115 have “no role”. The output unit 116 may determine other roles based on the pattern selected by the pattern selection unit 115.

そして出力部１１６は、フェーズごとの複数の参加者それぞれの役割を示す情報を、通信端末２へ送信する。通信端末２の受信部２１１は、議論分析装置１から受信したフェーズごとの複数の参加者それぞれの役割を、フェーズの時間範囲Ａ３１の近傍に参加者の役割Ａ３２として表示させる。図７の例では、フェーズの時間範囲Ａ３１の下方において、リーダーと判定された参加者に関連付けて実線が表示され、フォロワーと判定された参加者に関連付けて破線が表示されている。参加者の役割Ａ３２は、その他の方法によって参加者の役割を表してもよい。これにより、分析者は、議論分析装置１が遷移の傾向のパターンに基づいて自動的に判定した複数の参加者それぞれの役割を知ることができる。 Then, the output unit 116 transmits information indicating the roles of each of the plurality of participants for each phase to the communication terminal 2. The receiving unit 211 of the communication terminal 2 displays the roles of the plurality of participants for each phase received from the discussion analysis device 1 as participant roles A32 near the time range A31 of the phase. In the example of FIG. 7, below the phase time range A31, a solid line is displayed in association with a participant determined to be a leader, and a broken line is displayed in association with a participant determined to be a follower. Participant role A32 may represent the participant role in other ways. Thereby, the analyst can know the roles of each of the plurality of participants automatically determined by the discussion analysis device 1 based on the pattern of transition trends.

総合評価情報Ａ４は、パターン選択部１１５が選択したパターンに基づいて生成された、議論のフェーズごとの参加者の行動を文章として表す情報である。具体的には、総合評価情報Ａ４を表示する場合に、出力部１１６は、パターン選択部１１５が選択したパターンを取得する。そして出力部１１６は、所定の規則に基づいて、パターンに対応する文章を生成する。所定の規則は、記憶部１２に予め定義された、パターンに対応するテンプレートである。 The comprehensive evaluation information A4 is information that is generated based on the pattern selected by the pattern selection unit 115 and represents the behavior of the participants in each phase of the discussion as a sentence. Specifically, when displaying the comprehensive evaluation information A4, the output unit 116 acquires the pattern selected by the pattern selection unit 115. The output unit 116 then generates a sentence corresponding to the pattern based on a predetermined rule. The predetermined rule is a template that is predefined in the storage unit 12 and corresponds to a pattern.

例えばパターン選択部１１５が選択したパターンにおいて、参加者Ｕ１及び参加者Ｕ２が互いに接続されており、参加者Ｕ１の発話量が参加者Ｕ２の発話量よりも大きい場合に、出力部１１６は、「Ｕ１を中心に、Ｕ２も参加して議論が行われました。」という文章を生成する。記憶部１２は、パターン選択部１１５が生成し得る各パターンに対応するテンプレートを予め記憶している。ここに示したパターンに基づいて文章を生成する方法は一例であり、出力部１１６は、パターン選択部１１５が生成し得る各パターンに基づいて文章を生成可能な既知の方法を用いることができる。 For example, in the pattern selected by the pattern selection unit 115, when the participant U1 and the participant U2 are connected to each other and the amount of speech by the participant U1 is larger than the amount of speech by the participant U2, the output unit 116 outputs the following: The sentence "A discussion was held centered around U1, with U2 also participating." is generated. The storage unit 12 stores in advance templates corresponding to each pattern that can be generated by the pattern selection unit 115. The method of generating sentences based on the patterns shown here is an example, and the output unit 116 can use a known method that can generate sentences based on each pattern that the pattern selection unit 115 can generate.

これにより、分析者は、議論分析装置１が遷移の傾向のパターンに基づいて自動的に生成した複数の参加者の関係性を文章として知ることができ、該関係性の理解が容易になる。 Thereby, the analyst can know the relationships between the plurality of participants automatically generated by the discussion analysis device 1 based on the pattern of transition trends in the form of text, and it becomes easier to understand the relationships.

さらに出力部１１６は、パターン選択部１１５が選択したパターンに加えてサブパターンに基づいて、文章を生成してもよい。例えばパターン選択部１１５が選択したパターンにおいて参加者Ｕ１、参加者Ｕ２及び参加者Ｕ３が互いに接続されており、パターン選択部１１５が選択したサブパターンにおいて、参加者Ｕ３と参加者Ｕ１との間の接続が削除された場合には、「すべてのメンバーが議論に参加しました。発言のやり取りは主にＵ１とＵ２を中心に行われました。」という文章を生成する。 Furthermore, the output unit 116 may generate sentences based on sub-patterns in addition to the pattern selected by the pattern selection unit 115. For example, in the pattern selected by pattern selection section 115, participant U1, participant U2, and participant U3 are connected to each other, and in the subpattern selected by pattern selection section 115, participant U3 and participant U1 are connected to each other. If the connection is deleted, a sentence is generated saying "All members participated in the discussion. The exchange of comments was mainly centered around U1 and U2."

これにより、分析者は、議論分析装置１が遷移の傾向のパターンをさらに細分化したサブパターンに基づいて自動的に生成した複数の参加者の関係性を文章として知ることができる。 Thereby, the analyst can know the relationships among the plurality of participants, which are automatically generated by the discussion analysis device 1 based on sub-patterns obtained by further subdividing the transition tendency pattern, as sentences.

図８は、個人レポート画面Ｂを表示している表示部２３の前面図である。個人レポート画面Ｂは、１人の表示対象の参加者が過去に参加した複数の議論に関する情報を表示する画面である。個人レポート画面Ｂは、参加者の傾向情報Ｂ１と、参加者の経過情報Ｂ２と、議論情報Ｂ３とを含む。 FIG. 8 is a front view of the display unit 23 displaying the personal report screen B. Personal report screen B is a screen that displays information regarding multiple discussions in which one participant to be displayed has participated in the past. The personal report screen B includes participant trend information B1, participant progress information B2, and discussion information B3.

参加者の傾向情報Ｂ１は、表示対象の参加者の特性と、表示対象の参加者の議論における行動とに基づいて生成された文章として、表示対象の参加者の傾向を表す情報である。具体的には、参加者情報記憶部１２２は、参加者の特性を示す情報を予め記憶している。参加者の特性を示す情報は、例えば参加者に対して行われた心理テストの結果である。 The participant tendency information B1 is information representing the tendency of the participant to be displayed as a sentence generated based on the characteristics of the participant to be displayed and the behavior of the participant to be displayed in the discussion. Specifically, the participant information storage unit 122 stores in advance information indicating the characteristics of the participants. The information indicating the characteristics of the participants is, for example, the result of a psychological test conducted on the participants.

参加者の傾向情報Ｂ１を表示する場合に、出力部１１６は、参加者の特性を示す情報と、参加者が過去に参加した複数の議論についてパターン選択部１１５が選択したパターンとを取得する。出力部１１６は、パターン選択部１１５が選択したパターンに基づいて、上述の方法により、複数の議論それぞれにおける参加者の役割（すなわち行動の傾向）を判定する。出力部１１６は、判定した役割のうち１つの役割（例えば最も頻度が高い役割）を選択する。そして出力部１１６は、所定の規則に基づいて、参加者の特性と、選択した参加者の役割とに対応する文章を生成する。所定の規則は、記憶部１２に予め定義された、参加者の特性及び参加者の役割に対応するテンプレートである。 When displaying participant trend information B1, the output unit 116 acquires information indicating characteristics of the participants and patterns selected by the pattern selection unit 115 for multiple discussions in which the participants have participated in the past. Based on the pattern selected by the pattern selection unit 115, the output unit 116 determines the role (that is, behavioral tendency) of the participant in each of the plurality of discussions using the method described above. The output unit 116 selects one role (for example, the role with the highest frequency) from among the determined roles. Then, the output unit 116 generates a sentence corresponding to the participant's characteristics and the selected participant's role based on a predetermined rule. The predetermined rule is a template that is predefined in the storage unit 12 and corresponds to the characteristics of the participants and the roles of the participants.

例えば参加者の特性が高い独自性を示しており、選択した参加者が「役割なし」である場合に、出力部１１６は、「自分の独自性を出すのが得意な一方で、人に冷たく接しがちなところがあります。」という文章を生成する。記憶部１２は、様々な参加者の特性及び参加者の役割に対応するテンプレートを予め記憶している。ここに示した参加者の特性及び参加者の役割に基づいて文章を生成する方法は一例であり、出力部１１６は、参加者の特性及び参加者の役割に基づいて文章を生成可能な既知の方法を用いることができる。 For example, if the participant's characteristics indicate a high level of uniqueness and the selected participant is "no role", the output unit 116 may output a message that "is good at expressing his or her uniqueness, but is cold to others." Generates the sentence "There are some things that tend to touch each other." The storage unit 12 stores in advance templates corresponding to various participant characteristics and participant roles. The method of generating sentences based on the characteristics of the participants and the roles of the participants shown here is just an example, and the output unit 116 uses a known method that can generate sentences based on the characteristics of the participants and the roles of the participants. A method can be used.

これにより、分析者は、予め収集された参加者の特性と、参加者の議論における行動の傾向とを対比させて認識することができる。 This allows the analyst to compare and recognize the characteristics of the participants collected in advance and the behavioral trends of the participants in the discussion.

参加者の経過情報Ｂ２は、表示対象の参加者が過去に参加した複数の議論における、表示対象の参加者の発話量、割り込み量、盛り上げ量及び役割を示す情報である。図８の例では、参加者の経過情報Ｂ２は、議論ごとの参加者の発話量を棒グラフとして表し、該議論の全ての参加者の平均発話量を該棒グラフの上に重畳して表している。また、参加者の経過情報Ｂ２は、参加者が所定の役割（例えばリーダー又はフォロワー）となった議論の回次を表している。また、参加者の経過情報Ｂ２は、参加者の発話量、割り込み量及び盛り上げ量がそれぞれ所定の条件を満たした議論の回次を表している。 Participant progress information B2 is information indicating the amount of speech, amount of interruptions, amount of excitement, and role of the participant to be displayed in a plurality of discussions in which the participant to be displayed has participated in the past. In the example of FIG. 8, participant progress information B2 represents the amount of speech by participants for each discussion as a bar graph, and the average amount of speech of all participants in the discussion is superimposed on the bar graph. . Further, participant progress information B2 represents the number of discussions in which a participant took on a predetermined role (eg, leader or follower). Furthermore, participant progress information B2 represents the number of discussions in which the amount of utterances, the amount of interruptions, and the amount of excitement of participants each satisfied predetermined conditions.

また、参加者の経過情報Ｂ２は、発話量が所定の条件を満たした議論のフェーズ（例えば参加者の平均発話量が最も高いフェーズ）を表している。また、参加者の経過情報Ｂ２は、表示対象の参加者が特定の他の参加者と同じ議論に参加している際に表示対象の参加者の発話量が増加した場合の、該他の参加者を表している。また、参加者の経過情報Ｂ２は、第１の議論（例えば最初の議論）における発話量と比較して、第１の議論とは異なる第２の議論（例えば最後の議論）における発話量が増加しているか否かを表している。 Further, participant progress information B2 represents a phase of the discussion in which the amount of speech satisfies a predetermined condition (for example, a phase in which the average amount of speech of the participants is the highest). Participant progress information B2 also includes information about other participants when the display target participant's utterance increases while the display target participant is participating in the same discussion as a specific other participant. It represents a person. In addition, participant progress information B2 shows that the amount of utterances in a second discussion (for example, the last discussion) that is different from the first discussion has increased compared to the amount of utterances in the first discussion (for example, the first discussion). It shows whether it is done or not.

これにより、分析者は、１人の参加者について、過去に参加した議論における行動の傾向を一覧で見ることができる。ここに示した参加者の経過情報Ｂ２は一例であり、参加者が過去に参加した複数の議論における、参加者の発話量、割り込み量、盛り上げ量及び役割に基づいてその他の情報を表してもよい。 This allows the analyst to see a list of behavior trends for one participant in discussions he or she has participated in in the past. The participant progress information B2 shown here is just an example, and other information may be expressed based on the amount of speech, amount of interruptions, amount of excitement, and role of the participant in multiple discussions in which the participant has participated in the past. good.

議論情報Ｂ３は、表示対象の参加者が過去に参加した複数の議論のうち、所定の条件を満たす議論における表示対象の参加者の発話量を示す情報である。図８の例では、議論情報Ｂ３は、所定の条件を満たす議論それぞれについての発話量のグラフを含む。議論情報Ｂ３のグラフは、斜線の領域によって１つの議論における表示対象の参加者の発話量の時系列の変化を表しており、白抜きの領域によって該議論における全ての参加者の合計発話量の時系列の変化を表している。 Discussion information B3 is information indicating the amount of speech by a participant to be displayed in a discussion that satisfies a predetermined condition among a plurality of discussions in which the participant to be displayed has participated in the past. In the example of FIG. 8, discussion information B3 includes a graph of the amount of speech for each discussion that satisfies a predetermined condition. In the graph of discussion information B3, the shaded area represents the time-series change in the amount of speech of the displayed participant in one discussion, and the white area represents the total amount of speech of all participants in the discussion. It represents changes over time.

図８の例において、議論情報Ｂ３に表示する議論は、時間順（回次順）に複数の議論である。これにより、分析者は、参加者の発話量の傾向が時間順でどのように変わったかを一覧で見ることができる。 In the example of FIG. 8, the discussions displayed in the discussion information B3 are a plurality of discussions in chronological order (time order). This allows the analyst to view a list of how the trends in the amount of speech by participants have changed over time.

また、議論情報Ｂ３に表示する議論は、互いに類似する複数の議論又は互いに類似しない複数の議論であってもよい。この場合に、出力部１１６は表示対象の参加者が参加した複数の議論の複数の遷移行列の間の行列間距離を算出し、行列間距離が所定値よりも小さい複数の議論を互いに類似する複数の議論として特定し、又は行列間距離が所定値よりも大きい複数の議論を互いに類似しない複数の議論として特定する。これにより、分析者は、参加者が参加している議論のうち、話者の遷移の傾向が似ている又は似ていない議論における参加者の発話量の傾向を一覧で見ることができる。 Further, the discussions displayed in the discussion information B3 may be a plurality of discussions that are similar to each other or a plurality of discussions that are not similar to each other. In this case, the output unit 116 calculates the inter-matrix distance between the plurality of transition matrices of the plurality of discussions in which the display target participant participated, and classifies the plurality of discussions whose inter-matrix distance is smaller than a predetermined value to be similar to each other. A plurality of arguments are identified, or a plurality of arguments whose inter-matrix distance is larger than a predetermined value are identified as a plurality of arguments that are dissimilar to each other. This allows the analyst to view a list of trends in the amount of speech by participants in discussions in which the participants are participating, in which the trends in speaker transitions are similar or dissimilar.

図９は、コースレポート画面Ｃを表示している表示部２３の前面図である。コースレポート画面Ｃは、表示対象のグループに属する複数の参加者が過去に参加した複数の議論に関する情報を表示する画面である。例えばグループは、同一のコースを受講している複数の参加者、同一の講師による指導を受けている複数の参加者等である。コースレポート画面Ｃは、参加者の分布情報Ｃ１と、コースの経過情報Ｃ２と、コースの統計情報Ｃ３と、順位情報Ｃ４とを含む。 FIG. 9 is a front view of the display unit 23 displaying the course report screen C. The course report screen C is a screen that displays information regarding multiple discussions in which multiple participants belonging to the group to be displayed have participated in the past. For example, a group may be multiple participants taking the same course, multiple participants receiving instruction from the same instructor, or the like. The course report screen C includes participant distribution information C1, course progress information C2, course statistical information C3, and ranking information C4.

参加者の分布情報Ｃ１は、表示対象のグループに属する複数の参加者の発話量及び割り込み量の分布を示す情報である。図９の例では、分布情報Ｃ１は、横軸を発話量とし、縦軸を割り込み量として、表示対象のグループに属する複数の参加者の発話量及び割り込み量の組み合わせをプロットとして表している。 The participant distribution information C1 is information indicating the distribution of the amount of speech and the amount of interruptions of a plurality of participants belonging to the group to be displayed. In the example of FIG. 9, the distribution information C1 represents a combination of the amount of speech and the amount of interruptions of a plurality of participants belonging to the group to be displayed as a plot, with the horizontal axis representing the amount of speech and the vertical axis representing the amount of interruptions.

これにより、分析者は、表示対象のグループに属する複数の参加者の傾向を知ることができる。例えば分布情報Ｃ１の右上の領域にプロットされた参加者は、発話量及び割り込み量がともに大きいため、議論をリードする傾向がある。分布情報Ｃ１の左上の領域にプロットされた参加者は、割り込み量が大きいが発話量が小さいため、議論において他人に同調する傾向がある。分布情報Ｃ１の右下の領域にプロットされた参加者は、発話量が大きいが割り込み量が小さいため、議論において行儀が良い傾向がある。分布情報Ｃ１の左下の領域にプロットされた参加者は、発話量及び割り込み量がともに小さいため、議論への参加に消極的である傾向がある。 This allows the analyst to know the tendencies of multiple participants belonging to the group to be displayed. For example, participants plotted in the upper right area of the distribution information C1 tend to lead the discussion because they have a large amount of speech and a large amount of interruptions. Participants plotted in the upper left area of the distribution information C1 have a large amount of interruptions but a small amount of speech, so they tend to tune in to others in the discussion. Participants plotted in the lower right area of the distribution information C1 have a large amount of speech but a small amount of interruptions, and therefore tend to behave well in discussions. The participants plotted in the lower left area of the distribution information C1 have a small amount of speech and a small amount of interruptions, so they tend to be reluctant to participate in the discussion.

コースの経過情報Ｃ２は、表示対象のグループに属する複数の参加者が過去に参加した複数の議論における発話量、割り込み量及び盛り上げ量の経過を示す情報である。図９の例では、コースの経過情報Ｃ２は、表示対象のグループに属する複数の参加者が過去に参加した複数の議論のうち、前期、中期及び後期それぞれにおける発話量、割り込み量及び盛り上げ量を積み上げた棒グラフを表している。例えば出力部１１６は、複数の議論を最初の議論から最後の議論まで順に１／３ずつを前期、中期及び後期に分類し、各分類において合計又は平均の発話量、割り込み量及び盛り上げ量を算出して出力する。これにより、分析者は、表示対象のグループにおける議論の傾向の変化を知ることができる。 The course progress information C2 is information indicating the progress of the amount of utterances, the amount of interruptions, and the amount of excitement in multiple discussions in which multiple participants belonging to the group to be displayed have participated in the past. In the example of FIG. 9, the course progress information C2 indicates the amount of speech, the amount of interruptions, and the amount of excitement in each of the first, middle, and second periods of multiple discussions in which multiple participants belonging to the group to be displayed have participated in the past. It represents a stacked bar graph. For example, the output unit 116 classifies 1/3 of the multiple discussions from the first discussion to the last into early, middle, and late stages, and calculates the total or average amount of speech, interruption amount, and excitement amount for each classification. and output. This allows the analyst to know changes in the trend of discussion in the group being displayed.

コースの統計情報Ｃ３は、表示対象のグループに属する複数の参加者の発話量、割り込み量、盛り上げ量及びそれらの合計量（総合活動量）の統計値を示す情報である。図９の例では、コースの統計情報Ｃ３は、横軸を発話量、割り込み量、盛り上げ量及び総合活動量とし、縦軸を参加者の人数として棒グラフを表している。さらにコースの統計情報Ｃ３は、発話量、割り込み量、盛り上げ量及び総合活動量それぞれの平均値に該当する棒グラフの表示態様（例えば色）を、他の棒グラフの表示態様とは異なるように表している。これにより、分析者は、表示対象のグループに属する複数の参加者について、発話量、割り込み量、盛り上げ量及び総合活動量ごとの人数の分布と、発話量、割り込み量、盛り上げ量及び総合活動量の統計値とを知ることができる。 The course statistical information C3 is information indicating statistical values of the amount of speech, the amount of interruptions, the amount of excitement, and their total amount (total activity amount) of a plurality of participants belonging to the group to be displayed. In the example of FIG. 9, the course statistical information C3 is expressed as a bar graph with the horizontal axis representing the amount of speech, the amount of interruptions, the amount of excitement, and the total amount of activity, and the vertical axis representing the number of participants. Furthermore, the course statistical information C3 displays the display format (for example, color) of the bar graph corresponding to the average value of the amount of speech, the amount of interruptions, the amount of excitement, and the total amount of activity in a manner different from the display format of other bar graphs. There is. As a result, the analyst can determine the distribution of the number of people by amount of speech, amount of interruptions, amount of excitement, and total amount of activity, and the amount of speech, amount of interruptions, amount of excitement, and total amount of activity for multiple participants belonging to the group to be displayed. You can know the statistical value of.

順位情報Ｃ４は、表示対象のグループに属する複数の参加者の発話量の順位を示す情報である。図９の例では、順位情報Ｃ４は、第１の議論（例えば最初の議論）における複数の参加者の一覧を該複数の参加者の発話量に応じて順位付けして（例えば順位の昇順で）表すとともに、第１の議論とは異なる第２の議論（例えば最後の議論）における複数の参加者の一覧を該複数の参加者の発話量に応じて順位付けして（例えば順位の昇順で）表す。 The ranking information C4 is information indicating the ranking of the amount of speech of a plurality of participants belonging to the group to be displayed. In the example of FIG. 9, the ranking information C4 ranks a list of multiple participants in the first discussion (for example, the first discussion) according to the amount of utterances of the multiple participants (for example, in ascending order of ranking). ), and a list of multiple participants in a second discussion (for example, the last discussion) that is different from the first discussion is ranked according to the amount of utterances of the multiple participants (for example, in ascending order of rank). )represent.

また、順位情報Ｃ４は、第１の議論においてある参加者に対応する位置と、第２の議論において該参加者に対応する位置とを結ぶ線を表してもよい。さらに順位情報Ｃ４は、第１の議論と比較した第２の議論の参加者の順位の変動を、変動の量を示す数値及び変動の向き（上又は下）を示す矢印によって表してもよい。これにより分析者は、複数の参加者それぞれの発話量が２つの議論の間でどのように変わったかを知ることができる。 Furthermore, the ranking information C4 may represent a line connecting a position corresponding to a certain participant in the first discussion and a position corresponding to the participant in the second discussion. Furthermore, the ranking information C4 may represent a change in the ranking of participants in the second discussion compared to the first discussion using a numerical value indicating the amount of change and an arrow indicating the direction of the change (up or down). This allows the analyst to know how the amount of speech by each of the multiple participants changed between the two discussions.

図７～図９に示したディスカッションレポート画面Ａ、個人レポート画面Ｂ及びコースレポート画面Ｃは一例であり、情報の内容、外観及び配置は変更されてもよい。また、図７～図９に示したディスカッションレポート画面Ａ、個人レポート画面Ｂ及びコースレポート画面Ｃのうち少なくとも一部は、１つの画面に統合されてもよく、さらに複数の画面に分割されてもよい。 Discussion report screen A, personal report screen B, and course report screen C shown in FIGS. 7 to 9 are examples, and the content, appearance, and arrangement of information may be changed. Furthermore, at least a portion of the discussion report screen A, personal report screen B, and course report screen C shown in FIGS. 7 to 9 may be integrated into one screen, or may be further divided into multiple screens. good.

出力部１１６は、画面の表示に限らず、プリンタを用いて紙に印刷すること、記憶媒体にデータとして記憶させること、又は通信回線を介して外部へ送信することによって、議論に関する情報を出力してもよい。 The output unit 116 outputs information related to the discussion not only by displaying it on the screen but also by printing it on paper using a printer, storing it as data in a storage medium, or transmitting it to the outside via a communication line. You can.

出力部１１６は、分析者（閲覧者）ごとに内容を切り替えて、議論に関する情報を出力してもよい。この場合に、議論分析装置１は、分析者ごとに出力内容の設定を予め受け付け、分析者に関連付けて設定情報として記憶部１２に記憶させる。出力内容の設定は、例えば出力内容を示すプラグインの選択によって行われる。図７の例では、サマリー情報Ａ１、参加者情報Ａ２、フェーズ情報Ａ３及び総合評価情報Ａ４の４つのプラグインが定義されている。分析者又は議論分析装置１の管理者は、分析者に対して出力させるプラグインを選択することによって、出力内容を設定する。 The output unit 116 may output information regarding the discussion by switching the content for each analyst (viewer). In this case, the discussion analysis device 1 receives the output content settings for each analyst in advance, and stores them in the storage unit 12 as setting information in association with the analyst. Setting the output content is performed, for example, by selecting a plug-in that indicates the output content. In the example of FIG. 7, four plug-ins are defined: summary information A1, participant information A2, phase information A3, and comprehensive evaluation information A4. The analyst or the administrator of the discussion analysis device 1 sets output content by selecting a plug-in to be output to the analyst.

議論分析装置１において、出力部１１６は、議論に関する情報を出力する際に、出力対象の分析者を特定し、該分析者に関連付けられた設定情報を取得する。そして出力部１１６は、議論情報記憶部１２１に記憶された情報に基づいて、設定情報（プラグイン）が示す内容を出力する。これにより、議論分析装置１は、分析者ごとに異なる種類の情報を出力することができる。 In the discussion analysis device 1, when outputting information regarding the discussion, the output unit 116 specifies an analyst to be output and acquires setting information associated with the analyst. Then, the output unit 116 outputs the content indicated by the setting information (plug-in) based on the information stored in the discussion information storage unit 121. Thereby, the discussion analysis device 1 can output different types of information for each analyst.

［議論分析方法のフロー］
図１０は、議論分析装置１が行う議論分析方法のフローチャートを示す図である。議論分析装置１において、情報取得部１１１は、議論における複数の参加者それぞれの時系列の発話量を取得する（Ｓ１１）。情報取得部１１１は、議論における複数の参加者それぞれの第１フレームごとの発話量を示す情報を、議論ＩＤと関連付けて議論情報記憶部１２１に記憶させる。 [Flow of argument analysis method]
FIG. 10 is a diagram showing a flowchart of the discussion analysis method performed by the discussion analysis device 1. In the discussion analysis device 1, the information acquisition unit 111 acquires the amount of time-series utterances of each of the plurality of participants in the discussion (S11). The information acquisition unit 111 causes the discussion information storage unit 121 to store information indicating the amount of speech for each first frame of each of the plurality of participants in the discussion in association with the discussion ID.

このとき、情報取得部１１１は、集音装置３が取得した議論の音声に対して、音源定位を行い、複数の参加者それぞれの発話期間を特定することによって、発話量を取得する。別の方法として、情報取得部１１１は、記憶部１２に予め記憶された発話期間を読み出して取得することによって、発話量を取得してもよい。あるいは情報取得部１１１は、議論における参加者の顔を含む画像に基づいて、参加者の発話期間を特定することによって、発話量を取得してもよい。 At this time, the information acquisition unit 111 performs sound source localization on the discussion audio acquired by the sound collection device 3, and acquires the amount of speech by identifying the speech period of each of the plurality of participants. As another method, the information acquisition unit 111 may acquire the amount of speech by reading and acquiring the speech period stored in the storage unit 12 in advance. Alternatively, the information acquisition unit 111 may acquire the amount of speech by identifying the speech period of the participant based on an image including the face of the participant in the discussion.

最大発話者特定部１１２は、情報取得部１１１が取得した発話量に基づいて、第１フレームごとに複数の参加者のうち発話量が最大である最大発話者を特定する（Ｓ１２）。遷移検出部１１３は、最大発話者特定部１１２が特定した第１フレームごとの最大発話者の変化に基づいて、複数の参加者の間で発生した話者の遷移を検出する（Ｓ１３）。遷移検出部１１３は、第２フレームごとに生成した遷移行列を示す情報を、遷移情報として議論情報記憶部１２１に記憶させる。 The largest speaker identification unit 112 identifies the largest speaker with the largest amount of speech among the plurality of participants for each first frame based on the amount of speech acquired by the information acquisition unit 111 (S12). The transition detection unit 113 detects a transition in speakers that occurs between a plurality of participants, based on the change in the maximum speaker for each first frame specified by the maximum speaker identification unit 112 (S13). The transition detection unit 113 causes the discussion information storage unit 121 to store information indicating the transition matrix generated for each second frame as transition information.

フェーズ分割部１１４は、遷移検出部１１３が検出した遷移を示す遷移情報の時系列の類似性に基づいて、議論を１つ以上のフェーズに分割する（Ｓ１４）。フェーズ分割部１１４は、決定した議論のフェーズを示す情報を、議論の識別情報と関連付けて議論情報記憶部１２１に記憶させる。 The phase dividing unit 114 divides the discussion into one or more phases based on the similarity in the time series of the transition information indicating the transition detected by the transition detecting unit 113 (S14). The phase dividing unit 114 causes the discussion information storage unit 121 to store information indicating the determined phase of the discussion in association with discussion identification information.

パターン選択部１１５は、複数の参加者の各組み合わせにおける遷移の有無を示す複数のパターンを生成する。パターン選択部１１５は、生成した複数のパターンそれぞれの行列と、正規化したフェーズごとの遷移行列（遷移情報）との間の類似度を算出する。そしてパターン選択部１１５は、複数のパターンのうち、フェーズ分割部１１４が決定した１つ以上のフェーズそれぞれについて、算出した類似度が所定の条件を満たすパターンを選択する（Ｓ１５）。 The pattern selection unit 115 generates a plurality of patterns indicating the presence or absence of a transition in each combination of a plurality of participants. The pattern selection unit 115 calculates the degree of similarity between the matrix of each of the plurality of generated patterns and the normalized transition matrix (transition information) for each phase. Then, the pattern selection unit 115 selects, from among the plurality of patterns, a pattern whose calculated degree of similarity satisfies a predetermined condition for each of the one or more phases determined by the phase division unit 114 (S15).

さらにパターン選択部１１５は、フェーズごとに選択したパターンに変更を加えたサブパターンを選択してもよい。パターン選択部１１５は、選択したパターン及びサブパターンを示す情報を、議論の識別情報と関連付けて議論情報記憶部１２１に記憶させる。 Further, the pattern selection unit 115 may select a sub-pattern that is a modified pattern of the pattern selected for each phase. The pattern selection unit 115 causes the discussion information storage unit 121 to store information indicating the selected pattern and subpattern in association with discussion identification information.

出力部１１６は、情報取得部１１１、遷移検出部１１３、フェーズ分割部１１４及びパターン選択部１１５が議論情報記憶部１２１に記憶させた情報に基づいて、議論に関する情報を出力する（Ｓ１６）。例えば出力部１１６は、図７、図８及び図９に示す画面を通信端末２の表示部２３に表示させることによって議論に関する情報を出力する。 The output unit 116 outputs information regarding the discussion based on the information stored in the discussion information storage unit 121 by the information acquisition unit 111, transition detection unit 113, phase division unit 114, and pattern selection unit 115 (S16). For example, the output unit 116 outputs information regarding the discussion by displaying screens shown in FIGS. 7, 8, and 9 on the display unit 23 of the communication terminal 2.

［本実施形態の効果］
単純に音が発生した向きに基づいて自動的に話者の遷移を検出すると、参加者が話している際に発生した発言ではない音を参加者の発言として検出してしまい、話者の遷移を正しく検出できない場合がある。すなわち、議論の音声の中に物体の衝突音や他の参加者の相槌等の短い音が含まれている場合に、短い音を分析に必要な音か否かを判別するのは困難である。例えば隣接するグループの声が背景雑音として多く混ざる状況で、参加者がペンで机を叩くなどの音を出した場合、分離音には背景雑音が混ざる。この場合に、ペンの音を「音声ではないから不要」と判別するのは難しい。また、音の長さによって短い音を除外しようとしても、「うーん」や「ほー」等の長い相槌を除外することができず、逆に「違う」や「確かに」等の重要な意味のある発言を除外してしまうおそれがある。 [Effects of this embodiment]
If the speaker transition is automatically detected simply based on the direction in which the sound is generated, sounds that occur while the participant is speaking but are not utterances will be detected as the participant's utterances, and the speaker transition will be detected automatically. may not be detected correctly. In other words, when the audio of a discussion includes short sounds such as the sound of objects colliding or other participants' chiding, it is difficult to determine whether the short sounds are necessary for analysis or not. . For example, if a participant makes a sound such as tapping a pen on a desk in a situation where the voices of adjacent groups are often mixed in as background noise, the background noise will be mixed into the separated sound. In this case, it is difficult to determine that the pen sound is "unnecessary because it is not a voice." In addition, even if we try to exclude short sounds based on the length of the sound, we cannot exclude long sounds such as "um" and "ho", and conversely, we cannot exclude important meanings such as "no" and "certainly". There is a risk that certain statements may be excluded.

それに対して、本実施形態に係る議論分析装置１は、発話量が最大の参加者Ｕの変化に基づいて話者の遷移を検出する。そのため、議論分析装置１は、発言ではない音によって話者の遷移を検出することを抑えることができ、議論における話者の遷移の検出精度を向上できる。 In contrast, the discussion analysis device 1 according to the present embodiment detects a change in speakers based on a change in the participant U who has the largest amount of speech. Therefore, the discussion analysis device 1 can suppress the detection of speaker transitions based on sounds that are not utterances, and can improve the accuracy of detecting speaker transitions in discussions.

本実施形態に係る議論分析システムＳＳは、学生が行うアクティブ・ラーニングの分析や、組織における会議の分析に、好適に用いられる。また、議論分析システムＳＳは、組織における採用活動において、候補者同士で行われるグループディスカッションの分析にも好適に用いられる。従来、これらの議論には多数の参加者がいるため、議論の分析のために非常に大きな時間及び費用のコストが掛かっていた。それに対して、議論分析装置１は、これらの議論を自動的にかつ高い精度で分析できるため、分析のためのコストを大幅に削減できる。 The discussion analysis system SS according to this embodiment is suitably used for analyzing active learning conducted by students and for analyzing meetings in organizations. The discussion analysis system SS is also suitably used to analyze group discussions held among candidates during recruitment activities in an organization. Traditionally, these discussions involve a large number of participants, resulting in significant time and expense costs for analyzing the discussions. On the other hand, the discussion analysis device 1 can analyze these discussions automatically and with high accuracy, so the cost for analysis can be significantly reduced.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されず、その要旨の範囲内で種々の変形及び変更が可能である。例えば、装置の全部又は一部は、任意の単位で機能的又は物理的に分散・統合して構成することができる。また、複数の実施の形態の任意の組み合わせによって生じる新たな実施の形態も、本発明の実施の形態に含まれる。組み合わせによって生じる新たな実施の形態の効果は、もとの実施の形態の効果を併せ持つ。 Although the present invention has been described above using the embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments, and various modifications and changes can be made within the scope of the gist. be. For example, all or part of the device can be functionally or physically distributed and integrated into arbitrary units. In addition, new embodiments created by arbitrary combinations of multiple embodiments are also included in the embodiments of the present invention. The effects of the new embodiment resulting from the combination have the effects of the original embodiment.

議論分析装置１のプロセッサは、図１０に示す議論分析方法に含まれる各ステップ（工程）の主体となる。すなわち、議論分析装置１のプロセッサは、図１０に示す議論分析方法を実行するためのプログラムを記憶部から読み出し、該プログラムを実行して議論分析装置１の各部を制御することによって、図１０に示す議論分析方法を実行する。図１０に示す議論分析方法に含まれるステップは一部省略されてもよく、ステップ間の順番が変更されてもよく、複数のステップが並行して行われてもよい。 The processor of the discussion analysis device 1 is responsible for each step (process) included in the discussion analysis method shown in FIG. That is, the processor of the discussion analysis device 1 reads a program for executing the discussion analysis method shown in FIG. Implement the argument analysis method shown. Some of the steps included in the argument analysis method shown in FIG. 10 may be omitted, the order of the steps may be changed, or a plurality of steps may be performed in parallel.

ＳＳ議論分析システム
１議論分析装置
１１制御部
１１１情報取得部
１１２最大発話者特定部
１１３遷移検出部
１１４フェーズ分割部
１１５パターン選択部
１１６出力部

SS Discussion analysis system 1 Discussion analysis device 11 Control unit 111 Information acquisition unit 112 Maximum speaker identification unit 113 Transition detection unit 114 Phase division unit 115 Pattern selection unit 116 Output unit

Claims

an information acquisition unit that acquires the amount of speech of each of the plurality of participants in a discussion in which the plurality of participants participate;
In the discussion, a maximum speaker identification unit that identifies a maximum speaker with the maximum amount of speech among the plurality of participants for each first time range;
a transition detection unit that generates transition information indicating a transition of speakers that has occurred among the plurality of participants based on a change in the maximum speaker for each of the first time ranges;
Information regarding the amount of speech by the one participant in the plurality of discussions in which one of the plurality of participants participated, the degree of similarity of the transition information being greater than a predetermined value; an output unit that outputs in association with the one participant;
A discussion analysis device with.

an information acquisition unit that acquires the amount of speech of each of the plurality of participants in a discussion in which the plurality of participants participate;
In the discussion, a maximum speaker identification unit that identifies a maximum speaker with the maximum amount of speech among the plurality of participants for each first time range;
a transition detection unit that generates transition information indicating a transition of speakers that has occurred among the plurality of participants based on a change in the maximum speaker for each of the first time ranges;
Information regarding the amount of speech by the one participant in the plurality of discussions in which one of the plurality of participants participated, the degree of similarity of the transition information being smaller than a predetermined value an output unit that outputs in association with the one participant;
A discussion analysis device with.

The transition detection unit is configured to detect when the first participant who is the largest speaker in one time range and the second participant who is the largest speaker in a time range following the one time range are different. , generating the transition information indicating the transition from the first participant to the second participant.

The discussion analysis device according to any one of claims 1 to 3, further comprising a phase division unit that divides the discussion into one or more phases based on chronological similarity of the transition information.

The transition detection unit generates the transition information indicating the number of transitions for each second time range that is longer than the first time range,
The phase division unit clusters the transition information for each of the second time ranges based on the similarity of the time series of the transition information, and clusters the transition information corresponding to the transition information included in the plurality of generated clusters. The discussion analysis device according to claim 4, wherein the one or more phases making up the discussion are determined based on the time during the discussion in a second time range.

a pattern selection unit that generates a plurality of patterns indicating the presence or absence of the transition for each combination of the plurality of participants, and selects a pattern whose similarity with the transition information satisfies a predetermined condition from among the plurality of patterns; The discussion analysis device according to any one of claims 1 to 5, further comprising.

The pattern selection unit further generates a plurality of sub-patterns in which a part of the selected pattern is changed, and selects a sub-pattern whose degree of similarity with the transition information satisfies a predetermined condition from among the plurality of sub-patterns. The discussion analysis device according to claim 6.

The output unit determines the roles of the plurality of participants based on the pattern selected by the pattern selection unit, and outputs each of the plurality of participants in association with the role of each of the plurality of participants. The discussion analysis device according to claim 6 or 7.

The discussion analysis device according to any one of claims 6 to 8, wherein the output unit outputs the actions of the plurality of participants as sentences based on the pattern selected by the pattern selection unit.

10. The output unit according to claim 1, wherein the output unit outputs information regarding the amount of speech of the plurality of participants in the plurality of discussions in which the plurality of participants belonging to a predetermined group participated, in association with the group. The argument analysis device described in item (1).

The output unit is configured to output rankings of utterances of the plurality of participants belonging to the group in a first discussion and utterances of the plurality of participants belonging to the group in a second discussion different from the first discussion. 11. The discussion analysis device according to claim 10, wherein the discussion analysis device outputs the information in association with the rank of quantity.

The processor executes
obtaining the amount of speech of each of the plurality of participants in a discussion in which the plurality of participants participate;
In the discussion, identifying the largest speaker with the largest amount of speech among the plurality of participants for each first time range;
generating transition information indicating a speaker transition that has occurred among the plurality of participants based on the change in the maximum speaker for each of the first time ranges;
Information regarding the amount of speech by the one participant in the plurality of discussions in which one of the plurality of participants participated, the degree of similarity of the transition information being greater than a predetermined value; outputting in association with the one participant;
An argument analysis method with

The processor executes
obtaining the amount of speech of each of the plurality of participants in a discussion in which the plurality of participants participate;
In the discussion, identifying the largest speaker with the largest amount of speech among the plurality of participants for each first time range;
generating transition information indicating a speaker transition that has occurred among the plurality of participants based on the change in the maximum speaker for each of the first time ranges;
Information regarding the amount of speech by the one participant in the plurality of discussions in which one of the plurality of participants participated, the degree of similarity of the transition information being smaller than a predetermined value outputting in association with the one participant;
An argument analysis method with