JP2020148930A

JP2020148930A - Voice analyzer, voice analysis system and voice analysis method

Info

Publication number: JP2020148930A
Application number: JP2019046989A
Authority: JP
Inventors: 武志水本; Takeshi Mizumoto; 哲也菅原; Tetsuya Sugawara
Original assignee: Hylable Inc
Current assignee: Hylable Inc
Priority date: 2019-03-14
Filing date: 2019-03-14
Publication date: 2020-09-17
Anticipated expiration: 2039-03-14
Also published as: JP7261462B2

Abstract

To be able to reduce a cost of analyzing voice in discussions involving multiple participants.SOLUTION: A voice analyzer 1 according to an embodiment of the present invention is a voice analyzer which analyzes sounds emitted by multiple participants around the sound collecting device, and includes: a receiving unit 111 which receives, from a communication terminal, sound collecting device identification information which can identify a sound collecting device, a participant identification information which can identify the participant, and position designation information which specifies a position with respect to the sound collecting device which are obtained by the communication terminal; a position setting unit 112 which sets the position of each of multiple participants with respect to the sound collecting device, based on the sound collecting device identification information, the participant identification information, and the position designation information which are received by the receiving unit 111; a voice acquisition unit 113 which acquires the voice from the sound collecting device; and a voice analysis unit 114 which analyzes the voice emitted by each of the multiple participants included in the voice acquired by the voice acquisition unit 113, based on each of the positions of the multiple participants set by the position setting unit 112.SELECTED DRAWING: Figure 2

Description

本発明は、複数の参加者による議論の音声を分析するための音声分析装置、音声分析システム及び音声分析方法に関する。 The present invention relates to a voice analyzer, a voice analysis system, and a voice analysis method for analyzing the voice of a discussion by a plurality of participants.

特許文献１には、複数の参加者が参加する会議等の議論において、カメラ及びマイクを備える装置をテーブル上に載置し、該カメラが撮像した顔画像を用いて話者の方位を特定するとともに、該マイクが取得した音声を用いて発話内容を特定し、話者と発話内容とを対応付けて記録するシステムが記載されている。 In Patent Document 1, in a discussion such as a conference in which a plurality of participants participate, a device provided with a camera and a microphone is placed on a table, and the orientation of the speaker is specified using a face image captured by the camera. In addition, a system is described in which the utterance content is specified using the voice acquired by the microphone, and the speaker and the utterance content are recorded in association with each other.

特開２００５−２７４７０７号公報Japanese Unexamined Patent Publication No. 2005-274707

特許文献１に記載のシステムは、カメラ及びマイクを一体化した装置を議論のグループごとに必要とするため、導入のために高いコストが掛かる。学生が行うアクティブ・ラーニングの音声の分析や、組織における会議の音声の分析においては、多数のグループが並行して議論を行うことが想定されるため、低いコストで音声の分析を可能にすることが求められている。 Since the system described in Patent Document 1 requires a device in which a camera and a microphone are integrated for each group of discussions, it is costly to introduce the system. In the analysis of voice of active learning performed by students and the analysis of voice of meetings in an organization, it is expected that many groups will have discussions in parallel, so it is possible to analyze voice at low cost. Is required.

本発明はこれらの点に鑑みてなされたものであり、複数の参加者が参加する議論における音声を分析するためのコストを削減することを目的とする。 The present invention has been made in view of these points, and an object of the present invention is to reduce the cost for analyzing speech in a discussion in which a plurality of participants participate.

本発明の第１の態様の音声分析装置は、集音装置の周囲で複数の参加者が発した音声を分析する音声分析装置であって、通信端末が取得した、前記集音装置を識別可能な集音装置識別情報と、前記参加者を識別可能な参加者識別情報と、前記集音装置に対する位置を指定する位置指定情報とを、前記通信端末から受信する受信部と、前記受信部が受信した前記集音装置識別情報、前記参加者識別情報及び前記位置指定情報に基づいて、前記集音装置に対する前記複数の参加者それぞれの位置を設定する位置設定部と、前記集音装置から前記音声を取得する音声取得部と、前記位置設定部が設定した前記複数の参加者それぞれの前記位置に基づいて、前記音声取得部が取得した前記音声に含まれる前記複数の参加者それぞれが発した前記音声を分析する音声分析部と、を有する。 The voice analyzer of the first aspect of the present invention is a voice analyzer that analyzes sounds emitted by a plurality of participants around the sound collector, and can identify the sound collector acquired by the communication terminal. A receiving unit that receives the sound collecting device identification information, the participant identification information that can identify the participant, and the position designation information that specifies the position with respect to the sound collecting device from the communication terminal, and the receiving unit. A position setting unit that sets the position of each of the plurality of participants with respect to the sound collector based on the received sound collector identification information, the participant identification information, and the position designation information, and the sound collector from the sound collector. Based on the voice acquisition unit for acquiring the sound and the positions of the plurality of participants set by the position setting unit, each of the plurality of participants included in the sound acquired by the voice acquisition unit emitted the sound. It has a sound analysis unit that analyzes the sound.

前記受信部は、前記集音装置に付された前記集音装置を識別可能な第１の識別情報提示部を前記通信端末が読み取ることによって取得された前記集音装置識別情報を、前記通信端末から受信してもよい。 The receiving unit uses the communication terminal to obtain the sound collecting device identification information acquired by the communication terminal reading a first identification information presenting unit attached to the sound collecting device that can identify the sound collecting device. May be received from.

前記受信部は、前記複数の参加者それぞれが有する前記複数の参加者それぞれを識別可能な第２の識別情報提示部を前記通信端末が読み取ることによって取得された前記参加者識別情報を、前記通信端末から受信してもよい。 The receiving unit communicates the participant identification information acquired by the communication terminal reading a second identification information presenting unit that can identify each of the plurality of participants possessed by the plurality of participants. It may be received from the terminal.

前記受信部は、前記複数の参加者それぞれの生体情報を前記通信端末が読み取ることによって取得された前記参加者識別情報を、前記通信端末から受信してもよい。 The receiving unit may receive the participant identification information acquired by the communication terminal reading the biometric information of each of the plurality of participants from the communication terminal.

前記位置設定部は、前記位置指定情報に基づいて、予め設定された前記集音装置の周囲の座席配置の中で前記複数の参加者それぞれの座席の位置を選択し、選択した前記座席の位置を前記参加者の位置として設定してもよい。 Based on the position designation information, the position setting unit selects the seat positions of the plurality of participants in the seat arrangement around the sound collecting device set in advance, and the selected seat positions. May be set as the position of the participant.

前記受信部は、前記通信端末が前記複数の参加者それぞれの前記参加者識別情報を取得した順番を前記位置指定情報として受信し、前記位置設定部は、前記受信部が受信した前記順番に基づいて、前記座席配置の中で前記複数の参加者それぞれの前記座席の位置を選択してもよい。 The receiving unit receives the order in which the communication terminal acquires the participant identification information of each of the plurality of participants as the position designation information, and the position setting unit is based on the order received by the receiving unit. Therefore, the position of the seat of each of the plurality of participants may be selected in the seat arrangement.

前記受信部は、前記通信端末に対する操作を前記位置指定情報として受信し、前記位置設定部は、前記受信部が受信した前記操作に基づいて、前記座席配置の中で前記複数の参加者それぞれの前記座席の位置を選択してもよい。 The receiving unit receives an operation on the communication terminal as the position designation information, and the position setting unit receives each of the plurality of participants in the seat arrangement based on the operation received by the receiving unit. The position of the seat may be selected.

本発明の第２の態様の音声分析システムは、集音装置の周囲で複数の参加者が発した音声を分析する音声分析装置と、前記音声分析装置と通信可能な通信端末とを含む音声分析システムであって、前記通信端末は、前記集音装置を識別可能な集音装置識別情報と、前記参加者を識別可能な参加者識別情報と、前記集音装置に対する位置を指定する位置指定情報とを取得する取得部と、前記集音装置識別情報と、前記参加者識別情報と、前記位置指定情報とを送信する送信部と、を有し、前記音声分析装置は、前記通信端末が取得した、前記集音装置識別情報と、前記参加者識別情報と、前記位置指定情報とを、前記通信端末から受信する受信部と、前記受信部が受信した前記集音装置識別情報、前記参加者識別情報及び前記位置指定情報に基づいて、前記集音装置に対する前記複数の参加者それぞれの位置を設定する位置設定部と、前記集音装置から前記音声を取得する音声取得部と、前記位置設定部が設定した前記複数の参加者それぞれの前記位置に基づいて、前記音声取得部が取得した前記音声に含まれる前記複数の参加者それぞれが発した前記音声を分析する音声分析部と、を有する。 The voice analysis system according to the second aspect of the present invention includes a voice analyzer that analyzes sounds emitted by a plurality of participants around the sound collector, and a communication terminal capable of communicating with the voice analyzer. In the system, the communication terminal has sound collector identification information that can identify the sound collector, participant identification information that can identify the participant, and position designation information that specifies a position with respect to the sound collector. The voice analyzer has the acquisition unit for acquiring the above, the sound collector identification information, the participant identification information, and the transmission unit for transmitting the position designation information, and the communication terminal acquires the voice analyzer. A receiving unit that receives the sound collecting device identification information, the participant identification information, and the position designation information from the communication terminal, the sound collecting device identification information received by the receiving unit, and the participant. A position setting unit that sets the position of each of the plurality of participants with respect to the sound collector based on the identification information and the position designation information, a sound acquisition unit that acquires the sound from the sound collector, and the position setting. It has a voice analysis unit that analyzes the sound emitted by each of the plurality of participants included in the sound acquired by the sound acquisition unit based on the position of each of the plurality of participants set by the unit. ..

本発明の第３の態様の音声分析方法は、集音装置の周囲で複数の参加者が発した音声を分析する音声分析方法であって、プロセッサが実行する、通信端末が取得した、前記集音装置を識別可能な集音装置識別情報と、前記参加者を識別可能な参加者識別情報と、前記集音装置に対する位置を指定する位置指定情報とを、前記通信端末から受信するステップと、前記受信するステップが受信した前記集音装置識別情報、前記参加者識別情報及び前記位置指定情報に基づいて、前記集音装置に対する前記複数の参加者それぞれの位置を設定するステップと、前記集音装置から前記音声を取得するステップと、前記設定するステップが設定した前記複数の参加者それぞれの前記位置に基づいて、前記取得するステップが取得した前記音声に含まれる前記複数の参加者それぞれが発した前記音声を分析するステップと、を有する。 The voice analysis method of the third aspect of the present invention is a voice analysis method for analyzing sounds emitted by a plurality of participants around a sound collector, which is executed by a processor and acquired by a communication terminal. A step of receiving from the communication terminal the sound collector identification information that can identify the sound device, the participant identification information that can identify the participant, and the position designation information that specifies the position with respect to the sound collector. A step of setting the position of each of the plurality of participants with respect to the sound collecting device based on the sound collecting device identification information, the participant identification information, and the position designation information received by the receiving step, and the sound collecting. Based on the step of acquiring the sound from the device and the position of each of the plurality of participants set by the step to be set, each of the plurality of participants included in the sound acquired by the acquisition step emits. It has a step of analyzing the said voice.

本発明によれば、複数の参加者が参加する議論における音声を分析するためのコストを削減できるという効果を奏する。 According to the present invention, there is an effect that the cost for analyzing speech in a discussion in which a plurality of participants participate can be reduced.

実施形態に係る音声分析システムの模式図である。It is a schematic diagram of the voice analysis system which concerns on embodiment. 実施形態に係る音声分析システムのブロック図である。It is a block diagram of the voice analysis system which concerns on embodiment. 座席配置設定画面を表示している通信端末の正面図である。It is a front view of the communication terminal which displays the seat arrangement setting screen. 議論の音声の取得に用いられる集音装置を登録する方法を説明するための模式図である。It is a schematic diagram for demonstrating the method of registering the sound collector used for the acquisition of the sound of a discussion. 議論に参加する参加者を登録する方法を説明するための模式図である。It is a schematic diagram for demonstrating the method of registering a participant who participates in a discussion. 参加者登録画面を表示している通信端末の正面図である。It is a front view of the communication terminal which displays the participant registration screen. 分析結果画面を表示している通信端末の正面図である。It is a front view of the communication terminal which displays the analysis result screen. 音声分析システムが行う音声分析方法のフローチャートを示す図である。It is a figure which shows the flowchart of the voice analysis method performed by the voice analysis system.

［音声分析システムＳＳの概要］
図１は、本実施形態に係る音声分析システムＳＳの模式図である。音声分析システムＳＳは、音声分析装置１と、通信端末２と、集音装置３とを含む。音声分析システムＳＳが含む通信端末２及び集音装置３の数は限定されない。音声分析システムＳＳは、その他のサーバ、端末等の機器を含んでもよい。 [Overview of voice analysis system SS]
FIG. 1 is a schematic diagram of a voice analysis system SS according to the present embodiment. The voice analysis system SS includes a voice analysis device 1, a communication terminal 2, and a sound collector 3. The number of communication terminals 2 and sound collecting devices 3 included in the voice analysis system SS is not limited. The voice analysis system SS may include other devices such as servers and terminals.

集音装置３は、異なる向きに配置された複数の集音部（マイクロフォン）を含むマイクロフォンアレイを備える。例えばマイクロフォンアレイは、地面に対する水平面において、同一円周上に等間隔で配置された８個のマイクロフォンを含む。このようなマイクロフォンアレイを用いることによって、音声分析装置１は、集音装置３を取り囲んでいる複数の参加者Ｕが発した音声に基づいて、いずれの参加者Ｕが話者（音源）であるかを特定することができる。集音装置３は、マイクロフォンアレイを用いて取得した音声をデータとして音声分析装置１へ送信する。 The sound collecting device 3 includes a microphone array including a plurality of sound collecting units (microphones) arranged in different directions. For example, a microphone array includes eight microphones evenly spaced on the same circumference in a horizontal plane with respect to the ground. By using such a microphone array, in the voice analyzer 1, any participant U is a speaker (sound source) based on the voices emitted by a plurality of participants U surrounding the sound collector 3. Can be specified. The sound collecting device 3 transmits the voice acquired by using the microphone array to the voice analyzer 1 as data.

通信端末２は、通信を行うことが可能なコンピュータである。通信端末２は、例えばパーソナルコンピュータ等のコンピュータ端末、又はスマートフォン等の携帯端末である。通信端末２は、音声分析装置１に対して分析条件を設定し、また音声分析装置１から受信した情報を表示する。 The communication terminal 2 is a computer capable of performing communication. The communication terminal 2 is, for example, a computer terminal such as a personal computer or a mobile terminal such as a smartphone. The communication terminal 2 sets analysis conditions for the voice analyzer 1 and displays information received from the voice analyzer 1.

音声分析装置１は、集音装置３によって取得された音声を用いて音声を分析するコンピュータである。音声分析装置１は、例えば単一のコンピュータ、又はコンピュータ資源の集合であるクラウドによって構成される。 The voice analysis device 1 is a computer that analyzes voice using the voice acquired by the sound collector 3. The voice analyzer 1 is composed of, for example, a single computer or a cloud, which is a collection of computer resources.

音声分析装置１は、ローカルエリアネットワーク、インターネット等のネットワークＮを介して、通信端末２及び集音装置３に有線又は無線で接続される。音声分析装置１は、通信端末２及び集音装置３のうち少なくとも一方に、ネットワークＮを介さず直接接続されてもよい。 The voice analyzer 1 is connected to the communication terminal 2 and the sound collecting device 3 by wire or wirelessly via a network N such as a local area network or the Internet. The voice analyzer 1 may be directly connected to at least one of the communication terminal 2 and the sound collecting device 3 without going through the network N.

集音装置３は、議論を実施する部屋や建物ごとに配置されている管理端末と無線ＬＡＮ（Local Area Network）によって通信し、該管理端末を介して音声分析装置１とデータを授受してもよい。あるいは集音装置３は、移動体通信用のＳＩＭ（Subscriber Identity Module）を備え、音声分析装置１と移動体通信によって通信してもよい。この場合には、集音装置３と管理端末との間で無線ＬＡＮを構築する必要がないため、集音装置３が配置されている場所のネットワーク構成や電波の混雑状況に影響を受けづらい。 Even if the sound collector 3 communicates with a management terminal arranged for each room or building where discussions are held via a wireless LAN (Local Area Network) and exchanges data with the voice analyzer 1 via the management terminal. Good. Alternatively, the sound collecting device 3 may include a SIM (Subscriber Identity Module) for mobile communication and communicate with the voice analyzer 1 by mobile communication. In this case, since it is not necessary to construct a wireless LAN between the sound collecting device 3 and the management terminal, it is not easily affected by the network configuration of the place where the sound collecting device 3 is arranged and the radio wave congestion situation.

音声分析システムＳＳが実行する処理の概要を以下に説明する。参加者Ｕは、議論を開始する前に、集音装置３に付されたタグＴを通信端末２に読み取らせる。また、参加者Ｕは、参加者Ｕの学生証、社員証等のカードＣを通信端末２に読み取らせる。 The outline of the processing executed by the voice analysis system SS will be described below. Participant U has the communication terminal 2 read the tag T attached to the sound collecting device 3 before starting the discussion. In addition, the participant U causes the communication terminal 2 to read the card C such as the student ID card and the employee ID card of the participant U.

通信端末２は、読み取ったタグＴが示す集音装置３の識別情報と、読み取ったカードＣが示す参加者Ｕの識別情報と、集音装置３に対する位置を指定する情報とを、音声分析装置１へ送信する。集音装置３に対する位置は、通信端末２が複数の参加者Ｕの複数のカードＣを読み取った順番、又は通信端末２に対する参加者Ｕの操作によって示される。 The communication terminal 2 is a voice analyzer that obtains the identification information of the sound collecting device 3 indicated by the read tag T, the identification information of the participant U indicated by the read card C, and the information specifying the position with respect to the sound collecting device 3. Send to 1. The position with respect to the sound collecting device 3 is indicated by the order in which the communication terminal 2 reads the plurality of cards C of the plurality of participants U, or the operation of the participants U with respect to the communication terminal 2.

音声分析装置１は、通信端末２から受信した集音装置３の識別情報、参加者Ｕの識別情報及び集音装置３に対する位置を指定する情報に基づいて、集音装置３に対する複数の参加者Ｕそれぞれの位置を設定する。そして音声分析装置１は、設定した複数の参加者Ｕの位置に基づいて、集音装置３から取得した音声を分析し、通信端末２を用いて分析結果を出力する。 The voice analyzer 1 has a plurality of participants for the sound collector 3 based on the identification information of the sound collector 3 received from the communication terminal 2, the identification information of the participant U, and the information for designating the position with respect to the sound collector 3. U Set each position. Then, the voice analysis device 1 analyzes the voice acquired from the sound collector 3 based on the positions of the plurality of participants U set, and outputs the analysis result using the communication terminal 2.

本実施形態に係る音声分析システムＳＳによれば、通信端末２を用いて取得した情報に基づいて集音装置３を基準とした複数の参加者Ｕの相対的な位置を特定し、特定した位置に基づいて集音装置３を用いて取得した音声を分析して複数の参加者Ｕそれぞれの発話を分析する。そのため、音声分析システムＳＳは、集音装置３上にカメラを設ける必要がないため、複数の参加者Ｕが参加する議論における音声を分析するためのコストを削減できる。また、音声分析システムＳＳは、参加者Ｕごとにマイクを配置する必要がないため、複数の参加者Ｕに対応する複数のマイクを配置する手間を削減できる。 According to the voice analysis system SS according to the present embodiment, the relative positions of the plurality of participants U with respect to the sound collector 3 are specified based on the information acquired by using the communication terminal 2, and the specified positions are specified. The voice acquired by using the sound collecting device 3 is analyzed based on the above, and the utterances of each of the plurality of participants U are analyzed. Therefore, since the voice analysis system SS does not need to provide a camera on the sound collector 3, the cost for analyzing the voice in the discussion in which a plurality of participants U participate can be reduced. Further, since the voice analysis system SS does not need to arrange microphones for each participant U, it is possible to reduce the trouble of arranging a plurality of microphones corresponding to a plurality of participants U.

［音声分析システムＳＳの構成］
図２は、本実施形態に係る音声分析システムＳＳのブロック図である。図２において、矢印は主なデータの流れを示しており、図２に示していないデータの流れがあってよい。図２において、各ブロックはハードウェア（装置）単位の構成ではなく、機能単位の構成を示している。そのため、図２に示すブロックは単一の装置内に実装されてよく、あるいは複数の装置内に分かれて実装されてよい。ブロック間のデータの授受は、データバス、ネットワーク、可搬記憶媒体等、任意の手段を介して行われてよい。 [Voice analysis system SS configuration]
FIG. 2 is a block diagram of the voice analysis system SS according to the present embodiment. In FIG. 2, the arrows indicate the main data flows, and there may be data flows not shown in FIG. In FIG. 2, each block shows a functional unit configuration, not a hardware (device) unit configuration. Therefore, the block shown in FIG. 2 may be mounted in a single device, or may be mounted separately in a plurality of devices. Data transfer between blocks may be performed via any means such as a data bus, a network, or a portable storage medium.

音声分析装置１は、制御部１１と、記憶部１２とを有する。制御部１１は、受信部１１１と、位置設定部１１２と、音声取得部１１３と、音声分析部１１４と、出力部１１５とを有する。記憶部１２は、設定情報記憶部１２１と、分析結果記憶部１２２とを有する。 The voice analyzer 1 has a control unit 11 and a storage unit 12. The control unit 11 includes a reception unit 111, a position setting unit 112, a voice acquisition unit 113, a voice analysis unit 114, and an output unit 115. The storage unit 12 has a setting information storage unit 121 and an analysis result storage unit 122.

記憶部１２は、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）、ハードディスクドライブ等を含む記憶媒体である。記憶部１２は、制御部１１が実行するプログラムを予め記憶している。記憶部１２は、音声分析装置１の外部に設けられてもよく、その場合にネットワークを介して制御部１１との間でデータの授受を行ってもよい。 The storage unit 12 is a storage medium including a ROM (Read Only Memory), a RAM (Random Access Memory), a hard disk drive, and the like. The storage unit 12 stores in advance the program executed by the control unit 11. The storage unit 12 may be provided outside the voice analyzer 1, and in that case, data may be exchanged with the control unit 11 via a network.

設定情報記憶部１２１は、音声の分析に用いられる設定情報を記憶する。分析結果記憶部１２２は、音声の分析結果を記憶する。設定情報記憶部１２１及び分析結果記憶部１２２は、それぞれ記憶部１２上の記憶領域であってもよく、あるいは記憶部１２上で構成されたデータベースであってもよい。 The setting information storage unit 121 stores the setting information used for voice analysis. The analysis result storage unit 122 stores the voice analysis result. The setting information storage unit 121 and the analysis result storage unit 122 may be storage areas on the storage unit 12, respectively, or may be a database configured on the storage unit 12.

制御部１１は、例えばＣＰＵ（Central Processing Unit）等のプロセッサであり、記憶部１２に記憶されたプログラムを実行することにより、受信部１１１、位置設定部１１２、音声取得部１１３、音声分析部１１４及び出力部１１５として機能する。制御部１１の機能の少なくとも一部は、電気回路によって実行されてもよい。また、制御部１１の機能の少なくとも一部は、ネットワーク経由で実行されるプログラムによって実行されてもよい。 The control unit 11 is, for example, a processor such as a CPU (Central Processing Unit), and by executing a program stored in the storage unit 12, the reception unit 111, the position setting unit 112, the voice acquisition unit 113, and the voice analysis unit 114 And functions as an output unit 115. At least a part of the function of the control unit 11 may be performed by an electric circuit. Further, at least a part of the functions of the control unit 11 may be executed by a program executed via the network.

通信端末２は、制御部２１と、記憶部２２と、読取部２３と、表示部２４とを有する。制御部２１は、取得部２１１と、送信部２１２と、受信部２１３とを有する。表示部２４は、液晶ディスプレイ等、情報を表示可能な表示装置を含む。表示部２４として人間による接触の位置を検出可能なタッチスクリーンを用いてもよい。 The communication terminal 2 has a control unit 21, a storage unit 22, a reading unit 23, and a display unit 24. The control unit 21 has an acquisition unit 211, a transmission unit 212, and a reception unit 213. The display unit 24 includes a display device capable of displaying information such as a liquid crystal display. A touch screen capable of detecting the position of contact by a human may be used as the display unit 24.

読取部２３は、後述する集音装置ＩＤ及び参加者ＩＤの取得方法に応じた構成を備える。読取部２３は、集音装置ＩＤ及び参加者ＩＤがタグＴ及びカードＣに搭載されたＩＣ（Integrated Circuit）チップに記録されている場合に、近距離無線通信によって該ＩＣチップに記録された情報を読み取ることが可能な近距離無線通信装置を備える。近距離無線通信は、例えばＮＦＣ（Near Field Communication）である。読取部２３は、集音装置ＩＤ及び参加者ＩＤがタグＴ及びカードＣ上のコードによって表される場合に、該コードを撮像可能な撮像装置を備える。コードは、例えばバーコード又は２次元コードである。 The reading unit 23 has a configuration according to a method of acquiring a sound collecting device ID and a participant ID, which will be described later. When the sound collector ID and the participant ID are recorded on the IC (Integrated Circuit) chip mounted on the tag T and the card C, the reading unit 23 provides the information recorded on the IC chip by short-range wireless communication. It is equipped with a short-range wireless communication device capable of reading. The short-range wireless communication is, for example, NFC (Near Field Communication). The reading unit 23 includes an imaging device capable of imaging the sound collecting device ID and the participant ID when they are represented by the codes on the tag T and the card C. The code is, for example, a bar code or a two-dimensional code.

また、参加者の顔を認識することによって参加者ＩＤが取得される場合に、読取部２３は、参加者の顔を撮像可能な撮像装置を備える。また、参加者の指紋を認識することによって参加者ＩＤが取得される場合に、読取部２３は、指紋スキャナを備える。 Further, when the participant ID is acquired by recognizing the participant's face, the reading unit 23 includes an imaging device capable of imaging the participant's face. Further, when the participant ID is acquired by recognizing the fingerprint of the participant, the reading unit 23 includes a fingerprint scanner.

記憶部２２は、ＲＯＭ、ＲＡＭ、ハードディスクドライブ等を含む記憶媒体である。記憶部２２は、制御部２１が実行するプログラムを予め記憶している。記憶部２２は、通信端末２の外部に設けられてもよく、その場合にネットワークを介して制御部２１との間でデータの授受を行ってもよい。 The storage unit 22 is a storage medium including a ROM, a RAM, a hard disk drive, and the like. The storage unit 22 stores in advance the program executed by the control unit 21. The storage unit 22 may be provided outside the communication terminal 2, and in that case, data may be exchanged with the control unit 21 via a network.

制御部２１は、例えばＣＰＵ等のプロセッサであり、記憶部２２に記憶されたプログラムを実行することにより、取得部２１１、送信部２１２及び受信部２１３として機能する。制御部２１の機能の少なくとも一部は、電気回路によって実行されてもよい。また、制御部２１の機能の少なくとも一部は、ネットワーク経由で実行されるプログラムによって実行されてもよい。 The control unit 21 is, for example, a processor such as a CPU, and functions as an acquisition unit 211, a transmission unit 212, and a reception unit 213 by executing a program stored in the storage unit 22. At least a part of the function of the control unit 21 may be performed by an electric circuit. Further, at least a part of the functions of the control unit 21 may be executed by a program executed via the network.

本実施形態に係る音声分析装置１及び通信端末２は、図２に示す具体的な構成に限定されない。音声分析装置１及び通信端末２は、それぞれ１つの装置に限られず、２つ以上の物理的に分離した装置が有線又は無線で接続されることにより構成されてもよい。 The voice analyzer 1 and the communication terminal 2 according to the present embodiment are not limited to the specific configuration shown in FIG. The voice analyzer 1 and the communication terminal 2 are not limited to one device each, and may be configured by connecting two or more physically separated devices by wire or wirelessly.

［音声分析方法の説明］
本実施形態に係る音声分析システムＳＳが行う音声分析方法を以下に説明する。議論を開始する前に、通信端末２は、所定の操作が行われると、集音装置３の周囲の座席配置を設定するための座席配置設定画面を表示部２４に表示させる。音声の分析者は、通信端末２において、座席配置設定画面上で集音装置３の周囲の座席配置を設定する操作を行う。分析者ではなく議論の参加者が、通信端末２を操作してもよい。 [Explanation of voice analysis method]
The voice analysis method performed by the voice analysis system SS according to the present embodiment will be described below. Before starting the discussion, the communication terminal 2 causes the display unit 24 to display a seat arrangement setting screen for setting the seat arrangement around the sound collecting device 3 when a predetermined operation is performed. The voice analyst performs an operation of setting the seat arrangement around the sound collecting device 3 on the seat arrangement setting screen on the communication terminal 2. Participants in the discussion, rather than the analyst, may operate the communication terminal 2.

図３は、座席配置設定画面を表示している通信端末２の正面図である。通信端末２は、設定領域２４１を表示している。設定領域２４１は、集音装置３を中心とした仮想的な円であり、設定領域２４１の周囲には集音装置３を中心とした角度が表されている。設定領域２４１の中には、座席の位置が円で表され、座席の位置の近傍には座席の番号２４２及びキャンセルボタン２４３が表される。座席の番号２４２は、設定済みの座席に対して、集音装置３を基準とした所定の向き（例えば時計回り）で順番に割り振られる。 FIG. 3 is a front view of the communication terminal 2 displaying the seat arrangement setting screen. The communication terminal 2 displays the setting area 241. The setting area 241 is a virtual circle centered on the sound collecting device 3, and an angle centered on the sound collecting device 3 is represented around the setting area 241. The seat position is represented by a circle in the setting area 241, and the seat number 242 and the cancel button 243 are represented in the vicinity of the seat position. The seat number 242 is sequentially assigned to the set seat in a predetermined direction (for example, clockwise) with respect to the sound collecting device 3.

分析者が設定領域２４１内の１点の位置を押下する操作を行った場合に、通信端末２は、該位置に新たな座席を設定し、設定領域２４１の中で該座席の位置を示す円を追加する。分析者がいずれかのキャンセルボタン２４３を押下する操作を行った場合に、通信端末２は、該キャンセルボタン２４３に対応する座席を削除又は無効化し、設定領域２４１の中で該座席の位置を示す円を消去又は無効化する。 When the analyst performs an operation of pressing the position of one point in the setting area 241, the communication terminal 2 sets a new seat at the position, and a circle indicating the position of the seat in the setting area 241. To add. When the analyst performs an operation of pressing any of the cancel buttons 243, the communication terminal 2 deletes or invalidates the seat corresponding to the cancel button 243, and indicates the position of the seat in the setting area 241. Erase or invalidate the circle.

分析者が、座席配置設定画面において完了ボタン２４４を押下した場合に、通信端末２の取得部２１１は、設定された座席配置（例えば集音装置３を中心とした複数の座席それぞれの角度）を取得する。ここに示した操作は一例であり、通信端末２は、分析者によるその他の操作によって座席配置を設定してもよい。 When the analyst presses the completion button 244 on the seat arrangement setting screen, the acquisition unit 211 of the communication terminal 2 determines the set seat arrangement (for example, the angle of each of the plurality of seats centered on the sound collecting device 3). get. The operation shown here is an example, and the communication terminal 2 may set the seat arrangement by another operation by the analyst.

通信端末２の送信部２１２は、取得部２１１が取得した座席配置を示す情報を、音声分析装置１へ送信する。音声分析装置１の受信部１１１は、通信端末２から受信した座席配置を示す情報を、設定情報記憶部１２１に記憶させる。 The transmission unit 212 of the communication terminal 2 transmits the information indicating the seat arrangement acquired by the acquisition unit 211 to the voice analyzer 1. The receiving unit 111 of the voice analyzer 1 stores the information indicating the seat arrangement received from the communication terminal 2 in the setting information storage unit 121.

座席配置の設定は、全ての集音装置３に対して一括して行われてもよい。この場合に、座席配置を示す情報は、設定情報記憶部１２１において、全ての集音装置３の集音装置ＩＤに関連付けられる。あるいは座席配置の設定は、集音装置３ごとに行われてもよい。この場合に、座席配置設定画面上で、分析者は、設定対象の１つ又は複数の集音装置ＩＤを指定する操作を行う。そして座席配置を示す情報は、設定情報記憶部１２１において、指定された集音装置ＩＤに関連付けられる。 The seat arrangement may be set for all the sound collecting devices 3 at once. In this case, the information indicating the seat arrangement is associated with the sound collecting device IDs of all the sound collecting devices 3 in the setting information storage unit 121. Alternatively, the seat arrangement may be set for each sound collecting device 3. In this case, on the seat arrangement setting screen, the analyst performs an operation of designating one or more sound collecting device IDs to be set. Then, the information indicating the seat arrangement is associated with the designated sound collecting device ID in the setting information storage unit 121.

議論を開始する際に、複数の参加者は、１つの集音装置３を取り囲んで着席する。議論に参加する複数の参加者のうち１人の参加者は、通信端末２において、集音装置３及び参加者を登録する操作を行う。参加者ではなく分析者が、通信端末２を操作してもよい。参加者は、集音装置３及び参加者を登録するために、上述の座席配置の設定に用いられた通信端末２と同一の通信端末２を用いてもよく、別の通信端末２を用いてもよい。 At the beginning of the discussion, a plurality of participants are seated around one sound collecting device 3. One of the plurality of participants participating in the discussion performs an operation of registering the sound collecting device 3 and the participants on the communication terminal 2. An analyst, not a participant, may operate the communication terminal 2. Participants may use the same communication terminal 2 as the communication terminal 2 used for setting the seat arrangement described above in order to register the sound collector 3 and the participants, or may use another communication terminal 2. May be good.

まず、音声分析システムＳＳは、議論の音声の取得に用いられる集音装置３を登録する処理を実行する。図４は、議論の音声の取得に用いられる集音装置３を登録する方法を説明するための模式図である。議論を開始する前に、通信端末２は、所定の操作が行われると、集音装置３を登録するための集音装置登録画面を表示部２４に表示させる。通信端末２が集音装置登録画面を表示している状態で、参加者は、集音装置３に付されたタグＴを通信端末２に読み取らせる。 First, the voice analysis system SS executes a process of registering the sound collecting device 3 used for acquiring the voice of the discussion. FIG. 4 is a schematic diagram for explaining a method of registering the sound collecting device 3 used for acquiring the sound of the discussion. Before starting the discussion, the communication terminal 2 causes the display unit 24 to display a sound collecting device registration screen for registering the sound collecting device 3 when a predetermined operation is performed. While the communication terminal 2 is displaying the sound collector registration screen, the participant causes the communication terminal 2 to read the tag T attached to the sound collector 3.

図４の上段の図は、通信端末２が集音装置３に付されたタグＴを読み取る処理を表している。集音装置３の上部には、タグＴが付されている。タグＴは、集音装置３を識別可能な集音装置識別情報（集音装置ＩＤ）を提示する識別情報提示部である。例えば集音装置ＩＤは、予め分析者によって集音装置３に割り振られた数字又は文字列であってもよく、集音装置３固有の製造番号やＭＡＣアドレス（Media Access Control Address）であってもよい。 The upper part of FIG. 4 shows a process in which the communication terminal 2 reads the tag T attached to the sound collecting device 3. A tag T is attached to the upper part of the sound collecting device 3. The tag T is an identification information presenting unit that presents sound collecting device identification information (sound collecting device ID) capable of identifying the sound collecting device 3. For example, the sound collector ID may be a number or a character string previously assigned to the sound collector 3 by an analyst, or may be a serial number or MAC address (Media Access Control Address) unique to the sound collector 3. Good.

タグＴは、集音装置ＩＤを記録した、近距離無線通信（ＮＦＣ等）用のＩＣチップを搭載してもよい。この場合に、参加者が通信端末２をタグＴに近づけると、通信端末２の読取部２３としての近距離無線通信装置がタグＴのＩＣチップに記録された情報を近距離無線通信によって読み取る。通信端末２の取得部２１１は、読取部２３がタグＴのＩＣチップから読み取った情報が示す集音装置ＩＤを取得する。 The tag T may be equipped with an IC chip for short-range wireless communication (NFC or the like) that records the sound collecting device ID. In this case, when the participant brings the communication terminal 2 closer to the tag T, the short-range wireless communication device as the reading unit 23 of the communication terminal 2 reads the information recorded on the IC chip of the tag T by the short-range wireless communication. The acquisition unit 211 of the communication terminal 2 acquires the sound collecting device ID indicated by the information read from the IC chip of the tag T by the reading unit 23.

タグＴは、集音装置ＩＤを所定の規則に従って符号化することによって生成したコード（バーコード、２次元コード等）を表面に表してもよい。この場合に、参加者が通信端末２をタグＴに近づけると、通信端末２の読取部２３としての撮像装置がタグＴのコードを撮像する。通信端末２の取得部２１１は、読取部２３が撮像した画像に含まれるタグＴのコードを所定の規則に従って復号することによって集音装置ＩＤを取得する。 The tag T may represent a code (bar code, two-dimensional code, etc.) generated by encoding the sound collecting device ID according to a predetermined rule on the surface. In this case, when the participant brings the communication terminal 2 closer to the tag T, the image pickup device as the reading unit 23 of the communication terminal 2 images the code of the tag T. The acquisition unit 211 of the communication terminal 2 acquires the sound collecting device ID by decoding the code of the tag T included in the image captured by the reading unit 23 according to a predetermined rule.

図４の下段の図は、集音装置登録画面を表示している通信端末２を表している。通信端末２は、集音装置登録画面において、図３と同様の設定領域２４１を表示している。取得部２１１が集音装置ＩＤを取得した場合に、通信端末２は、設定領域２４１の中央部に取得部２１１が取得した集音装置ＩＤを表すラベル２４５を表示する。ラベル２４５は、集音装置ＩＤの全部を表してもよく、集音装置ＩＤの一部を表してもよい。 The lower figure of FIG. 4 shows the communication terminal 2 displaying the sound collector registration screen. The communication terminal 2 displays the same setting area 241 as in FIG. 3 on the sound collecting device registration screen. When the acquisition unit 211 acquires the sound collecting device ID, the communication terminal 2 displays a label 245 indicating the sound collecting device ID acquired by the acquisition unit 211 in the central portion of the setting area 241. The label 245 may represent the entire sound collector ID, or may represent a part of the sound collector ID.

参加者が集音装置登録画面において完了ボタン２４４を押下すると、音声分析システムＳＳは、議論に参加する参加者を登録する処理に移る。図５（ａ）〜図５（ｃ）は、議論に参加する参加者を登録する方法を説明するための模式図である。図４に示した集音装置登録画面で完了ボタン２４４が押下された場合に、通信端末２は、取得部２１１が取得した集音装置ＩＤについて、議論に参加する複数の参加者を登録するための参加者登録画面を表示部２４に表示させる。通信端末２が参加者登録画面を表示している状態で、参加者は、参加者を特定するための情報を、図５（ａ）〜図５（ｃ）のいずれかの方法で、通信端末２に読み取らせる。 When the participant presses the completion button 244 on the sound collector registration screen, the voice analysis system SS moves to the process of registering the participants who participate in the discussion. 5 (a) to 5 (c) are schematic diagrams for explaining a method of registering participants who participate in the discussion. When the completion button 244 is pressed on the sound collector registration screen shown in FIG. 4, the communication terminal 2 registers a plurality of participants participating in the discussion regarding the sound collector ID acquired by the acquisition unit 211. The participant registration screen of is displayed on the display unit 24. While the communication terminal 2 is displaying the participant registration screen, the participant can use any of the methods shown in FIGS. 5 (a) to 5 (c) to provide information for identifying the participant. Let 2 read it.

図５（ａ）は、参加者が有するカードＣを通信端末２が読み取る処理を表している。カードＣは、参加者を識別可能な参加者識別情報（参加者ＩＤ）を提示する識別情報提示部である。例えば参加者ＩＤは、予め分析者によって参加者に割り振られた数字又は文字列であってもよく、参加者の所属する会社の社員番号や、参加者の所属する学校の学生番号であってもよい。音声分析装置１は、参加者ＩＤと、参加者の情報（例えば氏名、所属等）とを関連付けて、予め記憶部１２に記憶している。 FIG. 5A shows a process in which the communication terminal 2 reads the card C held by the participant. The card C is an identification information presenting unit that presents participant identification information (participant ID) that can identify a participant. For example, the participant ID may be a number or character string previously assigned to the participant by the analyst, or may be an employee number of the company to which the participant belongs or a student number of the school to which the participant belongs. Good. The voice analyzer 1 associates the participant ID with the participant information (for example, name, affiliation, etc.) and stores it in the storage unit 12 in advance.

カードＣは、参加者ＩＤを記録した、近距離無線通信（ＮＦＣ等）用のＩＣチップを搭載してもよい。この場合に、参加者が通信端末２をカードＣに近づけると、通信端末２の読取部２３としての近距離無線通信装置がカードＣのＩＣチップに記録された情報を近距離無線通信によって読み取る。通信端末２の取得部２１１は、読取部２３がカードＣのＩＣチップから読み取った情報が示す参加者ＩＤを取得する。 The card C may be equipped with an IC chip for short-range wireless communication (NFC or the like) that records a participant ID. In this case, when the participant brings the communication terminal 2 close to the card C, the short-range wireless communication device as the reading unit 23 of the communication terminal 2 reads the information recorded on the IC chip of the card C by the short-range wireless communication. The acquisition unit 211 of the communication terminal 2 acquires the participant ID indicated by the information read by the reading unit 23 from the IC chip of the card C.

カードＣは、参加者ＩＤを所定の規則に従って符号化することによって生成したコード（バーコード、２次元コード等）を表面に表してもよい。この場合に、参加者が通信端末２をカードＣに近づけると、通信端末２の読取部２３としての撮像装置がカードＣのコードを撮像する。通信端末２の取得部２１１は、読取部２３が撮像した画像に含まれるカードＣのコードを所定の規則に従って復号することによって参加者ＩＤを取得する。 The card C may display a code (bar code, two-dimensional code, etc.) generated by encoding the participant ID according to a predetermined rule on the surface. In this case, when the participant brings the communication terminal 2 closer to the card C, the image pickup device as the reading unit 23 of the communication terminal 2 images the code of the card C. The acquisition unit 211 of the communication terminal 2 acquires the participant ID by decoding the code of the card C included in the image captured by the reading unit 23 according to a predetermined rule.

図５（ｂ）は、参加者の顔を通信端末２が読み取る処理を表している。通信端末２は、既知の顔認識処理を用いて、参加者ＩＤを取得する。この場合に、顔認識処理によって特定される個人と参加者ＩＤとは、通信端末２の記憶部２２において予め関連付けられている。通信端末２の読取部２３としての撮像装置は、１人の参加者の顔を含む領域を撮像する。通信端末２の取得部２１１は、読取部２３が撮像した画像に対して顔認識処理を行うことによって、画像に含まれる参加者の個人を特定する。そして取得部２１１は、記憶部２２において、顔認識処理によって特定した個人に関連付けられた参加者ＩＤを取得する。 FIG. 5B shows a process in which the communication terminal 2 reads the face of the participant. The communication terminal 2 acquires a participant ID by using a known face recognition process. In this case, the individual identified by the face recognition process and the participant ID are associated in advance in the storage unit 22 of the communication terminal 2. The imaging device as the reading unit 23 of the communication terminal 2 images an area including the face of one participant. The acquisition unit 211 of the communication terminal 2 identifies the individual participant included in the image by performing face recognition processing on the image captured by the reading unit 23. Then, the acquisition unit 211 acquires the participant ID associated with the individual identified by the face recognition process in the storage unit 22.

顔認識処理は、通信端末２ではなく、音声分析装置１によって行われてもよい。この場合に、通信端末２は、参加者の顔を含む領域を撮像した画像を音声分析装置１に送信し、音声分析装置１が該画像に基づいて取得した参加者ＩＤを音声分析装置１から受信する。 The face recognition process may be performed by the voice analyzer 1 instead of the communication terminal 2. In this case, the communication terminal 2 transmits an image of the area including the face of the participant to the voice analyzer 1, and the participant ID acquired by the voice analyzer 1 based on the image is transmitted from the voice analyzer 1. Receive.

図５（ｃ）は、参加者の指紋を通信端末２が読み取る処理を表している。通信端末２は、既知の指紋認証処理を用いて、参加者ＩＤを取得する。この場合に、指紋認証処理によって特定される個人と参加者ＩＤとは、通信端末２の記憶部２２において予め関連付けられている。通信端末２の読取部２３としての指紋スキャナは、１人の参加者の指紋を撮像する。通信端末２の取得部２１１は、読取部２３が撮像した画像に対して指紋認証処理を行うことによって、画像に含まれる指紋を有する個人を特定する。そして取得部２１１は、記憶部２２において、指紋認証処理によって特定した個人に関連付けられた参加者ＩＤを取得する。 FIG. 5C shows a process in which the communication terminal 2 reads the fingerprint of the participant. The communication terminal 2 acquires a participant ID by using a known fingerprint authentication process. In this case, the individual identified by the fingerprint authentication process and the participant ID are previously associated with each other in the storage unit 22 of the communication terminal 2. The fingerprint scanner as the reading unit 23 of the communication terminal 2 captures the fingerprint of one participant. The acquisition unit 211 of the communication terminal 2 identifies an individual having a fingerprint included in the image by performing fingerprint authentication processing on the image captured by the reading unit 23. Then, the acquisition unit 211 acquires the participant ID associated with the individual identified by the fingerprint authentication process in the storage unit 22.

指紋認証処理は、通信端末２ではなく、音声分析装置１によって行われてもよい。この場合に、通信端末２は、参加者の指紋を含む領域を撮像した画像を音声分析装置１に送信し、音声分析装置１が該画像に基づいて取得した参加者ＩＤを音声分析装置１から受信する。 The fingerprint authentication process may be performed by the voice analyzer 1 instead of the communication terminal 2. In this case, the communication terminal 2 transmits an image of the area including the fingerprint of the participant to the voice analyzer 1, and the participant ID acquired by the voice analyzer 1 based on the image is transmitted from the voice analyzer 1. Receive.

図５（ｂ）、図５（ｃ）に示した参加者の顔や指紋のような生体情報を読み取る構成では、音声分析システムＳＳは、参加者がカードＣを有していない状況であっても参加者ＩＤを特定できるという利点がある。 In the configuration for reading biological information such as the participant's face and fingerprint shown in FIGS. 5 (b) and 5 (c), the voice analysis system SS is in a situation where the participant does not have the card C. Also has the advantage that the participant ID can be specified.

図５（ａ）〜図５（ｃ）において取得された参加者ＩＤに関連付けられた参加者の情報が音声分析装置１に登録されていない場合に、通信端末２は、参加者の情報の登録を受け付けてもよい。この場合に、通信端末２は、音声分析装置１に、取得した参加者ＩＤが記憶部１２に記憶されているか否かを問い合わせる。音声分析装置１の記憶部１２に参加者ＩＤが記憶されていない場合に、通信端末２は、参加者の情報（氏名、所属等）を入力するための不図示の参加者情報入力画面を、表示部２４に表示させる。 When the participant information associated with the participant ID acquired in FIGS. 5 (a) to 5 (c) is not registered in the voice analyzer 1, the communication terminal 2 registers the participant information. May be accepted. In this case, the communication terminal 2 inquires the voice analyzer 1 whether or not the acquired participant ID is stored in the storage unit 12. When the participant ID is not stored in the storage unit 12 of the voice analyzer 1, the communication terminal 2 displays a participant information input screen (not shown) for inputting participant information (name, affiliation, etc.). It is displayed on the display unit 24.

参加者は、参加者情報入力画面を表示している通信端末２に対して、参加者の情報を入力する操作を行う。通信端末２の送信部２１２は、参加者情報入力画面上で入力された参加者の情報を、参加者ＩＤとともに、音声分析装置１へ送信する。音声分析装置１の受信部１１１は、通信端末２から受信した参加者ＩＤと参加者の情報とを関連付けて記憶部１２に記憶させる。これにより、音声分析装置１に未登録の参加者であっても、容易に議論に参加することができる。 Participants perform an operation of inputting participant information to the communication terminal 2 displaying the participant information input screen. The transmission unit 212 of the communication terminal 2 transmits the participant information input on the participant information input screen to the voice analyzer 1 together with the participant ID. The receiving unit 111 of the voice analyzer 1 stores the participant ID received from the communication terminal 2 and the participant information in the storage unit 12 in association with each other. As a result, even a participant who has not been registered in the voice analyzer 1 can easily participate in the discussion.

別の方法として、参加者ＩＤに関連付けられた参加者の情報が音声分析装置１に登録されていない場合に、通信端末２は、参加者の情報の登録を受け付けることなく、参加者ＩＤの登録を拒否してもよい。この場合には、通信端末２は、音声分析装置１に、取得した参加者ＩＤが記憶部１２に記憶されているか否かを問い合わせる。音声分析装置１の記憶部１２に参加者ＩＤが記憶されていない場合に、通信端末２は、該参加者ＩＤが登録されていないことを表す情報を表示部２４に表示させるとともに、該参加者ＩＤの登録を拒否する。 Alternatively, when the participant information associated with the participant ID is not registered in the voice analyzer 1, the communication terminal 2 does not accept the registration of the participant information and registers the participant ID. May be rejected. In this case, the communication terminal 2 inquires the voice analyzer 1 whether or not the acquired participant ID is stored in the storage unit 12. When the participant ID is not stored in the storage unit 12 of the voice analyzer 1, the communication terminal 2 causes the display unit 24 to display information indicating that the participant ID is not registered, and the participant. Reject ID registration.

図６（ａ）、図６（ｂ）は、参加者登録画面を表示している通信端末２の正面図である。音声分析装置１は、取得部２１１が参加者ＩＤを取得した順番に基づいて、又は通信端末２に対する参加者の操作に基づいて、複数の参加者の位置を設定する。 6 (a) and 6 (b) are front views of the communication terminal 2 displaying the participant registration screen. The voice analyzer 1 sets the positions of a plurality of participants based on the order in which the acquisition unit 211 acquires the participant IDs or the operation of the participants with respect to the communication terminal 2.

図６（ａ）は、取得部２１１が参加者ＩＤを取得した順番に基づいて音声分析装置１が複数の参加者の位置を設定する場合の参加者登録画面を表している。通信端末２は、参加者登録画面において、図３と同様の設定領域２４１を表示している。 FIG. 6A shows a participant registration screen when the voice analyzer 1 sets the positions of a plurality of participants based on the order in which the acquisition unit 211 acquires the participant IDs. The communication terminal 2 displays the same setting area 241 as in FIG. 3 on the participant registration screen.

設定領域２４１の中には、図３の座席配置設定画面で設定された座席の位置が円で表されている。通信端末２は、集音装置３を基準とした所定の向きＤ（例えば時計回り）に、取得部２１１が参加者ＩＤを取得した順番で、座席に対して参加者ＩＤ２４６を割り振って表示する。 In the setting area 241, the seat positions set on the seat arrangement setting screen of FIG. 3 are represented by circles. The communication terminal 2 allocates and displays the participant ID 246 to the seats in the order in which the acquisition unit 211 acquires the participant ID in a predetermined direction D (for example, clockwise) with respect to the sound collecting device 3.

参加者が、参加者登録画面において完了ボタン２４４を押下した場合に、通信端末２の送信部２１２は、取得部２１１が取得した集音装置ＩＤ（すなわち集音装置識別情報）と、取得部２１１が取得した複数の参加者ＩＤ（すなわち参加者識別情報）と、取得部２１１が複数の参加者それぞれの参加者ＩＤを取得した順番（すなわち位置指定情報）とを、関連付けて音声分析装置１へ送信する。 When the participant presses the completion button 244 on the participant registration screen, the transmission unit 212 of the communication terminal 2 has the sound collector ID (that is, the sound collector identification information) acquired by the acquisition unit 211 and the acquisition unit 211. The plurality of participant IDs (that is, participant identification information) acquired by the acquisition unit 211 and the order in which the acquisition unit 211 acquired the participant IDs of each of the plurality of participants (that is, position designation information) are associated with each other to the voice analyzer 1. Send.

音声分析装置１において、受信部１１１は、通信端末２が取得した集音装置ＩＤ、通信端末２が取得した参加者ＩＤ、及び通信端末２が複数の参加者それぞれの参加者ＩＤを取得した順番を受信する。 In the voice analyzer 1, the receiving unit 111 acquires the sound collecting device ID acquired by the communication terminal 2, the participant ID acquired by the communication terminal 2, and the participant ID of each of the plurality of participants in the order in which the communication terminal 2 acquires the participant ID. To receive.

位置設定部１１２は、設定情報記憶部１２１において集音装置ＩＤに関連付けられた座席配置を取得する。そして位置設定部１１２は、取得した座席配置の中で、集音装置３を基準とした所定の向きＤに、通信端末２が複数の参加者それぞれの参加者ＩＤを取得した順番で、複数の参加者それぞれの座席を選択する。そして位置設定部１１２は、集音装置ＩＤと、複数の参加者ＩＤと、複数の参加者ＩＤそれぞれの参加者の位置（すなわち複数の参加者それぞれについて選択された座席の位置）とを関連付けて設定情報記憶部１２１に記憶させることによって、複数の参加者それぞれの位置を設定する。参加者の位置は、例えば集音装置３を中心とした水平面上の角度によって表される。また、位置設定部１１２は、座席の数が参加者の数よりも多い場合に、参加者が設定されなかった座席を空席と設定する情報を、設定情報記憶部１２１に記憶させる。 The position setting unit 112 acquires the seat arrangement associated with the sound collecting device ID in the setting information storage unit 121. Then, in the acquired seat arrangement, the position setting unit 112 has a plurality of seat arrangements in the order in which the communication terminal 2 has acquired the participant IDs of the plurality of participants in the predetermined orientation D with respect to the sound collecting device 3. Select a seat for each participant. Then, the position setting unit 112 associates the sound collecting device ID, the plurality of participant IDs, and the positions of the participants of the plurality of participant IDs (that is, the positions of the seats selected for each of the plurality of participants). The positions of the plurality of participants are set by storing the settings in the setting information storage unit 121. The position of the participant is represented by, for example, an angle on a horizontal plane centered on the sound collector 3. Further, when the number of seats is larger than the number of participants, the position setting unit 112 stores the information for setting the seats not set by the participants as vacant seats in the setting information storage unit 121.

図６（ｂ）は、通信端末２に対する参加者の操作に基づいて音声分析装置１が複数の参加者の位置を設定する場合の参加者登録画面を表している。通信端末２は、参加者登録画面において、図３と同様の設定領域２４１と、取得部２１１が取得した参加者ＩＤ２４７とを表示している。取得部２１１が取得した参加者ＩＤ２４７において、座席が未設定の参加者ＩＤと座席が設定済の参加者ＩＤの表示態様とは互いに異なるように表されている。 FIG. 6B shows a participant registration screen when the voice analyzer 1 sets the positions of a plurality of participants based on the operation of the participants with respect to the communication terminal 2. The communication terminal 2 displays the same setting area 241 as in FIG. 3 and the participant ID 247 acquired by the acquisition unit 211 on the participant registration screen. In the participant ID 247 acquired by the acquisition unit 211, the display mode of the participant ID in which the seat is not set and the participant ID in which the seat is set are shown to be different from each other.

設定領域２４１の中には、図３の座席配置設定画面で設定された座席の位置が円で表されている。参加者が通信端末２に対していずれかの座席の位置を指定する操作をすると、通信端末２は、指定された座席に対して１つの参加者ＩＤ２４６（例えば取得部２１１が取得した参加者ＩＤ２４７において、座席が未設定の参加者ＩＤのうち最も上の参加者ＩＤ）を割り振って表示する。参加者は、複数の参加者ＩＤそれぞれについて座席を指定する操作を繰り返す。 In the setting area 241, the seat positions set on the seat arrangement setting screen of FIG. 3 are represented by circles. When the participant performs an operation of designating the position of one of the seats with respect to the communication terminal 2, the communication terminal 2 has one participant ID 246 for the designated seat (for example, the participant ID 247 acquired by the acquisition unit 211). In, the highest participant ID among the participant IDs for which seats have not been set) is assigned and displayed. The participant repeats the operation of designating a seat for each of the plurality of participant IDs.

参加者が、参加者登録画面において完了ボタン２４４を押下した場合に、通信端末２の送信部２１２は、取得部２１１が取得した集音装置ＩＤ（すなわち集音装置識別情報）と、取得部２１１が取得した複数の参加者ＩＤ（すなわち参加者識別情報）と、複数の参加者ＩＤそれぞれについて座席を指定する操作を示す情報（すなわち位置指定情報）とを、関連付けて音声分析装置１へ送信する。 When the participant presses the completion button 244 on the participant registration screen, the transmission unit 212 of the communication terminal 2 has the sound collector ID (that is, the sound collector identification information) acquired by the acquisition unit 211 and the acquisition unit 211. The plurality of participant IDs (that is, participant identification information) acquired by the user and the information indicating the operation of designating a seat for each of the plurality of participant IDs (that is, position designation information) are associated and transmitted to the voice analyzer 1. ..

音声分析装置１において、受信部１１１は、通信端末２が取得した集音装置ＩＤ、通信端末２が取得した参加者ＩＤ、及び複数の参加者ＩＤそれぞれについて座席を指定する操作を示す情報を受信する。 In the voice analyzer 1, the receiving unit 111 receives information indicating an operation for designating a seat for each of the sound collecting device ID acquired by the communication terminal 2, the participant ID acquired by the communication terminal 2, and the plurality of participant IDs. To do.

位置設定部１１２は、設定情報記憶部１２１において集音装置ＩＤに関連付けられた座席配置を取得する。そして位置設定部１１２は、取得した座席配置の中で、複数の参加者ＩＤそれぞれについて座席を指定する操作が示すように、複数の参加者それぞれの座席を選択する。そして位置設定部１１２は、集音装置ＩＤと、複数の参加者ＩＤと、複数の参加者ＩＤそれぞれの参加者の位置（すなわち複数の参加者それぞれについて選択された座席の位置）とを関連付けて設定情報記憶部１２１に記憶させることによって、複数の参加者それぞれの位置を設定する。参加者の位置は、例えば集音装置３を中心とした水平面上の角度によって表される。また、位置設定部１１２は、座席の数が参加者の数よりも多い場合に、参加者が設定されなかった座席を空席と設定する情報を、設定情報記憶部１２１に記憶させる。 The position setting unit 112 acquires the seat arrangement associated with the sound collecting device ID in the setting information storage unit 121. Then, the position setting unit 112 selects the seats of each of the plurality of participants in the acquired seat arrangement, as indicated by the operation of designating the seats for each of the plurality of participant IDs. Then, the position setting unit 112 associates the sound collecting device ID, the plurality of participant IDs, and the positions of the participants of the plurality of participant IDs (that is, the positions of the seats selected for each of the plurality of participants). The positions of the plurality of participants are set by storing the settings in the setting information storage unit 121. The position of the participant is represented by, for example, an angle on a horizontal plane centered on the sound collector 3. Further, when the number of seats is larger than the number of participants, the position setting unit 112 stores the information for setting the seats not set by the participants as vacant seats in the setting information storage unit 121.

このように、音声分析システムＳＳは、通信端末２を用いて取得した集音装置識別情報、参加者識別情報及び位置指定情報に基づいて、集音装置３を基準とした複数の参加者の位置を設定する。そのため、音声分析システムＳＳは、集音装置３上にカメラを設けることなく複数の参加者の位置を容易に設定でき、複数の参加者Ｕが参加する議論における音声を分析するためのコストを削減できる。 In this way, the voice analysis system SS is based on the sound collector identification information, the participant identification information, and the position designation information acquired by using the communication terminal 2, and the positions of a plurality of participants with reference to the sound collector 3 To set. Therefore, the voice analysis system SS can easily set the positions of a plurality of participants without providing a camera on the sound collector 3, and reduces the cost for analyzing the voice in the discussion in which the plurality of participants U participate. it can.

参加者又は分析者は、議論を開始する際に、通信端末２を操作することによって、議論の開始を指示する。音声分析装置１において、音声取得部１１３は、議論の開始を指示する信号を通信端末２から受信すると、音声の取得を指示する信号を集音装置３へ送信する。集音装置３は、音声分析装置１から音声の取得を指示する信号を受信した場合に、音声の取得を開始する。 When the participant or the analyst starts the discussion, he / she instructs the start of the discussion by operating the communication terminal 2. In the voice analyzer 1, when the voice acquisition unit 113 receives the signal instructing the start of the discussion from the communication terminal 2, the voice acquisition unit 113 transmits the signal instructing the acquisition of the voice to the sound collector 3. The sound collecting device 3 starts acquiring voice when it receives a signal instructing voice acquisition from the voice analyzer 1.

集音装置３は、複数の集音部においてそれぞれ音声を取得し、各集音部に対応する各チャネルの音声として内部に記録する。そして集音装置３は、取得した複数のチャネルの音声を、音声分析装置１へ送信する。集音装置３は、取得した音声を逐次送信してもよく、あるいは所定量又は所定時間の音声を送信してもよい。また、集音装置３は、取得の開始から終了までの音声をまとめて送信してもよい。音声分析装置１において、音声取得部１１３は、集音装置３から音声を受信し、議論を識別するための識別情報（例えば議論ＩＤ）と関連付けて分析結果記憶部１２２に記憶させる。議論ＩＤは、自動的に議論に割り振られてもよく、あるいは参加者又は分析者によって入力されてもよい。 The sound collecting device 3 acquires sound in each of the plurality of sound collecting units, and internally records the sound as the sound of each channel corresponding to each sound collecting unit. Then, the sound collecting device 3 transmits the acquired voices of the plurality of channels to the voice analyzer 1. The sound collecting device 3 may sequentially transmit the acquired sound, or may transmit the sound of a predetermined amount or a predetermined time. Further, the sound collecting device 3 may collectively transmit the voices from the start to the end of the acquisition. In the voice analysis device 1, the voice acquisition unit 113 receives the voice from the sound collection device 3 and stores it in the analysis result storage unit 122 in association with the identification information (for example, the discussion ID) for identifying the discussion. The discussion ID may be automatically assigned to the discussion or may be entered by a participant or analyst.

参加者又は分析者は、議論を終了する際に、通信端末２を操作することによって、議論の終了を指示する。音声分析装置１において、音声取得部１１３は、議論の終了を指示す信号を通信端末２から受信すると、音声の取得の終了を指示する信号を集音装置３へ送信する。集音装置３は、音声分析装置１から音声の取得の終了を指示する信号を受信した場合に、音声の取得を終了する。 At the end of the discussion, the participant or the analyst instructs the end of the discussion by operating the communication terminal 2. In the voice analyzer 1, when the voice acquisition unit 113 receives the signal indicating the end of the discussion from the communication terminal 2, the voice acquisition unit 113 transmits the signal instructing the end of the voice acquisition to the sound collector 3. When the sound collecting device 3 receives a signal instructing the end of voice acquisition from the voice analyzer 1, the sound collecting device 3 ends the voice acquisition.

以降の処理は、音声の取得が終了したことを契機として、又は分析者が通信端末２に対して所定の指示を行ったことを契機として行われる。音声分析部１１４は、設定情報記憶部１２１において、音声の取得元の集音装置３の集音装置ＩＤに関連付けられた、複数の参加者ＩＤと、複数の参加者ＩＤそれぞれの参加者の位置とを取得する。音声分析部１１４は、集音装置３から受信した複数チャネルの音声に基づいて音源定位を行う。音源定位は、音声取得部１１３が取得した音声に含まれる音源の向きを、時間ごと（例えば１０ミリ秒〜１００ミリ秒ごと）に推定する処理である。音声分析部１１４は、時間ごとに推定した音源の向きを、設定情報記憶部１２１から取得した複数の参加者それぞれの位置と関連付ける。 Subsequent processing is performed when the acquisition of the voice is completed or when the analyst gives a predetermined instruction to the communication terminal 2. In the setting information storage unit 121, the voice analysis unit 114 has a plurality of participant IDs associated with the sound collector ID of the sound collector 3 from which the voice is acquired, and the positions of the participants of the plurality of participant IDs. And get. The voice analysis unit 114 performs sound source localization based on the voices of the plurality of channels received from the sound collector 3. The sound source localization is a process of estimating the direction of the sound source included in the voice acquired by the voice acquisition unit 113 every time (for example, every 10 milliseconds to 100 milliseconds). The voice analysis unit 114 associates the orientation of the sound source estimated for each time with the positions of the plurality of participants acquired from the setting information storage unit 121.

音声分析部１１４は、取得した音声に基づいて音源の向きを特定可能であれば、ＭＵＳＩＣ（Multiple Signal Classification）法、ビームフォーミング法等、既知の音源定位方法を用いることができる。 If the direction of the sound source can be specified based on the acquired voice, the voice analysis unit 114 can use a known sound source localization method such as a MUSIC (Multiple Signal Classification) method or a beamforming method.

次に音声分析部１１４は、取得した音声及び推定した音源の向きに基づいて、議論において、所定の時間ごと（例えば１０ミリ秒〜１００ミリ秒ごと）に、いずれの参加者が発話（発言）したかを判別する。音声分析部１１４は、１人の参加者が発話を開始してから終了するまでの連続した期間を発話期間として特定する。同じ時間に複数の参加者が発話を行った場合には、複数の参加者の発話期間の少なくとも一部同士が重複する。音声分析部１１４は、議論において特定した発話期間を、議論ＩＤ及び参加者ＩＤと関連付けて分析結果記憶部１２２に記憶させる。 Next, the voice analysis unit 114 makes a speech (speech) by any participant at predetermined time intervals (for example, every 10 milliseconds to 100 milliseconds) in the discussion based on the acquired voice and the estimated direction of the sound source. Determine if you did. The voice analysis unit 114 specifies a continuous period from the start to the end of the utterance by one participant as the utterance period. When a plurality of participants speak at the same time, at least a part of the utterance period of the plurality of participants overlaps with each other. The voice analysis unit 114 stores the utterance period specified in the discussion in the analysis result storage unit 122 in association with the discussion ID and the participant ID.

さらに音声分析部１１４は、特定した発話期間に基づいて、議論における複数の参加者それぞれの時系列の発話量（発言量ともいう）を取得する。具体的には、音声分析部１１４は、議論を所定の窓幅（例えば３０秒）のフレーム（時間範囲）に分割する。フレームは窓幅より短い所定のシフト幅（例えば１０秒）ずつずらされており、隣接するフレーム同士の一部同士が時系列で互いに重複している。 Further, the voice analysis unit 114 acquires the time-series utterance amount (also referred to as the utterance amount) of each of the plurality of participants in the discussion based on the specified utterance period. Specifically, the voice analysis unit 114 divides the discussion into frames (time range) having a predetermined window width (for example, 30 seconds). The frames are shifted by a predetermined shift width (for example, 10 seconds) shorter than the window width, and some of the adjacent frames overlap each other in chronological order.

そして音声分析部１１４は、フレームにおける参加者の発話期間の長さ（合計発話時間）を窓幅で割った値を、フレームごとの発話量として算出する。音声分析部１１４は、複数の参加者それぞれについて、議論の開始時刻から終了時刻までのフレームごとの発話量を算出する。音声分析部１１４は、議論における複数の参加者それぞれのフレームごとの発話量を示す情報を、議論ＩＤ及び参加者ＩＤと関連付けて分析結果記憶部１２２に記憶させる。 Then, the voice analysis unit 114 calculates the value obtained by dividing the length of the participant's utterance period (total utterance time) in the frame by the window width as the utterance amount for each frame. The voice analysis unit 114 calculates the amount of utterance for each frame from the start time to the end time of the discussion for each of the plurality of participants. The voice analysis unit 114 stores information indicating the amount of utterances for each frame of the plurality of participants in the discussion in the analysis result storage unit 122 in association with the discussion ID and the participant ID.

出力部１１５は、音声の分析結果を出力する。例えば出力部１１５は、複数の参加者それぞれの発話量を示す情報を、分析結果として出力する。出力部１１５は、集音装置３が取得した音声を用いて分析可能なその他の情報を分析結果として出力してもよい。 The output unit 115 outputs the voice analysis result. For example, the output unit 115 outputs information indicating the utterance amount of each of the plurality of participants as an analysis result. The output unit 115 may output other information that can be analyzed using the voice acquired by the sound collecting device 3 as an analysis result.

出力部１１５は、上述の集音装置３の登録及び参加者の登録に用いられた通信端末２と同一の通信端末２に分析結果を表示させてもよく、別の通信端末２に分析結果を表示させてもよい。出力部１１５は、画面の表示に限らず、プリンタを用いて紙に印刷すること、記憶媒体にデータとして記憶させること、又は通信回線を介して外部へ送信することによって、分析結果を出力してもよい。 The output unit 115 may display the analysis result on the same communication terminal 2 as the communication terminal 2 used for the registration of the sound collecting device 3 and the registration of the participants, and may display the analysis result on another communication terminal 2. It may be displayed. The output unit 115 outputs the analysis result not only by displaying the screen but also by printing on paper using a printer, storing it as data in a storage medium, or transmitting it to the outside via a communication line. May be good.

出力部１１５は、音声分析部１１４の処理が終了したことを契機として、又は分析者が通信端末２に対して分析結果を出力する指示を行ったことを契機として、分析結果記憶部１２２に記憶されている情報に基づいて分析結果を表示するための表示情報を生成し、通信端末２へ送信する。通信端末２の受信部２１３は、音声分析装置１が送信した表示情報を受信し、図７に示す分析結果画面を表示部２４に表示させる。 The output unit 115 stores in the analysis result storage unit 122 when the processing of the voice analysis unit 114 is completed or when the analyst gives an instruction to output the analysis result to the communication terminal 2. Display information for displaying the analysis result is generated based on the information, and is transmitted to the communication terminal 2. The receiving unit 213 of the communication terminal 2 receives the display information transmitted by the voice analyzer 1 and causes the display unit 24 to display the analysis result screen shown in FIG. 7.

図７は、分析結果画面を表示している通信端末２の正面図である。分析結果画面は、１つの議論に関する情報を表示する画面である。分析結果画面は、複数の参加者の時系列の発話量のグラフ２４８を含む。グラフ２４８は、複数の参加者の発話量を積み上げグラフとして表している。グラフ２４８の横軸は時間、縦軸は発話量である。グラフ２４８の領域には、複数の参加者それぞれに応じて異なる模様が表されている。これにより、分析者は、音声分析装置１による音声の分析結果を知ることができる。 FIG. 7 is a front view of the communication terminal 2 displaying the analysis result screen. The analysis result screen is a screen for displaying information related to one discussion. The analysis result screen includes a graph 248 of the time-series utterance volume of a plurality of participants. Graph 248 represents the amount of utterances of a plurality of participants as a stacked graph. The horizontal axis of the graph 248 is time, and the vertical axis is the amount of utterance. In the area of the graph 248, different patterns are shown according to each of the plurality of participants. As a result, the analyst can know the result of voice analysis by the voice analyzer 1.

［音声分析方法のフロー］
図８は、音声分析システムＳＳが行う音声分析方法のフローチャートを示す図である。議論を開始する前に、通信端末２の取得部２１１は、座席配置を設定するための座席配置設定画面上で設定された座席配置を取得する。通信端末２の送信部２１２は、取得部２１１が取得した座席配置を示す情報を、音声分析装置１へ送信する。音声分析装置１の受信部１１１は、通信端末２から受信した集音装置３の周囲の座席配置を示す情報を、設定情報記憶部１２１に記憶させる（Ｓ１１）。 [Flow of voice analysis method]
FIG. 8 is a diagram showing a flowchart of a voice analysis method performed by the voice analysis system SS. Before starting the discussion, the acquisition unit 211 of the communication terminal 2 acquires the seat arrangement set on the seat arrangement setting screen for setting the seat arrangement. The transmission unit 212 of the communication terminal 2 transmits the information indicating the seat arrangement acquired by the acquisition unit 211 to the voice analyzer 1. The receiving unit 111 of the voice analyzer 1 stores the information indicating the seat arrangement around the sound collecting device 3 received from the communication terminal 2 in the setting information storage unit 121 (S11).

議論を開始する際に、複数の参加者は、１つの集音装置３を取り囲んで着席する。議論に参加する複数の参加者のうち１人の参加者は、通信端末２において、集音装置３及び参加者を登録する操作を行う。 At the beginning of the discussion, a plurality of participants are seated around one sound collecting device 3. One of the plurality of participants participating in the discussion performs an operation of registering the sound collecting device 3 and the participants on the communication terminal 2.

通信端末２の取得部２１１は、読取部２３がタグＴのＩＣチップから読み取った情報が示す集音装置ＩＤを取得し、又は読取部２３が撮像した画像に含まれるタグＴのコードを所定の規則に従って復号することによって集音装置ＩＤを取得する。 The acquisition unit 211 of the communication terminal 2 acquires the sound collecting device ID indicated by the information read from the IC chip of the tag T by the reading unit 23, or determines the code of the tag T included in the image captured by the reading unit 23. The sound collector ID is acquired by decoding according to the rules.

また、通信端末２の取得部２１１は、読取部２３がカードＣのＩＣチップから読み取った情報が示す参加者ＩＤを取得し、又は読取部２３が撮像した画像に含まれるカードＣのコードを所定の規則に従って復号することによって参加者ＩＤを取得する。あるいは通信端末２の取得部２１１は、顔認識処置又は指紋認証処理を用いて、参加者の生体情報を読み取ることによって、参加者ＩＤを取得してもよい。 Further, the acquisition unit 211 of the communication terminal 2 acquires the participant ID indicated by the information read from the IC chip of the card C by the reading unit 23, or determines the code of the card C included in the image captured by the reading unit 23. The participant ID is obtained by decrypting according to the rules of. Alternatively, the acquisition unit 211 of the communication terminal 2 may acquire the participant ID by reading the biometric information of the participant by using the face recognition process or the fingerprint authentication process.

通信端末２の送信部２１２は、取得部２１１が取得した、集音装置識別情報（集音装置ＩＤ）、参加者識別情報（参加者ＩＤ）及び位置指定情報を、関連付けて音声分析装置１へ送信する。位置指定情報は、取得部２１１が複数の参加者それぞれの参加者ＩＤを取得した順番、又は複数の参加者ＩＤそれぞれについて座席を指定する操作を示す情報である。 The transmission unit 212 of the communication terminal 2 associates the sound collector identification information (sound collector ID), the participant identification information (participant ID), and the position designation information acquired by the acquisition unit 211 with each other to the voice analyzer 1. Send. The position designation information is information indicating the order in which the acquisition unit 211 has acquired the participant IDs of each of the plurality of participants, or the operation of designating a seat for each of the plurality of participant IDs.

音声分析装置１において、受信部１１１は、通信端末２が取得した、集音装置識別情報、参加者識別情報及び位置指定情報を受信する（Ｓ１２）。位置設定部１１２は、受信部１１１が受信した集音装置識別情報、参加者識別情報及び位置指定情報に基づいて複数の参加者それぞれの座席を選択し、選択した座席の位置を用いて複数の参加者それぞれの位置を設定する（Ｓ１３）。 In the voice analyzer 1, the receiving unit 111 receives the sound collecting device identification information, the participant identification information, and the position designation information acquired by the communication terminal 2 (S12). The position setting unit 112 selects a seat for each of the plurality of participants based on the sound collector identification information, the participant identification information, and the position designation information received by the reception unit 111, and uses the positions of the selected seats to obtain a plurality of seats. The position of each participant is set (S13).

議論が開始された後に、音声取得部１１３は、集音装置３から議論の音声を受信する（Ｓ１４）。音声分析部１１４は、設定情報記憶部１２１において、音声の取得元の集音装置３の集音装置ＩＤに関連付けられた、複数の参加者ＩＤと、複数の参加者ＩＤそれぞれの参加者の位置とを取得する。 After the discussion is started, the voice acquisition unit 113 receives the voice of the discussion from the sound collecting device 3 (S14). In the setting information storage unit 121, the voice analysis unit 114 has a plurality of participant IDs associated with the sound collector ID of the sound collector 3 from which the voice is acquired, and the positions of the participants of the plurality of participant IDs. And get.

音声分析部１１４は、受信した音声及び取得した参加者の位置を用いて音源定位を行い、複数の参加者それぞれの発話期間及び発話量を算出することによって、受信した議論の音声を分析する（Ｓ１５）。出力部１１５は、音声分析部１１４による音声の分析結果を出力する（Ｓ１６）。 The voice analysis unit 114 analyzes the received voice of the discussion by performing sound source localization using the received voice and the acquired position of the participant and calculating the utterance period and the utterance amount of each of the plurality of participants (the voice analysis unit 114). S15). The output unit 115 outputs the result of voice analysis by the voice analysis unit 114 (S16).

［本実施形態の効果］
本実施形態に係る音声分析システムＳＳは、通信端末２を用いて取得した集音装置識別情報、参加者識別情報及び位置指定情報に基づいて、集音装置３を基準とした複数の参加者の位置を設定する。このような構成により、音声分析システムＳＳは、特許文献１に記載のシステムのように集音装置３上にカメラを設けることを必要とせずに、複数の参加者の位置を設定できるため、複数の参加者が参加する議論における音声を分析するためのコストを削減できる。 [Effect of this embodiment]
The voice analysis system SS according to the present embodiment has a plurality of participants based on the sound collector 3 based on the sound collector identification information, the participant identification information, and the position designation information acquired by using the communication terminal 2. Set the position. With such a configuration, the voice analysis system SS can set the positions of a plurality of participants without the need to provide a camera on the sound collecting device 3 as in the system described in Patent Document 1. The cost of analyzing audio in discussions in which participants participate can be reduced.

また、１人の分析者が複数の議論のグループに対して複数の参加者の位置を設定するためには大きな手間が掛かるが、本実施形態に係る音声分析システムＳＳは、参加者自身が通信端末２を用いて議論に参加する参加者の位置を設定できるため、参加者の位置を設定するための手間を削減できる。 Further, although it takes a lot of time and effort for one analyst to set the positions of a plurality of participants with respect to a plurality of discussion groups, in the voice analysis system SS according to the present embodiment, the participants themselves communicate with each other. Since the positions of the participants participating in the discussion can be set using the terminal 2, the time and effort for setting the positions of the participants can be reduced.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されず、その要旨の範囲内で種々の変形及び変更が可能である。例えば、装置の全部又は一部は、任意の単位で機能的又は物理的に分散・統合して構成することができる。また、複数の実施の形態の任意の組み合わせによって生じる新たな実施の形態も、本発明の実施の形態に含まれる。組み合わせによって生じる新たな実施の形態の効果は、もとの実施の形態の効果を併せ持つ。 Although the present invention has been described above using the embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments, and various modifications and changes can be made within the scope of the gist. is there. For example, all or a part of the device can be functionally or physically distributed / integrated in any unit. Also included in the embodiments of the present invention are new embodiments resulting from any combination of the plurality of embodiments. The effect of the new embodiment produced by the combination has the effect of the original embodiment.

音声分析装置１及び通信端末２のプロセッサは、図８に示す音声分析方法に含まれる各ステップ（工程）の主体となる。すなわち、音声分析装置１及び通信端末２のプロセッサは、図８に示す音声分析方法を実行するためのプログラムを記憶部から読み出し、該プログラムを実行して音声分析装置１及び通信端末２の各部を制御することによって、図８に示す音声分析方法を実行する。図８に示す音声分析方法に含まれるステップは一部省略されてもよく、ステップ間の順番が変更されてもよく、複数のステップが並行して行われてもよい。 The processor of the voice analyzer 1 and the communication terminal 2 is the main body of each step included in the voice analysis method shown in FIG. That is, the processors of the voice analyzer 1 and the communication terminal 2 read a program for executing the voice analysis method shown in FIG. 8 from the storage unit, and execute the program to execute each part of the voice analyzer 1 and the communication terminal 2. By controlling, the voice analysis method shown in FIG. 8 is executed. Some of the steps included in the voice analysis method shown in FIG. 8 may be omitted, the order between the steps may be changed, and a plurality of steps may be performed in parallel.

ＳＳ音声分析システム
１音声分析装置
１１制御部
１１１受信部
１１２位置設定部
１１３音声取得部
１１４音声分析部
２通信端末
２１制御部
２１１取得部
２１２送信部
３集音装置

SS Voice analysis system 1 Voice analysis device 11 Control unit 111 Reception unit 112 Position setting unit 113 Voice acquisition unit 114 Voice analysis unit 2 Communication terminal 21 Control unit 211 Acquisition unit 212 Transmission unit 3 Sound collector

Claims

A voice analyzer that analyzes the voices emitted by multiple participants around the sound collector.
The sound collector identification information that can identify the sound collector, the participant identification information that can identify the participant, and the position designation information that specifies the position with respect to the sound collector, which are acquired by the communication terminal, are described above. The receiver that receives from the communication terminal and
A position setting unit that sets the position of each of the plurality of participants with respect to the sound collector based on the sound collector identification information, the participant identification information, and the position designation information received by the reception unit.
A sound acquisition unit that acquires the sound from the sound collector, and
A voice analysis unit that analyzes the voice emitted by each of the plurality of participants included in the voice acquired by the voice acquisition unit based on the positions of the plurality of participants set by the position setting unit. ,
A voice analyzer.

The receiving unit uses the communication terminal to obtain the sound collecting device identification information acquired by the communication terminal reading a first identification information presenting unit attached to the sound collecting device that can identify the sound collecting device. The voice analyzer according to claim 1, which receives from.

The receiving unit communicates the participant identification information acquired by the communication terminal reading a second identification information presenting unit that can identify each of the plurality of participants possessed by the plurality of participants. The voice analyzer according to claim 1 or 2, which is received from a terminal.

The voice analyzer according to claim 1 or 2, wherein the receiving unit receives the participant identification information acquired by the communication terminal reading the biometric information of each of the plurality of participants from the communication terminal. ..

Based on the position designation information, the position setting unit selects the seat positions of the plurality of participants in the seat arrangement around the sound collecting device set in advance, and the selected seat positions. The voice analyzer according to any one of claims 1 to 4, wherein is set as the position of the participant.

The receiving unit receives the order in which the communication terminal acquires the participant identification information of each of the plurality of participants as the position designation information.
The voice analyzer according to claim 5, wherein the position setting unit selects the position of the seat of each of the plurality of participants in the seat arrangement based on the order received by the receiving unit.

The receiving unit receives an operation on the communication terminal as the position designation information, and receives the operation.
The voice analyzer according to claim 5, wherein the position setting unit selects the position of the seat of each of the plurality of participants in the seat arrangement based on the operation received by the receiving unit.

A voice analysis system including a voice analyzer that analyzes voices emitted by a plurality of participants around the sound collector and a communication terminal capable of communicating with the voice analyzer.
The communication terminal is
An acquisition unit that acquires sound collector identification information that can identify the sound collector, participant identification information that can identify the participant, and position designation information that specifies a position with respect to the sound collector.
A transmission unit that transmits the sound collector identification information, the participant identification information, and the position designation information.
Have,
The voice analyzer is
A receiving unit that receives the sound collecting device identification information, the participant identification information, and the position designation information acquired by the communication terminal from the communication terminal.
A position setting unit that sets the position of each of the plurality of participants with respect to the sound collector based on the sound collector identification information, the participant identification information, and the position designation information received by the reception unit.
A sound acquisition unit that acquires the sound from the sound collector, and
A voice analysis unit that analyzes the voice emitted by each of the plurality of participants included in the voice acquired by the voice acquisition unit based on the positions of the plurality of participants set by the position setting unit. ,
Has a voice analysis system.

It is a voice analysis method that analyzes the voices emitted by multiple participants around the sound collector.
The processor runs,
The sound collector identification information that can identify the sound collector, the participant identification information that can identify the participant, and the position designation information that specifies the position with respect to the sound collector, which are acquired by the communication terminal, are described above. Steps to receive from the communication terminal and
A step of setting the position of each of the plurality of participants with respect to the sound collector based on the sound collector identification information, the participant identification information, and the position designation information received by the receiving step.
The step of acquiring the sound from the sound collector, and
A step of analyzing the voice emitted by each of the plurality of participants included in the voice acquired by the acquisition step based on the position of each of the plurality of participants set by the setting step.
A voice analysis method.