JP2021132317A

JP2021132317A - Remote conference support device and program

Info

Publication number: JP2021132317A
Application number: JP2020027074A
Authority: JP
Inventors: 祐貴田島; Yuki Tajima
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2020-02-20
Filing date: 2020-02-20
Publication date: 2021-09-09

Abstract

To specify a destination person in response to a question inquired from an attendant at a remote conference.SOLUTION: A remote conference support device comprises: a speech production information acquisition part that acquires an attribute of a speaker of each speech production and a word contained in the speech production as an analysis result of sound data of two or more first attendants at a first base and one or two or more second attendants at a second base; a storage part that stores the speaker and the attribute, acquired by the speech production information acquisition part so as to be associated with each speech production; a destination person specification part that specifies a destination person in response to a question inquired from each of two or more first attendants on the basis of a comparison of the attribute of the word contained in the speech production in accordance with the question and the attribute stored in each speech production into the storage part when each second attendant performs the speech production in accordance with the question to the other attendant; and a transmission part that transmits a notification for requesting an answer to the destination person specified by the destination person specification part to a first communication terminal.SELECTED DRAWING: Figure 2

Description

本発明は、遠隔会議支援装置およびプログラムに関する。 The present invention relates to a teleconferencing support device and a program.

近年、ブロードバンドやクラウド環境の普及、および、情報通信技術の発達に伴うコミ
ュニケーションツールの発達により、互いに遠隔する拠点間で映像および音声を共有することで実現されるテレビ会議が広がりつつある。 In recent years, with the spread of broadband and cloud environments and the development of communication tools accompanying the development of information and communication technology, video conferencing realized by sharing video and audio between bases remote from each other is spreading.

以下の特許文献１〜特許文献３には、このようなテレビ会議に関連する技術が開示されている。例えば、特許文献１には、複数拠点の間で会議を行う際に、発話者の発話と併せて行われるイベントによって、発話の対象を決定する技術が開示されている。また、特許文献２には、テレビ会議システムで会話を整理する技術であって、キーワードに基づいて各発話の関連性を評価し、同じキーワードを含み、かつ、時間的に近い発話を関連性候補とする技術が開示されている。また、特許文献３には、人とコンピュータとの対話システムであって、人物名を対話文から抽出して記憶し、その人物名が再度出現した時に過去の対話を参照して人物名が指す人物を決定する技術が開示されている。 The following Patent Documents 1 to 3 disclose techniques related to such video conferencing. For example, Patent Document 1 discloses a technique for determining the target of utterance by an event performed together with the utterance of the speaker when a meeting is held between a plurality of bases. Further, Patent Document 2 is a technique for organizing conversations in a video conferencing system, in which the relevance of each utterance is evaluated based on keywords, and utterances containing the same keyword and close in time are candidates for relevance. The technology is disclosed. Further, Patent Document 3 is a dialogue system between a person and a computer, in which a person's name is extracted from a dialogue sentence and stored, and when the person's name reappears, the person's name is referred to by referring to a past dialogue. Techniques for determining a person are disclosed.

特開２０１７−２０１７３７号公報Japanese Unexamined Patent Publication No. 2017-17737 特開２０００−７８２９８号公報Japanese Unexamined Patent Publication No. 2000-78298 特開２００４−３３４５９１号公報Japanese Unexamined Patent Publication No. 2004-334591

テレビ会議システムでは、ある出席者が、遠隔する出席者に質問または確認などの問いかけを伴う発話を行うことがある。しかし、当該発話において、問いかけの宛先者となる出席者の名前が明示的に示されない場合がある。例えば、互いに初対面である出席者は互いの名前を発話で指定することが難しい。結果、問いかけの宛先者が誰であるかを各出席者が把握することが困難となり得る。上記の特許文献１〜特許文献３にも、問いかけの宛先者を特定する技術は開示されていない。 In a video conferencing system, an attendee may make an utterance with a question or confirmation to a remote attendee. However, in the utterance, the name of the attendee to whom the question is addressed may not be explicitly indicated. For example, it is difficult for attendees who meet each other for the first time to specify each other's names by utterance. As a result, it can be difficult for each attendee to know who the question is addressed to. The above-mentioned Patent Documents 1 to 3 also do not disclose a technique for identifying the destination of a question.

そこで、本発明は、上記問題に鑑みてなされたものであり、本発明の目的とするところは、遠隔会議においてある出席者により行われた問いかけの宛先者を特定することが可能な、新規かつ改良された遠隔会議支援装置およびプログラムを提供することにある。 Therefore, the present invention has been made in view of the above problems, and an object of the present invention is a novel and novel one capable of identifying a destination of a question asked by an attendee at a remote conference. The purpose is to provide improved teleconferencing support devices and programs.

上記課題を解決するために、本発明のある観点によれば、第１の拠点に設けられた第１の通信端末から送信された遠隔会議に出席する２以上の第１の出席者の音声データ、および第２の拠点に設けられた第２の通信端末から送信された前記遠隔会議に出席する１または２以上の第２の出席者の音声データの解析結果として、各発話の発話者および発話に含まれる単語の属性を取得する発話情報取得部と、前記発話情報取得部により取得された前記発話者、および前記属性を発話ごとに関連付けて記憶する記憶部と、前記第２の出席者が他の出席者への問いかけを伴う発話を行った場合、当該問いかけを伴う発話に含まれる単語の属性と、前記記憶部に発話ごとに記憶された属性との比較に基づき、前記２以上の第１の出席者から前記問いかけの宛先者を特定する宛先者特定部と、前記宛先者特定部により特定された宛先者に対して回答を要求する通知を前記第１の通信端末に送信する送信部と、を備える、遠隔会議支援装置が提供される。 In order to solve the above problems, according to a certain viewpoint of the present invention, the voice data of two or more first attendees attending the remote conference transmitted from the first communication terminal provided at the first base. As a result of analyzing the voice data of one or more second attendees attending the remote conference transmitted from the second communication terminal provided at the second base, the speaker and the utterance of each utterance. An utterance information acquisition unit that acquires the attributes of the words included in the utterance, the speaker acquired by the utterance information acquisition unit, a storage unit that stores the attributes in association with each utterance, and the second attendee. When an utterance accompanied by a question to another attendee is made, the second or higher second or higher is based on a comparison between the attribute of the word included in the utterance accompanied by the question and the attribute memorized for each utterance in the storage unit. A destination identification unit that identifies the destination of the question from one attendee, and a transmission unit that transmits a notification requesting an answer to the destination specified by the destination identification unit to the first communication terminal. And, a remote conference support device is provided.

前記宛先者特定部は、前記記憶部に前記属性が記憶されている発話ごとに、前記問いかけを伴う発話に含まれる単語の属性と、前記記憶部に発話ごとに記憶された属性との比較に基づき評価値を算出し、当該評価値が最大であった発話に関連付けられている発話者を前記宛先者として特定してもよい。 The destination identification unit compares the attribute of the word included in the utterance accompanied by the question with the attribute stored for each utterance in the storage unit for each utterance in which the attribute is stored in the storage unit. An evaluation value may be calculated based on the evaluation value, and the speaker associated with the utterance having the maximum evaluation value may be specified as the destination.

前記評価値は、前記記憶部に記憶された発話ごとの属性のうちで、前記問いかけを伴う発話に含まれる単語の属性に一致する属性の数であってもよい。 The evaluation value may be the number of attributes of each utterance stored in the storage unit that match the attributes of the words included in the utterance accompanied by the question.

前記宛先者特定部は、前記記憶部に前記属性が記憶されている発話のうちで、所定の条件を満たす２以上の発話を前記属性の比較対象としてもよい。 The destination identification unit may compare two or more utterances satisfying a predetermined condition among the utterances in which the attribute is stored in the storage unit.

前記所定の条件は、最新の発話から所定数以内の発話であること、または、所定の時間内に行われた発話であること、を含んでもよい。 The predetermined condition may include that the utterance is within a predetermined number of utterances from the latest utterance, or that the utterance is made within a predetermined time.

前記宛先者特定部は、いずれの発話の評価値も所定の基準を上回らない場合、前記宛先者を特定しなくてもよい。 The destination identification unit does not have to specify the destination if the evaluation value of any utterance does not exceed a predetermined standard.

前記属性は、単語の品詞、または単語の意味的な分類を含んでもよい。 The attribute may include the part of speech of the word or the semantic classification of the word.

前記遠隔会議支援装置は、前記２以上の第１の出席者の音声データ、および前記１または２以上の第２の出席者の音声データを受信する受信部をさらに備え、前記発話情報取得部は、前記受信部により受信された音声データを解析することにより各発話の発話者および発話に含まれる単語の属性を取得してもよい。 The remote conference support device further includes a receiving unit that receives the voice data of the two or more first attendees and the voice data of the one or two or more second attendees, and the utterance information acquisition unit , The speaker of each utterance and the attributes of the words included in the utterance may be acquired by analyzing the voice data received by the receiving unit.

前記遠隔会議支援装置は、前記発話情報取得部により取得された前記音声データの解析結果に基づき、前記音声データが示す発話が前記問いかけを伴う発話であるか否かを判定する発話種別判定部をさらに備え、前記宛先者特定部は、発話情報取得部により前記問いかけを伴う発話であると判定された発話に関して、前記宛先者を特定してもよい。 The remote conference support device includes an utterance type determination unit that determines whether or not the utterance indicated by the voice data is an utterance accompanied by the question, based on the analysis result of the voice data acquired by the utterance information acquisition unit. Further, the destination identification unit may specify the destination with respect to the utterance determined by the utterance information acquisition unit to be an utterance accompanied by the question.

また、上記課題を解決するために、本発明の別の観点によれば、コンピュータを、第１の拠点に設けられた第１の通信端末から送信された遠隔会議に出席する２以上の第１の出席者の音声データ、および第２の拠点に設けられた第２の通信端末から送信された前記遠隔会議に出席する１または２以上の第２の出席者の音声データの解析結果として、各発話の発話者および発話に含まれる単語の属性を取得する発話情報取得部と、前記発話情報取得部により取得された前記発話者、および前記属性を発話ごとに関連付けて記憶する記憶部と、前記第２の出席者が他の出席者への問いかけを伴う発話を行った場合、当該問いかけを伴う発話に含まれる単語の属性と、前記記憶部に発話ごとに記憶された属性との比較に基づき、前記２以上の第１の出席者から前記問いかけの宛先者を特定する宛先者特定部と、前記宛先者特定部により特定された宛先者に対して回答を要求する通知を前記第１の通信端末に送信する送信部と、として機能させるための、プログラムが提供される。 Further, in order to solve the above problems, according to another viewpoint of the present invention, the computer is used as two or more firsts to attend a remote conference transmitted from the first communication terminal provided at the first base. As a result of analysis of the voice data of the attendees and the voice data of one or more second attendees attending the remote conference transmitted from the second communication terminal provided at the second base, respectively. An utterance information acquisition unit that acquires the attributes of the utterance speaker and words included in the utterance, the speaker acquired by the utterance information acquisition unit, and a storage unit that stores the attributes in association with each utterance, and the above. When the second attendee makes an utterance accompanied by a question to another attendee, the attribute of the word included in the utterance accompanied by the question is compared with the attribute memorized for each utterance in the storage unit. , The first communication of the destination identification unit that identifies the destination of the question from the two or more first attendees and the notification requesting a reply from the destination specified by the destination identification unit. A program is provided to function as a transmitter to transmit to the terminal.

以上説明した本発明によれば、遠隔会議においてある出席者により行われた問いかけの宛先者を特定することが可能である。 According to the present invention described above, it is possible to identify the recipient of a question asked by a certain attendee in a remote conference.

本発明の一実施形態による遠隔会議システムの構成を示す説明図である。It is explanatory drawing which shows the structure of the remote conference system by one Embodiment of this invention. 本発明の一実施形態によるクラウドサーバ２０の構成を示す説明図である。It is explanatory drawing which shows the structure of the cloud server 20 by one Embodiment of this invention. 各出席者により行われる発話の具体例を示す説明図である。It is explanatory drawing which shows the specific example of the utterance made by each attendee. 記憶部２４０に記憶される発話解析データの具体例を示す説明図である。It is explanatory drawing which shows the specific example of the utterance analysis data stored in the storage part 240. 宛先者の特定の第１の具体例を示す説明図である。It is explanatory drawing which shows the specific 1st specific example of a destination. 宛先者の特定の第２の具体例を示す説明図である。It is explanatory drawing which shows the specific 2nd specific example of the addressee. 宛先者の特定の第３の具体例を示す説明図である。It is explanatory drawing which shows the specific 3rd specific example of the addressee. 回答を要求する表示の具体例を示す説明図である。It is explanatory drawing which shows the specific example of the display requesting an answer. 本発明の一実施形態によるクラウドサーバ２０の動作を示すフローチャートである。It is a flowchart which shows the operation of the cloud server 20 by one Embodiment of this invention. 音声データの収集および解析の詳細を示すフローチャートである。It is a flowchart which shows the detail of the collection and analysis of voice data. クラウドサーバ２０のハードウェア構成を示したブロック図である。It is a block diagram which showed the hardware configuration of the cloud server 20.

以下に添付図面を参照しながら、本発明の実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Embodiments of the present invention will be described in detail below with reference to the accompanying drawings. In the present specification and the drawings, components having substantially the same functional configuration are designated by the same reference numerals, so that duplicate description will be omitted.

また、本明細書及び図面において、実質的に同一の機能構成を有する複数の構成要素を、同一の符号の後に異なるアルファベットを付して区別する場合もある。例えば、実質的に同一の機能構成または論理的意義を有する複数の構成を、必要に応じて遠隔会議用端末１０Ａ及び１０Ｂのように区別する。ただし、実質的に同一の機能構成を有する複数の構成要素の各々を特に区別する必要がない場合、複数の構成要素の各々に同一符号のみを付する。例えば、遠隔会議用端末１０Ａ及び１０Ｂを特に区別する必要が無い場合には、各遠隔会議用端末を単に遠隔会議用端末１０と称する。 Further, in the present specification and the drawings, a plurality of components having substantially the same functional configuration may be distinguished by adding different alphabets after the same reference numerals. For example, a plurality of configurations having substantially the same functional configuration or logical significance are distinguished as necessary, such as remote conference terminals 10A and 10B. However, when it is not necessary to particularly distinguish each of the plurality of components having substantially the same functional configuration, only the same reference numerals are given to each of the plurality of components. For example, when it is not necessary to distinguish between the remote conference terminals 10A and 10B, each remote conference terminal is simply referred to as a remote conference terminal 10.

＜１．遠隔会議システムの概要＞
本発明の一実施形態は、遠隔する拠点間での遠隔会議を実現する遠隔会議システムに関する。以下、図１を参照し、本発明の一実施形態による遠隔会議システムの概要を説明する。 <1. Overview of teleconferencing system>
One embodiment of the present invention relates to a remote conference system that realizes a remote conference between remote bases. Hereinafter, an outline of the remote conference system according to the embodiment of the present invention will be described with reference to FIG.

図１は、本発明の一実施形態による遠隔会議システムの構成を示す説明図である。図１に示したように、本発明の一実施形態による遠隔会議システムは、拠点Ａに設けられる遠隔会議用端末１０Ａ、拠点Ｂに設けられる遠隔会議用端末１０Ｂ、およびクラウドサーバ２０を有する。遠隔会議用端末１０Ａ、遠隔会議用端末１０Ｂおよびクラウドサーバ２０は、ネットワーク１２により接続されている。ネットワーク１２は、電話回線網、インターネット、衛星通信網などの公衆回線網や、ＬＡＮ（ＬｏｃａｌＡｅｒａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）などを含んでもよい。また、ネットワーク１２は、ＩＰ−ＶＰＮ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ−ＶｉｒｔｕａｌＰｒｉｖａｔｅＮｅｔｗｏｒｋ）などの専用回線網を含んでもよい。 FIG. 1 is an explanatory diagram showing a configuration of a remote conference system according to an embodiment of the present invention. As shown in FIG. 1, the remote conference system according to the embodiment of the present invention includes a remote conference terminal 10A provided at the base A, a remote conference terminal 10B provided at the base B, and a cloud server 20. The remote conference terminal 10A, the remote conference terminal 10B, and the cloud server 20 are connected by the network 12. The network 12 may include a public line network such as a telephone line network, the Internet, and a satellite communication network, a LAN (Local Area Network), a WAN (Wide Area Network), and the like. Further, the network 12 may include a dedicated line network such as IP-VPN (Internet Protocol-Virtual Private Network).

拠点Ａおよび拠点Ｂは互いに遠隔する。図１では、拠点Ａにおける遠隔会議への出席者として出席者Ａ１および出席者Ａ２が示されておりであり、拠点Ｂにおける遠隔会議への出席者として出席者Ｂ１および出席者Ｂ２が示されている。なお、遠隔の意味は、拠点Ａおよび拠点Ｂが遠く離れた場所に存在することに限られず、拠点Ａおよび拠点Ｂは同じ建物の別フロアに存在してもよいし、拠点Ａおよび拠点Ｂは同一フロアの別部屋に存在してもよい。 Base A and Base B are remote from each other. In FIG. 1, attendees A1 and A2 are shown as attendees at the teleconference at the base A, and attendees B1 and B2 are shown as attendees at the teleconferencing at the base B. There is. The meaning of remoteness is not limited to the fact that the base A and the base B are located far apart, the base A and the base B may exist on different floors of the same building, and the base A and the base B may exist. It may exist in a separate room on the same floor.

（遠隔会議用端末）
遠隔会議用端末１０は、各拠点で遠隔会議の出席者により共有される通信端末である。例えば、拠点Ａは第１の拠点の一例であり、拠点Ａで共有される遠隔会議用端末１０Ａは第１の通信端末の一例である。同様に、拠点Ｂは第２の拠点の一例であり、拠点Ｂで共有される遠隔会議用端末１０Ｂは第２の通信端末の一例である。 (Terminal for remote conference)
The remote conference terminal 10 is a communication terminal shared by the attendees of the remote conference at each base. For example, the base A is an example of the first base, and the remote conference terminal 10A shared by the base A is an example of the first communication terminal. Similarly, the base B is an example of a second base, and the remote conference terminal 10B shared by the base B is an example of a second communication terminal.

遠隔会議用端末１０は、遠隔会議を実現するための多様な機能を有する。例えば、遠隔会議用端末１０は、遠隔会議用端末１０が配置されている拠点を撮像して拠点の映像データを取得する撮像機能、遠隔会議用端末１０が配置されている拠点に存在する出席者の音声を収音して音声データを取得する収音機能、他の拠点の遠隔会議用端末１０と映像データおよび音声データを通信する機能、他の拠点遠隔会議用端末１０から受信された映像データを表示する表示機能、他の拠点遠隔会議用端末１０から受信された音声データを出力する音声出力機能を有する。 The remote conference terminal 10 has various functions for realizing a remote conference. For example, the remote conference terminal 10 has an imaging function of capturing an image of a base where the remote conference terminal 10 is located and acquiring video data of the base, and attendees existing at the base where the remote conference terminal 10 is located. Sound collection function that collects the voice of the other base and acquires voice data, a function that communicates video data and voice data with the remote conference terminal 10 of another base, video data received from the remote conference terminal 10 of another base It has a display function for displaying, and a voice output function for outputting voice data received from another base remote conference terminal 10.

具体的には、遠隔会議用端末１０Ａは、拠点Ａの映像データおよび音声データを遠隔会議用端末１０Ｂに送信し、遠隔会議用端末１０Ｂが当該映像データの表示および当該音声データの出力を行う。また、遠隔会議用端末１０Ａは、拠点Ｂの映像データおよび音声データを遠隔会議用端末１０Ｂから受信し、当該映像データの表示および当該音声データの出力を行う。これにより、拠点Ａに存在する出席者Ａ１およびＡ２と、拠点Ｂに存在する出席者Ｂ１およびＢ２が、互いの映像を見ながら対話することが可能となる。 Specifically, the remote conference terminal 10A transmits the video data and audio data of the base A to the remote conference terminal 10B, and the remote conference terminal 10B displays the video data and outputs the audio data. Further, the remote conference terminal 10A receives the video data and audio data of the base B from the remote conference terminal 10B, displays the video data, and outputs the audio data. As a result, the attendees A1 and A2 existing at the base A and the attendees B1 and B2 existing at the base B can talk with each other while watching each other's images.

また、本発明の一実施形態による遠隔会議用端末１０は、収音機能により取得された音声データをクラウドサーバ２０に送信する。また、遠隔会議用端末１０は、詳細については後述する回答要求通知をクラウドサーバ２０から受信すると、当該回答要求通知を表示または音声により出力する。 Further, the remote conference terminal 10 according to the embodiment of the present invention transmits the voice data acquired by the sound collecting function to the cloud server 20. Further, when the remote conference terminal 10 receives the reply request notification, which will be described in detail later, from the cloud server 20, the remote conference terminal 10 displays or outputs the reply request notification by voice.

なお、図１においては遠隔会議用端末１０としてデスクトップ型のＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）を示しているが、遠隔会議用端末１０は、ノートＰＣ、タブレット端末および大型ディスプレイを備える端末などの他の情報処理装置であってもよい。 Although the remote conference terminal 10 shows a desktop PC (Personal Computer) in FIG. 1, the remote conference terminal 10 may be used for other information processing such as a notebook PC, a tablet terminal, and a terminal having a large display. It may be a device.

（クラウドサーバ）
クラウドサーバ２０は、遠隔会議を支援する遠隔会議支援装置の一例である。クラウドサーバ２０は、遠隔会議用端末１０Ａおよび遠隔会議用端末１０Ｂから音声データを受信し、当該音声データを解析する。音声データが質問または確認など誰かへの問いかけを伴う発話である場合、クラウドサーバ２０は、当該問いかけの宛先者を特定し、宛先者に対して回答を要求する回答要求通知を遠隔会議用端末１０Ａまたは遠隔会議用端末１０Ｂに送信する。以下、このようなクラウドサーバ２０の構成および動作を順次詳細に説明する。 (Cloud server)
The cloud server 20 is an example of a remote conference support device that supports remote conferences. The cloud server 20 receives voice data from the remote conference terminal 10A and the remote conference terminal 10B, and analyzes the voice data. When the voice data is an utterance that involves asking someone, such as a question or confirmation, the cloud server 20 identifies the recipient of the question and sends a response request notification requesting an answer to the recipient to the remote conference terminal 10A. Alternatively, it is transmitted to the remote conference terminal 10B. Hereinafter, the configuration and operation of such a cloud server 20 will be sequentially described in detail.

なお、図２においてはクラウドサーバ２０として１つのサーバを示しているが、以下に説明するクラウドサーバ２０の機能は複数のサーバからなるサーバ群に分散して実装されてもよい。 Although one server is shown as the cloud server 20 in FIG. 2, the functions of the cloud server 20 described below may be distributed and implemented in a server group composed of a plurality of servers.

＜２．クラウドサーバの構成＞
図２は、本発明の一実施形態によるクラウドサーバ２０の構成を示す説明図である。図２に示したように、本発明の一実施形態によるクラウドサーバ２０は、通信部２２０、音声データ解析部２３０、記憶部２４０、発話種別判定部２５０および宛先者特定部２６０を備える。 <2. Cloud server configuration>
FIG. 2 is an explanatory diagram showing a configuration of a cloud server 20 according to an embodiment of the present invention. As shown in FIG. 2, the cloud server 20 according to the embodiment of the present invention includes a communication unit 220, a voice data analysis unit 230, a storage unit 240, an utterance type determination unit 250, and a destination identification unit 260.

（通信部）
通信部２２０は、遠隔会議用端末１０と多様なデータを通信する。例えば、通信部２２０は、遠隔会議用端末１０Ａおよび遠隔会議用端末１０Ｂから音声データを受信する受信部としての機能、および、遠隔会議用端末１０Ａまたは遠隔会議用端末１０Ｂに回答要求通知を送信する送信部としての機能を有する。 (Communication Department)
The communication unit 220 communicates various data with the remote conference terminal 10. For example, the communication unit 220 functions as a receiving unit for receiving voice data from the remote conference terminal 10A and the remote conference terminal 10B, and transmits a response request notification to the remote conference terminal 10A or the remote conference terminal 10B. It has a function as a transmitter.

（音声データ解析部）
音声データ解析部２３０は、発話情報取得部の一例であり、通信部２２０により受信された音声データの解析結果として、各発話の発話者、各発話に含まれる単語、および当該単語の属性などの発話解析データを取得する。音声データ解析部２３０は、例えば、通信部２２０により受信された音声データを自然言語処理により解析することで各発話に含まれる特徴的な単語を抽出し、人工知能により当該単語の属性を決定してもよい。単語の属性としては、単語の品詞、および単語の意味的な分類である意味分類が挙げられる。例えば、「来月」という単語の品詞は「一般名詞」であり、意味分類は「時期」である。なお、音声データの解析は他のサーバで行われ、ｒ２０は当該解析の結果として発話解析データを取得してもよく、この場合、ｒ２０は音声データの解析機能を有さなくてもよい。 (Voice data analysis department)
The voice data analysis unit 230 is an example of an utterance information acquisition unit, and as an analysis result of voice data received by the communication unit 220, the speaker of each utterance, the words included in each utterance, the attributes of the words, and the like are used. Acquire utterance analysis data. The voice data analysis unit 230 extracts characteristic words included in each utterance by analyzing the voice data received by the communication unit 220 by natural language processing, and determines the attributes of the words by artificial intelligence. You may. The attributes of a word include the part of speech of the word and the semantic classification, which is a semantic classification of the word. For example, the part of speech of the word "next month" is "general noun" and the semantic classification is "time". The voice data is analyzed by another server, and the r20 may acquire the utterance analysis data as a result of the analysis. In this case, the r20 does not have to have the voice data analysis function.

（記憶部）
記憶部２４０は、クラウドサーバ２０の動作に用いられる多様なデータを記憶する。例えば、記憶部２４０は、各出席者の氏名および音声の特徴を記憶していてもよい。また、本発明の一実施形態による記憶部２４０は、音声データ解析部２３０により取得された発話解析データを記憶する。以下、図３および図４を参照し、記憶部２４０に記憶される発話解析データの具体例を説明する。 (Memory)
The storage unit 240 stores various data used for the operation of the cloud server 20. For example, the storage unit 240 may store the names and voice features of each attendee. Further, the storage unit 240 according to the embodiment of the present invention stores the utterance analysis data acquired by the voice data analysis unit 230. Hereinafter, a specific example of the utterance analysis data stored in the storage unit 240 will be described with reference to FIGS. 3 and 4.

図３は、各出席者により行われる発話の具体例を示す説明図である。図３に示した例では、出席者Ａ１が「ＸＸの発売は来月です。」という発話Ｖ１を行い、次に出席者Ａ２が「来月ＸＸが店頭に並ぶのは関東でのみです。」という発話Ｖ２を行い、続いて出席者Ｂ１が「ＸＸが全国の店頭に並ぶのはいつ頃ですか？」という発話Ｖ３を行う。この場合、例えば図４に示す発話解析データが取得され、記憶部２４０に記憶される。 FIG. 3 is an explanatory diagram showing a specific example of utterances made by each attendee. In the example shown in Fig. 3, attendee A1 makes an utterance V1 saying "XX will be released next month." Then attendee A2 says "XX will be on the shelves next month only in Kanto." The utterance V2 is made, and then the attendee B1 makes the utterance V3 "When will XX be lined up in stores nationwide?" In this case, for example, the utterance analysis data shown in FIG. 4 is acquired and stored in the storage unit 240.

図４は、記憶部２４０に記憶される発話解析データの具体例を示す説明図である。図４に示したように、発話解析データは、発話ごとに発話ＩＤ、発話者、単語、品詞および意味分類が関連付けられたデータである。例えば、図４に示した発話解析データは、発話Ｖ１の発話者が出席者Ａ１であり、発話Ｖ１から単語ＸＸ（品詞：固有名詞、意味分類：物品）、発売（品詞：一般名詞、意味分類：イベント）、および来月（品詞：一般名詞、意味分類：時期）が抽出されたことを示す。 FIG. 4 is an explanatory diagram showing a specific example of the utterance analysis data stored in the storage unit 240. As shown in FIG. 4, the utterance analysis data is data in which the utterance ID, the speaker, the word, the part of speech, and the semantic classification are associated with each utterance. For example, in the utterance analysis data shown in FIG. 4, the speaker of the utterance V1 is the attendee A1, and the words XX (part of speech: proper nomenclature, meaning classification: article) and release (part of speech: general nomenclature, meaning classification) are released from the utterance V1. : Event) and next month (part of speech: general nomenclature, semantic classification: time) are extracted.

（発話種別判定部）
発話種別判定部２５０は、記憶部２４０に記憶された発話解析データに基づき、各発話が問いかけを伴う発話であるか否かを判定する。各発話が問いかけを伴う発話であるか否かの判定方法は特に限定されない。例えば、発話種別判定部２５０は、発話の内容を５Ｗ１Ｈの文章に置き換え可能である場合には当該発話が問いかけを伴う発話であると判定してもよい。または、発話種別判定部２５０は、問いかけを伴う発話である正解データと発話の内容を比較し、正解データと発話の内容が一致する場合、または、正解データと発話の内容が類似する場合、当該発話が問いかけを伴う発話であると判定してもよい。 (Utterance type judgment unit)
The utterance type determination unit 250 determines whether or not each utterance is an utterance accompanied by a question, based on the utterance analysis data stored in the storage unit 240. The method for determining whether or not each utterance is an utterance accompanied by a question is not particularly limited. For example, the utterance type determination unit 250 may determine that the utterance is an utterance accompanied by a question when the content of the utterance can be replaced with a sentence of 5W1H. Alternatively, the utterance type determination unit 250 compares the correct answer data, which is an utterance accompanied by a question, with the content of the utterance, and if the correct answer data matches the content of the utterance, or if the correct answer data and the content of the utterance are similar, the relevant case is found. It may be determined that the utterance is an utterance accompanied by a question.

（宛先者特定部）
宛先者特定部２６０は、発話種別判定部２５０により問いかけを伴う発話であると判定された発話について、当該問いかけの宛先者を特定する。例えば、宛先者特定部２６０は、記憶部２４０に発話解析データが記憶されている発話ごとに、問いかけを伴う発話に含まれる単語の意味分類と、記憶部２４０に発話ごとに記憶されている各単語の意味分類との比較に基づき評価値を算出し、評価値が最大であった発話に関連付けられている発話者を宛先者として特定してもよい。以下、図５〜図７を参照し、宛先者の特定の具体例を説明する。 (Destination identification department)
The destination identification unit 260 identifies the destination of the question for the utterance determined by the utterance type determination unit 250 to be an utterance accompanied by a question. For example, the destination identification unit 260 has the meaning classification of the words included in the utterance accompanied by the question for each utterance in which the utterance analysis data is stored in the storage unit 240, and each utterance stored in the storage unit 240. The evaluation value may be calculated based on the comparison with the meaning classification of the word, and the speaker associated with the utterance having the maximum evaluation value may be specified as the destination. Hereinafter, specific specific examples of the recipient will be described with reference to FIGS. 5 to 7.

図５は、宛先者の特定の第１の具体例を示す説明図である。より詳細には、図５には、図３に示した出席者Ｂ１による発話Ｖ３が問いかけを伴う発話であると判定され、当該発話Ｖ３について図４に示した発話解析データが取得された場合の宛先者の特定例を示している。 FIG. 5 is an explanatory diagram showing a specific first specific example of the destination. More specifically, in FIG. 5, it is determined that the utterance V3 by the attendee B1 shown in FIG. 3 is an utterance accompanied by a question, and the utterance analysis data shown in FIG. 4 is acquired for the utterance V3. A specific example of the destination is shown.

この場合、図５に示したように、発話Ｖ１では、単語「ＸＸ」の意味分類「物品」が発話Ｖ３に含まれる単語「ＸＸ」の意味分類「物品」に一致し、単語「来月」の意味分類「時期」が発話Ｖ３に含まれる単語「いつ」の意味分類「時期」に一致する。従って、宛先者特定部２６０は、発話Ｖ１に含まれる単語の意味分類のうちで、問いかけを伴う発話Ｖ３に含まれる単語の意味分類に一致する意味分類の数である「２」を発話Ｖ１に対する評価値として算出する。 In this case, as shown in FIG. 5, in the utterance V1, the meaning classification "article" of the word "XX" matches the meaning classification "article" of the word "XX" included in the utterance V3, and the word "next month". The meaning classification "time" of is consistent with the meaning classification "time" of the word "when" included in the utterance V3. Therefore, the destination identification unit 260 sets "2", which is the number of meaning classifications of the words included in the utterance V1 to match the meaning classifications of the words included in the utterance V3 accompanied by the question, with respect to the utterance V1. Calculated as an evaluation value.

また、発話Ｖ２では、単語「ＸＸ」の意味分類「物品」が発話Ｖ３に含まれる単語「ＸＸ」の意味分類「物品」に一致し、単語「来月」の意味分類「時期」が発話Ｖ３に含まれる単語「いつ」の意味分類「時期」に一致し、単語「関東」の意味分類「地区」が発話Ｖ３に含まれる単語「全国」の意味分類「地区」に一致する。従って、宛先者特定部２６０は、発話Ｖ２に含まれる単語の意味分類のうちで、問いかけを伴う発話Ｖ３に含まれる単語の意味分類に一致する意味分類の数である「３」を発話Ｖ２に対する評価値として算出する。 Further, in the utterance V2, the meaning classification "article" of the word "XX" matches the meaning classification "article" of the word "XX" included in the utterance V3, and the meaning classification "time" of the word "next month" is the utterance V3. The meaning classification "time" of the word "when" included in is matched, and the meaning classification "district" of the word "Kanto" is matched with the meaning classification "district" of the word "nationwide" included in the utterance V3. Therefore, the destination identification unit 260 sets "3", which is the number of meaning classifications of the words included in the utterance V2, which matches the meaning classification of the words included in the utterance V3 accompanied by the question, to the utterance V2. Calculated as an evaluation value.

図５に示した例では、発話Ｖ２の評価値「３」が最大であるので、宛先者特定部２６０は、当該発話Ｖ２に関連付けられている出席者Ａ２を宛先者として特定する。 In the example shown in FIG. 5, since the evaluation value “3” of the utterance V2 is the maximum, the destination identification unit 260 identifies the attendee A2 associated with the utterance V2 as the destination.

図６は、宛先者の特定の第２の具体例を示す説明図である。より詳細には、図６には、図３に示した出席者Ｂ１による発話Ｖ３が問いかけを伴う発話であると判定され、当該発話Ｖ３について図４に示した発話解析データが取得された場合の宛先者の特定例を示している。一方、発話Ｖ１および発話Ｖ２については図６に示した発話解析データが取得されていたものとする。 FIG. 6 is an explanatory diagram showing a specific second specific example of the destination. More specifically, in FIG. 6, it is determined that the utterance V3 by the attendee B1 shown in FIG. 3 is an utterance accompanied by a question, and the utterance analysis data shown in FIG. 4 is acquired for the utterance V3. A specific example of the destination is shown. On the other hand, it is assumed that the utterance analysis data shown in FIG. 6 has been acquired for the utterance V1 and the utterance V2.

図６に示した例では、発話Ｖ１の評価値および発話Ｖ２の評価値が共に最大の「３」である。宛先者特定部２６０は、このように複数の発話が同一の最大評価値を有する場合には、これら複数の発話に関連付けられている全ての出席者を宛先者、または宛先者の候補として特定する。従って、図６に示した例では、出席者Ａ１および出席者Ａ２の双方が宛先者として特定される。 In the example shown in FIG. 6, the evaluation value of the utterance V1 and the evaluation value of the utterance V2 are both the maximum "3". When a plurality of utterances have the same maximum evaluation value in this way, the destination identification unit 260 identifies all the attendees associated with the plurality of utterances as the destination or the candidate of the destination. .. Therefore, in the example shown in FIG. 6, both attendee A1 and attendee A2 are identified as recipients.

図７は、宛先者の特定の第３の具体例を示す説明図である。より詳細には、図７には、図３に示した出席者Ｂ１による発話Ｖ３が問いかけを伴う発話であると判定され、当該発話Ｖ３について図４に示した発話解析データが取得された場合の宛先者の特定例を示している。一方、発話Ｖ１および発話Ｖ２については図７に示した発話解析データが取得されていたものとする。 FIG. 7 is an explanatory diagram showing a specific third specific example of the destination. More specifically, in FIG. 7, it is determined that the utterance V3 by the attendee B1 shown in FIG. 3 is an utterance accompanied by a question, and the utterance analysis data shown in FIG. 4 is acquired for the utterance V3. A specific example of the destination is shown. On the other hand, it is assumed that the utterance analysis data shown in FIG. 7 has been acquired for the utterance V1 and the utterance V2.

図７に示した例では、発話Ｖ１の評価値および発話Ｖ２の評価値が共に「０」である。宛先者特定部２６０は、いずれの発話の評価値も所定の基準を上回らない場合、例えばいずれの発話の評価値も「１」を上回らない場合、宛先者を特定しない。従って、図７に示した例では、出席者Ａ１および出席者Ａ２のいずれも宛先者として特定されない。 In the example shown in FIG. 7, the evaluation value of the utterance V1 and the evaluation value of the utterance V2 are both “0”. The destination identification unit 260 does not specify the destination when the evaluation value of any utterance does not exceed a predetermined standard, for example, when the evaluation value of any utterance does not exceed "1". Therefore, in the example shown in FIG. 7, neither the attendee A1 nor the attendee A2 is specified as the destination.

（回答要求通知の具体例）
宛先者特定部２６０により宛先者が特定されると、通信部２２０が宛先者への回答要求通知を宛先者が存在する側の拠点の遠隔会議用端末１０に送信する。回答要求通知を受信した遠隔会議用端末１０は、宛先者に表示または音声出力により問いかけに対する回答を要求する。ここで、図８を参照し、表示により回答が要求される例を説明する。 (Specific example of response request notification)
When the destination is specified by the destination identification unit 260, the communication unit 220 transmits a reply request notification to the destination to the remote conference terminal 10 of the base where the destination exists. The teleconferencing terminal 10 that has received the response request notification requests the recipient to answer the question by displaying or outputting voice. Here, with reference to FIG. 8, an example in which an answer is requested by display will be described.

図８は、回答を要求する表示の具体例を示す説明図である。図８には、遠隔会議用端末１０Ａが表示する対話画面４０を示している、当該対話画面４０には、拠点Ｂに存在する出席者Ｂ１および出席者Ｂ２の映像が含まれる。また、対話画面４０は、出席者Ａ１に回答を要求する表示として、システムメッセージ４２を含む。システムメッセージは、例えば、出席者Ａ１に問いかけが行われたことを示す「Ａ１さんへの問いかけです。」というメッセージ、および、出席者Ａ１さんに回答を要求する「Ａ１さん、ご回答下さい。」というメッセージを含む。出席者Ａ１は、当該システムメッセージ４２に基づき、発話Ｖ３が自身への問いかけであることを把握し、当該問いかけに対して回答することが可能である。 FIG. 8 is an explanatory diagram showing a specific example of a display requesting an answer. FIG. 8 shows a dialogue screen 40 displayed by the remote conference terminal 10A. The dialogue screen 40 includes images of attendees B1 and B2 existing at the base B. Further, the dialogue screen 40 includes a system message 42 as a display requesting the attendee A1 to answer. The system message is, for example, the message "This is a question to Mr. A1" indicating that the question was asked to attendee A1, and "Mr. A1, please answer." Includes the message. Based on the system message 42, the attendee A1 can grasp that the utterance V3 is a question to himself / herself and can answer the question.

＜３．クラウドサーバの動作＞
以上、本発明の一実施形態によるクラウドサーバ２０の構成を説明した。続いて、図９および図１０を参照して、本発明の一実施形態によるクラウドサーバ２０の動作を整理する。 <3. Cloud server operation>
The configuration of the cloud server 20 according to the embodiment of the present invention has been described above. Subsequently, with reference to FIGS. 9 and 10, the operation of the cloud server 20 according to the embodiment of the present invention will be arranged.

図９は、本発明の一実施形態によるクラウドサーバ２０の動作を示すフローチャートである。図９に示したように、まず、クラウドサーバ２０は音声データの収集および解析を行う（Ｓ３１０）。そして、音声データが示す発話の種別を発話種別判定部２５０が判定し（Ｓ３２０）、発話の種別が問いかけを伴う発話でない場合（Ｓ３３０／Ｎｏ）、Ｓ３１０からの処理が繰り返される。 FIG. 9 is a flowchart showing the operation of the cloud server 20 according to the embodiment of the present invention. As shown in FIG. 9, first, the cloud server 20 collects and analyzes voice data (S310). Then, when the utterance type determination unit 250 determines the utterance type indicated by the voice data (S320) and the utterance type is not an utterance accompanied by a question (S330 / No), the process from S310 is repeated.

一方、発話の種別が問いかけを伴う発話である場合（Ｓ３３０／Ｙｅｓ）、処理はＳ３４０に進められる。すなわち、宛先者特定部２６０が、記憶部２４０に記憶されている各発話の音声解析データに基づき、問いかけの宛先者を特定する（Ｓ３４０）。そして、１人の宛先者が特定された場合（Ｓ３５０／１人）、通信部２２０が１人の宛先者への回答要求通知を送信する（Ｓ３６０）。複数人の宛先者が特定された場合（Ｓ３５０／複数人）、通信部２２０が複数人の宛先者への回答要求通知を送信する（Ｓ３７０）。一方、宛先者が特定されなかった場合（Ｓ３５０／０人）、回答要求通知の送信は行われない。ただし、当該発話は誰かへの問いかけが行われている発話であるので、通信部２２０は、問いかけが行われていることを示す通知を送信してもよい。その後、機能が継続される間、Ｓ３１０からの処理が繰り返される（Ｓ３８０／Ｙｅｓ）。 On the other hand, when the type of utterance is an utterance accompanied by a question (S330 / Yes), the process proceeds to S340. That is, the destination identification unit 260 identifies the destination of the question based on the voice analysis data of each utterance stored in the storage unit 240 (S340). Then, when one destination is specified (S350 / person), the communication unit 220 transmits a reply request notification to one destination (S360). When a plurality of recipients are identified (S350 / plurality), the communication unit 220 transmits a response request notification to the plurality of recipients (S370). On the other hand, if the recipient is not specified (S350 / 0 people), the reply request notification is not transmitted. However, since the utterance is an utterance in which a question is being asked to someone, the communication unit 220 may send a notification indicating that the question is being made. After that, the process from S310 is repeated while the function is continued (S380 / Yes).

図１０は、音声データの収集および解析の詳細を示すフローチャートである。まず、通信部２２０が音声データの受信を待ち（Ｓ３１１）、通信部２２０が音声データを受信すると（Ｓ３１２／Ｙｅｓ）、音声データ解析部２３０が例えば自然言語処理により音声データを解析する（Ｓ３１３）。そして、当該解析により得られた、各発話の発話者、各発話に含まれる単語、および当該単語の属性などを含む発話解析データを記憶部２４０が記憶する（Ｓ３１４）。その後、機能が継続される間、Ｓ３１１からの処理が繰り返される（Ｓ３１５／Ｙｅｓ）。併せて、図９を参照して説明したＳ３２０からの処理が進められる。 FIG. 10 is a flowchart showing details of voice data collection and analysis. First, when the communication unit 220 waits for the reception of voice data (S311) and the communication unit 220 receives the voice data (S312 / Yes), the voice data analysis unit 230 analyzes the voice data by, for example, natural language processing (S313). .. Then, the storage unit 240 stores the utterance analysis data including the speaker of each utterance, the word included in each utterance, the attribute of the word, and the like obtained by the analysis (S314). After that, the process from S311 is repeated while the function is continued (S315 / Yes). At the same time, the process from S320 described with reference to FIG. 9 proceeds.

＜４．作用効果＞
以上説明したように、本発明の一実施形態によれば、遠隔会議において問いかけを伴う発話が行われた場合に、当該問いかけに回答すべき出席者を容易に把握できるようになる。このため、回答者を確認するやり取りの発生を抑制すること、発話者が宛先者の氏名などを事前に確認する手間を省くこと、などが可能となり、結果、遠隔会議の効率化が実現される。 <4. Action effect>
As described above, according to the embodiment of the present invention, when an utterance accompanied by a question is made in a remote conference, the attendees who should answer the question can be easily grasped. For this reason, it is possible to suppress the occurrence of exchanges for confirming the respondents, save the time and effort for the speaker to confirm the name of the recipient in advance, and as a result, the efficiency of the remote conference is realized. ..

また、本発明の一実施形態では、発話に含まれる単語の一致数ではなく、発話に含まれる単語の意味分類の一致数に基づいて評価値の算出が行われる。従って、問いかけを伴う発話が、過去の発話で用いられた単語と同一の単語を含んでいない場合でも適切に評価値を算出し、宛先者を特定することが可能である。 Further, in one embodiment of the present invention, the evaluation value is calculated based on the number of matches of the meaning classification of the words included in the utterance, not the number of matches of the words included in the utterance. Therefore, even if the utterance accompanied by the question does not include the same word as the word used in the past utterance, it is possible to appropriately calculate the evaluation value and identify the destination.

＜５．変形例＞
以上、本発明の一実施形態を説明した。以下では、上述した実施形態の幾つかの変形例を説明する。なお、以下に説明する各変形例は、単独で上述した実施形態に適用されてもよいし、組み合わせで上述した実施形態に適用されてもよい。また、各変形例は、上述した実施形態で説明した構成に代えて適用されてもよいし、上述した実施形態で説明した構成に対して追加的に適用されてもよい。 <5. Modification example>
The embodiment of the present invention has been described above. Hereinafter, some modifications of the above-described embodiment will be described. In addition, each modification described below may be applied alone to the above-described embodiment, or may be applied in combination to the above-described embodiment. Further, each modification may be applied in place of the configuration described in the above-described embodiment, or may be additionally applied to the configuration described in the above-described embodiment.

（第１の変形例）
上記では、宛先者特定部２６０が意味分類の一致数を評価値として算出する例を説明したが、評価値の算出は他の方法で行われてもよい。例えば、宛先者特定部２６０は、一致した意味分類に重み付けをして、重みの合計値を評価値として算出してもよい。この場合、特徴的な単語の意味分類の重みを大きくすることで、より適切な評価値が算出されることが期待される。 (First modification)
In the above, the example in which the destination identification unit 260 calculates the number of matches of the semantic classification as the evaluation value has been described, but the calculation of the evaluation value may be performed by another method. For example, the destination identification unit 260 may weight the matching semantic classifications and calculate the total value of the weights as an evaluation value. In this case, it is expected that a more appropriate evaluation value will be calculated by increasing the weight of the semantic classification of characteristic words.

（第２の変形例）
また、上記では、宛先者特定部２６０が各発話の評価値を同じ基準で算出する例を説明したが、各発話の評価値は異なる基準で算出されてもよい。例えば、ある出席者が過去の他の出席者の発話に関連して当該他の出席者に対して問いかけを行う発話を行うことを考えると、問いかけの元になった発話は、問いかけを行う発話と時間的に近くで行われている可能性が高い。このため、宛先者特定部２６０は、時間的により新しい発話には、より大きな重みを与えて評価値を算出してもよい。かかる構成により、宛先者の特定の精度を向上することが可能である。 (Second modification)
Further, in the above, the example in which the destination identification unit 260 calculates the evaluation value of each utterance based on the same standard has been described, but the evaluation value of each utterance may be calculated based on a different standard. For example, considering that an attendee makes an utterance that asks the other attendee in connection with the utterance of another attendee in the past, the utterance that is the source of the question is the utterance that asks the question. There is a high possibility that it is being done nearby in time. Therefore, the destination identification unit 260 may calculate the evaluation value by giving a larger weight to the utterance newer in time. With such a configuration, it is possible to improve the specific accuracy of the destination.

（第３の変形例）
第２の変形例に関連し、問いかけを行う発話と時間的に離れて行われた発話が問いかけの元になっている可能性は低い。そこで、宛先者特定部２６０は、記憶部２４０に発話解析データが記憶されている発話のうちで、所定の条件を満たす２以上の発話を評価値の算出対象としてもよい。所定の条件としては、最新の発話から所定数以内の発話であること、または、所定の時間内に行われた発話であること、などが挙げられる。かかる構成によっても、宛先者の特定の精度を向上することが可能である。 (Third variant)
In relation to the second variant, it is unlikely that the utterance that asks the question and the utterance that is made at a time lag are the source of the question. Therefore, the destination identification unit 260 may use two or more utterances satisfying a predetermined condition as the evaluation value calculation target among the utterances in which the utterance analysis data is stored in the storage unit 240. Predetermined conditions include that the utterance is within a predetermined number of utterances from the latest utterance, or that the utterance is made within a predetermined time. With such a configuration, it is possible to improve the specific accuracy of the destination.

＜６．ハードウェア構成＞
以上、本発明の一実施形態および変形例を説明した。上述した音声データの解析および宛先者の特定などの情報処理は、ソフトウェアと、以下に説明するクラウドサーバ２０のハードウェアとの協働により実現される。なお、以下に説明するハードウェア構成は遠隔会議用端末１０にも適用可能である。 <6. Hardware configuration>
An embodiment and a modification of the present invention have been described above. The information processing such as the analysis of the voice data and the identification of the destination described above is realized by the cooperation between the software and the hardware of the cloud server 20 described below. The hardware configuration described below can also be applied to the remote conference terminal 10.

図１１は、クラウドサーバ２０のハードウェア構成を示したブロック図である。クラウドサーバ２０は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２０１と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２０２と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２０３と、ホストバス２０４と、を備える。また、クラウドサーバ２０は、ブリッジ２０５と、外部バス２０６と、インターフェース２０７と、入力装置２０８と、表示装置２０９と、音声出力装置２１０と、ストレージ装置（ＨＤＤ）２１１と、ドライブ２１２と、ネットワークインターフェース２１５とを備える。 FIG. 11 is a block diagram showing the hardware configuration of the cloud server 20. The cloud server 20 includes a CPU (Central Processing Unit) 201, a ROM (Read Only Memory) 202, a RAM (Random Access Memory) 203, and a host bus 204. The cloud server 20 includes a bridge 205, an external bus 206, an interface 207, an input device 208, a display device 209, an audio output device 210, a storage device (HDD) 211, a drive 212, and a network interface. It is equipped with 215.

ＣＰＵ２０１は、演算処理装置および制御装置として機能し、各種プログラムに従ってクラウドサーバ２０内の動作全般を制御する。また、ＣＰＵ２０１は、マイクロプロセッサであってもよい。ＲＯＭ２０２は、ＣＰＵ２０１が使用するプログラムや演算パラメータ等を記憶する。ＲＡＭ２０３は、ＣＰＵ２０１の実行において使用するプログラムや、その実行において適宜変化するパラメータ等を一時記憶する。これらはＣＰＵバスなどから構成されるホストバス２０４により相互に接続されている。これらＣＰＵ２０１、ＲＯＭ２０２およびＲＡＭ２０３とソフトウェアとの協働により、図２を参照して説明した音声データ解析部２３０、発話種別判定部２５０および宛先者特定部２６０などの機能が実現され得る。 The CPU 201 functions as an arithmetic processing device and a control device, and controls the overall operation in the cloud server 20 according to various programs. Further, the CPU 201 may be a microprocessor. The ROM 202 stores programs, calculation parameters, and the like used by the CPU 201. The RAM 203 temporarily stores a program used in the execution of the CPU 201, parameters that are appropriately changed in the execution, and the like. These are connected to each other by a host bus 204 composed of a CPU bus or the like. By the collaboration between the CPU 201, ROM 202 and RAM 203 and the software, functions such as the voice data analysis unit 230, the utterance type determination unit 250, and the destination identification unit 260 described with reference to FIG. 2 can be realized.

ホストバス２０４は、ブリッジ２０５を介して、ＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔ／Ｉｎｔｅｒｆａｃｅ）バスなどの外部バス２０６に接続されている。なお、必ずしもホストバス２０４、ブリッジ２０５および外部バス２０６を分離構成する必要はなく、１つのバスにこれらの機能を実装してもよい。 The host bus 204 is connected to an external bus 206 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 205. It is not always necessary to separately configure the host bus 204, the bridge 205, and the external bus 206, and these functions may be implemented in one bus.

入力装置２０８は、マウス、キーボード、タッチパネル、ボタン、マイクロフォン、センサー、スイッチおよびレバーなどユーザが情報を入力するための入力手段と、ユーザによる入力に基づいて入力信号を生成し、ＣＰＵ２０１に出力する入力制御回路などから構成されている。クラウドサーバ２０のユーザは、該入力装置２０８を操作することにより、クラウドサーバ２０に対して各種のデータを入力したり処理動作を指示したりすることができる。 The input device 208 includes input means for the user to input information such as a mouse, a keyboard, a touch panel, a button, a microphone, a sensor, a switch, and a lever, and an input that generates an input signal based on the input by the user and outputs the input signal to the CPU 201. It is composed of a control circuit and the like. By operating the input device 208, the user of the cloud server 20 can input various data to the cloud server 20 and instruct the processing operation.

表示装置２０９は、例えば、液晶ディスプレイ（ＬＣＤ）装置、プロジェクター装置、ＯＬＥＤ（ＯｒｇａｎｉｃＬｉｇｈｔＥｍｉｔｔｉｎｇＤｉｏｄｅ）装置およびランプなどの表示装置を含む。また、音声出力装置２１０は、スピーカおよびヘッドホンなどの音声出力装置を含む。 The display device 209 includes, for example, a liquid crystal display (LCD) device, a projector device, an OLED (Organic Light Emitting Diode) device, and a display device such as a lamp. Further, the audio output device 210 includes an audio output device such as a speaker and headphones.

ストレージ装置２１１は、本実施形態にかかるクラウドサーバ２０の記憶部の一例として構成されたデータ格納用の装置である。ストレージ装置２１１は、記憶媒体、記憶媒体にデータを記録する記録装置、記憶媒体からデータを読み出す読出し装置および記憶媒体に記録されたデータを削除する削除装置などを含んでもよい。ストレージ装置２１１は、例えば、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）またはＳＳＤ（ＳｏｌｉｄＳｔｒａｇｅＤｒｉｖｅ）、あるいは同等の機能を有するメモリ等で構成される。このストレージ装置２１１は、ストレージを駆動し、ＣＰＵ２０１が実行するプログラムや各種データを格納する。 The storage device 211 is a data storage device configured as an example of the storage unit of the cloud server 20 according to the present embodiment. The storage device 211 may include a storage medium, a recording device that records data on the storage medium, a reading device that reads data from the storage medium, a deleting device that deletes the data recorded on the storage medium, and the like. The storage device 211 is composed of, for example, an HDD (Hard Disk Drive) or SSD (Solid Stage Drive), or a memory having an equivalent function. The storage device 211 drives the storage and stores programs and various data executed by the CPU 201.

ドライブ２１２は、記憶媒体用リーダライタであり、クラウドサーバ２０に内蔵、あるいは外付けされる。ドライブ２１２は、装着されている磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリ等のリムーバブル記憶媒体２４に記録されている情報を読み出して、ＲＡＭ２０３またはストレージ装置２１１に出力する。また、ドライブ２１２は、リムーバブル記憶媒体２４に情報を書き込むこともできる。 The drive 212 is a reader / writer for a storage medium, and is built in or externally attached to the cloud server 20. The drive 212 reads the information recorded in the removable storage medium 24 such as the mounted magnetic disk, optical disk, magneto-optical disk, or semiconductor memory, and outputs the information to the RAM 203 or the storage device 211. The drive 212 can also write information to the removable storage medium 24.

ネットワークインターフェース２１５は、例えば、ネットワーク１２に接続するための通信デバイス等で構成された通信インターフェースである。また、ネットワークインターフェース２１５は、無線ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）対応通信装置であっても、有線による通信を行うワイヤー通信装置であってもよい。 The network interface 215 is, for example, a communication interface composed of a communication device or the like for connecting to the network 12. Further, the network interface 215 may be a wireless LAN (Local Area Network) compatible communication device or a wire communication device that performs wired communication.

＜７．補足＞
以上、添付図面を参照しながら本発明の好適な実施形態について詳細に説明したが、本発明はかかる例に限定されない。本発明の属する技術の分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本発明の技術的範囲に属するものと了解される。 <7. Supplement>
Although the preferred embodiments of the present invention have been described in detail with reference to the accompanying drawings, the present invention is not limited to such examples. It is clear that a person having ordinary knowledge in the field of technology to which the present invention belongs can come up with various modifications or modifications within the scope of the technical ideas described in the claims. , These are also naturally understood to belong to the technical scope of the present invention.

例えば、本明細書のクラウドサーバ２０の処理における各ステップは、必ずしもフローチャートとして記載された順序に沿って時系列に処理する必要はない。例えば、クラウドサーバ２０の処理における各ステップは、フローチャートとして記載した順序と異なる順序で処理されても、並列的に処理されてもよい。 For example, each step in the processing of the cloud server 20 of the present specification does not necessarily have to be processed in chronological order in the order described as a flowchart. For example, each step in the processing of the cloud server 20 may be processed in an order different from the order described in the flowchart, or may be processed in parallel.

また、遠隔会議用端末１０およびクラウドサーバ２０に内蔵されるＣＰＵ、ＲＯＭおよびＲＡＭなどのハードウェアに、上述した遠隔会議用端末１０およびクラウドサーバ２０の各構成と同等の機能を発揮させるためのコンピュータプログラムも作成可能である。また、該コンピュータプログラムを記憶させた記憶媒体も提供される。 Further, a computer for causing the hardware such as the CPU, ROM, and RAM built in the remote conference terminal 10 and the cloud server 20 to exhibit the same functions as the configurations of the remote conference terminal 10 and the cloud server 20 described above. Programs can also be created. A storage medium for storing the computer program is also provided.

１０遠隔会議用端末
２０クラウドサーバ
２２０通信部
２３０音声データ解析部
２４０記憶部
２５０発話種別判定部
２６０宛先者特定部
10 Remote conference terminal 20 Cloud server 220 Communication unit 230 Voice data analysis unit 240 Storage unit 250 Speech type determination unit 260 Destination identification unit

Claims

Voice data of two or more first attendees attending the remote conference transmitted from the first communication terminal provided at the first base, and transmitted from the second communication terminal provided at the second base. As a result of analyzing the voice data of one or more second attendees attending the remote conference, the utterance information acquisition unit for acquiring the speaker of each utterance and the attributes of the words included in the utterance, and the utterance information acquisition unit.
The speaker acquired by the utterance information acquisition unit, a storage unit that stores the attributes in association with each utterance, and a storage unit.
When the second attendee makes an utterance accompanied by a question to another attendee, the attribute of the word included in the utterance accompanied by the question is compared with the attribute memorized for each utterance in the storage unit. Based on the addressee identification unit that identifies the addressee of the question from the two or more first attendees,
A transmission unit that transmits a notification requesting a reply to the destination person specified by the destination person identification unit to the first communication terminal, and a transmission unit.
A remote conference support device equipped with.

The destination identification unit compares the attribute of the word included in the utterance accompanied by the question with the attribute stored for each utterance in the storage unit for each utterance in which the attribute is stored in the storage unit. The remote conference support device according to claim 1, wherein an evaluation value is calculated based on the evaluation value, and the speaker associated with the utterance having the maximum evaluation value is specified as the destination.

The remote conference support device according to claim 2, wherein the evaluation value is the number of attributes of each utterance stored in the storage unit that match the attributes of the words included in the utterance accompanied by the question. ..

The remote according to claim 2 or 3, wherein the destination identification unit targets two or more utterances satisfying a predetermined condition among the utterances whose attributes are stored in the storage unit as the comparison target of the attributes. Conference support device.

The remote conference support device according to claim 4, wherein the predetermined condition includes an utterance within a predetermined number of utterances from the latest utterance or an utterance made within a predetermined time.

The remote conference support device according to any one of claims 2 to 5, wherein the destination identification unit does not specify the destination when the evaluation value of any utterance does not exceed a predetermined standard.

The teleconferencing support device according to any one of claims 1 to 5, wherein the attribute includes a part of speech of a word or a semantic classification of a word.

The teleconferencing support device further includes a receiving unit that receives the voice data of the two or more first attendees and the voice data of the one or more second attendees.
The utterance information acquisition unit acquires the attributes of the speaker of each utterance and the words included in the utterance by analyzing the voice data received by the utterance unit, according to any one of claims 1 to 7. The teleconference support device described.

The remote conference support device has an utterance type determination unit that determines whether or not the utterance indicated by the voice data is an utterance accompanied by the question, based on the analysis result of the voice data acquired by the utterance information acquisition unit. Further prepare
The remote conference support device according to claim 8, wherein the destination identification unit identifies the destination with respect to an utterance determined by the utterance information acquisition unit to be an utterance accompanied by the question.

Computer,
Voice data of two or more first attendees attending the remote conference transmitted from the first communication terminal provided at the first base, and transmitted from the second communication terminal provided at the second base. As a result of analyzing the voice data of one or more second attendees attending the remote conference, the utterance information acquisition unit for acquiring the speaker of each utterance and the attributes of the words included in the utterance, and the utterance information acquisition unit.
The speaker acquired by the utterance information acquisition unit, a storage unit that stores the attributes in association with each utterance, and a storage unit.
When the second attendee makes an utterance accompanied by a question to another attendee, the attribute of the word included in the utterance accompanied by the question is compared with the attribute memorized for each utterance in the storage unit. Based on the addressee identification unit that identifies the addressee of the question from the two or more first attendees,
A transmission unit that transmits a notification requesting a reply to the destination person specified by the destination person identification unit to the first communication terminal, and a transmission unit.
A program to function as.