JP2009053342A

JP2009053342A - Minutes preparation apparatus

Info

Publication number: JP2009053342A
Application number: JP2007218441A
Authority: JP
Inventors: Junichi Shibuya; 純一澁谷
Original assignee: Individual
Current assignee: Individual
Priority date: 2007-08-24
Filing date: 2007-08-24
Publication date: 2009-03-12

Abstract

<P>PROBLEM TO BE SOLVED: To provide a minutes preparation apparatus capable of saving a data amount required for minutes, while improving processing efficiency when the minutes is prepared. <P>SOLUTION: The minuses preparation apparatus includes: an attendant indication section 11 for indicating information of an object attendant whose speech is recorded in attendants of an electronic meeting, and information of attendants whose speech is not recorded; a speech data receiving section 12 for receiving speech data when the attendant speaks; a speech data detection section 13 for detecting the speech data corresponding to the object attendant, which is indicated by the attendant indication section 11, in the speech data received by the speech data receiving section 12; a speech data storage section 14 for storing the speech data detected by the speech data detection section 13; a speech recognition section 15 for converting the speech data stored by the speech data storage section 14 to a text data; and a minutes preparation section 16 for preparing minutes based on the text data which is converted by the speech recognition section 15. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、議事録を作成する議事録作成装置に関する。 The present invention relates to a minutes creation device for creating minutes.

従来、会議の議事録は、会議の参加者の発言内容を議事録作成者がメモし、会議後にそのメモに基づいた清書あるいはワードプロセッサを用いてタイプするなどして、議事録の体裁を整えた形で作成されていた。このようなメモに頼った議事録の作成においては、通常、参加者が自由に発言するため、発言内容のメモをとるには多大な労力を要し、議事録作成者が会議に参加できなくなる恐れがある。議事録作成者が会議の全ての発言内容を録音することにより、会議後、録音した発言内容を聴きながら議事録を作成する場合も、議事録作成者には多くの負担が生じるという問題があった。 Conventionally, minutes of meeting minutes were recorded by the meeting creator, who wrote down the contents of the meeting participants, and typed them using a clean copy or word processor based on the notes after the meeting. It was created in the form. In making minutes based on such memos, participants usually speak freely, so it takes a lot of labor to take notes of the contents of the utterances, and the minutes maker cannot participate in the meeting. There is a fear. When the minutes maker records all the contents of the meeting, the minutes writer will be burdened even if the minutes are created while listening to the recorded contents after the meeting. It was.

そこで、音声認識技術を用いて、コンピュータが参加者の発言内容を認識し、認識した発言内容をテキストに変換し、電子的な議事録を作成する技術が示されている（例えば、特許文献１、２参照）。 Therefore, a technique is shown in which a computer recognizes the content of a participant's speech using voice recognition technology, converts the recognized content of the speech into text, and creates an electronic minutes (for example, Patent Document 1). 2).

また、従来の電子会議システムとしては、電子会議に参加する各参加者のためのクライアント装置、および、電子会議を管理する会議サーバ装置がネットワークを介して接続され、参加者はクライアント装置を用いて自らの会議中の発言のデータを会議サーバ装置に送信し、会議サーバ装置が、発言データを議事進行中に記録するものが知られている（例えば、特許文献３参照）。なお、発言データは、ネットワークを介して送受信される音声データである。 In addition, as a conventional electronic conference system, a client device for each participant who participates in an electronic conference and a conference server device that manages the electronic conference are connected via a network, and the participant uses the client device. It is known that the data of a speech during its own meeting is transmitted to the conference server device, and the conference server device records the speech data while proceeding with the proceedings (for example, see Patent Document 3). Note that the utterance data is audio data transmitted / received via a network.

会議サーバ装置は、記録した参加者の発言データから音声認識技術でテキストデータに変換することで、発言履歴による議事録を作成するようになっている。ここで、複数の参加者が、同時に発言した場合、複数の発言を１つの音声データとして記録してしまうと、人が正しく認識できる程度のテキストデータに変換することが困難になるため、会議サーバ装置が参加者毎に発言内容を記録するのが一般的である。
特開平０２−２０６８２５号公報特開平０７−１４６９１９号公報特開２００６−０５０５００号公報 The conference server device is configured to create the minutes based on the speech history by converting the recorded speech data of the participant into text data using a voice recognition technology. Here, when a plurality of participants speak at the same time, if the plurality of comments are recorded as one audio data, it becomes difficult to convert the data into text data that can be correctly recognized by a person. It is common for the device to record the content of a statement for each participant.
Japanese Patent Laid-Open No. 02-206825 Japanese Patent Application Laid-Open No. 07-146919 JP 2006-050500 A

しかしながら、上述した会議サーバ装置などの議事録作成装置は、参加者毎に発言内容を記録しているが、会議中に議題に関係のない発言ばかりする参加者や特に興味のない参加者がいることがしばしばあるため、従来の議事録作成装置がそのような参加者の発言内容までも記録に残して議事録を作成してしまうと、議事録に必要なデータ量が膨大なものとなってしまい、議事録を作成するときの処理効率が低下してしまうという問題があった。 However, although the minutes creation device such as the conference server device described above records the content of the remarks for each participant, there are participants who make only remarks that are not related to the agenda during the conference and participants who are not particularly interested. Therefore, if a conventional minutes creation device creates a minutes by recording even the contents of such participants' remarks, the amount of data required for the minutes becomes enormous. Therefore, there has been a problem that the processing efficiency when creating the minutes is reduced.

そこで、本発明は、議事録に必要なデータ量を節約することができ、議事録を作成するときの処理効率を高めることができる議事録作成装置を提供することを目的としたものである。 Therefore, an object of the present invention is to provide a minutes creation apparatus that can save the amount of data required for minutes and can improve the processing efficiency when the minutes are created.

本発明の議事録作成装置は、電子会議の各参加者のうち発言が記録される対象参加者の情報および発言が記録されない参加者の情報を指定させる参加者指定部と、前記参加者が発言したときの音声データを受信する音声データ受信部と、前記音声データ受信部によって受信された音声データのうち、前記参加者指定部で指定された対象参加者に対応する音声データを検出する音声データ検出部と、前記音声データ検出部によって検出された音声データを保存する音声データ保存部と、前記音声データ保存部によって保存された音声データをテキストデータに変換する音声認識部と、前記音声認識部で変換されたテキストデータに基づいて議事録を作成する議事録作成部と、を備えた構成を有している。
この構成により、電子会議の各参加者のうち発言が記録される対象参加者の情報を指定させ、受信された音声データのうち指定された対象参加者に対応する音声データを保存し、保存した音声データを音声認識して議事録を作成することで、発言が記録されない参加者に対応する音声データを保存することがなくなるため、議事録に必要なデータ量を節約することができ、議事録を作成するときの処理効率を高めることができる。 The minutes creation apparatus of the present invention includes a participant designating unit for designating information on a target participant whose speech is recorded among participants of an electronic conference and information on a participant whose speech is not recorded, and the participant speaks Audio data receiving unit for receiving audio data when the audio data is received, and audio data for detecting audio data corresponding to the target participant specified by the participant specifying unit among the audio data received by the audio data receiving unit A detection unit; a voice data storage unit that stores voice data detected by the voice data detection unit; a voice recognition unit that converts voice data stored by the voice data storage unit into text data; and the voice recognition unit And a minutes creation unit for creating minutes based on the text data converted in step (b).
With this configuration, the information of the target participant to which the remark is recorded among each participant of the electronic conference is designated, and the voice data corresponding to the designated target participant among the received voice data is saved and saved. By creating the minutes by recognizing the voice data, it is not necessary to save the voice data corresponding to the participants who are not recorded, so the amount of data required for the minutes can be saved and the minutes can be saved. Can improve the processing efficiency when creating.

また、本発明の議事録作成装置は、前記参加者指定部が、前記対象参加者の情報を指定させる際に前記対象参加者に対応する登録音声データを指定させ、前記音声認識部が、前記参加者指定部で指定された対象参加者の登録音声データに従って前記音声データをテキストデータに変換する構成を有している。
この構成により、対象参加者の登録音声データに従って音声データをテキストデータに変換するため、音声認識の精度を向上させることができる。 Also, the minutes creation device of the present invention, when the participant designation unit designates the information of the target participant, the registered voice data corresponding to the target participant is designated, the voice recognition unit, The voice data is converted into text data in accordance with the registered voice data of the target participant designated by the participant designation unit.
With this configuration, since the voice data is converted into text data in accordance with the registered voice data of the target participant, the accuracy of voice recognition can be improved.

また、本発明の議事録作成装置は、前記参加者指定部が、前記対象参加者の情報を指定させる際に前記対象参加者の音声データを保存するときの最大保存容量を指定させ、前記音声データ保存部が、前記音声データ検出部によって検出された音声データを、前記参加者指定部で指定された最大保存容量以内になるように保存する構成を有している。
この構成により、検出された音声データに対応する最大保存容量以内になるように検出された音声データを保存するため、議事録に必要なデータ量を有効に使用することができる。 Also, the minutes creation device of the present invention allows the participant designation unit to designate a maximum storage capacity for saving the target participant's voice data when the target participant's information is designated, and the voice The data storage unit stores the audio data detected by the audio data detection unit so as to be within the maximum storage capacity specified by the participant specifying unit.
With this configuration, since the detected voice data is stored so as to be within the maximum storage capacity corresponding to the detected voice data, the amount of data necessary for the minutes can be used effectively.

また、本発明の議事録作成装置は、前記参加者指定部が、前記対象参加者の情報を指定させる際に前記対象参加者に対応する音声データの音量を指定させ、前記音声データ検出部が、前記音声データ検出部によって検出された音声データの音量を、前記参加者指定部で指定された音量になるように調整する構成を有している。
この構成により、検出された音声データの音量を、指定された音量になるように調整することで、適正な音声データが保存されることになるため、議事録を正確に作成することができる。 In the minutes creation device of the present invention, the participant designating unit designates the volume of audio data corresponding to the target participant when the information on the target participant is designated, and the audio data detecting unit The sound data detected by the sound data detecting unit is adjusted so as to be the sound volume specified by the participant specifying unit.
With this configuration, by adjusting the volume of the detected audio data so as to become the designated volume, appropriate audio data is saved, so that the minutes can be created accurately.

また、本発明の議事録作成装置は、音声データを再生する音声データ再生部を備え、前記音声データ検出部が、前記音声データ受信部によって受信された音声データのうち、前記参加者指定部で指定された対象参加者に対応する音声データを検出すると共に、前記音声データ受信部によって受信された音声データを前記音声データ再生部に再生させる構成を有している。
この構成により、指定された対象参加者に対応する音声データを検出しながら、音声データを再生させるため、議事録を作成すると共に、利用者に議事の内容を聴き取らせることができる。 In addition, the minutes generating device of the present invention includes an audio data reproducing unit that reproduces audio data, and the audio data detecting unit includes the audio data received by the audio data receiving unit at the participant specifying unit. The audio data corresponding to the designated target participant is detected, and the audio data received by the audio data receiving unit is reproduced by the audio data reproducing unit.
With this configuration, since the audio data is reproduced while detecting the audio data corresponding to the designated target participant, the minutes can be created and the user can listen to the contents of the agenda.

以上のように本発明は、議事録に必要なデータ量を節約することができ、議事録を作成するときの処理効率を高めることができる議事録作成装置を提供するものである。 As described above, the present invention provides a minutes creation device that can save the amount of data required for minutes and can improve the processing efficiency when creating minutes.

以下、本発明の実施の形態について、図面を用いて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は、本発明の実施の形態に係る会議システムを表す図である。図１に示した会議システムでは、ネットワークを介して４つの会議端末１０が互いに接続されている。以降、会議端末１０の各々を区別する場合、会議端末１０Ａ〜会議端末１０Ｄとして記載し、区別しない場合、会議端末１０として記載する。 FIG. 1 is a diagram showing a conference system according to an embodiment of the present invention. In the conference system shown in FIG. 1, four conference terminals 10 are connected to each other via a network. Henceforth, when distinguishing each of the conference terminals 10, it describes as conference terminal 10A-conference terminal 10D, and when not distinguishing, it describes as conference terminal 10.

電子会議を行う際には、参加者Ａ〜参加者Ｄまで参加者がいるとすれば、参加者Ａ〜参加者Ｄのそれぞれは、会議端末１０Ａ〜会議端末１０Ｄのそれぞれを使用することになる。なお、本発明の実施の形態に係る会議システムでは、従来のような会議を管理するためのクライアントおよびサーバは特に持たない。 When conducting an electronic conference, if there are participants from participant A to participant D, each of participant A to participant D uses each of conference terminal 10A to conference terminal 10D. . Note that the conference system according to the embodiment of the present invention does not particularly have a client and a server for managing a conventional conference.

会議端末１０は、パソコンなどである。電子会議を行う際に、破線で示しているようにネットワークに接続された会議端末１０同士がデータを送受信することができる。例えば、会議端末１０は、電子会議中に、電子会議の参加者の発言を表した音声データを、ネットワークを介して自身の会議端末１０を除く参加中の会議端末１０全てに送信したり、他の会議端末１０から受信した発言の音声データを再生するようになっている。 The conference terminal 10 is a personal computer or the like. When conducting an electronic conference, the conference terminals 10 connected to the network can transmit and receive data as shown by the broken lines. For example, during the electronic conference, the conference terminal 10 transmits voice data representing the remarks of the participants of the electronic conference to all the participating conference terminals 10 excluding its own conference terminal 10 via the network. The voice data of the speech received from the conference terminal 10 is reproduced.

本発明の実施の形態では、図１に示した会議システムで用いられるデータの送受信は、ＩＰ（Internet Protocol）に準拠して行われるため、会議端末１０には、ＩＰアドレスが登録されている。 In the embodiment of the present invention, transmission / reception of data used in the conference system shown in FIG. 1 is performed in accordance with IP (Internet Protocol), and therefore, an IP address is registered in the conference terminal 10.

また、会議端末１０は、音声認識技術を用いて、他の会議端末１０から受信した発言の音声データをテキストデータに変換することで、議事録を作成するようになっている。なお、本発明の議事録作成装置は例示すれば会議端末１０であるため、本発明の実施の形態では、会議端末１０について説明する。 In addition, the conference terminal 10 creates a minutes by converting voice data of a speech received from another conference terminal 10 into text data using voice recognition technology. In addition, since the minutes creation apparatus of this invention is the conference terminal 10 if it illustrates, the meeting terminal 10 is demonstrated in embodiment of this invention.

図２は、本発明の実施の形態に係る会議端末のブロック図である。会議端末１０は、一般的なコンピュータの構成を有しており、詳細には、図示していないＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）、およびハードディスク、さらに、ネットワークに接続するためのネットワークインタフェース２１、電子会議の参加者の音声を出力するスピーカ２２、参加者の音声を入力するマイクロホン２３、参加者からの情報を入力するキーボードやマウスなどの入力機器２４、情報を表示するディスプレイ２５を有している。 FIG. 2 is a block diagram of the conference terminal according to the embodiment of the present invention. The conference terminal 10 has a general computer configuration, and more specifically, a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), a hard disk, A network interface 21 for connecting to the network, a speaker 22 for outputting the voice of the participant of the electronic conference, a microphone 23 for inputting the voice of the participant, and an input device 24 such as a keyboard and a mouse for inputting information from the participant And a display 25 for displaying information.

なお、通話時に周囲の雑音が混入してしまうことを防止するため、マイクロホン２３およびスピーカ２２を、マイクロホンとイヤホンとが一体型となっているイヤホンマイクに替えてもよい。 In order to prevent ambient noise from being mixed during a call, the microphone 23 and the speaker 22 may be replaced with an earphone microphone in which a microphone and an earphone are integrated.

また、図２に示すように、会議端末１０は、参加者指定部１１、音声データ受信部１２、音声データ検出部１３、音声データ保存部１４、音声認識部１５、議事録作成部１６、音声データ再生部１７、音声変換部１８、および音声データ送信部１９を備えており、例えば、これらの機能部は、ＣＰＵが実行するプログラムのモジュールなどで実施されてもよい。 As shown in FIG. 2, the conference terminal 10 includes a participant designation unit 11, a voice data receiving unit 12, a voice data detection unit 13, a voice data storage unit 14, a voice recognition unit 15, a minutes creation unit 16, a voice A data reproduction unit 17, an audio conversion unit 18, and an audio data transmission unit 19 are provided. For example, these functional units may be implemented by a module of a program executed by the CPU.

参加者指定部１１は、電子会議の各参加者のうち、電子会議の議事録の記録をとるのに必要な少なくとも１名以上の対象参加者の情報を会議端末１０の利用者に指定させるようになっている。対象参加者は、電子会議の各参加者のうち発言が記録される参加者である。対象参加者の情報を指定させるタイミングとしては、電子会議の開始前でもよいし、電子会議の途中でもよい。 The participant specifying unit 11 causes the user of the conference terminal 10 to specify information of at least one target participant necessary for recording the minutes of the electronic conference among the participants of the electronic conference. It has become. A target participant is a participant in which a statement is recorded among each participant of an electronic conference. The timing for designating the information on the target participant may be before the start of the electronic conference or during the electronic conference.

図３は、利用者に参加者の情報を指定させる画面のイメージ図であり、図３（Ａ）は、利用者に参加者の情報を指定させる指定ウインドウ３０を表している。参加者指定部１１は、指定ウインドウ３０をディスプレイ２５に表示するようになっている。図３（Ａ）の指定ウインドウ３０は、例えば最大８名が電子会議に参加する場合において、８名までの参加者の情報を、入力機器２４を介して指定させるものである。 FIG. 3 is an image diagram of a screen that allows a user to specify participant information. FIG. 3A illustrates a designation window 30 that allows the user to specify participant information. The participant designation unit 11 displays a designation window 30 on the display 25. The designation window 30 in FIG. 3A is for causing the information of up to eight participants to be designated via the input device 24 when, for example, a maximum of eight participants participate in the electronic conference.

指定ウインドウ３０は、参加者の情報を指定させるための指定項目３１を参加者毎に有している。図３（Ｂ）は、指定項目３１だけを拡大して表した図である。 The designation window 30 has a designation item 31 for designating participant information for each participant. FIG. 3B is an enlarged view of only the designated item 31.

利用者が指定する項目としては、参加者の名前を表す参加者名項目３３、音声認識時に必要となる登録音声データ識別項目３４、参加者が使用する機器を識別するための識別情報項目３５、参加者の音声データを保存するかしないかを指定する保存是非項目３６がある。参加者名項目３３、登録音声データ識別項目３４、識別情報項目３５、および保存是非項目３６は、利用者によって指定される。 The items specified by the user include a participant name item 33 representing the name of the participant, a registered voice data identification item 34 required at the time of voice recognition, an identification information item 35 for identifying a device used by the participant, There is a save item 36 for designating whether or not to save the participant's voice data. The participant name item 33, the registered voice data identification item 34, the identification information item 35, and the save right item 36 are designated by the user.

会議端末１０がＩＰに準拠したネットワークを使用して電子会議を行う場合、識別情報項目３５は、会議端末１０のＩＰアドレスである。参加者指定部１１は、指定された識別情報項目３５の情報を、図４に示す制御情報の識別情報に登録するようになっている。 When the conference terminal 10 conducts an electronic conference using a network compliant with IP, the identification information item 35 is the IP address of the conference terminal 10. The participant designating unit 11 registers the information of the designated identification information item 35 in the identification information of the control information shown in FIG.

図４に示す制御情報は、指定された参加者の情報に基づいたものであり、ＲＡＭやハードディスクに記憶されている。なお、図４（Ａ）では、会議端末１０Ｄが有する制御情報の例を示している。本発明の実施の形態では、識別情報については、ＩＰアドレスが登録される。 The control information shown in FIG. 4 is based on designated participant information, and is stored in the RAM or hard disk. Note that FIG. 4A illustrates an example of control information included in the conference terminal 10D. In the embodiment of the present invention, an IP address is registered for the identification information.

図４（Ａ）の制御情報の参加者については、図３（Ｂ）に示したように参加者名項目３３で指定されたものが登録され、図４（Ａ）の制御情報の登録音声データについては、図３（Ｂ）に示したように男性用または女性用の音声かを識別する登録音声データ識別項目３４で指定されたものが登録される。また、保存実施の是非について、図３（Ｂ）に示したように保存是非項目３６で指定されたものが登録される。 As for the participants in the control information of FIG. 4A, those specified in the participant name item 33 are registered as shown in FIG. 3B, and the registered voice data of the control information in FIG. As shown in FIG. 3B, the one designated by the registered voice data identification item 34 for identifying whether the voice is for male or female is registered. In addition, as shown in FIG. 3B, what is specified in the save right item 36 is registered as to whether or not to save.

図４（Ｂ）の制御情報については、図４（Ａ）に加えてスピーカ出力の有無および最大保存容量が含まれている。スピーカ出力の有無および最大保存容量については後述するが、スピーカ出力の有無を指定するスピーカ出力有無項目、最大保存容量を指定する最大保存容量項目が、新たに指定項目３１に追加され（図示していない）、スピーカ出力有無項目および最大保存容量項目でそれぞれ指定されたものが、図４（Ｂ）の制御情報のスピーカ出力の有無および最大保存容量のそれぞれに登録される。 The control information in FIG. 4B includes the presence / absence of speaker output and the maximum storage capacity in addition to FIG. 4A. The presence / absence of speaker output and the maximum storage capacity will be described later. A speaker output presence / absence item for specifying the presence / absence of speaker output and a maximum storage capacity item for specifying the maximum storage capacity are newly added to the specification item 31 (not shown). No), the items specified in the speaker output presence / absence item and the maximum storage capacity item are respectively registered in the presence / absence of speaker output and the maximum storage capacity in the control information of FIG.

音声データ受信部１２は、参加者が発言したときの音声データをネットワークインタフェース２１を介して受信し、受信した音声データを音声データ検出部１３に出力するようになっている。例えば、８名が電子会議に参加する場合において、音声データ受信部１２は、自己の会議端末１０の参加者の音声データを除く７名の音声データを受信するようになっている。 The voice data receiving unit 12 receives voice data when a participant speaks via the network interface 21 and outputs the received voice data to the voice data detecting unit 13. For example, when 8 people participate in an electronic conference, the audio data receiving unit 12 receives audio data of 7 people excluding the audio data of the participants of its own conference terminal 10.

なお、ＩＰに準拠したネットワークを使用して電子会議を行う場合、音声データは、ＩＰパケットのペイロード部に設定されているものであり、ＲＴＰ（Real-time Transport Protocol）等に準拠した技術に従ってリアルタイムに受信される。音声データ受信部１２は、リアルタイムに受信された音声データを音声データ検出部１３に出力するようになっている。 When an electronic conference is performed using an IP-compliant network, the voice data is set in the payload portion of the IP packet and is real-time according to a technology compliant with RTP (Real-time Transport Protocol) or the like. Received. The voice data receiving unit 12 outputs the voice data received in real time to the voice data detecting unit 13.

音声データ検出部１３は、音声データ受信部１２によって受信された音声データを音声データ再生部１７に出力するようになっている。音声データ再生部１７は、音声データ検出部１３によって出力された音声データをデジタルアナログ変換するなどして音声データを再生し、再生したものをスピーカ２２に出力するようになっている。 The audio data detection unit 13 outputs the audio data received by the audio data reception unit 12 to the audio data reproduction unit 17. The audio data reproduction unit 17 reproduces the audio data by converting the audio data output by the audio data detection unit 13 from digital to analog, and outputs the reproduced data to the speaker 22.

参加者の発言による音声のうち自己の会議端末１０の参加者の音声データを除く参加者の音声データが、スピーカ２２を介して聴覚される。 Participant's voice data excluding the voice data of the participant of his / her conference terminal 10 among the voices of the participant's speech is heard through the speaker 22.

また、音声データ検出部１３は、音声データ受信部１２によって受信された音声データのうち、参加者指定部１１で指定された対象参加者に対応する音声データを検出するようになっている。 The voice data detection unit 13 detects voice data corresponding to the target participant designated by the participant designation unit 11 from the voice data received by the voice data reception unit 12.

例えば、ＩＰに準拠したネットワークを使用して電子会議を行う場合、参加者指定部１１から音声データ検出部１３にＩＰアドレス等が出力されているため、音声データ検出部１３は、図４に示した制御情報の識別情報に設定されてあるＩＰアドレスのうち、音声データが含まれているＩＰパケットのヘッダ部のＩＰアドレスに一致するものがあるか否かを判定する。音声データ検出部１３は、一致したときのＩＰアドレスに対応する図４に示した制御情報の保存実施の是非が「保存する」になっているかを判定し、「保存する」になっている参加者である対象参加者の音声データを音声データ保存部１４に保存させる。 For example, when an electronic conference is performed using a network compliant with IP, since the IP address and the like are output from the participant specifying unit 11 to the audio data detecting unit 13, the audio data detecting unit 13 is shown in FIG. It is determined whether there is an IP address set in the identification information of the control information that matches the IP address in the header portion of the IP packet containing the voice data. The voice data detection unit 13 determines whether the saving of the control information shown in FIG. 4 corresponding to the IP address at the time of matching is “save”, and the participation is “save” The voice data storage unit 14 stores the voice data of the target participant who is the user.

また、音声データ検出部１３は、音声変換部１８によって出力された音声データに対応する図４に示した制御情報の保存実施の是非が「保存する」になっているかを判定し、「保存する」になっている参加者である対象参加者の音声データを音声データ保存部１４に保存させる。 Further, the voice data detection unit 13 determines whether or not the saving of the control information shown in FIG. 4 corresponding to the voice data output by the voice conversion unit 18 is “save”. The voice data of the target participant who is the “

また、音声データ検出部１３は、音声データ受信部１２によって受信された音声データを音声データ再生部１７に出力するとしたが、図４（Ｂ）の制御情報のスピーカ出力の有無に従って、スピーカ出力の有となっている音声データだけを音声データ再生部１７に出力する。特定の参加者の発言を聴きたくない場合などには、図４（Ｂ）の制御情報のスピーカ出力の有無が無しに登録され、音声データの再生を停止することで、不要な音声データの再生処理を省略して、その分議事録を作成するときの処理効率を上げることもできる。 In addition, the audio data detection unit 13 outputs the audio data received by the audio data reception unit 12 to the audio data reproduction unit 17, but the speaker output is performed according to the presence or absence of the speaker output of the control information in FIG. Only the audio data that is present is output to the audio data reproducing unit 17. When it is not desired to listen to the speech of a specific participant, unnecessary audio data is reproduced by stopping the reproduction of audio data by registering the presence or absence of speaker output of the control information in FIG. 4B. Processing efficiency can be improved when the minutes are created by omitting the processing.

ところで、図３（Ｂ）に示した指定項目３１内にある音量メータ３２は、対象参加者の音声データの音量を表すものであり、例えば、電子会議の開始前などに音量を調整するときに参照される。 By the way, the volume meter 32 in the designated item 31 shown in FIG. 3B represents the volume of the voice data of the target participant. For example, when adjusting the volume before the start of the electronic conference or the like. Referenced.

音量メータ３２に関する使用方法としては、利用者が、保存是非項目３６が保存するとなっている指定項目３１を選択して開始ボタン４１を押下したとき、選択された指定項目３１の会議端末１０から発言の音量のレベルが音量メータ３２に表示される。開始ボタン４１が押下されたとき、例えば、音声データ検出部１３は、対象参加者の音声データの音量のレベルを検出し、検出した音量のレベルを、該当する指定項目３１の音量メータ３２に表示させる。 As a method of using the volume meter 32, when the user selects the designated item 31 that the saved item 36 is to be saved and presses the start button 41, the user speaks from the conference terminal 10 of the selected designated item 31. Are displayed on the volume meter 32. When the start button 41 is pressed, for example, the voice data detection unit 13 detects the volume level of the target participant's voice data, and displays the detected volume level on the volume meter 32 of the corresponding designation item 31. Let

音量のレベルが音量メータ３２に表示されている最中に、参加者指定部１１は、音量調整ボタン４３で音量を利用者に指定させることで、音声データ検出部１３が、該当する音声データの音量を、参加者指定部１１で指定された音量になるように調整するようになっている。また、利用者が、音量のレベルの表示を停止したい場合は、停止ボタン４２を押下する。 While the volume level is displayed on the volume meter 32, the participant specifying unit 11 causes the user to specify the volume with the volume adjustment button 43, so that the audio data detecting unit 13 selects the corresponding audio data. The volume is adjusted so as to be the volume specified by the participant specifying unit 11. If the user wants to stop displaying the volume level, the user presses the stop button 42.

なお、クリアボタン４４は、選択された指定項目３１の各種項目に設定された内容をクリアするものであり、保存ボタン４５は、各指定項目３１に設定された内容を図４に示す制御情報に登録するものである。 The clear button 44 clears the contents set in the various items of the selected designation item 31, and the save button 45 changes the contents set in each designation item 31 to the control information shown in FIG. To register.

ここで、各参加者の発言による音声データの流れを表す図を図５に例示する。会議端末１０Ａの参加者は、時刻ｔ１から発言し、その後に発言をやめて無言の状態があり、また時刻ｔ４から発言した様子を表している。会議端末１０Ｂの参加者は、時刻ｔ３から発言し、その後に発言をやめて無言の状態があり、また時刻ｔ６から発言した様子を表している。会議端末１０Ｄの参加者は、時刻ｔ２から発言し、その後に発言をやめて無言の状態があり、また時刻ｔ５から発言した様子を表している。なお、会議端末１０Ｃの参加者の発言については、図４の制御情報で保存しない設定にしているため、省略している。 Here, FIG. 5 illustrates a diagram representing the flow of audio data according to the speech of each participant. The participant of the conference terminal 10A speaks from time t1, and then stops speaking and is in a silent state, and represents a state of speaking from time t4. The participant of the conference terminal 10B speaks from time t3, then stops speaking and is silent, and represents a state of speaking from time t6. The participant of the conference terminal 10D speaks from the time t2, and then stops speaking and has a silent state, and represents a state of speaking from the time t5. Note that the speech of the participant of the conference terminal 10C is omitted because it is set not to be saved in the control information of FIG.

音声データ保存部１４は、音声データ検出部１３によって検出された音声データを保存するようになっている。例えば、図５に示したように、各対象参加者の発言による音声データが音声データ検出部１３によって検出された場合、音声データ保存部１４は、各対象参加者の発言の無音部分を検出し、発言のあった期間の音声データを電子ファイルとして保存するようになっている。 The audio data storage unit 14 stores the audio data detected by the audio data detection unit 13. For example, as shown in FIG. 5, when the voice data detection unit 13 detects voice data based on the speech of each target participant, the voice data storage unit 14 detects a silent part of the speech of each target participant. The voice data for the period when the speech was made is stored as an electronic file.

なお、無音部分を検出する方法としては、例えば、所定時間内に一定の音量レベルに達しない場合において無音とみなす等の方法がある。 As a method for detecting a silent portion, for example, there is a method in which it is regarded as silent when a certain volume level is not reached within a predetermined time.

例えば、図６に示すように、音声データ保存部１４は、発言の開始時刻を音声データに設定し、発言が開始された後に発言の無音部分を検出した際、発言が開始されてから無言の状態になるまでの音声データを、対象参加者が識別可能なように電子ファイルとして保存する。 For example, as shown in FIG. 6, when the voice data storage unit 14 sets the speech start time to the voice data and detects a silent part of the speech after the speech is started, the speech data storage unit 14 is silent after the speech is started. Audio data up to the state is saved as an electronic file so that the target participant can be identified.

会議端末１０Ａの対象参加者が時刻ｔ１から発言したときの音声データを音声データＡ１とする。音声データＡ１は、音声データ保存部１４によって電子ファイルＡ１としてハードディスクなどの記録媒体に記憶される。また、会議端末１０Ａの対象参加者が時刻ｔ４から発言したときの音声データを音声データＡ２とする。音声データＡ２は、音声データ保存部１４によって電子ファイルＡ２としてハードディスクなどの記録媒体に記憶される。 The audio data when the target participant of the conference terminal 10A speaks from time t1 is referred to as audio data A1. The audio data A1 is stored in a recording medium such as a hard disk by the audio data storage unit 14 as an electronic file A1. Also, the audio data when the target participant of the conference terminal 10A speaks from time t4 is set as audio data A2. The audio data A2 is stored in a recording medium such as a hard disk by the audio data storage unit 14 as an electronic file A2.

会議端末１０Ｂの対象参加者が時刻ｔ３から発言したときの音声データを音声データＢ１とする。音声データＢ１は、音声データ保存部１４によって電子ファイルＢ１としてハードディスクなどの記録媒体に記憶される。また、会議端末１０Ｂの対象参加者が時刻ｔ６から発言したときの音声データを音声データＢ２とする。音声データＢ２は、音声データ保存部１４によって電子ファイルＢ２としてハードディスクなどの記録媒体に記憶される。 The audio data when the target participant of the conference terminal 10B speaks from time t3 is referred to as audio data B1. The audio data B1 is stored in a recording medium such as a hard disk by the audio data storage unit 14 as an electronic file B1. Also, the audio data when the target participant of the conference terminal 10B speaks from time t6 is referred to as audio data B2. The audio data B2 is stored in a recording medium such as a hard disk by the audio data storage unit 14 as an electronic file B2.

会議端末１０Ｄの対象参加者が時刻ｔ２から発言したときの音声データを音声データＤ１とする。音声データＤ１は、音声データ保存部１４によって電子ファイルＤ１としてハードディスクなどの記録媒体に記憶される。また、会議端末１０Ｄの対象参加者が時刻ｔ５から発言したときの音声データを音声データＤ２とする。音声データＤ２は、音声データ保存部１４によって電子ファイルＤ２としてハードディスクなどの記録媒体に記憶される。 The audio data when the target participant of the conference terminal 10D speaks from time t2 is set as audio data D1. The audio data D1 is stored in a recording medium such as a hard disk by the audio data storage unit 14 as an electronic file D1. Also, the audio data when the target participant of the conference terminal 10D speaks from time t5 is referred to as audio data D2. The audio data D2 is stored in a recording medium such as a hard disk by the audio data storage unit 14 as an electronic file D2.

なお、音声データ保存部１４が付与する電子ファイルのファイル名は、開始時刻と参加者を識別する情報とを組み合わせたものでもよい。 Note that the file name of the electronic file provided by the audio data storage unit 14 may be a combination of the start time and information for identifying the participant.

音声認識部１５は、音声データ保存部１４によって保存された音声データを認識することでテキストデータに変換するようになっている。図７に示すように、音声認識部１５は、学習機能を有する音声認識エンジンを有しており、同じ対象参加者の音声データをできるだけ一度に音声認識することで、音声認識の精度（音声認識率）を向上させるようになっている。 The voice recognition unit 15 recognizes the voice data stored by the voice data storage unit 14 and converts it into text data. As shown in FIG. 7, the speech recognition unit 15 has a speech recognition engine having a learning function, and recognizes speech recognition accuracy (voice recognition) by recognizing speech data of the same target participant as much as possible. Rate).

例えば、図７に示すように、音声認識部１５は、一定時間内に会議端末１０Ａの対象参加者の音声データだけを音声認識し、次の一定時間内に会議端末１０Ｂの対象参加者の音声データだけを音声認識し、さらに次の一定時間内に会議端末１０Ｄの対象参加者の音声データだけを音声認識し、音声認識した結果をテキストデータに変換するようになっている。 For example, as shown in FIG. 7, the voice recognition unit 15 recognizes only the voice data of the target participant of the conference terminal 10A within a predetermined time, and the voice of the target participant of the conference terminal 10B within the next fixed time. Only the data is voice-recognized, and only the voice data of the target participant of the conference terminal 10D is voice-recognized within the next fixed time, and the voice-recognized result is converted into text data.

また、音声認識部１５は、図４（Ｂ）の制御情報に設定されている登録音声データ識別項目３４が示す登録音声データを有しており、参加者指定部１１で指定された対象参加者の登録音声データに基づいて音声認識し、音声認識した結果をテキストデータに変換することで、さらに音声認識の精度（音声認識率）を向上させるようになっている。 Further, the voice recognition unit 15 has registered voice data indicated by the registered voice data identification item 34 set in the control information of FIG. 4B, and the target participant specified by the participant specifying unit 11 The speech recognition is performed based on the registered speech data, and the speech recognition result is converted into text data, thereby further improving the speech recognition accuracy (speech recognition rate).

図４では、登録音声データ識別項目３４については、音声認識の効率化を図るため、男性用の音声か女性用の音声かを識別する情報が指定されるが、最も音声認識の効率化を図るには、予め参加者の音声を録音して得られた登録音声データを登録しておき、登録音声データ識別項目３４に登録音声データを設定させることが望ましい。このようにすれば音声認識部１５は、参加者毎の登録音声データに基づいて音声認識することで、認識率を向上させるようになっている。 In FIG. 4, for the registered voice data identification item 34, information for identifying whether the voice is for male or female is specified in order to improve the efficiency of voice recognition. It is desirable to register the registered voice data obtained by recording the participant's voice in advance and set the registered voice data in the registered voice data identification item 34. If it does in this way, the voice recognition part 15 will improve a recognition rate by carrying out voice recognition based on registration voice data for every participant.

議事録作成部１６は、音声認識部１５で音声データから変換されたテキストデータに基づいて議事録を作成するようになっている。図６に示すように、音声データ保存部１４が保存した音声データには開始時刻が設定されているため、開始時刻の順番に、音声データから変換されたテキストデータを並び替えて議事録を作成するようになっている。 The minutes creation unit 16 creates minutes based on the text data converted from the voice data by the voice recognition unit 15. As shown in FIG. 6, since the start time is set for the sound data stored by the sound data storage unit 14, the minutes are created by rearranging the text data converted from the sound data in the order of the start time. It is supposed to be.

議事録作成部１６が作成した議事録をディスプレイ２５に表示させたときの図を図８に示す。図８（Ａ）は、ＨＴＭＬ（Hyper Text Markup Language）形式に対応した議事録であり、図８（Ｂ）は、テキスト形式に対応した議事録である。 FIG. 8 shows a diagram when the minutes created by the minutes creation unit 16 are displayed on the display 25. FIG. 8A shows the minutes corresponding to the HTML (Hyper Text Markup Language) format, and FIG. 8B shows the minutes corresponding to the text format.

なお、議事録作成部１６は、電子会議の途中でリアルタイムに議事録を作成するようにしてもよいし、電子会議が終わった場合など任意の時点で議事録を作成するようにしてもよい。 The minutes creation unit 16 may create the minutes in real time during the electronic conference, or may create the minutes at an arbitrary time such as when the electronic conference is over.

図８（Ａ）では、発言者と発言内容とを対応させて表示させている。さらに、図８（Ａ）では、発言内容に対応する再生ボタンを押下すれば発言内容の音声データを再生するようになっている。例えば、再生ボタンが押下されたとき、音声データ再生部１７は、押下された再生ボタンに対応する音声データの電子ファイルを音声データ保存部１４から取得して再生するようになっている。 In FIG. 8A, the speaker and the message content are displayed in correspondence with each other. Further, in FIG. 8A, the speech data of the utterance content is reproduced by pressing a playback button corresponding to the utterance content. For example, when a playback button is pressed, the audio data playback unit 17 acquires an electronic file of audio data corresponding to the pressed playback button from the audio data storage unit 14 and plays it back.

図８（Ｂ）では、発言者および発言開始時刻が括弧の中に表示されており、発言者と発言内容とを対応させて表示させている。なお、図３（Ｂ）の参加者名項目３３で指定された参加者名は、図８（Ａ）および図８（Ｂ）の発言者として表示される。 In FIG. 8B, the speaker and the speech start time are displayed in parentheses, and the speaker and the content of the speech are displayed in correspondence with each other. Note that the participant name designated in the participant name item 33 in FIG. 3B is displayed as the speaker in FIGS. 8A and 8B.

音声変換部１８は、参加者の発言をマイクロホン２３が収音したときの音声信号を、アナログデジタル変換し、変換されたデジタルの音声データを音声データ送信部１９および音声データ検出部１３に出力するようになっている。 The voice conversion unit 18 performs analog-to-digital conversion on the voice signal when the microphone 23 picks up the speech of the participant, and outputs the converted digital voice data to the voice data transmission unit 19 and the voice data detection unit 13. It is like that.

音声データ送信部１９は、音声変換部１８によって音声データが出力されたとき、音声データを自身の会議端末１０を除く電子会議に参加している会議端末１０全てにネットワークインタフェース２１を介して送信するようになっている。 When the audio data is output by the audio conversion unit 18, the audio data transmission unit 19 transmits the audio data to all the conference terminals 10 participating in the electronic conference excluding its own conference terminal 10 via the network interface 21. It is like that.

以上のように構成された会議端末１０の動作の一例について図９を用いて以下に説明する。図９（Ａ）は、電子会議を開始する前の段階で電子会議の参加者の情報を登録する動作の流れを示すフローチャートである。 An example of the operation of the conference terminal 10 configured as described above will be described below with reference to FIG. FIG. 9A is a flowchart showing a flow of an operation for registering information of participants in an electronic conference at a stage before starting the electronic conference.

まず、電子会議を開始する前に、会議端末１０を利用する利用者が電子会議の参加者を指定すると共に、参加者の情報が会議端末１０に登録される。例えば、図３に示したように、参加者指定部１１が指定ウインドウ３０をディスプレイ２５に表示し、利用者が指定ウインドウ３０を介して指定項目３１内の各種項目に対し情報を入力機器２４で入力する。指定された項目に対する情報が、参加者指定部１１によって図４の制御情報に登録される（ステップＳ１）。 First, before starting an electronic conference, a user who uses the conference terminal 10 designates a participant of the electronic conference, and information on the participant is registered in the conference terminal 10. For example, as shown in FIG. 3, the participant designation unit 11 displays a designation window 30 on the display 25, and the user inputs information on various items in the designation item 31 via the designation window 30 using the input device 24. input. Information on the designated item is registered in the control information of FIG. 4 by the participant designation unit 11 (step S1).

図９（Ｂ）は、電子会議が開始した後に音声データを受信した場合の動作の流れを示すフローチャートである。 FIG. 9B is a flowchart showing an operation flow when audio data is received after the electronic conference is started.

電子会議が開始した後に、音声データ受信部１２は、ネットワークインタフェース２１を介して受信したとき（ステップＳ１１）、受信した音声データを音声データ検出部１３に出力する。音声データ検出部１３は、出力された音声データが含まれているＩＰパケットのヘッダ部からＩＰアドレスを取得する（ステップＳ１２）。 After the electronic conference is started, the voice data receiving unit 12 outputs the received voice data to the voice data detecting unit 13 when received via the network interface 21 (step S11). The voice data detection unit 13 acquires an IP address from the header part of the IP packet that includes the output voice data (step S12).

音声データ検出部１３は、ステップＳ１２で取得したＩＰアドレスと同じものが、図４に示した制御情報の識別情報にある場合で、識別情報と対応する音声データの保存の是非を判定する（ステップＳ１３）。ステップＳ１２で取得したＩＰアドレスが、図４に示した制御情報の識別情報の中に一致するものがあって、識別情報と対応する音声データの保存の是非が「保存する」である場合、音声データ検出部１３は、音声データを音声データ保存部１４に出力する。 The voice data detection unit 13 determines whether or not to save the voice data corresponding to the identification information when the same IP address acquired in step S12 is in the identification information of the control information shown in FIG. S13). If the identification information of the control information shown in FIG. 4 matches the IP address acquired in step S12, and the saving of the audio data corresponding to the identification information is “save”, the voice The data detection unit 13 outputs the audio data to the audio data storage unit 14.

音声データ保存部１４は、各対象参加者の発言の無音部分を検出した場合、発言のあった期間の音声データをそれぞれ分離し（ステップＳ１４）、分離した各音声データに発言の開始時刻の情報を設定する（ステップＳ１５）。次に、音声データ保存部１４は、時刻情報を設定した音声データを記録媒体に保存する（ステップＳ１６）。 When the voice data storage unit 14 detects a silent part of the speech of each target participant, the speech data storage unit 14 separates the speech data of the speech period (step S14), and information on the start time of the speech for each separated speech data Is set (step S15). Next, the audio data storage unit 14 stores the audio data in which the time information is set in a recording medium (step S16).

なお、図４（Ｂ）に示した制御情報の最大保存容量に、最大保存容量を表す値が設定されている場合には、音声データ保存部１４は、設定された最大保存容量に到達するまで対象参加者の音声データを保存する。保存した音声データの容量が、対象参加者に対応した制御情報の最大保存容量に到達した場合、音声データ保存部１４は、音声データの保存を停止する。また、制御情報の最大保存容量に「制限なし」が設定されている場合には、音声データ保存部１４は、無制限に対象参加者の音声データを保存する。 When a value representing the maximum storage capacity is set in the maximum storage capacity of the control information shown in FIG. 4B, the audio data storage unit 14 until the set maximum storage capacity is reached. Save the audio data of the target participant. When the volume of the stored audio data reaches the maximum storage capacity of the control information corresponding to the target participant, the audio data storage unit 14 stops storing the audio data. When “no restriction” is set as the maximum storage capacity of the control information, the audio data storage unit 14 stores the audio data of the target participant without limitation.

一方、音声データ受信部１２で受信した音声データは、ステップＳ１３で音声データの保存の是非に関わらず、音声データ再生部１７によって再生され（ステップＳ１７）、再生されたものがスピーカ２２を介して音声として出力される。 On the other hand, the audio data received by the audio data receiving unit 12 is reproduced by the audio data reproducing unit 17 regardless of whether the audio data is stored in step S13 (step S17). Output as audio.

なお、図４（Ｂ）に示した制御情報のスピーカ出力の有無は、参加者が発言した音声をスピーカ２２に出力するか否かを表す情報であり、制御情報のスピーカ出力の有無が「無し」に設定されている参加者の音声データは、音声データ再生部１７によって再生されない。 The presence / absence of the speaker output of the control information shown in FIG. 4B is information indicating whether or not the voice uttered by the participant is output to the speaker 22, and the presence / absence of the speaker output of the control information is “none”. The audio data of the participant set to “” is not reproduced by the audio data reproduction unit 17.

図９（Ｃ）は、音声データを保存した場合の動作の流れを示すフローチャートである。図９（Ｃ）に示したフローチャートは、リアルタイムに議事録を作成する場合の動作を示したものである。 FIG. 9C is a flowchart showing an operation flow when audio data is stored. The flowchart shown in FIG. 9C shows the operation when creating the minutes in real time.

音声データが音声データ保存部１４によって記録媒体に記憶されたとき（ステップＳ２１）、音声認識部１５は、図４（Ｂ）の制御情報に設定されている登録音声データ識別項目３４が示す登録音声データに基づいて、音声データ保存部１４によって保存された音声データを音声認識してテキストデータに変換する（ステップＳ２２）。 When the voice data is stored in the recording medium by the voice data storage unit 14 (step S21), the voice recognition unit 15 displays the registered voice indicated by the registered voice data identification item 34 set in the control information of FIG. Based on the data, the speech data stored by the speech data storage unit 14 is recognized as speech and converted to text data (step S22).

次に、議事録作成部１６は、開始時刻から古いものから順番に、音声データから変換されたテキストデータを並び替えて議事録を作成し（ステップＳ２３）、作成した議事録を図８で示したようにディスプレイ２５に表示させる（ステップＳ２４）。 Next, the minutes creation unit 16 creates the minutes by rearranging the text data converted from the voice data in order from the oldest to the start time (step S23), and the created minutes are shown in FIG. As shown, it is displayed on the display 25 (step S24).

以上説明したように、会議端末１０は、電子会議の各参加者のうち発言が記録される対象参加者の情報を指定させ、受信された音声データのうち指定された対象参加者に対応する音声データだけを保存し、保存した音声データを音声認識して議事録を作成することで、発言が記録されない参加者に対応する音声データを保存することがなくなるため、議事録に必要なデータ量を節約することができる。 As described above, the conference terminal 10 causes the information of the target participant to be recorded to be recorded among the participants of the electronic conference, and the audio corresponding to the specified target participant in the received audio data. Since only the data is saved and the saved voice data is recognized and the minutes are created, it is no longer necessary to save the voice data corresponding to the participant whose speech is not recorded. Can be saved.

また、会議端末１０は、必要のある参加者の発言の音声データだけで議事録を作成することで、不要な参加者の発言の音声データを音声認識する必要もなくなるため、議事録を作成するときの処理効率を高めることができる。 In addition, the conference terminal 10 creates the minutes by creating the minutes only from the speech data of the necessary participant's utterance, so that it is not necessary to recognize the voice data of the unnecessary participant's utterance. When processing efficiency can be increased.

また、対象参加者の登録音声データが会議端末１０に登録されており、音声認識部１５が、登録された登録音声データに従って音声認識し、音声データをテキストデータに変換するため、会議端末１０は、音声認識の精度を向上させることができる。対象参加者の登録音声データだけを登録しておけばよいため、発言を記録しない不要な登録音声データを登録する必要はない。 In addition, since the registered voice data of the target participant is registered in the conference terminal 10 and the voice recognition unit 15 recognizes the voice according to the registered voice data registered and converts the voice data into text data, the conference terminal 10 The accuracy of voice recognition can be improved. Since only the registered voice data of the target participant needs to be registered, it is not necessary to register unnecessary registered voice data that does not record a speech.

また、音声データ保存部１４が、設定された最大保存容量に到達するまで対象参加者の音声データを保存し、最大保存容量を超えて保存しないため、会議端末１０は、議事録に必要なデータ量を有効に使用することができる。 In addition, since the audio data storage unit 14 stores the audio data of the target participant until the set maximum storage capacity is reached, and does not store exceeding the maximum storage capacity, the conference terminal 10 stores data necessary for the minutes. The amount can be used effectively.

また、音声データ検出部１３が、該当する音声データの音量を、参加者指定部１１で指定された音量になるように調整することで、適正な音声データが保存されることになるため、会議端末１０は、議事録を正確に作成することができる。 In addition, since the audio data detection unit 13 adjusts the volume of the corresponding audio data so as to be the volume specified by the participant specifying unit 11, appropriate audio data is stored, so that the conference The terminal 10 can accurately create the minutes.

なお、本発明の実施の形態では、会議端末１０Ａ〜会議端末１０Ｄの全てが議事録を取っていてもよいし、１つ以上の会議端末１０が取ってもよいし、議事録を取る会議端末１０の個数が限定されることはない。また、本発明の実施の形態に係る会議端末は、会議端末１０個々に必要のある参加者の発言だけで議事録を作成するため、同じ会議でも会議端末１０毎に様々な議事録を作成することができる。また、議事録を取る端末を会議端末１０Ｄとし、会議端末１０Ａ〜会議端末１０Ｃを一般的なパソコンに替えてもよく、すなわち、会議用の端末のうち１つを会議端末１０とし、残りの端末を一般的なパソコンにするような構成でもよい。 In the embodiment of the present invention, all of the conference terminals 10A to 10D may take minutes, one or more conference terminals 10 may take, and the conference terminal takes minutes. The number of 10 is not limited. In addition, since the conference terminal according to the embodiment of the present invention creates minutes only by the speech of the participants that are necessary for each conference terminal 10, various minutes are created for each conference terminal 10 even in the same conference. be able to. Further, the terminal for collecting the minutes may be the conference terminal 10D, and the conference terminals 10A to 10C may be replaced with a general personal computer. That is, one of the conference terminals is the conference terminal 10 and the remaining terminals. The configuration may be such that the computer is a general personal computer.

また、図１０に示すように、会議端末１０がＵＳＢハブ４０に接続され、各イヤホンマイク５０がＵＳＢハブ４０に接続されていてもよい。本発明の実施の形態では、識別情報をＩＰアドレスとして説明したが、図１０に示す形態で実施する際の識別情報はイヤホンマイク５０のデバイス識別子とする。 Further, as shown in FIG. 10, the conference terminal 10 may be connected to the USB hub 40, and each earphone microphone 50 may be connected to the USB hub 40. In the embodiment of the present invention, the identification information is described as an IP address. However, the identification information when implemented in the form shown in FIG. 10 is a device identifier of the earphone microphone 50.

また、電子会議の途中で、利用者が指定ウインドウ３０を介して指定項目３１内にある情報を消去することで、参加者指定部１１は、制御情報に登録してある参加者を解除することもできる。 In addition, during the electronic conference, the user deletes the information in the specified item 31 via the specified window 30 so that the participant specifying unit 11 cancels the participant registered in the control information. You can also.

また、電子会議の途中で、利用者が指定ウインドウ３０を介して指定項目３１内の保存実施の是非を「保存する」から「保存しない」に変更することで、参加者指定部１１は、利用者が変更した保存実施の是非を「保存しない」に登録することもできる。逆に、電子会議の途中で、利用者が指定ウインドウ３０を介して指定項目３１内の保存実施の是非を「保存しない」から「保存する」に変更することで、参加者指定部１１は、利用者が変更した保存実施の是非を「保存する」に登録することもできる。 Also, during the electronic conference, the user changes the right or wrong of saving in the specified item 31 via the specified window 30 from “Save” to “Do not save”, so that the participant specifying unit 11 can use It is also possible to register “do not save” whether or not the save has been changed by the user. Conversely, during the electronic conference, the user changes the right or wrong of the saving in the designated item 31 via the designation window 30 from “Do not save” to “Save”, so that the participant designation unit 11 It is also possible to register “Save” whether or not the save operation has been changed by the user.

以上のように、本発明は、議事録に必要なデータ量を節約することができ、議事録を作成するときの処理効率を高めることができるという効果を有し、電子会議用のパソコンなどに有用である。 As described above, the present invention can save the amount of data required for the minutes, has the effect of improving the processing efficiency when creating the minutes, and can be applied to a personal computer for electronic meetings. Useful.

本発明の実施の形態に係る会議システムを表す図The figure showing the conference system which concerns on embodiment of this invention 本発明の実施の形態に係る会議端末のブロック図Block diagram of a conference terminal according to an embodiment of the present invention 利用者に参加者の情報を指定させる画面のイメージ図Image of screen that lets users specify participant information 指定された参加者の情報に基づいた制御情報を示す図The figure which shows the control information based on the information of the designated participant 各参加者の発言による音声データの流れを表す図A diagram showing the flow of audio data for each participant 発言が開始された後に発言の無音部分を検出した際、発言が開始されてから無言の状態になるまでの音声データを電子ファイルとして保存するときの図Diagram of saving audio data as an electronic file from when speech is started until silence is reached when a silent part of speech is detected after speech is started 同じ参加者の音声データをできるだけ一度に音声認識させるイメージ図Image of recognizing voice data of the same participant as much as possible 音声データから変換されたテキストデータを開始時刻の順番に並び替えた議事録を示す図The figure which shows the minutes which rearranged the text data which is converted from voice data in order of the start time 本発明の実施の形態に係る会議端末の動作の一例を示すフローチャートThe flowchart which shows an example of operation | movement of the conference terminal which concerns on embodiment of this invention. 本発明の実施の形態と異なる形態に係る会議システムを表す図The figure showing the conference system which concerns on a different form from embodiment of this invention.

Explanation of symbols

１０会議端末
１１参加者指定部
１２音声データ受信部
１３音声データ検出部
１４音声データ保存部
１５音声認識部
１６議事録作成部
１７音声データ再生部
１８音声変換部
１９音声データ送信部
２１ネットワークインタフェース
２２スピーカ
２３マイクロホン
２４入力機器
２５ディスプレイ
３０指定ウインドウ
３１指定項目
３２音量メータ
３３参加者名項目
３４登録音声データ識別項目
３５識別情報項目
３６保存是非項目
４０ＵＳＢハブ
４１開始ボタン
４２停止ボタン
４３音量調整ボタン
４４クリアボタン
４５保存ボタン
５０イヤホンマイク DESCRIPTION OF SYMBOLS 10 Conference terminal 11 Participant designation | designated part 12 Audio | voice data receiving part 13 Audio | voice data detection part 14 Audio | voice data preservation | save part 15 Audio | voice recognition part 16 Minutes preparation part 17 Audio | voice data reproduction part 18 Audio | voice conversion part 19 Audio | voice data transmission part 21 Network interface 22 Speaker 23 Microphone 24 Input device 25 Display 30 Specified window 31 Specified item 32 Volume meter 33 Participant name item 34 Registered voice data identification item 35 Identification information item 36 Save item 40 USB hub 41 Start button 42 Stop button 43 Volume adjustment button 44 Clear button 45 Save button 50 Earphone microphone

Claims

Participant designating section for designating information of target participants whose utterances are recorded and information of participants whose utterances are not recorded among the participants of the electronic conference,
An audio data receiving unit for receiving audio data when the participant speaks;
Among the audio data received by the audio data receiver, an audio data detector that detects audio data corresponding to the target participant specified by the participant specifying unit;
An audio data storage unit that stores the audio data detected by the audio data detection unit;
A voice recognition unit for converting the voice data stored by the voice data storage unit into text data;
A minutes creation device, comprising: a minutes creation section for creating minutes based on the text data converted by the voice recognition section.

When the participant designation unit designates the information of the target participant, the registered voice data corresponding to the target participant is designated,
2. The minutes creation apparatus according to claim 1, wherein the voice recognition unit converts voice data into text data in accordance with registered voice data of a target participant designated by the participant designation unit.

The participant designation unit designates the maximum storage capacity when saving the target participant's voice data when the target participant's information is designated,
The audio data storage unit stores the audio data detected by the audio data detection unit so as to be within a maximum storage capacity specified by the participant specifying unit. 2. Minutes preparation device described in 2.

When the participant specifying unit specifies the information of the target participant, the volume of audio data corresponding to the target participant is specified,
The sound data detection unit adjusts the sound volume of the sound data detected by the sound data detection unit so as to be a sound volume specified by the participant specifying unit. Minutes creation device described in any of the above.

Provided with an audio data reproduction unit for reproducing audio data,
The voice data detection unit detects voice data corresponding to a target participant designated by the participant designation unit from the voice data received by the voice data reception unit, and receives the voice data by the voice data reception unit. The apparatus for creating minutes as claimed in claim 1, wherein the audio data reproducing unit reproduces the audio data thus reproduced.