JPH07226931A

JPH07226931A - Multi-medium conference equipment

Info

Publication number: JPH07226931A
Application number: JP6018578A
Authority: JP
Inventors: Osamu Yamagishi; 治山岸; Nobuhiro Matsuda; 伸広松田; Toshihiko Wakahara; 俊彦若原; Motoi Sato; 基佐藤
Original assignee: Toshiba Corp; Nippon Telegraph and Telephone Corp
Current assignee: Toshiba Corp; Nippon Telegraph and Telephone Corp
Priority date: 1994-02-15
Filing date: 1994-02-15
Publication date: 1995-08-22

Abstract

PURPOSE:To provide the multi-medium conference equipment in which the retrieval and the edit are facilitated in the re-processing generating an agenda or a paper material after the end of conference. CONSTITUTION:The-utterance of conference participants 10-1-10-4 is detected based on a sound volume and a consecutive period of voice information received from microphones 14-1-14-4, text information relating to the utterance of the conference participant received from a data terminal equipment 15-0 of a modulator 10-0 is detected and the voice information and the audio information received from the conference participant are stored in an information storage/retrieval device 100 with same link information together with the text information received by the modulator and the voice information and the video information received from the conference participant are retrieved from the information storage/retrieval device 100 based on the text information and the link information.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、複数の会議参加者の
間で音声情報及び映像情報を含むマルチメディア情報を
用いて会議通信を行うマルチメディア会議装置に関し、
特に、各会議参加者から入力されたマルチメディア情報
を司会者から入力されたテキスト情報にリアルタイムに
関連付けて蓄積し、会議終了後における議事録または資
料を作成する再処理時における検索および編集を容易に
したマルチメディア会議装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a multimedia conference apparatus for conducting conference communication between a plurality of conference participants using multimedia information including audio information and video information,
In particular, multimedia information input from each conference participant is stored in real time in association with text information input from the moderator, and minutes and minutes are created after the conference. Searching and editing during reprocessing is easy. Multimedia conference device.

【０００２】[0002]

【従来の技術】従来、会議サービスを提供する会議装置
としては、音声を用いて複数の会議参加者の間で会議通
話を行う会議装置が知られている。2. Description of the Related Art Conventionally, as a conference device for providing a conference service, a conference device for making a conference call using a voice between a plurality of conference participants is known.

【０００３】また、近年の情報化社会の発展に伴い、音
声以外に、映像情報、ファクシミリ情報、テキスト情報
も会議司会者の間で通信することができるようにしたマ
ルチメディア情報を通信可能な会議装置も提案されてい
る。With the development of the information-oriented society in recent years, in addition to voice, video information, facsimile information, and text information can be communicated among the conference moderators. Devices have also been proposed.

【０００４】ところで、この種の会議装置においては会
議終了後に該会議に係わる議事録または資料を作成する
ために、会議通話中における各会議参加者の入力音声、
ファクシミリ情報、テキスト情報等を記憶するように構
成されているが、上記議事録または資料の作成を容易に
するために、上記入力音声、ファクシミリ情報、テキス
ト情報等を会議司会者により入力されたメモを付して記
憶する構成も考えられている。By the way, in this type of conferencing apparatus, in order to create the minutes or materials relating to the conference after the conference, the input voice of each conference participant during the conference call,
Although it is configured to store facsimile information, text information, etc., in order to facilitate the creation of the minutes or materials mentioned above, a memo in which the above-mentioned input voice, facsimile information, text information, etc. are input by the conference moderator. A configuration in which the item is attached and stored is also considered.

【０００５】この場合、会議司会者により入力されたメ
モと上記入力音声、ファクシミリ情報、テキスト情報等
をどのように関連づけて記憶するかが問題である。In this case, there is a problem in how the memo input by the conference chairperson and the input voice, facsimile information, text information, etc. are stored in association with each other.

【０００６】従来、複数のメディアを融合したサービス
としては、テキストに音声や映像を結び付けたハイパー
テキストが知られている。ハイパーテキストは、複数の
情報をリンク情報で結び付けることができるので、決ま
った順序で見たり聞いたりした情報を、テキストやキー
ワードにより、ランダムに取り出せることができ、会議
終了後における議事録または資料を作成する情報の再処
理において非常に便利であるが、このハイパーテキスト
の作成には非常に時間がかかり、これをそのまま会議装
置に適用することは難しい。[0006] Conventionally, as a service that fuses a plurality of media, hypertext in which voice and video are linked to text is known. Since hypertext can link multiple pieces of information with link information, you can randomly retrieve the information you saw or heard in a fixed order by text or keywords, and record the minutes or materials after the meeting. It is very convenient for reprocessing the information to be created, but it takes a very long time to create this hypertext, and it is difficult to directly apply it to the conference device.

【０００７】すなわち、ハイパーテキストは、多数の人
々で共用できる百科辞典などのデータ化等に適したもの
であり、例えば、百科辞典をハイパーテキストでデータ
化する場合を考えると、音声、映像、文書を別々に編集
し（音声と映像は同一に編集することも考えられる）、
これらにリンク情報を付加してハイパーテキストに生成
することになり、この場合、各情報は単一メディアとし
て蓄積され、この蓄積した情報を情報収集時間より長い
時間を掛けて編集することになる。That is, hypertext is suitable for converting encyclopedias that can be shared by a large number of people into data. For example, considering the case where encyclopedias are converted into hypertext, audio, video, and document are considered. Edit separately (the audio and video may be edited the same),
Link information is added to these to generate hypertext. In this case, each piece of information is stored as a single medium, and this stored information is edited for a time longer than the information collection time.

【０００８】[0008]

【発明が解決しようとする課題】上述のごとく、従来の
会議装置においては会議終了後に該会議に係わる議事録
または資料を作成するために、会議通話中における各会
議参加者の入力音声、ファクシミリ情報、テキスト情報
等を記憶するように構成されているが、この場合、見た
り聞いたりした情報決まった順序で取り出すことができ
るだけなので、上記議事録または資料の作成には非常に
手間がかかり、また、ハイパーテキストを導入すると、
複数の情報をリンク情報で結び付けることができるの
で、決まった順序で見たり聞いたりした情報を、テキス
トやキーワードにより、ランダムに取り出せることがで
きるので、会議終了後における議事録または資料を作成
する情報の再処理において非常に便利であるが、このハ
イパーテキストの作成には非常に時間がかかり、これを
そのまま会議装置に適用することは難しいという問題が
あった。As described above, in the conventional conference apparatus, in order to create the minutes or materials relating to the conference after the conference is finished, the input voice and facsimile information of each conference participant during the conference call. , It is configured to store text information, etc., but in this case, it is very time-consuming to create the above minutes or materials, because the information viewed or heard can only be retrieved in a fixed order. , With the introduction of hypertext,
Since multiple pieces of information can be linked by link information, the information that you see or hear in a fixed order can be retrieved at random by text or keywords. Although it is very convenient in the reprocessing of the above, it takes a very long time to create this hypertext, and it is difficult to apply this hypertext to the conference device as it is.

【０００９】そこで、この発明は、会議終了後における
議事録または資料を作成する再処理時における検索およ
び編集を容易にしたマルチメディア会議装置を提供する
ことを目的とする。Therefore, an object of the present invention is to provide a multimedia conferencing apparatus which facilitates retrieval and editing at the time of reprocessing for creating minutes or materials after the end of a conference.

【００１０】[0010]

【課題を解決するための手段】上記目的を達成するた
め、この発明は、少なくとも１人の司会者および複数の
会議参加者の間で音声情報及び映像情報を含むマルチメ
ディア情報を用いて会議通信を行うマルチメディア会議
装置において、前記司会者および前記会議参加者に対応
して設けられたマルチメディア情報入出力手段と、前記
マルチメディア情報入出力手段により入力された音声情
報の音量および持続期間から前記会議参加者の発言を検
出するとともに、前記司会者により入力された前記会議
参加者の発言に係わるテキスト情報を検出し、該会議参
加者から入力されたマルチメディア情報を、前記司会者
により入力されたテキスト情報とともに同一のリンク情
報を付して記憶するマルチメディア情報蓄積手段と、前
記テキスト情報を基に前記会議参加者から入力されたマ
ルチメディア情報を検索して前記マルチメディア情報入
出力手段から出力するマルチメディア情報検索手段とを
具備することを特徴とする。To achieve the above object, the present invention provides a conference communication between at least one moderator and a plurality of conference participants using multimedia information including audio information and video information. In the multimedia conferencing apparatus for performing the following, from the multimedia information input / output unit provided corresponding to the moderator and the conference participant, and the volume and duration of the voice information input by the multimedia information input / output unit. While detecting the speech of the conference participant, the text information relating to the speech of the conference participant input by the moderator is detected, and the multimedia information input from the conference participant is input by the moderator. Multimedia information storage means for storing the same link information together with the stored text information and the text information. Characterized by comprising a multimedia information retrieval means for searching for multimedia information inputted from said conferee outputs from the multimedia information input and output unit.

【００１１】[0011]

【作用】この発明では、マルチメディア情報入出力手段
により入力された音声情報の音量および持続期間から会
議参加者の発言を検出するとともに、司会者により入力
された会議参加者の発言に係わるテキスト情報を検出
し、該会議参加者から入力されたマルチメディア情報
を、司会者により入力されたテキスト情報とともに同一
のリンク情報を付してマルチメディア情報蓄積手段に記
憶し、その後、上記テキスト情報を基に会議参加者から
入力されたマルチメディア情報をマルチメディア情報検
索手段により検索してマルチメディア情報入出力手段か
ら出力するように構成して、会議終了後における議事録
または資料を作成する再処理時における検索および編集
を容易にする。According to the present invention, the speech of the conference participant is detected from the volume and duration of the voice information input by the multimedia information input / output means, and the text information relating to the speech of the conference participant input by the moderator. The multimedia information input from the conference participant is stored in the multimedia information accumulating means with the same link information as the text information input by the moderator, and then the above-mentioned text information is used as the basis. At the time of reprocessing, the multimedia information input from the conference participants is searched by the multimedia information search means and output from the multimedia information input / output means, and the minutes or materials are created after the conference is over. Facilitates searching and editing in.

【００１２】[0012]

【実施例】以下、この発明に係わるマルチメディア会議
装置の実施例を図面に基づいて詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT An embodiment of a multimedia conference apparatus according to the present invention will be described below in detail with reference to the drawings.

【００１３】図１は、この発明に係わるマルチメディア
会議装置の全体構成を示したもので、図１に示すマルチ
メディア会議装置においては、４人の会議参加者１０−
１、１０−２、１０−３、１０−４と１人の会議司会者
１０−０との間で、情報蓄積／検索装置１００との間
で、音声および映像を用いた会議通信サービスを提供す
る。FIG. 1 shows the overall configuration of a multimedia conference apparatus according to the present invention. In the multimedia conference apparatus shown in FIG. 1, four conference participants 10-
Providing a conference communication service using voice and video with the information storage / retrieval device 100 between the one, 10-2, 10-3, 10-4 and one conference moderator 10-0. To do.

【００１４】ここで、各会議参加者１０−１、１０−
２、１０−３、１０−４に対応して、それぞれ、映像を
出力するモニタ１１−１、１１−２、１１−３、１１−
４、映像を入力するカメラ１２−１、１２−２、１２−
３、１２−４、音声を出力するスピーカ１３−１、１３
−２、１３−３、１３−４、音声を入力するマイクロフ
ォン１４−１、１４−２、１４−３、１４−４が設けら
れ、会議司会者１０−０に対応して、映像を出力するモ
ニタ１１−０、映像を入力するカメラ１２−０、音声を
出力するスピーカ１３−０、音声を入力するマイクロフ
ォン１４−０、テキストを入力するデータ端末１５−０
が設けられ、これらは情報蓄積／検索装置１００に接続
されている。Here, each conference participant 10-1, 10-
Monitors 11-1, 11-2, 11-3, and 11- that output images corresponding to 2, 10-3, and 10-4, respectively.
4. Cameras 12-1, 12-2, 12-for inputting images
3, 12-4, speakers 13-1, 13 for outputting voice
-2, 13-3, 13-4, microphones 14-1, 14-2, 14-3, 14-4 for inputting audio are provided, and video is output corresponding to the conference moderator 10-0. Monitor 11-0, camera 12-0 for inputting video, speaker 13-0 for outputting audio, microphone 14-0 for inputting audio, data terminal 15-0 for inputting text
Are provided, and these are connected to the information storage / retrieval apparatus 100.

【００１５】かかる構成において、会議参加者１０−
１、１０−２、１０−３、１０−４および会議司会者１
０−０は、それぞれ、モニタ１１−１、１１−２、１１
−３、１１−４、１１−０、カメラ１２−１、１２−
２、１２−３、１２−４、１２−０、スピーカ１３−
１、１３−２、１３−３、１３−４、１３−０、マイク
ロフォン１４−１、１４−２、１４−３、１４−４、１
４−０を使用し、情報蓄積／検索装置１００を介して会
議を行う。In this structure, the conference participants 10-
1, 10-2, 10-3, 10-4 and conference moderator 1
0-0 indicates monitors 11-1, 11-2, 11 respectively.
-3, 11-4, 11-0, cameras 12-1, 12-
2, 12-3, 12-4, 12-0, speaker 13-
1, 13-2, 13-3, 13-4, 13-0, microphones 14-1, 14-2, 14-3, 14-4, 1
4-0 is used to hold a conference via the information storage / retrieval apparatus 100.

【００１６】情報蓄積／検索装置１００は、後に詳述す
るマルチメディア情報蓄積／検索機能以外にマルチメデ
ィア情報の交換機能を有しており、各会議参加者に対応
するマイクロフォンから入力された音声情報およびカメ
ラから入力された映像情報を他の会議参加者および会議
司会者に送信し、ここで他の会議参加者および会議司会
者に対応するスピーカから音声を出力し、モニタから映
像情報を出力するように構成されている。The information storing / retrieving apparatus 100 has a multimedia information exchanging function in addition to the multimedia information storing / retrieving function which will be described later in detail, and voice information input from a microphone corresponding to each conference participant. And the video information input from the camera is sent to the other conference participants and the moderator, and the audio is output from the speaker corresponding to the other conference participants and the moderator, and the video information is output from the monitor. Is configured.

【００１７】すなわち、情報蓄積／検索装置１００は、
各参加者に対応するマイクロフォンから入力された音声
情報を監視し、この音声情報が所定の音量以上であり、
かつ所定の持続時間を有するものであると、該音声情報
を入力した会議参加者の発言として検出し、該会議参加
者に対応するマイクロフォンから入力された音声情報お
よびカメラから入力された映像情報を他の会議参加者お
よび会議司会者に送信することにより会議通信を行う。That is, the information storage / retrieval apparatus 100 is
The voice information input from the microphone corresponding to each participant is monitored, and this voice information is equal to or higher than a predetermined volume,
And, if it has a predetermined duration, it is detected as the speech of the conference participant who has input the audio information, and the audio information input from the microphone corresponding to the conference participant and the video information input from the camera are detected. Conference communication is performed by transmitting to other conference participants and the conference moderator.

【００１８】また、会議司会者１０−０は各参加者の発
言内容を聴取し、この各参加者の発言内容を要約し、こ
れをテキスト情報としてデータ端末１５−０を使用して
情報蓄積／検索装置１００に入力することにより、会議
の議事のメモ入力を行う。The conference moderator 10-0 listens to the speech contents of each participant, summarizes the speech contents of each participant, and stores the information as text information using the data terminal 15-0. By inputting to the search device 100, a memo of the meeting agenda is input.

【００１９】情報蓄積／検索装置１００は、各参加者に
対応するマイクロフォンから入力された音声情報が所定
の音量以上であり、かつ所定の持続時間を有するもので
あると、その持続時間に基づき切り出した１つの発言に
対応する音声情報およびカメラから入力された映像情報
を、このとき会議司会者１０−０からデータ端末１５−
０を用いて入力されたテキスト情報とともに、同一のリ
ンク情報を付加して蓄積する。If the voice information input from the microphones corresponding to each participant is equal to or more than a predetermined volume and has a predetermined duration, the information storage / retrieval apparatus 100 cuts out based on the duration. At this time, the audio information corresponding to one utterance and the video information input from the camera are transferred from the conference moderator 10-0 to the data terminal 15-
The same link information is added and stored together with the text information input using 0.

【００２０】会議が終了すると、会議司会者１０−０
は、データ端末１５−０を使用してメモ入力したテキス
ト情報を基に該会議に係わる議事録や資料等を作成す
る。ここで、メモ入力したテキスト情報から議事の内容
を思い出せない場合は、同一リンク情報を有する発言
（音声情報、映像情報）を情報蓄積／検索装置１００の
蓄積情報から検索することによりその内容の確認を行
う。When the conference is over, the conference moderator 10-0
Creates minutes and materials related to the conference on the basis of the text information input by memo using the data terminal 15-0. Here, if the content of the proceedings cannot be remembered from the text information entered in the memo, the content having the same link information (voice information, video information) is searched from the stored information of the information storage / retrieval device 100 to confirm the content. I do.

【００２１】ここで、データ端末１５−０を使用してメ
モ入力したテキスト情報が発言に対して遅れて、または
進んで入力される場合があるので、情報蓄積／検索装置
１００には、発言を時間的にさかのぼって検索する機能
と時間的に進行して検索する機能を有している。Here, since the text information input by memo using the data terminal 15-0 may be input later or earlier than the utterance, the information storage / retrieval apparatus 100 may make a utterance. It has a function of searching backward in time and a function of searching in time.

【００２２】また、会議司会者１０−０は、メモ入力し
たテキスト情報に基づき議事録や資料等を作成した後、
再度、データ端末１５−０を使用して情報蓄積／検索装
置１００に蓄積するが、その際にも発言を選択して議事
録や資料等にリンク情報を付加する。ここで、選択した
発言（音声情報、映像情報）にメモ入力したテキスト情
報と異なるリンク情報を付与することも可能である。The meeting moderator 10-0 creates the minutes and materials based on the text information entered in the memo, and then
Again, the data is stored in the information storage / retrieval device 100 by using the data terminal 15-0, and at that time also, the utterance is selected and the link information is added to the minutes and materials. Here, it is also possible to add link information different from the text information input as a memo to the selected utterance (audio information, video information).

【００２３】これにより、議事の内容を会議参加者が確
認する際にも、同様に音声情報や映像情報を検索するこ
とが可能になり、議事録作成や議事録確認の業務の向上
が図れる。As a result, even when the conference participants confirm the contents of the proceedings, it is possible to search the voice information and the image information in the same manner, and the work of preparing the proceedings and confirming the proceedings can be improved.

【００２４】また、上記説明においては、各参加者から
入力された音声情報の音量および持続時間に基づき会議
参加者の発言を自動的に特定するように構成したが、会
議司会者１０−０によるデータ端末１５−０を用いた指
示により会議参加者の発言を特定してリンク情報を付加
するように構成してもよい。In the above description, the speech of the conference participant is automatically specified based on the volume and duration of the voice information input by each participant. It may be configured to specify the utterance of the conference participant by the instruction using the data terminal 15-0 and add the link information.

【００２５】また、上記実施例においては、各会議参加
者に対応してそれぞれマイクロフォンを設けているの
で、会議開始時に各マイクロフォンと会議参加者との対
応づけを行っておけば、発言者の名前を情報として情報
蓄積／検索装置１００に蓄積することもできる。Further, in the above embodiment, since the microphones are provided for the respective conference participants, if the microphones are associated with the conference participants at the start of the conference, the name of the speaker Can be stored in the information storage / retrieval apparatus 100 as information.

【００２６】図２は、図１に示した情報蓄積／検索装置
１００の詳細構成を示したものである。図２において、
情報蓄積／検索装置１００は、制御部１０１、蓄積／検
索部１０２、選択／分配部１０３を具備し、更に、会議
司会者に対応して音声処理部１０４−０、映像処理部１
０５−０、データ処理部１０６−０を具備し、各会議参
加者に対応して、音声処理部１０４−１、１０４−２、
１０４−３、１０４−４および映像処理部１０５−１、
１０５−２、１０５−３、１０５−４を具備して構成さ
れる。FIG. 2 shows a detailed configuration of the information storage / retrieval apparatus 100 shown in FIG. In FIG.
The information storage / retrieval apparatus 100 includes a control unit 101, a storage / retrieval unit 102, and a selection / distribution unit 103, and further includes a voice processing unit 104-0 and a video processing unit 1 corresponding to a conference moderator.
05-0 and a data processing unit 106-0, corresponding to each conference participant, voice processing units 104-1, 104-2,
104-3 and 104-4 and the video processing unit 105-1,
105-2, 105-3, 105-4 is comprised and comprised.

【００２７】また、音声処理部１０４−０は、図１に示
したスピーカ１３−０およびマイクロフォン１４−０に
接続され、映像処理部１０５−０は、図１に示したモニ
タ１１−０およびカメラ１２−０に接続され、データ処
理部１０６−０は図１に示したデータ端末１５−０に接
続される。The audio processing unit 104-0 is connected to the speaker 13-0 and the microphone 14-0 shown in FIG. 1, and the video processing unit 105-0 is connected to the monitor 11-0 and the camera shown in FIG. 12-0, and the data processing unit 106-0 is connected to the data terminal 15-0 shown in FIG.

【００２８】また、音声処理部１０４−１、１０４−
２、１０４−３、１０４−４は、図１に示したスピーカ
１３−１、１３−２、１３−３、１３−４およびマイク
ロフォン１４−１、１４−２、１４−３、１４−４にそ
れぞれ接続され、映像処理部１０５−１、１０５−２、
１０５−３、１０５−４は、図１に示したモニタ１１−
１、１１−２、１１−３、１１−４およびカメラ１２−
１、１２−２、１２−３、１２−４にそれぞれ接続され
る。The voice processing units 104-1, 104-
2, 104-3, 104-4 are connected to the speakers 13-1, 13-2, 13-3, 13-4 and the microphones 14-1, 14-2, 14-3, 14-4 shown in FIG. The video processing units 105-1, 105-2, which are respectively connected,
105-3 and 105-4 are monitors 11- shown in FIG.
1, 11-2, 11-3, 11-4 and camera 12-
1, 12-2, 12-3, 12-4, respectively.

【００２９】ここで、制御部１０１は、音声処理部１０
４−０、１０４−１、１０４−２、１０４−３、１０４
−４からの発言者通知に対して発言者の音声情報および
映像情報を選択する指示を選択／分配部１０３に対して
行い、該音声情報および映像情報に付与するリンク情報
を蓄積／検索部１０２に指示する。Here, the control unit 101 controls the voice processing unit 10
4-0, 104-1, 104-2, 104-3, 104
-4, the selection / distribution unit 103 is instructed to select the voice information and the video information of the speaker in response to the speaker notification, and the link information to be added to the voice information and the video information is stored / retrieved. Instruct.

【００３０】この時にデータ処理部１０６−０からテキ
スト情報の入力があると、制御部１０１は、該テキスト
情報を上記リンク情報と同一のリンク情報を付加して蓄
積／検索部１０２に蓄積するよう蓄積／検索部１０２に
対して指示する。At this time, when text information is input from the data processing unit 106-0, the control unit 101 adds the same link information as the above-mentioned link information to the storage / retrieval unit 102. The storage / retrieval unit 102 is instructed.

【００３１】また、司会者がデータ処理部１０６−０を
介して制御部１０１に検索の指示を通知すると、制御部
１０１は、リンク情報および検索キー情報より、該当す
るテキスト情報、音声情報、映像情報を司会者へ出力す
るよう蓄積／検索部１０２に対して指示し、データ処理
部１０６−０から該当情報を出力する。When the moderator notifies the control unit 101 of the search instruction via the data processing unit 106-0, the control unit 101 uses the link information and the search key information to retrieve the corresponding text information, voice information, and video. The accumulation / retrieval unit 102 is instructed to output the information to the moderator, and the data processing unit 106-0 outputs the corresponding information.

【００３２】蓄積／検索部１０２は、制御部１０１の制
御下で、データ処理部１０６−０から出力されるテキス
ト情報および参加者中の発言に基づき音声処理部１０４
−１、１０４−２、１０４−３、１０４−４から出力さ
れる音声情報および映像処理部１０５−１、１０５−
２、１０５−３、１０５−４から出力される映像情報を
選択／分配部１０３を介して入力し、それぞれ対応する
リンク情報を付加して蓄積する。Under the control of the control unit 101, the accumulation / retrieval unit 102 is based on the text information output from the data processing unit 106-0 and the speech of the participants, and the voice processing unit 104.
-1, 104-2, 104-3, 104-4 output audio information and video processing units 105-1, 105-
The video information output from 2, 105-3, and 105-4 is input via the selection / distribution unit 103, and the corresponding link information is added and accumulated.

【００３３】また、蓄積／検索部１０２は、制御部１０
１から与えられるリンク情報および検索キー情報に基づ
き、該当するテキスト情報、音声情報、映像情報を検索
し、データ処理部１０６−０および音声処理部１０４−
０、１０４−１、１０４−２、１０４−３、１０４−
４、映像処理部１０５−０、１０５−１、１０５−２、
１０５−３、１０５−４へ出力する。Further, the storage / retrieval unit 102 includes a control unit 10
1. Based on the link information and search key information given from 1, the corresponding text information, audio information, and video information are searched, and the data processing unit 106-0 and the audio processing unit 104-
0, 104-1, 104-2, 104-3, 104-
4, video processing units 105-0, 105-1, 105-2,
Output to 105-3 and 105-4.

【００３４】選択／分配部１０３は、制御部１０１の制
御下で、発言を行っている会議参加者に対応する音声処
理部から出力される音声情報および映像処理部から出力
される映像情報を選択し、他の会議参加者および会議司
会者に対応する音声処理部、映像処理部に出力するとと
もに、該音声情報および映像情報を蓄積／検索部１０２
へ出力する。Under the control of the control unit 101, the selection / distribution unit 103 selects the audio information output from the audio processing unit and the video information output from the video processing unit corresponding to the conference participant making a statement. Then, it outputs the audio information and the video information to the audio processing unit and the video processing unit corresponding to other conference participants and the conference moderator, and stores / retrieves the audio information and the video information.
Output to.

【００３５】音声処理部１０４−０、１０４−１、１０
４−２、１０４−３、１０４−４は、会議司会者および
各会議参加者に対応するマイクロフォンから入力された
音声を音声情報として、それぞれ、選択／分配部１０３
に出力する処理を行うとともに、選択／分配部１０３か
ら出力された音声情報を会議司会者および各会議参加者
に対応するスピーカに加えて音声として出力する処理を
行う。Voice processing units 104-0, 104-1, 10
4-2, 104-3, 104-4 select / distribute unit 103, respectively, using the voice input from the microphones corresponding to the conference moderator and each conference participant as voice information.
The audio information output from the selecting / distributing unit 103 is also output to the conference moderator and speakers corresponding to each conference participant as voice.

【００３６】また、音声処理部１０４−０、１０４−
１、１０４−２、１０４−３、１０４−４は、会議司会
者および各会議参加者に対応するマイクロフォンからそ
れぞれ入力された音声の音量レベルおよび一定音量レベ
ル以上の期間を検出して会議司会者および各会議参加者
の発言を検出して、その検出結果を制御部１０１に通知
する。Further, the voice processing units 104-0 and 104-
Numerals 1, 104-2, 104-3, and 104-4 detect the volume level of the voice input from the microphones corresponding to the conference moderator and each conference participant and a period of a certain volume level or more, respectively, and the moderators of the conference. Also, the speech of each conference participant is detected, and the detection result is notified to the control unit 101.

【００３７】映像処理部１０５−０、１０５−１、１０
５−２、１０５−３、１０５−４は、会議司会者および
各会議参加者に対応するカメラから入力された映像をデ
ィジタル信号（映像情報）に変換して選択／分配部１０
３に出力する処理を行うとともに、選択／分配部１０３
から出力された映像情報を会議司会者および各会議参加
者に対応するモニタに加え、映像として出力する処理を
行う。Video processing units 105-0, 105-1, 10
Reference numerals 5-2, 105-3, and 105-4 convert the images input from the cameras corresponding to the conference moderator and each conference participant into digital signals (image information) to select / distribute unit 10.
3 and outputs to the selection / distribution unit 103.
The video information output from is added to the monitors corresponding to the conference moderator and each conference participant, and a process of outputting as video is performed.

【００３８】データ処理部１０６−０は、会議司会者が
図１に示したデータ端末１５−０を使用して入力したテ
キスト情報を蓄積／検索部１０２に出力する処理を行う
とともに、蓄積／検索部１０２から検索したテキスト情
報を図１に示したデータ端末１５−０に出力する処理を
行う。The data processing unit 106-0 performs a process of outputting the text information input by the conference moderator using the data terminal 15-0 shown in FIG. 1 to the storage / retrieval unit 102, and also stores / retrieves the text information. The text information retrieved from the unit 102 is output to the data terminal 15-0 shown in FIG.

【００３９】また、データ処理部１０６−０は、図１に
示したデータ端末１５−０を介する制御部１０１との間
のインターフェースを制御情報を用いて行なう。Further, the data processing unit 106-0 uses the control information to interface with the control unit 101 via the data terminal 15-0 shown in FIG.

【００４０】図３は、図２に示した蓄積／検索部１０２
における情報フォーマットの一例を示したものである。FIG. 3 shows the storage / retrieval unit 102 shown in FIG.
2 shows an example of the information format in.

【００４１】蓄積／検索部１０２は、上述したように、
テキスト情報、音声情報、映像情報を蓄積、検索するた
めに使用するもので、テキスト制御領域１３１、音声制
御領域１３２、映像制御領域１３３を具備して構成され
る。各領域には、切り出された情報を各発言単位で格納
するための制御情報、リンク情報、格納先頭アドレス、
レングスを格納する領域が設けられている。The storage / retrieval unit 102, as described above,
It is used to store and retrieve text information, audio information, and video information, and includes a text control area 131, a voice control area 132, and a video control area 133. In each area, control information for storing the cut-out information in each utterance unit, link information, a storage start address,
An area for storing the length is provided.

【００４２】ここで、制御情報は、前後へのポインタお
よび情報の有効，無効を示す情報である。Here, the control information is information indicating a pointer to the front and back and validity / invalidity of the information.

【００４３】また、リンク情報は、切り出された発言単
位でのテキスト情報、音声情報、映像情報の関連付けに
使用する。ここで、１つの発言には同一のリンク情報が
付加される。The link information is used for associating the text information, the audio information, and the video information in the cut-out utterance units. Here, the same link information is added to one utterance.

【００４４】また、格納先頭アドレスは、テキスト情
報、音声情報、映像情報を格納した情報格納領域の先頭
アドレスを示し、レングスはその情報量を示す。The storage start address indicates the start address of the information storage area in which text information, audio information, and video information are stored, and the length indicates the amount of information.

【００４５】このような構成によると、テキスト情報か
らリンク情報を利用して、音声情報、映像情報を取り出
すことが可能である。また、制御情報を使用して時間的
に前者の発言および後者の発言を検索することも可能で
ある。With such a configuration, it is possible to extract the audio information and the video information from the text information using the link information. It is also possible to retrieve the former utterance and the latter utterance temporally using the control information.

【００４６】なお、同時に２つ以上の会議参加者から発
言を検出した場合に、各会議参加者毎に別々のリンク情
報を持って蓄積することにより、蓄積された各会議参加
者の発生する複数の情報の１つであるテキスト情報を検
索することにより、２つ以上の同時発言を各々選択し
て、１つずつ出力することも可能である。When the utterances are detected from two or more conference participants at the same time, by accumulating the different link information for each conference participant, a plurality of accumulated conference participants are generated. It is also possible to select two or more simultaneous utterances and output them one by one by searching the text information, which is one of the above information.

【００４７】この場合、発言時に２者以上が同時に会話
していたとしても、これらの会話の内容を一人づつ分離
して再生することが可能になる。In this case, even if two or more persons are talking at the same time when speaking, the contents of these conversations can be separated and reproduced one by one.

【００４８】図４は、図１に示した情報蓄積／検索装置
１００の他の構成を示したものである。図４に示す情報
蓄積／検索装置１００は、図２に示した情報蓄積／検索
装置１００に対して音声／テキスト変換辞書１０７を追
加することにより構成される。他の構成は、図２に示し
たものと同様である。FIG. 4 shows another configuration of the information storage / retrieval apparatus 100 shown in FIG. The information storage / retrieval apparatus 100 shown in FIG. 4 is configured by adding a voice / text conversion dictionary 107 to the information storage / retrieval apparatus 100 shown in FIG. Other configurations are the same as those shown in FIG.

【００４９】ここで、音声／テキスト変換辞書１０７
は、会議参加者または会議司会者から入力された音声情
報をテキスト情報に変換するための辞書で、選択／分配
部１０３から出力された会議参加者または会議司会者の
音声情報を入力し、この音声情報をテキスト情報に変換
して蓄積／検索部１０２に出力する。Here, the voice / text conversion dictionary 107 is used.
Is a dictionary for converting the voice information input from the conference participant or the conference moderator into text information. The voice information of the conference participant or the conference moderator output from the selection / distribution unit 103 is input. The voice information is converted into text information and output to the storage / search unit 102.

【００５０】蓄積／検索部１０２は、制御部１０１の制
御下で、データ処理部１０６−０から出力されるテキス
ト情報および参加者中の発言に基づき音声処理部１０４
−１、１０４−２、１０４−３、１０４−４から出力さ
れる音声情報および映像処理部１０５−１、１０５−
２、１０５−３、１０５−４から出力される映像情報を
選択／分配部１０３を介して入力し、それぞれ対応する
リンク情報を付加して蓄積するとともに、音声／テキス
ト変換辞書１０７から入力される音声情報に対応するテ
キスト情報をそれぞれ対応するリンク情報を付加して蓄
積する。Under the control of the control unit 101, the storage / retrieval unit 102 is based on the text information output from the data processing unit 106-0 and the speech in the participants, and the voice processing unit 104.
-1, 104-2, 104-3, 104-4 output audio information and video processing units 105-1, 105-
The video information output from 2, 105-3, and 105-4 is input through the selection / distribution unit 103, the corresponding link information is added and stored, and the video information is input from the voice / text conversion dictionary 107. Text information corresponding to voice information is added with corresponding link information and stored.

【００５１】このような構成によると、会議参加者また
は会議司会者の音声情報をテキスト情報として蓄積／検
索部１０２に蓄積することができ、また、制御部１０１
から与えられるリンク情報および検索キー情報に基づ
き、該当する音声情報に対応するテキスト情報を検索す
ることができる。With such a configuration, the voice information of the conference participant or the conference moderator can be stored as text information in the storage / retrieval unit 102, and the control unit 101 can be used.
Based on the link information and the search key information given from, the text information corresponding to the corresponding voice information can be searched.

【００５２】図５は、図１に示した情報蓄積／検索装置
１００の更に他の構成を示したものである。図５に示す
情報蓄積／検索装置１００は、図２に示した情報蓄積／
検索装置１００に対して個人認証辞書１０８を追加する
ことにより構成される。他の構成は、図２に示したもの
と同様である。FIG. 5 shows still another configuration of the information storage / retrieval apparatus 100 shown in FIG. The information storage / retrieval apparatus 100 shown in FIG. 5 is the same as the information storage / retrieval apparatus shown in FIG.
It is configured by adding the personal authentication dictionary 108 to the search device 100. Other configurations are the same as those shown in FIG.

【００５３】ここで、個人認証辞書１０８は、会議参加
者から入力された映像情報に基づき会議参加者を個人認
証し、発言者個人の名前を示す情報に変換する辞書で、
選択／分配部１０３から出力された会議参加者の映像情
報を入力し、この映像情報から会議参加者を個人認証
し、発言者個人の名前を示す情報を蓄積／検索部１０２
に出力する。Here, the personal authentication dictionary 108 is a dictionary that personally authenticates the conference participants based on the video information input from the conference participants and converts it into information indicating the name of each speaker.
The video information of the conference participants output from the selection / distribution unit 103 is input, the conference participants are individually authenticated from this video information, and the information indicating the name of each speaker is stored / retrieved.
Output to.

【００５４】なお、個人認証辞書１０８における会議参
加者の個人認証は、例えば、映像情報に含まれる会議参
加者の顔の映像の特徴を抽出することにより会議参加者
の個人認証を行うことができる。The individual authentication of the conference participant in the individual authentication dictionary 108 can be performed by extracting the characteristic of the image of the face of the conference participant included in the image information, for example. .

【００５５】蓄積／検索部１０２は、制御部１０１の制
御下で、データ処理部１０６−０から出力されるテキス
ト情報および参加者中の発言に基づき音声処理部１０４
−１、１０４−２、１０４−３、１０４−４から出力さ
れる音声情報および映像処理部１０５−１、１０５−
２、１０５−３、１０５−４から出力される映像情報を
選択／分配部１０３を介して入力し、それぞれ対応する
リンク情報を付加して蓄積するとともに、個人認証辞書
１０８から入力される発言者個人の名前を示す情報をそ
れぞれ対応するリンク情報を付加して蓄積する。Under the control of the control unit 101, the accumulation / retrieval unit 102, based on the text information output from the data processing unit 106-0 and the speech of the participants, the voice processing unit 104.
-1, 104-2, 104-3, 104-4 output audio information and video processing units 105-1, 105-
Speakers input from the personal authentication dictionary 108 while inputting the video information output from 2, 105-3, and 105-4 via the selecting / distributing unit 103, adding and storing corresponding link information, respectively. Information indicating an individual's name is accumulated by adding corresponding link information.

【００５６】このような構成によると、発言者個人の名
前を示す情報を蓄積／検索部１０２に蓄積することがで
き、また、制御部１０１から与えられるリンク情報およ
び検索キー情報に基づき、該当する発言者個人の名前を
示す情報を検索することができる。According to this structure, the information indicating the name of each speaker can be stored in the storage / retrieval unit 102, and the information corresponding to the speaker name can be stored based on the link information and the search key information provided from the control unit 101. It is possible to retrieve information indicating the name of each speaker.

【００５７】なお上記実施例においては、会議参加者か
ら入力された映像情報に基づき会議参加者を個人認証を
行うように構成したが、会議参加者から入力された音声
情報に基づき会議参加者を個人認証を行うように構成し
てもよい。この場合、例えば、会議参加者から入力され
た音声情報の特徴を抽出して会議参加者を個人認証を行
うように構成することができる。In the above-described embodiment, the conference participant is configured to perform individual authentication based on the video information input from the conference participant, but the conference participant is identified based on the voice information input from the conference participant. It may be configured to perform personal authentication. In this case, for example, the feature of the voice information input from the conference participant can be extracted to configure the conference participant to perform individual authentication.

【００５８】図６は、公衆電話網等の網を利用して遠隔
会議サービスを行うように構成したこの発明のマルチメ
ディア会議装置の他の実施例を示したものである。この
実施例においても、図１に示した実施例と同様に、４人
の会議参加者１０−１、１０−２、１０−３、１０−４
と１人の会議司会者１０−０との間で、網３００を介し
て、音声および映像を用いた会議通信サービスを提供す
る。FIG. 6 shows another embodiment of the multimedia conferencing apparatus according to the present invention, which is constructed so as to provide a teleconference service by utilizing a network such as a public telephone network. Also in this embodiment, as in the embodiment shown in FIG. 1, four conference participants 10-1, 10-2, 10-3, 10-4.
And one conference moderator 10-0 provide a conference communication service using audio and video via the network 300.

【００５９】すなわち、各会議参加者１０−１、１０−
２、１０−３、１０−４に対応して、それぞれ、映像を
出力するモニタ１１−１、１１−２、１１−３、１１−
４、映像を入力するカメラ１２−１、１２−２、１２−
３、１２−４、音声を出力するスピーカ１３−１、１３
−２、１３−３、１３−４、音声を入力するマイクロフ
ォン１４−１、１４−２、１４−３、１４−４が設けら
れ、会議司会者１０−０に対応して、映像を出力するモ
ニタ１１−０、映像を入力するカメラ１２−０、音声を
出力するスピーカ１３−０、音声を入力するマイクロフ
ォン１４−０、テキストを入力するデータ端末１５−０
が設けられる。そして、会議参加者１０−１に対応する
モニタ１１−１、カメラ１２−１、スピーカ１３−１、
マイクロフォン１４−１は、通信端末２００−１を介し
て網３００に接続され、会議参加者１０−２に対応する
モニタ１１−２、カメラ１２−２、スピーカ１３−２、
マイクロフォン１４−２は、通信端末２００−２を介し
て網３００に接続され、会議参加者１０−３に対応する
モニタ１１−３、カメラ１２−３、スピーカ１３−３、
マイクロフォン１４−３は、通信端末２００−３を介し
て網３００に接続され、会議参加者１０−４に対応する
モニタ１１−４、カメラ１２−４、スピーカ１３−４、
マイクロフォン１４−４は、通信端末２００−４を介し
て網３００に接続され、会議司会者１０−０に対応する
モニタ１１−０、カメラ１２−０、スピーカ１３−０、
マイクロフォン１４−０、データ端末１５−０は、通信
端末２００−０を介して網３００に接続される。That is, each conference participant 10-1, 10-
Monitors 11-1, 11-2, 11-3, and 11- that output images corresponding to 2, 10-3, and 10-4, respectively.
4. Cameras 12-1, 12-2, 12-for inputting images
3, 12-4, speakers 13-1, 13 for outputting voice
-2, 13-3, 13-4, microphones 14-1, 14-2, 14-3, 14-4 for inputting audio are provided, and video is output corresponding to the conference moderator 10-0. Monitor 11-0, camera 12-0 for inputting video, speaker 13-0 for outputting audio, microphone 14-0 for inputting audio, data terminal 15-0 for inputting text
Is provided. Then, the monitor 11-1, the camera 12-1, the speaker 13-1, which corresponds to the conference participant 10-1,
The microphone 14-1 is connected to the network 300 via the communication terminal 200-1, and the monitor 11-2, the camera 12-2, the speaker 13-2 corresponding to the conference participant 10-2,
The microphone 14-2 is connected to the network 300 via the communication terminal 200-2, and the monitor 11-3, the camera 12-3, the speaker 13-3 corresponding to the conference participant 10-3,
The microphone 14-3 is connected to the network 300 via the communication terminal 200-3, and the monitor 11-4, the camera 12-4, the speaker 13-4 corresponding to the conference participant 10-4,
The microphone 14-4 is connected to the network 300 via the communication terminal 200-4, and the monitor 11-0, the camera 12-0, the speaker 13-0 corresponding to the conference moderator 10-0,
The microphone 14-0 and the data terminal 15-0 are connected to the network 300 via the communication terminal 200-0.

【００６０】更に、網３００には、情報蓄積／検索装置
４００が接続される。Further, an information storage / retrieval device 400 is connected to the network 300.

【００６１】かかる構成において、会議参加者１０−
１、１０−２、１０−３、１０−４および会議司会者１
０−０は、それぞれ、モニタ１１−１、１１−２、１１
−３、１１−４、１１−０、カメラ１２−１、１２−
２、１２−３、１２−４、１２−０、スピーカ１３−
１、１３−２、１３−３、１３−４、１３−０、マイク
ロフォン１４−１、１４−２、１４−３、１４−４、１
４−０を使用し、網３００を介して会議を行う。In this structure, the conference participants 10-
1, 10-2, 10-3, 10-4 and conference moderator 1
0-0 indicates monitors 11-1, 11-2, 11 respectively.
-3, 11-4, 11-0, cameras 12-1, 12-
2, 12-3, 12-4, 12-0, speaker 13-
1, 13-2, 13-3, 13-4, 13-0, microphones 14-1, 14-2, 14-3, 14-4, 1
A conference is held via the network 300 using 4-0.

【００６２】また、会議司会者１０−０は各参加者の発
言内容を聴取し、この各参加者の発言内容を要約し、こ
れをテキスト情報としてデータ端末１５−０を使用して
入力することにより、会議の議事のメモ入力を行う。The conference moderator 10-0 listens to the speech contents of each participant, summarizes the speech contents of each participant, and inputs this as text information using the data terminal 15-0. Enter the memo of the meeting agenda.

【００６３】情報蓄積／検索装置４００は、各参加者に
対応するマイクロフォンから入力された音声情報が所定
の音量以上であり、かつ所定の持続時間を有するもので
あると、その持続時間に基づき切り出した１つの発言に
対応する音声情報およびカメラから入力された映像情報
を、このとき会議司会者１０−０からデータ端末１５−
０を用いて入力されたテキスト情報とともに、同一のリ
ンク情報を付加して蓄積する。If the voice information input from the microphones corresponding to each participant is equal to or higher than a predetermined volume and has a predetermined duration, the information storage / retrieval device 400 cuts out based on the duration. At this time, the audio information corresponding to one utterance and the video information input from the camera are transferred from the conference moderator 10-0 to the data terminal 15-
The same link information is added and stored together with the text information input using 0.

【００６４】会議が終了すると、会議司会者１０−０
は、データ端末１５−０を使用してメモ入力したテキス
ト情報を基に該会議に係わる議事録や資料等を作成す
る。ここで、メモ入力したテキスト情報から議事の内容
を思い出せない場合は、同一リンク情報を有する発言
（音声情報、映像情報）を情報蓄積／検索装置４００の
蓄積情報から検索することによりその内容の確認を行
う。When the conference is over, the conference moderator 10-0
Creates minutes and materials related to the conference on the basis of the text information input by memo using the data terminal 15-0. Here, if the content of the agenda cannot be recalled from the text information entered in the memo, the content having the same link information (voice information, video information) is searched from the stored information of the information storage / retrieval device 400 to confirm the content. I do.

【００６５】ここで、データ端末１５−０を使用してメ
モ入力したテキスト情報が発言に対して遅れて、または
進んで入力される場合があるので、情報蓄積／検索装置
１００には、発言を時間的にさかのぼって検索する機能
と時間的に進行して検索する機能を有している。Since the text information memo-inputted using the data terminal 15-0 may be input later or earlier than the utterance, the information storage / retrieval apparatus 100 may make a utterance. It has a function of searching backward in time and a function of searching in time.

【００６６】また、会議司会者１０−０は、メモ入力し
たテキスト情報に基づき議事録や資料等を作成した後、
再度、データ端末１５−０を使用して情報蓄積／検索装
置４００に蓄積するが、その際にも発言を選択して議事
録や資料等にリンク情報を付加する。ここで、選択した
発言（音声情報、映像情報）にメモ入力したテキスト情
報と異なるリンク情報を付与することも可能である。The meeting moderator 10-0 creates the minutes and materials based on the text information entered in the memo, and then
Again, the data is stored in the information storage / retrieval device 400 by using the data terminal 15-0. At that time as well, the statement is selected and the link information is added to the minutes and materials. Here, it is also possible to add link information different from the text information input as a memo to the selected utterance (audio information, video information).

【００６７】これにより、議事の内容を会議参加者が確
認する際にも、同様に音声情報や映像情報を検索するこ
とが可能になり、議事録作成や議事録確認の業務の向上
が図れる。As a result, even when the conference participants confirm the contents of the proceedings, it becomes possible to retrieve the voice information and the video information in the same manner, and the work of preparing the proceedings and confirming the proceedings can be improved.

【００６８】また、上記説明においては、各参加者から
入力された音声情報の音量および持続時間に基づき会議
参加者の発言を自動的に特定するように構成したが、会
議司会者１０−０によるデータ端末１５−０を用いた指
示により会議参加者の発言を特定してリンク情報を付加
するように構成してもよい。Further, in the above description, the configuration is such that the speech of the conference participant is automatically specified based on the volume and duration of the voice information input from each participant. It may be configured to specify the utterance of the conference participant by the instruction using the data terminal 15-0 and add the link information.

【００６９】また、上記実施例においては、各会議参加
者に対応してそれぞれマイクロフォンを設けているの
で、会議開始時に各マイクロフォンと会議参加者との対
応つけを行っておけば、発言者の名前を情報として情報
蓄積／検索装置４００に蓄積することもできる。Further, in the above-mentioned embodiment, since the microphones are provided for the respective conference participants, if the microphones are associated with the conference participants at the start of the conference, the name of the speaker can be obtained. Can be stored in the information storage / search device 400 as information.

【００７０】図７は、図６に示した通信端末２００−０
の詳細構成を示したものである。図７において、通信端
末２００−０は、音声処理部２０１、映像処理部２０
２、データ処理部２０３、制御部２０４、通信処理部２
０５を具備して構成される。FIG. 7 shows the communication terminal 200-0 shown in FIG.
2 shows a detailed configuration of. In FIG. 7, the communication terminal 200-0 includes a voice processing unit 201 and a video processing unit 20.
2, data processing unit 203, control unit 204, communication processing unit 2
It is configured by including 05.

【００７１】ここで、音声処理部２０１は、図６に示し
たスピーカ１３−０およびマイクロフォン１４−０に接
続され、マイクロフォン１４−０から入力された音声を
音声情報として通信処理部２０５に出力する処理および
通信処理部２０５からの音声情報をスピーカ１３−０か
ら音声として出力する処理を行う。Here, the voice processing unit 201 is connected to the speaker 13-0 and the microphone 14-0 shown in FIG. 6, and outputs the voice input from the microphone 14-0 to the communication processing unit 205 as voice information. Processing and processing of outputting voice information from the communication processing unit 205 as voice from the speaker 13-0 is performed.

【００７２】また、音声処理部２０１は、マイクロフォ
ン１４−０から入力された音声の音量レベルおよび一定
音量レベル以上の期間を検出して会議司会者の発言を検
出して、その検出結果を制御部２０４に通知する。Further, the voice processing unit 201 detects the volume level of the voice input from the microphone 14-0 and the period of time equal to or higher than a certain volume level to detect the utterance of the chairperson of the conference, and the control unit outputs the detection result. Notify 204.

【００７３】映像処理部２０２は、カメラ１２−０から
入力された映像をディジタル信号（映像情報）に変換し
て通信処理部２０５に出力する処理を行うとともに、通
信処理部２０５から出力された映像情報を会議司会者に
対応するモニタ１１−０に加え、映像として出力する処
理を行う。The image processing unit 202 converts the image input from the camera 12-0 into a digital signal (image information) and outputs the digital signal to the communication processing unit 205, and at the same time, outputs the image output from the communication processing unit 205. Information is added to the monitor 11-0 corresponding to the moderator and output as a video.

【００７４】データ処理部２０３は、会議司会者が図６
に示したデータ端末１５−０を使用して入力したテキス
ト情報を通信処理部２０５に出力する処理を行うととも
に、通信処理部２０５からのテキスト情報を図６に示し
たデータ端末１５−０に出力する処理を行う。In the data processing unit 203, the conference moderator is shown in FIG.
The text information input using the data terminal 15-0 shown in FIG. 6 is output to the communication processing unit 205, and the text information from the communication processing unit 205 is output to the data terminal 15-0 shown in FIG. Perform processing to

【００７５】制御部２０４は、音声処理部２０１からの
発言者通知に対して通信処理部２０５を制御して、音声
処理部２０１から出力される音声情報および映像処理部
２０２から出力される映像情報を図６に示した網３００
に送出する処理を行う。The control unit 204 controls the communication processing unit 205 in response to the speaker notification from the audio processing unit 201, and the audio information output from the audio processing unit 201 and the video information output from the video processing unit 202. The net 300 shown in FIG.
Process to send to.

【００７６】また、通信処理部２０５は、網３００から
受信した音声情報を音声処理部２０１に出力し、網３０
０から受信した映像情報を映像処理部２０２に出力し、
網３００から受信したテキスト情報をデータ処理部２０
３に出力する。Further, the communication processing unit 205 outputs the voice information received from the network 300 to the voice processing unit 201, and the network 30
The video information received from 0 is output to the video processing unit 202,
The data processing unit 20 receives the text information received from the network 300.
Output to 3.

【００７７】なお、図７には、会議司会者１０−０に対
応する通信端末２００−０の詳細構成を示したが、各会
議参加者１０−１、１０−２、１０−３、１０−４は、
図７に示した通信端末２００−０からデータ処理部２０
３を除いた構成で、他の構成は図７に示したものと同一
である。Although the detailed configuration of the communication terminal 200-0 corresponding to the conference moderator 10-0 is shown in FIG. 7, each conference participant 10-1, 10-2, 10-3, 10-. 4 is
From the communication terminal 200-0 shown in FIG. 7 to the data processing unit 20
With the exception of the configuration of FIG. 3, the other configurations are the same as those shown in FIG.

【００７８】図８は、図６に示した情報蓄積／検索装置
４００の詳細構成を示したものである。図８において、
情報蓄積／検索装置４００は、制御部４０１、蓄積／検
索部４０２、通信処理部４０３を具備して構成される。FIG. 8 shows a detailed structure of the information storage / retrieval apparatus 400 shown in FIG. In FIG.
The information storage / retrieval device 400 includes a control unit 401, a storage / retrieval unit 402, and a communication processing unit 403.

【００７９】ここで、制御部４０１は、通信処理部４０
３で受信された網３００からの音声情報、映像情報、テ
キスト情報を同一のリンク情報を付して蓄積／検索部４
０２に蓄積するように蓄積／検索部４０２に指示する。Here, the control unit 401 controls the communication processing unit 40.
The storage / retrieval unit 4 with the same link information added to the audio information, the video information, and the text information from the network 300 received in 3
The storage / retrieval unit 402 is instructed to store the data in 02.

【００８０】また、会議司会者１０−０から通信処理部
４０３を介して制御部４０１に対して検索の指示が通知
されると、制御部４０１は、該当する音声情報、映像情
報、テキスト情報を蓄積／検索部４０２から検索する指
示を蓄積／検索部４０２に与える。When the conference moderator 10-0 notifies the control unit 401 of the search instruction via the communication processing unit 403, the control unit 401 sends the corresponding audio information, video information, and text information. The accumulation / retrieval unit 402 gives a retrieval instruction to the accumulation / retrieval unit 402.

【００８１】蓄積／検索部４０２は、この指示に対応し
て該当する音声情報、映像情報、テキスト情報を、検索
する処理を実行し、該検索した音声情報、映像情報、テ
キスト情報を通信処理部４０３に出力する。The accumulation / retrieval unit 402 executes a process for retrieving the corresponding voice information, video information, and text information in response to this instruction, and the retrieved voice information, video information, and text information are processed by the communication processing unit. Output to 403.

【００８２】通信処理部４０３は、網３００からの音声
情報、映像情報、テキスト情報を蓄積／検索部４０２に
出力し、また、蓄積／検索部４０２から出力された音声
情報、映像情報、テキスト情報を網３００に送出する処
理を行う。The communication processing unit 403 outputs the voice information, the video information and the text information from the network 300 to the storage / retrieval unit 402, and the voice information, the video information and the text information output from the storage / retrieval unit 402. Is transmitted to the network 300.

【００８３】[0083]

【発明の効果】以上説明したようにこの発明によれば、
マルチメディア情報入出力手段により入力された音声情
報の音量および持続期間から会議参加者の発言を検出す
るとともに、司会者により入力された会議参加者の発言
に係わるテキスト情報を検出し、該会議参加者から入力
されたマルチメディア情報を、司会者により入力された
テキスト情報とともに同一のリンク情報を付してマルチ
メディア情報蓄積手段に記憶し、その後、上記テキスト
情報を基に会議参加者から入力されたマルチメディア情
報をマルチメディア情報検索手段により検索してマルチ
メディア情報入出力手段から出力するように構成したの
で、会議終了後における議事録または資料を作成する再
処理時における検索および編集を容易に行うことができ
るという効果を奏する。As described above, according to the present invention,
The speech of the conference participant is detected from the volume and the duration of the voice information input by the multimedia information input / output means, and the text information related to the speech of the conference participant input by the moderator is detected to participate in the conference. The multimedia information input by the person in charge is stored in the multimedia information storage means with the same link information as the text information input by the moderator, and then stored by the conference participants based on the above text information. Since the multimedia information is retrieved by the multimedia information retrieval means and output from the multimedia information input / output means, the retrieval and editing at the time of reprocessing to create the minutes or materials after the conference can be easily performed. There is an effect that it can be performed.

[Brief description of drawings]

【図１】この発明に係わるマルチメディア会議装置の全
体構成の一実施例を示したブロック図。FIG. 1 is a block diagram showing an embodiment of the overall configuration of a multimedia conference device according to the present invention.

【図２】図１に示した情報蓄積／検索装置の詳細構成を
示したブロック図。FIG. 2 is a block diagram showing a detailed configuration of the information storage / retrieval device shown in FIG.

【図３】図２に示した蓄積／検索部における情報フォー
マットの一例を示した図。FIG. 3 is a diagram showing an example of an information format in a storage / retrieval unit shown in FIG.

【図４】図１に示した情報蓄積／検索装置の他の構成を
示したブロック図。FIG. 4 is a block diagram showing another configuration of the information storage / retrieval device shown in FIG.

【図５】図１に示した情報蓄積／検索装置の更に他の構
成を示したブロック図。5 is a block diagram showing still another configuration of the information storage / retrieval device shown in FIG.

【図６】公衆電話網等の網を利用して遠隔会議サービス
を行うように構成したこの発明のマルチメディア会議装
置の他の実施例を示したブロック図。FIG. 6 is a block diagram showing another embodiment of the multimedia conferencing apparatus of the present invention configured to perform a teleconference service using a network such as a public telephone network.

【図７】図６に示した通信端末の詳細構成を示したブロ
ック図。7 is a block diagram showing a detailed configuration of the communication terminal shown in FIG.

【図８】図６に示した情報蓄積／検索装置の詳細構成を
示したブロック図。8 is a block diagram showing a detailed configuration of the information storage / retrieval device shown in FIG.

【符号の説明】１０−０会議司会者１０−１、１０−２、１０−３、１０−４会議参加
者１１−１、１１−２、１１−３、１１−４、１１−０
モニタ１２−１、１２−２、１２−３、１２−４、１２−０
カメラ１３−１、１３−２、１３−３、１３−４、１３−０
スピーカ１４−１、１４−２、１４−３、１４−４、１４−０
マイクロフォン１００情報蓄積／検索装置１０１制御部１０２蓄積／検索部１０３選択／分配部１０４−１、１０４−２、１０４−３、１０４−４、１
０４−０音声処理部１０５−１、１０５−２、１０５−３、１０５−４、１
０５−０映像処理部１０６−０データ処理部１３１テキスト制御領域１３２音声制御領域１３３映像制御領域１０７音声／テキスト変換辞書１０８個人認証辞書２００−１、２００−２、２００−３、２００−４、２
００−０通信端末３００網４００情報蓄積／検索装置２０１音声処理部２０２映像処理部２０３データ処理部２０４制御部２０５通信処理部４０１制御部４０２蓄積／検索部４０３通信処理部[Explanation of Codes] 10-0 Conference moderator 10-1, 10-2, 10-3, 10-4 Conference participant 11-1, 11-2, 11-3, 11-4, 11-0
Monitor 12-1, 12-2, 12-3, 12-4, 12-0
Cameras 13-1, 13-2, 13-3, 13-4, 13-0
Speakers 14-1, 14-2, 14-3, 14-4, 14-0
Microphone 100 Information storage / search device 101 Control unit 102 Storage / search unit 103 Selection / distribution unit 104-1, 104-2, 104-3, 104-4, 1
04-0 Audio processing unit 105-1, 105-2, 105-3, 105-4, 1
05-0 video processing unit 106-0 data processing unit 131 text control area 132 voice control area 133 video control area 107 voice / text conversion dictionary 108 personal authentication dictionary 200-1, 200-2, 200-3, 200-4, Two
00-0 communication terminal 300 network 400 information storage / search device 201 audio processing unit 202 video processing unit 203 data processing unit 204 control unit 205 communication processing unit 401 control unit 402 storage / search unit 403 communication processing unit

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｈ０４Ｌ 12/18 Ｈ０４Ｍ 3/56 Ｚ (72)発明者若原俊彦東京都千代田区内幸町一丁目１番６号日本電信電話株式会社内 (72)発明者佐藤基東京都千代田区内幸町一丁目１番６号日本電信電話株式会社内─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification number Internal reference number for FI Technical indication H04L 12/18 H04M 3/56 Z (72) Inventor Toshihiko Wakahara 1-1-1, Uchisaiwaicho, Chiyoda-ku, Tokyo No. 6 Nippon Telegraph and Telephone Corp. (72) Inventor Moto Sato 1-1-6 Uchisaiwaicho, Chiyoda-ku, Tokyo Nihon Telegraph and Telephone Corp.

Claims

[Claims]

1. A multimedia conferencing apparatus for performing a conference communication between at least one moderator and a plurality of conference participants using multimedia information including audio information and video information, the moderator and the conference participant. A multimedia information input / output unit provided for each person, and detecting the utterance of the conference participant from the volume and duration of the voice information input by the multimedia information input / output unit, and by the moderator. Text information relating to the input of the conference participant input is detected, and the multimedia information input from the conference participant is stored together with the same link information as the text information input by the moderator. Multimedia information storage means, and multimedia information input from the conference participants based on the text information. A multimedia conferencing apparatus comprising: a multimedia information searching unit that searches and outputs from the multimedia information input / output unit.

2. A voice / voice that converts voice information input by the multimedia information input / output means into text information.
A text information conversion means is further provided, and the text information converted by the voice / text information conversion means is attached to the same link information as the link information and stored in the multimedia information storage means. Item 1. The multimedia conference device according to item 1.

3. The personal authentication means for specifying the conference participant who has spoken from the multimedia information input by the multimedia information input / output means, further comprising: The multimedia conference apparatus according to claim 1, wherein the multimedia information storage unit stores the same link information as the link information.

4. The multimedia conferencing apparatus according to claim 3, wherein the personal authentication unit identifies the conference participant who made the speech based on the characteristics of the voice information input by the multimedia information input / output unit.

5. The personal authentication means specifies the conference participant who has made a statement from the facial features of the conference participant based on the video information input by the multimedia information input / output means. Multimedia conferencing equipment.

6. The multimedia conference according to claim 1, wherein the multimedia information input / output unit, the multimedia information storage unit, and the multimedia information retrieval unit are connected to each other via a communication network. apparatus.

7. The multimedia information accumulating means, when detecting the speeches of a plurality of conference participants at the same time, adds different link information corresponding to each conference participant, and inputs from the conference participants. The multimedia conference apparatus according to claim 1, wherein the multimedia information is stored together with the text information input by the moderator.