JPH07123157A

JPH07123157A - Voice input device for multi-spot conference

Info

Publication number: JPH07123157A
Application number: JP5263968A
Authority: JP
Inventors: Masaki Yagyu; 正樹柳生
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1993-10-22
Filing date: 1993-10-22
Publication date: 1995-05-12

Abstract

PURPOSE:To provide the voice input device for multi-spot conference where the noise component due to the environmental noise in a conference room included in a synthesized voice is suppressed. CONSTITUTION:An output control part 301 controls output of the voice signal of a speaker and the environmental noise in the conference room. An environmental noise extracting part 302 extracts the environmental noise in each conference room from the output of the output control part 301 before the start of conference. An inverted signal generating part 303 inverts the polarity of the noise signal extracted by the environmental noise extracting part 302 to generate the inverted noise signal. An inverted signal storage part 304 stores the inverted noise signal after sampling it with a prescribed period. A voice synthesizing part 305 electrically synthesizes the inverted noise signal read out from the inverted signal storage part 304 and the conference voice including the environmental noise signal from the output control part 301. A noise control part 306 changes and holds the phase of the sampled inverted noise signal from the inverted signal. storage part 304 so that the output level of the voice synthesizing part 305 is lowest.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は多地点会議用音声入力装
置に関し、特に少なくとも３地点の会議室相互間で通信
網を通して画像会議または音声会議を行う際に各会議室
で音声を入力する多地点会議用音声入力装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice input device for multipoint conferences, and more particularly to a multipoint voice input device for inputting voices in each conference room through a communication network between conference rooms at three locations. The present invention relates to a point conference audio input device.

【０００２】[0002]

【従来の技術】従来の多地点会議における音声合成手法
は、会議出席者のいる各会議室にそれぞれ音声入力用の
マイクロホンおよび音声出力用のスピーカと、マイクロ
ホンからの音声を交換網側に入力する音声入力装置とを
設置して行われていた。また、映像が必要な画像会議系
では、各会議室にテレビカメラおよびモニタテレビを付
加していた。2. Description of the Related Art A conventional speech synthesis method in a multipoint conference inputs a microphone for voice input and a speaker for voice output to each conference room in which a conference attendee is present, and a voice from the microphone is input to a switching network side. It was done by installing a voice input device. Further, in an image conference system that requires video, a television camera and a monitor television are added to each conference room.

【０００３】そして、各会議室の音声入力装置からの音
声は交換網を経由して多地点会議音声合成装置に入力さ
れて合成されていた。Then, the voice from the voice input device in each conference room is input to the multipoint conference voice synthesizer via the exchange network and synthesized.

【０００４】また、画像系については、各会議室からの
画像信号を他の会議室に送出，分配しているが、モニタ
テレビ画面を複数に分割したマルチ画面化や、選択指示
された任意の画像を各会議室に送出，分配することも行
われている。Regarding the image system, although the image signal from each conference room is sent to and distributed to other conference rooms, the monitor TV screen is divided into a plurality of screens, or an arbitrary selection is instructed. Images are also sent and distributed to each conference room.

【０００５】一方、音声系については、話者自身に対す
る音声の戻りによる回り込みを防止する音声合成方式が
採られ、この音声合成に際しては、いずれも会議の出席
者の音声の和の成分が話者自身を除く出席者に送出，分
配されていた。On the other hand, regarding the voice system, a voice synthesizing method for preventing the wraparound due to the return of the voice to the speaker itself is adopted, and in this voice synthesizing, the sum component of the voices of the attendees of the conference is the speaker. It was sent and distributed to all attendees except myself.

【０００６】図３は一般的な多地点会議システムにおけ
る音声合成の部分の構成の一例を示すブロック図であ
る。FIG. 3 is a block diagram showing an example of the configuration of a voice synthesis section in a general multipoint conference system.

【０００７】図３を参照すると、多地点会議システムは
出席者のいる複数（Ｎ室）の会議室３１，３２，…，３
Ｎが交換網２を通して多地点会議音声合成装置１に接続
されている。Referring to FIG. 3, the multipoint conference system includes a plurality (N rooms) of conference rooms 31, 32, ..., 3 with attendees.
N is connected to the multipoint conference speech synthesizer 1 through the switching network 2.

【０００８】各会議室３１，３２，…，３Ｎの音声入力
装置からの音声ｖ１，ｖ２，…，ｖＮは交換網２を通し
て多地点会議音声合成装置１に入力され、多地点会議音
声合成装置１は１会議室に対してはその室内の話者自身
の音声のみを差し引いて他の全会議室からの音声の和の
成分を送出，分配する。The voices v1, v2, ..., VN from the voice input devices of the respective conference rooms 31, 32, ..., 3N are input to the multipoint conference voice synthesizer 1 through the switching network 2, and the multipoint conference voice synthesizer 1 For one conference room, only the voice of the speaker himself in that room is subtracted, and the sum component of the voices from all other conference rooms is transmitted and distributed.

【０００９】例えば、会議室３１からは話者の上り音声
ｖ１が交換網２側に出力されるが、多地点会議音声合成
装置１から会議室３１への下り音声は音声ｖ１を除いた
各会議室３２，…，３Ｎからの音声ｖ２，…，ｖＮの和
の成分が出力される。この上り，下り音声の送出，分配
は他の会議室３２，…，３Ｎについても同様である。For example, although the speaker's upstream voice v1 is output from the conference room 31 to the switching network 2 side, the downstream voice from the multipoint conference voice synthesizer 1 to the conference room 31 is not included in the voice v1. The sum components of the sounds v2, ..., VN from the chambers 32, ..., 3N are output. The transmission and distribution of the upstream and downstream voices are the same for the other conference rooms 32, ..., 3N.

【００１０】なお、上記の話者自身に対する音声の戻り
による回り込みを防止する音声合成方式は、アナログ音
声またはディジタル音声のいずれにも適用される。The voice synthesis method for preventing the wraparound due to the return of voice to the speaker itself is applied to either analog voice or digital voice.

【００１１】[0011]

【発明が解決しようとする課題】この従来の多地点会議
音声合成装置は、音声合成に際してはいずれも会議の出
席者の音声の和の成分が話者自身のいる会議室を除くす
べての会議室に送出，分配される。In this conventional multipoint conference speech synthesizer, all the conference rooms except the conference room in which the sum of the voices of the attendees of the conference is the speaker themselves in the speech synthesis. Sent to and distributed to.

【００１２】ここで、会議の環境について考えてみる
と、１出席者（話者）からの音声は他の出席者全員に分
配されるが、話者からの音声にはその話者のいる会議室
内の環境雑音（エアコンディショナ等が発する騒音）が
含まれて送出される。Considering the environment of the conference, the voice from one attendee (speaker) is distributed to all the other attendees, but the voice from the speaker is the conference with the speaker. Environmental noise in the room (noise generated by an air conditioner, etc.) is included and transmitted.

【００１３】今、会議への出席者のいる会議室数Ｎ＝８
の場合の画像・音声会議を想定すると、１会議室からの
画像信号および話者の音声信号ｖはその室内の画像・音
声の環境雑音ｎを含んで送出され、多地点会議音声合成
装置は話者に対しては、この話者からの音声信号ｖを除
いた他の出席者からの音声信号と、他の出席者のいる７
つの会議室の環境雑音とを音声合成して供給する。Now, the number of conference rooms with attendees N = 8
In the case of the image / speech conference in the case of, the image signal from one conference room and the voice signal v of the speaker are transmitted including the environmental noise n of the image / speech in the room, and the multipoint conference speech synthesizer speaks. For a person who has a voice signal from another attendee excluding the voice signal v from this speaker, and the presence of another attendee 7
It provides the environment noise of the two conference rooms with voice synthesis.

【００１４】したがって、説明を簡単にするために、各
会議室の環境雑音のレベルが一様であるとすると、話者
のいる会議室の環境雑音ｎのレベルの約７倍のレベルの
環境雑音を含む合成音声が話者に分配されることにな
る。Therefore, in order to simplify the explanation, assuming that the level of the environmental noise in each conference room is uniform, the environmental noise level is about 7 times the level of the environmental noise n in the conference room in which the speaker is present. The synthesized voice including "" will be distributed to the speakers.

【００１５】当然のことながら、出席者のいる会議室数
Ｎが小さい場合は、この合成雑音は小さいため会議に支
障を及ぼすことは軽微であるが、会議室数Ｎが多くなる
と話者の会話音声が聞き取りにくくなるという問題点が
あった。As a matter of course, when the number of conference rooms N in which attendees are present is small, this synthesized noise is so small that the conference is not hindered. There was a problem that the voice became difficult to hear.

【００１６】一方、画像信号については、音声信号での
環境雑音に相当する画像ノイズは画像機器および画像信
号の伝送路系に混入または誘導されるノイズであるた
め、非常にノイズレベルが低く、画像系の特性上からも
目立たない。On the other hand, with respect to the image signal, since the image noise corresponding to the environmental noise in the audio signal is the noise mixed or induced in the image equipment and the transmission path system of the image signal, the noise level is very low, It is not noticeable due to the characteristics of the system.

【００１７】したがって、本発明の目的は、合成音声に
含まれる会議室内の環境雑音による雑音成分を抑制した
多地点会議用音声入力装置を提供することにある。Therefore, an object of the present invention is to provide a voice input device for a multipoint conference in which a noise component due to environmental noise in a conference room included in a synthesized voice is suppressed.

【００１８】[0018]

【課題を解決するための手段】本発明によれば、少なく
とも３地点の会議室相互間で通信網を通して画像または
音声会議を行う多地点会議用音声入力装置において、前
記各会議室内の環境雑音成分を相殺する環境雑音相殺手
段を備えることを特徴とする多地点会議用音声入力装置
が得られる。According to the present invention, in a multipoint conference voice input device for conducting a video or audio conference between conference rooms at least at three points through a communication network, environmental noise components in each of the conference rooms. There is provided a voice input device for multipoint conference, which is provided with an environment noise canceling means for canceling.

【００１９】また、前記環境雑音相殺手段は会議開始前
に前記各会議室内の環境雑音を抽出する環境雑音抽出部
と、この環境雑音抽出部で抽出した環境雑音信号の極性
を反転した反転雑音信号を生成する反転信号生成部と、
前記反転雑音信号を所定の周期でサンプリングしたのち
に蓄積する反転信号蓄積部と、この反転信号蓄積部から
の前記サンプリング後の反転雑音信号と前記環境雑音を
含む会議音声とを合成する音声合成部と、この音声合成
部の出力レベルが最小となるように前記反転信号蓄積部
からの前記サンプリング後の反転雑音信号の位相を変化
させて最小点の位相量を保持するとともに前記環境雑音
を含む会議音声を前記雑音抽出部および前記音声合成部
のいずれに出力するかを制御する雑音制御部とを備える
ことを特徴とする多地点会議用音声入力装置が得られ
る。The environmental noise canceling means extracts an environmental noise in each of the conference rooms before the start of the conference, and an inverted noise signal obtained by inverting the polarity of the environmental noise signal extracted by the environmental noise extracting unit. An inverted signal generation unit for generating
An inverted signal storage unit that stores the inverted noise signal after sampling it at a predetermined cycle, and a voice synthesis unit that synthesizes the sampled inverted noise signal from the inverted signal storage unit and the conference voice including the environmental noise. And the phase of the sampled inverted noise signal from the inverted signal accumulating unit is changed so as to minimize the output level of the voice synthesizing unit, the phase amount of the minimum point is held, and the conference including the environmental noise is held. A voice input device for multipoint conference, comprising: a noise control unit that controls which of the noise extraction unit and the voice synthesis unit the voice is output to.

【００２０】さらに、前記音声合成部は前記環境雑音を
含んだ会議音声を第１の入力とし、前記反転信号蓄積部
に蓄積された前記サンプリング後の反転雑音信号を第２
の入力とし、これら第１および第２の入力を電気的およ
び音響的のいずれかで合成する構成としてもよい。Further, the voice synthesizer receives the conference voice including the environmental noise as a first input, and outputs the sampled inverted noise signal accumulated in the inverted signal accumulator as a second input.
Input, and these first and second inputs may be combined electrically or acoustically.

【００２１】さらにまた、前記音声合成部は前記第１お
よび第２の入力の差分信号のみを取り出して増幅する差
動増幅器で構成してもよい。Furthermore, the voice synthesizer may be constituted by a differential amplifier for extracting and amplifying only the difference signal between the first and second inputs.

【００２２】[0022]

【実施例】次に、本発明について図面を参照して説明す
る。DESCRIPTION OF THE PREFERRED EMBODIMENTS Next, the present invention will be described with reference to the drawings.

【００２３】図１は本発明の多地点会議用音声入力装置
の一実施例を示すブロック図、図２は図１における音声
合成部の入力および出力波形を示す波形図である。FIG. 1 is a block diagram showing an embodiment of a voice input device for multipoint conference according to the present invention, and FIG. 2 is a waveform diagram showing input and output waveforms of the voice synthesizing section in FIG.

【００２４】図１を参照すると、本実施例の多地点会議
用音声入力装置３は図３に示した各会議室３１，３２，
…，３Ｎに設置され、マイクロホン（図示省略）から入
力した話者からの音声信号（会議音声）およびその会議
室内の環境雑音の出力制御を行う出力制御部３０１と、
会議開始前に多地点の各会議室内の環境雑音を出力制御
部３０１の出力から抽出する環境雑音抽出部３０２と、
この環境雑音抽出部３０２で抽出した環境雑音信号の極
性を反転した反転雑音信号を生成する反転信号生成部３
０３と、この反転信号生成部３０３からの反転雑音信号
を所定の周期でサンプリングしたのちに蓄積する反転信
号蓄積部３０４と、反転信号生成部３０３を通して反転
信号蓄積部３０４から読み出したサンプリング後の反転
雑音信号と出力制御部３０１からの環境雑音信号を含ん
だ会議音声とを電気的に合成する音声合成部３０５と、
反転信号蓄積部３０４から読み出したサンプリング後の
反転雑音信号または環境雑音抽出部３０２で抽出した反
転雑音信号のいずれかを選択する制御を行う雑音制御部
３０６とを備えている。Referring to FIG. 1, the audio input device 3 for multipoint conference of the present embodiment has the conference rooms 31, 32, 32 shown in FIG.
.., 3N, and an output control unit 301 that controls output of a voice signal (conference voice) from a speaker input from a microphone (not shown) and environmental noise in the conference room.
An environmental noise extraction unit 302 that extracts the environmental noise in each conference room at multiple points from the output of the output control unit 301 before the conference starts;
Inversion signal generation unit 3 that generates an inversion noise signal in which the polarity of the environment noise signal extracted by this environment noise extraction unit 302 is inverted.
03, an inversion signal accumulating section 304 that accumulates the inversion noise signal from the inversion signal generating section 303 after sampling at a predetermined cycle, and inversion after sampling read from the inversion signal accumulating section 304 through the inversion signal generating section 303. A voice synthesis unit 305 for electrically synthesizing a noise signal and a conference voice including the environmental noise signal from the output control unit 301;
A noise control unit 306 that performs control to select either the sampled inverted noise signal read from the inverted signal storage unit 304 or the inverted noise signal extracted by the environmental noise extraction unit 302.

【００２５】また、雑音制御部３０６は音声合成部３０
５の出力レベルが最小となるように反転信号蓄積部３０
４から読み出すサンプリング後の反転雑音信号の位相を
変化させて最小点の位相量を保持するとともに、出力制
御部３０１に制御信号を送ってマイクロホンからの入力
を雑音抽出部３０２または音声合成部３０５のいずれに
出力するかを制御させる。Further, the noise control unit 306 is the voice synthesis unit 30.
Inversion signal accumulator 30 so that the output level of 5 is minimized.
4 changes the phase of the sampled inverted noise signal to hold the phase amount of the minimum point, and sends a control signal to the output control unit 301 to input the input from the microphone to the noise extraction unit 302 or the voice synthesis unit 305. Control which is output.

【００２６】雑音制御部３０６からの制御信号に基づ
き、出力制御部３０１は元の環境雑音を含んだ会議音声
を音声合成部３０５の＋入力に供給し、反転信号生成部
３０３は反転信号蓄積部３０４に蓄積されているサンプ
リング後の反転雑音信号を読み出して音声合成部３０５
の−入力に供給し音声合成部３０５は両入力の差分信号
のみを取り出して雑音成分を相殺する。Based on the control signal from the noise control unit 306, the output control unit 301 supplies the conference voice containing the original environmental noise to the + input of the voice synthesis unit 305, and the inverted signal generation unit 303 the inverted signal accumulation unit 303. The sampled inverted noise signal stored in 304 is read out, and the speech synthesis unit 305
To the negative input, and the speech synthesizer 305 takes out only the difference signal of both inputs and cancels the noise component.

【００２７】続いて、本実施例の動作について説明す
る。Next, the operation of this embodiment will be described.

【００２８】まず、会議を開始する前に各会議室のマイ
クロホン（図示省略）から入力される雑音成分を環境雑
音抽出部３０２が抽出できるように、雑音制御部３０６
内の音声制御スイッチ（図示省略）を操作する。First, before the conference is started, the noise control unit 306 is provided so that the environmental noise extraction unit 302 can extract the noise component input from the microphone (not shown) in each conference room.
Operate the voice control switch (not shown) inside.

【００２９】雑音制御部３０６では、この音声制御スイ
ッチの操作により、環境雑音抽出部３０２が抽出した雑
音の影響を軽減する処理を行うスクランブラモードに設
定され、出力制御部３０１に制御信号を送出する。The noise control unit 306 is set to a scrambler mode for reducing the influence of noise extracted by the environmental noise extraction unit 302 by operating the voice control switch, and sends a control signal to the output control unit 301. .

【００３０】この制御信号により、環境雑音抽出部３０
２は出力制御部３０１の出力から環境雑音を抽出して反
転信号生成部３０３に入力する。With this control signal, the environmental noise extraction unit 30
Reference numeral 2 extracts environmental noise from the output of the output control unit 301 and inputs it to the inverted signal generation unit 303.

【００３１】反転信号生成部３０３は入力された環境雑
音の信号の極性を反転し、この反転雑音信号を所定の周
期でサンプリングしたのち、反転信号蓄積部３０４に蓄
積する。The inverted signal generator 303 inverts the polarity of the input environmental noise signal, samples the inverted noise signal at a predetermined cycle, and then stores it in the inverted signal storage 304.

【００３２】反転信号蓄積部３０３は反転信号蓄積部３
０４に蓄積されたサンプリング後の反転雑音信号を所定
の周期で読み出して、比較用の雑音信号源として音声合
成部３０５の−入力に中継入力する。The inverted signal storage unit 303 is the inverted signal storage unit 3
The sampled inverted noise signal stored in 04 is read out at a predetermined cycle and relayed to the negative input of the voice synthesis unit 305 as a noise signal source for comparison.

【００３３】雑音制御部３０６は反転信号生成部３０３
を通して反転信号蓄積部３０４からのサンプリング後の
反転雑音信号の読出し位相を変化させて音声合成部３０
５の合成出力レベルが最小になる位相を検出し、この位
相量を保持する。The noise control unit 306 is an inverted signal generation unit 303.
Through the voice synthesizing unit 30 by changing the read phase of the inverted noise signal after sampling from the inverted signal accumulating unit 304 through
The phase at which the combined output level of 5 is minimized is detected, and this phase amount is held.

【００３４】雑音制御部３０６はこの位相量を一旦保持
すると、出力制御部３０１に制御信号を送って環境雑音
が環境雑音抽出部３０２に再度入力されるのを禁止する
とともにスクランブル動作を完了する。Once the noise control unit 306 holds this phase amount, it sends a control signal to the output control unit 301 to prohibit the environmental noise from being input to the environmental noise extraction unit 302 again, and completes the scramble operation.

【００３５】このスクランブル動作の完了後、雑音制御
部３０６内の上記音声制御スイッチを通話モードに戻す
と、雑音制御部３０６は出力制御部３０１に制御信号を
送って環境雑音を含んだ会議音声（図２に図示）を音声
合成部３０５の＋入力に供給させる。After the completion of this scramble operation, when the voice control switch in the noise control section 306 is returned to the call mode, the noise control section 306 sends a control signal to the output control section 301 to send the conference voice (including the environmental noise). 2) is supplied to the + input of the voice synthesis unit 305.

【００３６】一方、音声合成部３０５の−入力には、先
に雑音制御部３０６に保持された位相により、反転信号
蓄積部３０４に蓄積されている反転雑音信号（図２に図
示）が反転信号生成部３０３を経由して供給され続け
る。On the other hand, the inverted noise signal (shown in FIG. 2) stored in the inverted signal storage unit 304 is input to the negative input of the voice synthesis unit 305 according to the phase previously held in the noise control unit 306. It continues to be supplied via the generation unit 303.

【００３７】音声合成部３０５は２入力（＋入力および
−入力）の差分のみを所定の増幅度で増幅する差動増幅
器である。The voice synthesizer 305 is a differential amplifier which amplifies only the difference between two inputs (+ input and −input) with a predetermined amplification degree.

【００３８】図２に示すとおり、＋入力中の環境雑音と
−入力の反転雑音信号とは波形の振幅および位相は同一
で極性が逆相であるので、音声合成部３０５は環境雑音
成分を相殺して会議音声のみを出力する。As shown in FIG. 2, since the environmental noise in the + input and the inverted noise signal in the − input have the same waveform amplitude and phase and opposite polarities, the speech synthesis unit 305 cancels the environmental noise component. And outputs only the conference audio.

【００３９】音声合成部３０５の出力は多地点会議の１
会議室の音声として交換網２（図３に図示）を経由して
多地点会議音声合成装置１（図３に図示）に入力され
る。The output of the voice synthesizing unit 305 is 1 for multipoint conference.
The voice of the conference room is input to the multipoint conference voice synthesizer 1 (shown in FIG. 3) via the exchange network 2 (shown in FIG. 3).

【００４０】多地点会議の出席者のいる各会議室に図１
に示した多地点会議用音声入力装置３を配備すれば、多
地点会議音声合成装置１のＮ個の入力端子には各会議室
３１，３２，…，３Ｎからの会議音声のみが入力される
ことになり、任意の１会議室に送出，分配される環境雑
音ＥＭＮ＝（Ｎ−１）ｎのレベルは環境雑音ｎのレベル
がほぼ０であるため、非常に小さくなる。FIG. 1 shows each conference room with attendees of a multipoint conference.
If the multipoint conference voice input device 3 shown in FIG. 2 is provided, only the conference voices from the conference rooms 31, 32, ..., 3N are input to the N input terminals of the multipoint conference voice synthesizer 1. Therefore, the level of the environmental noise EMN = (N-1) n transmitted and distributed to any one conference room is very small because the level of the environmental noise n is almost zero.

【００４１】上記環境雑音ＥＭＮは出席者のいる各会議
室から出力される会議音声中の環境雑音ｎのレベルが支
配的な要素であり、本実施例によれば環境雑音ＥＭＮを
極めて小さくすることができる。The environmental noise EMN is dominated by the level of the environmental noise n in the conference voice output from each conference room in which attendees are present. According to the present embodiment, the environmental noise EMN should be extremely small. You can

【００４２】なお、本実施例においては電気的に音声合
成に行うとしたが、同様の手法でスピーカ等を使用して
音響的にスクランブラをかけ、音響レベルが最低となっ
た時点で音響信号を電気信号に変換して反転信号蓄積部
に蓄積し、これを読み出して音声合成する構成として
も、同等の効果が得られる。In the present embodiment, it is assumed that the voice synthesis is performed electrically. However, the sound signal is acoustically scrambled by using a speaker or the like by a similar method, and the acoustic signal is reached when the acoustic level becomes the minimum. Can be converted into an electric signal, stored in the inverted signal storage section, read out, and speech-synthesized. The same effect can be obtained.

【００４３】[0043]

【発明の効果】以上説明したように本発明は、Ｎ地点の
会議室相互間で通信網を通して画像または音声会議を行
う多地点会議用音声入力装置において、各会議室内の環
境雑音成分を相殺することにより、多地点会議音声合成
装置から各会議室に送出，分配される環境雑音ＥＭＮの
レベルをほとんど０にすることができるので、会議音声
の明瞭度を著しく向上させることができるという効果を
有する。As described above, the present invention cancels out environmental noise components in each conference room in a multipoint conference voice input device for performing a video or audio conference between conference rooms at N points through a communication network. As a result, the level of the environmental noise EMN transmitted and distributed from the multipoint conference speech synthesizer to each conference room can be reduced to almost zero, which has the effect of significantly improving the intelligibility of the conference voice. .

[Brief description of drawings]

【図１】本発明の多地点会議用音声入力装置の一実施例
を示すブロック図である。FIG. 1 is a block diagram showing an embodiment of a voice input device for multipoint conference according to the present invention.

【図２】図１における音声合成部の入力および出力波形
を示す波形図である。FIG. 2 is a waveform diagram showing input and output waveforms of a voice synthesis unit in FIG.

【図３】一般的な多地点会議システムにおける音声合成
の部分の構成の一例を示すブロック図である。FIG. 3 is a block diagram showing an example of a configuration of a voice synthesis part in a general multipoint conference system.

[Explanation of symbols]

１多地点会議音声合成装置２交換網３多地点会議用音声入力装置３１，３２，〜，３Ｎ会議室３０１出力制御部３０２環境雑音抽出部３０３反転信号生成部３０４反転信号蓄積部３０５音声合成部３０６雑音制御部 1 Multipoint conference voice synthesizer 2 Switching network 3 Multipoint conference voice input device 31, 32, ..., 3N Conference room 301 Output controller 302 Environmental noise extractor 303 Inverted signal generator 304 Inverted signal accumulator 305 Speech synthesizer 306 Noise control unit

Claims

[Claims]

1. A multi-point audio input device for performing a video or audio conference between conference rooms at least at three points through a communication network, comprising environmental noise canceling means for canceling environmental noise components in each of the conference rooms. A voice input device for multipoint conference, characterized by:

2. The environmental noise canceling means extracts an environmental noise in each of the conference rooms before the conference starts, and
An inversion signal generation unit that generates an inversion noise signal in which the polarity of the environment noise signal extracted by the environment noise extraction unit is inverted, an inversion signal accumulation unit that accumulates the inverted noise signal after sampling it at a predetermined cycle, and A voice synthesizing unit for synthesizing the sampled inversion noise signal from the inversion signal storage unit and the conference voice including the environmental noise, and an output from the inversion signal storage unit so that the output level of the voice synthesis unit is minimized. Noise control that changes the phase of the inverted noise signal after sampling to hold the phase amount of the minimum point and controls which of the noise extracting unit and the voice synthesizing unit outputs the conference voice including the environmental noise. The multi-point conference audio input device according to claim 1, further comprising:

3. The speech synthesizer receives the conference speech including the environmental noise as a first input and the sampled inverted noise signal stored in the inverted signal storage as a second input. The voice input device for multipoint conference according to claim 2, wherein the first and second inputs are combined electrically or acoustically.

4. The voice input for multipoint conference according to claim 2, wherein the voice synthesizer is a differential amplifier that extracts and amplifies only the difference signal between the first and second inputs. apparatus.