JPH02230898A

JPH02230898A - Voice reproduction system

Info

Publication number: JPH02230898A
Application number: JP5147589A
Authority: JP
Inventors: Naofumi Inmaki; 印牧　直文; Toshiharu Tanabe; 田邊　敏晴
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1989-03-03
Filing date: 1989-03-03
Publication date: 1990-09-13
Anticipated expiration: 2011-08-07
Also published as: JP2523366B2

Abstract

PURPOSE:To allow the content of conversation to be listened by concerned persons only by synthesizing a voice at a frequency band other than the voice effecting the understanding of the content of conversation with a masking voice and reproducing the synthesized voice through a nondirectional speaker. CONSTITUTION:The frequency band is split (split into A and B) at a frequency split circuit 11 to an audio signal of a monaural sound X(=A+B) transferred from an input terminal 10. For example, the voice A of a noticed characteristic frequency region is reproduced by a directivity speaker 12 and the voice B of the other frequency is transferred to a making sound addition section 25. The masking sound addition section 25 adds the 'masking sound C with respect to A' to the voice B and the synthesized sound (B+C) is reproduced by the nondirectional speaker 13. Thus, the directivity speaker 12 acts it that the content of conversation is not listened to a person not concerned resident therearound (not understood) but listened to a person concerned 15.

Description

【発明の詳細な説明】「産業上の利用分野」この発明は例えば通信会議システムに用いられる音声再
生方式に関する．「従来の技術」音声会議、テレビ会議等の通信会議システムを実現する
際には、会議の性格上、再生装置を長時間使用すること
が多く、受話器やイヤホンを用いると受聴者に対して重
圧感、圧迫惑を生じさせるという第１の問題が発生する
．他方、受話器やイヤホンを用いず拡声スピ一カを使う方
式が考えられるが、この場合は受話器やイヤホンでは問
題視されなかった欠点が生じる。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] This invention relates to an audio reproduction method used, for example, in a teleconferencing system. ``Prior art'' When realizing a communication conference system such as an audio conference or a video conference, playback devices are often used for long periods of time due to the nature of the conference, and using a telephone receiver or earphones places a heavy burden on listeners. The first problem arises, which is the feeling of pressure and confusion. On the other hand, it is possible to use a loudspeaker instead of a telephone receiver or earphones, but in this case, there are drawbacks that are not seen as problems with telephone receivers or earphones.

即ち、会議とは無関係な人間（非当事者）が周囲にいる
環境の中で、通信会議を行う場合、再生される会議内容
が当該非当事者に受聰されてしまうという第２の問題点
が生じる．この点に対処するために、拡声スピーカに狭指向性を持
たせる再生方式が考えられる．狭指向性によって小さく
させた受聴範囲内に当事者だけが位置すると、会議とは
無関係な非当事者の受聴が困難になるということを利用
している．ところが、この従来方式では音の波長が大き
い低音域の指向性については実現が難しく、無理に実現
するとスピーカ口径が大きくなり装置の大型化という問
題が生じる．この問題を解決するために、会話内容の了解性に影響を
与える周波数帯域（例えばホルマント周波数帯域）に対
して狭指向性スピーカで再生し、それ以外の周波数帯域
に対しては無指向性スピーカで再生する方式が提案され
ている．この発明はこの方式を改良し、周囲の当該非当
事者への会話内容の了解性を更に低減させる、即ち、周
囲への会話内容のもれを防止するものである．この発明
の目的は、上記従来の欠点を除去するため、会話内容の
了解性に影響を与える周波数帯域に対して狭指向性スピ
ーカで再生し、マスキング効果を活用して周囲の当該非
当事者への会話内容の了解性を低減させるためのマスキ
ング音を無指向性スピ一カで再生する音声再生方式を提
供することにある。In other words, when a teleconference is held in an environment where there are people (non-parties) who are unrelated to the conference, the second problem arises that the content of the conference being played back may be heard by the non-parties. ．． To address this issue, a playback method can be considered in which the loudspeaker has narrow directivity. This takes advantage of the fact that if only the parties involved are located within the listening range reduced by the narrow directivity, it becomes difficult for non-parties unrelated to the conference to hear the conference. However, with this conventional method, it is difficult to achieve directivity in the bass range, where the wavelength of sound is large, and if this is achieved forcibly, the speaker diameter becomes large, resulting in an increase in the size of the device. To solve this problem, the frequency bands that affect the intelligibility of conversation content (for example, the formant frequency band) are played back using narrow directional speakers, and the other frequency bands are played back using omnidirectional speakers. A method for reproducing the data has been proposed. This invention improves this method to further reduce the intelligibility of the conversation to surrounding non-parties, that is, to prevent the conversation from leaking to the surroundings. The purpose of the present invention is to eliminate the above-mentioned conventional drawbacks by using a narrow directional speaker to reproduce the frequency band that affects the intelligibility of conversation content, and by utilizing the masking effect to make it easier for surrounding non-parties to hear the sound. An object of the present invention is to provide an audio reproduction method for reproducing masking sound for reducing the intelligibility of conversation contents using an omnidirectional speaker.

「課題を解決するための手段」この発明は、会話内容の了解性に影響する度合いによっ
て周波数帯域を２分割し、その了解性に影響する周波数
帯域（指向性周波数帯域）の音声を狭指向性スピーカで
再生し、前記了解性に影響を与えない周波数帯域（無指
向性周波数帯域）の音声に、前記狭指向性スピーカより
再生される音声をマスキングするマスキング音を加えて
合成し無指向性スピーカで再生することを最も主要な特
徴とする．会話内容の了解性に影響する周波数帯域とし
ては、日本語の５母音を特徴づける成分音であるホルマ
ントの周波数帯域や個人を判別できる周波数帯域等があ
る．「実施例」第１図は、この発明の特徴を示す第１のシステム例であ
る．入力端子１０から転送されてくるモノラル音Ｘ　（
＝Ａ＋Ｂ）のオーディオ信号に対して周波数分割回路１
ｌで周波数の帯域分割（ＡとＢとの分割）を行い、例え
ば着目するホルマント周波数（その付近の周波数を含む
）の音声Ａを指向性スピーカｌ２により再生し、それ以
外の周波数の音声Ｂをマスキング音付加部２５に転送す
る。"Means for Solving the Problem" This invention divides the frequency band into two according to the degree to which it affects the intelligibility of conversation content, and narrowly directs the sound in the frequency band that affects the intelligibility (directional frequency band). A masking sound that masks the sound played from the narrow directional speaker is added to the sound in a frequency band that does not affect the intelligibility (omnidirectional frequency band) that is played back by the speaker, and synthesized. The main feature is that it can be played back. Frequency bands that affect the intelligibility of conversation content include the frequency band of formants, which are component sounds that characterize the five vowels of Japanese, and the frequency band that allows identification of individuals. ``Embodiment'' FIG. 1 is a first example of a system showing the features of this invention. Monaural sound X transferred from input terminal 10 (
=A+B) frequency division circuit 1 for the audio signal
Frequency band division (dividing into A and B) is performed using l, and for example, audio A with the formant frequency of interest (including frequencies around it) is reproduced by directional speaker l2, and audio B with other frequencies is reproduced. It is transferred to the masking sound adding section 25.

マスキング音付加部２５は、マスキング音用入力端子１
５から転送されてくる“Ａに対するマスキング音Ｃ”を
音声Ｂに付加して、この合成音（Ｂ十Ｃ）を無指向性．
スピーカ１３より再生する。マスキング音Ｃは例えば音
声Ａと類似した他の音声である。The masking sound addition section 25 has a masking sound input terminal 1
The "masking sound C for A" transferred from 5 is added to sound B, and this synthesized sound (B + C) is omnidirectional.
It is played back from the speaker 13. Masking sound C is another sound similar to sound A, for example.

指向性スピーカ１２は、会話内容が周囲の非当事者１４
に聞えず（理解されずに）、当事者１５に聞えるように
する役割を有し、無指向性スピーカｌ３は、音の拡がり
怒、厚み惑等を出す役割を有する．即ち、当事者１５は
スピーカ１２とスピーカｌ３から再生される音声Ｘ＋Ｃ
　（＝Ａ＋Ｂ＋Ｃ）を聞き、非当事者１４はスピーカｌ
３から再生される音声Ｂ十〇を聞くことになり、マスキ
ング音Ｃの音量を適宜制御することによって、非当事者
ｌ４への当事者１５の会話内容の了解性を低滅させるこ
とが出来る．第２図はこの発明の特徴を示す第２のシステム例である
．入力端子１０から転送されてくるステレオ音のオーデ
ィオ信号（Ｌ信号．Ｒ信号）に対して、Ｌ信号とＲ信号
は、それぞれ第１図で述べた再生動作に従って、例えば
ホルマント周波数帯域の音声とそれ以外の周波数帯域の
音声とに分割され、前者の音声はＬ信号用指向性スピー
カ１２．ＬとＲ信号用指向性スピーカ１２Ｒとがらそれ
ぞれ再生され、後者の音声（Ｌ信号，Ｒ信号）は、マス
キング音付加部２５へ転送される。マスキング音付加部
は、マスキング音用入力端子１５がら転送されてくるマ
スキングするステレオ音と周波数分割回路１１から転送
されるステレオ音とを対応するし信号、Ｒ信号毎に合成
し、この合成音をそれぞれＬ信号用無指向性スピーカ１
３ＬとＲ信号用無指向性スピーカ１３Ｒとから再生する
。当事者ｌ５は会話内容を音像定位して聞くことができ
る．従来の技術では、全周波数帯域に指向性を持たせたため
、スピーカが大型化する傾向にあったが、この発明では
、会話内容の了解性に影響を与える周波数帯域に対して
指向性を高め、更に、マスキング効果を活用して、了解
性の効果を高める点が従来との大きな相違点である．第３図はこの発明の実施例の構成を示すブロック図であ
る．制御部２４の指令により、周波数帯域設定部２３は
、男声、女声、会話の効果音等を考慮して予め定められ
た指向性帯域設定データと無指向性帯域設定データとを
それぞれ指向性帯域抽出再生部２１と無指向性帯域抽出
再生部２２とに転送する．指向性帯域抽出再生部２ｌは
、その指向性帯域設定データに基づき、初期設定を行い
、その設定完了後、その完了通知を周波数帯域設定部２
３に転送する．同時に、無指向性帯域抽出再生部２２は
、前記無指向性帯域設定データに基づき、初期設定を行
い、その設定完了後、その完了通知を周波数帯域設定部
２３に転送する．周波数帯域設定部２３は指向性帯域抽
出再生部２ｌから受信する前記完了通知と無指向性帯域
抽出再生部２２から受信する前記完了通知とを受けとっ
た後、起動開始指令を指向性帯域抽出再生部２１と無指
向性帯域抽出再生部２２とに通知する．その通知完了後
、指向性帯域抽出再生部２１は、入力端子１０から送ら
れてくるオーディオ信号に対して初期設定された音声の
帯域のみ抽出し、その音声を指向性スピーカ１２を介し
て再生する．他方、無指向性帯域抽出再生部２２は、初
期設定された音声のみ抽出し、その音声をマスキング音
付加部２５に転送する．マスキング音付加部２５は、無
指向性帯域抽出再生部２２から転送されてくる音声に、
マスキング音用入力端子１５から入力されるマスキング
音を加えて合成し、その合成音を無指向性スピーカ１３
を介して再生する．「発明の効果」以上説明したように、この発明による音声再生方式によ
れば、会話内容の了解性に影響する例えばホルマント周
波数帯域（その付近の周波数を含む）の音声に対して、
指向性スピーカを介して再生し、それ以外の周波数帯域
の音声に対してはマスキング音を加えて合成し、その合
成音に対して無指向性スピーカを介して再生することが
ら、受聴者がハンドフリーとなる利点があるとともに、
マスキング効果によって会話内容が当事者だけに聞こえ
、周囲の非当事者には聞こえないという利点がある．更
に、人間の発声範囲の１００Ｈｚ〜８０００Ｈｚの全周
波数に対して指向性を与えるのではなく、中域の３００
Ｈｚ〜２０００Ｈｚのうちのいくつかの周波数帯域のみ
に指向性を与えることから、スピーカ（スピーカ口径）
の小型化が図れるとともに経済化が図れるという利点が
ある。また無指向性スピーカを用いることによって音の
拡がり感や厚み感を出すことが出来るという利点もある
．The directional speaker 12 allows the conversation content to be transmitted to surrounding non-parties 14.
The non-directional speaker 13 has the role of making the sound audible to the person concerned 15 without being heard (understood) by others, and the omnidirectional speaker 13 has the role of making the sound appear wider, louder, and thicker. That is, the party 15 listens to the audio X+C played from the speaker 12 and the speaker l3.
(=A+B+C), the non-party 14 uses the speaker l
By controlling the volume of the masking sound C appropriately, the intelligibility of the conversation content of the party 15 to the non-party 14 can be reduced. Figure 2 is a second system example showing the features of this invention. Regarding the stereo audio signals (L signal and R signal) transferred from the input terminal 10, the L signal and R signal are respectively processed according to the playback operation described in FIG. The former sound is transmitted through the L signal directional speaker 12. The L and R signal directional speakers 12R are respectively reproduced, and the latter sounds (L signal, R signal) are transferred to the masking sound adding section 25. The masking sound adding section synthesizes the masking stereo sound transferred from the masking sound input terminal 15 and the stereo sound transferred from the frequency division circuit 11 for each corresponding R signal and R signal, and generates this synthesized sound. Omnidirectional speaker 1 for each L signal
3L and the R signal omnidirectional speaker 13R. Participant 15 can listen to the content of the conversation by localizing the sound image. In conventional technology, the speaker tends to become larger because the entire frequency band has directivity, but with this invention, the directivity is increased in the frequency band that affects the intelligibility of conversation content. Furthermore, a major difference from conventional methods is that it utilizes masking effects to increase the effect of intelligibility. FIG. 3 is a block diagram showing the configuration of an embodiment of this invention. In response to a command from the control unit 24, the frequency band setting unit 23 extracts directional band setting data and non-directional band setting data that are predetermined in consideration of male voices, female voices, conversation sound effects, etc. The data is transferred to the playback section 21 and the omnidirectional band extraction and playback section 22. The directional band extraction and reproduction unit 2l performs initial settings based on the directional band setting data, and after completing the setting, sends a completion notification to the frequency band setting unit 2.
Transfer to 3. At the same time, the omnidirectional band extraction and reproducing section 22 performs initial setting based on the omnidirectional band setting data, and after completing the setting, transfers a completion notification to the frequency band setting section 23. After receiving the completion notification received from the directional band extraction and playback unit 2l and the completion notification received from the omnidirectional band extraction and playback unit 22, the frequency band setting unit 23 issues an activation start command to the directional band extraction and playback unit. 21 and the omnidirectional band extraction and reproduction section 22. After the notification is completed, the directional band extraction and playback unit 21 extracts only the initially set audio band for the audio signal sent from the input terminal 10 and plays the audio via the directional speaker 12. ．． On the other hand, the omnidirectional band extraction and playback unit 22 extracts only the initially set audio and transfers the audio to the masking sound addition unit 25. The masking sound addition unit 25 adds sound to the audio transferred from the omnidirectional band extraction and playback unit 22.
The masking sound input from the masking sound input terminal 15 is added and synthesized, and the synthesized sound is output to the omnidirectional speaker 13.
Play via . "Effects of the Invention" As explained above, according to the audio reproduction method according to the present invention, for example, the voice in the formant frequency band (including frequencies in the vicinity), which affects the intelligibility of conversation content, can be reproduced.
It is played back through a directional speaker, and the sound in other frequency bands is synthesized by adding masking sound, and the synthesized sound is played back through an omnidirectional speaker. It has the advantage of being free, and
The masking effect has the advantage that the conversation can only be heard by the person concerned, but not by those around them. Furthermore, instead of providing directivity for all frequencies in the human vocal range of 100Hz to 8000Hz,
Since it provides directivity only in some frequency bands from Hz to 2000Hz, the speaker (speaker aperture)
This has the advantage that it can be made smaller and more economical. Another advantage of using omnidirectional speakers is that you can create a sense of spaciousness and depth in the sound.

[Brief explanation of the drawing]

第１図はこの発明の特徴を示す第１のシステム例を示す
ブロック図、第２図はこの発明の特徴を示す第２のシス
テム例を示すブロック図、第３図はこの発明の実施例の
構成を示すブロック図である．ヤ　１　図十固木図FIG. 1 is a block diagram showing a first system example showing the features of this invention, FIG. 2 is a block diagram showing a second system example showing features of this invention, and FIG. 3 is a block diagram showing a second system example showing features of this invention. It is a block diagram showing the configuration. 1 Figure 1: Hardwood diagram

Claims

[Claims]

(1) An audio reproduction system that has a directional speaker with directivity and an omnidirectional speaker with omnidirectionality, which has a directional frequency that gives directionality when playing back an input audio signal. means for setting a band and an omnidirectional frequency band having no directivity; means for extracting an audio signal in the directional frequency band from the input audio signal and reproducing it using the directional speaker; means for extracting an audio signal in the omnidirectional frequency band from an input audio signal; and a means for adding an audio signal for masking the audio signal in the omnidirectional frequency band to the audio signal in the omnidirectional frequency band; 1. A sound reproduction method comprising means for reproducing using a speaker.