JP6866764B2

JP6866764B2 - Speech processing system and speech processor

Info

Publication number: JP6866764B2
Application number: JP2017101256A
Authority: JP
Inventors: 本地　由和; 由和本地
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2017-05-22
Filing date: 2017-05-22
Publication date: 2021-04-28
Anticipated expiration: 2037-05-22
Also published as: JP2018197770A

Description

本発明は、音声処理システム及び音声処理装置に関する。 The present invention relates to a voice processing system and a voice processing device.

近年、オフィス等の室内空間を間仕切りによって区分することで会議や商談等を行う打合せブースを設けることが行われている。このように間仕切りによって打合せブースを設けることで、室内空間を効率的に利用することができると共に、この室内空間全体の活気を高めることができる。 In recent years, it has been practiced to provide a meeting booth for meetings, business negotiations, etc. by dividing an indoor space such as an office by a partition. By providing a meeting booth with partitions in this way, it is possible to efficiently use the indoor space and enhance the vitality of the entire indoor space.

前記打合せブースは、室内空間の効率的な利用や室内空間の活性化の観点から、室内の事務スペース等の近くに存在することが望まれている。また、この打合せブースは、前記観点から十分な数が存在することが望まれており、そのため各打合せブースの小規模化が図られている。 The meeting booth is desired to exist near an indoor office space or the like from the viewpoint of efficient use of the indoor space and activation of the indoor space. Further, it is desired that a sufficient number of the meeting booths exist from the above viewpoint, and therefore each meeting booth is being reduced in size.

しかしながら、打合せブースの小規模化が図られる場合、この打合せブース内の会話が打合せブース外に漏洩するおそれが高くなる。このような点から、今日では間仕切りにスピーカを設け、マスキング音を放音することによって、打合せブース内での会話が外部で聞き取れないようにする技術が実用化されている（特開２０１２−８２５８５号公報参照）。 However, when the size of the meeting booth is reduced, there is a high possibility that the conversation in the meeting booth will be leaked to the outside of the meeting booth. From this point of view, a technique has been put into practical use today in which a speaker is provided in the partition and a masking sound is emitted so that the conversation in the meeting booth cannot be heard outside (Japanese Patent Laid-Open No. 2012-82585). See Gazette).

特開２０１２−８２５８５号公報Japanese Unexamined Patent Publication No. 2012-82585

前記公報に記載されているようなマスキング音は、通常打合せブースの会話と関連性のない音である。このようなマスキング音は、打合せブース内の会話内容を隠蔽すると共に、打合せブース外に存在する人の作業効率を下げない音量に制御されることが要求される。しかしながら、打合せブース外に存在する人にとって気にならない音量は、打合せブースから漏れる会話の音量や、マスキング音の種類等に左右されるため制御が困難である。 The masking sound as described in the above-mentioned publication is a sound that is not usually related to the conversation in the meeting booth. Such masking sounds are required to conceal the conversation content in the meeting booth and to be controlled to a volume that does not reduce the work efficiency of people outside the meeting booth. However, it is difficult to control the volume that is not noticeable to people outside the meeting booth because it depends on the volume of conversation leaking from the meeting booth, the type of masking sound, and the like.

このような不都合に鑑みて、本発明は、ブース外に存在する人に騒音感を与えることを抑えつつ、ブース内の会話内容をブース外に伝わり難くすることができる音声処理システム及び音声処理装置を提供することを課題とする。 In view of such inconvenience, the present invention presents a voice processing system and a voice processing device capable of making it difficult for the conversation contents in the booth to be transmitted to the outside of the booth while suppressing giving a feeling of noise to a person existing outside the booth. The challenge is to provide.

前記課題を解決するためになされた本発明は、メイン空間と、このメイン空間と間仕切りによって区分される１又は複数のブースとを備える室内空間に用いられ、前記１又は複数のブース内で発せられる音から不明瞭音を生成する第１の不明瞭音生成機構と、前記第１の不明瞭音生成機構で生成された不明瞭音に基づく不明瞭化音声を前記メイン空間に放音する機構とを備える音声処理システムである。 The present invention made to solve the above problems is used in an indoor space including a main space and one or a plurality of booths divided by the main space and a partition, and is emitted in the one or a plurality of booths. A first indistinct sound generation mechanism that generates an indistinct sound from a sound, and a mechanism that emits an indistinct sound based on the indistinct sound generated by the first indistinct sound generation mechanism into the main space. It is a voice processing system including.

前記第１の不明瞭音生成機構が、前記ブース内で発せられる音の高周波成分を低減する手段と、高周波成分の低減後に音に残響を付加する手段とを有するとよい。 The first obscure sound generation mechanism may have a means for reducing the high frequency component of the sound emitted in the booth and a means for adding reverberation to the sound after the reduction of the high frequency component.

前記ブース外で発せられる音から不明瞭音を生成する第２の不明瞭音生成機構をさらに備え、前記放音機構が、前記第２の不明瞭音生成機構で生成された不明瞭音に基づく不明瞭化音声を前記メイン空間に放音するとよい。 A second indistinct sound generation mechanism for generating an indistinct sound from a sound emitted outside the booth is further provided, and the sound emission mechanism is based on the indistinct sound generated by the second indistinct sound generation mechanism. The obscured sound may be emitted into the main space.

前記第１の不明瞭音生成機構で得られた音及び前記第２の不明瞭音生成機構で得られた音をミキシングする機構をさらに備え、前記放音機構が、前記ミキシング機構でミキシングされた不明瞭化音声を放音するとよい。 A mechanism for mixing the sound obtained by the first indistinct sound generation mechanism and the sound obtained by the second indistinct sound generation mechanism is further provided, and the sound emitting mechanism is mixed by the mixing mechanism. It is advisable to emit an obscured sound.

また、前記課題を解決するためになされた本発明は、メイン空間と、このメイン空間と間仕切りによって区分される１又は複数のブースとを備える室内空間に用いられ、前記１又は複数のブース内の音を集音するマイクと、前記マイクによって集音された音を不明瞭化処理する不明瞭化処理部と、前記不明瞭化処理部で生成された不明瞭音に基づく不明瞭化音声を前記メイン空間で放音するスピーカとを備える音声処理装置である。 Further, the present invention made to solve the above-mentioned problems is used in an indoor space including a main space and one or a plurality of booths divided by the main space and a partition, and is used in the one or a plurality of booths. The microphone that collects the sound, the obscuring processing unit that obscures the sound collected by the microphone, and the obscuring sound based on the obscuring sound generated by the obscuring processing unit are described above. It is a sound processing device including a speaker that emits sound in the main space.

なお、本発明において、「不明瞭音」とは、原音に音声処理を施すことで原音を聞き取り難くした音をいう。「高周波成分」とは、所定の周波数以上の音をいい、例えば周波数が５００Ｈｚ以上の音をいう。「残響」とは、反射音のうち個別の反射音として分離不可能な反射音群をいい、初期反射音の到達以降に到達する反射音をいう。 In the present invention, the "indistinct sound" means a sound in which the original sound is hard to hear by performing voice processing on the original sound. The “high frequency component” refers to a sound having a predetermined frequency or higher, for example, a sound having a frequency of 500 Hz or higher. "Reverberation" refers to a group of reflected sounds that cannot be separated as individual reflected sounds among the reflected sounds, and refers to reflected sounds that arrive after the arrival of the initial reflected sound.

本発明に係る音声処理システムは、１又は複数のブース内で発せられる音から不明瞭音を生成し、この不明瞭音に基づく不明瞭化音声をメイン空間に放音するので、ブース内の会話内容をブース外に伝わり難くすることができる。また、当該音声処理システムは、前記不明瞭化音声がブース内で発せられる音に由来するので、前記不明瞭化音声の音圧レベルをブース内で発せられる音の音圧レベルに合わせやすい。そのため、当該音声処理システムは、この不明瞭化音声にブース外に存在する人の意識を向き難くすることができるので、ブース外に存在する人に騒音感を与え難い。 The voice processing system according to the present invention generates an unclear sound from sounds emitted in one or more booths, and emits an unclear sound based on the unclear sound to the main space, so that a conversation in the booth is performed. It is possible to make it difficult for the contents to be transmitted outside the booth. Further, in the voice processing system, since the obscured voice is derived from the sound emitted in the booth, it is easy to match the sound pressure level of the obscured voice with the sound pressure level of the sound emitted in the booth. Therefore, the voice processing system can make it difficult for the person who exists outside the booth to be aware of the obscured voice, so that it is difficult to give a sense of noise to the person who exists outside the booth.

本発明に係る音声処理装置は、マイクによって集音された音から不明瞭化処理部によって生成された不明瞭音に基づく不明瞭化音声をメイン空間に放音可能に構成されるので、ブース内の会話内容をブース外に伝わり難くすることができる。また、当該音声処理装置は、前記不明瞭化音声がブース内で発せられる音に由来するので、前記不明瞭化音声の音圧レベルをブース内で発せられる音の音圧レベルに合わせやすい。そのため、当該音声処理装置は、この不明瞭化音声にブース外に存在する人の意識を向き難くすることができるので、ブース外に存在する人に騒音感を与え難い。 Since the voice processing device according to the present invention is configured to be able to emit the obscured sound based on the unclear sound generated by the obscuring processing unit from the sound collected by the microphone into the main space, it is possible to emit the unclear sound in the booth. It is possible to make it difficult for the content of the conversation to be transmitted outside the booth. Further, since the voice processing device derives the obscured voice from the sound emitted in the booth, it is easy to match the sound pressure level of the obscured voice with the sound pressure level of the sound emitted in the booth. Therefore, the voice processing device can make it difficult for the person who exists outside the booth to be aware of the obscured voice, so that it is difficult to give a sense of noise to the person who exists outside the booth.

本発明の一実施形態に係る音声処理システムの室内空間を示す模式図である。It is a schematic diagram which shows the room space of the voice processing system which concerns on one Embodiment of this invention. 図１の室内空間における音声処理システムを示す模式図である。It is a schematic diagram which shows the voice processing system in the room space of FIG. 図２の音声処理システムの不明瞭音生成機構の詳細を示す図である。It is a figure which shows the detail of the obscure sound generation mechanism of the voice processing system of FIG. 図２の音声処理システムとは異なる形態に係る音声処理システムを示す模式図である。It is a schematic diagram which shows the voice processing system which concerns on the form different from the voice processing system of FIG. 図２及び図４の音声処理システムとは異なる形態に係る音声処理システムを示す模式図である。It is a schematic diagram which shows the voice processing system which concerns on the form different from the voice processing system of FIG. 2 and FIG. 図２、図４及び図５の音声処理システムとは異なる形態に係る音声処理システムを示す模式図である。It is a schematic diagram which shows the voice processing system which concerns on the form different from the voice processing system of FIG.2, FIG.4 and FIG.

以下、適宜図面を参照しつつ、本発明の実施の形態を詳説する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings as appropriate.

［第一実施形態］
＜音声処理システム＞
図１乃至図３を参照して、本発明の第一実施形態に係る音声処理システムについて説明する。当該音声処理システムは、図１に示すように、メイン空間Ｘ１と、このメイン空間Ｘ１と間仕切りＰ１によって区分される複数のブースＹ１とを備える室内空間に用いられる。前記室内空間としては、特に限定されるものではないが、例えばオフィス空間が挙げられる。また、この室内空間がオフィス空間である場合、メイン空間Ｘ１は典型的には事務スペースであり、ブースＹ１は会議や商談等を行う打合せブースである。本実施形態では、前記室内空間の一端側に３つのブースＹ１が設けられている。間仕切りＰ１の具体的構造は、特に限定されるものではなく、移動式のものであってもよく、室内空間を画定する壁に固定された固定式のものであってもよい。但し、当該音声処理システムは、ブースＹ１内での会話内容がメイン空間Ｘ１に漏れ聞こえる場合に適しているため、好ましい間仕切りＰ１として、例えば天井との間に空間が形成されるタイプのものが挙げられる。 [First Embodiment]
<Voice processing system>
The voice processing system according to the first embodiment of the present invention will be described with reference to FIGS. 1 to 3. As shown in FIG. 1, the audio processing system is used in an indoor space including a main space X1 and a plurality of booths Y1 divided by the main space X1 and a partition P1. The indoor space is not particularly limited, and examples thereof include an office space. When this indoor space is an office space, the main space X1 is typically an office space, and the booth Y1 is a meeting booth for meetings, business negotiations, and the like. In the present embodiment, three booths Y1 are provided on one end side of the indoor space. The specific structure of the partition P1 is not particularly limited, and may be a mobile type or a fixed type fixed to a wall defining an indoor space. However, since the voice processing system is suitable when the conversation content in the booth Y1 is leaked to the main space X1, a preferable partition P1 is, for example, a type in which a space is formed between the booth Y1 and the ceiling. Be done.

本実施形態において、各ブースＹ１には、テーブルＡ１と、複数の椅子Ｂ１とが備えられている。テーブルＡ１は、例えば平面視略矩形状の天板を有している。また、この天板の対向する一対の側縁の近傍にそれぞれ一対の椅子Ｂ１が配置されている。本実施形態において、前記天板の一方の側縁の近傍に配置される一対の椅子Ｂ１には、例えば商談の一方の当事者が着席し、前記天板の一方の側縁と対向する他方の側縁の近傍に配置される一対の椅子Ｂ１には商談の他方の当事者が着席する。 In the present embodiment, each booth Y1 is provided with a table A1 and a plurality of chairs B1. The table A1 has, for example, a top plate having a substantially rectangular shape in a plan view. Further, a pair of chairs B1 are arranged in the vicinity of the pair of side edges of the top plate facing each other. In the present embodiment, for example, one party of the business negotiation is seated on the pair of chairs B1 arranged in the vicinity of one side edge of the top plate, and the other side facing the one side edge of the top plate. The other party to the negotiation is seated in the pair of chairs B1 located near the rim.

図２に示すように、当該音声処理システムは、各ブースＹ１内で発せられる音から不明瞭音を生成する第１の不明瞭音生成機構１と、第１の不明瞭音生成機構１で生成された不明瞭音に基づく不明瞭化音声をメイン空間Ｘ１に放音する放音機構２とを備える。なお、図２では一つのブースＹ１のみを図示しているが、当該音声処理システムは、他のブースＹ１に関してもブースＹ１内で発せられる音から不明瞭音を生成する第１の不明瞭音生成機構１と、第１の不明瞭音生成機構１で生成された不明瞭音に基づく不明瞭化音声をメイン空間Ｘ１に放音する放音機構２とを備えている。 As shown in FIG. 2, the voice processing system is generated by a first unclear sound generation mechanism 1 and a first unclear sound generation mechanism 1 that generate unclear sounds from sounds emitted in each booth Y1. It is provided with a sound emitting mechanism 2 that emits an obscured sound based on the unclear sound to the main space X1. Although only one booth Y1 is shown in FIG. 2, the voice processing system also generates a first unclear sound from the sound emitted in the booth Y1 for the other booths Y1. It includes a mechanism 1 and a sound emitting mechanism 2 that emits an unclear sound based on the unclear sound generated by the first unclear sound generation mechanism 1 to the main space X1.

（不明瞭音生成機構）
第１の不明瞭音生成機構１は、図３に示すように、集音手段３と、高周波成分低減手段４と、残響付加手段５とを有する。また、第１の不明瞭音生成機構１は、集音手段３で集音した音声信号を所定の時間単位で分割する信号分割手段６、信号分割手段６で分割された音声信号に時間軸に沿った処理を施す信号処理手段７、各ブースＹ１内の会話がメイン空間Ｘ１に漏れるタイミングと前述の不明瞭化音声がメイン空間Ｘ１に放音されるタイミングとを同期する同期手段８をさらに備えていてもよい。なお、第１の不明瞭音生成機構１において、高周波成分低減手段４、残響付加手段５、信号分割手段６、信号処理手段７及び同期手段８は、集音手段３で集音された音を不明瞭化処理する不明瞭化処理手段として構成されている。 (Unclear sound generation mechanism)
As shown in FIG. 3, the first unclear sound generation mechanism 1 includes a sound collecting means 3, a high frequency component reducing means 4, and a reverberation adding means 5. Further, the first unclear sound generation mechanism 1 divides the sound signal collected by the sound collecting means 3 into the signal dividing means 6 and the sound signal divided by the signal dividing means 6 on the time axis. Further provided are a signal processing means 7 that performs processing along the line, and a synchronization means 8 that synchronizes the timing at which the conversation in each booth Y1 leaks into the main space X1 and the timing at which the above-mentioned obscured sound is emitted into the main space X1. You may be. In the first indistinct sound generation mechanism 1, the high frequency component reducing means 4, the reverberation adding means 5, the signal dividing means 6, the signal processing means 7, and the synchronizing means 8 collect the sound collected by the sound collecting means 3. It is configured as an obscuring processing means for obscuring processing.

〈集音手段〉
集音手段３は、各ブースＹ１内の音を集音する。具体的には、集音手段３は、各ブースＹ１内で行われる商談等の会話を集音する。集音手段３は、マイク１１ａ，１１ｂによって構成される。本実施形態において、マイク１１ａ，１１ｂは、テーブルＡ１の天板の一方の当事者が着席する側の端縁近傍及び他方の当事者が着席する側の端縁近傍に設けられている。これにより、集音手段３は、各ブースＹ１内の会話がブースＹ１外に漏洩するよりも時間的に早い段階で、この会話を容易かつ確実に集音可能に構成されている。なお、集音手段３によって集音された音声信号（アナログ信号）は、必要に応じてＡ／Ｄ変換部等から構成されるＡ／Ｄ変換手段（不図示）によってデジタル信号に変換されてもよい。 <Sound collecting means>
The sound collecting means 3 collects the sound in each booth Y1. Specifically, the sound collecting means 3 collects sounds of conversations such as business negotiations held in each booth Y1. The sound collecting means 3 is composed of microphones 11a and 11b. In the present embodiment, the microphones 11a and 11b are provided near the edge of the table top plate on the side where one party is seated and near the edge on the side where the other party is seated. As a result, the sound collecting means 3 is configured to be able to easily and surely collect the conversation at a stage earlier in time than the conversation in each booth Y1 leaks out of the booth Y1. Even if the audio signal (analog signal) collected by the sound collecting means 3 is converted into a digital signal by an A / D conversion means (not shown) composed of an A / D conversion unit or the like, if necessary. Good.

〈信号分割手段〉
信号分割手段６は、集音手段３によって集音された音声信号を例えば１ｓｅｃ以上５ｓｅｃ以下程度の短い時間で分割する。信号分割手段６は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等を含む信号分割部によって構成される。信号分割手段６は、集音手段３によって集音された音声信号を等しい時間間隔で分割してもよく、ランダムな時間間隔で分割してもよい。なお、信号分割手段６が集音手段３によって集音された音声信号を前記時間単位で分割した場合、各ブースＹ１内の最初の数秒の会話内容が外部に漏洩するおそれがある。しかしながら、信号分割手段６が集音手段３によって集音された音声信号を前記時間単位で分割することで、会話の音声の持つスペクトル特性を維持しやすい。そのため、会話全体としての内容を各ブースＹ外により伝わり難くすることができる。また、この構成によると、メイン空間Ｘ１に存在する人の意識を放音機構２によって放音される不明瞭化音声に向き難くすることができるので、メイン空間Ｘ１に存在する人に騒音感を与え難い。 <Signal dividing means>
The signal dividing means 6 divides the audio signal collected by the sound collecting means 3 in a short time of, for example, 1 sec or more and 5 sec or less. The signal dividing means 6 is composed of a signal dividing unit including a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like. The signal dividing means 6 may divide the audio signal collected by the sound collecting means 3 at equal time intervals or at random time intervals. When the signal dividing means 6 divides the audio signal collected by the sound collecting means 3 in the time units, the conversation content of the first few seconds in each booth Y1 may be leaked to the outside. However, when the signal dividing means 6 divides the voice signal collected by the sound collecting means 3 in the time unit, it is easy to maintain the spectral characteristics of the conversation voice. Therefore, it is possible to make it difficult for the contents of the conversation as a whole to be transmitted outside each booth Y. Further, according to this configuration, the consciousness of the person existing in the main space X1 can be made difficult to be directed to the obscured voice emitted by the sound emitting mechanism 2, so that the person existing in the main space X1 feels noisy. Hard to give.

〈信号処理手段〉
信号処理手段７は、信号分割手段６で分割された一部又は全ての音声信号を時間軸に沿って反転又は時間軸を基準として回転させたうえ、クロスフェードさせる。信号処理手段７は、ＣＰＵ、ＲＯＭ、ＲＡＭ等を含む信号処理部によって構成される。信号処理手段７は、信号分割手段６で分割された音声信号に対してピッチ変換を施した後にこの音声信号を時間軸に沿って反転又は時間軸を基準として回転させることが好ましい。 <Signal processing means>
The signal processing means 7 inverts a part or all of the audio signals divided by the signal dividing means 6 along the time axis or rotates them with reference to the time axis, and then crossfades them. The signal processing means 7 is composed of a signal processing unit including a CPU, a ROM, a RAM, and the like. It is preferable that the signal processing means 7 performs pitch conversion on the audio signal divided by the signal dividing means 6 and then inverts the audio signal along the time axis or rotates the audio signal with reference to the time axis.

〈高周波成分低減手段〉
高周波成分低減手段４は、各ブースＹ１内で発せられる音の高周波成分を低減する。高周波成分低減手段４は、例えばローパスフィルタによって構成される。高周波成分低減手段４は、例えば周波数５００Ｈｚ以上の高周波成分を低減する。当該音声処理システムは、必ずしも前述の信号分割手段６及び信号処理手段７を有する必要はない。当該音声処理システムが前述の信号分割手段６及び信号処理手段７を有しない場合、高周波成分低減手段４は、集音手段３によって集音された音声信号（アナログ信号）又は必要に応じてデジタル信号に変換された音声信号の高周波成分を低減する。また、当該音声処理システムが前述の信号分割手段６及び信号処理手段７を有する場合、高周波成分低減手段４は、信号処理手段７によって処理された音声信号の高周波成分を低減する。 <Means for reducing high frequency components>
The high frequency component reducing means 4 reduces the high frequency component of the sound emitted in each booth Y1. The high frequency component reducing means 4 is composed of, for example, a low-pass filter. The high frequency component reducing means 4 reduces high frequency components having a frequency of, for example, 500 Hz or higher. The voice processing system does not necessarily have to have the above-mentioned signal dividing means 6 and signal processing means 7. When the voice processing system does not have the above-mentioned signal dividing means 6 and signal processing means 7, the high frequency component reducing means 4 is a voice signal (analog signal) collected by the sound collecting means 3 or a digital signal if necessary. Reduces the high frequency component of the audio signal converted to. Further, when the voice processing system has the above-mentioned signal dividing means 6 and signal processing means 7, the high frequency component reducing means 4 reduces the high frequency component of the voice signal processed by the signal processing means 7.

〈残響付加手段〉
残響付加手段５は、高周波成分低減手段４によって高周波成分が低減された音に残響を付加する。残響付加手段５は、ＣＰＵ、ＲＯＭ、ＲＡＭ等を含む残響付加部によって構成される。残響付加手段５は、例えば高周波成分低減手段４によって高周波成分が低減された音声信号に初期反射密度が高く残響時間の短い残響を付加する。具体的には、残響付加手段５は、高周波成分低減手段４によって高周波成分が低減された音声信号に複数の遅延処理（ｄｅｌａｙ）を施し、遅延後の音声信号を加算することで複数の反射音を付加する。残響付加手段５は、高周波成分低減手段４によって高周波成分が低減された音声信号に初期反射密度が高く残響時間の短い残響を付加することで、第１の不明瞭音生成機構１で生成される音声全体のエネルギー量を制御しつつ、この音声の音色及び音量を容易かつ確実に制御することができる。なお「初期反射密度」とは、単位時間あたりの反射音の本数をいう。 <Means for adding reverberation>
The reverberation adding means 5 adds reverberation to the sound whose high frequency component is reduced by the high frequency component reducing means 4. The reverberation adding means 5 is composed of a reverberation adding unit including a CPU, a ROM, a RAM, and the like. The reverberation adding means 5 adds reverberation having a high initial reflection density and a short reverberation time to an audio signal whose high frequency component has been reduced by, for example, the high frequency component reducing means 4. Specifically, the reverberation adding means 5 performs a plurality of delay processes (delays) on the audio signal whose high frequency component is reduced by the high frequency component reducing means 4, and adds the delayed audio signals to a plurality of reflected sounds. Is added. The reverberation adding means 5 is generated by the first indistinct sound generation mechanism 1 by adding a reverberation having a high initial reflection density and a short reverberation time to a voice signal whose high frequency component has been reduced by the high frequency component reducing means 4. While controlling the amount of energy of the entire voice, the tone color and volume of the voice can be easily and surely controlled. The "initial reflection density" refers to the number of reflected sounds per unit time.

〈同期手段〉
同期手段８は、後述する放音機構２から不明瞭化音声が放音されるタイミングが各ブースＹ１内の会話がメイン空間Ｘ１に漏れるタイミングと重なるように、第１の不明瞭音生成機構１から放音機構２に音声信号が出力されるタイミングを調節する。同期手段８は、ＣＰＵ、ＲＯＭ、ＲＡＭ等を含む同期部によって構成される。同期手段８は、例えば各ブースＹ１において想定される話者とメイン空間Ｘ１との距離等に基づいて算出される各ブースＹ１内の会話がメイン空間Ｘ１に伝播される時間に対応するよう、第１の不明瞭音生成機構１から放音機構２に音声信号が出力されるタイミングを遅らせる。なお、当該音声処理システムは、放音機構２から不明瞭化音声が放音されるタイミングと各ブースＹ１内の会話がメイン空間Ｘ１に漏れるタイミングとを調節する必要がない場合、必ずしも同期手段８を有しなくてもよい。 <Synchronization means>
In the synchronization means 8, the first obscure sound generation mechanism 1 is provided so that the timing at which the obscured sound is emitted from the sound emitting mechanism 2 described later overlaps with the timing at which the conversation in each booth Y1 leaks to the main space X1. The timing at which the audio signal is output to the sound emitting mechanism 2 is adjusted. The synchronization means 8 is composed of a synchronization unit including a CPU, ROM, RAM, and the like. The synchronization means 8 corresponds to the time when the conversation in each booth Y1 is propagated to the main space X1, for example, calculated based on the distance between the speaker and the main space X1 assumed in each booth Y1. The timing at which the audio signal is output from the indistinct sound generation mechanism 1 of 1 to the sound emission mechanism 2 is delayed. When it is not necessary to adjust the timing at which the obscured voice is emitted from the sound emitting mechanism 2 and the timing at which the conversation in each booth Y1 leaks to the main space X1, the voice processing system does not necessarily have the synchronization means 8 It is not necessary to have.

当該音声処理システムは、第１の不明瞭音生成機構１が、高周波成分低減手段４及び残響付加手段５を有するので、容易かつ確実に不明瞭音を生成することができる。特に第１の不明瞭音生成機構１が、高周波成分低減手段４及び残響付加手段５を有する場合、音圧レベルの変化を抑制しつつ、各ブースＹ１内で話される会話とは異なる意味のない語列を放音することができるので、メイン空間Ｘ１で不明瞭音生成機構１で生成された不明瞭音に基づく不明瞭化音声を聞いた聴者に騒音感を与え難い。 In the voice processing system, since the first unclear sound generation mechanism 1 includes the high frequency component reducing means 4 and the reverberation adding means 5, it is possible to easily and surely generate unclear sound. In particular, when the first obscure sound generation mechanism 1 has the high frequency component reducing means 4 and the reverberation adding means 5, it has a different meaning from the conversation spoken in each booth Y1 while suppressing the change in the sound pressure level. Since it is possible to emit a non-existent word sequence, it is difficult to give a sense of noise to the listener who hears the obscured sound based on the unclear sound generated by the unclear sound generation mechanism 1 in the main space X1.

また、当該音声処理システムは、第１の不明瞭音生成機構１が、信号分割手段６及び信号処理手段７を有する場合、各ブースＹ１内の会話と第１の不明瞭音生成機構１で生成される不明瞭音との電気信号的な相関を容易かつ確実に低減することができるので、ハウリング等の不快音の発生を抑制しやすい。また、この構成によると、メイン空間Ｘ１の活気の演出効果を高めやすく、これによりメイン空間Ｘ１に存在する人の意識を各ブースＹ１内の会話により向き難くすることができる。 Further, in the voice processing system, when the first unclear sound generation mechanism 1 has the signal dividing means 6 and the signal processing means 7, the conversation in each booth Y1 and the first unclear sound generating mechanism 1 generate the sound. Since the electrical signal correlation with the unclear sound to be generated can be easily and surely reduced, it is easy to suppress the generation of unpleasant sounds such as howling. Further, according to this configuration, it is easy to enhance the effect of producing the liveliness of the main space X1, which makes it difficult for the consciousness of the person existing in the main space X1 to be directed by the conversation in each booth Y1.

当該音声処理システムは、第１の不明瞭音生成機構１が同期手段８を有する場合、各ブースＹ１内で話者が発言した場合にのみ、この発言に合わせて不明瞭化音声を放音することができる。そのため、各ブースＹ１内の会話内容がメイン空間Ｘ１に漏洩することを確実に抑制することができると共に、不明瞭化音声に起因して騒音感が生じることをより的確に抑制することができる。 When the first unclear sound generation mechanism 1 has the synchronization means 8, the voice processing system emits an unclear sound in accordance with the remark only when the speaker speaks in each booth Y1. be able to. Therefore, it is possible to reliably suppress the leakage of the conversation content in each booth Y1 to the main space X1, and it is possible to more accurately suppress the generation of noise due to the obscured voice.

（放音機構）
放音機構２は、スピーカによって構成される。このスピーカは、例えば各ブースＹ１とメイン空間Ｘ１とを区分する間仕切りＰ１のメイン空間Ｘ１側の側面に取り付けられる。放音機構２は、第１の不明瞭音生成機構１で生成され、必要に応じてＤ／Ａ変換部から構成されるＤ／Ａ変換手段（不図示）によってアナログ信号に変換された不明瞭音を不明瞭化音声としてメイン空間Ｘ１に放音する。 (Sound release mechanism)
The sound emitting mechanism 2 is composed of a speaker. This speaker is attached to, for example, the side surface of the partition P1 that separates each booth Y1 and the main space X1 on the main space X1 side. The sound emitting mechanism 2 is an unclear sound generated by the first unclear sound generating mechanism 1 and converted into an analog signal by a D / A conversion means (not shown) composed of a D / A conversion unit as needed. The sound is emitted to the main space X1 as an obscuring sound.

＜音声処理装置＞
続いて、当該音声処理システムのハードウェアを構成する音声処理装置について説明する。当該音声処理装置は、メイン空間Ｘ１と、このメイン空間Ｘ１と間仕切りＰ１によって区分される複数のブースＹ１とを備える室内空間に用いられる。当該音声処理装置は、各ブースＹ１内の音を集音するマイク１１ａ，１１ｂと、マイク１１ａ，１１ｂによって集音された音を不明瞭化処理する不明瞭化処理部１２と、不明瞭化処理部１２で生成された不明瞭音に基づく不明瞭化音声をメイン空間Ｘ１で放音するスピーカ（前述の放音機構２を構成するスピーカ）とを備える。不明瞭化処理部１２は、前述のローパスフィルタ及び残響付加部を有する。また、不明瞭化処理部１２は、前述の信号分割部、信号処理部及び同期部をさらに有していてもよい。 <Voice processing device>
Subsequently, the voice processing device constituting the hardware of the voice processing system will be described. The audio processing device is used in an indoor space including a main space X1 and a plurality of booths Y1 divided by the main space X1 and a partition P1. The voice processing device includes microphones 11a and 11b that collect the sounds in each booth Y1, an obfuscation processing unit 12 that obscures the sounds collected by the microphones 11a and 11b, and an obscuration process. A speaker (speaker constituting the above-mentioned sound emitting mechanism 2) that emits an unclear sound based on the unclear sound generated by the unit 12 in the main space X1 is provided. The obfuscation processing unit 12 has the above-mentioned low-pass filter and reverberation adding unit. Further, the obscuring processing unit 12 may further include the above-mentioned signal dividing unit, signal processing unit, and synchronization unit.

＜利点＞
当該音声処理システムは、複数のブースＹ１内で発せられる音から不明瞭音を生成し、この不明瞭音に基づく不明瞭化音声をメイン空間Ｘ１に放音するので、ブースＹ１内の会話内容をブースＹ１外に伝わり難くすることができる。また、当該音声処理システムは、前記不明瞭化音声がブースＹ１内で発せられる音に由来するので、前記不明瞭化音声の音圧レベルをブースＹ１内で発せられる音の音圧レベルに合わせやすい。そのため、当該音声処理システムは、この不明瞭化音声にブースＹ１外に存在する人の意識を向き難くすることができるので、ブースＹ１外に存在する人に騒音感を与え難い。 <Advantage>
The voice processing system generates an unclear sound from the sounds emitted in the plurality of booths Y1 and emits the unclear sound based on the unclear sound to the main space X1, so that the conversation content in the booth Y1 can be displayed. It can be difficult to convey to the outside of booth Y1. Further, in the voice processing system, since the obscured voice is derived from the sound emitted in the booth Y1, it is easy to match the sound pressure level of the obscured voice with the sound pressure level of the sound emitted in the booth Y1. .. Therefore, the voice processing system can make it difficult for the person who exists outside the booth Y1 to be aware of the obscured voice, so that it is difficult to give a sense of noise to the person who exists outside the booth Y1.

当該音声処理システムは、各ブースＹ１内での会話の音声を利用して不明瞭化音声を生成するので、不明瞭化音声を放音することによる不自然さが生じ難い。そのため、当該音声処理システムは、前述の不明瞭化音声を、単純に室内空間の雰囲気を作るための音声として機能させやすい。 Since the voice processing system generates the obscured voice by using the voice of the conversation in each booth Y1, it is unlikely that the unnaturalness caused by emitting the obscured voice is generated. Therefore, the voice processing system can easily make the above-mentioned obscured voice function as a voice for simply creating an atmosphere of an indoor space.

当該音声処理装置は、マイク１１ａ，１１ｂによって集音された音から不明瞭化処理部１２によって生成された不明瞭音に基づく不明瞭化音声をメイン空間Ｘ１に放音可能に構成されるので、各ブースＹ１内の会話内容をブースＹ１外に伝わり難くすることができる。また、当該音声処理装置は、前記不明瞭化音声がブースＹ１内で発せられる音に由来するので、前記不明瞭化音声の音圧レベルをブースＹ１内で発せられる音の音圧レベルに合わせやすい。そのため、当該音声処理装置は、この不明瞭化音声にブースＹ１外に存在する人の意識を向き難くすることができるので、ブースＹ１外に存在する人に騒音感を与え難い。 Since the voice processing device is configured to be able to emit the obscured sound based on the unclear sound generated by the obscuring processing unit 12 from the sound collected by the microphones 11a and 11b to the main space X1. It is possible to make it difficult for the conversation content in each booth Y1 to be transmitted to the outside of the booth Y1. Further, in the voice processing device, since the obscured voice is derived from the sound emitted in the booth Y1, it is easy to match the sound pressure level of the obscured voice with the sound pressure level of the sound emitted in the booth Y1. .. Therefore, the voice processing device can make it difficult for the person who exists outside the booth Y1 to be aware of the obscured voice, so that it is difficult to give a sense of noise to the person who exists outside the booth Y1.

［第二実施形態］
次に、図４を参照して、本発明の第二実施形態に係る音声処理システムについて説明する。当該音声処理システムは、メイン空間Ｘ２と、このメイン空間Ｘ２と間仕切りＰ２によって区分される１又は複数のブースＹ２とを備える室内空間に用いられる。前記室内空間としては、特に限定されるものではないが、第一実施形態と同様、例えばオフィス空間が挙げられる。また、この室内空間がオフィス空間である場合、メイン空間Ｘ２は典型的には事務スペースであり、ブースＹ２は会議や商談等を行う打合せブースである。当該音声処理システムは、前記ブースＹ２と、このブースＹ２とは別個の室内空間に設けられる他のブースＹ３との間で遠隔会議が可能な遠隔会議システムとして構成されている（なお、以下では説明の便宜上、メイン空間Ｘ２と間仕切りＰ２によって区分されるブースＹ２を「第１ブースＹ２」といい、この第１ブースＹ２と別個の室内空間に設けられる他のブースＹ３を「第２ブースＹ３」ということがある）。間仕切りＰ２の具体的構成は、特に限定されるものではなく、図１乃至図３の音声処理システムにおける間仕切り１と同様のものを用いることができる。 [Second Embodiment]
Next, the voice processing system according to the second embodiment of the present invention will be described with reference to FIG. The audio processing system is used in an indoor space including a main space X2, the main space X2, and one or more booths Y2 separated by a partition P2. The indoor space is not particularly limited, and examples thereof include an office space as in the first embodiment. When this indoor space is an office space, the main space X2 is typically an office space, and the booth Y2 is a meeting booth for meetings, business negotiations, and the like. The audio processing system is configured as a remote conference system capable of conducting a remote conference between the booth Y2 and another booth Y3 provided in an indoor space separate from the booth Y2 (note that it will be described below). For convenience of convenience, the booth Y2 separated by the main space X2 and the partition P2 is referred to as "first booth Y2", and the other booth Y3 provided in the indoor space separate from the first booth Y2 is referred to as "second booth Y3". Sometimes). The specific configuration of the partition P2 is not particularly limited, and the same partition 1 as in the voice processing system of FIGS. 1 to 3 can be used.

本実施形態において、第１ブースＹ２には、テーブルＡ２と、複数の椅子Ｂ２とが備えられている。また、第１ブースＹ２及び第２ブースＹ３には、遠隔会議端末２３ａ，２３ｂが配置されている。第１ブースＹ２において、遠隔会議端末２３ａは、テーブルＡ２上に載置されている。遠隔会議端末２３ａ，２３ｂは、スピーカ及びマイクを有しており、一方の遠隔会議端末２３ａ，２３ｂのマイクで集音した音を他方の遠隔会議端末２３ａ，２３ｂのスピーカから放音可能に構成されている。 In the present embodiment, the first booth Y2 is provided with a table A2 and a plurality of chairs B2. Further, remote conference terminals 23a and 23b are arranged in the first booth Y2 and the second booth Y3. In the first booth Y2, the remote conference terminal 23a is placed on the table A2. The remote conference terminals 23a and 23b have a speaker and a microphone, and the sound collected by the microphone of one remote conference terminal 23a and 23b can be emitted from the speaker of the other remote conference terminals 23a and 23b. ing.

当該音声処理システムは、第１ブースＹ２内で発せられる音から不明瞭音を生成する第１の不明瞭音生成機構２１と、第１ブースＹ２外で発せられる音から不明瞭音を生成する第２の不明瞭音生成機構２２と、第１の不明瞭音生成機構２１で生成された不明瞭音に基づく不明瞭化音声及び第２の不明瞭音生成機構２２で生成された不明瞭音に基づく不明瞭化音声をメイン空間Ｘ２で放音する放音機構２４とを備える。さらに、当該音声処理システムは、第１の不明瞭音生成機構２１で得られた音及び第２の不明瞭音生成機構２２で得られた音をミキシングするミキシング機構２５を備えており、放音機構２４がミキシング機構２５でミキシングされた不明瞭化音声を放音するよう構成されている。 The voice processing system has a first indistinct sound generation mechanism 21 that generates an indistinct sound from a sound emitted in the first booth Y2, and a first indistinct sound that generates an indistinct sound from a sound emitted outside the first booth Y2. In the unclear sound generation mechanism 22 of 2, the unclear sound based on the unclear sound generated by the first unclear sound generation mechanism 21, and the unclear sound generated by the second unclear sound generation mechanism 22 It is provided with a sound emitting mechanism 24 that emits an obscured sound based on the sound in the main space X2. Further, the voice processing system includes a mixing mechanism 25 that mixes the sound obtained by the first unclear sound generation mechanism 21 and the sound obtained by the second unclear sound generation mechanism 22, and emits sound. The mechanism 24 is configured to emit the obscured sound mixed by the mixing mechanism 25.

（不明瞭化音生成機構）
第１の不明瞭音生成機構２１は、集音手段が遠隔会議端末２３ａのマイクによって構成される以外、図１乃至図３の音声処理システムの第１の不明瞭音生成機構１と同様の構成とすることができる。 (Indistinct sound generation mechanism)
The first obscure sound generation mechanism 21 has the same configuration as the first obscure sound generation mechanism 1 of the voice processing system of FIGS. 1 to 3 except that the sound collecting means is composed of the microphone of the remote conference terminal 23a. Can be.

第２の不明瞭音生成機構２２は、第２ブースＹ３内で発せられる音から不明瞭音を生成する。第２の不明瞭音生成機構２２は、集音手段が遠隔会議端末２３ｂのマイクによって構成される以外、図１乃至図３の音声処理システムの第１の不明瞭音生成機構１と同様の構成とすることができる。 The second unclear sound generation mechanism 22 generates an unclear sound from the sound emitted in the second booth Y3. The second obscure sound generation mechanism 22 has the same configuration as the first obscure sound generation mechanism 1 of the voice processing system of FIGS. 1 to 3 except that the sound collecting means is composed of the microphone of the remote conference terminal 23b. Can be.

（ミキシング機構）
ミキシング機構２５は、ＣＰＵ、ＲＯＭ、ＲＡＭ等を含むミキシング装置によって構成される。ミキシング機構２５による具体的なミキシング手法は特に限定されるものではなく、公知の手法を採用可能である。ミキシング機構２５は、第１の不明瞭音生成機構２１で得られた音及び第２の不明瞭音生成機構２２で得られた音をミキシングするよう構成される限り、例えば第１の不明瞭音生成機構２１のマイク及び第２の不明瞭音生成機構２２のマイクによって集音された音をミキシングしてもよく、第１の不明瞭音生成機構２１で生成された不明瞭音及び第２の不明瞭音生成機構２２で生成された不明瞭音をミキシングしてもよい。但し、ミキシング機構２５は、不明瞭化の程度及びミキシングレベルを独立して制御しやすい点から、第１の不明瞭音生成機構２１で生成された不明瞭音及び第２の不明瞭音生成機構２２で生成された不明瞭音をミキシングすることが好ましい。また、ミキシング機構２５が第１の不明瞭音生成機構２１のマイク及び第２の不明瞭音生成機構２２のマイクによって集音された音をミキシングする場合、ミキシング機構２５によってミキシングされた音に対して、前述の高周波成分低減手段４、残響付加手段５、信号分割手段６、信号処理手段７、同期手段８等によって不明瞭化処理を施せばよい。さらに、ミキシング機構２５が第１の不明瞭音生成機構２１のマイク及び第２の不明瞭音生成機構２２のマイクによって集音された音をミキシングする場合、第１の不明瞭音生成機構２１及び第２の不明瞭音生成機構２２が前述の不明瞭化処理手段をそれぞれ有する必要はなく、第１の不明瞭音生成機構２１及び第２の不明瞭音生成機構２２が共通の１つの不明瞭化処理手段を有していればよい。 (Mixing mechanism)
The mixing mechanism 25 is composed of a mixing device including a CPU, a ROM, a RAM, and the like. The specific mixing method by the mixing mechanism 25 is not particularly limited, and a known method can be adopted. As long as the mixing mechanism 25 is configured to mix the sound obtained by the first unclear sound generation mechanism 21 and the sound obtained by the second unclear sound generation mechanism 22, for example, the first unclear sound. The sounds collected by the microphone of the generation mechanism 21 and the microphone of the second obscure sound generation mechanism 22 may be mixed, and the obscure sound generated by the first obscure sound generation mechanism 21 and the second obscure sound generation mechanism 21 may be mixed. The unclear sound generated by the unclear sound generation mechanism 22 may be mixed. However, since the mixing mechanism 25 can easily control the degree of ambiguity and the mixing level independently, the unclear sound generated by the first unclear sound generation mechanism 21 and the second unclear sound generation mechanism It is preferable to mix the obscure sound generated in 22. Further, when the mixing mechanism 25 mixes the sound collected by the microphone of the first obscure sound generation mechanism 21 and the microphone of the second obscure sound generation mechanism 22, the sound mixed by the mixing mechanism 25 is mixed. Therefore, the obfuscation process may be performed by the above-mentioned high frequency component reducing means 4, reverberation adding means 5, signal dividing means 6, signal processing means 7, synchronization means 8, and the like. Further, when the mixing mechanism 25 mixes the sounds collected by the microphone of the first obscure sound generation mechanism 21 and the microphone of the second obscure sound generation mechanism 22, the first obscure sound generation mechanism 21 and It is not necessary for the second obscure sound generation mechanism 22 to have the above-mentioned obfuscation processing means, respectively, and the first obscure sound generation mechanism 21 and the second obscure sound generation mechanism 22 are one in common. It suffices to have a chemical processing means.

（放音機構）
放音機構２４は、スピーカによって構成される。このスピーカは、例えば第１ブースＹ２とメイン空間Ｘ２とを区分する間仕切りＰ２のメイン空間Ｘ２側の側面に取り付けられている。放音機構２４は、第１の不明瞭音生成機構２１及び第２の不明瞭音生成機構２２で不明瞭化処理され、かつミキシング機構２５でミキシングされた後、必要に応じてＤ／Ａ変換部から構成されるＤ／Ａ変換手段（不図示）によってアナログ信号に変換された不明瞭化音声をメイン空間Ｘ２に放音する。 (Sound release mechanism)
The sound emitting mechanism 24 is composed of a speaker. This speaker is attached to, for example, the side surface of the partition P2 that separates the first booth Y2 and the main space X2 on the main space X2 side. The sound emitting mechanism 24 is obscured by the first obscure sound generation mechanism 21 and the second obscure sound generation mechanism 22, and after being mixed by the mixing mechanism 25, D / A conversion is performed as necessary. The obscured sound converted into an analog signal by the D / A conversion means (not shown) composed of the parts is emitted to the main space X2.

＜音声処理装置＞
続いて、当該音声処理システムのハードウェアを構成する音声処理装置について説明する。当該音声処理装置は、メイン空間Ｘ２と、このメイン空間Ｘ２と間仕切りＰ２によって区分される１又は複数の第１ブースＹ２とを備える室内空間に用いられる。当該音声処理装置は、第１ブースＹ２と、この第１ブースＹ２とは別個の室内空間に設けられる第２ブースＹ３との間で遠隔会議が可能な遠隔会議装置として構成される。 <Voice processing device>
Subsequently, the voice processing device constituting the hardware of the voice processing system will be described. The audio processing device is used in an indoor space including a main space X2 and one or a plurality of first booths Y2 divided by the main space X2 and a partition P2. The audio processing device is configured as a remote conference device capable of conducting a remote conference between the first booth Y2 and the second booth Y3 provided in an indoor space separate from the first booth Y2.

当該音声処理装置は、第１ブースＹ２内の音を集音する第１のマイク（前述の遠隔会議端末２３ａに備えられるマイク）と、この第１のマイクによって集音される音を不明瞭化処理する第１の不明瞭化処理部と、第２ブースＹ３内の音を集音する第２のマイク（前述の遠隔会議端末２３ｂに備えられるマイク）と、第２のマイクによって集音される音を不明瞭化処理する第２の不明瞭化処理部と、第１のマイク及び第２のマイクによって集音された音、又は第１の不明瞭化処理部及び第２の不明瞭化処理部で生成された不明瞭音をミキシングするミキシング装置（前述のミキシング機構２５を構成するミキシング装置）と、このミキシング装置でミキシングされた不明瞭化音声を放音するスピーカ（前述の放音機構２４を構成するスピーカ）とを備える。第１の不明瞭化処理部及び第２の不明瞭化処理部の具体的構成は、第一実施形態における不明瞭化処理部１２と同様とすることができる。 The voice processing device obscures the first microphone (the microphone provided in the remote conference terminal 23a described above) that collects the sound in the first booth Y2 and the sound collected by the first microphone. The sound is collected by the first obscuring processing unit to be processed, the second microphone (the microphone provided in the remote conference terminal 23b described above) that collects the sound in the second booth Y3, and the second microphone. The second obfuscation processing unit that obscures the sound and the sound collected by the first microphone and the second microphone, or the first obscuring processing unit and the second obscuring processing. A mixing device that mixes the unclear sound generated by the unit (the mixing device that constitutes the above-mentioned mixing mechanism 25) and a speaker that emits the obscured sound mixed by this mixing device (the above-mentioned sound emitting mechanism 24). It is equipped with a speaker). The specific configuration of the first obscuration processing unit and the second obscuration processing unit can be the same as that of the obscuration processing unit 12 in the first embodiment.

＜利点＞
当該音声処理システムは、第１ブースＹ２外で発せられる音から不明瞭音を生成する第２の不明瞭音生成機構２２を備え、放音機構２４が第１の不明瞭音生成機構２１で生成された不明瞭音に基づく不明瞭化音声に加え、第２の不明瞭音生成機構２２で生成された不明瞭音に基づく不明瞭化音声をメイン空間Ｘ２に放音するので、遠隔会議を行っている場合でも、場所を隔てて行われる会話に重ねて不明瞭化音声を放音することができる。そのため、第１ブースＹ２での発言内容に加え、遠隔会議端末２３ｂから放音される音声がメイン空間Ｘ２に漏洩することを的確に抑制することができる。 <Advantage>
The voice processing system includes a second unclear sound generation mechanism 22 that generates unclear sound from the sound emitted outside the first booth Y2, and the sound emitting mechanism 24 is generated by the first unclear sound generation mechanism 21. In addition to the obscured sound based on the unclear sound, the obscured sound based on the unclear sound generated by the second unclear sound generation mechanism 22 is emitted to the main space X2, so that a remote conference is held. Even if it is, the obscuring sound can be emitted over the conversations that are held at different places. Therefore, in addition to the content of the remarks made at the first booth Y2, it is possible to accurately suppress the sound emitted from the remote conference terminal 23b from leaking to the main space X2.

また、当該音声処理システムは、第１の不明瞭音生成機構２１で得られた音及び第２の不明瞭音生成機構２２で得られた音をミキシングするミキシング機構２５を備え、放音機構２４がミキシング機構２５でミキシングされた不明瞭化音声を放音するので、メイン空間Ｘ２に存在する人に対する会話内容の隠蔽効果を高めることができる。つまり、例えば第１ブースＹ２に存在する１人の話者の発言のみに基づく不明瞭化音声をメイン空間Ｘ２に放音するよりも、複数の話者の発言に基づく不明瞭化音声をメイン空間Ｘ２に放音する方が、メイン空間Ｘ２に存在する人に会話内容をより伝わり難くすることができると共に、メイン空間Ｘ２に存在する人の意識をより会話内容に向き難くすることができる。 Further, the voice processing system includes a mixing mechanism 25 for mixing the sound obtained by the first unclear sound generation mechanism 21 and the sound obtained by the second unclear sound generation mechanism 22, and the sound emitting mechanism 24. Dissipates the obscured sound mixed by the mixing mechanism 25, so that the effect of concealing the conversation content with respect to the person existing in the main space X2 can be enhanced. That is, for example, rather than emitting the obscured voice based on the remarks of one speaker existing in the first booth Y2 to the main space X2, the obfuscated voice based on the remarks of a plurality of speakers is emitted in the main space. By emitting sound to X2, it is possible to make it more difficult to convey the conversation content to the person existing in the main space X2, and it is possible to make it more difficult for the consciousness of the person existing in the main space X2 to be directed to the conversation content.

当該音声処理装置は、当該音声処理システムと同様、第１ブースＹ２での発言内容に加え、遠隔会議端末２３ａから放音される音声がメイン空間Ｘ２に漏洩することを的確に抑制することができる。また、当該音声処理装置は、当該音声処理システムと同様、メイン空間Ｘ２で放音される音声を室内空間の雰囲気を作るための音声として機能させやすい。 Similar to the voice processing system, the voice processing device can accurately suppress the voice emitted from the remote conference terminal 23a from leaking to the main space X2 in addition to the content of the speech at the first booth Y2. .. Further, the voice processing device, like the voice processing system, can easily make the voice emitted in the main space X2 function as the voice for creating the atmosphere of the indoor space.

［その他の実施形態］
前記実施形態は、本発明の構成を限定するものではない。従って、前記実施形態は、本明細書の記載及び技術常識に基づいて前記実施形態各部の構成要素の省略、置換又は追加が可能であり、それらは全て本発明の範囲に属するものと解釈されるべきである。 [Other Embodiments]
The embodiments do not limit the configuration of the present invention. Therefore, it is possible to omit, replace or add components of each part of the embodiment based on the description of the present specification and common general technical knowledge, and all of them are construed as belonging to the scope of the present invention. Should be.

例えば当該音声処理システムは、メイン空間と、このメイン空間と間仕切りによって区分される１つのブースとを備える室内空間に用いられてもよい。また、当該音声処理システムは、複数のブースを備える室内空間において用いられる場合でも、必ずしも全てのブース内で発せられる音から不明瞭音を生成し、この不明瞭音に基づく不明瞭化音声を放音する必要はなく、１又は特定のブース内で発せられる音のみから不明瞭音を生成し、この不明瞭音に基づく不明瞭化音声を放音してもよい。さらに、当該音声処理システムは、複数のブースを備える室内空間において用いられる場合、隣接するブースに対しても不明瞭化音声を放音可能に構成されていてもよい。 For example, the audio processing system may be used in an indoor space including a main space and one booth divided by the main space and a partition. Further, even when the voice processing system is used in an indoor space having a plurality of booths, it does not necessarily generate an unclear sound from the sounds emitted in all the booths and emits an unclear sound based on the unclear sound. It is not necessary to make a sound, and the unclear sound may be generated only from the sound emitted in one or a specific booth, and the obscured sound based on this unclear sound may be emitted. Further, when the voice processing system is used in an indoor space having a plurality of booths, the voice processing system may be configured to be capable of emitting unclear voice to adjacent booths.

前述の実施形態では、不明瞭音生成機構が、高周波成分低減手段及び残響付加手段を有する構成について説明したが、不明瞭音生成機構は、必ずしも高周波成分低減手段及び残響付加手段を備える必要はない。不明瞭音生成機構は、例えば前述の信号分割手段及び信号処理手段のみを有していてもよい。 In the above-described embodiment, the configuration in which the indistinct sound generation mechanism has the high frequency component reducing means and the reverberation adding means has been described, but the indistinct sound generating mechanism does not necessarily have to include the high frequency component reducing means and the reverberation adding means. .. The obscure sound generation mechanism may have, for example, only the above-mentioned signal dividing means and signal processing means.

当該音声処理システムがブース外で発せられる音から不明瞭音を生成する第２の不明瞭音生成機構を備える場合、この第２の不明瞭音生成機構は必ずしも遠隔会議端末が配置される他のブースで発せられる音から不明瞭音を生成する必要はない。例えば第２の不明瞭音生成機構は、メイン空間で発せられる音から不明瞭音を生成してもよい。 When the voice processing system includes a second obscure sound generation mechanism that generates an unclear sound from the sound emitted outside the booth, the second obscure sound generation mechanism is not necessarily the other in which the remote conference terminal is arranged. There is no need to generate obscure sounds from the sounds emitted at the booth. For example, the second obscure sound generation mechanism may generate an unclear sound from the sound emitted in the main space.

また、当該音声処理システムは、第２の不明瞭音生成機構が遠隔会議端末が配置された他のブースで発せられる音から不明瞭音を生成する場合であっても、この他のブースの具体的構成は限定されるものではない。 Further, even when the second unclear sound generation mechanism generates unclear sound from the sound emitted at another booth in which the remote conference terminal is arranged, the voice processing system is specific to the other booth. The composition is not limited.

当該音声処理システムは、第２の不明瞭音生成機構で生成された不明瞭音に基づく不明瞭化音声をメイン空間で放音する場合でも、必ずしも前述のミキシング機構を備えていなくてもよい。図５を参照して、前述のミキシング機構を備えない場合の一例について説明する。 The voice processing system does not necessarily have to include the above-mentioned mixing mechanism even when the obscured sound based on the unclear sound generated by the second unclear sound generation mechanism is emitted in the main space. An example in the case where the above-mentioned mixing mechanism is not provided will be described with reference to FIG.

図５の音声処理システムは、メイン空間Ｘ３と、このメイン空間Ｘ３と間仕切りＰ３によって区分されるブースＹ４（以下、「第１ブースＹ４」ともいう）とを備える室内空間に用いられる。当該音声処理システムは、第１ブースＹ４と、この第１ブースＹ４とは別個の室内空間に設けられる他のブースＹ５（以下、「第２ブースＹ５」ともいう）との間で遠隔会議が可能な遠隔会議システムとして構成されている。当該音声処理システムでは、第１ブースＹ４及び第２ブースＹ５に遠隔会議端末３３ａ，３３ｂが配置されている。遠隔会議端末３３ａ，３３ｂは、スピーカ及びマイクを有しており、一方の遠隔会議端末３３ａ，３３ｂのマイクで集音した音を他方の遠隔会議端末３３ａ，３３ｂのスピーカから放音可能に構成されている。 The audio processing system of FIG. 5 is used in an indoor space including a main space X3 and a booth Y4 (hereinafter, also referred to as “first booth Y4”) divided by the main space X3 and the partition P3. The voice processing system enables remote conferences between the first booth Y4 and another booth Y5 (hereinafter, also referred to as "second booth Y5") provided in an indoor space separate from the first booth Y4. It is configured as a remote conference system. In the voice processing system, remote conference terminals 33a and 33b are arranged in the first booth Y4 and the second booth Y5. The remote conference terminals 33a and 33b have a speaker and a microphone, and the sound collected by the microphones of one remote conference terminal 33a and 33b can be emitted from the speaker of the other remote conference terminals 33a and 33b. ing.

当該音声処理システムは、第１ブースＹ４内で発せられる音から不明瞭音を生成する第１の不明瞭音生成機構３１と、第２ブースＹ５内で発せられる音から不明瞭音を生成する第２の不明瞭音生成機構３２と、第１の不明瞭音生成機構３１で生成された不明瞭音に基づく不明瞭化音声及び第２の不明瞭音生成機構３２で生成された不明瞭音に基づく不明瞭化音声をメイン空間Ｘ３で放音する放音機構３４とを備える。放音機構３４は、一対のスピーカ３４ａ，３４ｂによって構成されている。当該音声処理システムは、第１の不明瞭音生成機構３１で生成された不明瞭音に基づく不明瞭化音声を一方のスピーカ３４ａから放音し、第２の不明瞭音生成機構３２で生成された不明瞭音に基づく不明瞭化音声を他方のスピーカ３４ｂから放音可能に構成されている。つまり、当該音声処理システムは、第１の不明瞭音生成機構３１で生成された不明瞭音及び第２の不明瞭音生成機構３２で生成された不明瞭音を別系統で放音可能に構成されている。当該音声処理システムは、かかる構成によっても、場所を隔てて行われる会話に重ねて不明瞭化音声を放音することができる。そのため、第１ブースＹ４内での発言内容に加え、遠隔会議端末３３ａから放音される音声がメイン空間Ｘ３に漏洩することを的確に抑制することができる。 The voice processing system has a first indistinct sound generation mechanism 31 that generates an indistinct sound from the sound emitted in the first booth Y4, and a first indistinct sound that generates an indistinct sound from the sound emitted in the second booth Y5. In the unclear sound generation mechanism 32 of 2, the unclear sound based on the unclear sound generated by the first unclear sound generation mechanism 31, and the unclear sound generated by the second unclear sound generation mechanism 32. It is provided with a sound emitting mechanism 34 that emits an obscured sound based on the sound in the main space X3. The sound emitting mechanism 34 is composed of a pair of speakers 34a and 34b. The voice processing system emits an unclear sound based on the unclear sound generated by the first unclear sound generation mechanism 31 from one speaker 34a, and is generated by the second unclear sound generation mechanism 32. The obscured sound based on the unclear sound can be emitted from the other speaker 34b. That is, the voice processing system is configured so that the unclear sound generated by the first unclear sound generation mechanism 31 and the unclear sound generated by the second unclear sound generation mechanism 32 can be emitted by different systems. Has been done. The voice processing system can also emit an obscured voice over a conversation held at different locations. Therefore, in addition to the content of the remarks in the first booth Y4, it is possible to accurately suppress the sound emitted from the remote conference terminal 33a from leaking to the main space X3.

前述の実施形態では、発言内容の漏洩を抑制すべきブースとメイン空間とを区分する間仕切りにスピーカが取り付けられる構成について説明したが、スピーカの取り付け場所は前述の実施形態の構成に限定されるものではない。当該音声処理装置は、スピーカの配置を選択することで、雰囲気の共有度合いを調整することができる。当該音声処理装置は、例えば図６に示すように、複数のブースＹ６，Ｙ７が存在する場合、一方のブースＹ６に配置されるマイク４１ａで集音される音から生成された不明瞭音に基づく不明瞭化音声を放音するスピーカ４４ｂを他方のブースＹ７とメイン空間Ｘ４とを区分する間仕切りＰ５に取り付けてもよい。また、当該音声処理装置は、他方のブースＹ７に配置されるマイク４１ｂで集音される音から生成される不明瞭音に基づく不明瞭化音声を放音するスピーカ４４ａを一方のブースＹ６とメイン空間Ｘ４とを区分する間仕切りＰ４に取り付けてもよい。 In the above-described embodiment, the configuration in which the speaker is attached to the partition that separates the booth and the main space where leakage of the content of the statement should be suppressed has been described, but the installation location of the speaker is limited to the configuration of the above-described embodiment. is not it. The voice processing device can adjust the degree of sharing of the atmosphere by selecting the arrangement of the speakers. As shown in FIG. 6, for example, when a plurality of booths Y6 and Y7 exist, the voice processing device is based on an unclear sound generated from a sound collected by a microphone 41a arranged in one booth Y6. The speaker 44b that emits the obscured sound may be attached to the partition P5 that separates the other booth Y7 and the main space X4. Further, in the voice processing device, the speaker 44a that emits the obscured sound based on the unclear sound generated from the sound collected by the microphone 41b arranged in the other booth Y7 is mainly used as the one booth Y6. It may be attached to the partition P4 that separates the space X4.

当該音声処理システムは、音声を放射する放射パネルを必要に応じて室内空間に設置することで、メイン空間に存在する不明瞭化音声の聴者に対して音源の存在を感じ難くさせることが可能である。 By installing a radiation panel that emits sound in the indoor space as needed, the sound processing system can make it difficult for the listener of the obscured sound existing in the main space to feel the presence of the sound source. is there.

当該音声処理システムは、事務スペースと打合せブースとが区分されるオフィス空間の他、例えば公共空間のロビー等に用いることも可能である。 The voice processing system can be used not only in an office space where an office space and a meeting booth are separated, but also in a lobby of a public space, for example.

以上説明したように、本発明の音声処理システム及び音声処理装置は、ブース外に存在する人に騒音感を与えることを抑えつつ、ブース内の会話内容をブース外に伝わり難くすることができるので、事務スペースと打合せブースとが間仕切りによって区分されるオフィス空間に好適に適用することができる。 As described above, the voice processing system and the voice processing device of the present invention can make it difficult for the conversation contents in the booth to be transmitted to the outside of the booth while suppressing giving a noise feeling to the person existing outside the booth. , Can be suitably applied to an office space in which an office space and a meeting booth are separated by a partition.

１，２１，３１第１の不明瞭音生成機構
２２，３２第２の不明瞭音生成機構
２，２４，３４放音機構
３集音手段
４高周波成分低減手段
５残響付加手段
６信号分割手段
７信号処理手段
８同期手段
１１ａ，１１ｂ，４１ａ，４１ｂマイク
１２不明瞭化処理部
２３ａ，２３ｂ，３３ａ，３３ｂ遠隔会議端末
２５ミキシング機構
３４ａ，３４ｂ，４４ａ，４４ｂスピーカ
Ａ１，Ａ２テーブル
Ｂ１，Ｂ２椅子
Ｐ１，Ｐ２，Ｐ３，Ｐ４，Ｐ５間仕切り
Ｘ１，Ｘ２，Ｘ３，Ｘ４メイン空間
Ｙ１，Ｙ２，Ｙ３，Ｙ４，Ｙ５，Ｙ６，Ｙ７ブース 1,1,31 First unclear sound generation mechanism 22, 32 Second unclear sound generation mechanism 2, 24, 34 Sound generation mechanism 3 Sound collecting means 4 High frequency component reducing means 5 Reverberation adding means 6 Signal dividing means 7 Signal processing means 8 Synchronization means 11a, 11b, 41a, 41b Microphone 12 Obscuring processing unit 23a, 23b, 33a, 33b Remote conference terminal 25 Mixing mechanism 34a, 34b, 44a, 44b Speakers A1, A2 Tables B1, B2 Chairs P1 , P2, P3, P4, P5 Partition X1, X2, X3, X4 Main space Y1, Y2, Y3, Y4, Y5, Y6, Y7 Booth

Claims

Used for a main space and an indoor space with one or more booths separated from this main space by a partition.
A first indistinct sound generation mechanism that generates an indistinct sound from the first sound emitted in the one or more booths, and
A second indistinct sound generation mechanism that generates an indistinct sound from a second sound emitted in another booth that holds a remote conference with the one or more booths.
A mechanism for emitting an unclear sound generated by the first unclear sound generation mechanism and an unclear sound based on the unclear sound generated by the second unclear sound generation mechanism into the main space. A voice processing system equipped.

The voice processing system according to claim 1, wherein the first obscure sound generation mechanism has a means for reducing a high frequency component of the first sound and a means for adding reverberation to the sound after the reduction of the high frequency component.

A mechanism for mixing the sound obtained by the first obscure sound generation mechanism and the sound obtained by the second obscure sound generation mechanism is further provided.
The voice processing system according to claim 1 or 2 , wherein the sound emitting mechanism emits an obscured sound mixed by the mixing mechanism.

Used for a main space and an indoor space with one or more booths separated from this main space by a partition.
A first microphone for collecting the first sound in the one or more booths,
A second microphone that collects a second sound in another booth that holds a remote conference with the one or more booths, and a second microphone.
A first obscuration processing unit that obscures the first sound collected by the first microphone, and a first obscuration processing unit.
A second obscuration processing unit that obscures the second sound collected by the second microphone, and a second obscuration processing unit.
A speaker that emits an unclear sound based on the unclear sound generated by the first obscuring processing unit and the unclear sound generated by the second obscuring processing unit in the main space. A voice processing device to be equipped.