JP2011154138A

JP2011154138A - Masker sound generation apparatus, and program

Info

Publication number: JP2011154138A
Application number: JP2010014872A
Authority: JP
Inventors: Yasushi Shimizu; 寧清水; Mai Koike; 舞小池
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2010-01-26
Filing date: 2010-01-26
Publication date: 2011-08-11
Anticipated expiration: 2030-01-26
Also published as: JP5691180B2

Abstract

PROBLEM TO BE SOLVED: To obtain a high masking effect in a room without making a person in the room feel uncomfortable. SOLUTION: In an FFT (Fast Fourier Transform) unit 33, frequencies of the lowest side and the highest side of a frequency band in which power is a threshold or more in the average spectrum of a sound signal X are set to cut-off frequencies fc<SB>L</SB>and fc<SB>H</SB>of LPFs 41 and 44, a BPF 43 and HPFs 42 and 45. The sound signal X is input to the LPF 41, the HPF 42 and the BPF 43. Then, an arrangement order changed signal X<SB>B</SB>' in which the arrangement order of an output signal X<SB>B</SB>of the BPF 43 is changed, and noise signals y<SB>L</SB>' and y<SB>H</SB>' output from noise output units 35 and 36, are added, and the addition result is output from the speaker 94 of a room 92 as a masker sound signal M. COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、マスカ音を生成して音の漏れ聞こえを防ぐ技術に関する。 The present invention relates to a technique for generating a masker sound and preventing sound leakage.

マスキング効果は、２種類の音信号を同じ空間内に伝搬させた場合に、空間内の者が、２種類の音信号の音響的特徴（周波数成分，時間波形等）の関係に応じて、それらの音信号に気づき難くなる現象である。特許文献１には、このマスキング効果を利用して話声の漏れ聞こえを防ぐ技術の開示がある。同文献に開示されたマスキングシステムは、隣接する一方の部屋のマイクにより採取した音声を示す音信号を、マスキングの対象となるターゲット音信号とする。そして、このマスキングシステムは、ターゲット音信号を一音節分の信号の纏まり毎に区切り、区切った各区間を並べ替えるスクランブル処理を施し、スクランブル処理を施した音信号をマスカ音信号として他方の部屋のスピーカから放射する。この技術によると、ターゲット音信号とターゲット音信号に近い音の特徴を持ったマスカ音信号が放射されるため、マスキング効果により、そのマスカ音信号が放射された部屋内の者はターゲット音信号の聞き取りが困難になる。 The masking effect is that when two kinds of sound signals are propagated in the same space, the person in the space can change the sound characteristics (frequency components, time waveforms, etc.) of the two kinds of sound signals. This is a phenomenon that makes it difficult to notice the sound signal. Patent Document 1 discloses a technique for preventing leakage of speech using this masking effect. In the masking system disclosed in this document, a sound signal indicating sound collected by a microphone in one adjacent room is set as a target sound signal to be masked. This masking system then scrambles the target sound signal for each group of signals for one syllable, rearranges the divided sections, and uses the scrambled sound signal as a masker sound signal for the other room. Radiates from the speaker. According to this technology, the target sound signal and the masker sound signal having the sound characteristics close to the target sound signal are radiated. Therefore, the person in the room from which the masker sound signal was radiated is masked. Hearing becomes difficult.

特開２００８−２３３６７１号公報JP 2008-233671 A

しかしながら、この種のマスキングシステムでは、マスキングの対象となる音声に近い特徴を持った音の信号を、その音声をマスキングする手段であるマスカ音信号として利用する。このため、マスカ音信号の放射先が会話音や暗騒音の少ない静かな環境の部屋である場合、室内の者に不快な印象を与えてしまうという問題があった。
本発明は、このような背景の下に案出されたものであり、室内の者を不快にさせることなく高いマスキング効果を得ることを目的とする。 However, in this type of masking system, a sound signal having a characteristic close to the sound to be masked is used as a masker sound signal that is means for masking the sound. For this reason, when the radiation destination of the masker sound signal is a room in a quiet environment with little conversational sound and background noise, there is a problem that an unpleasant impression is given to the person in the room.
The present invention has been devised under such a background, and an object of the present invention is to obtain a high masking effect without making a person in the room uncomfortable.

本発明は、音信号を、第１の周波数帯域の成分を含む第１の帯域信号と、前記第１の周波数帯域と異なる第２の周波数帯域の成分を含む第２の帯域信号とに分割する帯域分割手段と、前記帯域分割手段が分割した第１の帯域信号の配列順を変更した配列順変更信号を出力する配列順変更手段と、前記帯域分割手段が分割した第２の帯域信号と同じ周波数帯域の雑音成分を含む雑音信号を出力する雑音出力手段と、前記配列順変更手段が出力した配列順変更信号と前記雑音出力手段が出力した雑音信号とを加算したマスカ音信号を出力する加算手段とを具備するマスカ音生成装置を提供する。 The present invention divides a sound signal into a first band signal including a component of a first frequency band and a second band signal including a component of a second frequency band different from the first frequency band. Same as the band dividing means, the arrangement order changing means for outputting the arrangement order changing signal obtained by changing the arrangement order of the first band signals divided by the band dividing means, and the second band signal divided by the band dividing means Noise output means for outputting a noise signal including a noise component in a frequency band, and addition for outputting a masker sound signal obtained by adding the arrangement order change signal output by the arrangement order change means and the noise signal output by the noise output means And a masker sound generating device.

本発明では、マスキングの対象となる音信号のパワースペクトルにおける大きなパワーを持った周波数帯域を第１の周波数帯域とすることにより、大きなパワーを有する周波数成分の特徴だけがマスキングの対象のそれと似通っているマスカ音信号を生成することができる。そして、そのような特徴を有するマスカ音信号を室内に放射することにより、室内の者を不快にさせることなく高いマスキング効果を得ることができる。 In the present invention, by setting the frequency band having a large power in the power spectrum of the sound signal to be masked as the first frequency band, only the characteristics of the frequency component having the large power are similar to those of the masking target. A masker sound signal can be generated. Then, by radiating a masker sound signal having such characteristics into the room, a high masking effect can be obtained without making the person in the room uncomfortable.

また、本発明は、コンピュータに、音信号を、第１の周波数帯域の成分を含む第１の帯域信号と、前記第１の周波数帯域と異なる第２の周波数帯域の成分を含む第２の帯域信号とに分割する帯域分割手段と、前記帯域分割手段が分割した第１の帯域信号の配列順を変更した配列順変更信号を出力する配列順変更手段と、前記帯域分割手段が分割した第２の帯域信号と同じ周波数帯域の雑音成分を含む雑音信号を出力する雑音出力手段と、前記配列順変更手段が出力した配列順変更信号と前記雑音出力手段が出力した雑音信号とを加算したマスカ音信号を出力する加算手段とを実現させるプログラムを提供する。 In addition, the present invention provides a computer with a second band including a sound signal, a first band signal including a first frequency band component, and a second frequency band component different from the first frequency band. A band dividing unit that divides the signal into signals, an arrangement order changing unit that outputs an arrangement order changing signal obtained by changing the arrangement order of the first band signals divided by the band dividing unit, and a second divided by the band dividing unit. Noise output means for outputting a noise signal including a noise component in the same frequency band as the band signal, and a masker sound obtained by adding the arrangement order change signal output by the arrangement order change means and the noise signal output by the noise output means A program for realizing an adding means for outputting a signal is provided.

この発明の第１実施形態であるマスカ音生成装置の構成を示す図である。It is a figure which shows the structure of the masker sound production | generation apparatus which is 1st Embodiment of this invention. 同マスカ音生成装置の設定部によって算出される平均スペクトルを示す図である。It is a figure which shows the average spectrum calculated by the setting part of the same masker sound production | generation apparatus. 同マスカ音生成装置の配列順変更部が実行する処理を示す図である。It is a figure which shows the process which the arrangement | sequence order change part of the same masker sound production | generation apparatus performs. 同マスカ音生成装置の配列順変更部が実行する処理を示す図である。It is a figure which shows the process which the arrangement | sequence order change part of the same masker sound production | generation apparatus performs. 同マスカ音生成装置の配列順変更部が実行する処理を示す図である。It is a figure which shows the process which the arrangement | sequence order change part of the same masker sound production | generation apparatus performs. 同マスカ音生成装置の配列順変更部が実行する処理を示す図である。It is a figure which shows the process which the arrangement | sequence order change part of the same masker sound production | generation apparatus performs. 同マスカ音生成装置の配列順変更部が実行する処理を示す図である。It is a figure which shows the process which the arrangement | sequence order change part of the same masker sound production | generation apparatus performs. 同マスカ音生成装置の配列順変更部が実行する処理を示す図である。It is a figure which shows the process which the arrangement | sequence order change part of the same masker sound production | generation apparatus performs. 同マスカ音生成装置の配列順変更部が実行する処理を示す図である。It is a figure which shows the process which the arrangement | sequence order change part of the same masker sound production | generation apparatus performs. 同マスカ音生成装置の配列順変更部が実行する処理を示す図である。It is a figure which shows the process which the arrangement | sequence order change part of the same masker sound production | generation apparatus performs. この発明の第２実施形態であるマスカ音生成装置の構成を示す図である。It is a figure which shows the structure of the masker sound production | generation apparatus which is 2nd Embodiment of this invention. この発明の第３実施形態であるマスカ音生成装置の構成を示す図である。It is a figure which shows the structure of the masker sound production | generation apparatus which is 3rd Embodiment of this invention. この発明の第４実施形態であるマスカ音生成装置の構成を示す図である。It is a figure which shows the structure of the masker sound production | generation apparatus which is 4th Embodiment of this invention. この発明の他の実施形態であるマスカ音生成装置の設定部によって算出される平均スペクトルを示す図である。It is a figure which shows the average spectrum calculated by the setting part of the masker sound production | generation apparatus which is other Embodiment of this invention. この発明の他の実施形態であるマスカ音生成装置の設定部によって算出される平均スペクトルを示す図である。It is a figure which shows the average spectrum calculated by the setting part of the masker sound production | generation apparatus which is other Embodiment of this invention. この発明の他の実施形態であるマスカ音生成装置の配列順変更部が実行する処理を示す図である。It is a figure which shows the process which the arrangement | sequence order change part of the masker sound production | generation apparatus which is other Embodiment of this invention performs.

以下、図面を参照しつつ本発明の一実施形態について説明する。
＜第１実施形態＞
図１は、本発明の第１実施形態であるマスカ音生成装置１０とマイクロホン９３およびスピーカ９４とを含むマスキングシステムの構成を示すブロック図である。このシステムにおけるマスカ音生成装置１０は、壁９０により仕切られた２つの部屋９１，９２のうち一方の部屋９１内の話者の音声（ターゲット音という）を聞こえ難くするマスカ音信号Ｍを生成し、他方の部屋９２へ出力する装置である。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
<First Embodiment>
FIG. 1 is a block diagram showing a configuration of a masking system including a masker sound generating device 10, a microphone 93, and a speaker 94 according to the first embodiment of the present invention. The masker sound generation device 10 in this system generates a masker sound signal M that makes it difficult to hear the voice of the speaker (referred to as target sound) in one of the two rooms 91 and 92 partitioned by the wall 90. , A device that outputs to the other room 92.

このマスカ音生成装置１０は、部屋９１内のマイクロホン９３が採取した音信号Ｘを音声データＳとして音声メモリ１９に記憶する処理と、音声メモリ１９内の音声データＳを素材としてマスカ音信号Ｍを生成して出力する処理を行う。マスカ音信号Ｍの素材とする音データＳは部屋９１内の話者本人の音声のものでも別人の音声のものでもよいが、少なくとも部屋９１内の話者の音声と同じ性別の音声データＳを利用することが好ましい。 The masker sound generation device 10 stores the sound signal X collected by the microphone 93 in the room 91 in the sound memory 19 as the sound data S, and the masker sound signal M using the sound data S in the sound memory 19 as a material. Generate and output. The sound data S used as the material of the masking sound signal M may be the voice of the speaker in the room 91 or the voice of another person, but at least the voice data S of the same gender as the voice of the speaker in the room 91 is used. It is preferable to use it.

このマスカ音生成装置１０の操作部５０は、データの収録を指示する操作、音声メモリ１９内の各音声データＳのうち１つをマスカ音信号Ｍの素材として選択する操作、部屋９１内の話者の性別を選択する操作、マスカ音信号Ｍの生成を指示する操作などを受け付ける。 The operation unit 50 of the masker sound generation device 10 performs an operation for instructing data recording, an operation for selecting one of the sound data S in the sound memory 19 as a material for the masker sound signal M, and a story in the room 91. An operation for selecting a person's gender, an operation for instructing generation of a masker sound signal M, and the like are accepted.

Ａ／Ｄ変換部１１には、部屋９１に固定されたマイクロホン９３が収音した音声のアナログ波形信号が入力される。Ａ／Ｄ変換部１１は、そのアナログ波形信号をデジタル信号に変換し、音信号Ｘとして出力する。書込制御部１５は、データの収録を指示する操作と性別を選択する操作が操作部５０によって行われた場合、その時から発話時間長Ｔ１（Ｔ１は、通常の話速で一文を話すのに要する時間長：例えば、Ｔ１＝３０秒とする）の間にＡ／Ｄ変換部１１から出力される音信号Ｘを音声データＳとし、操作部５０により指定された性別を示す識別子を付加して音声メモリ１９に書き込む。データ供給制御部７０は、操作部５０によって、性別および音声データＳの種類を選択する操作とマスカ音信号Ｍの生成を指示する操作が行われた場合、操作部５０の操作によって音声メモリ１９内から選択された種類の音声データＳを読み出し、読み出した音声データＳを音信号Ｘとして制御部１２に供給する。 The analog waveform signal of the sound collected by the microphone 93 fixed in the room 91 is input to the A / D conversion unit 11. The A / D converter 11 converts the analog waveform signal into a digital signal and outputs it as a sound signal X. When the operation for instructing the data recording and the operation for selecting the gender are performed by the operation unit 50, the writing control unit 15 uses the speech duration length T1 (T1 is used to speak a sentence at a normal speech speed). The sound signal X output from the A / D converter 11 during the time required (for example, T1 = 30 seconds) is set as the audio data S, and an identifier indicating the gender specified by the operation unit 50 is added. Write to the audio memory 19. When the operation unit 50 performs an operation of selecting the gender and the type of the audio data S and an operation of instructing the generation of the masker sound signal M by the operation unit 50, the data supply control unit 70 operates in the audio memory 19 by the operation of the operation unit 50. Is read out, and the read out audio data S is supplied to the control unit 12 as a sound signal X.

制御部１２は、データ供給制御部７０から入力される音信号Ｘに信号処理を施すことにより発話時間長Ｔ１分のマスカ音信号Ｍを生成し、生成したマスカ音信号Ｍをバッファ１７に書き込む。この制御部１２による信号処理については後述する。発音制御部１８は、制御部１２によってバッファ１７に書き込まれた発話時間長Ｔ１分のマスカ音信号Ｍを読み出してＤ／Ａ変換部１４へ出力する処理を繰り返す。Ｄ／Ａ変換部１４は、発音制御部１８から出力されるマスカ音信号Ｍをアナログ波形信号に変換し、部屋９２に固定されたスピーカ９４へ出力する。このマスカ音信号Ｍは、スピーカ９４からマスカ音として放音される。 The control unit 12 performs signal processing on the sound signal X input from the data supply control unit 70 to generate a masker sound signal M corresponding to the utterance time length T1 and writes the generated masker sound signal M in the buffer 17. The signal processing by the control unit 12 will be described later. The sound generation control unit 18 repeats the process of reading the masker sound signal M for the utterance time length T1 written in the buffer 17 by the control unit 12 and outputting it to the D / A conversion unit 14. The D / A conversion unit 14 converts the masker sound signal M output from the sound generation control unit 18 into an analog waveform signal, and outputs the analog waveform signal to the speaker 94 fixed in the room 92. The masker sound signal M is emitted from the speaker 94 as a masker sound.

制御部１２は、ＣＰＵ２０、ＲＡＭ２１、およびＲＯＭ２２を有する。ＣＰＵ２０は、ＲＡＭ２１をワークエリアとして利用しつつ、ＲＯＭ２２に記憶された音生成プログラム２３を実行する。音生成プログラム２３は、雑音発生部３１、帯域分割部３２、雑音出力部３５，３６、配列順変更部３７、および加算部３８の各機能をＣＰＵ２０に実現させるプログラムである。 The control unit 12 includes a CPU 20, a RAM 21, and a ROM 22. The CPU 20 executes the sound generation program 23 stored in the ROM 22 while using the RAM 21 as a work area. The sound generation program 23 is a program that causes the CPU 20 to realize the functions of the noise generating unit 31, the band dividing unit 32, the noise output units 35 and 36, the arrangement order changing unit 37, and the adding unit 38.

雑音発生部３１は、操作部５０によってマスカ音信号Ｍの生成を指示する操作が行われると、雑音信号Ｙの発生を開始する。雑音信号Ｙは、ホワイトノイズのサンプル列である。帯域分割部３２には、この雑音信号Ｙとデータ供給制御部７０の出力信号である音信号Ｘとが入力される。帯域分割部３２は、音信号Ｘを、第１の周波数帯域Ｗ１の成分とその低域側および高域側の第２の周波数帯域Ｗ２ＬおよびＷ２Ｈの成分を各々含む３種類の帯域信号Ｘ_Ｂ，Ｘ_Ｌ，Ｘ_Ｈに分割する役割と、雑音信号Ｙを、第２の周波数帯域Ｗ２ＬおよびＷ２Ｈの成分を各々含む２種類の帯域信号Ｙ_ＬおよびＹ_Ｈに分割する役割とを果たす。 The noise generator 31 starts generating the noise signal Y when an operation for instructing the generation of the masker sound signal M is performed by the operation unit 50. The noise signal Y is a sample sequence of white noise. The noise signal Y and the sound signal X that is the output signal of the data supply control unit 70 are input to the band dividing unit 32. The band dividing unit 32 converts the sound signal X into three types of band signals X _B , each including a component of the first frequency band W1 and components of the second frequency band W2L and W2H on the low frequency side and the high frequency side thereof. It plays the role of dividing into X _L and X _H and the role of dividing the noise signal Y into two types of band signals Y _L and Y _H each containing components of the second frequency bands W2L and W2H.

より具体的に説明すると、帯域分割部３２は、ＬＰＦ（Low Pass Filter）４１、ＨＰＦ（High Pass Filter）４２、ＢＰＦ（Band Pass Filter）４３、ＬＰＦ４４、およびＨＰＦ４５の５種類のフィルタを有している。ＢＰＦ４３は、音信号Ｘにおけるカットオフ周波数ｆｃ_Ｌとカットオフ周波数ｆｃ_Ｈ（ｆｃ_Ｈ＞ｆｃ_Ｌ）の間の周波数帯域（周波数帯域Ｗ１）の信号を帯域信号Ｘ_Ｂとして出力する。ＬＰＦ４１は、音信号Ｘにおけるカットオフ周波数ｆｃ_Ｌより低い周波数帯域（周波数帯域Ｗ２Ｌ）の信号を帯域信号Ｘ_Ｌとして出力する。ＨＰＦ４２は、音信号Ｘにおけるカットオフ周波数ｆｃ_Ｈより高い周波数帯域（周波数帯域Ｗ２Ｈ）の信号を帯域信号Ｘ_Ｈとして出力する。ＬＰＦ４４は、雑音信号Ｙにおけるカットオフ周波数ｆｃ_Ｌより低い周波数帯域（周波数帯域Ｗ２Ｌ）の信号を帯域信号Ｙ_Ｌとして出力する。ＨＰＦ４５は、雑音信号Ｙにおけるカットオフ周波数ｆｃ_Ｈより高い周波数帯域（周波数帯域Ｗ２Ｈ）の信号を帯域信号Ｙ_Ｈとして出力する。 More specifically, the band dividing unit 32 includes five types of filters: an LPF (Low Pass Filter) 41, an HPF (High Pass Filter) 42, a BPF (Band Pass Filter) 43, an LPF 44, and an HPF 45. Yes. The BPF 43 outputs a signal in a frequency band (frequency band W1) between the cutoff frequency fc _L and the cutoff frequency fc _H (fc _H > fc _L ) in the sound signal X as the band signal X _B. LPF41 outputs a signal of a frequency band lower than the cut-off frequency fc _L in the sound signal X (frequency band W2L) as the band signal _{X L.} HPF42 outputs a signal of the frequency band higher than the cut-off frequency fc _H in the sound signal X (frequency band W2H) as the band signal _{X H.} LPF44 outputs a signal cutoff at the noise signal Y frequency fc _L lower than the frequency band (frequency band W2L) as the band signal _{Y L.} HPF45 outputs a signal cutoff at the noise signal Y frequency fc _H higher than frequency band (frequency band W2H) as the band signal _{Y H.}

設定部３４は、女性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈと男性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈのうち操作部５０の操作により選択された性別のものをＬＰＦ４１、ＨＰＦ４２、ＢＰＦ４３、ＬＰＦ４４、およびＨＰＦ４５に設定する。ここで、女性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈと男性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈは、標準的な女性の音声のパワースペクトルと標準的な男性の音声のパワースペクトルを各々利用して次のように求められたものである。図２（Ａ）は、標準的な女性の音声のパワースペクトルを示す図であり、図２（Ｂ）は、標準的な男性の音声のパワースペクトルを示す図である。図２（Ａ）および図２（Ｂ）に示すように、女性の音声のパワースペクトルの波形の重心は、男性の音声のそれより高域側に位置している。本実施形態では、図２（Ａ）に示す標準的な女性の音声のパワースペクトルにおける閾値Ｔｈ以上のパワーを持った帯域の下限と上限の周波数を女性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈとする。また、図２（Ｂ）に示す標準的な男性の音声のパワースペクトルにおける閾値Ｔｈ以上のパワーを持った帯域の下限と上限の周波数を男性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈとする。 Setting unit 34, LPF 41 those gender selected by operating the operation unit 50 of the cut-off frequency fc _L and fc _H for voice cutoff frequency fc _L and fc _H and men for female voice, HPF 42 , BPF43, LPF44, and HPF45. Here, the cutoff frequency fc _L and fc _H for voice cutoff frequency fc _L and fc _H and men for women speech power spectrum of a standard female voice and a standard male voice power Each spectrum is obtained as follows. 2A is a diagram showing a power spectrum of a standard female voice, and FIG. 2B is a diagram showing a power spectrum of a standard male voice. As shown in FIGS. 2 (A) and 2 (B), the center of gravity of the waveform of the power spectrum of female speech is located higher than that of male speech. In the present embodiment, the lower limit and upper limit frequencies of the band having a power equal to or higher than the threshold Th in the power spectrum of the standard female voice shown in FIG. 2 (A) are set as the cut-off frequencies fc _L and fc for female voice. _{Let H} be. Further, the lower limit and upper limit frequencies of the band having power equal to or higher than the threshold Th in the standard male voice power spectrum shown in FIG. 2B are cut off frequencies fc _L and fc _H for male voice. .

図１において、ＢＰＦ４３の出力信号Ｘ_Ｂは配列順変更部３７に入力される。配列順変更部３７は、ＢＰＦ４３の出力信号Ｘ_Ｂの配列順を変更した配列順変更信号Ｘ_Ｂ’を出力する。この配列順変更部３７は、図３に示すように、データ供給制御部７０からＢＰＦ４３を介して入力される発話時間長Ｔ１（Ｔ１＝３０秒）分の帯域信号Ｘ_Ｂを発話時間長Ｔ２（例えば、Ｔ２＝５秒とする）ずつの６個の纏まりに区切り、それらの時間長Ｔ２の各々をなす一連の音サンプルを発音時間長Ｔ３（Ｔ３は、一音節の発音時間に相当する時間長：例えば、Ｔ３＝１００ｍ秒）の一定の長さのフレームｋ（ｋ＝１〜Ｎ）に区切る。例えば、Ｔ２＝５秒，Ｔ３＝１００ｍ秒である場合、発話時間長Ｔ２内におけるフレーム数Ｎは、５／０．１＝５０個である。そして、配列順変更部３７は、発話時間長Ｔ２内のＮ個のフレームｋ（ｋ＝１〜Ｎ）を並べ替え、フレームｋ（ｋ＝１〜Ｎ）を並べ替えた信号を配列順変更信号Ｘ_Ｂ’として出力する。配列順変更部３７によるフレームｋ（ｋ＝１〜Ｎ）の並べ替えは、以下の７つの態様ａ〜ｇのいずれかにより行う。 In Figure 1, the output signal _{X B} of BPF43 is input to the sequential order changing unit 37. The arrangement order changing unit 37 outputs an arrangement order changing signal X _B ′ obtained by changing the arrangement order of the output signal X _B of the BPF 43. The arrangement order change section 37, as shown in FIG. 3, the data utterance duration T1 (T1 = 30 seconds) from the supply control unit 70 via the BPF43 inputted partial utterance duration band signals _{X B} of T2 ( For example, it is divided into 6 groups each having T2 = 5 seconds, and a series of sound samples forming each of the time lengths T2 is converted into a sound generation time length T3 (T3 is a time length corresponding to the sound generation time of one syllable). : For example, the frame is divided into frames k (k = 1 to N) having a fixed length of T3 = 100 milliseconds. For example, when T2 = 5 seconds and T3 = 100 milliseconds, the number N of frames in the speech time length T2 is 5 / 0.1 = 50. Then, the arrangement order changing unit 37 rearranges the N frames k (k = 1 to N) within the utterance time length T2, and arranges the signals obtained by rearranging the frames k (k = 1 to N). Output as X _B '. The rearrangement of the frames k (k = 1 to N) by the arrangement order changing unit 37 is performed by any one of the following seven modes a to g.

ａ．時間長Ｔ２内において、フレームｋ（ｋ＝１〜Ｎ）の各々を元の位置とは異なった位置に移動し、かつ、各フレームｋの前に位置するフレームおよび後に位置するフレームが、並び替えにより異なったものになるようにする（図４）。
ｂ．時間長Ｔ２内において、フレームｋ（ｋ＝１〜Ｎ）の各々を元の位置とは異なった位置に移動し、かつ、フレームｋ（ｋ＝１〜Ｎ）の一部については、各フレームｋの前に位置するフレームおよび後に位置するフレームが、並び替えにより異なったものになるようにする（図５）。
ｃ．時間長Ｔ２内において、フレームｋ（ｋ＝１〜Ｎ）のうち一部を元の位置とは異なった位置に各々移動し、かつ、それらの移動する各フレームｋについては、各フレームｋの前に位置するフレームおよび後に位置するフレームが、並び替えにより異なったものになるようにする（図６）。
ｄ．時間長Ｔ２内において、フレームｋ（ｋ＝１〜Ｎ）のうち先頭からＮ−ｍ（ｍ＜Ｎ）個のフレームｋをフレームｍ個分ずつ後方に移動し、それらＮ−ｍ個のフレームｋの元の位置に残りのフレームｋを移動する（図７）。
ｅ．時間長Ｔ２内において、フレームｋ（ｋ＝１〜Ｎ）のうち末尾からＮ−ｍ（ｍ＜Ｎ）個のフレームｋをフレームｍ個分ずつ前方に移動し、それらＮ−ｍ個のフレームｋの元の位置に残りのフレームｋを移動する（図８）。
ｆ．時間長Ｔ２内において、フレームｋ（ｋ＝１〜Ｎ）を先頭から２個ずつの各組にし、各組のフレームｋの前後関係を入れ替える（図９）。
ｇ．時間長Ｔ２内において、フレーム１→フレーム２…フレームＮ−１→フレームＮの配列順を逆転させてフレームＮ→フレームＮ−１…フレーム２→フレーム１にする（図１０）。 a. Within the time length T2, each frame k (k = 1 to N) is moved to a position different from the original position, and the frame located before and after each frame k is rearranged. (Fig. 4).
b. Within the time length T2, each frame k (k = 1 to N) is moved to a position different from the original position, and for each part of the frame k (k = 1 to N), each frame k The frame located before and after the frame is made different by rearrangement (FIG. 5).
c. Within the time length T2, a part of the frame k (k = 1 to N) is moved to a position different from the original position, and for each moving frame k, the front of each frame k is moved. The frame located at and the frame located later are made different by rearrangement (FIG. 6).
d. Within the time length T2, N−m (m <N) frames k from the beginning of the frame k (k = 1 to N) are moved backward by m frames, and these N−m frames k The remaining frame k is moved to its original position (FIG. 7).
e. Within the time length T2, Nm (m <N) frames k from the end of the frame k (k = 1 to N) are moved forward by m frames, and the Nm frames k are moved forward. The remaining frame k is moved to the original position (FIG. 8).
f. Within the time length T2, the frames k (k = 1 to N) are set to two sets from the beginning, and the front-rear relations of the frames k of the sets are switched (FIG. 9).
g. Within the time length T2, the arrangement order of frame 1 → frame 2... Frame N−1 → frame N is reversed to frame N → frame N−1... Frame 2 → frame 1 (FIG. 10).

図１において、ＬＰＦ４１および４４の出力信号Ｘ_ＬおよびＹ_Ｌは雑音出力部３５に入力される。雑音出力部３５は、ＬＰＦ４４の出力信号Ｙ_ＬをＬＰＦ４１の出力信号Ｘ_Ｌの音と等しい音のエネルギーを持つように増幅し、雑音信号Ｙ_Ｌ'として出力する。より具体的に説明すると、雑音出力部３５は、ＬＰＦ４１から時間長Ｔ２分の信号Ｘ_Ｌが入力される度に、その時間長Ｔ２分の信号Ｘ_Ｌの振幅の２乗平均（信号Ｘ_Ｌが示す音のエネルギーの時間長Ｔ２分の時間平均）ＥＸ_Ｌを求める。また、雑音出力部３５は、ＬＰＦ４４から時間長Ｔ２分の信号Ｙ_Ｌが入力される度に、その時間長Ｔ２分の信号Ｙ_Ｌの振幅の２乗平均（信号Ｙ_Ｌが示す音のエネルギーの時間長Ｔ２分の時間平均）ＥＹ_Ｌを求める。そして、雑音出力部３５は、これらの値ＥＸ_ＬおよびＥＹ_Ｌを次式（１）に代入して求まるゲインＧをＬＰＦ４４から出力された時間長Ｔ２分の出力信号Ｙ_Ｌに乗算し、乗算結果を時間長Ｔ２分の雑音信号Ｙ_Ｌ'として出力する。
Ｇ＝ＥＸ_Ｌ／ＥＹ_Ｌ・・・（１） In FIG. 1, output signals X _L and Y _L of the LPFs 41 and 44 are input to the noise output unit 35. The noise output unit 35 amplifies the output signal Y _L of the LPF 44 so as to have a sound energy equal to the sound of the output signal X _L of the LPF 41, and outputs it as a noise signal Y _L ′. To be more specific, the noise output unit 35, every time the signal _{X L} time length T2 minutes LPF41 is input, the mean square of the amplitude of the time length T2 minutes signal _{X L} (signal _{X L} is energy of the time length T2-minute time of the sound that indicates the average) obtaining the EX _L. Also, the noise output unit 35, every time the signal _{Y L} of the time length T2 minutes LPF44 is input, the time length T2 minutes signal _{Y L} 2 mean square of the amplitude (signal _{Y L} is the sound indicated energy Time average of time length T2) EY _L is obtained. The noise output unit 35 multiplies these values EX _L and EY _L output signal of the time length T2 min output the gain G obtained by substituting the LPF44 to the following equation (1) to _{Y L,} the multiplication result Is output as a noise signal Y _L ′ of time length T2.
G = EX _L / EY _L (1)

ＨＰＦ４２および４５の出力信号Ｘ_ＨおよびＹ_Ｈは雑音出力部３６に入力される。雑音出力部３６は、ＨＰＦ４５の出力信号Ｙ_ＨをＨＰＦ４２の出力信号Ｘ_Ｈの音と等しい音のエネルギーを持つように増幅し、雑音信号Ｙ_Ｈ'として出力する。雑音出力部３６における雑音信号Ｙ_Ｈ'の生成の具体的な手順は雑音出力部３５における雑音信号Ｙ_Ｌ'の生成の具体的な手順と同様である。 The output signals X _H and Y _H of the HPFs 42 and 45 are input to the noise output unit 36. Noise output unit 36 amplifies the output signal _{Y H} of HPF45 to have the energy of the output signal _{X H} sound equal sound of HPF 42, and outputs a noise signal _{Y H} '. The specific procedure for generating the noise signal Y _H ′ in the noise output unit 36 is the same as the specific procedure for generating the noise signal Y _L ′ in the noise output unit 35.

加算部３８は、配列順変更部３７から時間長Ｔ２分ずつ出力される信号Ｘ_Ｂ'と雑音出力部３５および３６から時間長Ｔ２分ずつ出力される信号Ｙ_Ｌ’およびＹ_Ｈ'を加算し、この加算結果をマスカ音信号Ｍとして出力する。加算部３８が出力したマスカ音信号Ｍはバッファ１７に書き込まれる。そして、そのマスカ音信号ＭがＤ／Ａ変換部１４による変換を経てスピーカ９４から放音される。 The adding unit 38 adds the signal X _B ′ output from the arrangement order changing unit 37 for each time length T2 and the signals Y _L ′ and Y _H ′ output from the noise output units 35 and 36 for each time length T2. The addition result is output as a masker sound signal M. The masker sound signal M output from the adder 38 is written into the buffer 17. The masker sound signal M is emitted from the speaker 94 after being converted by the D / A converter 14.

以上説明した本実施形態によると、大きなパワーを持った周波数成分だけが部屋９１内の話者の音声のそれと似通った特徴を有するマスカ音信号Ｍを生成することができる。そして、このマスカ音信号Ｍをスピーカ９４から部屋９２に放射することにより、部屋９２内の者を不快にさせることなく高いマスキング効果を得ることができる。 According to the present embodiment described above, it is possible to generate a masker sound signal M in which only a frequency component having a large power has characteristics similar to that of a speaker's voice in the room 91. By radiating the masker sound signal M from the speaker 94 to the room 92, a high masking effect can be obtained without making the person in the room 92 uncomfortable.

＜第２実施形態＞
図１１は、本発明の第２実施形態であるマスカ音生成装置１０Ａとマイクロホン９３およびスピーカ９４とを含むマスキングシステムの構成を示すブロック図である。図１１において、第１実施形態のマスカ音生成装置１０と同じ要素には同一の符号を付してある。 Second Embodiment
FIG. 11 is a block diagram showing a configuration of a masking system including a masker sound generation device 10A, a microphone 93, and a speaker 94 according to the second embodiment of the present invention. In FIG. 11, the same elements as those of the masker sound generation device 10 of the first embodiment are denoted by the same reference numerals.

このマスカ音生成装置１０Ａは、マスカ音信号Ｍの素材となる音声データＳそのもののパワースペクトルに基づいてＬＰＦ４１、ＨＰＦ４２、ＢＰＦ４３、ＬＰＦ４４、およびＨＰＦ４５のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈを設定する。このマスカ音生成装置１０Ａでは、データ供給制御部７０と帯域分割部３２内のＢＰＦ４３，ＬＰＦ４１，およびＨＰＦ４２の間に遅延部７１が介挿されており、データ供給制御部７０と設定部３４Ａの間にＦＦＴ（Fast Fourier Transform）部３３が介挿されている。遅延部７１は、データ供給制御部７０の出力信号Ｘに時間長Ｔ１の遅延を与えてからＢＰＦ４３，ＬＰＦ４１，およびＨＰＦ４２に出力する。ＦＦＴ部３３は、データ供給制御部７０から時間長Ｔ３（Ｔ３＝１００ｍ秒）分の音サンプルが出力される度に、時間長Ｔ３分の音サンプルにＦＦＴを施し、ＦＦＴにより求まったパワースペクトルを出力する。 The masking sound generating apparatus 10A, LPF 41 based on the power spectrum of the audio data S itself as the material of the masking sound signal M, HPF42, BPF43, LPF44, and sets the cutoff frequency fc _L and fc _H of HPF45. In this masker sound generating device 10A, a delay unit 71 is interposed between the data supply control unit 70 and the BPF 43, LPF 41, and HPF 42 in the band dividing unit 32, and between the data supply control unit 70 and the setting unit 34A. An FFT (Fast Fourier Transform) unit 33 is inserted in the center. The delay unit 71 gives a delay of time length T1 to the output signal X of the data supply control unit 70, and then outputs it to the BPF 43, LPF 41, and HPF 42. The FFT unit 33 performs the FFT on the sound sample for the time length T3 every time the sound sample for the time length T3 (T3 = 100 msec) is output from the data supply control unit 70, and obtains the power spectrum obtained by the FFT. Output.

設定部３４Ａは、ＦＦＴ部３３が出力したパワースペクトルにおいて、パワーが閾値以上となる周波数帯域を第１の周波数帯域Ｗ１とし、この第１の周波数帯域Ｗ１の下限および上限の周波数をカットオフ周波数ｆｃ_Ｌおよびｆｃ_ＨとしてＬＰＦ４１，４４，ＢＰＦ４３，およびＨＰＦ４２，４５に設定する。より具体的に説明すると、設定部３４Ａは、ＦＦＴ部３３から音声データＳの長さである時間長Ｔ１分のパワースペクトルの列が出力されるのを待ち、それらのパワースペクトルの時間平均（以下、平均スペクトルという）を求める。そして、設定部３４Ａは、この平均スペクトルにおける各周波数成分のパワーを低域側から順に走査し、閾値Ｔｈ以上のパワーを持った最も低域側の周波数をＬＰＦ４１および４４とＢＰＦ４３におけるカットオフ周波数ｆｃ_Ｌとする。また、設定部３４Ａは、平均スペクトルにおける各周波数成分のパワーを高域側から順に走査し、閾値Ｔｈ以上のパワーを持った最も高域側の周波数をＨＰＦ４２および４５とＢＰＦ４３におけるカットオフ周波数ｆｃ_Ｈとする。 In the power spectrum output from the FFT unit 33, the setting unit 34A sets the frequency band in which the power is equal to or higher than the threshold as the first frequency band W1, and sets the lower limit and upper limit frequencies of the first frequency band W1 as the cutoff frequency fc. LPF41,44 as _L and fc _H, is set to BPF 43, and HPF42,45. More specifically, the setting unit 34A waits for the output of the power spectrum column for the time length T1 that is the length of the audio data S from the FFT unit 33, and the time average of these power spectra (hereinafter referred to as the power spectrum). (Referred to as average spectrum). Then, the setting unit 34A sequentially scans the power of each frequency component in the average spectrum from the low frequency side, and sets the lowest frequency having power equal to or higher than the threshold Th to the cutoff frequency fc in the LPFs 41 and 44 and the BPF 43. _{Let L} be. The setting unit 34A sequentially scans the power of each frequency component in the average spectrum from the high frequency side, and sets the highest frequency having power equal to or higher than the threshold Th to the cutoff frequency fc _H in the HPFs 42 and 45 and the BPF 43. And

以上説明した本実施形態によると、部屋９２内の者を不快にさせることなくマスキング効果を得ることができる。 According to the present embodiment described above, a masking effect can be obtained without making the person in the room 92 uncomfortable.

＜第３実施形態＞
図１２は、本発明の第３実施形態であるマスカ音生成装置１０Ｂとマイクロホン９３およびスピーカ９４とを含むマスキングシステムの構成を示すブロック図である。図１２において、第１および第２実施形態のマスカ音生成装置１０および１０Ａと同じ要素には同一の符号を付してある。 <Third Embodiment>
FIG. 12 is a block diagram showing a configuration of a masking system including a masker sound generation device 10B, a microphone 93, and a speaker 94 according to the third embodiment of the present invention. In FIG. 12, the same elements as those of the masker sound generation apparatuses 10 and 10A of the first and second embodiments are denoted by the same reference numerals.

このマスカ音生成装置１０Ｂは、マイクロホン９３により採取された最新の音信号Ｘを素材とするマスカ音信号Ｍの生成とその信号Ｘのパワースペクトルを利用したカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈの更新をリアルタイムに行う。このマスカ音生成装置１０Ｂは、書込制御部１５、音声メモリ１９、およびデータ供給制御部７０を有していない。このマスカ音生成装置１０Ｂでは、Ａ／Ｄ変換部１１の出力信号Ｘが遅延部７１とＦＦＴ部３３に入力される。遅延部７１およびＦＦＴ部３３の役割は第２実施形態のものの役割と同じである。 The masking sound generating apparatus 10B, the updating of the latest cutoff and the sound signal X using the generation and power spectrum of the signal X masking sound signal M to the material frequency fc _L and fc _H taken by the microphone 93 Perform in real time. The masker sound generation device 10B does not include the writing control unit 15, the audio memory 19, and the data supply control unit 70. In the masker sound generation device 10 B, the output signal X of the A / D conversion unit 11 is input to the delay unit 71 and the FFT unit 33. The roles of the delay unit 71 and the FFT unit 33 are the same as those of the second embodiment.

設定部３４Ｂは、ＦＦＴ部３４からパワースペクトルが出力される度に、出力されたものを含む時間長Ｔ１分のパワースペクトルの移動平均（以下、移動平均スペクトルという）を求める。そして、設定部３４Ｂは、最新の時間長Ｔ１分の移動平均スペクトルが求まる都度、移動平均スペクトルにおける閾値Ｔｈ以上のパワーを持った帯域の最も低域側の周波数をカットオフ周波数ｆｃ_ＬとしてＬＰＦ４１および４４とＢＰＦ４３に設定し、最も高域側の周波数をカットオフ周波数ｆｃ_ＨとしてＨＰＦ４２および４５とＢＰＦ４３に設定する。 Each time the power spectrum is output from the FFT unit 34, the setting unit 34B obtains a moving average (hereinafter referred to as a moving average spectrum) of the power spectrum for the time length T1 including the output power spectrum. Then, every time the moving average spectrum for the latest time length T1 is obtained, the setting unit 34B sets the LPF 41 and the cut-off frequency fc _L as the lowest frequency in the band having power equal to or higher than the threshold Th in the moving average spectrum. 44 and BPF 43 are set, and the highest frequency is set in the HPFs 42 and 45 and the BPF 43 as the cut-off frequency fc _H.

以上説明した本実施形態によると、部屋９１に不特定多数人が出入りして部屋９１を利用する場合においても、部屋９２内の者を不快にさせることなく高いマスキング効果を得ることができる。 According to the present embodiment described above, even when an unspecified number of people enter and leave the room 91 and use the room 91, a high masking effect can be obtained without making the person in the room 92 uncomfortable.

＜第４実施形態＞
図１３は、本発明の第４実施形態であるマスカ音生成装置１０Ｃとスピーカ９４とを含むマスキングシステムの構成を示すブロック図である。図１３において、第１、第２、および第３実施形態のマスカ音生成装置１０、１０Ａ、および１０Ｂと同じ要素には同一の符号を付してある。 <Fourth embodiment>
FIG. 13 is a block diagram showing a configuration of a masking system including a masker sound generation device 10C and a speaker 94 according to the fourth embodiment of the present invention. In FIG. 13, the same elements as those of the masker sound generation devices 10, 10A, and 10B of the first, second, and third embodiments are denoted by the same reference numerals.

このマスカ音生成装置１０Ｃでは、標準的な女性および男性の音声を素材としてマスカ音信号Ｍを生成する。このマスカ音生成装置１０Ｃは、Ａ／Ｄ変換部１１および書込制御部１５を有していない。音声メモリ１９には、標準的な女性の時間長Ｔ１分の音声を示す音声データＳＦと、標準的な男性の時間長Ｔ１分の音声を示す音声データＳＭとが予め記憶されている。本実施形態では、操作部５０の操作により、部屋９１内の話者の性別として女性が選択された場合、設定部３４は女性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_ＨをＬＰＦ４１、ＨＰＦ４２、ＢＰＦ４３、ＬＰＦ４４、およびＨＰＦ４５に設定し、データ供給制御部７０は音声データＳＦを音声メモリ１９から読み出してＨＰＦ４２、ＢＰＦ４３、およびＬＰＦ４４に供給する。また、部屋９１内の話者の性別として男性が選択された場合、設定部３４は男性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_ＨをＬＰＦ４１、ＨＰＦ４２、ＢＰＦ４３、ＬＰＦ４４、およびＨＰＦ４５に設定し、データ供給制御部７０は音声データＳＭを音声メモリ１９から読み出してＨＰＦ４２、ＢＰＦ４３、およびＬＰＦ４４に供給する。 The masker sound generation device 10C generates a masker sound signal M using standard female and male voices as materials. The masker sound generation device 10 C does not include the A / D conversion unit 11 and the writing control unit 15. The audio memory 19 stores in advance audio data SF indicating a standard female time length T1 and audio data SM indicating a standard male time length T1. In the present embodiment, when a woman is selected as the gender of the speaker in the room 91 by the operation of the operation unit 50, the setting unit 34 sets the cut-off frequencies fc _L and fc _H for the female voice to LPF 41, HPF 42, The data supply control unit 70 reads out the audio data SF from the audio memory 19 and supplies the audio data SF to the HPF 42, the BPF 43, and the LPF 44 by setting the BPF 43, the LPF 44, and the HPF 45. When male is selected as the gender of the speaker in the room 91, the setting unit 34 sets the cut-off frequencies fc _L and fc _H for male voice to LPF 41, HPF 42, BPF 43, LPF 44, and HPF 45, The data supply control unit 70 reads the audio data SM from the audio memory 19 and supplies it to the HPF 42, BPF 43, and LPF 44.

以上説明した本実施形態では、部屋９１内の話者の性別として女性が選択された場合には、予め準備された音声データＳＦから女性の話者向けのマスカ音信号Ｍを生成し、男性が選択された場合には、予め準備された音声データＳＭから男性の話者向けのマスカ音信号Ｍを生成する。よって、部屋９１内の音声をマイクロホン９３を使って収音できない場合でも、部屋９２内の者を不快にさせることなく高いマスキング効果を得ることができる。 In the present embodiment described above, when a woman is selected as the gender of a speaker in the room 91, a masker sound signal M for a female speaker is generated from voice data SF prepared in advance. When selected, a masker sound signal M for a male speaker is generated from the voice data SM prepared in advance. Therefore, even when the sound in the room 91 cannot be collected using the microphone 93, a high masking effect can be obtained without making the person in the room 92 uncomfortable.

以上、この発明の一実施形態について説明したが、この発明には他にも実施形態があり得る。例えば、以下の通りである。
（１）上記第１〜第４実施形態において、設定部３４がボリュームやつまみなどの操作子を有し、部屋９１内の話者がその操作子の操作を通じてカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈをマニュアル設定できるようにしてもよい。この実施形態によると、部屋９１内の話者は、部屋９２内の暗騒音の音量が大きい場合はカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈの間の帯域幅を広くし、部屋９２内の暗騒音の音量が小さい場合はカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈの間の帯域幅を狭める、というように、部屋９２に放音するマスカ音信号Ｍの周波数成分の特徴をその部屋９２の音響環境に応じて調整することができる。また、第１〜第３実施形態におけるマスカ音生成装置１０，１０Ａ，または１０Ｂは、部屋９２内の暗騒音の音量を検出する検出手段を有し、設定部３４は、検出手段が検出した音量に応じて、カットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈの間の帯域幅を変更するようにしてもよい。 Although one embodiment of the present invention has been described above, the present invention may have other embodiments. For example, it is as follows.
(1) In the first to fourth embodiments, the setting unit 34 has controls such as a volume and a knob, and a speaker in the room 91 sets the cutoff frequencies fc _L and fc _H through operation of the controls. Manual setting may be possible. According to this embodiment, the speaker in the room 91 increases the bandwidth between the cut-off frequencies fc _L and fc _{H when} the background noise level in the room 92 is high, and the background noise in the room 92 is reduced. Depending on the acoustic environment of the room 92, the frequency components of the masker sound signal M emitted to the room 92 are narrowed such that the bandwidth between the cut-off frequencies fc _L and fc _H is narrowed when the volume is low. Can be adjusted. Further, the masker sound generation device 10, 10 A, or 10 B in the first to third embodiments includes a detection unit that detects the volume of background noise in the room 92, and the setting unit 34 detects the volume detected by the detection unit. Depending on, the bandwidth between the cutoff frequencies fc _L and fc _H may be changed.

（２）上記第１実施形態において、女性の声に似せたマスカ音信号Ｍ１と男性の声に似せたマスカ音信号Ｍ２を生成し、マスカ音信号Ｍ１とＭ２を加算したマスカ音信号Ｍを部屋９２のスピーカ９４から放音してもよい。この実施形態によると、部屋９１内において女性の話者と男性の話者が同時に話していても、部屋９２内の者を不快にさせることなく高いマスキング効果を得ることができる。 (2) In the first embodiment, a masker sound signal M1 resembling a female voice and a masker sound signal M2 resembling a male voice are generated, and the masker sound signal M1 obtained by adding the masker sound signals M1 and M2 is stored in the room. Sound may be emitted from 92 speakers 94. According to this embodiment, even if a female speaker and a male speaker are speaking simultaneously in the room 91, a high masking effect can be obtained without making the person in the room 92 uncomfortable.

この場合において、次のようにして２種類のマスカ音信号Ｍ１およびＭ２を生成するとよい。まず、図１に示すマスカ音生成装置１０の帯域分割部３２内に、ＬＰＦ４１，４４，ＢＰＦ４３，ＨＰＦ４２，４５とＬＰＦ４１’，４４’，ＢＰＦ４３’，ＨＰＦ４２’，４５’とを設ける。さらに、ＬＰＦ４１，４４，ＢＰＦ４３，ＨＰＦ４２，４５に女性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈを設定し、ＬＰＦ４１’，４４’，ＢＰＦ４３’，ＨＰＦ４２’，４５’に男性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈを設定する。そして、帯域分割部３２内のＬＰＦ４１，ＢＰＦ４３，ＨＰＦ４２に女性の音声の音信号Ｘを供給してマスカ音信号Ｍ１を生成するとともに、同部３２内のＬＰＦ４１’，ＢＰＦ４３’，ＨＰＦ４２’に男性の音声の音信号Ｘを供給してマスカ音信号Ｍ２を生成し、これらの２種類の信号Ｍ１およびＭ２を加算したものをマスカ音信号Ｍとして部屋９２のスピーカ９４から放音する。 In this case, two types of masker sound signals M1 and M2 are preferably generated as follows. First, LPF 41, 44, BPF 43, HPF 42, 45 and LPF 41 ', 44', BPF 43 ', HPF 42', 45 'are provided in the band dividing unit 32 of the masker sound generating apparatus 10 shown in FIG. Further, cut-off frequencies fc _L and fc _H for female voice are set in LPF 41, 44, BPF 43, HPF 42, 45, and cut for male voice is set in LPF 41 ', 44', BPF 43 ', HPF 42', 45 '. The off frequencies fc _L and fc _H are set. Then, the sound signal X of the female voice is supplied to the LPF 41, BPF 43, and HPF 42 in the band dividing unit 32 to generate the masker sound signal M1, and the LPF 41 ', BPF 43', and HPF 42 'in the same unit 32 are also connected to the male signal. An audio sound signal X is supplied to generate a masker sound signal M2, and a sum of these two types of signals M1 and M2 is emitted from the speaker 94 in the room 92 as a masker sound signal M.

また、次のようにして２種類のマスカ音信号Ｍ１およびＭ２を生成してもよい。まず、時間長Ｔ４（例えば、Ｔ４＝５／２秒とする）の間、ＬＰＦ４１，４４，ＢＰＦ４３，ＨＰＦ４２，４５に女性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈを設定した状態で、ＬＰＦ４１，ＢＰＦ４３，ＨＰＦ４２に女性の音声の音信号Ｘを供給し、マスカ音信号Ｍ１を生成する。マスカ音信号Ｍ１はバッファに記憶する。次の時間長Ｔ４の間、ＬＰＦ４１，４４，ＢＰＦ４３，ＨＰＦ４２，４５に男性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈを設定した状態で、ＬＰＦ４１，ＢＰＦ４３，ＨＰＦ４２に男性の音声の音信号Ｘを供給し、マスカ音信号Ｍ２を生成する。そして、このマスカ音信号Ｍ２とバッファに書き込んでおいたマスカ音信号Ｍ１を加算したものをマスカ音信号Ｍとして部屋９２のスピーカ９４から放音する。以上の処理を時間長２×（Ｔ４）毎に繰り返すのである。 Further, two types of masker sound signals M1 and M2 may be generated as follows. First, the LPF 41 is set with the cut-off frequencies fc _L and fc _H for female voice set in the LPF 41, 44, BPF 43, HPF 42, 45 for a time length T4 (for example, T4 = 5/2 seconds). , BPF 43 and HPF 42 are supplied with a female sound signal X, and a masker sound signal M1 is generated. The masker sound signal M1 is stored in the buffer. During the next time length T4, the male audio sound signal X is sent to the LPF 41, BPF 43, HPF 42 while the cut-off frequencies fc _L and fc _H for male voice are set in the LPF 41, 44, BPF 43, HPF 42, 45. To generate a masker sound signal M2. Then, a sum of the masker sound signal M2 and the masker sound signal M1 written in the buffer is emitted as a masker sound signal M from the speaker 94 in the room 92. The above processing is repeated every time length 2 × (T4).

（３）上記第１実施形態において、ターゲット音の発声元が女性であるか男性であるかを判定し、この判定結果に応じて女性用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈと男性用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈの切り換えを行ってもよい。この実施形態は、次のようにして実現する。まず、マスカ音信号Ｍの生成を開始する際、マイクロホン９３により採取した音信号ＸにＦＦＴを施し、このＦＦＴの処理結果を所定のアルゴリズムで解析することで、部屋９１内の話者が女性であるか男性であるかを判定する。そして、部屋９１内の話者が女性である場合は、ＬＰＦ４１，４４，ＢＰＦ４３，ＨＰＦ４２，４５に女性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈを設定するとともに、音声メモリ１９内から女性の音声データＳを読み出し、この音声データＳを素材としてマスカ音信号Ｍを生成する。部屋９１内の話者が男性である場合は、ＬＰＦ４１，４４，ＢＰＦ４３，ＨＰＦ４２，４５に男性の音声用のカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈを設定するとともに、音声メモリ１９内から男性の音声データＳを読み出し、この音声データＳを素材としてマスカ音信号Ｍを生成する。 (3) In the first embodiment, it is determined whether the sound source of the target sound is female or male, and the cut-off frequencies fc _L and fc _H for female and the cut for male are determined according to the determination result. The off frequencies fc _L and fc _H may be switched. This embodiment is realized as follows. First, when the generation of the masker sound signal M is started, the sound signal X collected by the microphone 93 is subjected to FFT, and the FFT processing result is analyzed by a predetermined algorithm, so that the speaker in the room 91 is a woman. Determine if you are male or male. When the speaker in the room 91 is a woman, the cut-off frequencies fc _L and fc _H for the female voice are set in the LPFs 41, 44, BPF 43, HPF 42, 45, and the female memory 19 The audio data S is read, and a masker sound signal M is generated using the audio data S as a material. When the speaker in the room 91 is a male, the cut-off frequencies fc _L and fc _H for male voice are set in the LPFs 41, 44, BPF 43, HPF 42, 45, and male voice data is stored from the voice memory 19. S is read, and a masker sound signal M is generated using the audio data S as a material.

（４）上記第３実施形態において、設定部３４Ｂは、ＦＦＴ部３３からパワースペクトルが出力される度に、出力された最新のパワースペクトルを用いてカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈを更新してもよい。また、設定部３４Ｂは、カットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈを、前回求めたものから今回求めたものへと所定時間長をかけて緩やかに変化させるようにしてもよい。 (4) In the third embodiment, each time the power spectrum is output from the FFT unit 33, the setting unit 34B updates the cutoff frequencies fc _L and fc _H using the latest output power spectrum. Also good. The setting unit 34B may gradually change the cutoff frequencies fc _L and fc _H from the previously obtained value to the currently obtained value over a predetermined time length.

（５）上記第３実施形態において、マスカ音信号Ｍの生成を指示する操作が行われた後の最初の時間長Ｔ１分の平均スペクトルに基づいてＬＰＦ４１，４４、ＢＰＦ４３，ＨＰＦ４２，４５のカットオフ周波数ｆｃ_Ｌ，ｆｃ_Ｈを設定し、以降は、そのカットオフ周波数ｆｃ_Ｌ，ｆｃ_Ｈを切り換えることなくマスカ音信号Ｍの生成を行うようにしてもよい。 (5) In the third embodiment, the cutoff of the LPFs 41 and 44, the BPF 43, the HPFs 42 and 45 based on the average spectrum for the first time length T1 after the operation for instructing the generation of the masker sound signal M is performed. The frequencies fc _L and fc _H are set, and thereafter, the masker sound signal M may be generated without switching the cutoff frequencies fc _L and fc _H.

（６）上記第４実施形態では、音声メモリ１９に男性と女性の音声データＳＦおよびＳＭを記憶させた。しかし、３種類以上の音声データＳを音声メモリ１９に記憶してもよい。この実施形態は、次のようにして実現する。性別や言語（日本語、英語、中国語など）を異にする様々な音声の音声データＳを、各音声データＳが示す音声波形のパワースペクトルと対応付けて音声メモリ１９に記憶させておく。そして、マスカ音信号Ｍの生成の際、マイクロホン９３によって収音したターゲット音の音信号ＸにＦＦＴを施し、ＦＦＴによって求めたパワースペクトルに最も近いものと対応付けられた音声データＳを音声メモリ１９から読み出し、この音声データＳを素材としてマスカ音信号Ｍを生成する。また、言語を異にする複数種類の音声データＳを音声メモリ１９に記憶させておき、音声メモリ１９内の複数種類の音声データＳのうち操作部５０の操作によって選択された言語のものを素材としてマスカ音信号Ｍを生成してもよい。 (6) In the fourth embodiment, male and female voice data SF and SM are stored in the voice memory 19. However, three or more types of audio data S may be stored in the audio memory 19. This embodiment is realized as follows. The voice data S of various voices having different genders and languages (Japanese, English, Chinese, etc.) is stored in the voice memory 19 in association with the power spectrum of the voice waveform indicated by each voice data S. Then, when generating the masker sound signal M, the sound signal X of the target sound picked up by the microphone 93 is subjected to FFT, and the sound data S associated with the one closest to the power spectrum obtained by the FFT is stored in the sound memory 19. The masker sound signal M is generated using the audio data S as a material. Further, a plurality of types of audio data S having different languages are stored in the audio memory 19, and a material having a language selected by operating the operation unit 50 among the plurality of types of audio data S in the audio memory 19 is used. A masker sound signal M may be generated.

（７）上記第３実施形態において、配列順変更部３７は、上述した態様ａ〜態様ｇを時間長Ｔ１または時間長Ｔ２毎にランラムに選択し、選択した態様でフレームｋ（ｋ＝１〜Ｎ）の配列順を変更するようにしてもよい。また、この配列順の変更の態様の切り換えを、時間長Ｔ１や時間長Ｔ２と異なる周期で行ってもよいし、態様の切り換えのタイミング自体をランダムに決定してもよい。 (7) In the third embodiment, the arrangement order changing unit 37 selects the above-described aspect a to aspect g as a random number for each time length T1 or time length T2, and the frame k (k = 1 to 1) in the selected aspect. The arrangement order of N) may be changed. In addition, the switching of the arrangement order change mode may be performed in a period different from the time length T1 or the time length T2, or the mode switching timing itself may be determined at random.

（８）上記第１〜第４実施形態において、配列順変更部３７は、音信号Ｘを区切った各フレームｋのパワースペクトルやその他の分析結果に応じて各フレームｋの移動先を決定してもよい。この実施形態は、次のようにして実現する。まず、配列順変更部３７は、音信号Ｘをフレームｋ（ｋ＝１〜Ｎ）に区切った後、フレームｋ（ｋ＝１〜Ｎ）の各々を分析し、フレームｋ（ｋ＝１〜Ｎ）を母音に相当する区間のフレームｋと子音に相当する区間のフレームｋとに分ける。そして、フレームｋ（ｋ＝１〜Ｎ）のうち母音に相当する区間の各フレームｋ同士の位置をランダムに変更するとともに、子音に相当する区間の各フレームｋ同士の位置をランダムに変更する。 (8) In the first to fourth embodiments, the arrangement order changing unit 37 determines the movement destination of each frame k according to the power spectrum of each frame k dividing the sound signal X and other analysis results. Also good. This embodiment is realized as follows. First, the arrangement order changing unit 37 divides the sound signal X into frames k (k = 1 to N), and then analyzes each of the frames k (k = 1 to N). ) Is divided into a frame k in a section corresponding to a vowel and a frame k in a section corresponding to a consonant. Then, the positions of the frames k in the section corresponding to the vowel in the frame k (k = 1 to N) are randomly changed, and the positions of the frames k in the section corresponding to the consonant are randomly changed.

（９）上記第１〜第４実施形態は、壁９０により仕切られた２つの部屋９１および９２間の音声の漏れ聞こえの防止に本発明を適用したものであった。しかし、壁９０などが間に介在しない２つの領域ＡおよびＢのうち一方の領域Ａ（またはＢ）で発生した音を他方の領域Ｂ（またはＡ）で聞こえ難くする用途に本発明を適用してもよい。また、異なる空間に居る者同士の通話を実現させる通話装置（例えば、携帯電話、ＩＰ電話、インターフォン等）における各話者の話声を周りに聞こえ難くする用途に本発明を適用してもよい。この実施形態は、例えば、通話装置に第１〜第４実施形態のマスカ音生成装置１０，１０Ａ，１０Ｂ，または１０Ｃを内蔵し、マスカ音生成装置１０，１０Ａ，１０Ｂ，または１０Ｃが話者の音声から生成したマスカ音信号Ｍを話者の周りに放音することによって実現可能である。この場合において、発話者にイヤホンを装着させたり通話装置のスピーカの指向性を制御することにより、マスカ音信号Ｍが通話の相手方まで伝送されて会話が混乱する事態を防ぐようにするとなおよい。 (9) In the first to fourth embodiments, the present invention is applied to the prevention of sound leakage between the two rooms 91 and 92 partitioned by the wall 90. However, the present invention is applied to an application in which it is difficult to hear a sound generated in one region A (or B) of the two regions A and B where the wall 90 or the like is not interposed between the other regions B (or A). May be. In addition, the present invention may be applied to a purpose of making it difficult to hear each speaker's voice in a communication device (for example, a mobile phone, an IP phone, an interphone, etc.) that realizes a call between people in different spaces. . In this embodiment, for example, the communication device includes the masker sound generation device 10, 10A, 10B, or 10C of the first to fourth embodiments, and the masker sound generation device 10, 10A, 10B, or 10C is the speaker. This can be realized by emitting a masker sound signal M generated from speech around the speaker. In this case, it is more preferable to prevent the situation where the masker sound signal M is transmitted to the other party of the call and the conversation is confused by attaching the earphone to the speaker or controlling the directivity of the speaker of the call device.

（１０）上記第２および第３実施形態では、設定部３４Ａおよび３４Ｂは、音信号Ｘの平均スペクトル（または移動平均スペクトル）においてパワーが閾値Ｔｈ以上となる周波数帯域の最も低域側および高域側の周波数をカットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈとした。しかし、カットオフ周波数ｆｃ_Ｌおよびｆｃ_Ｈの決定に用いる閾値Ｔｈを高域側と低域側とで異ならせてもよい。この実施形態は、次のようにして実現する。まず、設定部３４Ａ（または３４Ｂ）は、図１４に示すように、カットオフ周波数ｆｃ_Ｌの決定に閾値Ｔｈ'１を用い、カットオフ周波数ｆｃ_Ｈの決定に閾値Ｔｈ'２（Ｔｈ'２≠Ｔｈ'１）を用いる。そして、設定部３４Ａ（または３４Ｂ）は、音信号Ｘの平均スペクトル（または移動平均スペクトル）における各周波数成分のパワーを低域側から順に走査し、閾値Ｔｈ'１以上のパワーを持った最も低域側の周波数をＬＰＦ４１および４４とＢＰＦ４３におけるカットオフ周波数ｆｃ_Ｌとする。また、設定部３４Ａ（または３４Ｂ）は、平均スペクトル（または移動平均スペクトル）における各周波数成分のパワーを高域側から順に走査し、閾値Ｔｈ'２以上のパワーを持った最も高域側の周波数をＨＰＦ４２および４５とＢＰＦ４３におけるカットオフ周波数ｆｃ_Ｈとする。この場合において、閾値Ｔｈ'１およびＴｈ'２は、人の聴感特性（ラウドネス特性（周波数軸上の感度）や臨界帯域（周波数軸上の分解能））を考慮して最適化するとよい。 (10) In the second and third embodiments, the setting units 34A and 34B have the lowest frequency band and the high frequency band of the frequency band in which the power is equal to or higher than the threshold Th in the average spectrum (or moving average spectrum) of the sound signal X. The frequencies on the side were defined as cutoff frequencies fc _L and fc _H. However, the threshold Th used for determining the cut-off frequencies fc _L and fc _H may be different between the high frequency side and the low frequency side. This embodiment is realized as follows. First, as shown in FIG. 14, the setting unit 34A (or 34B) uses the threshold Th′1 for determining the cutoff frequency fc _L , and uses the threshold Th′2 (Th′2 ≠) for determining the cutoff frequency fc _H. Th′1) is used. Then, the setting unit 34A (or 34B) sequentially scans the power of each frequency component in the average spectrum (or moving average spectrum) of the sound signal X from the low frequency side, and has the lowest power having a power equal to or higher than the threshold Th′1. The frequency on the band side is set as a cut-off frequency fc _L in the LPFs 41 and 44 and the BPF 43. The setting unit 34A (or 34B) scans the power of each frequency component in the average spectrum (or moving average spectrum) in order from the high frequency side, and the highest frequency having the power equal to or higher than the threshold Th′2. Is a cutoff frequency fc _H in the HPFs 42 and 45 and the BPF 43. In this case, the thresholds Th′1 and Th′2 may be optimized in consideration of human auditory characteristics (loudness characteristics (sensitivity on the frequency axis) and critical band (resolution on the frequency axis)).

（１１）上記第１〜第４実施形態では、帯域分割部３２は、音信号Ｘを、第１の周波数帯域Ｗ１の成分を含む帯域信号Ｘ_Ｂと、第１の周波数帯域Ｗ１の高域側および低域側の第２の周波数帯域Ｗ２ＬおよびＷ２Ｈの成分を含む２種類の帯域信号Ｘ_ＬおよびＸ_Ｈとに分割した。しかし、音信号Ｘを、第１の周波数帯域Ｗ１および第２の周波数帯域Ｗ２ＬおよびＷ２Ｈの各々について２種類以上の帯域信号に分割してもよい。この実施形態では、例えば、帯域分割部３２は、図１５に示すように、音信号Ｘの平均スペクトル（または移動平均スペクトル）の波形が閾値Ｔｈ”を跨いだ起伏を繰り返すものである場合、閾値Ｔｈ”以上のパワーを有する複数個（図１５の例では３個）の帯域を第１の周波数帯域Ｗ１−ｉ（ｉ＝１〜３）とし、帯域Ｗ１−１の低域側、帯域Ｗ１−１と帯域Ｗ１−２の間、帯域Ｗ１−２と帯域Ｗ１−３の間、および帯域Ｗ１−３の高域側の４つの帯域を第２の周波数帯域Ｗ２−ｊ（ｊ＝１〜４）とする。帯域分割部３２は、音信号Ｘを帯域Ｗ１−ｉ（ｉ＝１〜３）およびＷ２−ｊ（ｊ＝１〜５）の成分を各々含む７種類の帯域信号ＸＷ１−ｉ（ｉ＝１〜３）およびＸＷ２−ｊ（ｊ＝１〜４）に分割する。また、帯域分割部３２の後段に設けた複数個の配列順変更部３７により、帯域信号ＸＷ１−ｉ（ｉ＝１〜３）の配列順を変更した配列順変更信号ＸＷ１'−ｉ（ｉ＝１〜３）を生成するととともに、同部３２の後段に複数個ずつ設けた雑音出力部３５および３６により、帯域信号ＸＷ２−ｊ（ｊ＝１〜４）と同じ周波数成分を含む雑音信号ＸＷ２'−ｊ（ｊ＝１〜４）を生成する。そして、これらの信号ＸＷ１'−ｉ（ｉ＝１〜３）およびＸＷ２'−ｊ（ｊ＝１〜４）を加算したものをマスカ音信号Ｍとする。 (11) In the first to fourth embodiments, the band division section 32, a sound signal X, and the band signal X _B containing components of the first frequency band W1, the high frequency side of the first frequency band W1 and it is divided into two types of band signals X _L and X _H containing second frequency band W2L and components W2H lower frequency. However, the sound signal X may be divided into two or more types of band signals for each of the first frequency band W1 and the second frequency bands W2L and W2H. In this embodiment, for example, as shown in FIG. 15, the band dividing unit 32 has a threshold value when the waveform of the average spectrum (or moving average spectrum) of the sound signal X repeats undulations across the threshold value Th ″. A plurality of bands (three in the example of FIG. 15) having a power equal to or greater than Th ″ are defined as the first frequency band W1-i (i = 1 to 3), and the lower band side of the band W1-1, the band W1- 1 and the band W1-2, the band W1-2 and the band W1-3, and the four bands on the high frequency side of the band W1-3 are set to the second frequency band W2-j (j = 1 to 4). And The band dividing unit 32 includes seven types of band signals XW1-i (i = 1 to 1) each of the sound signal X including the components of the bands W1-i (i = 1 to 3) and W2-j (j = 1 to 5). 3) and XW2-j (j = 1 to 4). In addition, an arrangement order change signal XW1′-i (i = i = i) obtained by changing the arrangement order of the band signals XW1-i (i = 1 to 3) by a plurality of arrangement order changing sections 37 provided in the subsequent stage of the band dividing section 32. 1 to 3) and a noise signal XW2 ′ including the same frequency component as that of the band signal XW2-j (j = 1 to 4) by a plurality of noise output units 35 and 36 provided at the subsequent stage of the unit 32. -J (j = 1 to 4) is generated. A signal obtained by adding these signals XW1′-i (i = 1 to 3) and XW2′-j (j = 1 to 4) is defined as a masker sound signal M.

（１２）上記第１〜第４実施形態において、雑音出力部３５および３６は、ＬＰＦ４４およびＨＰＦ４５の出力信号Ｙ_ＬおよびＹ_Ｈを増幅することなくそのまま雑音信号Ｙ_ＬおよびＹ_Ｈとして出力し、加算部３８においてその２種類の雑音信号Ｙ_ＬおよびＹ_Ｈと配列順変更部３７の出力信号Ｘ_Ｂ'を加算してもよい。 (12) In the first to fourth embodiments, the noise output units 35 and 36 output the output signals Y _L and Y _H of the LPF 44 and the HPF 45 as noise signals Y _L and Y _H as they are without amplifying and adding them. The unit 38 may add the two types of noise signals Y _L and Y _H and the output signal X _B ′ of the arrangement order changing unit 37.

（１３）上記第１〜第４実施形態において、配列順変更部３７は、帯域信号Ｘ_Ｂをフレームｋ（ｋ＝１，２…Ｎ）に区切り、区切ったフレームｋ（ｋ＝１，２…Ｎ）を並べ替えることにより配列順の変更を行った。しかし、フレームｋ（ｋ＝１，２…Ｎ）を並べ替える代わりに、フレームｋの各々をなす一連の音サンプル自体の配列順を変更してもよい。また、配列順変更部３７による配列順の変更は、図１６に示すように、並べ替え前の信号Ｘ_Ｂのフレームｋ（ｋ＝１〜６）の一部が並べ替え後の信号Ｘ_Ｂ’において複数個現れるような態様で行ってもよい。 (13) In the above first to fourth embodiments, the arrangement order changing unit 37 divides the band signals _{X B} Frame k (k = 1,2 ... N) , separated by a frame k (k = 1, 2 ... The sequence was changed by rearranging N). However, instead of rearranging the frames k (k = 1, 2,..., N), the arrangement order of a series of sound samples forming each of the frames k may be changed. Further, the change of the arrangement order according to the arrangement order changing section 37, as shown in FIG. 16, frame k sort previous signal X _B (k = 1 to 6) of the signal X _B after partially sorted ' You may carry out in the aspect which appears two or more in.

（１４）上記第１〜第４実施形態において、所定時間長分のマスカ音信号Ｍを生成して当該マスカ音生成装置１０，１０Ａ，１０Ｂ，１０Ｃ内のメモリに記憶し、当該マスカ音生成装置１０，１０Ａ，１０Ｂ，１０Ｃを後刻起動した際に、メモリに記憶されているマスカ音信号Ｍを読み出して部屋９２に放音するようにしてもよい。 (14) In the first to fourth embodiments, a masker sound signal M for a predetermined time length is generated and stored in the memory in the masker sound generators 10, 10A, 10B, 10C, and the masker sound generator When the 10, 10A, 10B, and 10C are activated later, the masker sound signal M stored in the memory may be read and emitted to the room 92.

１０，１０Ａ,１０Ｂ,１０Ｃ…マスカ音生成装置、１１…Ａ／Ｄ変換部、１２…制御部、１４…Ｄ／Ａ変換部、１７…バッファ、１５…書込制御部、１８…発音制御部、１９…音声メモリ、２０…ＣＰＵ，２１…ＲＡＭ、２２…ＲＯＭ、２３…音生成プログラム、３２…帯域分割部、３３…ＦＦＴ部、３４…設定部、３５，３６…雑音出力部、３７…配列順変更部、３８…加算部、４１，４４…ＬＰＦ，４２，４５…ＨＰＦ、４３…ＢＰＦ、５０…操作部、７０…データ供給制御部、７１…遅延部、９０…壁、９１，９２…部屋、９３…マイクロホン、９４…スピーカ。 DESCRIPTION OF SYMBOLS 10, 10A, 10B, 10C ... Masker sound production | generation apparatus, 11 ... A / D conversion part, 12 ... Control part, 14 ... D / A conversion part, 17 ... Buffer, 15 ... Write control part, 18 ... Sound generation control part , 19 ... voice memory, 20 ... CPU, 21 ... RAM, 22 ... ROM, 23 ... sound generation program, 32 ... band dividing unit, 33 ... FFT unit, 34 ... setting unit, 35, 36 ... noise output unit, 37 ... Arrangement order changing unit, 38 ... adding unit, 41, 44 ... LPF, 42,45 ... HPF, 43 ... BPF, 50 ... operating unit, 70 ... data supply control unit, 71 ... delaying unit, 90 ... wall, 91,92 ... room, 93 ... microphone, 94 ... speaker.

Claims

Band dividing means for dividing the sound signal into a first band signal including a component of a first frequency band and a second band signal including a component of a second frequency band different from the first frequency band; ,
An arrangement order changing means for outputting an arrangement order change signal obtained by changing the arrangement order of the first band signals divided by the band dividing means;
Noise output means for outputting a noise signal including a noise component in the same frequency band as the second band signal divided by the band dividing means;
A masker sound generating apparatus comprising: an adding means for outputting a masker sound signal obtained by adding the array order changing signal output from the array order changing means and the noise signal output from the noise output means.

The band dividing means includes
The high frequency side and the low frequency side of the first frequency band are defined as the second frequency band, and two types of band signals each including components of these frequency bands are divided as the second band signal. ,
The noise output means includes
2. The masker according to claim 1, wherein two types of noise signals including noise components in the same frequency band as each of the two types of band signals divided by the band dividing unit as the second band signals are output. Sound generator.

Spectrum calculating means for calculating a power spectrum of the sound signal;
The power spectrum calculated by the spectrum calculating means comprises setting means for setting one or a plurality of frequency bands having power equal to or higher than a threshold as the first frequency band. The masker sound generation device described.

On the computer,
Band dividing means for dividing the sound signal into a first band signal including a component of a first frequency band and a second band signal including a component of a second frequency band different from the first frequency band; ,
An arrangement order changing means for outputting an arrangement order change signal obtained by changing the arrangement order of the first band signals divided by the band dividing means;
Noise output means for outputting a noise signal including a noise component in the same frequency band as the second band signal divided by the band dividing means;
A program for realizing an addition means for outputting a masker sound signal obtained by adding the arrangement order change signal output by the arrangement order change means and the noise signal output by the noise output means.