JP2011002704A

JP2011002704A - Sound signal transmitting device, sound signal receiving device, sound signal transmitting method and program therefor

Info

Publication number: JP2011002704A
Application number: JP2009146503A
Authority: JP
Inventors: Masahide Mizushima; 昌英水島; Akira Nakagawa; 朗中川; Hirosuke Hioka; 裕輔日岡; Kenichi Furuya; 賢一古家
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2009-06-19
Filing date: 2009-06-19
Publication date: 2011-01-06
Anticipated expiration: 2029-06-19
Also published as: JP5149872B2

Abstract

PROBLEM TO BE SOLVED: To solve the problem that, in order to obscure audio comprehension nature, a cut-off frequency is lowered to a degree in which a formant structure of voice is not transmitted so that sound of high frequency band may not be transmitted, but in such a case, information of a sound source other than voice (glass cracking sound etc.), which is high in frequency, is not transmitted.SOLUTION: In sound signal transmitting technology, a sound signal collected by a microphone is divided into frames by prescribed time, and a sound signal by frame is divided into a plurality of frequency band sound signals, and the divided sound signal by frame is obscured. When the obscuring processing is performed, frequency smoothed power of the input signal is determined, and variation of the input signal is compressed, and the input signal is time smoothed to longer time than a frame length.

Description

本発明は、通信を利用して高齢者等を遠隔地で見守る際に利用する音響信号送信装置、音響信号受信装置、音響信号送信方法及びそのプログラムに関する。 The present invention relates to an acoustic signal transmission device, an acoustic signal reception device, an acoustic signal transmission method, and a program thereof that are used when an elderly person or the like is monitored remotely using communication.

高齢者等を見守るような常時接続型の通信システムにおいて、見守られ側宅等でマイクロホンによって収音された音響信号を見守る側宅等に伝送する場合には、プライバシーへの配慮が欠かせない。プライバシーに配慮しつつ、見守られる側の状態を把握するためには、音響信号を加工して、不明瞭にしてから（音声の了解性をなくして）から伝送する必要がある。特許文献１が従来技術として知られている。特許文献１には、ローパスフィルタを通して、発話音声の了解性を低下させる方法が記載されている。 In an always-connected communication system that watches over the elderly and the like, privacy considerations are indispensable when transmitting to a side home or the like that watches over an acoustic signal collected by a microphone at the side home or the like. In order to grasp the state of the side to be watched while taking privacy into consideration, it is necessary to process the acoustic signal and make it unclear (without the comprehension of the voice) before transmitting it. Patent document 1 is known as a prior art. Patent Document 1 describes a method of reducing the intelligibility of an uttered voice through a low-pass filter.

特開２００６−２３８１１０号公報JP 2006-238110 A

しかしながら、音声の了解性をなくすためには、カットオフ周波数を音声のフォルマント構造が伝達されなくなる程度まで低くして、高い周波数帯域の音が伝わらないようにする必要がある。しかし、そうすると、音声以外の高い周波数にある音源の情報（ガラスが割れた音など）が伝わらないという課題が残る。 However, in order to eliminate the intelligibility of the voice, it is necessary to lower the cut-off frequency to such an extent that the voice formant structure is not transmitted so that the sound in the high frequency band is not transmitted. However, if it does so, the subject that the information (sound etc. which the glass broke) of the sound source in high frequencies other than a voice will not be transmitted remains.

上記の課題を解決するために、本発明に係る音響信号送信技術は、マイクロホンから収音された音響信号を所定時間ごとのフレームに分割し、各フレームごとの音響信号を複数の周波数帯域音響信号に分割し、各フレームごとに分割された音響信号を不明瞭化処理する。不明瞭化処理を行う際には、入力された信号の周波数的に平滑化したパワーを求め、入力された信号のパワーの変動を圧縮し、入力された信号をフレーム長より長い時間に時間的に平滑化する。 In order to solve the above problems, an acoustic signal transmission technique according to the present invention divides an acoustic signal collected from a microphone into frames for each predetermined time, and the acoustic signal for each frame is divided into a plurality of frequency band acoustic signals. The sound signal divided for each frame is obscured. When performing the obscuring process, the frequency-smoothed power of the input signal is obtained, the fluctuations in the power of the input signal are compressed, and the input signal is timed longer than the frame length. To smooth.

本発明は、不明瞭化部によって、音声の了解性は消して、非音声信号のパワー及び音色の時間変動を送信できるという効果を奏する。 The present invention has an effect that the obscuring unit can remove the intelligibility of the voice and transmit the power of the non-voice signal and the time variation of the timbre.

音響信号送信装置１００の構成例を示す図。The figure which shows the structural example of the acoustic signal transmitter. 音響装置送信装置１００の処理フロー例を示す図。The figure which shows the example of a processing flow of the audio equipment transmitter 100. フィルタバンク（重みが同一）設計の例を示す図。The figure which shows the example of filter bank (weight is the same) design. フィルタバンク（重みが周波数により異なる）設計の例を示す図。The figure which shows the example of a filter bank (a weight changes with frequencies) design. 圧縮特性の例を示す図。The figure which shows the example of a compression characteristic. 音響信号受信装置２００の構成例を示す図。The figure which shows the structural example of the acoustic signal receiver 200. FIG. 音響信号受信装置２００の処理フロー例を示す図。The figure which shows the example of a processing flow of the acoustic signal receiver 200. 図８は音響信号送信装置３００の構成例を示す図。FIG. 8 is a diagram illustrating a configuration example of the acoustic signal transmission device 300. 音響信号受信装置４００の構成例を示す図。The figure which shows the structural example of the acoustic signal receiver 400. 音響信号送信装置５００の構成例を示す図。The figure which shows the structural example of the acoustic signal transmitter 500. 音響信号送信装置５００の処理フロー例を示す図。The figure which shows the example of a processing flow of the acoustic signal transmission apparatus 500. 二つのマイクロホンアレーの配置例を示す図。The figure which shows the example of arrangement | positioning of two microphone arrays. 利得係数算出部５３０の構成例を示す図。The figure which shows the structural example of the gain coefficient calculation part 530. 音響信号受信装置６００の構成例を示す図。The figure which shows the structural example of the acoustic signal receiver 600. FIG. 音響信号送信装置７００の構成例を示す図。The figure which shows the structural example of the acoustic signal transmitter 700. FIG. 音響信号送信装置７００の処理フロー例を示す図。The figure which shows the example of a processing flow of the acoustic signal transmitter 700. 音響信号送信装置に信号を入力するマイクロホン３の配置例を示す図。The figure which shows the example of arrangement | positioning of the microphone 3 which inputs a signal into an acoustic signal transmitter. 判定部７５０の処理フロー例を示す図。The figure which shows the example of a processing flow of the determination part 750. 音響信号送信装置８００の構成例を示す図。The figure which shows the structural example of the acoustic signal transmitter 800. 音響信号送信装置に信号を入力するマイクロホン３_Ｒ１、３_Ｌ１、３_Ｒ２、３_Ｌ２の配置例を示す図。The figure which shows the example of arrangement _| positioning of microphone 3 _R1 , 3 _L1 , 3 _R2 , 3 _L2 which inputs a signal into an acoustic signal transmitter.

以下、本発明の実施の形態について、詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail.

［音響信号送信装置１００］
図１及び図２を用いて実施例１に係る音響信号送信装置１００を説明する。図１は音響信号送信装置１００の構成例を、図２は音響装置送信装置１００の処理フロー例を示す。
音響信号送信装置１００は、記憶部１０３、制御部１０５、ＡＤＣ１０７、フレーム分割部１０９、分割部１１１、不明瞭化部１２０及び送信部１３０を有する。さらに、不明瞭化部１２０は、周波数方向平滑化部１２１、ダイナミックレンジ圧縮部１２３及び時間方向平滑化部１２５を備える。各部について説明する。 [Acoustic signal transmitting apparatus 100]
The acoustic signal transmission device 100 according to the first embodiment will be described with reference to FIGS. 1 and 2. FIG. 1 shows a configuration example of the acoustic signal transmission device 100, and FIG. 2 shows a processing flow example of the acoustic device transmission device 100.
The acoustic signal transmission device 100 includes a storage unit 103, a control unit 105, an ADC 107, a frame division unit 109, a division unit 111, an obscuring unit 120, and a transmission unit 130. Further, the obscuring unit 120 includes a frequency direction smoothing unit 121, a dynamic range compression unit 123, and a time direction smoothing unit 125. Each part will be described.

＜記憶部１０３及び制御部１０５＞
記憶部１０３は、入出力される各データや演算過程の各データを、逐一、格納・読み出しする。それにより各演算処理が進められる。但し、必ずしも記憶部１０３に記憶しなければならないわけではなく、各部間で直接データを受け渡してもよい。
制御部１０５は、各処理を制御する。 <Storage unit 103 and control unit 105>
The storage unit 103 stores / reads each input / output data and each data of the calculation process one by one. Thereby, each calculation process is advanced. However, the data need not necessarily be stored in the storage unit 103, and data may be directly transferred between the units.
The control unit 105 controls each process.

＜ＡＤＣ１０７＞
ＡＤＣ（アナログ-デジタル変換回路：Analog to Digital Converter）１０７は、マイクロホン３で収音されたアナログ音響信号ｘ（ｔ）をデジタル音響信号ｘ（ｕ）に変換する（ｓ１０７）。但し、ｔは連続時間を、ｕは離散時間を表す。また、ＡＤＣが内蔵されるマイクロホンを用いる場合には、音響信号送信装置１００はＡＤＣを有さなくともよい。なお、音響信号とは、音声情報及び非音声情報を含む信号である。 <ADC 107>
The ADC (Analog to Digital Converter) 107 converts the analog sound signal x (t) collected by the microphone 3 into a digital sound signal x (u) (s107). However, t represents continuous time and u represents discrete time. In addition, when a microphone with a built-in ADC is used, the acoustic signal transmission device 100 does not have to have an ADC. Note that the acoustic signal is a signal including voice information and non-voice information.

＜フレーム分割部１０９＞
フレーム分割部１０９は、マイクロホン３から収音されたデジタル音響信号ｘ（ｕ）を所定時間ごとのフレームｎに分割し（ｓ１０９）、音響信号Ｘ（ｎ）を出力する。但し、ｎはフレーム番号を表す。所定時間は例えば１００ｍｓｅｃ以下とする。所定時間をある程度短くすることにより、後述する分割部１１１で行われる離散フーリエ変換の周波数分解能を荒くし、音響信号の不明瞭化するための処理量を少なくする。 <Frame division unit 109>
The frame dividing unit 109 divides the digital acoustic signal x (u) collected from the microphone 3 into frames n every predetermined time (s109), and outputs the acoustic signal X (n). Here, n represents a frame number. The predetermined time is, for example, 100 msec or less. By shortening the predetermined time to some extent, the frequency resolution of the discrete Fourier transform performed by the dividing unit 111 described later is roughened, and the processing amount for obscuring the acoustic signal is reduced.

＜分割部１１１＞
分割部１１１は、各フレームｎごとの音響信号Ｘ（ｎ）を複数の周波数帯域音響信号Ｘ（ω，ｎ）に分割する（ｓ１１１）。但し、ωは周波数を表す。例えば、離散フーリエ変換等により実現する。 <Division unit 111>
The dividing unit 111 divides the acoustic signal X (n) for each frame n into a plurality of frequency band acoustic signals X (ω, n) (s111). However, ω represents a frequency. For example, it is realized by discrete Fourier transform or the like.

＜不明瞭化部１２０＞
不明瞭化部１２０は、各フレームｎごとに分割された音響信号を不明瞭化処理する（ｓ１２０）。不明瞭化部１２０は、周波数方向平滑化部１２１、ダイナミックレンジ圧縮部１２３及び時間方向平滑化部１２５を備え、各部は縦続して接続されている。本実施例では、周波数方向平滑化部１２１、ダイナミックレンジ圧縮部１２３、時間方向平滑化部１２５の順に接続されているものとする。 <Obscuring part 120>
The obscuring unit 120 obscures the acoustic signal divided for each frame n (s120). The obscuring unit 120 includes a frequency direction smoothing unit 121, a dynamic range compression unit 123, and a time direction smoothing unit 125, which are connected in cascade. In this embodiment, it is assumed that the frequency direction smoothing unit 121, the dynamic range compression unit 123, and the time direction smoothing unit 125 are connected in this order.

『周波数方向平滑化部１２１』
周波数方向平滑化部１２１は、入力された信号Ｘの周波数的に平滑化したパワー｜Ｆ（ｍ，ｎ）｜^２を求める（ｓ１２１）。周波数方向平滑化部１２１は、例えばフィルタバンクである。なお、ｍはフィルタバンク番号でフィルタの分割数をＭとしたとき、ｍ＝１〜Ｍとする。|Ｆ（ｍ，ｎ)｜^２は、例えば、周波数帯域音響信号Ｘ（ω，ｎ）のパワー｜Ｘ（ω，ｎ）｜^２を求め、それらに各フィルタバンクの周波数ωにおける重みを乗算して加算することで求める。図３及び図４はフィルタバンク設計の例である。縦軸の値が重みになる。横軸の周波数は対数変換する。このように高い周波数ほどフィルタ帯域を広くすると、聴覚的感覚に近くなるので良い。これにより、まず周波数方向に平滑化をする。 “Frequency direction smoothing unit 121”
The frequency direction smoothing unit 121 obtains the power | F (m, n) | ² that is smoothed in frequency of the input signal X (s121). The frequency direction smoothing unit 121 is a filter bank, for example. Note that m is a filter bank number, and m = 1 to M, where M is the number of filter divisions. | F (m, n) | ² obtains, for example, the power | X (ω, n) | ² of the frequency band acoustic signal X (ω, n) and multiplies them by the weight at the frequency ω of each filter bank. And adding them. 3 and 4 are examples of filter bank design. The value on the vertical axis is the weight. The frequency on the horizontal axis is logarithmically converted. If the filter band is increased as the frequency becomes higher in this way, it may be closer to the auditory sense. Thus, first, smoothing is performed in the frequency direction.

『ダイナミックレンジ圧縮部１２３』
ダイナミックレンジ圧縮部１２３は、入力された信号のパワーの変動を圧縮する（ｓ１２３）。図５は、圧縮特性の例を示す。
｜Ｆ（ｍ,ｎ）｜^２のデシベル変換値をＬｖ（ｍ,ｎ）とすると、圧縮後の値Ｌｖ’（ｍ,ｎ）（真数で｜Ｆ’（ｍ,ｎ）｜^２とする）は以下のようになる。
Ｌｖ’（ｍ,ｎ）=−∞ （Ｌｖ（ｍ,ｎ）＜Ｎｐ１）
Ｌｖ’（ｍ,ｎ）=ｒ×Ｌｖ（ｍ,ｎ）+Ｇｍｉｎ（Ｎｐ１≦Ｌｖ（ｍ,ｎ）≦Ｎｐ２）
Ｌｖ’（ｍ,ｎ）=ｒ×Ｎｐ２+Ｇｍｉｎ（＝Ｇｍａｘ）（Ｌｖ（ｍ,ｎ）＞Ｎｐ２）
但し、ｒは圧縮率（０＜ｒ≦１）を表す。例えば半分に圧縮する場合は０．５にする。Ｎｐ１はそれ以下のレベルで出力を０にするポイントを表す。例えば暗騒音（背景雑音）のレベルを少し超える値に設定することで、なにか物音がしたなど、音が変動したときだけに、その情報を送信することが出来る。暗騒音レベルは、例えばＬｖ（ｍ,ｎ）を長時間平均することで自動的に求めることが出来るので、Ｎｐ１は、その値より例えば３ｄＢ大きくしておく等すれば良い。Ｎｐ２は主に過大な音がそのまま再生されないようにするため出力を所定値（Ｇｍａｘ）にするポイントを表す。Ｇｍｉｎは圧縮値の下限値（但し、Ｎｐ１より小さい場合は、圧縮値は−∞）、Ｇｍａｘは圧縮値の上限値である。ｒ、Ｇｍｉｎ、Ｇｍａｘ、Ｎｐ１及びＮｐ２の値をｍ、即ち周波数帯域ごとに変えても良い。例えば、より音韻の特徴を表す帯域（５００Ｈｚから３０００Ｈｚ）を強く圧縮すれば、音声はより不明瞭になる。 "Dynamic range compressor 123"
The dynamic range compressor 123 compresses power fluctuations of the input signal (s123). FIG. 5 shows an example of compression characteristics.
If the decibel transform value of | F (m, n) | ² is Lv (m, n), the compressed value Lv ′ (m, n) (true value | F ′ (m, n) | ² ) Is as follows.
Lv ′ (m, n) = − ∞ (Lv (m, n) <Np1)
Lv ′ (m, n) = r × Lv (m, n) + Gmin (Np1 ≦ Lv (m, n) ≦ Np2)
Lv ′ (m, n) = r × Np2 + Gmin (= Gmax) (Lv (m, n)> Np2)
However, r represents a compression rate (0 <r ≦ 1). For example, when compressing to half, it is set to 0.5. Np1 represents a point at which the output is reduced to 0 at a lower level. For example, by setting a value slightly higher than the level of background noise (background noise), it is possible to transmit the information only when the sound fluctuates, such as when something sounds. The background noise level can be automatically obtained by, for example, averaging Lv (m, n) for a long time, and therefore Np1 may be set to be, for example, 3 dB larger than the value. Np2 mainly represents a point at which the output is set to a predetermined value (Gmax) so that an excessive sound is not reproduced as it is. Gmin is the lower limit value of the compression value (however, if it is smaller than Np1, the compression value is -∞), and Gmax is the upper limit value of the compression value. The values of r, Gmin, Gmax, Np1 and Np2 may be changed for each m, that is, for each frequency band. For example, if the band (500 Hz to 3000 Hz) representing the characteristics of phonemes is strongly compressed, the voice becomes more unclear.

『時間方向平滑化部１２５』
時間方向平滑化部１２５は、入力された信号を前記フレーム長より長い時間に時間的に平滑化する（ｓ１２５）。例えば、｜Ｆ’（ｍ,ｎ）｜^２に時間方向に平滑化処理をかけて｜Ｆ”（ｍ,ｎ）｜^２を得る。時間方向に平滑化するには、例えば、忘却係数を乗じたり、過去の時系列を記憶しておき移動平均したりする方法がある。 “Time direction smoothing unit 125”
The time direction smoothing unit 125 smoothes the input signal temporally in a time longer than the frame length (s125). For example, | F ′ (m, n) | ² is smoothed in the time direction to obtain | F ″ (m, n) | ^2. For smoothing in the time direction, for example, a forgetting factor is multiplied. There is a method of memorizing past time series and moving average.

周波数方向平滑化部１２１、ダイナミックレンジ圧縮部１２３及び時間方向平滑化部１２５の各処理はいずれも収音された音響信号を不明瞭にするための処理である。例えば、周波数方向の平滑化だけで不明瞭にしようとすると、フィルタバンク数を少なくする必要があり、原信号の持つ周波数特性を伝達しにくくなる。同様に、時間方向の平滑化だけで、不明瞭にしようとすると、より大きな時定数でゆっくりとした変動に変換する必要があり、原信号の持つ時間変動特性を伝達しにくくなる。また、パワー変動の圧縮を大きくしすぎると、大きな音と小さな音の区別がしにくくなる。すなわち、三つの平滑化を併用することで、各々の平滑化の強さを大きくしすぎずに不明瞭化することができる。また、これらの順番は入れ替えても効果は同じである。たとえば、周波数方向平滑化をする前に、ダイナミックレンジ圧縮、時間方向平滑化をすることも可能である。 Each process of the frequency direction smoothing unit 121, the dynamic range compression unit 123, and the time direction smoothing unit 125 is a process for obscuring the collected sound signal. For example, if it is obscured only by smoothing in the frequency direction, it is necessary to reduce the number of filter banks, and it becomes difficult to transmit the frequency characteristics of the original signal. Similarly, if it is obscured only by smoothing in the time direction, it is necessary to convert it into a slow fluctuation with a larger time constant, and it becomes difficult to transmit the time fluctuation characteristic of the original signal. Also, if the power fluctuation is compressed too much, it becomes difficult to distinguish between loud sounds and small sounds. That is, by using three smoothings in combination, it is possible to obscure without increasing the strength of each smoothing too much. In addition, the effect is the same even if the order is changed. For example, it is possible to perform dynamic range compression and time direction smoothing before frequency direction smoothing.

＜送信部１３０＞
送信部１３０は、不明瞭化部１２０で不明瞭化した信号｜Ｆ”（ｍ,ｎ）｜^２を音響信号受信装置に送信する（ｓ１３０）。送信部１３０は、例えばＬＡＮアダプタ等である。
［音響信号受信装置２００］
図６は音響信号受信装置２００の構成例を、図７は音響信号受信装置２００の処理フロー例を示す。音響信号受信装置２００は、例えば、受信部２０１、記憶部２０３、制御部２０５、利得算出部２０７、キャリア信号発生部２０８、フレーム分割部２０９、分割部２１１、乗算部２１３、時間領域変換部２１５、ＤＡＣ２１７及びアンプ２１９を有する。 <Transmitter 130>
The transmitting unit 130 transmits the signal | F ″ (m, n) | ² obscured by the obscuring unit 120 to the acoustic signal receiving device (s130). The transmitting unit 130 is, for example, a LAN adapter or the like.
[Acoustic signal receiving apparatus 200]
6 shows a configuration example of the acoustic signal receiving apparatus 200, and FIG. 7 shows a processing flow example of the acoustic signal receiving apparatus 200. The acoustic signal receiving apparatus 200 includes, for example, a reception unit 201, a storage unit 203, a control unit 205, a gain calculation unit 207, a carrier signal generation unit 208, a frame division unit 209, a division unit 211, a multiplication unit 213, and a time domain conversion unit 215. , DAC 217 and amplifier 219.

＜受信部２０１＞
受信部２０１は、音響信号送信装置１００の不明瞭化部１２０で不明瞭化した信号｜Ｆ”（ｍ,ｎ）｜^２を受信する（ｓ２０１）。受信部２０１は、例えばＬＡＮアダプタ等である。
＜記憶部２０３及び制御部２０５＞
記憶部２０３及び制御部２０５は、それぞれ音響信号送信装置１００の記憶部１０３及び制御部１０５と同様の構成である。 <Receiving unit 201>
The receiving unit 201 receives the signal | F ″ (m, n) | ² obscured by the obscuring unit 120 of the acoustic signal transmitting apparatus 100 (s201) .The receiving unit 201 is, for example, a LAN adapter or the like. .
<Storage unit 203 and control unit 205>
The storage unit 203 and the control unit 205 have the same configuration as the storage unit 103 and the control unit 105 of the acoustic signal transmission device 100, respectively.

＜利得算出部２０７＞
利得算出部２０７は、不明瞭化された音響信号｜Ｆ”（ｍ,ｎ）｜^２を用いて各周波数ωごとの利得ｇ（ω，ｎ）を算出する（ｓ２０７）。
例えば、利得算出部２０７では、図４のような一様なフィルタバンクの場合は、ｍ番目のバンク内に含まれる周波数ωのパワーを全て等しく｜Ｆ”（ｍ,ｎ）｜^２とすれば良い。また例えば、図３のような重なりがあるフィルタバンクの場合、ある周波数ωがｍ１とｍ２のフィルタバンクに含まれる場合、ｍ１とｍ２の重みを各々Ｗ（ｍ１）、Ｗ（ｍ２）とすると、
|F"(ω,n)|²=W(m1)×|F"(m1,n)|²＋W(m2)×|F"(m2,n)|²
とすれば良い。そして、適当な小さな値、例えば、全ての周波数成分のパワー値の最大値の逆数を乗じて正規化利得ｇ（ω,ｎ）を求める。 <Gain calculation unit 207>
The gain calculation unit 207 calculates the gain g (ω, n) for each frequency ω using the obscured acoustic signal | F ″ (m, n) | ² (s207).
For example, the gain calculation unit 207, in the case of uniform filter bank as in FIG. 4, all the power of the frequency ω included in the m-th bank equal | F "(m, n) | If ² For example, in the case of a filter bank having an overlap as shown in Fig. 3, when a certain frequency ω is included in the filter bank of m1 and m2, the weights of m1 and m2 are set to W (m1) and W (m2), respectively. Then
| F "(ω, n) | ² = W (m1) × | F" (m1, n) | ² + W (m2) × | F "(m2, n) | ²
What should I do? Then, a normalization gain g (ω, n) is obtained by multiplying an appropriate small value, for example, the reciprocal of the maximum value of the power values of all frequency components.

＜キャリア信号発生部２０８＞
キャリア信号発生部は、定常的なキャリア信号ｚ（ｕ）を生成する（ｓ２０８）。なお、キャリア信号ｚ（ｕ）は、不明瞭化した信号Ｆ”（ω，ｎ）よりも広い周波数成分を持ち定常的な音であればどのような音でも良い。例えば、常に見守り側に再生されていても耳障りでないような水のせせらぎのような音が良い。なお、キャリア信号ｚ（ｕ）は記憶部２０３等に記憶しておいたものを利用してもよく、その場合、音響信号受信装置２００にキャリア信号発生部２０８を設けなくともよい。 <Carrier signal generator 208>
The carrier signal generator generates a stationary carrier signal z (u) (s208). The carrier signal z (u) may be any sound as long as it has a wider frequency component than the obfuscated signal F ″ (ω, n) and is a stationary sound. Even if it is applied, a sound like water murmur that is not harsh is good.The carrier signal z (u) may be stored in the storage unit 203 or the like, in which case the acoustic signal The receiving apparatus 200 may not include the carrier signal generation unit 208.

＜フレーム分割部２０９及び分割部２１１＞
フレーム分割部２０９は、所定の定常的なキャリア信号ｚ（ｕ）を所定時間ごとのフレームに分割し（ｓ２０９）、Ｚ（ｎ）を出力する。分割部２１１は、各フレームｎごとのキャリア信号を複数の周波数帯域の信号Ｚ（ω，ｎ）に分割する（ｓ２１１）。フレーム分割部２０９及び分割部２１１は、それぞれ音響信号送信装置１００のフレーム分割部１０９及び分割部１１１と同様の構成であり、フレーム分割部１０９と同じ時間間隔でフレームを分割する。 <Frame Dividing Unit 209 and Dividing Unit 211>
The frame dividing unit 209 divides a predetermined stationary carrier signal z (u) into frames for every predetermined time (s209), and outputs Z (n). The dividing unit 211 divides the carrier signal for each frame n into a plurality of frequency band signals Z (ω, n) (s211). The frame dividing unit 209 and the dividing unit 211 have the same configuration as the frame dividing unit 109 and the dividing unit 111 of the acoustic signal transmitting apparatus 100, respectively, and divide the frame at the same time interval as the frame dividing unit 109.

＜乗算部２１３＞
乗算部２１３は、利得Ｇ（ω，ｎ）と周波数帯域の信号Ｚ（ω，ｎ）を乗じ（ｓ２１３）、Ｙ（ω，ｎ）を求める。
＜時間領域変換部２１５＞
時間領域変換部２１５は、乗算部２１３で求めた値Ｙ（ω，ｎ）を時間領域の信号ｙ（ｎ）に変換する（ｓ２１５）。さらに、フレームを合成し、連続するデジタル信号ｙ（ｕ）に変換する。例えば、逆離散フーリエ変換等により時間領域の信号に変換する。 <Multiplier 213>
The multiplier 213 multiplies the gain G (ω, n) and the signal Z (ω, n) in the frequency band (s213) to obtain Y (ω, n).
<Time domain conversion unit 215>
The time domain conversion unit 215 converts the value Y (ω, n) obtained by the multiplication unit 213 into a time domain signal y (n) (s215). Further, the frames are combined and converted into a continuous digital signal y (u). For example, the signal is converted into a time domain signal by inverse discrete Fourier transform or the like.

＜ＤＡＣ２１７及びアンプ２１９＞
ＤＡＣ（デジタル-アナログ変換回路：Digital to Analog Converter）２１７は、時間領域変換部２１５から入力されるデジタル信号ｙ（ｕ）をアナログ信号ｙ（ｔ）に変換する（ｓ２１７）。アンプ２１９は、アナログ信号ｙ（ｔ）を増幅し、スピーカ２２１に出力する（ｓ２１９）。なお、ＤＡＣやアンプが内蔵されるスピーカに信号を出力する場合には、音響信号受信装置２００はＤＡＣ２１７やアンプ２１９を有さなくともよい。 <DAC 217 and amplifier 219>
The DAC (Digital-to-Analog Converter) 217 converts the digital signal y (u) input from the time domain conversion unit 215 into an analog signal y (t) (s217). The amplifier 219 amplifies the analog signal y (t) and outputs it to the speaker 221 (s219). Note that in the case of outputting a signal to a speaker in which a DAC or an amplifier is incorporated, the acoustic signal receiving device 200 does not have to have the DAC 217 or the amplifier 219.

＜効果＞
音響信号送信装置１００及び音響信号受信装置２００をこのような構成とすることによって、音声の了解性は消して、非音声信号のパワー及び音色の時間変動を送信できるという効果を奏する。
つまり、このような構成とすることによって、音響信号送信装置１００側、すなわち見守られ側の音響信号のピッチ情報は消される。もし平滑化処理を施さないで送信すると、周波数特性の外形（音声ではフォルマント構造に対応する）の時間変化が残っているため、了解性が残ってしまう（なんと話しているかが分かる）が、不明瞭化処理を施すことによって、了解性を排除できる。これにより、音声の了解性をなくしつつ、音のパワーと音色の変化がキャリア信号に施されることで、音響信号受信装置２００側、つまり見守り側は見守られる側の状況の把握がしやすくなる。なお、適宜、不明瞭化処理の強さを調整することによって、よりプライバシー配慮か、より多くの情報を伝えるかを選ぶことも出来る。 <Effect>
By configuring the acoustic signal transmitting apparatus 100 and the acoustic signal receiving apparatus 200 as described above, it is possible to eliminate the intelligibility of the voice and transmit the power of the non-voice signal and the time variation of the timbre.
That is, with such a configuration, the pitch information of the acoustic signal on the acoustic signal transmitting device 100 side, that is, the watched side is erased. If transmission is performed without smoothing processing, the time change of the external shape of the frequency characteristic (corresponding to the formant structure in speech) remains, so that intelligibility remains (you can tell what you are talking about) By performing the clarification process, intelligibility can be eliminated. As a result, sound power and timbre changes are applied to the carrier signal while eliminating the intelligibility of the sound, so that it is easy to grasp the situation on the side where the sound signal receiving apparatus 200, that is, the watching side is watched over. . It should be noted that by appropriately adjusting the strength of the obscuring process, it is possible to select whether to consider more privacy or to convey more information.

＜その他＞
なお、｜Ｆ”（ｍ，ｎ）｜^２を用いて利得ｇ（ω，ｎ）を求める処理は、送信前の音響信号送信装置１００において行ってもよい。但し、平滑化処理、即ち、不明瞭化処理は送信装置１００側（見守り側）で行わないと、プライバシー保護にならない。
また、音響信号受信装置２００の利得算出部２０７、キャリア信号発生部２０８、フレーム分割部２０９、分割部２１１、乗算部２１３、時間領域変換部２１５で行われる処理を音響信号送信装置１００内で行い、デジタル信号ｙ（ｕ）を音響信号受信装置２００に送信しても良い。しかし、送信すべきデータ量は｜Ｆ”（ｍ，ｎ）｜^２を送信する場合に比べ多くなる。 <Others>
Note that the process of obtaining the gain g (ω, n) using | F ″ (m, n) | ² may be performed in the acoustic signal transmitting apparatus 100 before transmission, provided that the smoothing process, that is, non-transmission is performed. If the clarification processing is not performed on the transmission apparatus 100 side (watching side), privacy protection cannot be achieved.
In addition, processing performed by the gain calculation unit 207, the carrier signal generation unit 208, the frame division unit 209, the division unit 211, the multiplication unit 213, and the time domain conversion unit 215 of the acoustic signal reception device 200 is performed in the acoustic signal transmission device 100. The digital signal y (u) may be transmitted to the acoustic signal receiving device 200. However, the amount of data to be transmitted is larger than when transmitting | F ″ (m, n) | ² .

［変形例１］
［音響信号送信装置３００］
図８を用いて実施例１の変形例１に係る音響信号送信装置３００を説明する。図８は音響信号送信装置３００の構成例を示す。
音響信号送信装置３００は、記憶部１０３、制御部１０５、ＡＤＣ１０７、フレーム分割部１０９、不明瞭化部３２０、利得算出部３２７及び送信部１３０を有する。実施例１と異なる部分について説明する。本変形例では、離散フーリエ変換を用いる分割部１１１の代わりにＭ個の帯域制限フィルタ（フィルタバンク３２１）を用いる。 [Modification 1]
[Acoustic signal transmitting apparatus 300]
An acoustic signal transmission device 300 according to the first modification of the first embodiment will be described with reference to FIG. FIG. 8 shows a configuration example of the acoustic signal transmission device 300.
The acoustic signal transmission device 300 includes a storage unit 103, a control unit 105, an ADC 107, a frame division unit 109, an obscuring unit 320, a gain calculation unit 327, and a transmission unit 130. A different part from Example 1 is demonstrated. In this modification, M band limiting filters (filter banks 321) are used instead of the dividing unit 111 that uses discrete Fourier transform.

＜不明瞭化部３２０＞
不明瞭化部３２０は、フィルタバンク３２１、ダイナミックレンジ圧縮部１２３及び時間方向平滑化部１２５を備える。
『フィルタバンク３２１』
フレームｎごとの音響信号Ｘ（ｎ）がファイルバンク３２１に入力される。フィルタバンク３２１は、実施例１の分割部１１１と周波数方向平滑化部１２１を兼用し、周波数帯域音響信号ごとに、周波数方向に平滑化された信号｜Ｆ（ｍ，ｎ）｜^２を得る。例えば、音響信号Ｘ（ｎ）をフィルタバンク３２１に通し、周波数帯域ごとの信号を得、振幅を自乗し足し合わせることで周波数方向に平滑化された信号｜Ｆ（ｍ，ｎ）｜^２を得ることができる。この信号｜Ｆ（ｍ，ｎ）｜^２をダイナミックレンジ圧縮部１２３または時間方向平滑化部１２５に出力する。本変形例では、ダイナミックレンジ圧縮部１２３に出力するものとする。ダイナミックレンジ圧縮部１２３及び時間方向平滑化部３２５の処理内容は実施例１と同様である。
＜利得算出部３２７＞
利得算出部３２７は、実施例１の音響信号受信装置２００の利得算出部２０７と同様の構成である。 <Obscuring part 320>
The obscuring unit 320 includes a filter bank 321, a dynamic range compression unit 123, and a time direction smoothing unit 125.
“Filter Bank 321”
An acoustic signal X (n) for each frame n is input to the file bank 321. The filter bank 321 combines the dividing unit 111 and the frequency direction smoothing unit 121 of the first embodiment, and obtains a signal | F (m, n) | ² smoothed in the frequency direction for each frequency band acoustic signal. For example, the acoustic signal X (n) is passed through the filter bank 321 to obtain a signal for each frequency band, and the signal | F (m, n) | ² smoothed in the frequency direction is obtained by squaring and adding the amplitude. be able to. This signal | F (m, n) | ² is output to the dynamic range compression unit 123 or the time direction smoothing unit 125. In this modification, it is output to the dynamic range compression unit 123. The processing contents of the dynamic range compression unit 123 and the time direction smoothing unit 325 are the same as those in the first embodiment.
<Gain calculation unit 327>
The gain calculation unit 327 has the same configuration as the gain calculation unit 207 of the acoustic signal receiving device 200 of the first embodiment.

［音響信号受信装置４００］
図９は音響信号受信装置４００の構成例を示す。音響信号受信装置４００は、例えば、受信部２０１、記憶部２０３、制御部２０５、キャリア信号発生部２０８、フレーム分割部２０９、フィルタバンク４２１、乗算部４１３、フレーム合成部４１５、ＤＡＣ２１７及びアンプ２１９を有する。
＜フィルタバンク４２１＞
フレームｎごとのキャリア信号Ｚ（ｎ）がファイルバンク４２１に入力される。フィルタバンク４２１は、例えば、キャリア信号Ｚ（ｎ）を通し、周波数帯域ごとの信号Ｚ（ω，ｎ）を得る。これを乗算部４１３に出力する。 [Acoustic signal receiving apparatus 400]
FIG. 9 shows a configuration example of the acoustic signal receiving device 400. The acoustic signal receiving device 400 includes, for example, a reception unit 201, a storage unit 203, a control unit 205, a carrier signal generation unit 208, a frame division unit 209, a filter bank 421, a multiplication unit 413, a frame synthesis unit 415, a DAC 217, and an amplifier 219. Have.
<Filter bank 421>
A carrier signal Z (n) for each frame n is input to the file bank 421. For example, the filter bank 421 passes the carrier signal Z (n) and obtains a signal Z (ω, n) for each frequency band. This is output to the multiplier 413.

＜乗算部４１３＞
乗算部４１３は、受信部２０１が受信した利得ｇ（ω，ｎ）と周波数ごとのキャリア信号Ｚ（ω，ｎ）を入力され、これらの値を掛け合わせ、Ｙ（ω，ｎ）を得、フレーム合成部４１５へ出力する。
＜フレーム合成部４１５＞
フレーム合成部４１５は、フレーム及び周波数ごとの信号を合成し、デジタル信号ｙ（ｕ）を得、ＤＡＣ２１７へ出力する。ＤＡＣ２１７、アンプ２１９の処理は実施例１と同様である。
音響信号送信装置３００及び音響信号受信装置４００をこのような構成とすることによって、実施例１と同様の効果を得ることができる。 <Multiplier 413>
The multiplier 413 receives the gain g (ω, n) received by the receiver 201 and the carrier signal Z (ω, n) for each frequency, and multiplies these values to obtain Y (ω, n). The data is output to the frame composition unit 415.
<Frame synthesis unit 415>
The frame synthesizer 415 synthesizes signals for each frame and frequency, obtains a digital signal y (u), and outputs it to the DAC 217. The processing of the DAC 217 and the amplifier 219 is the same as that in the first embodiment.
By configuring the acoustic signal transmitting device 300 and the acoustic signal receiving device 400 as described above, the same effects as in the first embodiment can be obtained.

常に不明瞭化された音響情報が伝わるだけではなく、見守られる側が所望したときには、音声による通話も出来る方が、緊急時に有効なだけでなく、利便性も高まる。特許文献１には、例えば見守られる側の高齢者等が通話スイッチを入れると、不明瞭化の処理が解除されるといった方法が記載されている。しかしながら、その操作がわずらわしかったり、通話後にスイッチを切り替え忘れて、不明瞭にならない音声が相手に伝わり、プライバシーが侵害されたりする心配もある。また、そのような仕様が常時マイクロホンで収音されることに対して利用者に心理的な不安を与えることも考えられる。そこで、音声の不明瞭化処理を自動的に解除し、明瞭な音声を伝達できるようにすることを可能にする装置を提供する。特定の場所で発声した音声は明瞭に送信され、それ以外の音は不明瞭化されるようにする実施例を説明する。こうすることで、利用者（見守られる側）は、特別な操作は不要で、その特定の場所さえ覚えておけば良く、意図しない音声を相手に伝えるリスクも減る。このためには、特定の場所における音源の有無の判定が必要となる。その方法として、特開２００９−２５４９０（以下、「参考文献１」という）を応用した実施例を説明する。 Not only is the obscured acoustic information always transmitted, but when the watched side desires, it is not only effective in an emergency, but also more convenient if a voice call is possible. Patent Document 1 describes a method in which, for example, an obfuscation process is canceled when an elderly person or the like to be watched turns on a call switch. However, there are also concerns that the operation may be troublesome, forgetting to switch the switch after a call, voice that will not be obscured is transmitted to the other party, and privacy is infringed. In addition, it is conceivable to give psychological anxiety to the user that such a specification is always picked up by a microphone. In view of this, an apparatus is provided that automatically cancels the voice obscuring process and enables transmission of clear voice. An embodiment will be described in which speech uttered at a specific location is transmitted clearly and other sounds are obscured. By doing this, the user (the side to be watched) does not need to perform any special operation, and only needs to remember the specific place, and the risk of transmitting unintended voice to the other party is reduced. For this purpose, it is necessary to determine the presence or absence of a sound source at a specific location. As a method therefor, an embodiment applying Japanese Patent Application Laid-Open No. 2009-25490 (hereinafter referred to as “reference document 1”) will be described.

［音響信号送信装置５００］
図１０及び図１１を用いて実施例２に係る音響信号送信装置５００を説明する。図１０は音響信号送信装置５００の構成例を、図１１は音響信号送信装置５００の処理フロー例を示す。実施例１と異なる部分についてのみ説明する。二つのマイクロホンアレー３Ｒ，３Ｌは図１２のように配置する。
音響信号送信装置５００は、特定の位置から音声が発生しているか否かを判定する判定部５５０を有する。そして、特定の位置から音声が発生していない場合には、不明瞭化部１２０において、不明瞭化した音響信号を生成する。 [Acoustic signal transmission device 500]
An acoustic signal transmission device 500 according to the second embodiment will be described with reference to FIGS. 10 and 11. FIG. 10 shows a configuration example of the acoustic signal transmission device 500, and FIG. 11 shows a processing flow example of the acoustic signal transmission device 500. Only parts different from the first embodiment will be described. The two microphone arrays 3R and 3L are arranged as shown in FIG.
The acoustic signal transmission device 500 includes a determination unit 550 that determines whether sound is generated from a specific position. Then, when no sound is generated from a specific position, the obscuring unit 120 generates an obscured acoustic signal.

音響信号送信装置５００は、収音部４とフレーム分割部１０９と分割部１１１と処理対象信号生成部５４０とパワースペクトル推定部５０７と利得係数算出部５３０と不明瞭化部１２０と乗算部５０９と判定部５５０と送信部５４０を有する。なお、収音部４、処理対象信号生成部５４０、パワースペクトル推定部５０７、利得係数算出部５３０及び乗算部５０９の処理内容は参考文献１に詳しく記載されている。本実施例では概要を説明する。 The acoustic signal transmission apparatus 500 includes a sound collection unit 4, a frame division unit 109, a division unit 111, a processing target signal generation unit 540, a power spectrum estimation unit 507, a gain coefficient calculation unit 530, an obscuration unit 120, and a multiplication unit 509. A determination unit 550 and a transmission unit 540 are included. The processing contents of the sound collection unit 4, the processing target signal generation unit 540, the power spectrum estimation unit 507, the gain coefficient calculation unit 530, and the multiplication unit 509 are described in detail in Reference Document 1. In this embodiment, an outline will be described.

＜収音部４＞
６つの収音部４−１,４−２,４−３，４−４，４−５，４−６は、複数のマイクロホンを搭載して構成されるマイクロホンアレー３Ｒ，３Ｌの出力信号を利用して、それぞれ異なる領域の音を収音する（ｓ４）。それぞれデジタル音響信号ｘ_ＳＬ（ｕ）、ｘ_ＳＲ（ｕ）、ｘ_ＮＬ（ｕ）、ｘ_ＮＲ（ｕ）、ｘ_ＳＣ（ｕ）、ｘ_ＮＣ（ｕ）を出力する。なお、収音部は６以上であってもよい。 <Sound collecting unit 4>
The six sound collection units 4-1, 4-2, 4-3, 4-4, 4-5 and 4-6 use the output signals of the microphone arrays 3 </ b> R and 3 </ b> L configured by mounting a plurality of microphones. Then, sounds of different areas are collected (s4). Digital acoustic signals x _SL (u), x _SR (u), x _NL (u), x _NR (u), x _SC (u), and x _NC (u) are output, respectively. The sound collection unit may be 6 or more.

＜フレーム分割部１０９及び分割部１１１＞
フレーム分割部１０９及び分割部１１１は、実施例１と同様の構成である。フレーム分割部１０９は、デジタル音響信号ｘ_ＳＬ（ｕ）、ｘ_ＳＲ（ｕ）、ｘ_ＮＬ（ｕ）、ｘ_ＮＲ（ｕ）、ｘ_ＳＣ（ｕ）、ｘ_ＮＣ（ｕ）を所定時間ごとのフレームｎに分割し（ｓ１０９）、音響信号Ｘ_ＳＬ（ｎ）、Ｘ_ＳＲ（ｎ）、Ｘ_ＮＬ（ｎ）、Ｘ_ＮＲ（ｎ）、Ｘ_ＳＣ（ｎ）、Ｘ_ＮＣ（ｎ）を出力する。分割部１１１は、各フレームｎごとの音響信号を複数の周波数帯域音響信号Ｘ_ＳＬ（ω，ｎ）、Ｘ_ＳＲ（ω，ｎ）、Ｘ_ＮＬ（ω，ｎ）、Ｘ_ＮＲ（ω，ｎ）、Ｘ_ＳＣ（ω，ｎ）、Ｘ_ＮＣ（ω，ｎ）に分割し（ｓ１１１）、出力する。なお、参考文献１記載の周波数領域変換部５は、フレーム分割部１０９及び分割部１１１の機能を備える。 <Frame Dividing Unit 109 and Dividing Unit 111>
The frame dividing unit 109 and the dividing unit 111 have the same configuration as that of the first embodiment. The frame dividing unit 109 converts the digital acoustic signals x _SL (u), x _SR (u), x _NL (u), x _NR (u), x _SC (u), and x _NC (u) into frames at predetermined time intervals. divided into n (s109), the audio signal _{_{_{X SL (n), X SR}}} (n), X NL (n), X NR (n), X SC (n), and outputs the _X NC (n). The dividing unit 111 converts the acoustic signal for each frame n into a plurality of frequency band acoustic signals X _SL (ω, n), X _SR (ω, n), X _NL (ω, n), X _NR (ω, n). , X _SC (ω, n) and X _NC (ω, n) (s111) and output. Note that the frequency domain conversion unit 5 described in Reference 1 includes the functions of the frame dividing unit 109 and the dividing unit 111.

＜処理対象信号生成部５４０＞
処理対象信号生成部５４０は、あらかじめ定めた１つ以上のマイクロホンまたは収音部からの信号から、処理対象信号Ｙ_ｓ（ω，ｎ）を生成する（ｓ５４０）。
＜パワースペクトル推定部５０７＞
パワースペクトル推定部５０７は、各収音部４−１,４−２,４−３，４−４，４−５，４−６で得られた各収音響信号の信号量から、所望音源の信号量と、その他の音源の信号量とを周波数ごとに推定し（ｓ５０７）、推定信号パワーベクトルＸ_ｏｐｔ（ω，ｎ）を出力する。 <Processing Target Signal Generation Unit 540>
The processing target signal generation unit 540 generates a processing target signal Y _s (ω, n) from signals from one or more predetermined microphones or sound collection units (s540).
<Power Spectrum Estimator 507>
The power spectrum estimation unit 507 calculates the desired sound source from the signal amount of each collected sound signal obtained by each sound collecting unit 4-1, 4-2, 4-3, 4-4, 4-5, 4-6. The signal amount and the signal amount of other sound sources are estimated for each frequency (s507), and an estimated signal power vector X _opt (ω, n) is output.

＜利得係数算出部５３０＞
図１３は利得係数算出部５３０の構成例を示す。利得係数算出部５３０は、所望音源の信号量｜Ｓ（ω，ｎ）｜^２、所望音源の信号量を含む全ての音源の信号量｜Ｓ（ω，ｎ）｜^２＋｜Ｎ_ＬＬ（ω，ｎ）｜^２＋｜Ｎ_Ｌ（ω，ｎ）｜^２＋｜Ｎ_Ｃ（ω，ｎ）｜^２＋｜Ｎ_Ｒ（ω，ｎ）｜^２＋｜Ｎ_ＲＲ（ω，ｎ）｜^２、および処理対象信号Ｙ_ｓ（ω，ｎ）から周波数ごとに利得係数を求める（ｓ５３０）。利得係数算出部５３０は、ベクトル要素抽出部８１と第１ゲイン算出部５３１と第２ゲイン算出部５３２とゲイン乗算部５３３を有する。 <Gain coefficient calculation unit 530>
FIG. 13 shows a configuration example of the gain coefficient calculation unit 530. The gain coefficient calculation unit 530 calculates the signal amount of the desired sound source | S (ω, n) | ² , the signal amounts of all sound sources including the signal amount of the desired sound source | S (ω, n) | ² + | N _LL (ω , N) | ² + | N _L (ω, n) | ² + | N _C (ω, n) | ² + | N _R (ω, n) | ² + | N _RR (ω, n) | ² , Then, a gain coefficient is obtained for each frequency from the processing target signal Y _s (ω, n) (s530). The gain coefficient calculation unit 530 includes a vector element extraction unit 81, a first gain calculation unit 531, a second gain calculation unit 532, and a gain multiplication unit 533.

『ベクトル要素抽出部８１』
ベクトル要素抽出部８１は、入力された推定信号パワーベクトルＸ_ｏｐｔ（ω，ｎ）を、推定信号パワー｜Ｓ（ω，ｎ）｜^２、推定左側方雑音パワー｜Ｎ_ＬＬ（ω，ｎ）｜^２、推定左方向雑音パワー｜Ｎ_Ｌ（ω，ｎ）｜^２、推定正面方向雑音パワー｜Ｎ_Ｃ（ω，ｎ）｜^２、推定右方向雑音パワー｜Ｎ_Ｒ（ω，ｎ）｜^２、推定右側方雑音パワー｜Ｎ_ＲＲ（ω，ｎ）｜^２としてそれぞれ出力する（ｓ８１）。
『第１ゲイン算出部５３１』
第１ゲイン算出部５３１は、推定信号パワー｜Ｓ（ω，ｎ）｜^２と処理対象信号Ｙ_Ｓ（ω，ｎ）から、第１ゲイン係数Ｇ_Ｓ（ω，ｎ）を次式のように計算し、出力する（ｓ５３１）。 “Vector Element Extraction Unit 81”
The vector element extraction unit 81 converts the input estimated signal power vector X _opt (ω, n) into an estimated signal power | S (ω, n) | ² and an estimated left-side noise power | N _LL (ω, n) | ² , estimated left noise power | N _L (ω, n) | ² , estimated front noise power | N _C (ω, n) | ² , estimated right noise power | N _R (ω, n) | ² , The estimated right side noise power | N _RR (ω, n) | ² is output (s81).
“First Gain Calculation Unit 531”
The first gain calculating unit 531, the estimated signal power ^| S (ω, n) | ² and the processing signal _{Y S} (ω, n), the first gain factor _{G S} a (omega, n) as follows Calculate and output (s531).

『第２ゲイン算出部５３２』
第２ゲイン算出部５３２は、所望音源の信号量を含む全ての音源の信号量に対する所望音源の信号量の割合（以下、「第２ゲイン係数」という）を求める（ｓ５３２）。例えば、第２ゲイン算出部５３２は、推定信号パワー｜Ｓ（ω，ｎ）｜^２、推定左側方雑音パワー｜Ｎ_ＬＬ（ω，ｎ）｜^２、推定左方向雑音パワー｜Ｎ_Ｌ（ω，ｎ）｜^２、推定正面方向雑音パワー｜Ｎ_Ｃ（ω，ｎ）｜^２、推定右方向雑音パワー｜Ｎ_Ｒ（ω，ｎ）｜^２、推定右側方雑音パワー｜Ｎ_ＲＲ（ω，ｎ）｜^２から、第２ゲイン係数Ｇ_ＳＮＲ（ω，ｎ）を次式のように計算し、出力する。 “Second Gain Calculation Unit 532”
The second gain calculation unit 532 obtains the ratio of the signal amount of the desired sound source to the signal amounts of all sound sources including the signal amount of the desired sound source (hereinafter referred to as “second gain coefficient”) (s532). For example, the second gain calculation unit 532 includes the estimated signal power | S (ω, n) | ² , the estimated left-side noise power | N _LL (ω, n) | ² , and the estimated left-side noise power | N _L (ω, n) | ² , estimated front noise power | N _C (ω, n) | ² , estimated right noise power | N _R (ω, n) | ² , estimated right noise power | N _RR (ω, n) From | ² , the second gain coefficient G _SNR (ω, n) is calculated as shown in the following equation and output.

第２ゲイン算出部５３２は、第２ゲイン係数をゲイン乗算部５３３に出力するとともに、判定部５５０に出力する。 The second gain calculation unit 532 outputs the second gain coefficient to the gain multiplication unit 533 and also outputs it to the determination unit 550.

『ゲイン乗算部５３３』
ゲイン乗算部５３３は、次式のように第１ゲイン係数Ｇ_Ｓ（ω，ｎ）と第２ゲイン係数Ｇ_ＳＮＲ（ω，ｎ）との積を利得係数Ｒ（ω，ｎ）として出力する（ｓ５３３）。
Ｒ（ω，ｎ）＝Ｇ_Ｓ（ω，ｎ）・Ｇ_ＳＮＲ（ω，ｎ） “Gain multiplier 533”
The gain multiplication unit 533 outputs a product of the first gain coefficient G _S (ω, n) and the second gain coefficient G _SNR (ω, n) as a gain coefficient R (ω, n) as shown in the following equation ( s533).
R (ω, n) = G _S (ω, n) · G _SNR (ω, n)

＜判定部５５０＞
第２ゲイン係数Ｇ_ＳＮＲ（ω，ｎ）を用いて、所定の領域から音が発生しているか否か判定する（ｓ５５０）。そして、特定の位置から音声が発生していない場合には、不明瞭化部１２０において、不明瞭化した音響信号を生成する。図１２のターゲットエリアＳに音がなければ、式（１）のＧ_ＳＮＲ（ω，ｎ）の値は全周波数成分で０に近い値になる。一方、ターゲットエリアＳに音があれば、そこにある音源の周波数成分のＧ_ＳＮＲ（ω，ｎ）のみが１に近い値になる。
そこで、例えば、以下の方法により、判定部５５０は、所定の領域（ターゲットエリアＳ）から音が発生しているか否か判定する。 <Determining unit 550>
It is determined using the second gain coefficient G _SNR (ω, n) whether or not sound is generated from a predetermined region (s550). Then, when no sound is generated from a specific position, the obscuring unit 120 generates an obscured acoustic signal. If there is no sound in the target area S of FIG. 12, the value of G _SNR (ω, n) in Expression (1) is close to 0 for all frequency components. On the other hand, if there is sound in the target area S, only the frequency component G _SNR (ω, n) of the sound source in the target area S becomes a value close to 1.
Therefore, for example, the determination unit 550 determines whether sound is generated from a predetermined region (target area S) by the following method.

ターゲットエリアＳ内にある音源が音声であると限定し、音声帯域の周波数（例えば１００Ｈｚから３ｋＨｚ程度）でＧ_ＳＮＲ（ω，ｎ）を周波数方向に加算し、その値があらかじめ決めておいたしきい値を超えたときに、ターゲットエリアＳに音声があると判定する。また加算するのではなく、Ｇ_ＳＮＲ（ω，ｎ）が、あらかじめ決めておいたしきい値を超えた成分の数を合計し、それが、あらかじめ決めておいた別のしきい値を超えたときに、ターゲットエリアＳに音声があると判定してもよい。なお、しきい値は、マイクロホンアレイの特性等や部屋の音響状態で変動するため、実験的に求める。さらに、上記の値（Ｇ_ＳＮＲ（ω，ｎ）の加算値、または成分数の加算値）は、フレーム分割時間単位で刻一刻と変化するため、判定誤差等により明瞭化不明瞭化が高速に切り替わることもある。それを防ぐために、得られた値を、時間方向に移動平均するか、忘却係数により、平滑化すると良い。 The sound source in the target area S is limited to voice, and G _SNR (ω, n) is added in the frequency direction at the frequency of the voice band (for example, about 100 Hz to 3 kHz), and the value is determined in advance. When the value is exceeded, it is determined that there is sound in the target area S. In addition, when G _SNR (ω, n) sums the number of components exceeding a predetermined threshold value and it exceeds another predetermined threshold value instead of adding. Alternatively, it may be determined that there is sound in the target area S. The threshold value is experimentally determined because it varies depending on the characteristics of the microphone array and the acoustic state of the room. Furthermore, since the above value (added value of G _SNR (ω, n) or the added value of the number of components) changes every moment in units of frame division time, clarification and obscuration is accelerated due to determination errors and the like. Sometimes it changes. In order to prevent this, the obtained values may be smoothed by moving average in the time direction or by a forgetting factor.

上記方法により判定部５５０は、特定の位置から音声が発生している場合には、処理対象信号生成部５４０の生成した処理対象信号Ｙ_Ｓ（ω，ｎ）を乗算部５０９へ出力するように制御し、特定の位置から音声が発生していない場合には、処理対象信号Ｙ_Ｓ（ω，ｎ）を不明瞭化部１２０へ出力するように制御する。また、送信部５４０へ付加信号または制御信号を送信する。 By the above method, the determination unit 550 outputs the processing target signal Y _S (ω, n) generated by the processing target signal generation unit 540 to the multiplication unit 509 when sound is generated from a specific position. If the sound is not generated from a specific position, the processing target signal Y _S (ω, n) is controlled to be output to the obscuring unit 120. Further, an additional signal or a control signal is transmitted to the transmission unit 540.

＜不明瞭化部１２０＞
不明瞭化部１２０は、実施例１と同様の構成である。よって、不明瞭化部１２０において、処理対象信号Ｙ_Ｓ（ω，ｎ）を用いて、不明瞭化した音響信号｜Ｆ”（ｍ，ｎ）｜を生成する（ｓ１２０）。 <Obscuring part 120>
The obscuring unit 120 has the same configuration as that of the first embodiment. Therefore, the obscuring unit 120 generates the obscured acoustic signal | F ″ (m, n) | using the processing target signal Y _S (ω, n) (s120).

＜乗算部５０９＞
乗算部５０９は、処理対象信号生成部５４０から与えられる所望音源の信号を主成分とする信号Ｙｓ（ω，ｎ）に各周波数領域毎に利得係数Ｒ（ω，ｎ）乗算することにより（ｓ５０９）、所望音源１の信号を主成分とする信号に含まれる背景雑音成分を抑制することができる。乗算部５０９は信号Ｙｓ（ω，ｎ）と利得係数Ｒ（ω，ｎ）を入力され、背景雑音成分を抑圧した信号Ｙ（ω，ｎ）を出力する。 <Multiplier 509>
The multiplier 509 multiplies the signal Ys (ω, n) whose main component is the signal of the desired sound source given from the processing target signal generator 540 by a gain coefficient R (ω, n) for each frequency domain (s509). ), The background noise component contained in the signal whose main component is the signal of the desired sound source 1 can be suppressed. Multiplier 509 receives signal Ys (ω, n) and gain coefficient R (ω, n), and outputs signal Y (ω, n) in which background noise components are suppressed.

＜送信部５４０＞
送信部５４０は、判定部５５０から付加信号または制御信号を、不明瞭化部１２０から不明瞭化した音響信号｜Ｆ”（ｍ，ｎ）｜または乗算部５０９から背景雑音成分を抑圧した信号Ｙ（ω，ｎ）を入力され、｜Ｆ”（ｍ，ｎ）｜またはＹ（ω，ｎ）を音響信号受信装置６００へ送信する（ｓ５４０）。例えば、付加信号は、送信する信号が｜Ｆ”（ｍ，ｎ）｜またはＹ（ω，ｎ）を識別する信号であり、送信部５４０は、各信号に受け取った付加信号を付加して送信する。また例えば、制御信号は、｜Ｆ”（ｍ，ｎ）｜及びＹ（ω，ｎ）がそれぞれ送信用のチャネルを有する場合に、何れのチャネルにより送信するかを制御する信号である。 <Transmitter 540>
The transmitting unit 540 suppresses the additional signal or control signal from the determining unit 550, the acoustic signal | F ″ (m, n) | obfuscated from the obscuring unit 120, or the signal Y from which the background noise component is suppressed from the multiplying unit 509. (Ω, n) is input, and | F ″ (m, n) | or Y (ω, n) is transmitted to the acoustic signal receiving device 600 (s540). For example, the additional signal is a signal for identifying a signal to be transmitted | F ″ (m, n) | or Y (ω, n), and the transmission unit 540 adds the received additional signal to each signal for transmission. For example, the control signal is a signal for controlling which channel is used for transmission when | F ″ (m, n) | and Y (ω, n) each have a transmission channel.

［音響信号受信装置６００］
図１４は、音響信号受信装置６００の構成例を示す。音響信号受信装置６００の受信部６０１は、付加信号または受信チャネルにより受信した信号が｜Ｆ”（ｍ，ｎ）｜またはＹ（ω，ｎ）であるかを判別する。受信した信号が｜Ｆ”（ｍ，ｎ）｜の場合には利得算出部２０７に送信する。一方、受信した信号がＹ（ω，ｎ）の場合には時間領域変換部２１５に送信する。 [Acoustic signal receiving apparatus 600]
FIG. 14 shows a configuration example of the acoustic signal receiving device 600. The receiving unit 601 of the acoustic signal receiving device 600 determines whether the additional signal or the signal received through the receiving channel is | F ″ (m, n) | or Y (ω, n). In the case of “(m, n) |”, it is transmitted to the gain calculation unit 207. On the other hand, when the received signal is Y (ω, n), it is transmitted to the time domain conversion unit 215.

＜効果＞
音響信号送信装置５００及び音響信号受信装置６００をこのように構成することによって、実施例１と同様の効果を得ることができる。さらに、音声の不明瞭化処理と、その処理の解除を、音源の位置の判定により自動で行うことで、切り替え忘れの防止や、スイッチ操作等の煩雑さをなくし、プライバシーに配慮しつつ、より自然なテレコミュニケーションを実現できるという効果を奏する。
なお、上記、判定部５５０による自動判定に加えて、不明瞭化処理を解除するスイッチも具備すれば、利用者の手間は増えるが確実性は増すこともできる。 <Effect>
By configuring the acoustic signal transmitting device 500 and the acoustic signal receiving device 600 in this manner, the same effects as in the first embodiment can be obtained. Furthermore, the voice obscuration process and the cancellation of the process are automatically performed by determining the position of the sound source, thereby preventing forgetting to switch and avoiding complications such as switch operation. It has the effect of realizing natural telecommunications.
In addition to the automatic determination by the determination unit 550 described above, if a switch for canceling the obscuring process is provided, the labor of the user is increased, but the reliability can be increased.

また、処理対象信号Ｙｓ（ω，ｎ）に対し、強調処理を行っているが、強調処理を行わずに、そのまま送信してもよい。この場合、乗算部５０９を設けなくとも良く、利得算出部５３０内の第１ゲイン算出部５３１及びゲイン乗算部５３３を設けなくとも良い。なお、本実施例の場合には、エリアの外の音を完全に消す（不明瞭にする）ことは出来ていないが、話者は、自身が通話をしている状態であれば、従来の電話と同じであり、多少の周囲の音が相手に伝わったとしても不自然な通話とはならない。 Further, although the enhancement processing is performed on the processing target signal Ys (ω, n), the processing target signal Ys (ω, n) may be transmitted as it is without performing the enhancement processing. In this case, the multiplication unit 509 may not be provided, and the first gain calculation unit 531 and the gain multiplication unit 533 in the gain calculation unit 530 may not be provided. In the case of this embodiment, the sound outside the area cannot be completely muted (unclear), but if the speaker is in a state where he / she is talking, It is the same as a telephone, and even if a little ambient sound is transmitted to the other party, it is not an unnatural call.

＜その他＞
さらに、この方法では、ターゲットエリアＳにある音声と同様な周波数帯域を持つ音源でも反応するため、これに、音声の特徴量を用いた判定方法を加えても良い。それには、例えば音源の周期性の有無を用いる方法がある。この場合、マイクロホンアレー３Ｒ，３Ｌの出力信号を判定部５５０の入力信号とする。例えば、伊藤憲三、"音声と非音声の識別処理に基づく定常雑音抑圧方式"、日本音響学会誌、Voｎ.６１,No８,P４３１-４４０,２００５（以下「参考文献２」という）のように、判定部５５０は、入力信号を線形予測分析した残差信号に対して自己相関値を算出、音声の基本周波数（例えば５０Ｈｚから３００Ｈｚ程度）の範囲内で、その最大値を探索し、その値があらかじめ決めておいたしきい値（概ね０.３以上）を超えた場合に、周期性があると判定する。それに加えて、判定部５５０は、残差信号のパワーがあらかじめ決めておいたしきい値を超えることで音声と判定する。なお、このしきい値は、例えば残差パワーの長時間平均を計測し続け、そこから例えば６ｄＢ程度大きい値に自動的に設定してもよい。 <Others>
Furthermore, in this method, a sound source having the same frequency band as the sound in the target area S reacts, so that a determination method using the sound feature amount may be added thereto. For example, there is a method using the presence or absence of periodicity of the sound source. In this case, output signals from the microphone arrays 3R and 3L are used as input signals to the determination unit 550. For example, Kenzo Ito, “Steady noise suppression method based on speech and non-speech discrimination processing”, Journal of the Acoustical Society of Japan, Von. 61, No. 8, P431-440, 2005 (hereinafter referred to as “Reference 2”), The determination unit 550 calculates an autocorrelation value for the residual signal obtained by linear prediction analysis of the input signal, searches for the maximum value within the range of the fundamental frequency of the speech (for example, about 50 Hz to about 300 Hz), and the value is When a predetermined threshold value (approximately 0.3 or more) is exceeded, it is determined that there is periodicity. In addition, the determination unit 550 determines that the sound is speech when the power of the residual signal exceeds a predetermined threshold value. Note that this threshold value may be set automatically, for example, by continuously measuring the long-term average of the residual power, and then increasing the threshold value by, for example, about 6 dB.

不明瞭化または明瞭化を行う際の自動判定の方法として、特許第３６７０５６２号公報（以下、「参考文献３」という）に基づく二つのマイクロホンを使った実施例を説明する。
[音響信号送信装置７００]
図１５は音響信号送信装置７００の構成例を、図１６は音響信号送信装置７００の処理フロー例を示す。音響信号送信装置７００は、ステレオ信号入力部７０１、フレーム分割部１０９、分割部１１１、類似度計算部７０４、減衰係数計算部７０５、乗算部７１６、加算部７１７、７６０、不明瞭化部１２０、判定部７５０及び送信部５４０を有する。 An embodiment using two microphones based on Japanese Patent No. 3670562 (hereinafter referred to as “Reference 3”) will be described as a method of automatic determination when obscuring or clarifying.
[Acoustic signal transmitter 700]
FIG. 15 shows a configuration example of the acoustic signal transmission device 700, and FIG. 16 shows a processing flow example of the acoustic signal transmission device 700. The acoustic signal transmission apparatus 700 includes a stereo signal input unit 701, a frame division unit 109, a division unit 111, a similarity calculation unit 704, an attenuation coefficient calculation unit 705, a multiplication unit 716, addition units 717 and 760, an obscuring unit 120, A determination unit 750 and a transmission unit 540 are included.

図１７は音響信号送信装置に信号を入力するマイクロホン３の配置例を示す。２つのマイクロホン３を明瞭化したい音源（話者）位置と対称になる位置に配置する。
なお、ステレオ信号入力部７０１、類似度計算部７０４、減衰係数計算部７０５、乗算部７１６、加算部７１７の処理内容は参考文献３に詳しく記載されている。またフレーム分割部１０９、分割部１１１及び不明瞭化部１２０は実施例１と、送信部５４０及び音響信号受信装置６００は実施例２と同様の構成である。これらについて本実施例では概要を説明する。 FIG. 17 shows an arrangement example of the microphones 3 for inputting signals to the acoustic signal transmission device. The two microphones 3 are arranged at positions symmetrical to the sound source (speaker) position to be clarified.
The processing contents of the stereo signal input unit 701, the similarity calculation unit 704, the attenuation coefficient calculation unit 705, the multiplication unit 716, and the addition unit 717 are described in detail in Reference Document 3. The frame dividing unit 109, the dividing unit 111, and the obscuring unit 120 have the same configurations as those in the first embodiment, and the transmission unit 540 and the acoustic signal receiving device 600 have the same configurations as those in the second embodiment. In this embodiment, the outline will be described.

＜ステレオ信号入力部７０１＞
音響信号送信装置７００は、ステレオ信号入力部７０１を介して、ステレオ信号を入力される（ｓ７０１）。このステレオ信号は、左右のチャネルごとに処理される。
＜フレーム分割部１０９及び分割部１１１＞
フレーム分割部１０９及び分割部１１１は、実施例１と同様の構成である。フレーム分割部１０９は、デジタル音響信号ｘ_R（ｕ）、ｘ_L（ｕ）を所定時間ごとのフレームｎに分割し（ｓ１０９）、音響信号Ｘ_R（ｎ）、Ｘ_L（ｎ）を出力する。分割部１１１は、各フレームｎごとの音響信号を複数の周波数帯域音響信号Ｘ_R（ω，ｎ）、Ｘ_L（ω，ｎ）に分割し（ｓ１１１）、出力する。なお、参考文献３記載の左チャネル周波数帯域分割部１０３及び右ちぇネル周波数帯域分割部１０４は、フレーム分割部１０９及び分割部１１１の機能を備える。 <Stereo signal input unit 701>
The acoustic signal transmitting apparatus 700 receives a stereo signal via the stereo signal input unit 701 (s701). This stereo signal is processed for each of the left and right channels.
<Frame Dividing Unit 109 and Dividing Unit 111>
The frame dividing unit 109 and the dividing unit 111 have the same configuration as that of the first embodiment. The frame dividing unit 109 divides the digital acoustic signals x _R (u) and x _L (u) into frames n every predetermined time (s109), and outputs the acoustic signals X _R (n) and X _L (n). . The dividing unit 111 divides the acoustic signal for each frame n into a plurality of frequency band acoustic signals X _R (ω, n) and X _L (ω, n) (s111) and outputs the result. Note that the left channel frequency band division unit 103 and the right channel frequency band division unit 104 described in Reference 3 have the functions of the frame division unit 109 and the division unit 111.

＜類似度計算部７０４＞
類似度計算部７０４において、Ｘ_R（ω，ｎ）、Ｘ_L（ω，ｎ）は、同じ周波数帯域ごとに類似度a(ω，ｎ)が計算される（ｓ７０４）。例えば、ａ（ω，ｎ）は、ａｉ（ω，ｎ）とａｐ（ω，ｎ）からなり、ａｉ（ω，ｎ）及びａｐ（ω，ｎ）は、以下の式により求める。 <Similarity calculation unit 704>
In the similarity calculation unit 704, the similarity a (ω, n) is calculated for each of the same frequency band for X _R (ω, n) and X _L (ω, n) (s704). For example, a (ω, n) is composed of ai (ω, n) and ap (ω, n), and ai (ω, n) and ap (ω, n) are obtained by the following equations.

＜減衰係数計算部７０５＞
減衰係数計算部７０５は、各周波数帯域ごとに計算された類似度a(ω，ｎ)に基づき各周波数帯域ごとに減衰係数g(ω，ｎ)を算出する（ｓ７０５）。
＜乗算部７１６＞
乗算部１１６は、減衰係数ｇ（ω，ｎ）を各チャネル各周波数帯域のＸ_R（ω，ｎ）、Ｘ_L（ω，ｎ）に乗じ（ｓ７１６）、ターゲットエリアＳ以外から発生する響信号を抑圧した信号Ｘ’_R（ω，ｎ）、Ｘ’_L（ω，ｎ）を出力する。 <Attenuation coefficient calculation unit 705>
The attenuation coefficient calculation unit 705 calculates an attenuation coefficient g (ω, n) for each frequency band based on the similarity a (ω, n) calculated for each frequency band (s705).
<Multiplier 716>
The multiplier 116 multiplies the attenuation coefficient g (ω, n) by X _R (ω, n) and X _L (ω, n) of each frequency band (s716), and generates an echo signal other than the target area S. Are output as signals X ′ _R (ω, n) and X ′ _L (ω, n).

＜加算部７１７及び７６０＞
加算部７１７は、左右のチャネル信号Ｘ’_R（ω，ｎ）、Ｘ’_L（ω，ｎ）を加算し（ｓ７１７）、モノラル化し、Ｙ（ω，ｎ）を算出する。
一方、加算部７６０は、左右のチャネル信号Ｘ_R（ω，ｎ）、Ｘ_L（ω，ｎ）を加算し（ｓ７６０）、モノラル化し、Ｘ（ω，ｎ）を算出する。
＜不明瞭化部１２０＞
不明瞭化部１２０は、実施例１と同様の構成である。よって、不明瞭化部１２０において、Ｘ（ω，ｎ）を用いて、不明瞭化した音響信号｜Ｆ”（ｍ，ｎ）｜を生成する（ｓ１２０）。 <Adding units 717 and 760>
The adder 717 adds the left and right channel signals X ′ _R (ω, n) and X ′ _L (ω, n) (s717), converts them to monaural, and calculates Y (ω, n).
On the other hand, the adding unit 760 adds the left and right channel signals X _R (ω, n) and X _L (ω, n) (s760), monauralizes, and calculates X (ω, n).
<Obscuring part 120>
The obscuring unit 120 has the same configuration as that of the first embodiment. Therefore, the obscuring unit 120 generates the obscured acoustic signal | F ″ (m, n) | using X (ω, n) (s120).

＜判定部７５０＞
判定部７５０は、類似度a(ω，ｎ)を用いて、特定の領域から音が発生しているか否か判定する（ｓ７５０）。図１８に判定部７５０の処理フロー例を示す。
左右２つのマイクロホンと対称の位置、即ちターゲットエリアＳにある音響信号は、２つのマイクロホン３に同位相かつ同パワーで入力される。まず初期設定を行い（ｓ７５１）、式（２）によって求めるａｉ（ω，ｎ）の値が１に近いあらかじめ１に近いあらかじめ決めておいたしきい値の範囲内（例えば、１−ｋ１≦ａｉ（ω，ｎ）≦１＋ｋ１、例えば、ｋ１＝０．０５とする）であり（ｓ７５２）、かつ、式（３）によって求めるａｐ（ω，ｎ）の値が１に近いあらかじめ決めておいたしきい値の範囲内（例えば、１−ｋ２≦ａｐ（ω，ｎ）≦１＋ｋ２、例えば、ｋ２＝０．０５とする）であった周波数成分の信号を（ｓ７５３）、中央付近にある音源の周波数成分と判定する。全ての周波数について上記判定を行う（ｓ７５５、ｓ７５６）。 <Determining unit 750>
The determination unit 750 determines whether or not sound is generated from a specific region using the similarity a (ω, n) (s750). FIG. 18 shows a processing flow example of the determination unit 750.
An acoustic signal in a position symmetrical to the two left and right microphones, that is, in the target area S, is input to the two microphones 3 with the same phase and the same power. First, initial setting is performed (s751), and the value of ai (ω, n) obtained by Equation (2) is close to a predetermined threshold value close to 1 (for example, 1−k1 ≦ ai ( ω, n) ≦ 1 + k1, for example, k1 = 0.05) (s752), and the value of ap (ω, n) obtained by the equation (3) is close to 1 in advance. (For example, 1−k2 ≦ ap (ω, n) ≦ 1 + k2, for example, k2 = 0.05) (s753), the frequency component of the sound source near the center judge. The above determination is made for all frequencies (s755, s756).

そして、その判定を受けた周波数成分の数をカウントし（ｓ７５４）、その数ｃｎｔがあらかじめ決めておいたしきい値ｋ３（例えば全周波数成分の３０％）以上であった場合に、特定の領域から音が発生していると判定する（ｓ７５７、ｓ７５８）。ｋ３未満であった場合には、特定の領域から音が発生していないと判定する（ｓ７５９）。判定後の処理は判定部５５０と同様である。
＜送信部５４０＞
送信部５４０は、判定部５５０から付加信号または制御信号を、不明瞭化部１２０から出力された不明瞭化した音響信号｜Ｆ”（ｍ，ｎ）｜または乗算部７１７から出力された信号Ｙ（ω，ｎ）を音響信号受信装置６００へ送信する（ｓ７６１、ｓ７６２）。 Then, the number of frequency components subjected to the determination is counted (s754), and when the number cnt is equal to or greater than a predetermined threshold value k3 (for example, 30% of all frequency components), a specific region is used. It is determined that sound is generated (s757, s758). If it is less than k3, it is determined that no sound is generated from the specific area (s759). The processing after the determination is the same as that of the determination unit 550.
<Transmitter 540>
The transmission unit 540 uses the additional signal or control signal from the determination unit 550, the obscured acoustic signal | F "(m, n) | output from the obscuring unit 120, or the signal Y output from the multiplication unit 717. (Ω, n) is transmitted to the acoustic signal receiving device 600 (s761, s762).

＜効果＞
このような構成とすることによって、実施例１及び実施例２と同様の効果を得ることができる。この実施例３の場合は、マイクロホンの数が２個で済み、演算量が少ないという利点はある。しかし、原理的に、実施例２のように、ピンポイントで音源の位置を判定することは出来ない。つまり、２つのマイクロホンから対称の位置にある音源は、マイクロホンからの距離に関わらず明瞭に伝送される音として判定される。ただし、マイクロホンから離れた位置にある音源信号には、壁等からの反射音がより加わり、左右の位相差の乱れが大きくなることと、距離減衰により音のパワーが徐々に小さくなるため、実用上は、図１７にあるような楕円の位置が明瞭化される範囲となる。 <Effect>
By adopting such a configuration, the same effects as those of the first and second embodiments can be obtained. In the case of the third embodiment, only two microphones are required, and there is an advantage that the amount of calculation is small. However, in principle, the position of the sound source cannot be determined pinpointed as in the second embodiment. That is, the sound source located at a symmetrical position from the two microphones is determined as a sound that is clearly transmitted regardless of the distance from the microphone. However, the sound source signal at a position far from the microphone is added with the reflected sound from the wall, etc., and the disturbance of the left and right phase difference increases, and the sound power gradually decreases due to distance attenuation. The above is a range in which the position of the ellipse as shown in FIG. 17 is clarified.

＜その他＞
なお、参考文献２記載の音声の特徴量を判定に加えてもよいことは、同様である。また、特定の領域以外から発生する音を抑圧せずに音響信号受信装置６００に送信してもよい。その場合、乗算部７１６及び減衰係数計算部７０５を設けなくともよい。また、判定部７５０が、分割部１１１から出力される信号Ｘ_Ｒ（ω，ｎ）、Ｘ_Ｌ（ω，ｎ）を乗算部７１６及び加算部７６０の何れか一方に出力するように制御してもよい。 <Others>
It is the same that the audio feature amount described in Reference 2 may be added to the determination. Moreover, you may transmit to the acoustic signal receiver 600, without suppressing the sound emitted from other than a specific area | region. In that case, the multiplication unit 716 and the attenuation coefficient calculation unit 705 need not be provided. In addition, the determination unit 750 controls the signals X _R (ω, n) and X _L (ω, n) output from the dividing unit 111 to be output to one of the multiplication unit 716 and the addition unit 760. Also good.

２つのマイクロホン対、つまり４つのマイクロホンと、実施例３で説明した類似度計算部を２つ用いて、実施例３に比べより狭い特定領域において音が発生するか否かを判定する実施例について説明する。 Example in which two microphone pairs, that is, four microphones, and two similarity calculation units described in Example 3 are used to determine whether sound is generated in a specific area narrower than Example 3. explain.

［音響信号送信装置８００］
図１９は音響信号送信装置８００の構成例を示す。音響信号送信装置８００は、第１類似度計算手段８１０、第２類似度計算手段８１１、加算部８１７、８６０、判定部８５０、不明瞭化部１２０、送信部５４０を有する。なお、不明瞭化部１２０は実施例１と、送信部５４０及び音響信号受信装置６００は実施例２と同様の構成である。
図２０は音響信号送信装置に信号を入力するマイクロホン３_Ｒ１、３_Ｌ１、３_Ｒ２、３_Ｌ２の配置例を示す。各マイクロホン対３_Ｒ１、３_Ｌ１と３_Ｒ２、３_Ｌ２の中央の重なり合う部分をターゲットエリアＳとする。 [Acoustic signal transmission device 800]
FIG. 19 shows a configuration example of the acoustic signal transmission device 800. The acoustic signal transmission apparatus 800 includes a first similarity calculation unit 810, a second similarity calculation unit 811, adders 817 and 860, a determination unit 850, an obscuring unit 120, and a transmission unit 540. The obscuring unit 120 has the same configuration as that of the first embodiment, and the transmission unit 540 and the acoustic signal receiving device 600 have the same configurations as those of the second embodiment.
FIG. 20 shows an arrangement example of the microphones 3 _R1 , 3 _L1 , 3 _R2 , and 3 _L2 for inputting signals to the acoustic signal transmission device. The overlapping area at the center of each microphone pair 3 _R1 , 3 _L1 and 3 _R2 , 3 _L2 is defined as a target area S.

＜第１類似度計算手段８１０及び第２類似度計算手段８１１＞
第１類似度計算手段８１０は、ステレオ信号入力部７０１、フレーム分割部１０９、分割部１１１及び第１類似度計算部８０４を備える。第２類似度計算手段８１１は、第１類似度計算手段８１０と同様の構成を有し、第１類似度計算部８０４に代えて、図示しない第２類似度計算部を備える。なお、フレーム分割部１０９及び分割部１１１は実施例１と、送信部５４０及び音響信号受信装置６００は実施例２と同様の構成である。第１類似度計算部８０４及び第２類似度計算部は実施例３の類似度計算部７０４と同様の構成である。 <First Similarity Calculation Unit 810 and Second Similarity Calculation Unit 811>
The first similarity calculation unit 810 includes a stereo signal input unit 701, a frame division unit 109, a division unit 111, and a first similarity calculation unit 804. The second similarity calculation unit 811 has the same configuration as the first similarity calculation unit 810, and includes a second similarity calculation unit (not shown) instead of the first similarity calculation unit 804. The frame dividing unit 109 and the dividing unit 111 have the same configurations as those in the first embodiment, and the transmission unit 540 and the acoustic signal receiving device 600 have the same configurations as those in the second embodiment. The first similarity calculation unit 804 and the second similarity calculation unit have the same configuration as the similarity calculation unit 704 of the third embodiment.

『第１類似度計算部８０４』
第１類似度計算部８０４は、マイクロホン３_Ｒ１及び３_Ｌ１の出力信号を用いて得られる周波数帯域音響信号Ｘ_R１（ω，ｎ）、Ｘ_L１（ω，ｎ）を入力され、第１類似度ａ１（ω，ｎ）を求める。同様に第２類似度計算部は、マイクロホン３_Ｒ２及び３_Ｌ２の出力信号を用いて得られる周波数帯域音響信号Ｘ_R２（ω，ｎ）、Ｘ_L２（ω，ｎ）を入力され、第２類似度ａ２（ω，ｎ）を求める。 “First similarity calculation unit 804”
The first similarity calculation unit 804 receives the frequency band acoustic signals X _R1 (ω, n) and X _L1 (ω, n) obtained using the output signals of the microphones 3 _R1 and 3 _L1 and receives the first similarity. a1 (ω, n) is obtained. Similarly, the second similarity calculation unit receives the frequency band acoustic signals X _R2 (ω, n) and X _L2 (ω, n) obtained using the output signals of the microphones 3 _R2 and 3 _L2 , and receives the second similarity. The degree a2 (ω, n) is obtained.

＜判定部８５０＞
判定部は、第１類似度ａ１（ω，ｎ）及び第２類似度ａ２（ω，ｎ）を用いて、特定の領域から音が発生しているか否か判定する。具体的には、各類似度ａ１，ａ２を用いて、図１８に記載される判定処理を行い、何れの類似度に対しても所定の領域から音が発生していると判定された場合のみ（ｓ７５７）、図２０のターゲットエリアＳから音が発生していると判定する。判定後の処理は判定部５５０と同様である。 <Determining unit 850>
The determination unit determines whether sound is generated from a specific region using the first similarity a1 (ω, n) and the second similarity a2 (ω, n). Specifically, the determination process illustrated in FIG. 18 is performed using the similarities a1 and a2, and only when it is determined that sound is generated from a predetermined region for any similarity. (S757), it is determined that sound is generated from the target area S of FIG. The processing after the determination is the same as that of the determination unit 550.

＜効果＞
このような構成とすることによって、実施例１、実施例２、実施例３と同様の効果を得ることができる。なお、実施例２と比べると演算処理が軽く、マイクロホンが４本ですむという利点がある。一方、実施例３と比べると特定の領域を狭くできるという利点がある。
＜加算部８１７及び加算部８６０＞
加算部８１７及び加算部８６０は、それぞれ周波数帯域音響信号Ｘ_R１（ω，ｎ）、Ｘ_L１（ω，ｎ）、Ｘ_R２（ω，ｎ）、Ｘ_L２（ω，ｎ）を入力され、これらの値を足し合わせた値Ｙ（ω，ｎ）、Ｘ（ω，ｎ）を出力する。なお、この場合、Ｙ（ω，ｎ）とＸ（ω，ｎ）は同一となる。 <Effect>
By adopting such a configuration, it is possible to obtain the same effects as those of the first embodiment, the second embodiment, and the third embodiment. Compared with the second embodiment, the calculation processing is lighter and four microphones are required. On the other hand, compared with the third embodiment, there is an advantage that a specific area can be narrowed.
<Adding unit 817 and adding unit 860>
The adder 817 and the adder 860 are input with frequency band acoustic signals X _R1 (ω, n), X _L1 (ω, n), X _R2 (ω, n), and X _L2 (ω, n), respectively. The values Y (ω, n) and X (ω, n) obtained by adding the values are output. In this case, Y (ω, n) and X (ω, n) are the same.

＜その他＞
なお、実施例３と同様に、減衰係数計算部を設けても良い。その場合、各類似度計算手段８１０，８１１からそれぞれ減衰係数を求めるため、二つの減衰係数計算部を設け、各類似度計算手段の出力信号に対して、対応する減衰係数を乗算する。
上述した音響信号送信装置及び音響信号受信装置は、コンピュータにより機能させることもできる。この場合はコンピュータに、目的とする装置（各種実施例で図に示した機能構成をもつ装置）として機能させるためのプログラム、又はその処理手順（各実施例で示したもの）の各過程をコンピュータに実行させるためのプログラムを、ＣＤ−ＲＯＭ、磁気ディスク、半導体記憶装置などの記録媒体から、あるいは通信回線を介してそのコンピュータ内にダウンロードし、そのプログラムを実行させればよい。 <Others>
Note that an attenuation coefficient calculation unit may be provided as in the third embodiment. In this case, in order to obtain the attenuation coefficient from the similarity calculation means 810 and 811, two attenuation coefficient calculation sections are provided, and the output signal of each similarity calculation means is multiplied by the corresponding attenuation coefficient.
The above-described acoustic signal transmitting apparatus and acoustic signal receiving apparatus can be functioned by a computer. In this case, the program for causing the computer to function as a target device (the device having the functional configuration shown in the drawings in various embodiments) or each process of the processing procedure (shown in each embodiment) is processed by the computer. A program to be executed by the computer may be downloaded from a recording medium such as a CD-ROM, a magnetic disk, or a semiconductor storage device or via a communication line into the computer, and the program may be executed.

１００，３００，５００，７００，８００音響信号送信装置
２００，４００，６００音響信号受信装置
１０３，２０３記憶部１０５，２０５制御部
１０７ＡＤＣ１０９，２０９フレーム分割部
１１１，２１１分割部１２０，３２０不明瞭化部
３２１，４２１フィルタバンク１２１周波数方向平滑化部
１２３ダイナミックレンジ圧縮部１２５時間方向平滑部
１３０，５４０送信部
２０１，６０１受信部２０７，３２７利得算出部
２１３乗算部２１５時間領域変換部
２１７ＤＡＣ２１９アンプ
４１５フレーム合成部４−１第１収音部
４−２第２収音部４−３第３収音部
４−４第４収音部４−５第５収音部
４−６第６収音部
５４０処理対象信号生成部５０７パワースペクトル推定部
５３０利得係数算出部５０９乗算部
５５０，７５０，８５０判定部７０４類似度計算部
８０４第１類似度計算部８１０第１類似度計算手段
８１１第２類似度計算手段 100, 300, 500, 700, 800 Acoustic signal transmitting device 200, 400, 600 Acoustic signal receiving device 103, 203 Storage unit 105, 205 Control unit 107 ADC 109, 209 Frame dividing unit 111, 211 Dividing unit 120, 320 Unclear Conversion units 321 and 421 Filter bank 121 Frequency direction smoothing unit 123 Dynamic range compression unit 125 Time direction smoothing unit 130, 540 Transmission unit 201, 601 Reception unit 207, 327 Gain calculation unit 213 Multiplication unit 215 Time domain conversion unit 217 DAC 219 Amplifier 415 Frame synthesis unit 4-1 First sound collection unit 4-2 Second sound collection unit 4-3 Third sound collection unit 4-4 Fourth sound collection unit 4-5 Fifth sound collection unit 4-6 Sound collection unit 540 Processing target signal generation unit 507 Power spectrum estimation unit 530 Gain coefficient calculation unit 509 Multiplication unit 550, 750, 850 determination unit 704 similarity calculation unit 804 first similarity calculation unit 810 first similarity calculation unit 811 second similarity calculation unit

Claims

A frame dividing unit that divides the acoustic signal collected from the microphone into frames at predetermined intervals;
A dividing unit for dividing the acoustic signal for each frame into a plurality of frequency band acoustic signals;
An obscuring unit for obscuring an acoustic signal divided for each frame;
The obscuring part is
A frequency direction smoothing unit for obtaining a frequency-smoothed power of an input signal;
A dynamic range compression unit that compresses fluctuations in the power of the input signal;
A time direction smoothing unit that temporally smoothes the input signal in a time longer than the frame length is connected in cascade.
An acoustic signal transmitter characterized by the above.

A frame dividing unit that divides the acoustic signal collected from the microphone into frames at predetermined intervals;
An obscuring unit for obscuring an acoustic signal divided for each frame;
The obscuring part is
A filter bank that divides an acoustic signal divided for each frame into a plurality of frequency band acoustic signals and obtains a frequency-smoothed power,
A dynamic range compression unit that compresses fluctuations in the power of the input signal;
A time direction smoothing unit that temporally smoothes the input signal in a time longer than the frame length;
An acoustic signal transmitter characterized by the above.

The acoustic signal transmission device according to claim 1 or 2,
A determination unit that determines whether sound is generated from a specific position;
When no sound is generated from a specific position, the obscuring unit generates an obscured acoustic signal.
An acoustic signal transmitter characterized by the above.

The acoustic signal transmission device according to claim 3,
6 or more sound collection units for collecting sounds in different areas using output signals of a microphone array configured with a plurality of microphones;
A processing target signal generation unit that generates a processing target signal from one or more predetermined microphones or signals from the sound collection unit;
From the signal amount of each collected sound signal obtained by each sound collecting unit, the signal amount of the desired sound source and the signal amount of the other sound source are estimated for each frequency, and the power spectrum estimation unit,
A second gain calculation unit for obtaining a ratio of the signal amount of the desired sound source to the signal amount of all sound sources including the signal amount of the desired sound source (hereinafter referred to as “second gain coefficient”);
The determination unit
Using the second gain coefficient to determine whether sound is generated from a predetermined area;
An acoustic signal transmitter characterized by the above.

The acoustic signal transmission device according to claim 3,
Using an acoustic signal collected from a two-channel microphone, and having a similarity calculator for calculating the similarity between channels for each frequency band;
The determination unit determines whether or not sound is generated from a specific region using the similarity;
An acoustic signal transmitter characterized by the above.

The acoustic signal transmission device according to claim 3,
A first similarity calculator that calculates the similarity between channels for each frequency band using an acoustic signal collected from a two-channel microphone;
A second similarity calculation unit that calculates a similarity between channels for each frequency band using an acoustic signal collected from a two-channel microphone different from the two-channel microphone of the first similarity calculation unit; Have
The determination unit determines whether sound is generated from a specific region using the similarity obtained from the first similarity calculation unit and the second similarity calculation unit.
An acoustic signal transmitter characterized by the above.

An acoustic signal receiving device for receiving an obscured acoustic signal,
Using the obscured acoustic signal, a gain calculation unit that calculates a gain for each frequency;
A frame dividing unit for dividing a predetermined stationary carrier signal into frames every predetermined time;
A dividing unit for dividing the carrier signal for each frame into signals of a plurality of frequency bands;
A multiplier that multiplies the gain and the signal of the frequency band;
A time domain conversion unit that converts a value obtained by the multiplication unit into a time domain signal;
An acoustic signal receiving device.

A frame dividing step for dividing the acoustic signal collected from the microphone into frames at predetermined intervals;
A dividing step of dividing the acoustic signal for each frame into a plurality of frequency band acoustic signals;
Comprising an obscuring step for obscuring the acoustic signal divided for each frame;
The obscuring step includes
A frequency direction smoothing step for obtaining a frequency-smoothed power of the input signal;
A dynamic range compression step for compressing fluctuations in the power of the input signal;
A time direction smoothing step for smoothing the input signal in time longer than the frame length is cascaded;
An acoustic signal transmission method characterized by the above.

An acoustic signal transmission method according to claim 8,
A determination step for determining whether or not sound is generated from a specific position;
If no sound is generated from a specific position, the obscuring step generates an obscured acoustic signal.
An acoustic signal transmission method characterized by the above.

A program for causing a computer to function as the acoustic signal transmitting device or the acoustic signal receiving device according to any one of claims 1 to 7.