JP2011075694A

JP2011075694A - Sound processing device and program

Info

Publication number: JP2011075694A
Application number: JP2009225287A
Authority: JP
Inventors: Hiromi Aoyanagi; 弘美青柳
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2009-09-29
Filing date: 2009-09-29
Publication date: 2011-04-14
Anticipated expiration: 2029-09-29
Also published as: JP5359744B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a sound processing device capable of improving articulation of a sound signal by improving a signal to noise ratio with natural audibility maintained. <P>SOLUTION: The invention is related to a device for improving articulation of the sound signal by improving the signal to noise ratio by suppressing a noise component and by emphasizing a sound component. A noise suppression lower limit value by frequency band determined on the basis of human audibility characteristics, and a sound emphasis upper limit by frequency band determined on the basis of the human audibility characteristics, are held, and a sound component band or a noise component band is discriminated by frequency band. A gain of the sound signal to be processed is controlled so as not to become smaller than the noise suppression lower limit value of the frequency band, for the frequency band determined to be the noise component, and the gain of the sound signal to be processed is controlled so as not to become larger than the sound emphasis upper limit value of the frequency band, for the frequency band determined to be the sound component. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は音処理装置及びプログラムに関し、例えば、電話端末において捕捉音声若しくは発音出力音声の明瞭度を向上させる場合に適用し得るものである。 The present invention relates to a sound processing apparatus and program, and can be applied to, for example, improving the clarity of captured speech or pronunciation output speech in a telephone terminal.

雑音が重畳された音声は、明瞭度が下り発話内容が聴き取り難くなるなどの問題が生じる。これを解決するために、従来から、ノイズキャンセラ、ノイズサプレッサと称される技術が存在する。また、特許文献１に示されるような音声強調技術が存在る。前者は、雑音が重畳された音声から雑音部分を消去、抑圧することにより音声対雑音比（ＳＮ比）を向上させて明瞭度を向上させるものである。後者は、音声部分を強調することにより、音声対雑音比を向上させ明瞭度を向上させるものである。 The voice with noise superimposed has problems such as intelligibility and difficulty in listening to the content of the utterance. In order to solve this, conventionally, there are techniques called a noise canceller and a noise suppressor. In addition, there is a speech enhancement technique as disclosed in Patent Document 1. The former improves the speech-to-noise ratio (S / N ratio) by eliminating and suppressing the noise part from the speech with superimposed noise, thereby improving the clarity. The latter enhances the speech-to-noise ratio and enhances the intelligibility by emphasizing the speech part.

特開平６−２０８３９５号公報JP-A-6-208395

しかしながら、前者の方法では、消去、抑圧された後の雑音の周波数特性について特には考慮がなされず、レベルは下るものの異音が発生して、逆に聴感品質を劣化させる場合があった。また、後者の方法では、元々の音声が持つ周波数特性を過度に変形して歪感を増大させる場合があった。 However, in the former method, no particular consideration is given to the frequency characteristics of the noise after being erased and suppressed, and there is a case where noise is generated although the level is lowered, and the auditory quality is deteriorated. In the latter method, the frequency characteristic of the original voice may be excessively deformed to increase the sense of distortion.

本発明は、前述の課題に鑑みてなされたものであり、自然な聴感を保ちつつ信号対雑音比を向上させ、音信号（音声信号若しくは音響信号）の明瞭度を向上させる音処理装置及びプログラムを提供しようとしたものである。 The present invention has been made in view of the above-described problems, and a sound processing apparatus and program that improve a signal-to-noise ratio while maintaining a natural audibility and improve the clarity of a sound signal (audio signal or sound signal). It is what I tried to provide.

第１の本発明は、雑音成分の抑圧により信号対雑音比を向上させて音信号の明瞭度を向上させる音処理装置において、（１）人間の聴覚特性に基づいて決定された周波数帯域毎の雑音抑圧下限値を保持している抑圧下限保持手段と、（２）周波数帯域毎に、雑音成分の帯域か否かを判別する信号／雑音判定手段と、（３）雑音成分と判定された周波数帯域に対しては、その周波数帯域の雑音抑圧下限値を下回らないように、処理対象の音信号の利得を制御する利得制御手段とを有することを特徴とする。 According to a first aspect of the present invention, there is provided a sound processing apparatus for improving a signal-to-noise ratio by suppressing a noise component to improve a clarity of a sound signal. (1) For each frequency band determined based on human auditory characteristics Suppression lower limit holding means for holding a noise suppression lower limit value, (2) signal / noise determination means for determining whether or not each frequency band is a noise component band, and (3) a frequency determined as a noise component. A gain control means for controlling the gain of the sound signal to be processed is provided for the band so as not to fall below the noise suppression lower limit value of the frequency band.

第２の本発明は、音成分の強調により信号対雑音比を向上させて音信号の明瞭度を向上させる音処理装置において、（１）人間の聴覚特性に基づいて決定された周波数帯域毎の音強調上限値を保持している強調上限保持手段と、（２）周波数帯域毎に、音成分の帯域か否かを判別する信号／雑音判定手段と、（３）音成分と判定された周波数帯域に対しては、その周波数帯域の音強調上限値を上回らないように、処理対象の音信号の利得を制御する利得制御手段とを有することを特徴とする。 According to a second aspect of the present invention, there is provided a sound processing apparatus for improving a signal-to-noise ratio by enhancing a sound component to improve a clarity of a sound signal. (1) For each frequency band determined based on human auditory characteristics Emphasis upper limit holding means for holding a sound emphasis upper limit; (2) signal / noise determination means for determining whether or not each frequency band is a sound component band; and (3) a frequency determined as a sound component. A gain control means for controlling the gain of the sound signal to be processed is provided for the band so as not to exceed the sound enhancement upper limit value of the frequency band.

第３の本発明は、雑音成分の抑圧及び音成分の強調により信号対雑音比を向上させて音信号の明瞭度を向上させる音処理装置において、（１）人間の聴覚特性に基づいて決定された周波数帯域毎の雑音抑圧下限値を保持している抑圧下限保持手段と、（２）人間の聴覚特性に基づいて決定された周波数帯域毎の音強調上限値を保持している強調上限保持手段と、（３）周波数帯域毎に、音成分の帯域か雑音成分の帯域を判別する信号／雑音判定手段と、（４）雑音成分と判定された周波数帯域に対しては、その周波数帯域の雑音抑圧下限値を下回らないように、処理対象の音信号の利得を制御すると共に、音成分と判定された周波数帯域に対しては、その周波数帯域の音強調上限値を上回らないように、処理対象の音信号の利得を制御する利得制御手段とを有することを特徴とする。 A third aspect of the present invention is a sound processing apparatus that improves the signal-to-noise ratio by suppressing noise components and enhancing sound components to improve the intelligibility of sound signals, and is determined based on (1) human auditory characteristics. Suppression lower limit holding means for holding a noise suppression lower limit value for each frequency band, and (2) enhancement upper limit holding means for holding a sound enhancement upper limit value for each frequency band determined based on human auditory characteristics And (3) a signal / noise determining means for determining a sound component band or a noise component band for each frequency band, and (4) for a frequency band determined as a noise component, noise in the frequency band. Control the gain of the sound signal to be processed so that it does not fall below the suppression lower limit, and for the frequency band determined to be a sound component, do not exceed the sound enhancement upper limit for that frequency band. To control the gain of the sound signal Characterized by a control unit.

第４の本発明の音処理プログラムは、コンピュータを、（１）人間の聴覚特性に基づいて決定された周波数帯域毎の雑音抑圧下限値を保持している抑圧下限保持手段と、（２）周波数帯域毎に、雑音成分の帯域か否かを判別する信号／雑音判定手段と、（３）雑音成分と判定された周波数帯域に対しては、その周波数帯域の雑音抑圧下限値を下回らないように、処理対象の音信号の利得を制御する利得制御手段として機能させ、（４）雑音成分の抑圧により信号対雑音比を向上させて音信号の明瞭度を向上させることを特徴とする。 A sound processing program according to a fourth aspect of the present invention includes: (1) a suppression lower limit holding unit that holds a noise suppression lower limit for each frequency band determined based on human auditory characteristics; and (2) a frequency A signal / noise determining means for determining whether or not the band is a noise component for each band; and (3) a frequency band determined to be a noise component so as not to fall below a noise suppression lower limit value of the frequency band. And (4) improving the signal-to-noise ratio by suppressing the noise component to improve the clarity of the sound signal.

第５の本発明の音処理プログラムは、コンピュータを、（１）人間の聴覚特性に基づいて決定された周波数帯域毎の音強調上限値を保持している強調上限保持手段と、（２）周波数帯域毎に、音成分の帯域か否かを判別する信号／雑音判定手段と、（３）音成分と判定された周波数帯域に対しては、その周波数帯域の音強調上限値を上回らないように、処理対象の音信号の利得を制御する利得制御手段として機能させ、（４）音成分の強調により信号対雑音比を向上させて音信号の明瞭度を向上させることを特徴とする。 A sound processing program according to a fifth aspect of the present invention includes: (1) an emphasis upper limit holding unit that holds a sound emphasis upper limit for each frequency band determined based on human auditory characteristics; and (2) a frequency A signal / noise determining means for determining whether or not the sound component is in each band; and (3) a frequency band determined to be a sound component so as not to exceed a sound enhancement upper limit value of the frequency band. And (4) improving the signal-to-noise ratio by enhancing the sound component to improve the intelligibility of the sound signal.

第６の本発明の音処理プログラムは、コンピュータを、（１）人間の聴覚特性に基づいて決定された周波数帯域毎の雑音抑圧下限値を保持している抑圧下限保持手段と、（２）人間の聴覚特性に基づいて決定された周波数帯域毎の音強調上限値を保持している強調上限保持手段と、（３）周波数帯域毎に、音成分の帯域か雑音成分の帯域を判別する信号／雑音判定手段と、（４）雑音成分と判定された周波数帯域に対しては、その周波数帯域の雑音抑圧下限値を下回らないように、処理対象の音信号の利得を制御すると共に、音成分と判定された周波数帯域に対しては、その周波数帯域の音強調上限値を上回らないように、処理対象の音信号の利得を制御する利得制御手段として機能させ、（５）雑音成分の抑圧及び音成分の強調により信号対雑音比を向上させて音信号の明瞭度を向上させることを特徴とする。 A sound processing program according to a sixth aspect of the present invention includes: (1) a suppression lower limit holding unit that holds a noise suppression lower limit value for each frequency band determined based on human auditory characteristics; An emphasis upper limit holding means for holding a sound emphasis upper limit value for each frequency band determined based on the auditory characteristics of the sound signal, and (3) a signal / discriminating a sound component band or a noise component band for each frequency band (4) For the frequency band determined as the noise component, (4) controlling the gain of the sound signal to be processed so as not to fall below the noise suppression lower limit value of the frequency band, The determined frequency band is made to function as gain control means for controlling the gain of the sound signal to be processed so as not to exceed the sound enhancement upper limit value of the frequency band, and (5) noise component suppression and sound Signal by component enhancement Improve noise ratio, characterized in that to improve the clarity of the sound signal.

本発明によれば、自然な聴感を保ちつつ信号対雑音比を向上させ、音信号の明瞭度を向上させることができる。 According to the present invention, the signal-to-noise ratio can be improved while maintaining a natural audibility, and the clarity of the sound signal can be improved.

第１の実施形態の音声処理装置１００の機能的構成を示すブロック図である。It is a block diagram which shows the functional structure of the audio processing apparatus 100 of 1st Embodiment. 図１の利得変更回路の詳細構成を示すブロック図である。FIG. 2 is a block diagram illustrating a detailed configuration of a gain change circuit in FIG. 1. 図１の利得変更回路群への入力信号及び出力信号の周波数特性を示す説明図である。It is explanatory drawing which shows the frequency characteristic of the input signal and output signal to the gain change circuit group of FIG. 第２の実施形態の音声処理装置１００の機能的構成を示すブロック図である。It is a block diagram which shows the functional structure of the audio processing apparatus 100 of 2nd Embodiment. 図４の利得変更回路の詳細構成を示すブロック図である。FIG. 5 is a block diagram illustrating a detailed configuration of the gain change circuit of FIG. 4. 図４の利得変更回路における音声強調上限値及び雑音抑圧下限値の組の変更の説明図である。FIG. 5 is an explanatory diagram of changing a set of a speech enhancement upper limit value and a noise suppression lower limit value in the gain change circuit of FIG. 4.

（Ａ）第１の実施形態
以下、本発明による音処理装置及びプログラムを音声処理に適用した第１の実施形態を、図面を参照しながら詳述する。第１の実施形態の音処理装置（音声処理装置）は、例えば、電話端末のマイクロフォンが捕捉し、デジタル信号に変換された近端音声信号を処理する位置に介在され、又は、スピーカに与えるデジタル信号でなる遠端音声信号を処理する位置に介在される。 (A) First Embodiment Hereinafter, a first embodiment in which a sound processing apparatus and a program according to the present invention are applied to sound processing will be described in detail with reference to the drawings. The sound processing device (sound processing device) according to the first embodiment is, for example, a digital signal that is captured by a microphone of a telephone terminal and interposed at a position where a near-end sound signal converted into a digital signal is processed, or applied to a speaker. It is interposed at a position where a far-end audio signal consisting of a signal is processed.

（Ａ−１）第１の実施形態の構成
図１は、第１の実施形態の音声処理装置１００の機能的構成を示すブロック図である。例えば、ソフトフォンは、コンピュータにソフトフォン用アプリケーションをインストールして構築されるものであるが、そのソフトフォン用アプリケーションの部分的プログラムとして第１の実施形態の音処理プログラム（音声処理プログラム）を適用し、第１の実施形態の音声処理装置１００を構築しても良い。この場合であっても、機能的構成を図１で表すことができる。 (A-1) Configuration of the First Embodiment FIG. 1 is a block diagram showing a functional configuration of the speech processing apparatus 100 of the first embodiment. For example, a softphone is constructed by installing a softphone application on a computer, and the sound processing program (voice processing program) of the first embodiment is applied as a partial program of the softphone application. However, the speech processing apparatus 100 according to the first embodiment may be constructed. Even in this case, the functional configuration can be represented in FIG.

図１において、第１の実施形態の音声処理装置１００は、有音無音検出回路１０１、周波数帯域分割回路１０２、利得変更回路１０３−１〜１０３−Ｍ及び周波数帯域合成回路１０４を有する。 In FIG. 1, the speech processing apparatus 100 according to the first embodiment includes a sound / silence detection circuit 101, a frequency band division circuit 102, gain change circuits 103-1 to 103 -M, and a frequency band synthesis circuit 104.

雑音が重畳されたデジタル信号でなる音声信号ｓ（ｎ）（但し、ｎは時刻を表すパラメータである）は、有音無音検出回路１０１及び周波数帯域分割回路１０２に入力される。 An audio signal s (n) (where n is a parameter representing time), which is a digital signal on which noise is superimposed, is input to the sound / silence detection circuit 101 and the frequency band division circuit 102.

有音無音検出回路１０１は、入力音声信号ｓ（ｎ）が有音（有音区間）か無音（無音区間）かを判定するものである。有音無音検出回路１０１による判定方法としては既存の任意のものを適用できる。例えば、判定用の信号パワ閾値を持ち、入力音声信号ｓ（ｎ）のパワがこれを上回る場合に有音（Ｖ＝１）、そうでない場合に無音（Ｖ＝０）という結果を、利得変更回路１０３−１〜１０３−Ｍに出力する。 The sound / silence detection circuit 101 determines whether the input sound signal s (n) is sound (sound section) or sound (silence section). As the determination method by the sound / silence detection circuit 101, any existing method can be applied. For example, if the signal power threshold value for determination is present and the power of the input audio signal s (n) exceeds this value, the result is sound (V = 1), and if not, the result is silent (V = 0). Output to circuits 103-1 to 103-M.

周波数帯域分割回路１０２は、入力音声信号ｓ（ｎ）をＭ個の周波数帯域に分割し、各帯域分割信号ｓ１（ｎ）〜ｓＭ（ｎ）をそれぞれ、対応する利得変更回路１０３−１〜１０３−Ｍに出力するものである。 The frequency band dividing circuit 102 divides the input audio signal s (n) into M frequency bands, and each of the band divided signals s1 (n) to sM (n) is respectively corresponding to the gain changing circuits 103-1 to 103. Output to -M.

各利得変更回路１０３−ｍ（ｍは１〜Ｍ）はそれぞれ、入力された帯域分割信号ｓｍ（ｎ）に対する信号成分又は雑音成分の利得を制御し、音声明瞭度を高めるものである。この第１の実施形態の場合、利得変更回路１０３−ｍは、ＩＳＯ（国際標準化機構）２２６等の人間の聴覚特性（ラウドネス曲線）に応じて、利得の制御限界が決定されている。 Each gain changing circuit 103-m (m is 1 to M) controls the gain of the signal component or noise component with respect to the input band-divided signal sm (n), thereby improving the speech intelligibility. In the case of the first embodiment, the gain change circuit 103-m has a gain control limit determined in accordance with human auditory characteristics (loudness curve) such as ISO (International Organization for Standardization) 226.

周波数帯域合成回路１０４は、利得変更回路１０３−１〜１０３−Ｍからの出力信号ｘ１（ｎ）〜ｘＭ（ｎ）を合成し、合成後の音声信号ｘ（ｎ）を明瞭化された音声信号として出力するものである。 The frequency band synthesis circuit 104 synthesizes the output signals x1 (n) to xM (n) from the gain change circuits 103-1 to 103-M, and the synthesized audio signal x (n) is clarified audio signal. Is output as

図２は、利得変更回路１０３−ｍの詳細構成を示すブロック図である。利得変更回路１０３−１〜１０３−Ｍは同様な構成を有するが、利得変更回路１０３−１〜１０３−Ｍは後述する限界値記憶部２０３に記憶されている音声強調上限値及び雑音抑圧下限値が異なっている。 FIG. 2 is a block diagram showing a detailed configuration of the gain changing circuit 103-m. The gain changing circuits 103-1 to 103 -M have the same configuration, but the gain changing circuits 103-1 to 103 -M are voice enhancement upper limit values and noise suppression lower limit values stored in a limit value storage unit 203 described later. Is different.

図２において、利得変更回路１０３−ｍは、ＳＮ算出部２００、ＳＮ閾値供給部２０１、音声／雑音判定部２０２、限界値記憶部２０３及び利得変更部２０４を有する。 In FIG. 2, the gain change circuit 103-m includes an SN calculation unit 200, an SN threshold supply unit 201, a voice / noise determination unit 202, a limit value storage unit 203, and a gain change unit 204.

ＳＮ算出部２００は、入力された帯域分割信号ｓｍ（ｎ）の音声対雑音比（ＳＮ比）ｓｎｍを計算するものである。ＳＮ比ｓｎｍの計算方法としては、既存のいかなる方法を適用したものであっても良く、一例を挙げれば以下の通りである。 The SN calculation unit 200 calculates a voice-to-noise ratio (SN ratio) snm of the input band-divided signal sm (n). As a calculation method of the S / N ratio snm, any existing method may be applied, and an example is as follows.

帯域分割信号ｓｍ（ｎ）のパワを計算し、信号パワｓｍｐとする。無音時（Ｖ＝０）の場合、例えば、雑音推定値ｓｍｎｐを（１）式に従って更新する（但し、０＜α＜１）。一方、有音時（Ｖ＝１）の場合には、雑音推定値ｓｍｎｐの更新は実施しない。そして、ＳＮ比ｓｎｍを（２）式に従って計算する。 The power of the band-divided signal sm (n) is calculated and set as the signal power smp. When there is no sound (V = 0), for example, the estimated noise value smnp is updated according to the equation (1) (where 0 <α <1). On the other hand, when there is a sound (V = 1), the noise estimate value smnp is not updated. And SN ratio snm is calculated according to (2) Formula.

ｓｍｎｐ＝αＸｓｍｎｐ＋（１−α）Ｘｓｍｐ …（１）
ｓｎｍ＝１０ｘｌｏｇ_１０（ｓｍｐ／ｓｍｎｐ） …（２）
ＳＮ閾値供給部２０１は、ＳＮ比ｓｎｍと比較されるＳＮ閾値を音声／雑音判定部２０２に供給するものである。ＳＮ閾値供給部２０１は、無音時（Ｖ＝０）の帯域分割信号ｓｍ（ｎ）のパワの平均値を算出し、このパワ平均値に所定のオフセットを足し込んだものをＳＮ閾値として音声／雑音判定部２０２に供給するようにしても良い。また、多くの人の音声信号を処理し、当該利得変更回路１０３−ｍが担当する帯域が音声帯域か雑音帯域かを切り分けることができるＳＮ比の境界値を予め算出しておき、ＳＮ閾値供給部２０１にＳＮ閾値として予め設定しておき、ＳＮ閾値供給部２０１は、予め設定されたＳＮ閾値として音声／雑音判定部２０２に供給するようにしても良い。 smnp = αXsmnp + (1−α) Xsmp (1)
snm = ₁₀ × log ₁₀ (smp / smnp) (2)
The SN threshold supply unit 201 supplies an SN threshold to be compared with the SN ratio snm to the voice / noise determination unit 202. The SN threshold supply unit 201 calculates the average power of the band division signal sm (n) when there is no sound (V = 0), and adds the predetermined offset to the average power value as the SN threshold. You may make it supply to the noise determination part 202. FIG. In addition, a boundary value of an SN ratio that can process a voice signal of many people, and can determine whether the band that the gain changing circuit 103-m is in charge of is a voice band or a noise band, and supplies an SN threshold value. The SN threshold value may be set in advance in the unit 201 as the SN threshold value, and the SN threshold value supply unit 201 may be supplied to the voice / noise determination unit 202 as a preset SN threshold value.

音声／雑音判定部２０２は、ＳＮ算出部２００が算出したＳＮ比ｓｎｍと、ＳＮ閾値供給部２０１から供給されたＳＮ閾値との大小比較により、当該利得変更回路１０３−ｍが担当する帯域が、対象話者（の音声）の特徴から言えば、音声帯域又は雑音帯域になっているかを判定するものである。音声／雑音判定部２０２は、ＳＮ比ｓｎｍがＳＮ閾値を上回る場合には、当該利得変更回路１０３−ｍが担当する帯域が音声帯域であると判定し、そうでない場合には、当該利得変更回路１０３−ｍが担当する帯域が雑音帯域であると判定する。 The voice / noise determination unit 202 compares the SN ratio snm calculated by the SN calculation unit 200 with the SN threshold value supplied from the SN threshold value supply unit 201, so that the band assigned to the gain change circuit 103-m is Speaking from the characteristics of the target speaker (speech), it is determined whether the voice band or noise band is reached. When the SN ratio snm exceeds the SN threshold, the voice / noise determination unit 202 determines that the band handled by the gain change circuit 103-m is a voice band, and otherwise, the gain change circuit. It is determined that the band assigned to 103-m is a noise band.

限界値記憶部２０３には、当該利得変更回路１０３−ｍが担当する帯域が音声帯域である場合における音声強調の上限の限界利得（音声強調上限値）と、当該利得変更回路１０３−ｍが担当する帯域が雑音帯域である場合における雑音抑圧の下限の限界利得（雑音抑圧下限値）とを記憶しているものである。なお、限界値記憶部２０３には、音声強調上限値又は雑音抑圧下限値の一方のみを記憶し、他方は、記憶している値から所定量を減算又は加算することで得るようにしても良い。 In the limit value storage unit 203, the upper limit limit gain (speech enhancement upper limit value) of speech enhancement when the band handled by the gain change circuit 103-m is a speech band, and the gain change circuit 103-m is responsible. The limit gain (noise suppression lower limit value) of the lower limit of noise suppression when the band to be processed is a noise band is stored. Note that the limit value storage unit 203 may store only one of the voice enhancement upper limit value and the noise suppression lower limit value, and the other may be obtained by subtracting or adding a predetermined amount from the stored value. .

後述する図３における音声強調上限曲線Ｃ１１は、利得変更回路１０３−１〜１０３−Ｍにおける音声強調上限値を繋げ合わせた曲線であり、図３における雑音抑圧下限曲線Ｃ１２は、利得変更回路１０３−１〜１０３−Ｍにおける雑音抑圧下限値を繋げ合わせた曲線である。これら曲線Ｃ１及びＣ２は、ＩＳＯ（国際標準化機構）２２６等の人間の聴覚特性（ラウドネス曲線）に応じた曲線形状をしている。 A voice enhancement upper limit curve C11 in FIG. 3 to be described later is a curve obtained by connecting voice enhancement upper limit values in the gain change circuits 103-1 to 103-M, and a noise suppression lower limit curve C12 in FIG. 3 is a gain change circuit 103-. It is the curve which connected the noise suppression lower limit in 1-103-M. These curves C1 and C2 have a curved shape corresponding to human auditory characteristics (loudness curve) such as ISO (International Organization for Standardization) 226.

利得変更部２０４は、音声／雑音判定部２０２による、当該利得変更回路１０３−ｍが担当する帯域が音声帯域又は雑音帯域という判定結果によって処理を切り替えるものである。 The gain changing unit 204 switches processing according to a determination result by the voice / noise determining unit 202 that the band handled by the gain changing circuit 103-m is a voice band or a noise band.

利得変更部２０４は、当該利得変更回路１０３−ｍが担当する帯域が音声帯域の場合には、入力された帯域分割信号ｓｍ（ｎ）の利得が、限界値記憶部２０３に記憶されている音声強調上限値を下回っているか否かを判別する。ここで、帯域分割信号ｓｍ（ｎ）の利得は帯域分割信号ｓｍ（ｎ）のパワである。利得変更部２０４は、帯域分割信号ｓｍ（ｎ）の利得が音声強調上限値を下回っていると、入力された帯域分割信号ｓｍ（ｎ）を、利得が音声強調上限値に一致するように変更（増幅）して出力し、帯域分割信号ｓｍ（ｎ）の利得が音声強調上限値以上であると、入力された帯域分割信号ｓｍ（ｎ）を利得変更することなくそのまま通過させる。 The gain changing unit 204, when the band handled by the gain changing circuit 103-m is a voice band, the gain of the input band division signal sm (n) is stored in the limit value storage unit 203. It is determined whether or not it is below the emphasis upper limit value. Here, the gain of the band division signal sm (n) is the power of the band division signal sm (n). When the gain of the band division signal sm (n) is below the voice enhancement upper limit value, the gain changing unit 204 changes the input band division signal sm (n) so that the gain matches the voice enhancement upper limit value. When the gain of the band division signal sm (n) is equal to or higher than the voice enhancement upper limit value, the input band division signal sm (n) is passed through without changing the gain.

利得変更部２０４は、当該利得変更回路１０３−ｍが担当する帯域が雑音帯域の場合には、入力された帯域分割信号ｓｍ（ｎ）の利得が、限界値記憶部２０３に記憶されている雑音抑圧下限値を上回っているか否かを判別する。利得変更部２０４は、帯域分割信号ｓｍ（ｎ）の利得が雑音抑圧下限値を上回っていると、入力された帯域分割信号ｓｍ（ｎ）を、利得が雑音抑圧下限値に一致するように変更（減衰）して出力し、帯域分割信号ｓｍ（ｎ）の利得が雑音抑圧下限値以下であると、入力された帯域分割信号ｓｍ（ｎ）を利得変更することなくそのまま通過させる。 When the band assigned to the gain changing circuit 103-m is a noise band, the gain changing unit 204 uses the noise stored in the limit value storage unit 203 as the gain of the input band division signal sm (n). It is determined whether or not the suppression lower limit value is exceeded. When the gain of the band division signal sm (n) exceeds the noise suppression lower limit value, the gain changing unit 204 changes the input band division signal sm (n) so that the gain matches the noise suppression lower limit value. When the gain of the band division signal sm (n) is equal to or less than the noise suppression lower limit value, the input band division signal sm (n) is passed as it is without changing the gain.

（Ａ−２）第１の実施形態の動作
次に、第１の実施形態の音声処理装置１００の動作を、図３に示す周波数特性図をも参照しながら説明する。 (A-2) Operation of the First Embodiment Next, the operation of the speech processing apparatus 100 of the first embodiment will be described with reference to the frequency characteristic diagram shown in FIG.

雑音が重畳されたデジタル信号でなる音声信号ｓ（ｎ）は、有音無音検出回路１０１及び周波数帯域分割回路１０２に入力される。 An audio signal s (n), which is a digital signal on which noise is superimposed, is input to the sound / silence detection circuit 101 and the frequency band division circuit 102.

入力音声信号ｓ（ｎ）は、周波数帯域分割回路１０２によって、Ｍ個の周波数帯域に分割され、分割後の各帯域分割信号ｓ１（ｎ）〜ｓＭ（ｎ）はそれぞれ、対応する利得変更回路１０３−１〜１０３−Ｍに与えられる。また、有音無音検出回路１０１によって、入力音声信号ｓ（ｎ）が有音か無音かが判定され、その判定結果も、利得変更回路１０３−１〜１０３−Ｍに与えられる。 The input audio signal s (n) is divided into M frequency bands by the frequency band dividing circuit 102, and the divided band divided signals s1 (n) to sM (n) are respectively corresponding gain changing circuits 103. -1 to 103-M. The voiced / silent detection circuit 101 determines whether the input voice signal s (n) is voiced or silent, and the determination result is also given to the gain changing circuits 103-1 to 103 -M.

利得変更回路１０３−ｍ（ｍは１〜Ｍ）においては、ＳＮ算出部２００によって、入力された帯域分割信号ｓｍ（ｎ）のＳＮ比ｓｎｍが計算され、このＳＮ比ｓｎｍとＳＮ閾値供給部２０１から供給されたＳＮ閾値とが、音声／雑音判定部２０２によって比較され、当該利得変更回路１０３−ｍが担当する帯域が音声帯域又は雑音帯域になっているかが判定される。 In the gain changing circuit 103-m (m is 1 to M), the SN calculation unit 200 calculates the SN ratio snm of the input band division signal sm (n), and the SN ratio snm and the SN threshold supply unit 201 Is compared with the SN threshold value supplied by the voice / noise determination unit 202, and it is determined whether the band handled by the gain changing circuit 103-m is a voice band or a noise band.

当該利得変更回路１０３−ｍが担当する帯域が音声帯域と判定された場合には、利得変更部２０４によって、入力された帯域分割信号ｓｍ（ｎ）の利得が、限界値記憶部２０３に記憶されている音声強調上限値を下回っているか否かが判別され、帯域分割信号ｓｍ（ｎ）の利得が音声強調上限値を下回っていると、入力された帯域分割信号ｓｍ（ｎ）は、その利得が音声強調上限値に一致するように変更（増幅）されて出力され、帯域分割信号ｓｍ（ｎ）の利得が音声強調上限値以上であると、入力された帯域分割信号ｓｍ（ｎ）は利得変更されることなくそのまま出力される。 When it is determined that the band handled by the gain changing circuit 103-m is the voice band, the gain changing unit 204 stores the gain of the input band division signal sm (n) in the limit value storage unit 203. Whether or not the gain of the band division signal sm (n) is below the voice enhancement upper limit value, the input band division signal sm (n) Is changed (amplified) so as to coincide with the voice enhancement upper limit value, and when the gain of the band division signal sm (n) is equal to or higher than the voice enhancement upper limit value, the input band division signal sm (n) is gain It is output as it is without being changed.

一方、当該利得変更回路１０３−ｍが担当する帯域が雑音帯域の場合には、利得変更部２０４によって、入力された帯域分割信号ｓｍ（ｎ）の利得が、限界値記憶部２０３に記憶されている雑音抑圧下限値を上回っているか否かが判別され、帯域分割信号ｓｍ（ｎ）の利得が雑音抑圧下限値を上回っていると、入力された帯域分割信号ｓｍ（ｎ）は、その利得が雑音抑圧下限値に一致するように変更（減衰）されて出力され、帯域分割信号ｓｍ（ｎ）の利得が雑音抑圧下限値以下であると、入力された帯域分割信号ｓｍ（ｎ）は利得変更されることなくそのまま出力される。 On the other hand, when the band assigned to the gain changing circuit 103-m is a noise band, the gain changing unit 204 stores the gain of the input band division signal sm (n) in the limit value storage unit 203. If the gain of the band division signal sm (n) exceeds the noise suppression lower limit value, it is determined whether or not the gain of the input band division signal sm (n) is higher than the noise suppression lower limit value. If it is output after being changed (attenuated) to match the noise suppression lower limit value and the gain of the band division signal sm (n) is less than or equal to the noise suppression lower limit value, the input band division signal sm (n) is gain changed. It is output as is without being processed.

利得変更回路１０３−１〜１０３−Ｍからの出力信号ｘ１（ｎ）〜ｘＭ（ｎ）は周波数帯域合成回路１０４によって合成され、合成後の音声信号ｘ（ｎ）が、明瞭化された音声信号として当該音声処理装置１００から出力される。 The output signals x1 (n) to xM (n) from the gain changing circuits 103-1 to 103-M are synthesized by the frequency band synthesizing circuit 104, and the synthesized audio signal x (n) is clarified audio signal. Is output from the speech processing apparatus 100.

図３（Ａ）は、利得変更回路群１０３−１〜１０３−Ｍへ入力された帯域分割信号ｓ１（ｎ）〜ｓＭ（ｎ）の利得を示し、図３（Ｂ）は、利得変更回路群１０３−１〜１０３−Ｍから出力された帯域分割信号ｘ１（ｎ）〜ｘＭ（ｎ）の利得を示している。また、図３において、黒塗りの縦棒で利得を示す帯域は、内部の音声／雑音判定部２０２（２０２−１〜２０２−Ｍ）によって音声帯域と判定された帯域を示し、ハッチが付与された縦棒で利得を示す帯域は、内部の音声／雑音判定部２０２（２０２−１〜２０２−Ｍ）によって雑音帯域と判定された帯域を示している。 3A shows the gains of the band division signals s1 (n) to sM (n) input to the gain change circuit groups 103-1 to 103 -M, and FIG. 3B shows the gain change circuit group. The gains of the band division signals x1 (n) to xM (n) output from 103-1 to 103-M are shown. In FIG. 3, a band indicating a gain with a black vertical bar indicates a band determined as an audio band by the internal audio / noise determination unit 202 (202-1 to 202-M), and hatched. The band indicating the gain by the vertical bar indicates the band determined as the noise band by the internal voice / noise determination unit 202 (202-1 to 202-M).

１番目の帯域は、図３（Ａ）に示すように、雑音帯域と判定され、しかも、帯域分割信号ｓ１（ｎ）の利得が雑音抑圧下限値を上回っているので、図３（Ｂ）に示すように、帯域分割信号ｓ１（ｎ）は、その利得が雑音抑圧下限値に一致するように変更（減衰）され、変更後の帯域分割信号ｘ１（ｎ）が出力される。 As shown in FIG. 3A, the first band is determined to be a noise band, and the gain of the band division signal s1 (n) exceeds the noise suppression lower limit value. As shown, the band division signal s1 (n) is changed (attenuated) so that its gain matches the noise suppression lower limit value, and the changed band division signal x1 (n) is output.

５番目の帯域は、図３（Ａ）に示すように、雑音帯域と判定されるが、帯域分割信号ｓ５（ｎ）の利得が雑音抑圧下限値以下であるので、図３（Ｂ）に示すように、帯域分割信号ｓ５（ｎ）はそのまま出力帯域分割信号ｘ５（ｎ）となる。 As shown in FIG. 3A, the fifth band is determined to be a noise band, but the gain of the band division signal s5 (n) is equal to or lower than the noise suppression lower limit value, so that the fifth band is shown in FIG. Thus, the band division signal s5 (n) becomes the output band division signal x5 (n) as it is.

２番目の帯域は、図３（Ａ）に示すように、音声帯域と判定され、しかも、帯域分割信号ｓ２（ｎ）の利得が音声強調上限値を下回っているので、図３（Ｂ）に示すように、帯域分割信号ｓ２（ｎ）は、その利得が音声強調上限値に一致するように変更（増幅）され、変更後の帯域分割信号ｘ２（ｎ）が出力される。 As shown in FIG. 3A, the second band is determined to be a voice band, and the gain of the band division signal s2 (n) is lower than the voice enhancement upper limit value. As shown, the band division signal s2 (n) is changed (amplified) so that its gain matches the voice enhancement upper limit value, and the changed band division signal x2 (n) is output.

３番目の帯域は、図３（Ａ）に示すように、音声帯域と判定されるが、帯域分割信号ｓ３（ｎ）の利得が音声強調上限値以上であるので、図３（Ｂ）に示すように、帯域分割信号ｓ３（ｎ）はそのまま出力帯域分割信号ｘ３（ｎ）となる。 The third band is determined to be a voice band as shown in FIG. 3A, but the gain of the band division signal s3 (n) is equal to or higher than the voice enhancement upper limit value, and therefore, the third band is shown in FIG. Thus, the band division signal s3 (n) becomes the output band division signal x3 (n) as it is.

（Ａ−３）第１の実施形態の効果
第１の実施形態によれば、人間の聴覚特性に基づく音声強調上限値、雑音抑圧下限値を設け、音声強調の上限、雑音抑圧の下限を設定して、音声強調又は雑音抑圧を行うようにしたので、自然な聴感を保ちつつ音声対雑音比を向上させ、音声の明瞭度を向上させることができる。 (A-3) Effects of the First Embodiment According to the first embodiment, the voice enhancement upper limit value and the noise suppression lower limit value based on the human auditory characteristics are provided, and the voice enhancement upper limit and the noise suppression lower limit are set. Thus, since voice enhancement or noise suppression is performed, the voice-to-noise ratio can be improved while maintaining a natural audibility and voice clarity can be improved.

（Ｂ）第２の実施形態
次に、本発明による音処理装置及びプログラムを音声処理に適用した第２の実施形態を、図面を参照しながら詳述する。 (B) Second Embodiment Next, a second embodiment in which the sound processing apparatus and the program according to the present invention are applied to sound processing will be described in detail with reference to the drawings.

図４は、第２の実施形態の音声処理装置１００Ａの機能的構成を示すブロック図であり、第１の実施形態に係る図１との同一、対応部分には同一、対応符号を付して示している。また、図５は、第２の実施形態における利得変更回路１０３Ａ−ｍの詳細構成を示すブロック図であり、第１の実施形態に係る図４との同一、対応部分には同一、対応符号を付して示している。 FIG. 4 is a block diagram illustrating a functional configuration of the speech processing apparatus 100A according to the second embodiment. The same or corresponding parts as those in FIG. 1 according to the first embodiment are denoted by the same or corresponding reference numerals. Show. FIG. 5 is a block diagram showing a detailed configuration of the gain changing circuit 103A-m in the second embodiment. The same components as those in FIG. 4 according to the first embodiment, corresponding components are denoted by the same reference numerals. It is attached.

第２の実施形態の音声処理装置１００Ａは、第１の実施形態と同様な有音無音検出回路１０１、周波数帯域分割回路１０２、利得変更回路１０３Ａ−１〜１０３Ａ−Ｍ及び周波数帯域合成回路１０４に加え、限界値変更指令回路１０５を有する。また、利得変更回路１０３Ａ−ｍ（ｍは１〜Ｍ）の内部構成が第１の実施形態と異なっている。利得変更回路１０３Ａ−ｍは、第１の実施形態と同様に、ＳＮ算出部２００、ＳＮ閾値供給部２０１、音声／雑音判定部２０２、限界値記憶部２０３Ａ及び利得変更部２０４を有するが、限界値記憶部２０３Ａが第１の実施形態のものと異なっている。 The sound processing apparatus 100A of the second embodiment includes a sound / silence detection circuit 101, a frequency band dividing circuit 102, gain changing circuits 103A-1 to 103A-M, and a frequency band synthesizing circuit 104 similar to those of the first embodiment. In addition, a limit value change command circuit 105 is provided. Further, the internal configuration of the gain changing circuit 103A-m (m is 1 to M) is different from that of the first embodiment. Similarly to the first embodiment, the gain changing circuit 103A-m includes an SN calculation unit 200, an SN threshold supply unit 201, an audio / noise determination unit 202, a limit value storage unit 203A, and a gain change unit 204. The value storage unit 203A is different from that of the first embodiment.

第２の実施形態における限界値記憶部２０３Ａは、音声強調上限値及び雑音抑圧下限値の組を、複数組記憶しているものである。以下では、説明の簡単化のために、２組の音声強調上限値及び雑音抑圧下限値が限界値記憶部２０３Ａに格納されているとする。２組のうち、一方の組の音声強調上限値及び雑音抑圧下限値は老人の聴覚特性を考慮したものとなっており、他方の組の音声強調上限値及び雑音抑圧下限値は老人以外の聴覚特性を考慮したものとなっている。 The limit value storage unit 203A in the second embodiment stores a plurality of sets of speech enhancement upper limit values and noise suppression lower limit values. In the following, for simplification of explanation, it is assumed that two sets of speech enhancement upper limit values and noise suppression lower limit values are stored in the limit value storage unit 203A. Of the two sets, the voice enhancement upper limit value and noise suppression lower limit value of one set take into account the hearing characteristics of the elderly, and the voice enhancement upper limit value and noise suppression lower limit value of the other set are auditory other than the elderly. The characteristics are taken into consideration.

なお、第２の実施形態における限界値記憶部２０３Ａは、ある組の音声強調上限値及び雑音抑圧下限値だけを記憶し、他の組の音声強調上限値及び雑音抑圧下限値は、記憶している値に所定の変換関数を適用して得るものであっても良い。 Note that the limit value storage unit 203A in the second embodiment stores only a certain set of speech enhancement upper limit values and noise suppression lower limit values, and stores other sets of speech enhancement upper limit values and noise suppression lower limit values. It may be obtained by applying a predetermined conversion function to a certain value.

第２の実施形態で新たに設けられた限界値変更指令回路１０５は、利得変更回路１０３Ａ−１〜１０３Ａ−Ｍに対し、利得変更部２０４（２０４−１〜２０４−Ｍ）が利用する音声強調上限値及び雑音抑圧下限値の組を指令するものである。利得変更部２０４−１〜２０４−Ｍは、限界値変更指令回路１０５からの変更指令に対して連動して利用する組を変更するものである。 The limit value change command circuit 105 newly provided in the second embodiment is a voice enhancement used by the gain changing unit 204 (204-1 to 204-M) for the gain changing circuits 103A-1 to 103A-M. A set of an upper limit value and a noise suppression lower limit value is commanded. The gain changing units 204-1 to 204-M change a pair used in conjunction with the change command from the limit value change command circuit 105.

例えば、限界値変更指令回路１０５がプッシュスイッチ若しくはキーボード上の特定のキーを中心として構成され、プッシュスイッチ若しくは特定キーが操作される毎に、他方の組への切換えを指示するものであっても良い。また例えば、限界値変更指令回路１０５がディップスイッチ、若しくは、各組に対応した２つのキーを中心として構成され、ディップスイッチが操作されたり、選択中でない組に対応するキーが操作されたりした場合に、他方の組への切換えを指示するものであっても良い。 For example, the limit value change command circuit 105 is configured around a specific key on a push switch or a keyboard, and each time the push switch or a specific key is operated, the switch to the other set is instructed. good. Further, for example, when the limit value change command circuit 105 is configured with a dip switch or two keys corresponding to each group as a center, and the dip switch is operated or a key corresponding to a group not selected is operated. Alternatively, switching to the other set may be instructed.

例えば、第２の実施形態の音声処理装置１００Ａがソフトフォンに組み込まれたものであって、マイクロフォンからの音声信号を処理する系に設けられた場合であれば、近端話者が、遠端話者が老人であると認識したときに、限界値変更指令回路１０５によって、老人の聴覚特性を考慮した音声強調上限値及び雑音抑圧下限値の組に変更させれば良い。また例えば、第２の実施形態の音声処理装置１００Ａがソフトフォンに組み込まれたものであって、スピーカへの音声信号を処理する系に設けられた場合であれば、近端話者は自己が老人であるときに、限界値変更指令回路１０５によって、老人の聴覚特性を考慮した音声強調上限値及び雑音抑圧下限値の組に変更させれば良い。 For example, if the speech processing apparatus 100A of the second embodiment is incorporated in a soft phone and provided in a system that processes a speech signal from a microphone, the near-end speaker is When the speaker is recognized as an elderly person, the limit value change command circuit 105 may change the voice enhancement upper limit value and the noise suppression lower limit value in consideration of the hearing characteristics of the elderly person. Also, for example, if the speech processing apparatus 100A of the second embodiment is incorporated in a soft phone and provided in a system that processes speech signals to a speaker, When the user is an elderly person, the limit value change command circuit 105 may change the voice enhancement upper limit value and noise suppression lower limit value in consideration of the hearing characteristics of the elderly person.

なお、限界値変更指令回路１０５は、入力された音声信号ｓ（ｎ）から、老人の音声か否かを自動判別する既存技術を適用したものであっても良い。 The limit value change command circuit 105 may be an application of an existing technique that automatically determines whether or not the voice of the elderly person is input from the input voice signal s (n).

図６（Ａ）は、老人以外の聴覚特性を考慮した音声強調上限値及び雑音抑圧下限値の組の選択状態を示しており、このような状態において、老人の聴覚特性を考慮した音声強調上限値及び雑音抑圧下限値の組への変更指令が生じると、図６（Ｂ）に実線で示す音声強調上限曲線Ｃ２１及び雑音抑圧下限曲線Ｃ２２で規定される音声強調上限値及び雑音抑圧下限値の組を、利得変更部２０４−１〜２０４−Ｍが利用する状態に切り替わる。 FIG. 6A shows a selected state of a voice enhancement upper limit value and a noise suppression lower limit value in consideration of auditory characteristics other than the elderly, and in such a state, the voice enhancement upper limit in consideration of the auditory characteristics of the elderly person. When a change command to a set of a value and a noise suppression lower limit value is generated, the voice enhancement upper limit value and the noise suppression lower limit value defined by the voice enhancement upper limit curve C21 and the noise suppression lower limit curve C22 indicated by solid lines in FIG. The group is switched to a state used by the gain changing units 204-1 to 204-M.

ここで、老人の聴覚特性を考慮した音声強調上限曲線Ｃ２１及び雑音抑圧下限曲線Ｃ２２は、ＩＳＯ７０２９などに基づき、年齢による聴覚特性の違いを考慮して定められたものである。すなわち、人間の聴覚は、年齢と共に高い周波数が知覚されにくくなるため、これを考慮して、音声強調上限曲線Ｃ２１及び雑音抑圧下限曲線Ｃ２２においては、周波数が高い領域の音声強調上限値及び雑音抑圧下限値を、他の組に係る音声強調上限曲線Ｃ１１及び雑音抑圧下限曲線Ｃ１２より上昇させている。 Here, the speech enhancement upper limit curve C21 and the noise suppression lower limit curve C22 in consideration of the elderly's auditory characteristics are determined in consideration of differences in auditory characteristics depending on age based on ISO 7029 and the like. That is, in human hearing, since high frequencies are difficult to perceive with age, in consideration of this, the speech enhancement upper limit curve C21 and the noise suppression lower limit curve C22 take into account the speech enhancement upper limit value and noise suppression in a high frequency region. The lower limit value is set higher than the speech enhancement upper limit curve C11 and the noise suppression lower limit curve C12 related to other groups.

図６の例では、音声強調上限曲線Ｃ１１及び雑音抑圧下限曲線Ｃ１２が有効な場合に比較し、老人の聴覚特性を考慮した音声強調上限曲線Ｃ２１及び雑音抑圧下限曲線Ｃ２２が有効な場合には、例えば、Ｍ番目の雑音帯域はその減衰量が少なくなり、また、（Ｍ−３）番目の音声帯域は増幅されなかったものから増幅対象に変化する。 In the example of FIG. 6, when the speech enhancement upper limit curve C21 and the noise suppression lower limit curve C22 considering the auditory characteristics of the elderly are valid compared to when the speech enhancement upper limit curve C11 and the noise suppression lower limit curve C12 are valid, For example, the attenuation amount of the Mth noise band is reduced, and the (M-3) th audio band is changed from an unamplified one to an amplification target.

（Ｂ−３）第２の実施形態の効果
第２の実施形態によっても、人間の聴覚特性に基づく音声強調上限値、雑音抑圧下限値を設け、音声強調の上限、雑音抑圧の下限を設定して、音声強調及び雑音抑圧を行うようにしたので、自然な聴感を保ちつつ音声対雑音比を向上させ、音声の明瞭度を向上させることができる。 (B-3) Effects of Second Embodiment Also according to the second embodiment, a voice enhancement upper limit value and a noise suppression lower limit value based on human auditory characteristics are provided, and a voice enhancement upper limit and a noise suppression lower limit are set. Thus, since speech enhancement and noise suppression are performed, the speech-to-noise ratio can be improved while maintaining a natural audibility, and speech intelligibility can be improved.

さらに、第２の実施形態によれば、年齢と共に変化する人間の聴覚特性を考慮し、音声強調上限値及び雑音抑圧下限値の組を複数組設けて選択して使用できるようにしたので、第１の実施形態に比して、より聴取者に合わせて、音声の明瞭度を向上させることができる。 Furthermore, according to the second embodiment, considering the human auditory characteristics that change with age, a plurality of sets of speech enhancement upper limit values and noise suppression lower limit values can be provided for selection and use. Compared with the first embodiment, it is possible to improve the intelligibility of the voice in accordance with the listener.

（Ｃ）他の実施形態
上記各実施形態では、雑音成分の抑圧と音声成分の強調とを併用して音声対雑音比（ＳＮ比）を向上させる音声処理装置を示したが、雑音成分の抑圧によって音声対雑音比を向上させる音声処理装置に本発明を適用することができ、また、音声成分の強調によって音声対雑音比を向上させる音声処理装置に本発明を適用することができる。 (C) Other Embodiments In each of the above embodiments, the speech processing device that improves the speech-to-noise ratio (SN ratio) by using the suppression of the noise component and the enhancement of the speech component has been described. Thus, the present invention can be applied to a speech processing apparatus that improves the speech-to-noise ratio, and the present invention can be applied to a speech processing apparatus that improves the speech-to-noise ratio by enhancing speech components.

上記実施形態の説明では、音声処理装置を電話端末に適用する例を挙げたが、本発明の用途はこれに限定されるものではない。 In the description of the above embodiment, an example in which the voice processing device is applied to a telephone terminal is given, but the application of the present invention is not limited to this.

また、上記各実施形態では音声信号を処理する音声処理装置に本発明を適用したものを示したが、音響信号（オーディオ信号）を処理する音響処理装置に本発明を適用することができる。特許請求の範囲における「音信号」は、音声信号又は音響信号を表している。 Moreover, although what applied this invention to the audio processing apparatus which processes an audio | voice signal was shown in said each embodiment, this invention is applicable to the audio processing apparatus which processes an acoustic signal (audio signal). The “sound signal” in the claims represents an audio signal or an acoustic signal.

１００、１００Ａ…音声処理装置、１０１…有音無音検出回路、１０２…周波数帯域分割回路、１０３−１〜１０３−Ｍ、１０３Ａ−１〜１０３Ａ−Ｍ…利得変更回路、１０４…周波数帯域合成回路、１０５…限界値変更指令回路、２００…ＳＮ算出部、２０１…ＳＮ閾値供給部、２０２…音声／雑音判定部、２０３、２０３Ａ…限界値記憶部、２０４…利得変更部。 DESCRIPTION OF SYMBOLS 100, 100A ... Voice processing apparatus, 101 ... Sound / silence detection circuit, 102 ... Frequency band division circuit, 103-1 to 103-M, 103A-1 to 103A-M ... Gain change circuit, 104 ... Frequency band synthesis circuit, 105 ... limit value change command circuit, 200 ... SN calculation unit, 201 ... SN threshold supply unit, 202 ... voice / noise determination unit, 203, 203A ... limit value storage unit, 204 ... gain change unit.

Claims

In a sound processing device that improves the intelligibility of a sound signal by improving the signal-to-noise ratio by suppressing noise components,
A suppression lower limit holding means for holding a noise suppression lower limit for each frequency band determined based on human auditory characteristics;
For each frequency band, a signal / noise determination means for determining whether the band is a noise component band,
And a gain control means for controlling the gain of the sound signal to be processed so that the frequency band determined to be a noise component does not fall below a noise suppression lower limit value of the frequency band. apparatus.

The suppression lower limit holding means holds, as noise suppression lower limit values for each frequency band, a plurality of values according to the human auditory age change characteristic, and selects a value to be applied from the plurality of values. Item 2. The sound processing apparatus according to Item 1.

In a sound processing device that improves the intelligibility of a sound signal by improving the signal-to-noise ratio by enhancing the sound component,
Emphasis upper limit holding means for holding a sound emphasis upper limit for each frequency band determined based on human auditory characteristics;
For each frequency band, a signal / noise determination means for determining whether or not the sound component band,
And a gain control means for controlling the gain of the sound signal to be processed so that the frequency band determined to be a sound component does not exceed the sound enhancement upper limit value of the frequency band. apparatus.

The emphasis upper limit holding unit holds a plurality of values according to the age change characteristics of human hearing as a sound emphasis upper limit value of each frequency band, and selects a value to be applied from the plurality of values. Item 4. The sound processing device according to item 3.

In a sound processing apparatus for improving the intelligibility of a sound signal by improving the signal-to-noise ratio by suppressing the noise component and enhancing the sound component,
A suppression lower limit holding means for holding a noise suppression lower limit for each frequency band determined based on human auditory characteristics;
Emphasis upper limit holding means for holding a sound emphasis upper limit for each frequency band determined based on human auditory characteristics;
For each frequency band, a signal / noise determining means for determining a sound component band or a noise component band;
For the frequency band determined to be a noise component, the gain of the sound signal to be processed is controlled so that it does not fall below the noise suppression lower limit value of that frequency band, and for the frequency band determined to be a sound component Has a gain control means for controlling the gain of the sound signal to be processed so as not to exceed the sound enhancement upper limit value of the frequency band.

The suppression lower limit holding means holds a plurality of values according to the age change characteristics of human hearing as a noise suppression lower limit value of each frequency band, and selects a value to be applied from a plurality of values,
The emphasis upper limit holding unit holds a plurality of values according to the age change characteristics of human hearing as a sound emphasis upper limit value of each frequency band, and selects a value to be applied from the plurality of values. Item 6. The sound processing device according to Item 5.

Computer
A suppression lower limit holding means for holding a noise suppression lower limit for each frequency band determined based on human auditory characteristics;
For each frequency band, a signal / noise determination means for determining whether the band is a noise component band,
For a frequency band determined as a noise component, it functions as a gain control means that controls the gain of the sound signal to be processed so that it does not fall below the noise suppression lower limit value of that frequency band, thereby suppressing the noise component. A sound processing program for improving the intelligibility of a sound signal by improving the signal-to-noise ratio.

Computer
Emphasis upper limit holding means for holding a sound emphasis upper limit for each frequency band determined based on human auditory characteristics;
For each frequency band, a signal / noise determination means for determining whether or not the sound component band,
For a frequency band determined to be a sound component, it functions as a gain control means for controlling the gain of the sound signal to be processed so that it does not exceed the sound enhancement upper limit value for that frequency band. A sound processing program for improving the intelligibility of a sound signal by improving the signal-to-noise ratio.

Computer
A suppression lower limit holding means for holding a noise suppression lower limit for each frequency band determined based on human auditory characteristics;
Emphasis upper limit holding means for holding a sound emphasis upper limit for each frequency band determined based on human auditory characteristics;
For each frequency band, a signal / noise determining means for determining a sound component band or a noise component band;
For the frequency band determined to be a noise component, the gain of the sound signal to be processed is controlled so that it does not fall below the noise suppression lower limit value of that frequency band, and for the frequency band determined to be a sound component Functions as a gain control means that controls the gain of the sound signal to be processed so that it does not exceed the sound enhancement upper limit of that frequency band, and the signal-to-noise ratio is reduced by suppressing the noise component and enhancing the sound component. A sound processing program characterized by improving the clarity of a sound signal.