JP4006770B2

JP4006770B2 - Noise estimation device, noise reduction device, noise estimation method, and noise reduction method

Info

Publication number: JP4006770B2
Application number: JP31032496A
Authority: JP
Inventors: 利幸森井
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1996-11-21
Filing date: 1996-11-21
Publication date: 2007-11-14
Anticipated expiration: 2016-11-21
Also published as: JPH10149198A

Abstract

PROBLEM TO BE SOLVED: To enable noise spectrum estimation, to reduce the feeling of abnormal sounds and to suppress degradation in sound quality concerning a noise reduction device, which removes a noise component from an input sound, constituting sound encoder/decoder or a sound input/output device. SOLUTION: A noise estimating part 24 estimates noise spectrum and stores it in a noise spectrum storage part 25. A noise reducing/spectrum compensating part 18 reduces the noise spectrum from an input spectrum and compensates the spectrum of too much reduced frequency, and a spectrum stabilizing part 19 stabilizes the compensated spectrum and adjusts the phase of complex spectrum. Thus, the noise reduction device, which estimates the noise spectrum even inside or outside a sound block and suppresses degradation in sound quality, is provided.

Description

【０００１】
【発明の属する技術分野】
本発明は、ディジタル携帯電話やマルチメディア通信等に必要な音声符号化・復号化装置（音声コーデック）や、音声入出力装置を構成するために、入力された音声から背景ノイズ成分を推定するノイズ推定装置及びノイズ推定方法と、その背景ノイズ成分を除去するノイズ削減装置及びノイズ削減方法に関するものである。
【０００２】
【従来の技術】
携帯電話等のディジタル移動通信の分野では、加入者の増加に対処するために低ビットレートの音声の圧縮符号化法が求められており、各研究機関において研究開発が進んでいる。日本国内においては、モトローラ社の開発したＶＳＥＬＰ（１１．２ｋｂｐｓ）、ＮＴＴ移動通信網株式会社の開発したＰＳＩ−ＣＥＬＰ（５．６ｋｂｐｓ）がディジタル携帯電話用の標準符号化方式として採用され、同方式を搭載したディジタル携帯電話が国内において既に発売されている。また国際的には、ＩＴＵ−Ｔの標準化では１６ｋｂｐｓ（ＬＤ−ＣＥＬＰ）、８ｋｂｐｓ（ＣＳ−ＡＣＥＬＰ）が標準化され、現在製品開発の段階にある。
【０００３】
これらの方式は、いずれもＣＥＬＰ（ＣｏｄｅＥｘｉｔｅｄＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎ： M.R.Schroeder ”High Quality Speech at Low Bit Rates”Proc.ICASSP'85 pp.937-940に記載）という方式を改良したものである。これは、音声を音源情報と声道情報とに分離し、音源情報については、符号帳に格納された複数の音源サンプルのインデクスによって符号化し、声道情報については、ＬＰＣ（線形予測係数）を符号化するとともに、音源情報符号化の際には、声道情報を加味して入力音声と比較を行なうという方法（Ａ−ｂ−Ｓ：ＡｎａｌｙｓｉｓｂｙＳｙｎｔｈｅｓｉｓ）を採用していることに特徴がある。
【０００４】
上記技術により、非常に低ビットレートで音声信号を伝送することができるようになったが、それと共に大きな問題点が明らかになった。それは、音声の発声モデルに基づいて情報圧縮を行っているために、音声信号以外の音響信号に対応できないという点である。そのため、音声信号中に背景ノイズや機器ノイズが含まれていると、効率の良い符号化が出来ず、合成時（復号化時）に異音を生じる結果となっていた。
【０００５】
この問題を解決するために、従来より入力音声信号からノイズを削減する手法が検討されてきた。上記標準化方式のＰＳＩ−ＣＥＬＰでは、符号化を行う前にノイズキヤンセラによってノイズを削減するという処理を行っている。上記ノイズキャンセラは、カルマンフィルタを基本として開発されており、音声の有無を検出して、適応的に制御を行うことによりノイズを低減させている。このノイズキャンセラによって、ある程度の背景ノイズを削減することができる。しかし、ノイズレベルの高いノイズや、音声中のノイズ等に対しては余り良い性能が得られていなかった。
【０００６】
一方、より強力なノイズ低減法として、スペクトルサブトラクション法が挙げられる。（S.F.Boll "Suppression of Acoustic Noise in Speech Using Spectral Subtraction", IEEE Trans.ASSP.,Vol.27,No.2,pp113-120,1979に記載）。これは、入力音声信号に対して離散フーリエ変換を行ってスペクトルに変換した後、ノイズをスペクトル上で減ずる方法であり、主に音声認識装置の入力部等に応用されている。
【０００７】
この方法を音声信号中のノイズ低減に応用した一例について、図２を用いて説明する。すなわち、ノイズスペクトルの推定は次の手順で行われる。まず、音声を含んでいないノイズのみの信号３１を入力し、Ａ／Ｄ変換部３２においてディジタル信号に変換する。次に、フーリエ変換部３３において、一定時間長の入力信号列（フレームと呼ぶ）に対して離散フーリエ変換を行い、ノイズのスペクトルを求める。そして、ノイズ分析部３４において、複数のフレームに対して求めたノイズのスペクトルからノイズの平均的スペクトルを求め、これをノイズスペクトル格納部３５に格納する。そして、ノイズの削減は以下の手順で行われる。
【０００８】
ノイズを含む音声信号３６を入力し、Ａ／Ｄ変換部３７においてディジタル信号に変換する。次に、上記と同様にしてフーリエ変換部３８で離散フーリエ変換を行い、ノイズを含んだ音声のスペクトルを求める。そして、ノイズ削減部３９において、ノイズスペクトル格納部３５に格納されたノイズスペクトルを、音声のスペクトルから減ずる。その結果得られたスペクトルに対して、逆フーリエ変換部４０において逆フーリエ変換を行い、出力信号４１を得る。
【０００９】
なお、このアルゴリズムにおけるスペクトルとしては、振幅スペクトル（複素数の複素平面上でのノルム、実数部と虚数部を２乗して加算し、平方根をとることによって求められる。）を用いるのが一般的である。また、振幅スペクトルで減じた場合、逆フーリエ変換を行う時の位相成分としては、入力信号の位相成分をそのまま用いるという方法が挙げられる。
【００１０】
【発明が解決しようとする課題】
上記スペクトルサブトラクション法は、より強力なノイズ低減方法であるが、ノイズ推定が困難であるために、これまでのリアルタイムの音声処理装置に用いられた例は少なかった。
【００１１】
すなわち、上記スペクトルサブトラクション法をリアルタイムの音声処理装置に応用するためには、次の様な課題を有していた。
＜１＞音声がどのタイミングでデータ中に存在するかが明らかでないために、ノイズスペクトル推定が困難。
＜２＞ノイズレベルが高い時にスペクトルが大きく歪み、音質の劣化を生ずる。
＜３＞無音区間（音声の無いノイズのみの区間）において異音感を生ずる。
【００１２】
本発明は、ノイズスペクトル推定が可能なノイズ推定装置及びノイズ推定方法の提供とともに、ノイズスペクトル推定が可能で、異音感を低減し、音質の劣化を抑えるノイズ削減装置及びノイズ削減方法の提供を目的とする。
【００１３】
【課題を解決するための手段】
上記課題を解決するために本発明は、ノイズ推定部が、フーリエ変換部により得られる入力スペクトルとノイズスペクトル格納部に格納されているノイズスペクトルとを比較することによってノイズのスペクトルを推定し、得られたノイズスペクトルをノイズスペクトル格納部に格納し、ノイズ削減／スペクトル補償部が、ノイズ削減係数調節部により得られる係数に基づいてノイズスペクトル格納部に格納されているノイズスペクトルを、フーリエ変換部により得られる入力スペクトルから減じ、得られたスペクトルを調べて減じすぎた周波数のスペクトルを補償し、スペクトル安定化部が、ノイズ削減／スペクトル補償部により得られたスペクトルを安定化処理するとともに、フーリエ変換部により得られた複素スペクトルの位相のうちノイズ削減／スペクトル補償部において補償された周波数の位相を調整するように構成したものである。
【００１４】
これにより、音声区間内でも音声区間外でもノイズスペクトル推定を行うことが可能で、音質の劣化を抑えることのできる優れたノイズ削減装置が得られる。
【００１５】
【発明の実施の形態】
本発明は、２種類のノイズスペクトルを用いてノイズ削減を行うために前記２種類のノイズスペクトルを推定するノイズ推定装置であって、入力音声信号をディジタル信号に変換するＡ／Ｄ変換部と、前記Ａ／Ｄ変換部により得られる一定時間長（１フレーム）のディジタル信号に対して離散フーリエ変換を行い入力スペクトルと複素スペクトルを得るフーリエ変換部と、ノイズ削減処理に用いる平均ノイズスペクトル及び前記ノイズ削減処理において減じすぎた周波数スペクトルを補償するために用いる補償用ノイズスペクトルの前記２種類のノイズスペクトルを格納するノイズスペクトル格納部と、前記フーリエ変換部により得られる前記入力スペクトルと前記ノイズスペクトル格納部に格納されている前記補償用ノイズスペクトルとの比較によって得られる新たな補償用ノイズスペクトルと、前記フーリエ変換部により得られる前記入力スペクトルを学習の計算式に用いて得られる新たな平均ノイズスペクトルとを、前記２種類のノイズスペクトルとして新たに推定し、得られた新たな２種類のノイズスペクトルを前記ノイズスペクトル格納部に格納するノイズ推定部と、を備えたことを特徴とするノイズ推定装置であって、特に、予めノイズ区間であるかどうかの判定を行ない、ノイズであると判定した場合には、前記フーリエ変換部により得られる前記入力スペクトルを各周波数毎に前記補償用ノイズスペクトルと大小比較し、前記補償用ノイズスペクトルより小さい場合にその周波数の前記補償用ノイズスペクトルを新たな入力スペクトルとすることによって前記新たな補償用ノイズスペクトルを推定し、またそれとは別に、前記入力スペクトルを一定割合で加算していく前記学習の計算式によって前記新たな平均ノイズスペクトルを推定し、さらに、得られた前記新たな補償用ノイズスペクトルと前記新たな平均ノイズスペクトルとを前記ノイズスペクトル格納部に格納する前記ノイズ推定部を備えたノイズ推定装置とすることで、ノイズのスペクトルを平均と最低の２つの方向から推定することが可能となり、これを用いたノイズ削減装置を構成することにより、より的確な削減処理を行うことができるという作用を有する。
また本発明は、入力音声信号をディジタル信号に変換するＡ／Ｄ変換部と、削減量を決定する係数を調節するノイズ削減係数調節部と、前記Ａ／Ｄ変換部により得られる一定時間長（１フレーム）のディジタル信号に対して離散フーリエ変換を行い入力スペクトルと複素スペクトルを得るフーリエ変換部と、ノイズ削減処理に用いる平均ノイズスペクトル及び前記ノイズ削減処理において減じすぎた周波数スペクトルを補償するために用いる補償用ノイズスペクトルの２種類のノイズスペクトルを格納するノイズスペクトル格納部と、前記フーリエ変換部により得られる前記入力スペクトルと前記ノイズスペクトル格納部に格納されている前記補償用ノイズスペクトルとの比較によって得られる新たな補償用ノイズスペクトルと、前記フーリエ変換部により得られる前記入力スペクトルを学習の計算式に用いて得られる新たな平均ノイズスペクトルとを、前記２種類のノイズスペクトルとして新たに推定し、得られた新たな２種類のノイズスペクトルを前記ノイズスペクトル格納部に格納するノイズ推定部と、前記ノイズ削減係数調節部により得られる前記係数に基づいて前記ノイズスペクトル格納部に格納されている平均ノイズスペクトルを前記フーリエ変換部により得られる前記入力スペクトルから減じ、更に、得られるスペクトルを調べ、減じすぎた周波数のスペクトルを、補償用ノイズスペクトルで補償するノイズ削減／スペクトル補償部と、を備えることを特徴としたノイズ削減装置である。
ここで更に、前記ノイズ削減／スペクトル補償部により得られたスペクトルのうちノイズ区間と判定されるものについて、平滑化処理するとともに、前記フーリエ変換部により得られた前記複素スペクトルの位相のうち前記ノイズ削減／スペクトル補償部において補償された周波数の位相を調整するスペクトル安定化部と、を備えてもよく、更に、前記スペクトル安定化部において平滑化処理されたスペクトルと調整された位相スペクトルとに基づいて逆フーリエ変換を行う逆フーリエ変換部と、前記逆フーリエ変換部により得られた信号に対してスペクトル強調を行うスペクトル強調部と、前記スペクトル強調部により得られた信号を前のフレームの信号と整合させる波形整合部と、を備えてもよい。
あるいは、前記Ａ／Ｄ変換部により得られる一定時間長のディジタル信号に対して線形予測分析（ＬＰＣ分析）を行うＬＰＣ分析部と、ノイズ削減処理を受けた前記入力スペクトルと前記複素スペクトルに対して逆離散フーリエ変換を行う逆フーリエ変換部と、前記逆フーリエ変換部により得られた信号に対して前記ＬＰＣ分析部で得られたパラメータを用いたスペクトル強調を行うスペクトル強調部と、を更に備えてもよく、音声区間中でも音声区間外でもノイズスペクトル推定を行うことができるとともに、入力のスペクトル包絡の特徴を線形予測係数で強調することができるという作用を有する。
【００１７】
また、ノイズ削減係数調節部にて得られたノイズ削減係数をノイズスペクトル格納部に格納された平均ノイズスペクトルに乗じて、フーリエ変換部にて得られた入力スペクトルから減じ、負のスペクトル値になってしまった周波数に対してはノイズスペクトル格納部に格納された補償用ノイズスペクトルにより補償するノイズ削減／スペクトル補償部を備えたノイズ削減装置とすることで、ノイズの平均スペクトルを削減に用いることにより、より大きくノイズスペクトルを削減することができるとともに、補償用スペクトルを別に推定したことにより、より的確な補償を行うことができるという作用を有する。
【００１８】
また、ノイズ削減／スペクトル補償部にてノイズ削減とスペクトル補償をなされたスペクトルの全域パワーと聴感的に重要な一部の帯域のパワー（中域パワー）とを調べ、入力された信号が無音区間（音声のないノイズのみの信号）かどうかを識別し、無音区間と判断した場合には、全域パワーと中域パワーに対して安定化処理とパワー低減処理を行なうスペクトル安定化部を備えたノイズ削減装置とすることで、音声の含まれていないノイズのみの区間のスペクトルをスムージングすることができるとともに、同区間のスペクトルがノイズ削減のために極端なスペクトル変動を起こすことを防ぐという作用を有する。
【００１９】
また、フーリエ変換部で得られた複素スペクトルに対して、ノイズ削減／スペクトル補償部でスペクトル補償を受けたかどうかの情報を基に、乱数による位相回転を行なうスペクトル安定化部を備えたノイズ削減装置とすることで、補償された周波数成分の位相にランダム性を持たせ、削減できずに残ったノイズを、聴感的に異音感の少ないノイズに変換させることができるという作用を有する。
【００２０】
また、予めスペクトル強調に用いる重み係数のセットを複数用意し、ノイズ削減時には、入力された信号の状態に応じて重み付け係数のセットを選択し、選択された重み付け係数を用いてスペクトル強調を行なうスペクトル強調部を備えたノイズ削減装置することで、音声区間においては、聴感的により適当な重み付けができ、無音区間や無声子音区間においては、聴感重み付けによる異音感を押さえることができるという作用を有する。
【００２１】
以下、本発明の実施の形態について、図１を用いて説明する。
（実施の形態）
図１は、本実施の形態におけるノイズ削減装置の主要部の機能ブロック図である。図１において、１１は入力信号、１２はＡ／Ｄ変換部、１３はノイズ削減係数格納部、１４はノイズ削減係数調整部、１５は入力波形設定部、１６は入力波形設定部、１７はフーリエ変換部、１８はノイズ削減／スペクトル補償部、１９はスペクトル安定化部、２０は逆フーリエ変換部、２１はスペクトル強調部、２２は波形整合部、２３は出力信号、２４はノイズ推定部、２５はノイズスペクトル格納部、２６は前スペクトル格納部、２７は乱数位相格納部、２８は前波形格納部、２９は最大パワー格納部である。
【００２２】
始めに初期設定について説明する。（表１）に、固定パラメータの名称と設定例を示す。
【００２３】
【表１】

【００２４】
また、乱数位相格納部２７には、位相を調整するための位相データを格納しておく。これらは、スペクトル安定化部１９において、位相を回転させるために用いられる。位相データが８種類の場合の例を（表２）に示す。
【００２５】
【表２】

【００２６】
さらに、上記位相データを用いるためのカウンター（乱数位相カウンター）も、乱数位相格納部２７に格納しておく。この値は、予め０に初期化して格納しておく。
【００２７】
次に、スタティックのＲＡＭ領域を設定する。すなわち、ノイズ削減係数格納部１３、ノイズスペクトル格納部２５、前スペクトル格納部２６、前波形格納部２８、最大パワー格納部２９をクリアする。以下に、各格納部の説明と設定例を述べる。
【００２８】
ノイズ削減係数格納部１３は、ノイズ削減係数を格納するエリアであり、初期値として２０．０を格納しておく。ノイズスペクトル格納部２５は、平均ノイズパワーと、平均ノイズスペクトルと、１位候補の補償用ノイズスペクトルと２位候補の補償用ノイズスペクトルとそれぞれの周波数のスペクトル値が、何フレーム前に変化したかを示すフレーム数（持続数）を、各周波数毎に格納するエリアであり、平均ノイズパワーに十分大きな値、平均ノイズスペクトルに指定最小パワー、補償用ノイズスペクトルと持続数としてそれぞれに充分大きな数を初期値として格納しておく。
【００２９】
前スペクトル格納部２６は、補償用ノイズパワー、以前のフレームのパワー（全域、中域）（前フレームパワー）、以前のフレームの平滑化パワー（全域、中域）（前フレーム平滑化パワー）、及びノイズ連続数を格納するエリアであり、補償用ノイズパワーとして十分大きな値を、前フレームパワー、全フレーム平滑化パワーとしていずれも０．０を、またノイズ連続数としてノイズ基準連続数を格納しておく。
【００３０】
前波形格納部２８は、出力信号を整合させるための、前のフレームの出力信号の、最後の先読みデータ長分のデータを格納するエリアであり、初期値として全てに０を格納しておく。スペクトル強調部２１は、ＡＲＭＡ及び高域強調フィルタリングを行なうが、そのためのそれぞれのフィルターの状態をいずれも０にクリアしておく。最大パワー格納部２９は、入力された信号のパワーの最大を格納するエリアであり、最大パワーとして０を格納しておく。
【００３１】
次にノイズ削減アルゴリズムについて、図１を用いてブロック毎に説明する。
まず、音声を含むアナログ入力信号１１をＡ／Ｄ変換部１２でＡ／Ｄ変換し、１フレーム長＋先読みデータ長（上記設定例では、１６０＋８０＝２４０ポイント）の分だけ入力する。ノイズ削減係数調節部１４は、ノイズ削減係数格納部１３に格納されたノイズ削減係数と指定ノイズ削減係数とノイズ削減係数学習係数と補償パワー上昇係数とを基に、（数１）により、ノイズ削減係数並びに補償係数を算出する。そして、得られたノイズ削減係数をノイズ削減係数格納部１３に格納するとともに、Ａ／Ｄ変換部１２で得られた入力信号を、入力波形設定部１５へ送り、さらに補償係数とノイズ削減係数を、ノイズ推定部２４とノイズ削減／スペクトル補償部18へ送る。
【００３２】
【数１】

【００３３】
なお、ノイズ削減係数はノイズを減ずる割合を示した係数、指定ノイズ削減係数は予め指定された固定削減係数、ノイズ削減係数学習係数はノイズ削減係数の指定ノイズ削減係数に近づける割合を示した係数、補償係数はスペクトル補償における補償パワーを調節する係数、補償パワー上昇係数は補償係数を調節する係数である。
【００３４】
入力波形設定部１５においては、Ａ／Ｄ変換部１２からの入力信号を、ＦＦＴ（高速フーリエ変換）することができるように、２の指数乗の長さを持つメモリ配列に、後ろ詰めで書込む。前の部分は０を詰めておく。前述の設定例では、２５６の長さの配列に０〜１５まで０を書込み、１６〜２５５まで入力信号を書込む。この配列は、８次のＦＦＴの際に実数部として用いられる。また、虚数部として、実数部と同じ長さの配列を用意し、全てに０を書込んでおく。
【００３５】
ＬＰＣ分析部１６においては、入力波形設定部１５で設定した実数部エリアに対してハミング窓を掛け、窓掛け後の波形に対して自己相関分析を行って自己相関係数を求め、自己相関法に基づくＬＰＣ分析を行い、線形予測係数を得る。さらに、得られた線形予測係数をスペクトル強調部２１に送る。
【００３６】
フーリエ変換部１７は、入力波形設定部１５で得られる実数部、虚数部のメモリ配列を用いて、ＦＦＴによる離散フーリエ変換を行う。得られた複素スペクトルの実数部と虚数部の絶対値の和を計算することによって、入力信号の疑似振幅スペクトル（以下、入力スペクトル）を求める。また、各周波数の入力スペクトル値の総和（以下、入力パワー）を求め、ノイズ推定部２４へ送る。また、複素スペクトルそのものを、スペクトル安定化部１９へ送る。
【００３７】
次に、ノイズ推定部２４における処理を説明する。
ノイズ推定部２４は、フーリエ変換部１７で得られた入力パワーと最大パワー格納部２９に格納された最大パワーの値とを比較し、最大パワーの方が小さい場合は、最大パワー値を入力パワー値として、その値を最大パワー格納部２９に格納する。そして、以下の３つうち少なくとも一つに該当する場合はノイズ推定を行い、全て満たさない場合はノイズ推定は行わない。
（１）入力パワーが、最大パワーに無音検出係数を乗じた値よりも小さい。
（２）ノイズ削減係数が、指定ノイズ削減係数に０．２を加えたものより大きい。
（３）入力パワーが、ノイズスペクトル格納部２５から得られる平均ノイズパワーに１．６を乗じたものより小さい。
【００３８】
ここで、ノイズ推定部２４におけるノイズ推定アルゴリズムを述べる。
まず、ノイズスペクトル格納部２５に格納されている１位候補、２位候補の全ての周波数の持続数を更新する（１を加算する）。そして、１位候補の各周波数の持続数を調べ、予め設定したノイズスペクトル基準持続数より大きい場合は、２位候補の補償用スペクトルと持続数を１位候補とし、２位候補の補償用スペクトルを３位候補の補償用スペクトルとし持続数を０とする。ただし、この２位候補の補償用スペクトルの入れ替えにおいては、３位候補を格納せず、２位候補を若干大きくしたもので代用することによって、メモリを節約することができる。本実施の形態では、２位候補の補償用スペクトルを１．４倍したものを代用することとする。
【００３９】
持続数の更新の後に、各周波数毎に、補償用ノイズスペクトルと入力スペクトルとの比較を行う。まず、各周波数の入力スペクトルを１位候補の補償用ノイズスペクトルと比較し、もし入力スペクトルの方が小さい場合は、１位候補の補償用ノイズスペクトルと持続数を２位候補とし、入力スペクトルを１位候補の補償用スペクトルとし１位候補の持続数は０とする。前記の条件以外の場合は、入力スペクトルと２位候補の補償用ノイズスペクトルとの比較を行い、もし入力スペクトルの方が小さい場合は、入力スペクトルを２位候補の補償用スペクトルとし２位候補の持続数は０とする。そして、得られた１、２位候補の補償用スペクトルと持続数を補償用ノイズスペクトル格納部２５に格納する。また、同時に、平均ノイズスペクトルも、次の（数２）にしたがって更新する。
【００４０】
【数２】

【００４１】
なお、平均ノイズスペクトルは、疑似的に求めた平均のノイズスペクトルであり、（数２）における係数ｇは、平均ノイズスペクトルの学習の早さを調節する係数である。すなわち、入力パワーがノイズパワーと比較して小さい場合は、ノイズのみの区間である可能性が高いとして学習速度を上げ、そうでない場合は、音声区間中である可能性があるとして学習速度を下げる効果を持つ係数である。
【００４２】
そして、平均ノイズスペクトルの各周波数の値の総和を求め、これを平均ノイズパワーとする。補償用ノイズスペクトル、平均ノイズスペクトル、平均ノイズパワーは、ノイズスペクトル格納部２５に格納する。
【００４３】
また、上記ノイズ推定処理において、１つの周波数のノイズスペクトルを複数の周波数の入力スペクトルと対応させれば、ノイズスペクトル格納部２５を構成するためのＲＡＭ容量を節約することができる。例として、本実施の形態の２５６ポイントのＦＦＴを用いる場合に、１つの周波数のノイズスペクトルを４つの周波数の入力スペクトルから推定するときの、ノイズスペクトル格納部２５のＲＡＭ容量を示す。（疑似）振幅スペクトルが周波数軸上で左右対称であることを考慮すると、全ての周波数で推定する場合は１２８個の周波数のスペクトルと持続数を格納するので、１２８（周波数）×２（スペクトルと持続数）×３（補償用の１、２位候補、平均）で計７６８ＷのＲＡＭ容量が必要になる。
【００４４】
これに対して、１つの周波数のノイズスペクトルを４つの周波数の入力スペクトルと対応させる場合は、３２（周波数）×２（スペクトルと持続数）×３（補償用の１、２位候補、平均）で計１９２ＷのＲＡＭ容量でよいことになる。この場合、ノイズスペクトルの周波数解像度は低下することになるが、上記１対４の場合は、殆ど性能の劣化がないことを実験により確認している。また、この工夫は、１つの周波数のスペクトルでノイズスペクトルを推定するものではないから、定常音（サイン波、母音等）が長時間続いた場合に、そのスペクトルをノイズスペクトルと誤推定することを防ぐ効果もある。
【００４５】
次に、ノイズ削減／スペクトル補償部１８における処理について説明する。
入力スペクトルから、ノイズスペクトル格納部２５に格納されている平均ノイズスペクトルにノイズ削減係数調節部１４で得られたノイズ削減係数を乗じたものを引く（以後、差スペクトル）。上記ノイズ推定部２４の説明において示したノイズスペクトル格納部２５のＲＡＭ容量の節約を行った場合は、入力スペクトルに対応する周波数の平均ノイズスペクトルにノイズ削減係数を乗じたものを引く。そして、差スペクトルが負になった場合には、ノイズスペクトル格納部２５に格納された補償用ノイズスペクトルの１位候補に、ノイズ削減係数調整部１４で求めた補償係数を乗じたものを代入することにより補償する。これを、全ての周波数について行う。また、差スペクトルを補償した周波数が分るように、周波数毎にフラグデータを作成する。例えば、各周波数毎に１つのエリアがあり、補償しない時は０、補償したときは１を代入する。このフラグデータは、差スペクトルと共に、スペクトル安定化部１９へ送られる。また、フラグデータの値を調べることによって補償した総数（補償数）を求め、これもスペクトル安定化部１９へ送る。
【００４６】
次に、スペクトル安定化部１９における処理について説明する。なお、この処理は、主に音声の含まれていない区間の異音感低減のために機能する。
【００４７】
まず、ノイズ削減／スペクトル補償部１８から得られた各周波数の差スペクトルの和を計算し、現フレームパワーを求める。現フレームパワーは全域と中域の２種類を求める。全域は全ての周波数（全域と呼ぶ、本実施の形態では０〜１２８まで）について求め、中域は聴感的に重要な中ごろの帯域（中域と呼ぶ、本実施の形態では１６〜７９まで）について求める。
【００４８】
同様に、ノイズスペクトル格納部２５に格納された補償用ノイズスペクトルの１位候補についての和を求め、これを現フレームノイズパワー（全域、中域）とする。ここで、ノイズ削減／スペクトル補償部１８から得られた補償数の値を調べ、十分大きい場合、且つ、以下の３条件のうち少なくとも１つ満たす場合に、現フレームがノイズのみの区間と判断して、スペクトルの安定化処理を行う。
（１）入力パワーが、最大パワーに無音検出係数を乗じた値よりも小さい。
（２）現フレームパワー（中域）が、現フレームノイズパワー（中域）に５．０を乗じた値より小さい。
（３）入力パワーが、ノイズ基準パワーよりも小さい。
【００４９】
安定化処理を行なわない場合は、前スペクトル格納部２６に格納されたノイズ連続数が正の時に１を減じ、また現フレームノイズパワー（全域、中域）を前フレームパワー（全域、中域）とし、それぞれを前スペクトル格納部２６に格納して、位相拡散処理に進む。
【００５０】
ここで、スペクトル安定化処理について説明する。この処理の目的は、無音区間（音声の無いノイズのみの区間）のスペクトルの安定化とパワー低減を実現することである。処理は２種類あり、ノイズ連続数がノイズ基準連続数より小さい場合は（処理１）を、以上の場合は（処理２）を行なう。２つの処理を以下に示す。
（処理１）前スペクトル格納部２６に格納されたノイズ連続数に１を加算し、また現フレームノイズパワー（全域、中域）を前フレームパワー（全域、中域）とし、それぞれを前スペクトル格納部２６に格納して、位相調整処理へ進む。
（処理２）前スペクトル格納部２６に格納された前フレームパワー、前フレーム平滑化パワー、更に固定係数である無音パワー減少係数を参照し、（数３）にしたがってそれぞれを変更する。
【００５１】
【数３】

【００５２】
次に、これらのパワーを差スペクトルに反映させる。そのために、中域に乗ずる係数（以後、係数１）と全域に乗ずる係数（以後、係数２）の２つの係数を算出する。まず、以下の（数４）に示す式で係数１を算出する。
【００５３】
【数４】

【００５４】
係数２は、係数１の影響を受けるので、求める手段は多少複雑になる。手順を以下に示す。
（１）前フレーム平滑化パワー（全域）が前フレームパワー（中域）より小さい場合、または、現フレームノイズパワー（全域）が現フレームノイズパワー（中域）よりも小さい場合は（２）へ。それ以外の場合は（３）へ。
（２）係数２は０.０とし、前フレームパワー（全域）を前フレームパワー（中域）として、（６）へ。
（３）現フレームノイズパワー（全域）が現フレームノイズパワー（中域）と等しい場合は（４）へ。異なる場合は（５）へ。
（４）係数２を１.０とし、（６）へ。
（５）以下の（数５）により係数２を求め、（６）へ。
【００５５】
【数５】

【００５６】
（６）係数２算出処理終了。
上記アルゴリズムにより得られた係数１、２は、いずれも上限を１.０に、下限を無音パワー減少係数にクリッピングする。そして、中域の周波数（本例では１６〜７９）の差スペクトルに係数１を乗じて得られた値を差スペクトルとし、さらに、その差スペクトルの全域から中域を除いた周波数（本例では０〜１５、８０〜１２８）の差スペクトルに係数２を乗じて得られた値を差スペクトルとする。それに伴い、前フレームパワー（全域、中域）を以下の（数６）によって変換する。
【００５７】
【数６】

【００５８】
こうして得られた各種パワーデータ等を全て前スペクトル格納部２６に格納し、（処理２）を終わる。
【００５９】
以上の要領で、スペクトル安定化部１９におけるスペクトルの安定化が行われる。
【００６０】
次に、位相調整処理について説明を行う。従来のスペクトルサブトラクションでは、位相は原則として変更しないが、本実施の形態では、その周波数のスペクトルが削減時に補償された場合に、位相をランダムに変更する処理を行なう。この処理により、残ったノイズのランダム性が強くなるので、聴感的に悪印象を与えにくくなるという効果が得られる。
【００６１】
まず、乱数位相格納部２７に格納された乱数位相カウンターを得る。そして、全ての周波数のフラグデータ（補償の有無を示したデータ）を参照して、補償している場合は、以下の（数７）により、フーリエ変換部１７で得られた複素スペクトルの位相を回転させる。
【００６２】
【数７】

【００６３】
（数７）においては、２つの乱数位相データをペアで使用している。したがって、上記処理を１回行なう毎に、乱数位相カウンターを２ずつ増加させ、上限（本実施の形態では１６）になった場合は０とする。なお、乱数位相カウンターは乱数位相格納部２７へ格納し、得られた複素スペクトルは、逆フーリエ変換部２０へ送る。また、差スペクトルの総和を求め（以下、差スペクトルパワー）、これをスペクトル強調部２１へ送る。
【００６４】
逆フーリエ変換部２０では、スペクトル安定化部１９で得られた差スペクトルの振幅と複素スペクトルの位相とに基づき、新たな複素スペクトルを構成し、ＦＦＴを用いて逆フーリエ変換を行う。（得られた信号を第１次出力信号と呼ぶ。）そして、得られた第１次出力信号をスペクトル強調部２１へ送る。
【００６５】
次に、スペクトル強調部２１における処理について説明する。
まず、ノイズスペクトル格納部２５に格納さされた平均ノイズパワーと、スペクトル安定化部１９で得られた差スペクトルパワーと、定数であるノイズ基準パワーを参照して、ＭＡ強調係数とＡＲ強調係数を選択する。選択は、以下の２つの条件を評価することにより行う。
（条件１）差スペクトルパワーがノイズスペクトル格納部２５に格納された平均ノイズパワーに０.６を乗じた値よりも大きく、且つ、平均ノイズパワーがノイズ基準パワーよりも大きい。
（条件２）差スペクトルパワーが平均ノイズパワーより大きい。
【００６６】
（条件１）を満たす場合は、これを「有声区間」とし、ＭＡ強調係数をＭＡ強調係数１ー１とし、ＡＲ強調係数をＡＲ強調係数１ー１とし、高域強調係数を高域強調係数１とする。また、（条件１）を満たさず、（条件２）を満たす場合は、これを「無声子音区間」とし、ＭＡ強調係数をＭＡ強調係数１ー０とし、ＡＲ強調係数をＡＲ強調係数１ー０とし、高域強調係数を０とする。また、（条件１）を満たさず、（条件２）を満たさない場合はこれを「無音区間、ノイズのみの区間」とし、ＭＡ強調係数をＭＡ強調係数０とし、ＡＲ強調係数をＡＲ強調係数０とし、高域強調係数を高域強調係数０とする。
【００６７】
そして、ＬＰＣ分析部１６から得られた線形予測係数と、上記ＭＡ強調係数、ＡＲ強調係数を用いて、以下の（数８）の式に基づき、極強調フィルターのＭＡ係数とＡＲ係数とを算出する。
【００６８】
【数８】

【００６９】
そして、逆フーリエ変換部２０において得られた第１次出力信号に対して、上記ＭＡ係数とＡＲ係数とを用いて極強調フィルターを掛ける。このフィルターの伝達関数を、以下の（数９）に示す。
【００７０】
【数９】

【００７１】
更に、高域成分を強調するために、上記高域強調係数を用いて、高域強調フィルターを掛ける。このフィルターの伝達関数を、以下の（数１０）に示す。
【００７２】
【数１０】

【００７３】
上記処理によって得られた信号を第２次出力信号と呼ぶ。なお、フィルターの状態は、スペクトル強調部２１の内部に保存される。
【００７４】
最後に、波形整合部２２において、スペクトル強調部２１で得られた第２次出力信号と、前波形格納部２８に格納された信号とを、三角窓によって重ね合せて出力信号を得る。更に、この出力信号の最後の先読みデータ長分のデータを、前波形格納部２８に格納する。このときの整合方法を、以下の（数１１）に示す。
【００７５】
【数１１】

【００７６】
ここで注意が必要なのは、出力信号としては先読みデータ長＋フレーム長分のデータが出力されるが、このうち信号として扱うことができるのは、データの始端からフレーム長の長さの区間のみということである。なぜなら、後ろの先読みデータ長のデータは、次の出力信号を出力するときに書き換えられるからである。ただし、出力信号の全区間内では連続性は補償されるので、ＬＰＣ分析やフィルター分析等の周波数分析には使用することができる。
【００７７】
【発明の効果】
以上のように本発明によれば、音声区間中でも音声区間外でもノイズスペクトル推定を行うことができ、音声がどのタイミングでデータ中に存在するかが明らかでない場合でもノイズスペクトルを推定することができる。また、入力のスペクトル包絡の特徴を線形予測係数で強調することができ、ノイズレベルが高い場合でも音質の劣化を防ぐことが出来る。
【００７８】
また、ノイズのスペクトルを平均と最低の２つの方向から推定でき、より的確な削減処理を行うことができる。
【００７９】
また、ノイズの平均スペクトルを削減に用いることによって、より大きくノイズスペクトルを削減することができ、さらに、補償用スペクトルを別に推定したことにより、より的確な補償を行うことができる。
【００８０】
そして、音声の含まれていないノイズのみの区間のスペクトルをスムージングすることができ、同区間のスペクトルが、ノイズ削減のために極端なスペクトル変動による異音感を防ぐことができる。
【００８１】
そして、補償された周波数成分の位相にランダム性を持たせることができ、削減できずに残ったノイズを、聴感的に異音感の少ないノイズに変換させることができる。
【００８２】
また、音声区間においては、聴感的により適当な重み付けができるようになり、無音区間や無声子音区間においては、聴感重み付けによる異音感を抑えることができる。
【図面の簡単な説明】
【図１】本発明の一実施の形態によるノイズ削減装置の主要部の機能ブロック図
【図２】従来のスペクトルサブトラクションによるノイズ削減装置の機能ブロック図
【符号の説明】
１１入力信号
１２Ａ／Ｄ変換部
１３ノイズ削減係数格納部
１４ノイズ削減係数調節部
１５入力波形設定部
１６ＬＰＣ分析部
１７フーリエ変換部
１８ノイズ削減／スペクトル補償部
１９スペクトル安定化部
２０逆フーリエ変換部
２１スペクトル強調部
２２波形整合部
２３出力音声
２４ノイズ推定部
２５ノイズスペクトル格納部
２６前スペクトル格納部
２７乱数位相格納部
２８前波形格納部[0001]
BACKGROUND OF THE INVENTION
The present invention provides a background noise component from input speech in order to constitute a speech encoding / decoding device (speech codec) and speech input / output device necessary for a digital cellular phone, multimedia communication, and the like. Estimating noise estimation device and noise estimation method, and background noise component The present invention relates to a noise reduction device and a noise reduction method to be removed.
[0002]
[Prior art]
In the field of digital mobile communications such as cellular phones, a low bit rate speech compression coding method is required to cope with the increase in subscribers, and research and development is progressing in each research institution. In Japan, VSELP (11.2 kbps) developed by Motorola and PSI-CELP (5.6 kbps) developed by NTT Mobile Communications Network Co., Ltd. have been adopted as standard coding systems for digital mobile phones. Digital mobile phones equipped with are already on the market in Japan. Internationally, standardization of ITU-T has standardized 16 kbps (LD-CELP) and 8 kbps (CS-ACELP), and is currently in the stage of product development.
[0003]
Each of these systems is an improvement on the system called CELP (Code Exited Linear Prediction: MR Schroeder "High Quality Speech at Low Bit Rates" Proc. ICAS SP '85 pp. 937-940). This separates speech into sound source information and vocal tract information, and the sound source information is encoded by an index of a plurality of sound source samples stored in the codebook. For the vocal tract information, LPC (Linear Prediction Coefficient) is used. In addition to encoding, sound source information encoding is characterized by adopting a method (AbS: Analysis by Synthesis) in which vocal tract information is taken into consideration and compared with input speech. .
[0004]
Although the above-mentioned technique has made it possible to transmit an audio signal at a very low bit rate, a major problem has been revealed. That is, since information compression is performed based on a voice utterance model, it cannot cope with an acoustic signal other than a voice signal. For this reason, if background noise or device noise is included in the audio signal, efficient encoding cannot be performed, and abnormal noise is generated during synthesis (decoding).
[0005]
In order to solve this problem, methods for reducing noise from an input audio signal have been studied conventionally. In the standardized PSI-CELP, noise is reduced by a noise canceller before encoding. The noise canceller is developed on the basis of a Kalman filter, and reduces noise by detecting the presence or absence of speech and performing adaptive control. This noise canceller can reduce a certain amount of background noise. However, not so good performance has been obtained for noise with a high noise level, noise in speech, and the like.
[0006]
On the other hand, as a more powerful noise reduction method, there is a spectral subtraction method. (Described in SFBoll "Suppression of Acoustic Noise in Speech Using Spectral Subtraction", IEEE Trans.ASSP., Vol.27, No.2, pp113-120, 1979). This is a method of performing discrete Fourier transform on an input speech signal and converting it to a spectrum, and then reducing noise on the spectrum, and is mainly applied to an input unit of a speech recognition apparatus.
[0007]
An example in which this method is applied to noise reduction in an audio signal will be described with reference to FIG. That is, the noise spectrum is estimated by the following procedure. First, a noise-only signal 31 that does not include sound is input and converted into a digital signal by an A / D converter 32. Next, the Fourier transform unit 33 performs discrete Fourier transform on an input signal sequence (referred to as a frame) having a fixed time length to obtain a noise spectrum. Then, the noise analysis unit 34 obtains an average spectrum of noise from the noise spectra obtained for a plurality of frames, and stores this in the noise spectrum storage unit 35. The noise is reduced by the following procedure.
[0008]
An audio signal 36 including noise is input and converted into a digital signal by an A / D converter 37. Next, in the same manner as described above, the Fourier transform unit 38 performs discrete Fourier transform to obtain a spectrum of speech including noise. Then, the noise reduction unit 39 subtracts the noise spectrum stored in the noise spectrum storage unit 35 from the voice spectrum. The inverse Fourier transform unit 40 performs inverse Fourier transform on the spectrum obtained as a result, and an output signal 41 is obtained.
[0009]
As a spectrum in this algorithm, it is common to use an amplitude spectrum (a norm of a complex number on a complex plane, obtained by squaring a real part and an imaginary part, and taking a square root). is there. In addition, when the amplitude spectrum is subtracted, there is a method of using the phase component of the input signal as it is as the phase component when performing the inverse Fourier transform.
[0010]
[Problems to be solved by the invention]
Although the above spectral subtraction method is a more powerful noise reduction method, since noise estimation is difficult, there have been few examples that have been used in real-time speech processing apparatuses so far.
[0011]
That is, in order to apply the spectral subtraction method to a real-time speech processing apparatus, the following problems have been encountered.
<1> Since it is not clear at what timing the voice is present in the data, it is difficult to estimate the noise spectrum.
<2> When the noise level is high, the spectrum is greatly distorted and the sound quality is deteriorated.
<3> An unusual sound is produced in a silent section (a section of only noise without sound).
[0012]
The present invention In addition to providing a noise estimation device and noise estimation method capable of noise spectrum estimation, An object of the present invention is to provide a noise reduction device and a noise reduction method capable of estimating a noise spectrum, reducing noise, and suppressing deterioration of sound quality.
[0013]
[Means for Solving the Problems]
In order to solve the above-described problems, the present invention provides a noise estimation unit that estimates a noise spectrum by comparing an input spectrum obtained by a Fourier transform unit with a noise spectrum stored in a noise spectrum storage unit. The noise spectrum is stored in the noise spectrum storage unit, and the noise reduction / spectrum compensation unit converts the noise spectrum stored in the noise spectrum storage unit based on the coefficient obtained by the noise reduction coefficient adjustment unit by the Fourier transform unit. Subtract from the obtained input spectrum, examine the obtained spectrum and compensate the spectrum of the excessive frequency, the spectrum stabilization unit stabilizes the spectrum obtained by the noise reduction / spectrum compensation unit, and Fourier transform Out of the phase of the complex spectrum obtained by the It is obtained by configured to adjust the phase of the compensated frequency in size reduction / spectrum compensating section.
[0014]
Thereby, noise spectrum estimation can be performed both inside and outside the speech section, and an excellent noise reduction device capable of suppressing deterioration of sound quality can be obtained.
[0015]
DETAILED DESCRIPTION OF THE INVENTION
The present invention A noise estimation device for estimating the two types of noise spectrum in order to perform noise reduction using the two types of noise spectrum, An A / D converter that converts an input audio signal into a digital signal, and a discrete Fourier transform on the digital signal of a certain time length (one frame) obtained by the A / D converter to obtain an input spectrum and a complex spectrum A Fourier transform unit; The average noise spectrum used for the noise reduction process and the compensation noise spectrum used to compensate for the frequency spectrum that has been excessively reduced in the noise reduction process. A noise spectrum storage unit for storing two types of noise spectra, and the input spectrum obtained by the Fourier transform unit; The compensation noise spectrum stored in the noise spectrum storage unit and A new compensation noise spectrum obtained by comparing Obtained by the Fourier transform unit The input spectrum In the learning formula Obtained using New A noise estimation unit that newly estimates an average noise spectrum as the two types of noise spectra, and stores the obtained two types of new noise spectra in the noise spectrum storage unit. It is a noise estimation device, in particular, it is determined whether it is a noise section in advance, and when it is determined that it is noise, Said Obtained by Fourier transform Said Input spectrum for each frequency Said Compare with the noise spectrum for compensation. Said Compensation noise spectrum New By taking the input spectrum The new Estimate the noise spectrum for compensation, and separately, Said Add the input spectrum at a fixed rate Formula for learning By The new Estimate the average noise spectrum, and The new obtained Noise spectrum for compensation The new With the average noise spectrum Said Store in the noise spectrum storage Said By using a noise estimator equipped with a noise estimator, it is possible to estimate the noise spectrum from the average and the lowest two directions. By constructing a noise reduction device using this, more accurate reduction is possible. It has the effect | action that a process can be performed.
The present invention also provides an A / D conversion unit that converts an input audio signal into a digital signal, a noise reduction coefficient adjustment unit that adjusts a coefficient for determining a reduction amount, and a fixed time length (obtained by the A / D conversion unit) A Fourier transform unit that performs discrete Fourier transform on a digital signal of one frame to obtain an input spectrum and a complex spectrum; An average noise spectrum used for noise reduction processing and a compensation noise spectrum used to compensate for a frequency spectrum that has been excessively reduced in the noise reduction processing. A noise spectrum storage unit for storing two types of noise spectra, and the input spectrum obtained by the Fourier transform unit; The compensation noise spectrum stored in the noise spectrum storage unit and Obtained by comparing New Compensation noise spectrum, Obtained by the Fourier transform unit The input spectrum In the learning formula Obtained using New The average noise spectrum is newly estimated as the two types of noise spectrum, and the obtained two new types of noise spectra are stored in the noise spectrum storage unit, and obtained by the noise reduction coefficient adjustment unit. Be Said Stored in the noise spectrum storage unit based on the coefficient. Ruhei A uniform noise spectrum is obtained by the Fourier transform unit. Said Subtract from the input spectrum, further examine the resulting spectrum, , Supplement A noise reduction apparatus comprising: a noise reduction / spectrum compensation unit that compensates with a compensation noise spectrum.
Here, the spectrum determined by the noise reduction / spectrum compensation unit, which is determined to be a noise interval, is smoothed and obtained by the Fourier transform unit. Said A spectrum stabilizing unit for adjusting a phase of a frequency compensated in the noise reduction / spectrum compensating unit among phases of a complex spectrum, and a spectrum smoothed in the spectrum stabilizing unit, An inverse Fourier transform unit that performs an inverse Fourier transform based on the adjusted phase spectrum, a spectrum enhancement unit that performs spectrum enhancement on a signal obtained by the inverse Fourier transform unit, and the spectrum enhancement unit. A waveform matching unit that matches the signal with the signal of the previous frame.
Alternatively, an LPC analysis unit that performs linear prediction analysis (LPC analysis) on a digital signal having a certain time length obtained by the A / D conversion unit and a noise reduction process Said Input spectrum and Said An inverse Fourier transform unit that performs inverse discrete Fourier transform on a complex spectrum, and a spectrum enhancement unit that performs spectrum enhancement using the parameters obtained by the LPC analysis unit on the signal obtained by the inverse Fourier transform unit; The noise spectrum can be estimated both inside and outside the speech section, and the input spectral envelope feature can be enhanced with a linear prediction coefficient.
[0017]
Also, The noise reduction coefficient obtained by the noise reduction coefficient adjustment unit is multiplied by the average noise spectrum stored in the noise spectrum storage unit and subtracted from the input spectrum obtained by the Fourier transform unit, resulting in a negative spectral value. A noise reduction / spectrum compensation unit is provided that compensates for a certain frequency using a compensation noise spectrum stored in the noise spectrum storage unit. The Noise reduction device By doing By using the average spectrum of noise for the reduction, the noise spectrum can be reduced more greatly, and the compensation spectrum is estimated separately, so that more accurate compensation can be performed.
[0018]
Also, The noise reduction / spectrum compensation unit examines the total power of the spectrum that has been subjected to noise reduction and spectrum compensation and the power of a part of the audibly important band (middle power), and the input signal is a silent section (voice And a spectrum stabilization unit that performs stabilization processing and power reduction processing for all power and mid-range power when it is determined that it is a silent section. The Noise reduction device By doing In addition, it is possible to smooth the spectrum of the noise-only section that does not contain speech, and to prevent the spectrum of the same section from causing extreme spectrum fluctuations for noise reduction.
[0019]
Also, Equipped with a spectrum stabilization unit that performs phase rotation using random numbers based on information on whether or not the spectrum obtained by the noise reduction / spectrum compensation unit has been subjected to the complex spectrum obtained by the Fourier transform unit The Noise reduction device By doing The phase of the compensated frequency component is made random, and the noise that cannot be reduced can be audibly converted into noise with a less unusual sound.
[0020]
Also, A plurality of weight coefficient sets used in advance for spectrum enhancement are prepared, and at the time of noise reduction, a weight enhancement coefficient set is selected in accordance with the state of the input signal, and a spectrum enhancement unit that performs spectrum enhancement using the selected weight coefficient With The Noise reduction device by doing In the voice section, an appropriate weight can be given perceptually, and in the silent section and the unvoiced consonant section, an abnormal sound feeling due to the auditory weight can be suppressed.
[0021]
Hereinafter, an embodiment of the present invention will be described with reference to FIG.
(Embodiment)
FIG. 1 is a functional block diagram of the main part of the noise reduction apparatus according to the present embodiment. In FIG. 1, 11 is an input signal, 12 is an A / D conversion unit, 13 is a noise reduction coefficient storage unit, 14 is a noise reduction coefficient adjustment unit, 15 is an input waveform setting unit, 16 is an input waveform setting unit, and 17 is a Fourier. Conversion unit, 18 is noise reduction / spectrum compensation unit, 19 is spectral stability Conversion , 20 is an inverse Fourier transform unit, 21 is a spectrum enhancement unit, 22 is a waveform matching unit, 23 is an output signal, 24 is a noise estimation unit, 25 is a noise spectrum storage unit, 26 is a previous spectrum storage unit, and 27 is a random number phase. A storage unit, 28 is a previous waveform storage unit, and 29 is a maximum power storage unit.
[0022]
First, the initial setting will be described. Table 1 shows fixed parameter names and setting examples.
[0023]
[Table 1]

[0024]
The random number phase storage unit 27 stores phase data for adjusting the phase. These are used in the spectrum stabilization unit 19 to rotate the phase. An example in the case of eight types of phase data is shown in (Table 2).
[0025]
[Table 2]

[0026]
Further, a counter (random number phase counter) for using the phase data is also stored in the random number phase storage unit 27. This value is initialized to 0 in advance and stored.
[0027]
Next, a static RAM area is set. That is, the noise reduction coefficient storage unit 13, the noise spectrum storage unit 25, the previous spectrum storage unit 26, the previous waveform storage unit 28, and the maximum power storage unit 29 are cleared. Below, explanation of each storage unit and setting examples will be described.
[0028]
The noise reduction coefficient storage unit 13 is an area for storing a noise reduction coefficient, and stores 20.0 as an initial value. The noise spectrum storage unit 25 indicates how many frames before the average noise power, the average noise spectrum, the compensation noise spectrum of the first candidate, the compensation noise spectrum of the second candidate, and the spectrum values of the respective frequencies have changed. This is an area that stores the number of frames (number of sustains) for each frequency, with a sufficiently large value for the average noise power, a specified minimum power for the average noise spectrum, and a sufficiently large number for each of the compensation noise spectrum and the number of sustains. Store as initial value.
[0029]
The previous spectrum storage unit 26 includes compensation noise power, previous frame power (whole area, middle band) (previous frame power), previous frame smoothing power (whole area, middle band) (previous frame smoothing power), And an area for storing the number of continuous noises, a sufficiently large value for the noise power for compensation, 0.0 for both the previous frame power and the smoothing power for all frames, and the noise reference number for continuous noise. Keep it.
[0030]
The previous waveform storage unit 28 is an area for storing data corresponding to the last read-ahead data length of the output signal of the previous frame for matching the output signal, and stores 0 as all initial values. The spectrum emphasizing unit 21 performs ARMA and high-frequency emphasizing filtering, but clears the states of the respective filters to zero. The maximum power storage unit 29 is an area for storing the maximum power of the input signal, and stores 0 as the maximum power.
[0031]
Next, the noise reduction algorithm will be described for each block with reference to FIG.
First, the analog input signal 11 including voice is A / D converted by the A / D converter 12 and input by the length of 1 frame length + prefetch data length (160 + 80 = 240 points in the above setting example). The noise reduction coefficient adjusting unit 14 performs noise reduction based on the noise reduction coefficient, the specified noise reduction coefficient, the noise reduction coefficient learning coefficient, and the compensation power increase coefficient stored in the noise reduction coefficient storage unit 13 by (Equation 1). A coefficient and a compensation coefficient are calculated. Then, the obtained noise reduction coefficient is stored in the noise reduction coefficient storage unit 13, and the input signal obtained by the A / D conversion unit 12 is sent to the input waveform setting unit 15, and further the compensation coefficient and the noise reduction coefficient are obtained. , To the noise estimation unit 24 and the noise reduction / spectrum compensation unit 18.
[0032]
[Expression 1]

[0033]
The noise reduction coefficient is a coefficient indicating the ratio of reducing noise, the specified noise reduction coefficient is a fixed reduction coefficient specified in advance, the noise reduction coefficient learning coefficient is a coefficient indicating the ratio of approaching the specified noise reduction coefficient of the noise reduction coefficient, The compensation coefficient is a coefficient for adjusting the compensation power in spectrum compensation, and the compensation power increase coefficient is a coefficient for adjusting the compensation coefficient.
[0034]
In the input waveform setting unit 15, the input signal from the A / D conversion unit 12 is written in a memory array having a length of exponential power of 2 so as to be subjected to FFT (fast Fourier transform). Include. The previous part is padded with zeros. In the above setting example, 0 is written from 0 to 15 and an input signal is written from 16 to 255 in an array of 256 lengths. This array is used as a real part in the 8th-order FFT. As an imaginary part, an array having the same length as the real part is prepared, and 0 is written in all.
[0035]
The LPC analysis unit 16 multiplies the real part area set by the input waveform setting unit 15 with a Hamming window, performs autocorrelation analysis on the windowed waveform to obtain an autocorrelation coefficient, and calculates an autocorrelation method. LPC analysis is performed based on to obtain a linear prediction coefficient. Further, the obtained linear prediction coefficient is sent to the spectrum enhancement unit 21.
[0036]
The Fourier transform unit 17 performs discrete Fourier transform by FFT using the memory array of the real part and the imaginary part obtained by the input waveform setting unit 15. By calculating the sum of the absolute values of the real part and imaginary part of the obtained complex spectrum, a pseudo amplitude spectrum (hereinafter referred to as input spectrum) of the input signal is obtained. In addition, the sum of input spectrum values of each frequency (hereinafter referred to as input power) is obtained and sent to the noise estimation unit 24. In addition, the complex spectrum itself is Conversion Send to part 19.
[0037]
Next, processing in the noise estimation unit 24 will be described.
The noise estimation unit 24 compares the input power obtained by the Fourier transform unit 17 with the value of the maximum power stored in the maximum power storage unit 29. If the maximum power is smaller, the noise estimation unit 24 uses the maximum power value as the input power. The value is stored in the maximum power storage unit 29 as a value. When at least one of the following three conditions is satisfied, noise estimation is performed. When all of the following three conditions are not satisfied, noise estimation is not performed.
(1) The input power is smaller than a value obtained by multiplying the maximum power by the silence detection coefficient.
(2) The noise reduction coefficient is larger than the designated noise reduction coefficient plus 0.2.
(3) The input power is smaller than the average noise power obtained from the noise spectrum storage unit 25 multiplied by 1.6.
[0038]
Here, a noise estimation algorithm in the noise estimation unit 24 will be described.
First, the continuous number of all frequencies of the first candidate and the second candidate stored in the noise spectrum storage unit 25 is updated (add 1). Then, the number of continuations of each frequency of the first candidate is examined, and if it is larger than the preset noise spectrum reference number, the compensation spectrum of the second candidate and the number of persistence are set as the first candidate and the compensation spectrum of the second candidate Is the third-place candidate compensation spectrum and the duration is 0. However, in the replacement of the compensation spectrum for the second candidate, the third candidate is not stored, and the memory can be saved by substituting a slightly larger second candidate. In this embodiment, the compensation spectrum of the second candidate is multiplied by 1.4.
[0039]
After the update of the continuous number, the compensation noise spectrum and the input spectrum are compared for each frequency. First, the input spectrum of each frequency is compared with the compensation noise spectrum of the first candidate. If the input spectrum is smaller, the compensation noise spectrum of the first candidate and the persistence number are set as the second candidate, and the input spectrum is The compensation spectrum of the first candidate is assumed to be zero, and the number of persistence of the first candidate is zero. In cases other than the above conditions, the input spectrum is compared with the compensation noise spectrum of the second candidate. If the input spectrum is smaller, the input spectrum is set as the second candidate compensation spectrum. The persistence number is 0. Then, the obtained compensation spectrum and the persistence number of the first and second candidates are stored in the compensation noise spectrum storage unit 25. At the same time, the average noise spectrum is also updated according to the following (Equation 2).
[0040]
[Expression 2]

[0041]
The average noise spectrum is an average noise spectrum obtained in a pseudo manner, and the coefficient g in (Expression 2) is a coefficient for adjusting the learning speed of the average noise spectrum. That is, when the input power is small compared to the noise power, the learning speed is increased because it is likely that the section is only noise, and otherwise, the learning speed is decreased because it may be in the speech section. A coefficient that has an effect.
[0042]
And the sum total of the value of each frequency of an average noise spectrum is calculated | required, and this is made into average noise power. The noise spectrum for compensation, the average noise spectrum, and the average noise power are stored in the noise spectrum storage unit 25.
[0043]
Further, in the noise estimation process, if the noise spectrum of one frequency is associated with the input spectrum of a plurality of frequencies, the RAM capacity for configuring the noise spectrum storage unit 25 can be saved. As an example, when the 256-point FFT of this embodiment is used, the RAM capacity of the noise spectrum storage unit 25 when a noise spectrum of one frequency is estimated from an input spectrum of four frequencies is shown. Considering that the (pseudo) amplitude spectrum is symmetrical on the frequency axis, when estimating at all frequencies, the spectrum of 128 frequencies and the persistence number are stored, so 128 (frequency) × 2 (spectrum and Persistence number) × 3 (1st and 2nd candidates for compensation, average) requires a total RAM capacity of 768 W.
[0044]
On the other hand, when the noise spectrum of one frequency is made to correspond to the input spectrum of four frequencies, 32 (frequency) × 2 (spectrum and duration) × 3 (1st and 2nd candidates for compensation, average) Therefore, a total RAM capacity of 192 W is sufficient. In this case, although the frequency resolution of the noise spectrum is lowered, it has been confirmed by experiments that there is almost no deterioration in performance in the case of the above 1: 4. In addition, this device does not estimate the noise spectrum with a spectrum of one frequency, so if a stationary sound (sine wave, vowel, etc.) continues for a long time, the spectrum is erroneously estimated as a noise spectrum. There is also an effect to prevent.
[0045]
Next, processing in the noise reduction / spectrum compensation unit 18 will be described.
A value obtained by multiplying the average noise spectrum stored in the noise spectrum storage unit 25 by the noise reduction coefficient obtained by the noise reduction coefficient adjustment unit 14 is subtracted from the input spectrum (hereinafter, difference spectrum). When the RAM capacity of the noise spectrum storage unit 25 shown in the description of the noise estimation unit 24 is saved, the average noise spectrum of the frequency corresponding to the input spectrum is multiplied by the noise reduction coefficient. When the difference spectrum becomes negative, a value obtained by multiplying the first candidate of the compensation noise spectrum stored in the noise spectrum storage unit 25 by the compensation coefficient obtained by the noise reduction coefficient adjustment unit 14 is substituted. To compensate. This is done for all frequencies. In addition, flag data is created for each frequency so that the frequency for which the difference spectrum is compensated is known. For example, there is one area for each frequency, and 0 is substituted when not compensated, and 1 is substituted when compensated. This flag data is sent to the spectrum stabilizing unit 19 together with the difference spectrum. Also, the total number of compensation (compensation number) is obtained by examining the value of the flag data, which is also spectrally stable. Conversion Send to part 19.
[0046]
Next, spectral stability Conversion Processing in the unit 19 will be described. This process mainly functions to reduce the sense of abnormal noise in a section that does not include voice.
[0047]
First, the sum of the difference spectrum of each frequency obtained from the noise reduction / spectrum compensation unit 18 is calculated to obtain the current frame power. Two types of current frame power are required: the entire region and the middle region. The entire area is obtained for all frequencies (referred to as the entire area, 0 to 128 in the present embodiment), and the middle area is a middle band that is audibly important (referred to as the intermediate area, up to 16 to 79 in the present embodiment). Ask for.
[0048]
Similarly, the sum for the first candidate of the compensation noise spectrum stored in the noise spectrum storage unit 25 is obtained, and this is used as the current frame noise power (entire area, middle area). Here, the value of the compensation number obtained from the noise reduction / spectrum compensation unit 18 is examined, and if the value is sufficiently large and at least one of the following three conditions is satisfied, it is determined that the current frame is a noise-only section. To stabilize the spectrum.
(1) The input power is smaller than a value obtained by multiplying the maximum power by the silence detection coefficient.
(2) The current frame power (middle range) is smaller than a value obtained by multiplying the current frame noise power (middle range) by 5.0.
(3) The input power is smaller than the noise reference power.
[0049]
When the stabilization process is not performed, 1 is subtracted when the number of continuous noises stored in the previous spectrum storage unit 26 is positive, and the current frame noise power (whole area, middle band) is reduced to the previous frame power (whole area, middle band). Each of them is stored in the previous spectrum storage unit 26, and the process proceeds to the phase spreading process.
[0050]
Here, the spectrum stabilization process will be described. The purpose of this processing is to realize the stabilization and power reduction of the spectrum in the silent section (the section of only noise without sound). There are two types of processing. When the number of continuous noises is smaller than the reference number of noises, (Processing 1) is performed, and when above, (Processing 2) is performed. Two processes are shown below.
(Process 1) 1 is added to the number of continuous noises stored in the previous spectrum storage unit 26, and the current frame noise power (whole area, middle band) is set as the previous frame power (whole area, middle band). The data is stored in the unit 26 and the process proceeds to the phase adjustment process.
(Process 2) With reference to the previous frame power, the previous frame smoothing power, and the silent power reduction coefficient which is a fixed coefficient stored in the previous spectrum storage unit 26, each is changed according to (Equation 3).
[0051]
[Equation 3]

[0052]
Next, these powers are reflected in the difference spectrum. For this purpose, two coefficients are calculated: a coefficient to be multiplied by the middle range (hereinafter, coefficient 1) and a coefficient to be multiplied by the entire area (hereinafter, coefficient 2). First, the coefficient 1 is calculated by the following equation (Equation 4).
[0053]
[Expression 4]

[0054]
Since the coefficient 2 is affected by the coefficient 1, the means for obtaining is somewhat complicated. The procedure is shown below.
(1) If the previous frame smoothing power (whole area) is smaller than the previous frame power (middle band), or if the current frame noise power (whole area) is smaller than the current frame noise power (middle band), go to (2) . Otherwise, go to (3).
(2) Coefficient 2 is set to 0.0, the previous frame power (entire area) is set to the previous frame power (middle area), and the process proceeds to (6).
(3) If the current frame noise power (whole area) is equal to the current frame noise power (middle area), go to (4). If not, go to (5).
(4) Set coefficient 2 to 1.0 and go to (6).
(5) The coefficient 2 is obtained by the following (Equation 5), and the process proceeds to (6).
[0055]
[Equation 5]

[0056]
(6) The coefficient 2 calculation process ends.
The coefficients 1 and 2 obtained by the above algorithm are all clipped to the upper limit of 1.0 and the lower limit to the silent power reduction coefficient. Then, a value obtained by multiplying the difference spectrum of the mid-range frequency (16 to 79 in this example) by the coefficient 1 is defined as a difference spectrum, and further, the frequency (in this example) excluding the mid-range from the entire range of the difference spectrum. A value obtained by multiplying the difference spectrum of 0 to 15 and 80 to 128) by the coefficient 2 is defined as a difference spectrum. Accordingly, the previous frame power (entire area, middle area) is converted by the following (Equation 6).
[0057]
[Formula 6]

[0058]
All the various power data and the like thus obtained are stored in the previous spectrum storage unit 26, and (Process 2) is completed.
[0059]
In this way, spectral stability Conversion The spectrum is stabilized in the unit 19.
[0060]
Next, the phase adjustment process will be described. In the conventional spectrum subtraction, the phase is not changed in principle, but in the present embodiment, when the spectrum of the frequency is compensated at the time of reduction, a process of changing the phase at random is performed. By this processing, the randomness of the remaining noise becomes strong, so that an effect of making it difficult to give a bad impression audibly is obtained.
[0061]
First, the random number phase storage unit 2 7 Get the random number phase counter stored in. If all the frequency flag data (data indicating the presence / absence of compensation) are referred to and the phase is compensated, the phase of the complex spectrum obtained by the Fourier transform unit 17 is calculated by the following (Equation 7). Rotate.
[0062]
[Expression 7]

[0063]
In (Expression 7), two random number phase data are used in pairs. Therefore, each time the above processing is performed once, the random number phase counter is incremented by 2 and is set to 0 when the upper limit (16 in the present embodiment) is reached. The random number phase counter is stored in the random number phase storage unit 27, and the obtained complex spectrum is sent to the inverse Fourier transform unit 20. Further, the sum of the difference spectra is obtained (hereinafter referred to as difference spectrum power), and this is sent to the spectrum enhancement unit 21.
[0064]
In the inverse Fourier transform unit 20, spectral stability Conversion Based on the amplitude of the difference spectrum and the phase of the complex spectrum obtained by the unit 19, a new complex spectrum is constructed, and inverse Fourier transform is performed using FFT. (The obtained signal is referred to as a primary output signal.) Then, the obtained primary output signal is sent to the spectrum enhancement unit 21.
[0065]
Next, processing in the spectrum emphasizing unit 21 will be described.
First, the average noise power stored in the noise spectrum storage unit 25 and the spectrum stability Conversion The MA enhancement coefficient and the AR enhancement coefficient are selected with reference to the difference spectrum power obtained by the unit 19 and the noise reference power that is a constant. The selection is performed by evaluating the following two conditions.
(Condition 1) The difference spectrum power is larger than the value obtained by multiplying the average noise power stored in the noise spectrum storage unit 25 by 0.6, and the average noise power is larger than the noise reference power.
(Condition 2) The difference spectrum power is larger than the average noise power.
[0066]
When (Condition 1) is satisfied, this is set as “voiced interval”, the MA enhancement coefficient is set as the MA enhancement coefficient 1-1, the AR enhancement coefficient is set as the AR enhancement coefficient 1-1, and the high frequency enhancement coefficient is set as the high frequency enhancement coefficient. Set to 1. When (Condition 1) is not satisfied but (Condition 2) is satisfied, this is set as “unvoiced consonant section”, the MA enhancement coefficient is set as MA enhancement coefficient 1-0, and the AR enhancement coefficient is set as AR enhancement coefficient 1-0. And the high frequency emphasis coefficient is 0. When (Condition 1) is not satisfied and (Condition 2) is not satisfied, this is set as “silent section, noise only section”, the MA enhancement coefficient is set to MA enhancement coefficient 0, and the AR enhancement coefficient is set to AR enhancement coefficient 0. And the high frequency emphasis coefficient is 0.
[0067]
Then, using the linear prediction coefficient obtained from the LPC analysis unit 16, the MA enhancement coefficient, and the AR enhancement coefficient, the MA coefficient and AR coefficient of the pole enhancement filter are calculated based on the following equation (8). To do.
[0068]
[Equation 8]

[0069]
Then, the primary output signal obtained in the inverse Fourier transform unit 20 is subjected to a pole enhancement filter using the MA coefficient and the AR coefficient. The transfer function of this filter is shown in the following (Equation 9).
[0070]
[Equation 9]

[0071]
Further, in order to emphasize the high frequency component, a high frequency enhancement filter is applied using the high frequency enhancement coefficient. The transfer function of this filter is shown in the following (Equation 10).
[0072]
[Expression 10]

[0073]
The signal obtained by the above processing is called a secondary output signal. Note that the state of the filter is stored inside the spectrum enhancement unit 21.
[0074]
Finally, in the waveform matching unit 22, the secondary output signal obtained by the spectrum enhancement unit 21 and the signal stored in the previous waveform storage unit 28 are overlapped by a triangular window to obtain an output signal. Further, the data corresponding to the last read-ahead data length of this output signal is stored in the previous waveform storage unit 28. The matching method at this time is shown in the following (Equation 11).
[0075]
[Expression 11]

[0076]
It should be noted here that as the output signal, the data corresponding to the pre-read data length + frame length is output, but only the section of the frame length from the beginning of the data can be handled as a signal. That is. This is because the data of the subsequent prefetch data length is rewritten when the next output signal is output. However, since the continuity is compensated within the entire section of the output signal, it can be used for frequency analysis such as LPC analysis and filter analysis.
[0077]
【The invention's effect】
As described above, according to the present invention, noise spectrum estimation can be performed both inside and outside the speech section, and the noise spectrum can be estimated even when it is not clear at what timing speech is present in the data. . In addition, the characteristics of the input spectral envelope can be emphasized by a linear prediction coefficient, and deterioration of sound quality can be prevented even when the noise level is high.
[0078]
In addition, the noise spectrum can be estimated from the average and minimum two directions, and more accurate reduction processing can be performed.
[0079]
Further, the noise spectrum can be reduced more greatly by using the average spectrum of noise, and more accurate compensation can be performed by estimating the compensation spectrum separately.
[0080]
Then, it is possible to smooth the spectrum of the noise-only section that does not contain speech, and the spectrum of the same section can prevent the sense of noise due to extreme spectrum fluctuations for noise reduction.
[0081]
Then, the phase of the compensated frequency component can be given randomness, and the remaining noise that cannot be reduced can be audibly converted to noise with less noise.
[0082]
In the voice section, more appropriate weighting can be performed perceptually, and in the silent section and the unvoiced consonant section, it is possible to suppress the abnormal sound feeling due to the auditory weighting.
[Brief description of the drawings]
FIG. 1 is a functional block diagram of a main part of a noise reduction apparatus according to an embodiment of the present invention.
FIG. 2 is a functional block diagram of a conventional noise reduction apparatus using spectral subtraction.
[Explanation of symbols]
11 Input signal
12 A / D converter
13 Noise reduction coefficient storage
14 Noise reduction coefficient adjustment unit
15 Input waveform setting section
16 LPC analyzer
17 Fourier transform
18 Noise reduction / spectrum compensation section
19 Spectral stabilization section
20 Inverse Fourier transform unit
21 Spectrum enhancement section
22 Waveform matching section
23 Output audio
24 Noise estimation unit
25 Noise spectrum storage
26 Front spectrum storage
27 Random number phase storage
28 Previous waveform storage

Claims

A noise estimation device for estimating the two types of noise spectrum in order to perform noise reduction using the two types of noise spectrum,
An A / D converter for converting an input audio signal into a digital signal;
A Fourier transform unit that obtains an input spectrum and a complex spectrum by performing a discrete Fourier transform on a digital signal having a predetermined time length obtained by the A / D conversion unit;
A noise spectrum storage unit that stores the two types of noise spectra of an average noise spectrum used for noise reduction processing and a compensation noise spectrum used to compensate for a frequency spectrum that has been excessively reduced in the noise reduction processing ;
A new compensation noise spectrum obtained by comparing the input spectrum obtained by the Fourier transform unit with the compensation noise spectrum stored in the noise spectrum storage unit, and the input spectrum obtained by the Fourier transform unit A new average noise spectrum obtained by using as a learning calculation formula is newly estimated as the two types of noise spectrum, and the new two types of noise spectrum obtained are stored in the noise spectrum storage unit. An estimation unit;
A noise estimation apparatus comprising:

The noise estimator is
If it is determined whether it is a noise section in advance and it is determined that it is noise, the input spectrum obtained by the Fourier transform unit is compared with the compensation noise spectrum for each frequency, and the compensation spectrum is obtained. to estimate the said new compensation noise spectrum by the compensation noise spectrum of the frequency as the new input spectrum is smaller than the noise spectrum, also separately, by adding the input spectrum at a fixed rate characterized in that to estimate the new average noise spectrum, and further, to store the resulting said new compensation for noise spectrum as a new said average noise spectrum in said noise spectrum storage section by the calculation formula of the learning going The noise estimation apparatus according to claim 1.

An A / D converter for converting an input audio signal into a digital signal;
A noise reduction coefficient adjustment unit for adjusting a coefficient for determining a reduction amount;
A Fourier transform unit that obtains an input spectrum and a complex spectrum by performing a discrete Fourier transform on a digital signal having a predetermined time length obtained by the A / D conversion unit;
A noise spectrum storage unit that stores two types of noise spectra, an average noise spectrum used for noise reduction processing and a compensation noise spectrum used to compensate for a frequency spectrum that has been excessively reduced in the noise reduction processing ;
A new compensation noise spectrum obtained by comparing the input spectrum obtained by the Fourier transform unit with the compensation noise spectrum stored in the noise spectrum storage unit, and the input spectrum obtained by the Fourier transform unit A new average noise spectrum obtained by using as a learning calculation formula is newly estimated as the two types of noise spectrum, and the new two types of noise spectrum obtained are stored in the noise spectrum storage unit. An estimation unit;
Wherein subtracting the average noise spectrum that is stored in the noise spectrum storage section based on the coefficient obtained by the noise reduction coefficient adjusting unit from said input spectrum obtained by said Fourier transform unit, further, examining the resulting spectrum, the spectrum of frequencies too reduced, and the noise reduction / spectrum compensating section that compensates by complement償用noise spectrum,
A noise reduction device comprising:

In addition,
The spectrum determined by the noise reduction / spectrum compensation unit as a noise interval is smoothed and the noise reduction / spectrum compensation among the phases of the complex spectrum obtained by the Fourier transform unit. A spectral stabilization unit for adjusting the phase of the frequency compensated in the unit;
The noise reduction device according to claim 3, further comprising:

The noise reduction / spectrum compensation unit includes:
Examine the total power of the spectrum that has been subjected to noise reduction and spectrum compensation and the power of a part of the audibly important band, and identify whether the input signal is a silent period,
The spectrum stabilization unit includes:
5. The noise reduction device according to claim 4, wherein when the function is determined to be a silent section, smoothing processing and power reduction processing are performed on the whole area power and the middle area power.

The spectrum stabilization unit includes:
5. The phase rotation by random numbers is performed on the complex spectrum obtained by the Fourier transform unit based on information on whether or not spectrum compensation has been performed by the noise reduction / spectrum compensation unit. Noise reduction device.

The noise reduction / spectrum compensation unit includes:
The noise reduction coefficient obtained by the noise reduction coefficient adjusting unit by multiplying the average noise spectrum stored in said noise spectrum storage section, subtracted from the input spectrum obtained by said Fourier transform unit, the negative spectrum noise reduction apparatus according to claim 3, characterized in that to compensate the auxiliary償用noise spectrum stored in said noise spectrum storage section for frequencies became value.

In addition,
An LPC analysis unit that performs linear prediction analysis on the digital signal having a predetermined time length obtained by the A / D conversion unit;
An inverse Fourier transform unit that performs an inverse discrete Fourier transform on the input spectrum and the complex spectrum subjected to noise reduction processing;
A spectrum enhancement unit that performs spectrum enhancement using the parameters obtained by the LPC analysis unit on the signal obtained by the inverse Fourier transform unit;
The noise reduction device according to claim 3, further comprising:

In addition,
The spectrum determined by the noise reduction / spectrum compensation unit that is determined to be a noise interval is smoothed and the noise reduction / spectrum compensation among the phases of the complex spectrum obtained by the Fourier transform unit. A spectrum stabilizing unit for adjusting the phase of the frequency compensated in the unit;
An inverse Fourier transform unit that performs an inverse Fourier transform based on the spectrum smoothed in the spectrum stabilization unit and the adjusted phase spectrum;
A spectrum enhancement unit for performing spectrum enhancement on the signal obtained by the inverse Fourier transform unit;
A waveform matching unit that matches the signal obtained by the spectrum enhancement unit with the signal of the previous frame;
The noise reduction device according to claim 3, further comprising:

A noise estimation method for estimating the two types of noise spectra in order to perform noise reduction using two types of noise spectra,
An A / D conversion step for converting an input audio signal into a digital signal;
A Fourier transform step of performing a discrete Fourier transform on the digital signal having a fixed time length obtained by the A / D conversion step to obtain an input spectrum and a complex spectrum;
A noise spectrum storage step of storing the two types of noise spectra of the average noise spectrum used for noise reduction processing and the compensation noise spectrum used for compensating the frequency spectrum excessively reduced in the noise reduction processing in a noise spectrum storage unit;
A new compensation noise spectrum obtained by comparing the input spectrum obtained by the Fourier transform step with the compensation noise spectrum stored in the noise spectrum storage unit, and the input obtained by the Fourier transform unit A new average noise spectrum obtained by using a spectrum as a learning calculation formula is newly estimated as the two types of noise spectra, and the two new types of noise spectra obtained are stored in the noise spectrum storage unit. A noise estimation step;
A noise estimation method comprising:

An A / D conversion step for converting an input audio signal into a digital signal;
A noise reduction coefficient adjustment step for adjusting a coefficient for determining a reduction amount;
A Fourier transform step of performing a discrete Fourier transform on the digital signal having a fixed time length obtained by the A / D conversion step to obtain an input spectrum and a complex spectrum;
A noise spectrum storing step of storing two types of noise spectra, an average noise spectrum used for noise reduction processing and a compensation noise spectrum used to compensate for a frequency spectrum excessively reduced in the noise reduction processing ;
A new compensation noise spectrum obtained by comparing the input spectrum obtained by the Fourier transform step with the compensation noise spectrum stored in the noise spectrum storage unit, and the input obtained by the Fourier transform unit A new average noise spectrum obtained by using a spectrum as a learning calculation formula is newly estimated as the two types of noise spectra, and the two new types of noise spectra obtained are stored in the noise spectrum storage unit. A noise estimation step;
Subtracted from the noise reduction coefficient adjusting the input spectrum above the average noise spectrum that is stored in the noise spectrum storage section obtained by the Fourier transform step based on said coefficients obtained in step, further, the resulting spectrum were examined, the spectrum of frequencies too reduced, and the noise reduction / spectrum compensating step of compensating at complement償用noise spectrum,
A noise reduction method characterized by comprising:

In addition,
The spectrum determined by the noise reduction / spectrum compensation step that is determined to be a noise interval is smoothed, and the noise reduction / spectrum compensation among the phases of the complex spectrum obtained by the Fourier transform step. A spectral stabilization step of adjusting the phase of the compensated frequency in the step;
The noise reduction method according to claim 11, further comprising:

In addition,
An LPC analysis step for performing a linear prediction analysis on the digital signal having the predetermined time length obtained by the A / D conversion step;
An inverse Fourier transform step for performing an inverse discrete Fourier transform on the input spectrum and the complex spectrum subjected to noise reduction processing;
A spectral enhancement step of performing spectral enhancement on the signal obtained by the inverse Fourier transform step using the parameters obtained in the LPC analysis step;
The noise reduction method according to claim 11, further comprising: