JP4162860B2

JP4162860B2 - Unnecessary sound signal removal device

Info

Publication number: JP4162860B2
Application number: JP2001004845A
Authority: JP
Inventors: 孝一中田; 真吾木内; 利明浅野; 徹丸本; 望斉藤
Original assignee: Alpine Electronics Inc
Current assignee: Alpine Electronics Inc
Priority date: 2001-01-12
Filing date: 2001-01-12
Publication date: 2008-10-08
Anticipated expiration: 2021-01-12
Also published as: JP2002207500A

Abstract

PROBLEM TO BE SOLVED: To immediately eliminate an audio sound signal and carry out speech recognition even when a state changes from a hands-free talking state into an audio state. SOLUTION: A microphone 71 detects sounds inside a vehicle outputted from a loudspeaker 52 and a human speaker. An unwanted sound signal eliminating section 73 generates a signal corresponding to an audio reproducing sound with an adaptive signal processing and eliminates the generated signal from the detected signal and then inputs the resultant signal into a speech recognizing device 74 in a first status (audio state), and generates a signal corresponding to the sound outputted from a loudspeaker for a receiver speech 52FR with the adaptive signal processing and eliminates the generated signal from the detected signal and inputs the resultant signal into a transmitter 75 in a second state (calling state). When the state changes from the first state into the second state, an adaptive filter parameter updated by the adaptive signal processing in the first state is stored in a parameter storage section 83, and when the state changes from the second state into the first state, the stored adaptive filter parameter is set on the adaptive filter and the adaptive signal processing is carried out.

Description

【０００１】
【発明の属する技術分野】
本発明は不要音信号除去装置に係わり、特に、スピーカ及び話者から車室内に出力する音を検出し、検出信号よりスピーカから出力する音に応じた不要音信号を除去する不要音信号除去装置に関する。
【０００２】
【従来の技術】
近年ナビゲーションシステムなど、車載機器のヒューマンインタフェース（入出力手段）として音声認識装置が一般化してきている。しかし、音声認識装置は、認識の対象とする音声以外の外乱（車両走行ノイズやオーディオ再生音）の影響で認識性能が低下するという問題があり、さまざまな騒音対策が実現されている。その一例としてマイク受信信号から車室内のオーディオ再生音を除去（抑圧）するオーディオ音キャンセル装置が提案されている。
また、近年移動電話機の爆発的な普及により、運転しながらの通話による事故が増大し、安全上の観点から、走行中での携帯電話等による通話は法律によって禁止されている。このため、送受話器を外したりしなくても通話ができるハンズフリーテレフォン装置（ＨＦＴ）が普及してきている。ＨＦＴでは、音声をオーディオシステムのスピーカより出力するが、このスピーカから再生出力された通話相手音声がマイクにより検出されて通信相手に送信されるという所謂エコー現象が発生する。このため、かかるエコーを除去するためのエコーキャンセラが実用化されている。
【０００３】
・オーディオ音キャンセル装置
図８は従来のオーディオ音キャンセル装置の構成図であり、１１はオーディオソース、１２はオーディオ信号を入力され、オーディオ音を音響空間に放射するスピーカ、１３は音響空間に放射されたオーディオ音及び話者からの話者音声を検出するマイクロホン、１４はエラー信号ｅのパワーが最小となるように適応信号処理を行い、適応フィルタの係数Ｗを更新する適応信号処理部、１５はマイクロホンによる検出信号と適応フィルタ出力の差を演算してエラー信号ｅを出力する演算部である。１６はオーディオ音がキャンセルされた話者音声より音声認識する音声認識装置である。
適応信号処理部１４は、オーディオ信号ｘ(n)を参照信号として入力されると共に、前記演算部１５から出力されるエラ−信号ｅ(n)を入力され、該エラ−信号のパワーが最小となるように適応信号処理を行って信号ｙ(n)を出力する。適応信号処理部１４は適応信号演算部１４ａと、FIR型のデジタルフィルタ構成の適応フィルタ１４ｂを有している。
【０００４】
適応信号演算部１４ａは聴取位置におけるエラー信号ｅ(n)と参照信号としてのオーディオ信号ｘ(n)が入力され、これらの信号を用いて聴取位置におけるオーディオ信号がキャンセルされるように適応信号演算を行って適応フィルタ１４ｂの係数を決定する。例えば、適応信号演算部１４ａは周知のLMS(Least Mean Square)適応アルゴリズムに従って、エラ−信号ｅ(n)のパワーが最小となるように適応フィルタ１４ｂの係数を決定する。適応フィルタ１４ｂは適応信号演算部１４ａにより決定された係数に従ってオーディオ信号ｘ(n)にデジタルフィルタ処理を施して信号ｙ(n)を出力する。従って、適応信号処理によりエラー信号ｅ(n)のパワーが最小となるように適応フィルタ１４ｂの係数が収束すれば、適応フィルタはスピーカからマイクロホンまでの伝達特性を模擬することになりその出力信号ｙ(n)はマイクロホン１から出力するオーディオ信号成分と等しくなり、演算部１５から話者音声のみが音声認識装置１６に入力される。
【０００５】
適応フィルタ１４ｂは図９に示すように、ＮタップのFIR型デジタルフィルタで構成され、例えば、入力信号を順次１サンプリング時間遅延する(N-1)個の遅延要素DL₁，DL₂・・・DL_N-1と、各遅延要素出力に係数ｗ₀(n)，ｗ₁(n)，ｗ₂(n)・・・ｗ_N-1(n)を乗算するN個の乗算部ML₀，ML₁，・・・ML_N-1と、各乗算部出力を順次加算する加算部AD₀，AD₁・・・AD_N-1で実現される。すなわち、現時刻ｎ・Ｔsにおける参照信号をｘ(n)、その時の各乗算器の係数をｗ₀(n)，ｗ₁(n)，ｗ₂(n)・・・ｗ_N-1(n)、出力信号をｙ(n)とすれば、適応フィルタ１４ｂは次式ｙ(n)＝Σ_iｗi(n)・ｘ(n-i) (i=0〜N-1) ・・・(1)
の演算を実行して信号ｙ(n)を出力する。ただし、(n)は現サンプリング時刻の値、(n-1)は１サンプリング時刻前の値、(n-2)は２サンプリング時刻前の値、・・・である
【０００６】
適応信号演算部１４ａは、現時刻から１サンプリング時刻Ｔs後の次の時刻 (ｎ+1)・Ｔsにおける適応フィルタ１４ｂの係数ｗ₀(n+1)，ｗ₁(n+1)，ｗ₂(n+1)・・・ｗ_N-1(n+1)を、現時刻ｎ・Ｔｓにおける係数ｗ₀(n)，ｗ₁(n)，ｗ₂(n)・・・ｗ_N-1(n)とエラー信号ｅ（n）と入力信号ｘ(n)を用いて次の係数更新式
ｗ_j(n+1)＝ｗ_j(n)＋α・ｅ(n)・ｘ(n) (j=0〜N-1) ・・・(2)
により決定する。ただし、(n+1)は１サンプリング時刻後の値、αは適応フィルタ係数を更新するステップを決める定数(ステップサイズパラメータ)であり、１以下の適当な値に設定される。LMS適応アルゴリズムによる処理においては、上記演算を１サンプリング時間内に行って、信号ｙ(n)を出力する。
【０００７】
以上のように、適応信号処理部１４は、オーディオ信号ｘ(n)を参照信号、マイクロホン１３から出力するオーディオ再生音に応じた信号を目標信号とし、エラー信号ｅのパワーが最小となるように適応フィルタ１４ｂの係数ｗ₀(n)，ｗ₁(n)，ｗ₂(n)・・・ｗ_N-1(n)を更新する。すなわち、適応信号処理により、適応フィルタ１４ｂはスピーカ１２からマイクロホン１３までの伝達特性Ｃを模擬するようになり、演算部１５の出力からオーディオ再生音に応じた信号が除去される。従って、話者が音声でナビゲーションシステム等に動作を指示する際、話者音声のみが音声認識装置１６に入力し、音声認識装置１６は正しく音声認識することができる。
【０００８】
・４スピーカのオーディオ音キャンセル装置
以上はキャンセルすべきオーディオ信号が１つのスピーカより出力する場合であるが、車室内では通常はステレオ再生であるため左右２個以上のスピーカよりオーディオ信号が出力する。図１０はスピーカが４つの場合において各スピーカから出力するオーディオ信号をキャンセルするオーディオ音キャンセル装置の構成図であり、図８と同一部分には同一符号を付している。
図１０において、２１〜２２は右側前後のスピーカであり、オーディオソースよりＲチャンネルのオーディオ信号ｘ₁(n)が入力される。２３〜２４は左側前後のスピーカであり、オーディオソースよりＬチャンネルのオーディオ信号ｘ₂(n)が入力される。２５、２６はそれぞれＲチャンネルオーディオ信号、Ｌチャンネルオーディオ信号をキャンセルするための適応信号処理部、２７は適応信号処理部２５，２６から出力する信号ｙ₁(n),ｙ₂(n)を加算する加算器である。
【０００９】
適応信号処理部２５は図８の適応信号処理部１４と同一の適応信号処理を行なう。すなわち、適応信号処理部２５はＲチャンネルオーディオ信号ｘ₁(n)を参照信号、マイクロホン１３により検出されるＲチャンネル信号成分を目標信号とし、エラー信号ｅ(n)のパワーが最小となるように適応フィルタ（図示せず）の係数ｗ₁₀(n)，ｗ₁₁(n)，ｗ₁₂(n)・・・ｗ_1N-1(n)を更新する。この係数の更新を繰り返すことにより、適応フィルタはスピーカ２１，２２からマイクロホン１３までの伝達特性Ｃ_Rを模擬するようになり、エラー信号ｅ(n)よりＲチャンネルオーディオ音に応じた信号がキャンセルされる。
同様に、適応信号処理部２６は図８の適応信号処理部１４と同一の適応信号処理を行なう。すなわち、適応信号処理部２６はＬチャンネルオーディオ信号ｘ₂(n)を参照信号、マイクロホン１３により検出されるＬチャンネル信号成分を目標信号とし、エラー信号ｅ(n)のパワーが最小となるように適応フィルタ（図示せず）の係数ｗ₂₀(n)，ｗ₂₁(n)，ｗ₂₂(n)・・・ｗ_2N-1(n)を更新する。この係数の更新を繰り返すことにより、適応フィルタはスピーカ２３，２４からマイクロホン１３までの伝達特性Ｃ_Lを模擬するようになり、エラー信号ｅ(n)よりＬチャンネルオーディオ音に応じた信号がキャンセルされる。
【００１０】
図１０ではＲチャンネルフロントスピーカ、Ｒチャンネルリアスピーカ共通に１つの適応信号処理部を設けた場合であるが、Ｒチャンネルフロントスピーカより出力するＲチャンネルオーディオ信号、Ｒチャンネルリアスピーカより出力するＲチャンネルオーディオ信号をそれぞれキャンセルするために別個に適応信号処理部を設けることもできる。同様に、Ｌチャンネルフロントスピーカより出力するＬチャンネルオーディオ信号、Ｌチャンネルリアスピーカより出力するＬチャンネルオーディオ信号をそれぞれキャンセルするために別個に適応信号処理部を設けることもできる。かかる場合には４つの適応信号処理部の各適応フィルタはスピーカ２１〜２４からマイクロホン１３までの伝達特性Ｃ_FR，Ｃ_RR，Ｃ_FL，Ｃ_RLをそれぞれ模擬するようになり、エラー信号ｅ(n)からオーディオ音が正しくキャンセルされる。
【００１１】
・エコーキャンセラ
図１１はハンズフリーテレフォン装置においてエコーをキャンセルするエコーキャンセラの構成図であり、３１はハンズフリーテレフォン装置（ＨＦＴ）であり、受話音声出力部３１ａ、話者音声送信部３１ｂを備えている。３２は受話音声出力部より受話音声信号を入力され、受話音声を音響空間に放射するスピーカ(受話音声用スピーカ)、３３は音響空間に放射された受話音声及び話者（ドライバ）の通話音声を検出するマイクロホン、３４はエラー信号ｅ(n)のパワーが最小となるように適応信号処理を行い、適応フィルタ（図示せず）の係数Ｗを更新する適応信号処理部、３５はマイクロホンによる検出信号と適応フィルタ出力の差を演算してエラー信号ｅ(n)を出力する演算部である。適応信号処理部３４は、図８のオーディオ音キャンセルの場合と同様に適応信号処理を行ってエラー信号ｅから受話音声を除去する。すなわち、適応信号処理部３４は、受話音声出力部から入力する受話音声信号ｘ′(n)を参照信号、マイクロホン３３により検出された受話音声信号を目標信号とし、エラー信号ｅ(n)のパワーが最小となるように適応フィルタ（図示せず）の係数ｗ₀(n)，ｗ₁(n)，ｗ₂(n)・・・ｗ_N-1(n)を更新する。この係数の更新を繰り返すことにより、適応フィルタはスピーカ３２からマイクロホン３３までの伝達特性Ｃ′を模擬するようになり、エラー信号ｅ(n)より受話音声に応じた信号がキャンセルされる。従って、ドライバが通話する際、ドライバの音声のみがハンズフリーテレフォン装置３１の話者音声送信部３１ｂに入力し、エコーをキャンセルすることができる。
【００１２】
【発明が解決しようとする課題】
音声認識のためのオーディオキャンセル装置とＨＦＴ用のエコーキャンセラの仕組みは原理的に同じであるが、これらを統合したシステムが実現されていないのが現状である。このため、現状では、オーディオキャンセル装置、エコーキャンセラをそれぞれ別個に設けており、システムが大規模化すると共にコストアップの原因になり好ましくなかった。以下に統合できない理由を説明する。
【００１３】
オーディオキャンセル装置は、音声認識のために必要なものであり、音声認識処理を行っていないときは必ずしも必要ではない。しかし、オーディオキャンセル装置でオーディオ音をキャンセルするには適応信号処理により適応フィルタの係数を更新し、適応フィルタ特性を時々刻々と変動するマイクとスピーカ間の音響的な伝達特性に近似あるいは一致させる必要がある（伝達特性の同定）。適応信号処理において、適応フィルタ係数が収束して上記伝達特性と同等の特性を示すようになるまでには時間を要する。すなわち、オーディオキャンセル装置を起動してからオーディオ音をキャンセルして正しい音声認識が可能となるまでに相当の時間を要し、その間音声認識処理を停止する必要があり、音声入力ができない問題が生じる。そこで、音声認識処理を行っていないときでもオーディオキャンセル装置に適応信号処理動作を継続させ、これにより適応フィルタ特性をマイクとスピーカ間の音響的な伝達特性と同等の特性にしておき、音声認識が必要になったとき、直ちに音声認識処理を行えるようにしている。
【００１４】
オーディオキャンセル装置とエコーキャンセラを統合して適応信号処理を共通化すると、統合システムをエコーキャンセラとして使用する場合があり、システムを常時、オーディオキャンセル装置として動作させることができなくなる。オーディオキャンセルに求められるフィルタ特性（フィルタ係数）とエコーキャンセルに求められるフィルタ特性は同一であれば問題はないが、異なる。すなわち、ＨＦＴにおける相手受話音声は、通常運転席近傍のスピーカのみから出力されるが、オーディオ音は２つ以上（通常は４つ）のスピーカから出力され、オーディオキャンセルに求められるフィルタ特性とエコーキャンセルに求められるフィルタ特性は異なる。以上より、統合システムにおいて、システムをエコーキャンセラからオーディオキャンセラとして動作させる際、フィルタ特性の収束に時間を要し、直ちに音声認識処理が行えない問題が発生する。又、逆に、統合システムをオーディオキャンセラからエコーキャンセラとして動作させる際にも、フィルタ特性の収束に時間を要し、通話開始時に十分にエコーキャンセルできない問題が発生する。以上が統合できない理由である
【００１５】
以上より、本発明の目的はオーディオキャンセル装置とＨＦＴ用のエコーキャンセラを統合してもオーディオ音を直ちにキャンセルして音声認識を行え、又、受話音声を直ちにキャンセルして通話開始時にエコーキャンセルを行えるようにすることである。
本発明の別の目的は、ハンズフリー通話中、全スピーカからのオーディオ再生音の出力を停止し、運転席近傍のスピーカからのみ相手受話音声を出力する場合であっても、オーディオキャンセル装置とＨＦＴ用のエコーキャンセラを統合できるようにすることである。
本発明の別の目的は、ハンズフリー通話中、助手席側のスピーカからオーディオ再生音を出力する場合であっても、オーディオキャンセル装置とＨＦＴ用のエコーキャンセラを統合できるようにすることである。
本発明の別の目的は、ハンズフリー通話中、助手席側のスピーカからオーディオ再生音を出力し、オーディオ再生音がマイクで受信されてしまう場合であっても、通話音声のＳＮ比を低下させることなく良好なエコーキャンセルができるようにすることである。
【００１６】
【課題を解決するための手段】
本発明は、スピーカ及び話者から出力する音を検出し、検出信号よりスピーカから出力する音に応じた信号を除去する不要音信号除去装置であり、(1) スピーカ及び話者から出力された音を検出して検出信号を出力する音検出部、(2) 第１の状態において第１のスピーカ群より出力された音に応じた第１の信号を適応信号処理により発生し、該第１の信号を前記検出信号より除去し、又、第２の状態において少なくとも１個の第２のスピーカ群より出力された音に応じた第２の信号を適応信号処理により発生し、該第２の信号を前記検出信号より除去する不要音信号除去部、(3) 第１の状態から第２の状態に切り替わる際、第１の状態における適応信号処理により更新されている第１のパラメータ(適応フィルタ係数)を保存する保存部を備え、(4) 不要音信号除去部は、第２の状態から第１の状態に切り替わる際、前記保存してある第１のパラメータを用いて適応信号処理を行なう。又、不要音信号除去部は、第２の状態から第１の状態に切り替わる際、第２の状態における適応信号処理により更新されている第２のパラメータを前記保存部に保存し、第１の状態から第２の状態に切り替わる際、保存してある第２のパラメータを用いて適応信号処理を行なう。
【００１７】
具体的には、第２の状態はハンズフリー電話機による通話状態であり、第１の状態は第２状態以外の状態（オーディオ状態）である。第１状態における音声認識時に前記検出信号より第１の信号を除去した信号を音声認識装置に入力し、第２状態のハンズフリー通話時に前記検出信号より第２の信号を除去した信号をハンズフリー電話機に入力する。
以上のようにすれば、オーディオキャンセル装置とＨＦＴ用のエコーキャンセラを統合することができ、しかも、状態変更時であっても、オーディオ音を直ちにキャンセルして音声認識を行え、あるいは、受話音声を直ちにキャンセルして通話開始時にエコーキャンセルを行うことができる。
又、ハンズフリー電話機による通話状態（第２状態）において、助手席側のスピーカからオーディオ再生音を出力する場合であっても、オーディオ信号及び受話音声信号を検出信号より直ちにキャンセルでき、通話音声のＳＮ比を低下させることなく良好なエコーキャンセルができる。
【００１８】
【発明の実施の形態】
（Ａ）第１実施例
（ａ）第１、第２状態
図１は第１実施例の状態説明図であり、自動車５１の車室内には４つのスピーカ５２_FR，５２_FL，５２_RR，５２_RLが設けられている。スピーカ５２_FR，５２_RRはＲチャンネルオーディオ信号が入力される車室内右側の前後のスピーカであり、スピーカ５２_FRは運転席５３の近傍に設けられている。この運転席近傍に設けられたスピーカ５２_FRにはハンズフリー電話機による通話時、受話音声信号が入力される。スピーカ５２_FL，５２_RLはＬチャンネルオーディオ信号が入力される車室内左側の前後のスピーカであり、スピーカ５２_FLは助手席５４の近傍に設けられている。５５は後部座席である。
【００１９】
本発明では、ハンズフリー電話機による通話状態以外の状態を第１状態（オーディオ状態）、ハンズフリー電話機による通話状態を第２状態と定義する。ドライバは、ハンズフリー電話機で電話中にナビゲーション装置などに音声で動作を指示することはありえない。すなわち、音声認識は第１状態において必要になり、第２状態において必要性でない。
第１状態において、図１（Ａ）に示すように左側スピーカ５２_FL，５２_RLにＬチャンネルオーディオ信号が入力され、これらスピーカよりＬチャンネルオーディオ再生音が車室内に放射される。又、右側スピーカ５２_FR，５２_RRにはＲチャンネルオーディオ信号が入力され、これらスピーカよりＲチャンネルオーディオ再生音が車室内に放射される。この第１状態において音声認識モードになれば、不要音信号除去部（後述）は車室内に設けた音声検出部により検出された信号よりオーディオ再生音に応じた信号を除去して音声認識装置に入力する。
第２状態においては、図１（Ｂ）に示すようにいずれのスピーカにもオーディオ信号を入力せず、運転席近傍のスピーカ５２_FRのみにハンズフリー電話機から受話音声信号を入力し、該スピーカより受話音声を車室内に放射する。
【００２０】
（ｂ）第１実施例の不要音信号除去装置
図２は第１実施例の不要音信号除去装置の構成図であり、第１状態の音声認識時、車室内に設けた音声検出部の検出信号よりオーディオ再生音に応じた信号を除去して音声認識装置に入力し、第２状態のハンズフリー通話時、検出信号より受話音声に応じた信号を除去してハンズフリー電話機に入力する。
オーディオソース６１はＬチャンネルオーディオ信号AS_L、Ｒチャンネルオーディオ信号AS_Rを出力し、ハンズフリー電話機６２は通話時に通信相手からの音声信号（受話音声信号）RSを出力する。又、ハンズフリー電話機６２は発信時及び着信時に通話状態になったとき通話信号ＳＰを制御部６３に入力すると共に、終話状態になったとき終話信号SEを制御部６３に入力する。制御部６３は、通話信号SPと終話信号SEに基づいて、現状態が第１状態（通話状態以外の状態）であるか、第２状態（通話状態）であるか監視すると共に、音声認識開始／終了信号SRSEに基づいて現状態が音声認識状態であるか否かを監視する。又、制御部６３は現在の状態に基づいてスイッチ６４〜６６及び切替器６７を制御する。
【００２１】
スイッチ（ＳＷ１〜ＳＷ３）６４〜６６は第１状態（オーディオ状態）において図１（Ｃ）に示すように動作し、オーディオ信号を各スピーカに入力する。すなわち、第１状態においてスイッチ（ＳＷ１，ＳＥ３）６４、６６はオンし、スイッチ（ＳＷ２）６５は端子Ａ側に切り替わる。この結果、第１状態において、Ｌチャンネルオーディオ信号が可変利得アンプ６８_FL，６８_RLを介して左側スピーカ５２_FL，５２_RLに入力し、Ｒチャンネルオーディオ信号が可変利得アンプ６８_FR，６８_RR介して右側スピーカ５２_FR，５２_RRに入力する。又、第２状態においてスイッチ６４、６６はオフし、スイッチ６５は端子Ｂ側に切り替わる。この結果、第２状態（ＨＦＴ通話状態）において、全スピーカ５２_FL，５２_RL，５２_FR，５２_RRにオーディオ信号は入力せず、運転席近傍のスピーカ５２_FRのみにハンズフリー電話機６２から受話音声信号RSが可変利得アンプ６９，６８_FRを介して入力する。従って、第１状態（オーディオ状態）において、左側スピーカ５２_FL，５２_RLからＬチャンネルオーディオ再生音が、右側スピーカ５２_FR，５２_RRからＲチャンネルオーディオ再生音が車室内に放射される。又、第２状態（通話状態）において運転席近傍のスピーカ（受話音声用スピーカ）５２_FRから受話音声が車室内に放射される。
【００２２】
ドライバの口元近傍に設けられたマイクロホン７１は車室内の音、すなわち、各スピーカから放射される音声及びドライバが発する音声を検出して音声検出信号ｄ(n)を出力し、遅延部７２は後述する適応信号処理に要する時間、例えばフィルタ９１のタップ数の半分程度に相当するサンプリング時間分音声検出信号ｄ(n)を遅延する。
不要音信号除去部７３は第１の状態（オーディオ状態）において車室内スピーカ５２_FL，５２_RL，５２_FR，５２_RRより車室内に放射されたオーディオ再生音に応じた信号（推定オーディオ信号という）ｙ(n)を適応信号処理により発生し、該推定オーディオ信号ｙ(n)を検出信号ｄ(n)より除去する。又、不要音信号除去部７３は、第２の状態（通話状態）において受話音声用スピーカ５２_FRより車室内に放射された受話音声に応じた信号（推定受話音声信号という）ｙ(n)を適応信号処理により発生し、該推定受話音声信号を検出信号ｄ(n)より除去する。
切替器６７は、制御部６３かの制御信号に基づいて不要音信号除去部７３から出力する信号を、第１状態の音声認識時に音声認識装置７４に入力し、第２状態の通話状態時にハンズフリー電話機の送話部７４に入力する。
【００２３】
（ｃ）不要音信号除去部
不要音信号除去部７３において、誤差信号発生部８１は、第１状態の適応信号処理により発生する推定オーディオ信号ｙ(n)と検出信号ｄ(n)との誤差信号ｅ(n)を発生し、第２状態の適応信号処理により発生する推定受話音声信号ｙ(n)と検出信号ｄ(n)との誤差信号ｅ(n)を発生する。
適応信号処理部８２は第１状態の適応信号処理によりオーディオ信号を推定して推定オーディオ信号ｙ(n)を出力し、第２状態の適応信号処理により受話音声信号を推定して推定受話音声信号ｙ(n)を出力する。
フィルタ係数保存部８３は、第１状態から第２状態に切り替わる際、第１状態の適応信号処理により更新されている適応フィルタ係数を保存すると共に、第２状態から第１状態に切り替わる際、第２状態の適応信号処理により更新されている適応フィルタ係数を保存する。
【００２４】
（ｄ）適応信号処理部
適応信号処理部８２は、第１、第２の適応信号処理部９１，９２と加算器９３で構成されている。加算器９３は、第１状態において第１、第２の適応信号処理部９１，９２から出力する信号を加算して推定オーディオ信号ｙ(n)を出力し、第２状態において第１、第２の適応信号処理部９１，９２から出力する信号を加算して推定受話音声信号を出力する。
第１の適応信号処理部９１は図３（Ａ）に示すように適応信号演算部９１ａと適応フィルタ９１ｂを備え、第２の適応信号処理部９２は図３（Ｂ）に示すように適応信号演算部９２ａと適応フィルタ９２ｂを備えており、共に図８で説明した適応信号処理を実行する。
【００２５】
すなわち、第１状態（オーディオ状態）において、適応信号演算部９１ａは、Ｒチャンネルオーディオ信号AS_R(=ｘ₁(n))と誤差信号ｅ(n)を用いて適応信号演算を行って適応フィルタ９１ｂの係数値を更新し、適応フィルタ９１ｂはＲチャンネルの推定オーディオ信号ｙ₁(n)を出力する。又、適応信号演算部９２ａは、Ｌチャンネルオーディオ信号AS_L(=ｘ₂(n))と誤差信号ｅ(n)を用いて適応信号演算を行って適応フィルタ９２ｂの係数値を更新し、適応フィルタ９２ｂはＬチャンネルの推定オーディオ信号ｙ₂(n)を出力する。加算器９３は、第１状態において第１、第２の適応信号処理部９１，９２から出力する信号ｙ₁(n)，ｙ₂(n)を加算して推定オーディオ信号ｙ(n)を出力する。
【００２６】
又、第２状態（通話状態）において、適応信号演算部９１ａは、受話音声信号RS(=ｘ₁(n))と誤差信号ｅ(n)を用いて適応信号演算を行って適応フィルタ９１ｂの係数値を更新し、適応フィルタ９１ｂは推定受話音声信号ｙ₁(n)を出力する。又、適応信号演算部９２ａは、Ｌチャンネルオーディオ信号AS_L(=ｘ₂(n)＝０)と誤差信号ｅ(n)を用いて適応信号演算を行って適応フィルタ９２ｂの係数値を更新し、適応フィルタ９２ｂはＬチャンネルの推定オーディオ信号ｙ₂(n)＝０を出力する。加算器９３は、第２状態において第１、第２の適応信号処理部９１，９２から出力する信号ｙ₁(n)，ｙ₂(n)を加算して推定受話音声信号ｙ(n)を出力する。
【００２７】
（ｅ）全体の動作
第１状態（オーディオ状態）において、Ｒチャンネルオーディオ信号AS_Rは右側スピーカ５２_FR，５２_RRに入力し、これらスピーカよりＲチャンネルオーディオ再生音が車室内に放射される。又、Ｌチャンネルオーディオ信号AS_Lは左側スピーカ５２_FL，５２_RLに入力し、これらスピーカよりＬチャンネルオーディオ再生音が車室内に放射される。ドライバの口元近傍に設けられたマイクロホン７１は車室内の音、すなわち、各スピーカから放射されるオーディオ再生音を検出して検出信号ｄ(n)を出力し、遅延部７２は因果性を満たすための時間分、音声検出信号ｄ(n)を遅延する。
【００２８】
適応信号処理部９１の適応信号演算部９１ａ（図３（Ａ））は、Ｒチャンネルオーディオ信号AS_R(=ｘ₁(n))と誤差信号ｅ(n)を用いて適応信号処理を行って適応フィルタ９１ｂの係数値を更新し、適応フィルタ９１ｂはＲチャンネルの推定オーディオ信号ｙ₁(n)を出力する。又、適応信号処理部９２の適応信号演算部９２ａは、Ｌチャンネルオーディオ信号AS_L(=ｘ₂(n))と誤差信号ｅ(n)を用いて適応信号演算を行って適応フィルタ９２ｂの係数値を更新し、適応フィルタ９２ｂはＬチャンネルの推定オーディオ信号ｙ₂(n)を出力する。加算器９３は、第１、第２の適応信号処理部９１，９２、すなわち、適応フィルタ９１ｂ，９２ｂから出力するＲチャンネルおよびＬチャンネルの推定オーディオ信号ｙ₁(n)，ｙ₂(n)を加算して推定オーディオ信号ｙ(n)を出力する。
誤差信号発生部８１は、適応信号処理により発生する推定オーディオ信号ｙ(n)と検出信号ｄ(n)との誤差信号ｅ(n)を発生し、第１、第２の適応信号処理部９１，９２に入力する。
【００２９】
以後、第１状態において上記適応信号処理を繰り返すことにより適応フィルタ９１ｂの特性は右側スピーカ５２_FR，５２_RRからマイクロホン７１までの伝達特性と同等の特性になる。そして、乗員の姿勢変化などにより伝達特性が変化しても常時適応信号処理を行っているため適応フィルタ９１ｂは該伝達特性を模擬する。同様に、第１状態において適応信号処理を繰り返すことにより適応フィルタ９２ｂの特性は左側スピーカ５２_FL，５２_RLからマイクロホン７１までの伝達特性と同等の特性になる。そして、乗員の姿勢変化などにより伝達特性が変化しても常時適応信号処理を行っているため適応フィルタ９２ｂは該伝達特性を模擬する。この結果、誤差信号発生部８１の出力である誤差信号ｅ(n)は略零になって、オーディオ信号キャンセル状態になっている。
【００３０】
かかる状態において、ナビゲーション装置などに音声で指示する必要が生じればドライバは所定のスイッチを操作して音声認識モードにする。これにより制御部６３は切替器６７を制御し、不要音信号除去部７３から出力する不要音除去信号（誤差信号）ｅ(n)を音声認識装置７４に入力する。この音声認識モードにおいて、ドライバが音声でナビゲーション装置に指示を出すと、該ドライバ音声とオーディオ再生音の合成音がマイクロホン７１により検出され、検出信号ｄ(n)が誤差信号発生部８１に入力する。又、適応信号処理部８２から推定オーディオ信号ｙ(n)が誤差信号発生部８１に入力する。誤差信号発生部８１は検出信号ｄ(n)から推定オーディオ信号ｙ(n)を減算し、得られた信号を切替器６７を介して音声認識装置７４に入力する。前述のように推定オーディオ信号は実際のオーディオ信号とほぼ等しいため、音声認識装置７４にはドライバの音声信号が入力し、音声認識装置７４は正しく音声認識ができる。
【００３１】
次に、状態変化時の動作を図４の処理フローに従って説明する。
第１の状態（オーディオ状態）において、制御部６３はハンズフリー電話機６２に着信があり、あるいはハンズフリー電話機より発信して通話状態になったか否かを監視する（ステップ１０１）。ハンズフリー電話機が通話状態になれば、制御部は第１の信号処理部９１に適応フィルタ９１ｂのフィルタ係数Ｗ1(n)を係数保存部８３にＷtemp(n)として保存するよう指示すると共に、係数保存部８３に保存されているフィルタ係数Ｗ'(n)を読み出して適応フィルタ９１ｂにフィルタ係数Ｗ1(n)としてセットするよう指示する。この指示に従って、第１の信号処理部９１はフィルタ係数の保存および保存されている係数の読み出し／設定を実行する（ステップ１０２〜１０３）。尚、フィルタ係数Ｗ'(n)は前回の第２状態の適応信号処理により更新された最新の適応フィルタ係数である。このため、乗員の姿勢変化などにより受話音声用スピーカ５２_FRからマイクロホン７１までの伝達特性が変動していても、この係数をセットされた適応フィルタの特性は該伝達特性に近似している。
【００３２】
ついで、制御部６３はスイッチ６４，６６をオフ、スイッチ６５をＢ接点側に切り替える（ステップ１０４）。これにより、全スピーカにオーディオ信号が入力しなくなり、運転席近傍の受話音声用スピーカ５２_FRのみにハンズフリー電話機６２から受話音声信号RSが可変利得アンプ６９，６８_FRを介して入力し、受話音声用スピーカ５２_FRから受話音声が車室内に放射される。又、制御部６３は切替器６７を制御し、不要音信号除去部７３から出力する誤差信号ｅ(n)をハンズフリー電話機の送話器７５に入力する。
【００３３】
ドライバの口元近傍に設けられたマイクロホン７１は車室内の音、すなわち、受話音声用スピーカ５２_FRから放射されるドライバの通話音声を検出して検出信号ｄ(n)を出力し、遅延部７２は因果性を満たすための時間分、音声検出信号ｄ(n)を遅延する。一方、適応信号演算部９１ａは、受話音声信号RS(=ｘ₁(n))と誤差信号ｅ(n)を用いて適応信号演算を行って適応フィルタ９１ｂの係数値を更新し、適応フィルタ９１ｂは推定受話音声信号ｙ₁(n)を出力する。又、適応信号演算部９２ａは、Ｌチャンネルオーディオ信号AS_L(=ｘ₂(n)＝０)と誤差信号ｅ(n)を用いて適応信号演算を行って適応フィルタ９２ｂの係数値を更新し、適応フィルタ９２ｂはＬチャンネルの推定オーディオ信号ｙ₂(n)＝０を出力する。加算器９３は、第２状態において第１、第２の適応信号処理部９１，９２から出力する信号ｙ₁(n)，ｙ₂(n)を加算して推定受話音声信号ｙ(n)を出力する。
誤差信号発生部８１は、適応信号処理により発生する推定受話音声信号ｙ(n)と検出信号ｄ(n)との誤差信号ｅ(n)を発生し、第１、第２の適応信号処理部９１，９２に入力する。以後、第２状態において上記適応信号処理を繰り返すことにより適応フィルタ９１ｂの特性は受話音声用スピーカ５２_FRからマイクロホン７１までの伝達特性と同等の特性になる。この場合、適応フィルタ９１ｂに前回の第２状態において求めてある最新のフィルタ係数Ｗ'(n)を設定して適応信号処理を行うため、短時間で適応フィルタ９１ｂの特性は該伝達特性と同等の特性になる。
【００３４】
この結果、第１状態から第２状態に切り替わると、直ちに、検出信号ｄ(n)より推定受話音声信号ｙ(n)が除去され、ドライバの音声に応じた信号のみが送話器７５に入力してエコーキャンセルを実現できる（ステップ１０５）。
以後、制御部６３は通話が終了したか監視し（ステップ１０６）、通話が終了すれば、第１の適応信号処理部９１に対し、適応フィルタ９１ｂの係数Ｗ1(n)をＷ'(n)として係数保存部８３に保存し、係数保存部８３に保存してある係数Ｗtemp(n)を適応フィルタ９１ｂにＷ1(n)としてセットするよう指示する。この指示により、第１の信号処理部９１はフィルタ係数の保存および保存されている係数の読み出し／設定を実行する（ステップ１０７，１０８）。尚、フィルタ係数Ｗtemp(n)は前回の第１状態において更新された最新の適応フィルタ係数であり、前回の第１状態における右側スピーカ５２_FR，５２_RRからマイクロホン７１までの伝達特性を模擬する係数である。従って、乗員の姿勢変化などにより上記伝達特性が若干変動していたとしても、この係数をセットされた適応フィルタ特性は該伝達特性に近似している。
【００３５】
ついで、制御部６３はスイッチ６４，６６をオン、スイッチ６５をＡ接点側に切り替える。又、制御部６３は切替器６７を制御し、不要音信号除去部７３から出力する誤差信号ｅ(n)を音声認識装置７４に入力する（ステップ１０９）。
以上により、左側スピーカにＬチャンネルオーディオ信号が入力され、右側スピーカにＲチャンネルオーディオ信号が入力され、各スピーカからオーディオ再生音が車室内に放射される。以後、前述の第１状態におけるオーディオ音キャンセル処理が行われる。この場合、適応フィルタ９１ｂに前回の第１状態において求めてあるフィルタ係数Ｗtemp(n)を設定して適応信号処理を行うため、乗員の姿勢変化などにより右側スピーカからマイクロホンまでの伝達特性が変動していても、短時間で適応フィルタ９１ｂの特性は該伝達特性と同等の特性になり、検出信号よりオーディオ再生音に応じた信号をキャンセルする(ステップ１１０）。以後、はじめに戻って以降の処理が繰り返される。
【００３６】
以上、第１実施例によれば、オーディオキャンセル装置とＨＦＴ用のエコーキャンセラを統合でき、しかも、フィルタ係数を保存しておき必要時に読み出して適応信号処理を継続するため、音声認識時にオーディオ音を直ちにキャンセルして音声認識を行え、又、ハンズフリー通話時に受話音声を直ちにキャンセルしてエコーキャンセルを行うことができる。
又、第１実施例によれば、通話中、全スピーカからのオーディオ再生音の出力を停止し、かつ、運転席部のスピーカから相手受話音声を出力する構成であっても、オーディオキャンセル装置とＨＦＴ用のエコーキャンセラを統合することができる。
【００３７】
（Ｂ）第２実施例
第１実施例ではハンズフリー通話状態（第２の状態）において全スピーカからオーディオ再生音の出力を停止する場合であるが、第２実施例はオーディオ再生音を聴取しながら電話できるようにしたものである。
図５は第２実施例の状態説明図であり、図１と同一部分には同一符号を付している。第２実施例においては、第１状態はハンズフリー電話機による通話状態以外の状態であり、第２状態はハンズフリー電話機による通話状態である。
【００３８】
第２実施例の第１状態は第１実施例の第１状態と同様に、図５（Ａ）に示すように左側スピーカ５２_FL，５２_RLにＬチャンネルオーディオ信号が入力され、これらスピーカよりＬチャンネルオーディオ再生音が車室内に放射される。又、右側スピーカ５２_FR，５２_RRにはＲチャンネルオーディオ信号が入力され、これらスピーカよりＲチャンネルオーディオ再生音が車室内に放射される。第１状態において音声認識モードになれば、不要音信号除去部は車室内に設けた音声検出部により検出された信号よりオーディオ再生音に応じた信号を除去して音声認識装置に入力する。
第２状態においては、図５（Ｂ）に示すように左側スピーカ５２_FL，５２_RLにＬチャンネルオーディオ信号が入力され、これらスピーカよりＬチャンネルオーディオ再生音が車室内に放射される。しかし、運転席と同じ側の右側スピーカ５２_FR，５２_RRにはオーディオ信号は入力されず、運転席近傍のスピーカ（受話音声用スピーカ）５２_FRのみにハンズフリー電話機から受話音声信号が入力し、該スピーカより受話音声が車室内に放射される。
【００３９】
図６は第２実施例の不要音信号除去装置の構成図であり、図２の第１実施例と異なる点は、第１実施例のスイッチ６４を除去した点であり、これにより第２状態のハンズフリー通話時、左側スピーカ５２_FL，５２_RLにＬチャンネルオーディオ信号を入力し、受話音声用スピーカ５２_FRにハンズフリー電話機から受話音声信号を入力することができる。
図７は状態変化時の第２実施例の処理フローである。尚、第１状態における動作は第１実施例とまったく同じであるため説明は省略する。
【００４０】
第１の状態（オーディオ状態）において、制御部６３はハンズフリー電話機６２に着信があり、あるいはハンズフリー電話機より発信して通話状態になったか否かを監視する（ステップ２０１）。ハンズフリー電話機が通話状態になれば、制御部は第１の信号処理部９１に対し、適応フィルタ９１ｂのフィルタ係数Ｗ1(n)を係数保存部８３にＷtemp(n)として保存するよう指示すると共に、係数保存部８３に保存されているフィルタ係数Ｗ'(n)を読み出して適応フィルタ９１ｂにフィルタ係数Ｗ1(n)としてセットするよう指示する。この指示に従って、第１の信号処理部９１はフィルタ係数の保存および保存されている係数の読み出し／設定を実行する（ステップ２０２〜２０３）。尚、フィルタ係数Ｗ'(n)は前回の第２状態の適応信号処理により更新された最新の適応フィルタ係数である。このため、乗員の姿勢変化などにより受話音声用スピーカ５２_FRからマイクロホンまでの伝達特性が変動していても、この係数をセットされた適応フィルタ特性は該伝達特性に近似している。
【００４１】
ついで、制御部６３は図５（Ｃ）に示すようにスイッチ（ＳＷ２）６６をオフ、スイッチ（ＳＷ１）６５をＢ接点側に切り替える（ステップ２０４）。これにより、継続して左側スピーカ５２_FL，５２_RLにＬチャンネルオーディオ信号が入力されるが、右側スピーカ５２_FR，５２_RRにはオーディオ信号は入力されず、受話音声用スピーカ５２_FRにハンズフリー電話機６２から受話音声信号RSが可変利得アンプ６９，６８_FRを介して入力し、受話音声用スピーカ５２_FRから受話音声が車室内に放射される。又、制御部６３は切替器６７を制御し、不要音信号除去部７３から出力する誤差信号ｅ(n)をハンズフリー電話機の送話器７５に入力する。
【００４２】
ドライバの口元近傍に設けられたマイクロホン７１は車室内の音、すなわち、Ｌチャンネルスピーカから放射するＬチャンネルオーディオ再生音と受話音声用スピーカ５２_FRから放射されるドライバの通話音声を検出して検出信号ｄ(n)を出力し、遅延部７２は適応信号処理に要する時間分、音声検出信号ｄ(n)を遅延する。一方、適応信号演算部９１ａは、受話音声信号RS(=ｘ₁(n))と誤差信号ｅ(n)を用いて適応信号演算を行って適応フィルタ９１ｂの係数値を更新し、適応フィルタ９１ｂは推定受話音声信号ｙ₁(n)を出力する。又、適応信号演算部９２ａは、Ｌチャンネルオーディオ信号AS_L(=ｘ₂(n))と誤差信号ｅ(n)を用いて適応信号演算を行って適応フィルタ９２ｂの係数値を更新し、適応フィルタ９２ｂはＬチャンネルの推定オーディオ信号ｙ₂(n)を出力する。加算器９３は、第２状態において第１、第２の適応信号処理部９１，９２から出力する信号ｙ₁(n)，ｙ₂(n)を加算して推定合成音信号ｙ(n)を出力する。
【００４３】
誤差信号発生部８１は、適応信号処理により発生する推定合成音信号ｙ(n)と検出信号ｄ(n)との誤差信号ｅ(n)を発生し、第１、第２の適応信号処理部９１，９２に入力する。以後、第２状態において上記適応信号処理を繰り返すことにより適応フィルタ９１ｂの特性は受話音声用スピーカ５２_FRからマイクロホン７１までの伝達特性と同等の特性になる。又、適応フィルタ９２ｂの特性は左側スピーカ５２_FL，５２_RLからマイクロホン７１までの伝達特性と同等の特性になる。この場合、適応フィルタ９１ｂに前回の第２状態において求めてあるフィルタ係数Ｗ'(n)を設定して適応信号処理を行うため、乗員の姿勢変化などにより上記伝達特性が若干変化していても短時間で適応フィルタ９１ｂの特性は該伝達特性と同等の特性になる。この結果、第１状態から第２状態に切り替わると、直ちに、検出信号ｄ(n)より推定合成音信号ｙ(n)が除去され、ドライバの音声に応じた信号のみが送話器７５に入力してエコーキャンセルを実現できる（ステップ２０５）。
【００４４】
以後、制御部６３は通話が終了したか監視し（ステップ２０６）、通話が終了すれば第１の適応信号処理部９１に対し、適応フィルタ９１ｂの係数Ｗ1(n)をＷ'(n)として係数保存部８３に保存し、係数保存部８３に保存してある係数Ｗtemp(n)を適応フィルタ９１ｂにＷ1(n)としてセットするよう指示する。この指示に従って、第１の信号処理部９１はフィルタ係数の保存および保存されている係数の読み出し／設定を実行する（ステップ２０７，２０８）。尚、フィルタ係数Ｗtemp(n)は前回の第１状態において更新された最新の適応フィルタ係数であり、前回の第１状態における右側スピーカ５２_FR，５２_RRからマイクロホンまでの伝達特性を模擬する。従って、乗員の姿勢変化などにより上記伝達特性が若干変動していたとしても、この係数をセットされた適応フィルタの特性は該伝達特性に近似している。
ついで、制御部６３はスイッチ（ＳＷ２）６６をオン、スイッチ（ＳＷ１）６５をＡ接点側に切り替える。又、制御部６３は切替器６７を制御し、不要音信号除去部７３から出力する誤差信号ｅ(n)を音声認識装置７４に入力する（ステップ２０９）。
【００４５】
以上により、左側スピーカにＬチャンネルオーディオ信号が入力され、右側スピーカにＲチャンネルオーディオ信号が入力され、各スピーカからオーディオ再生音が車室内に放射される。以後、第１状態におけるオーディオ音キャンセル処理が行われる。この場合、適応フィルタ９１ｂに前回の第１状態において求めてあるフィルタ係数Ｗtemp(n)を設定して適応信号処理を行うため、乗員の姿勢変化などにより右側スピーカからマイクロホンまでの伝達特性が変動していても、短時間で適応フィルタ９１ｂの特性は該伝達特性と同等の特性になり、検出信号よりオーディオ再生音に応じた信号をキャンセルする(ステップ２１０）。
第２実施例によれば、オーディオキャンセル装置とＨＦＴ用のエコーキャンセラを統合でき、しかも、フィルタ係数を保存しておき必要時に読み出して適応信号処理を継続するため、音声認識時にオーディオ音を直ちにキャンセルして音声認識を行え、又、ハンズフリー通話時に受話音声を直ちにキャンセルしてエコーキャンセルを行うことができる。
【００４６】
又、第２実施例によれば、ハンズフリー通話中、助手席側のスピーカからオーディオ再生音を出力する構成であっても、オーディオキャンセル装置とＨＦＴ用のエコーキャンセラを統合することができる。この場合、第２実施例によれば、Ｌチャンネルのオーディオキャンセル処理と通話音声のエコーキャンセル処理が同時にできる。
以上では、第１の状態をオーディオ状態、第２の状態を通話状態として説明したが、第１、第２の状態は実施例に限定されるものではない。
以上では、運転席が右側に存在する車両について説明したが、左側に存在する車両にも本発明を適用できることは勿論である。
以上、本発明を実施例により説明したが、本発明は請求の範囲に記載した本発明の主旨に従い種々の変形が可能であり、本発明はこれらを排除するものではない。
【００４７】
【発明の効果】
以上本発明によれば、オーディオキャンセル装置とＨＦＴ用のエコーキャンセラを統合でき、この結果、装置規模を小型化でき、しかも、安価な構成とすることができる。
又、本発明によれば、フィルタ係数を保存しておき必要時に読み出して適応信号処理を継続するため、音声認識時にオーディオ音を直ちにキャンセルして音声認識を行え、又、ハンズフリー通話時に受話音声を直ちにキャンセルしてエコーキャンセルを行うことができる。
【００４８】
又、本発明によれば、ハンズフリー通話中、全スピーカからのオーディオ再生音の出力を停止し、かつ、運転席部のスピーカから相手受話音声を出力する構成であっても、オーディオキャンセル装置とＨＦＴ用のエコーキャンセラを統合することができる。
又、本発明によれば、ハンズフリー通話中、助手席側のスピーカからオーディオ再生音を出力する構成であっても、オーディオキャンセル装置とＨＦＴ用のエコーキャンセラを統合することができ、Ｌチャンネルのオーディオキャンセル処理と通話音声のエコーキャンセル処理が同時にできる。
【図面の簡単な説明】
【図１】第１実施例の状態説明図である。
【図２】第１実施例の不要音信号除去装置の構成図である。
【図３】適応信号処理部の構成図である。
【図４】第１実施例の状態変化時の処理フローである。
【図５】第２実施例の状態説明図である。
【図６】第２実施例の不要音信号除去装置の構成図である。
【図７】第２実施例の状態変化時の処理フローである。
【図８】従来のオーディオ音キャンセル装置の構成図である。
【図９】適応フィルタの構成図である。
【図１０】車室内スピーカのオーディオ音キャンセル装置の構成図である。
【図１１】エコーキャンセラーの構成図である。
【符号の説明】
５２_FL，５２_RL，５２_FR，５２_RR・・スピーカ
６１・・オーディオソース
６２・・ハンズフリー電話機
６３・・制御部
６４〜６６・・スイッチ
６７・・切替器
７１・・マイクロホン
７３・・不要音信号除去部
７４・・音声認識装置
７５・・ハンズフリー電話機の送話部
８１・・誤差信号発生部
８２・・適応信号処理部
８３・・フィルタ係数保存部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an unnecessary sound signal removing device, and in particular, an unnecessary sound signal removing device that detects sound output from a speaker and a speaker into a vehicle interior and removes an unnecessary sound signal corresponding to sound output from the speaker from a detection signal. About.
[0002]
[Prior art]
In recent years, speech recognition apparatuses have become common as human interfaces (input / output means) for in-vehicle devices such as navigation systems. However, the speech recognition apparatus has a problem that the recognition performance is deteriorated due to the influence of disturbances other than the speech to be recognized (vehicle running noise and audio reproduction sound), and various noise countermeasures are realized. As an example, there has been proposed an audio sound canceling apparatus that removes (suppresses) audio reproduction sound in a vehicle interior from a microphone reception signal.
In recent years, due to the explosive spread of mobile phones, accidents caused by calls while driving have increased, and from the viewpoint of safety, calls using mobile phones while traveling are prohibited by law. For this reason, a hands-free telephone device (HFT) capable of making a call without removing the handset has become widespread. In HFT, sound is output from a speaker of an audio system, and a so-called echo phenomenon occurs in which a call partner's voice reproduced and output from this speaker is detected by a microphone and transmitted to a communication partner. For this reason, echo cancellers for removing such echoes have been put into practical use.
[0003]
・ Audio sound canceling device
FIG. 8 is a configuration diagram of a conventional audio sound canceling apparatus, in which 11 is an audio source, 12 is an audio signal input, a speaker that radiates audio sound into the acoustic space, and 13 is an audio sound and speech radiated into the acoustic space. A microphone for detecting a speaker voice from a speaker, 14 an adaptive signal processing unit for performing adaptive signal processing so that the power of the error signal e is minimized, and updating a coefficient W of the adaptive filter, and 15 a signal detected by the microphone. It is a calculation part which calculates the difference of an adaptive filter output and outputs the error signal e. Reference numeral 16 denotes a voice recognition device that recognizes voice from a speaker voice whose audio sound has been canceled.
The adaptive signal processing unit 14 is input with the audio signal x (n) as a reference signal and the error signal e (n) output from the arithmetic unit 15, and the power of the error signal is minimized. The adaptive signal processing is performed so that the signal y (n) is output. The adaptive signal processing unit 14 includes an adaptive signal calculation unit 14a and an adaptive filter 14b having a FIR type digital filter configuration.
[0004]
The adaptive signal calculation unit 14a receives the error signal e (n) at the listening position and the audio signal x (n) as a reference signal, and uses these signals to cancel the audio signal at the listening position. To determine the coefficient of the adaptive filter 14b. For example, the adaptive signal calculation unit 14a determines a coefficient of the adaptive filter 14b according to a known LMS (Least Mean Square) adaptive algorithm so that the power of the error signal e (n) is minimized. The adaptive filter 14b performs digital filter processing on the audio signal x (n) according to the coefficient determined by the adaptive signal calculation unit 14a and outputs a signal y (n). Therefore, if the coefficient of the adaptive filter 14b converges so that the power of the error signal e (n) is minimized by the adaptive signal processing, the adaptive filter simulates the transfer characteristic from the speaker to the microphone, and the output signal y (n) is equal to the audio signal component output from the microphone 1, and only the speaker voice is input from the calculation unit 15 to the voice recognition device 16.
[0005]
As shown in FIG. 9, the adaptive filter 14b is composed of an N-tap FIR digital filter, and for example, (N-1) delay elements DL that sequentially delay the input signal by one sampling time. ₁ , DL ₂ ... DL _N-1 And a coefficient w for each delay element output ₀ (n), w ₁ (n), w ₂ (n) ... w _N-1 N multipliers ML for multiplying (n) ₀ , ML ₁ ・・・・・・ ML _N-1 And an adder AD that sequentially adds the outputs of each multiplier ₀ , AD ₁ ... AD _N-1 It is realized with. That is, the reference signal at the current time n · Ts is x (n), and the coefficient of each multiplier at that time is w ₀ (n), w ₁ (n), w ₂ (n) ... w _N-1 (n), where y (n) is the output signal, the adaptive filter 14b is given by the following equation y (n) = Σ _i wi (n) x (ni) (i = 0 to N-1) (1)
And the signal y (n) is output. However, (n) is a value at the current sampling time, (n-1) is a value before one sampling time, (n-2) is a value before two sampling times, and so on.
[0006]
The adaptive signal calculation unit 14a calculates the coefficient w of the adaptive filter 14b at the next time (n + 1) · Ts one sampling time Ts after the current time. ₀ (n + 1), w ₁ (n + 1), w ₂ (n + 1) ... w _N-1 (n + 1) is the coefficient w at the current time n · Ts ₀ (n), w ₁ (n), w ₂ (n) ... w _N-1 (n), error signal e (n), and input signal x (n)
w _j (n + 1) = w _j (n) + α · e (n) · x (n) (j = 0 to N-1) (2)
Determined by However, (n + 1) is a value after one sampling time, α is a constant (step size parameter) that determines the step of updating the adaptive filter coefficient, and is set to an appropriate value of 1 or less. In the processing by the LMS adaptive algorithm, the above calculation is performed within one sampling time, and a signal y (n) is output.
[0007]
As described above, the adaptive signal processing unit 14 uses the audio signal x (n) as the reference signal and the signal corresponding to the audio reproduction sound output from the microphone 13 as the target signal so that the power of the error signal e is minimized. Coefficient w of adaptive filter 14b ₀ (n), w ₁ (n), w ₂ (n) ... w _N-1 Update (n). That is, by the adaptive signal processing, the adaptive filter 14b simulates the transfer characteristic C from the speaker 12 to the microphone 13, and a signal corresponding to the audio reproduction sound is removed from the output of the arithmetic unit 15. Therefore, when the speaker instructs the navigation system or the like by voice, only the speaker's voice is input to the voice recognition device 16, and the voice recognition device 16 can correctly recognize the voice.
[0008]
・ 4-speaker audio sound canceling device
The above is the case where the audio signal to be canceled is output from one speaker. However, since the stereo reproduction is usually performed in the vehicle interior, the audio signal is output from two or more left and right speakers. FIG. 10 is a configuration diagram of an audio sound canceling apparatus that cancels an audio signal output from each speaker when there are four speakers. The same parts as those in FIG. 8 are denoted by the same reference numerals.
In FIG. 10, reference numerals 21 to 22 denote right and left speakers, which are R channel audio signals x from the audio source. ₁ (n) is input. Reference numerals 23 to 24 denote left and right speakers, which are L channel audio signals x from the audio source. ₂ (n) is input. Reference numerals 25 and 26 denote adaptive signal processing units for canceling the R channel audio signal and the L channel audio signal, respectively. 27 denotes a signal y output from the adaptive signal processing units 25 and 26. ₁ (n), y ₂ This is an adder for adding (n).
[0009]
The adaptive signal processing unit 25 performs the same adaptive signal processing as the adaptive signal processing unit 14 of FIG. That is, the adaptive signal processing unit 25 performs the R channel audio signal x ₁ The coefficient w of the adaptive filter (not shown) is set so that the power of the error signal e (n) is minimized with (n) as the reference signal and the R channel signal component detected by the microphone 13 as the target signal. _Ten (n), w ₁₁ (n), w ₁₂ (n) ... w _1N-1 Update (n). By repeatedly updating this coefficient, the adaptive filter transfers the transfer characteristic C from the speakers 21 and 22 to the microphone 13. _R The signal corresponding to the R channel audio sound is canceled from the error signal e (n).
Similarly, the adaptive signal processing unit 26 performs the same adaptive signal processing as the adaptive signal processing unit 14 of FIG. That is, the adaptive signal processing unit 26 performs the L channel audio signal x ₂ The coefficient w of the adaptive filter (not shown) is set so that the power of the error signal e (n) is minimized with (n) as the reference signal and the L channel signal component detected by the microphone 13 as the target signal. ₂₀ (n), w _{twenty one} (n), w _{twenty two} (n) ... w _2N-1 Update (n). By repeatedly updating this coefficient, the adaptive filter transfers the transfer characteristic C from the speakers 23 and 24 to the microphone 13. _L The signal corresponding to the L channel audio sound is canceled from the error signal e (n).
[0010]
FIG. 10 shows a case where one adaptive signal processing unit is provided in common for the R channel front speaker and the R channel rear speaker, but the R channel audio signal output from the R channel front speaker and the R channel audio output from the R channel rear speaker. A separate adaptive signal processing unit may be provided to cancel each signal. Similarly, an adaptive signal processing unit can be separately provided to cancel the L channel audio signal output from the L channel front speaker and the L channel audio signal output from the L channel rear speaker. In such a case, each of the adaptive filters of the four adaptive signal processing units has a transfer characteristic C from the speakers 21 to 24 to the microphone 13. _FR , C _RR , C _FL , C _RL And the audio sound is correctly canceled from the error signal e (n).
[0011]
・ Echo canceller
FIG. 11 is a configuration diagram of an echo canceller that cancels echoes in the hands-free telephone device. Reference numeral 31 denotes a hands-free telephone device (HFT), which includes a received voice output unit 31a and a speaker voice transmission unit 31b. 32 is a speaker that receives a received voice signal from the received voice output unit and radiates the received voice to the acoustic space (speaker for received voice), and 33 is the received voice radiated to the acoustic space and the voice of the speaker (driver). The detecting microphone 34 performs adaptive signal processing so that the power of the error signal e (n) is minimized, and an adaptive signal processing unit 35 updates the coefficient W of an adaptive filter (not shown). 35 is a detection signal by the microphone. And an error signal e (n) by calculating the difference between the output and the adaptive filter output. The adaptive signal processing unit 34 performs adaptive signal processing in the same manner as in the case of audio sound cancellation in FIG. 8 to remove the received voice from the error signal e. That is, the adaptive signal processing unit 34 uses the received voice signal x ′ (n) input from the received voice output unit as a reference signal, the received voice signal detected by the microphone 33 as a target signal, and the power of the error signal e (n). The coefficient w of the adaptive filter (not shown) so that is minimized ₀ (n), w ₁ (n), w ₂ (n) ... w _N-1 Update (n). By repeating this coefficient update, the adaptive filter simulates the transfer characteristic C ′ from the speaker 32 to the microphone 33, and the signal corresponding to the received voice is canceled from the error signal e (n). Therefore, when the driver makes a call, only the voice of the driver can be input to the speaker voice transmission unit 31b of the hands-free telephone device 31, and the echo can be canceled.
[0012]
[Problems to be solved by the invention]
The mechanism of the audio cancel device for speech recognition and the echo canceller for HFT is the same in principle, but a system in which these are integrated has not been realized. For this reason, at present, an audio canceling device and an echo canceller are provided separately, which is not preferable because the system becomes large and costs increase. The reason why it cannot be integrated is explained below.
[0013]
The audio cancel device is necessary for voice recognition, and is not necessarily required when voice recognition processing is not performed. However, in order to cancel the audio sound with the audio canceling device, it is necessary to update the coefficient of the adaptive filter by adaptive signal processing and approximate or match the adaptive filter characteristic with the acoustic transfer characteristic between the microphone and the speaker that changes every moment. There is (identification of transfer characteristics). In the adaptive signal processing, it takes time until the adaptive filter coefficients converge and exhibit the same characteristics as the above transfer characteristics. That is, it takes a considerable time from the start of the audio cancel device to canceling the audio sound and enabling correct speech recognition, and during that time it is necessary to stop the speech recognition processing, resulting in a problem that speech input cannot be performed. . Therefore, even when voice recognition processing is not performed, the audio canceling apparatus continues the adaptive signal processing operation, so that the adaptive filter characteristic is set to a characteristic equivalent to the acoustic transfer characteristic between the microphone and the speaker, and the voice recognition is performed. When necessary, voice recognition processing can be performed immediately.
[0014]
If an audio canceling device and an echo canceller are integrated to share adaptive signal processing, the integrated system may be used as an echo canceller, and the system cannot be operated as an audio canceling device at all times. There is no problem as long as the filter characteristics (filter coefficients) required for audio cancellation and the filter characteristics required for echo cancellation are the same, but they are different. That is, the other party's received voice in HFT is output from only speakers near the normal driver's seat, but audio sound is output from two or more (usually four) speakers, and the filter characteristics and echo cancellation required for audio cancellation. Different filter characteristics are required. As described above, in the integrated system, when the system is operated from the echo canceller to the audio canceller, it takes time to converge the filter characteristics, and there arises a problem that voice recognition processing cannot be performed immediately. On the contrary, when the integrated system is operated from the audio canceller as an echo canceller, it takes time to converge the filter characteristics, and there arises a problem that the echo cannot be sufficiently canceled at the start of a call. This is why you cannot integrate
[0015]
As described above, the object of the present invention is to immediately cancel the audio sound and perform voice recognition even if the audio cancel device and the echo canceller for HFT are integrated, and to cancel the received voice immediately and perform echo cancellation at the start of the call. Is to do so.
Another object of the present invention is to stop the output of audio playback sound from all speakers during a hands-free call, and to output the other party's received voice only from the speakers near the driver's seat. It is to be able to integrate an echo canceller for use.
Another object of the present invention is to enable integration of an audio cancellation device and an HFT echo canceller even when audio reproduction sound is output from a passenger side speaker during a hands-free call.
Another object of the present invention is to reduce the signal-to-noise ratio of call speech even when audio playback sound is output from a speaker on the passenger seat side during a hands-free call and the audio playback sound is received by a microphone. It is to enable good echo cancellation without any problems.
[0016]
[Means for Solving the Problems]
The present invention is an unnecessary sound signal removing device that detects a sound output from a speaker and a speaker and removes a signal corresponding to the sound output from the speaker from a detection signal. (1) Output from the speaker and the speaker A sound detection unit for detecting sound and outputting a detection signal; (2) generating a first signal according to the sound output from the first speaker group in the first state by adaptive signal processing; The second signal corresponding to the sound output from at least one second speaker group in the second state is generated by adaptive signal processing in the second state. (3) a first parameter (adaptive filter) updated by adaptive signal processing in the first state when switching from the first state to the second state; (4) Yooto signal removal unit, when switching from the second state to the first state, the adaptive signal processing using the first parameter that is to the storage. In addition, when the unnecessary sound signal removing unit switches from the second state to the first state, the unnecessary sound signal removing unit stores the second parameter updated by the adaptive signal processing in the second state in the storage unit, When the state is switched to the second state, adaptive signal processing is performed using the stored second parameter.
[0017]
Specifically, the second state is a call state using a hands-free telephone, and the first state is a state other than the second state (audio state). A signal obtained by removing the first signal from the detection signal at the time of voice recognition in the first state is input to the voice recognition device, and a signal obtained by removing the second signal from the detection signal at the time of the hands-free call in the second state. Type on the phone.
In this way, the audio cancel device and the HFT echo canceller can be integrated, and even when the state is changed, the audio sound can be immediately canceled and voice recognition can be performed, or the received voice can be Cancel immediately and perform echo cancellation at the beginning of the call.
Further, even when audio playback sound is output from the speaker on the passenger seat side in the call state (second state) by the hands-free telephone, the audio signal and the received voice signal can be canceled immediately from the detection signal, Good echo cancellation can be performed without reducing the S / N ratio.
[0018]
DETAILED DESCRIPTION OF THE INVENTION
(A) First embodiment
(A) First and second states
FIG. 1 is an explanatory diagram of the state of the first embodiment. _FR , 52 _FL , 52 _RR , 52 _RL Is provided. Speaker 52 _FR , 52 _RR Are front and rear speakers on the right side of the vehicle interior to which the R channel audio signal is input. _FR Is provided in the vicinity of the driver's seat 53. Speaker 52 provided near the driver's seat _FR Is a received voice signal during a call using a hands-free telephone. Speaker 52 _FL , 52 _RL Are front and rear speakers on the left side of the vehicle interior to which an L channel audio signal is input. _FL Is provided in the vicinity of the passenger seat 54. Reference numeral 55 denotes a rear seat.
[0019]
In the present invention, a state other than a call state with a hands-free telephone is defined as a first state (audio state), and a call state with a hands-free telephone is defined as a second state. The driver cannot instruct the navigation device or the like to operate by voice during a call with a hands-free telephone. That is, voice recognition is required in the first state and not necessary in the second state.
In the first state, as shown in FIG. _FL , 52 _RL The L channel audio signal is input to the speaker, and the L channel audio reproduction sound is radiated from the speakers into the vehicle interior. The right speaker 52 _FR , 52 _RR The R channel audio signal is input to R, and the R channel audio reproduction sound is radiated from the speakers into the vehicle interior. If the voice recognition mode is set in this first state, an unnecessary sound signal removal unit (described later) removes a signal corresponding to the audio reproduction sound from the signal detected by the voice detection unit provided in the vehicle interior, and the voice recognition apparatus input.
In the second state, as shown in FIG. 1B, the audio signal is not input to any speaker, and the speaker 52 near the driver's seat is used. _FR Only the reception voice signal is input from the hands-free telephone, and the reception voice is radiated from the speaker into the vehicle interior.
[0020]
(B) Unnecessary sound signal removing apparatus of the first embodiment
FIG. 2 is a block diagram of the unnecessary sound signal removing apparatus according to the first embodiment. During the speech recognition in the first state, the signal corresponding to the audio reproduction sound is removed from the detection signal of the voice detection unit provided in the vehicle interior. When a hands-free call in the second state is input to the voice recognition device, a signal corresponding to the received voice is removed from the detection signal and input to the hands-free telephone.
The audio source 61 is an L channel audio signal AS _L , R channel audio signal AS _R The hands-free telephone 62 outputs a voice signal (received voice signal) RS from the communication partner during a call. Further, the hands-free telephone 62 inputs the call signal SP to the control unit 63 when it enters a call state at the time of outgoing and incoming calls, and also inputs the end signal SE to the control unit 63 when it enters the call end state. Based on the call signal SP and the call end signal SE, the control unit 63 monitors whether the current state is the first state (a state other than the call state) or the second state (the call state) and recognizes the voice. Based on the start / end signal SRSE, it is monitored whether or not the current state is a voice recognition state. The control unit 63 controls the switches 64 to 66 and the switch 67 based on the current state.
[0021]
The switches (SW1 to SW3) 64 to 66 operate as shown in FIG. 1C in the first state (audio state), and input an audio signal to each speaker. That is, in the first state, the switches (SW1, SE3) 64 and 66 are turned on, and the switch (SW2) 65 is switched to the terminal A side. As a result, in the first state, the L channel audio signal is converted into the variable gain amplifier 68. _FL , 68 _RL Left speaker 52 via _FL , 52 _RL The R channel audio signal is input to the variable gain amplifier 68. _FR , 68 _RR Right speaker 52 _FR , 52 _RR To enter. In the second state, the switches 64 and 66 are turned off, and the switch 65 is switched to the terminal B side. As a result, in the second state (HFT call state), all the speakers 52 _FL , 52 _RL , 52 _FR , 52 _RR No audio signal is input to the speaker 52 near the driver's seat. _FR Only the received voice signal RS from the hands-free telephone 62 is variable gain amplifiers 69 and 68. _FR Enter through. Therefore, in the first state (audio state), the left speaker 52 _FL , 52 _RL L channel audio playback sound from the right speaker 52 _FR , 52 _RR R channel audio reproduction sound is emitted from the vehicle interior. In the second state (calling state), a speaker near the driver's seat (speaker for received voice) 52 _FR The incoming voice is radiated into the passenger compartment.
[0022]
The microphone 71 provided in the vicinity of the driver's mouth detects the sound in the vehicle interior, that is, the sound emitted from each speaker and the sound emitted by the driver, and outputs the sound detection signal d (n). The delay unit 72 is described later. The audio detection signal d (n) is delayed by a sampling time corresponding to the time required for the adaptive signal processing to be performed, for example, about half the number of taps of the filter 91.
The unnecessary sound signal removing unit 73 is the vehicle interior speaker 52 in the first state (audio state). _FL , 52 _RL , 52 _FR , 52 _RR A signal (referred to as an estimated audio signal) y (n) corresponding to the audio reproduction sound radiated into the passenger compartment is generated by adaptive signal processing, and the estimated audio signal y (n) is removed from the detection signal d (n). . Further, the unnecessary sound signal removing unit 73 receives the received voice speaker 52 in the second state (call state). _FR A signal (referred to as an estimated received voice signal) y (n) corresponding to the received voice radiated into the passenger compartment is generated by adaptive signal processing, and the estimated received voice signal is removed from the detected signal d (n).
The switch 67 inputs a signal output from the unnecessary sound signal removal unit 73 based on the control signal from the control unit 63 to the voice recognition device 74 during the voice recognition in the first state, and hands in the voice state in the second state. Input to the transmitter 74 of the free telephone.
[0023]
(C) Unnecessary sound signal removal unit
In the unnecessary sound signal removing unit 73, the error signal generating unit 81 generates an error signal e (n) between the estimated audio signal y (n) generated by the adaptive signal processing in the first state and the detection signal d (n). Then, an error signal e (n) between the estimated received voice signal y (n) and the detection signal d (n) generated by the adaptive signal processing in the second state is generated.
The adaptive signal processing unit 82 estimates the audio signal by the adaptive signal processing in the first state and outputs an estimated audio signal y (n), estimates the received voice signal by the adaptive signal processing in the second state, and estimates the received voice signal. Output y (n).
When switching from the first state to the second state, the filter coefficient storage unit 83 stores the adaptive filter coefficient updated by the adaptive signal processing in the first state, and at the time of switching from the second state to the first state, The adaptive filter coefficient updated by the two-state adaptive signal processing is stored.
[0024]
(D) Adaptive signal processing unit
The adaptive signal processing unit 82 includes first and second adaptive signal processing units 91 and 92 and an adder 93. The adder 93 adds the signals output from the first and second adaptive signal processing units 91 and 92 in the first state and outputs an estimated audio signal y (n). In the second state, the adder 93 outputs the first and second signals. The signals output from the adaptive signal processing units 91 and 92 are added to output an estimated received voice signal.
As shown in FIG. 3A, the first adaptive signal processing unit 91 includes an adaptive signal calculation unit 91a and an adaptive filter 91b, and the second adaptive signal processing unit 92 includes an adaptive signal as shown in FIG. An arithmetic unit 92a and an adaptive filter 92b are provided, and both execute the adaptive signal processing described with reference to FIG.
[0025]
That is, in the first state (audio state), the adaptive signal calculation unit 91a performs the R channel audio signal AS. _R (= x ₁ (n)) and the error signal e (n) are used to perform an adaptive signal calculation to update the coefficient value of the adaptive filter 91b. The adaptive filter 91b ₁ Output (n). In addition, the adaptive signal calculation unit 92a receives the L channel audio signal AS. _L (= x ₂ (n)) and the error signal e (n) are used to perform an adaptive signal operation to update the coefficient value of the adaptive filter 92b, and the adaptive filter 92b uses the L channel estimated audio signal y. ₂ Output (n). The adder 93 outputs a signal y output from the first and second adaptive signal processing units 91 and 92 in the first state. ₁ (n), y ₂ (n) is added and the estimated audio signal y (n) is output.
[0026]
In the second state (call state), the adaptive signal calculation unit 91a receives the received voice signal RS (= x ₁ (n)) and the error signal e (n) are used to perform an adaptive signal calculation to update the coefficient value of the adaptive filter 91b. The adaptive filter 91b ₁ Output (n). In addition, the adaptive signal calculation unit 92a receives the L channel audio signal AS. _L (= x ₂ (n) = 0) and the error signal e (n) are used to perform an adaptive signal calculation to update the coefficient value of the adaptive filter 92b. The adaptive filter 92b then estimates the L channel estimated audio signal y. ₂ (n) = 0 is output. The adder 93 outputs a signal y output from the first and second adaptive signal processing units 91 and 92 in the second state. ₁ (n), y ₂ (n) is added to output an estimated received voice signal y (n).
[0027]
(E) Overall operation
In the first state (audio state), the R channel audio signal AS _R Is the right speaker 52 _FR , 52 _RR R channel audio reproduction sound is radiated into the vehicle interior from these speakers. Also, the L channel audio signal AS _L Is the left speaker 52 _FL , 52 _RL The L channel audio reproduction sound is radiated into the passenger compartment from these speakers. The microphone 71 provided in the vicinity of the driver's mouth detects the sound in the passenger compartment, that is, the audio reproduction sound radiated from each speaker, and outputs the detection signal d (n), and the delay unit 72 satisfies the causality. The voice detection signal d (n) is delayed by the amount of time.
[0028]
The adaptive signal calculation unit 91a (FIG. 3A) of the adaptive signal processing unit 91 receives the R channel audio signal AS. _R (= x ₁ (n)) and the error signal e (n) are used to perform adaptive signal processing to update the coefficient value of the adaptive filter 91b, and the adaptive filter 91b uses the R channel estimated audio signal y. ₁ Output (n). In addition, the adaptive signal calculation unit 92a of the adaptive signal processing unit 92 includes an L channel audio signal AS. _L (= x ₂ (n)) and the error signal e (n) are used to perform an adaptive signal operation to update the coefficient value of the adaptive filter 92b, and the adaptive filter 92b uses the L channel estimated audio signal y. ₂ Output (n). The adder 93 outputs the R channel and L channel estimated audio signals y output from the first and second adaptive signal processing units 91 and 92, that is, the adaptive filters 91b and 92b. ₁ (n), y ₂ (n) is added and the estimated audio signal y (n) is output.
The error signal generator 81 generates an error signal e (n) between the estimated audio signal y (n) generated by the adaptive signal processing and the detection signal d (n), and the first and second adaptive signal processors 91. , 92.
[0029]
Thereafter, by repeating the adaptive signal processing in the first state, the characteristic of the adaptive filter 91b becomes the right speaker 52. _FR , 52 _RR To the microphone 71. The adaptive filter 91b simulates the transfer characteristic because the adaptive signal processing is always performed even if the transfer characteristic changes due to a change in the posture of the occupant. Similarly, by repeating the adaptive signal processing in the first state, the characteristic of the adaptive filter 92b becomes the left speaker 52. _FL , 52 _RL To the microphone 71. The adaptive filter 92b simulates the transfer characteristic because the adaptive signal processing is always performed even if the transfer characteristic changes due to a change in the posture of the occupant. As a result, the error signal e (n), which is the output of the error signal generator 81, becomes substantially zero, and the audio signal is canceled.
[0030]
In such a state, if it is necessary to give a voice instruction to the navigation device or the like, the driver operates a predetermined switch to set the voice recognition mode. As a result, the control unit 63 controls the switch 67 to input the unnecessary sound removal signal (error signal) e (n) output from the unnecessary sound signal removal unit 73 to the speech recognition device 74. In this voice recognition mode, when the driver gives an instruction to the navigation device by voice, a synthesized sound of the driver voice and the audio reproduction sound is detected by the microphone 71, and the detection signal d (n) is input to the error signal generator 81. . The estimated audio signal y (n) is input from the adaptive signal processing unit 82 to the error signal generating unit 81. The error signal generator 81 subtracts the estimated audio signal y (n) from the detection signal d (n), and inputs the obtained signal to the voice recognition device 74 via the switch 67. As described above, since the estimated audio signal is substantially equal to the actual audio signal, the voice signal of the driver is input to the voice recognition device 74, and the voice recognition device 74 can correctly recognize the voice.
[0031]
Next, the operation when the state changes will be described according to the processing flow of FIG.
In the first state (audio state), the control unit 63 monitors whether there is an incoming call to the hands-free telephone 62 or whether a call is made by calling from the hands-free telephone (step 101). When the hands-free telephone is in a call state, the control unit instructs the first signal processing unit 91 to store the filter coefficient W1 (n) of the adaptive filter 91b as Wtemp (n) in the coefficient storage unit 83, and The filter coefficient W ′ (n) stored in the storage unit 83 is read out and the adaptive filter 91b is instructed to be set as the filter coefficient W1 (n). In accordance with this instruction, the first signal processing unit 91 stores the filter coefficient and reads / sets the stored coefficient (steps 102 to 103). The filter coefficient W ′ (n) is the latest adaptive filter coefficient updated by the adaptive signal processing in the second state of the previous time. For this reason, the received voice speaker 52 is caused by a change in the posture of the occupant. _FR Even if the transfer characteristic from to the microphone 71 fluctuates, the characteristic of the adaptive filter in which this coefficient is set approximates the transfer characteristic.
[0032]
Next, the control unit 63 turns off the switches 64 and 66 and switches the switch 65 to the B contact side (step 104). As a result, no audio signal is input to all speakers, and the received voice speaker 52 near the driver's seat is used. _FR Only the received voice signal RS from the hands-free telephone 62 is variable gain amplifiers 69 and 68. _FR The received voice speaker 52 _FR The incoming voice is radiated into the passenger compartment. Further, the control unit 63 controls the switch 67 to input the error signal e (n) output from the unnecessary sound signal removal unit 73 to the transmitter 75 of the hands-free telephone.
[0033]
The microphone 71 provided in the vicinity of the driver's mouth is a sound in the passenger compartment, that is, a speaker for received voice 52. _FR The driver's call voice radiated from is detected and a detection signal d (n) is output, and the delay unit 72 delays the voice detection signal d (n) by the time required to satisfy the causality. On the other hand, the adaptive signal calculator 91a receives the received voice signal RS (= x ₁ (n)) and the error signal e (n) are used to perform an adaptive signal calculation to update the coefficient value of the adaptive filter 91b. The adaptive filter 91b ₁ Output (n). In addition, the adaptive signal calculation unit 92a receives the L channel audio signal AS. _L (= x ₂ (n) = 0) and the error signal e (n) are used to perform an adaptive signal calculation to update the coefficient value of the adaptive filter 92b. The adaptive filter 92b then estimates the L channel estimated audio signal y. ₂ (n) = 0 is output. The adder 93 outputs a signal y output from the first and second adaptive signal processing units 91 and 92 in the second state. ₁ (n), y ₂ (n) is added to output an estimated received voice signal y (n).
The error signal generator 81 generates an error signal e (n) between the estimated received voice signal y (n) and the detection signal d (n) generated by the adaptive signal processing, and the first and second adaptive signal processors 91 and 92. Thereafter, by repeating the adaptive signal processing in the second state, the characteristic of the adaptive filter 91b is changed to the received voice speaker 52. _FR To the microphone 71. In this case, since the adaptive filter 91b performs the adaptive signal processing by setting the latest filter coefficient W ′ (n) obtained in the previous second state, the characteristic of the adaptive filter 91b is equivalent to the transfer characteristic in a short time. It becomes the characteristic of.
[0034]
As a result, when the first state is switched to the second state, the estimated received voice signal y (n) is immediately removed from the detection signal d (n), and only the signal corresponding to the driver's voice is input to the transmitter 75. Thus, echo cancellation can be realized (step 105).
Thereafter, the control unit 63 monitors whether the call is finished (step 106). If the call is finished, the coefficient W1 (n) of the adaptive filter 91b is set to W ′ (n) for the first adaptive signal processing unit 91. Is stored in the coefficient storage unit 83, and the coefficient Wtemp (n) stored in the coefficient storage unit 83 is instructed to be set as W1 (n) in the adaptive filter 91b. In response to this instruction, the first signal processing unit 91 stores the filter coefficient and reads / sets the stored coefficient (steps 107 and 108). The filter coefficient Wtemp (n) is the latest adaptive filter coefficient updated in the previous first state, and the right speaker 52 in the previous first state. _FR , 52 _RR To the microphone 71 is a coefficient that simulates the transfer characteristic. Therefore, even if the transfer characteristic is slightly changed due to a change in the posture of the occupant, the adaptive filter characteristic in which this coefficient is set approximates the transfer characteristic.
[0035]
Next, the control unit 63 turns on the switches 64 and 66 and switches the switch 65 to the A contact side. Further, the control unit 63 controls the switch 67 to input the error signal e (n) output from the unnecessary sound signal removal unit 73 to the voice recognition device 74 (step 109).
As described above, the L channel audio signal is input to the left speaker, the R channel audio signal is input to the right speaker, and the audio reproduction sound is radiated from each speaker into the vehicle interior. Thereafter, the audio sound canceling process in the first state is performed. In this case, since the adaptive signal processing is performed by setting the filter coefficient Wtemp (n) obtained in the previous first state to the adaptive filter 91b, the transfer characteristic from the right speaker to the microphone fluctuates due to a change in the posture of the occupant. Even in this case, the characteristic of the adaptive filter 91b becomes equivalent to the transfer characteristic in a short time, and the signal corresponding to the audio reproduction sound is canceled from the detection signal (step 110). Thereafter, returning to the beginning, the subsequent processing is repeated.
[0036]
As described above, according to the first embodiment, the audio cancel device and the echo canceller for HFT can be integrated, and the filter coefficient is stored and read out when necessary to continue the adaptive signal processing. The voice can be recognized by canceling immediately, and the received voice can be canceled immediately and echo can be canceled during a hands-free call.
According to the first embodiment, the audio canceling device and the audio canceling device can be configured to stop the output of the audio reproduction sound from all the speakers during the call and to output the other party's received voice from the speaker of the driver's seat. An echo canceller for HFT can be integrated.
[0037]
(B) Second embodiment
In the first embodiment, the output of the audio playback sound is stopped from all the speakers in the hands-free call state (second state). However, the second embodiment can make a call while listening to the audio playback sound. It is.
FIG. 5 is an explanatory diagram of the state of the second embodiment, and the same components as those in FIG. In the second embodiment, the first state is a state other than a call state using a hands-free phone, and the second state is a call state using a hands-free phone.
[0038]
The first state of the second embodiment is the same as the first state of the first embodiment, as shown in FIG. _FL , 52 _RL The L channel audio signal is input to the speaker, and the L channel audio reproduction sound is radiated from the speakers into the vehicle interior. The right speaker 52 _FR , 52 _RR The R channel audio signal is input to R, and the R channel audio reproduction sound is radiated from the speakers into the vehicle interior. If the voice recognition mode is set in the first state, the unnecessary sound signal removal unit removes a signal corresponding to the audio reproduction sound from the signal detected by the voice detection unit provided in the passenger compartment and inputs the signal to the voice recognition device.
In the second state, as shown in FIG. _FL , 52 _RL The L channel audio signal is input to the speaker, and the L channel audio reproduction sound is radiated from the speakers into the vehicle interior. However, the right speaker 52 on the same side as the driver seat _FR , 52 _RR No audio signal is input to the speaker (speaker for received voice) 52 near the driver's seat. _FR Only a reception voice signal is input from the hands-free telephone, and the reception voice is radiated from the speaker into the passenger compartment.
[0039]
FIG. 6 is a block diagram of the unnecessary sound signal removing apparatus of the second embodiment. The difference from the first embodiment of FIG. 2 is that the switch 64 of the first embodiment is removed. Left hand speaker 52 during hands-free call _FL , 52 _RL The L channel audio signal is input to the speaker 52 for the received voice. _FR An incoming voice signal can be input from a hands-free telephone.
FIG. 7 is a processing flow of the second embodiment when the state changes. Since the operation in the first state is exactly the same as that of the first embodiment, description thereof is omitted.
[0040]
In the first state (audio state), the control unit 63 monitors whether there is an incoming call to the hands-free telephone 62 or whether a call is made by calling from the hands-free telephone (step 201). When the hands-free telephone is in a call state, the control unit instructs the first signal processing unit 91 to store the filter coefficient W1 (n) of the adaptive filter 91b in the coefficient storage unit 83 as Wtemp (n). Then, the filter coefficient W ′ (n) stored in the coefficient storage unit 83 is read out and the adaptive filter 91b is instructed to be set as the filter coefficient W1 (n). In accordance with this instruction, the first signal processing unit 91 stores the filter coefficient and reads / sets the stored coefficient (steps 202 to 203). The filter coefficient W ′ (n) is the latest adaptive filter coefficient updated by the adaptive signal processing in the second state of the previous time. For this reason, the received voice speaker 52 is caused by a change in the posture of the occupant. _FR Even if the transfer characteristic from the microphone to the microphone fluctuates, the adaptive filter characteristic set with this coefficient approximates the transfer characteristic.
[0041]
Next, the control unit 63 turns off the switch (SW2) 66 and switches the switch (SW1) 65 to the B contact side as shown in FIG. 5C (step 204). Thereby, the left speaker 52 continues. _FL , 52 _RL L channel audio signal is input to the right speaker 52. _FR , 52 _RR No audio signal is input to the received voice speaker 52. _FR Next, the received voice signal RS from the hands-free telephone 62 is converted into variable gain amplifiers 69 and 68. _FR The received voice speaker 52 _FR The incoming voice is radiated into the passenger compartment. Further, the control unit 63 controls the switch 67 to input the error signal e (n) output from the unnecessary sound signal removal unit 73 to the transmitter 75 of the hands-free telephone.
[0042]
A microphone 71 provided in the vicinity of the driver's mouth is a sound in the passenger compartment, that is, an L channel audio reproduction sound radiated from the L channel speaker and a speaker for received voice 52. _FR The driver's call voice radiated from is detected and a detection signal d (n) is output, and the delay unit 72 delays the voice detection signal d (n) by the time required for adaptive signal processing. On the other hand, the adaptive signal calculator 91a receives the received voice signal RS (= x ₁ (n)) and the error signal e (n) are used to perform an adaptive signal calculation to update the coefficient value of the adaptive filter 91b. The adaptive filter 91b ₁ Output (n). In addition, the adaptive signal calculation unit 92a receives the L channel audio signal AS. _L (= x ₂ (n)) and the error signal e (n) are used to perform an adaptive signal operation to update the coefficient value of the adaptive filter 92b, and the adaptive filter 92b uses the L channel estimated audio signal y. ₂ Output (n). The adder 93 outputs a signal y output from the first and second adaptive signal processing units 91 and 92 in the second state. ₁ (n), y ₂ (n) is added to output the estimated synthesized sound signal y (n).
[0043]
The error signal generator 81 generates an error signal e (n) between the estimated synthesized sound signal y (n) and the detection signal d (n) generated by the adaptive signal processing, and the first and second adaptive signal processors 91 and 92. Thereafter, by repeating the adaptive signal processing in the second state, the characteristic of the adaptive filter 91b is changed to the received voice speaker 52. _FR To the microphone 71. The characteristic of the adaptive filter 92b is the left speaker 52. _FL , 52 _RL To the microphone 71. In this case, since the adaptive signal processing is performed by setting the filter coefficient W ′ (n) obtained in the previous second state to the adaptive filter 91b, even if the transfer characteristic changes slightly due to a change in the posture of the occupant. In a short time, the characteristic of the adaptive filter 91b becomes equivalent to the transfer characteristic. As a result, when the first state is switched to the second state, the estimated synthesized sound signal y (n) is immediately removed from the detection signal d (n), and only the signal corresponding to the driver's voice is input to the transmitter 75. Thus, echo cancellation can be realized (step 205).
[0044]
Thereafter, the control unit 63 monitors whether or not the call is finished (step 206). When the call is finished, the coefficient W1 (n) of the adaptive filter 91b is set to W ′ (n) for the first adaptive signal processing unit 91. The coefficient storage unit 83 stores the coefficient Wtemp (n) stored in the coefficient storage unit 83 and instructs the adaptive filter 91b to set it as W1 (n). In accordance with this instruction, the first signal processing unit 91 stores the filter coefficient and reads / sets the stored coefficient (steps 207 and 208). The filter coefficient Wtemp (n) is the latest adaptive filter coefficient updated in the previous first state, and the right speaker 52 in the previous first state. _FR , 52 _RR Simulate the transfer characteristics from to the microphone. Therefore, even if the transfer characteristic is slightly changed due to a change in the posture of the occupant, the characteristic of the adaptive filter to which this coefficient is set approximates the transfer characteristic.
Next, the control unit 63 turns on the switch (SW2) 66 and switches the switch (SW1) 65 to the A contact side. In addition, the control unit 63 controls the switch 67 and inputs the error signal e (n) output from the unnecessary sound signal removal unit 73 to the voice recognition device 74 (step 209).
[0045]
As described above, the L channel audio signal is input to the left speaker, the R channel audio signal is input to the right speaker, and the audio reproduction sound is radiated from each speaker into the vehicle interior. Thereafter, the audio sound cancellation process in the first state is performed. In this case, since the adaptive signal processing is performed by setting the filter coefficient Wtemp (n) obtained in the previous first state to the adaptive filter 91b, the transfer characteristic from the right speaker to the microphone fluctuates due to a change in the posture of the occupant. Even in this case, the characteristic of the adaptive filter 91b becomes equivalent to the transfer characteristic in a short time, and the signal corresponding to the audio reproduction sound is canceled from the detection signal (step 210).
According to the second embodiment, the audio cancel device and the HFT echo canceller can be integrated, and the filter coefficients are stored and read out when necessary to continue the adaptive signal processing. Thus, voice recognition can be performed, and the received voice can be canceled immediately at the time of a hands-free call to perform echo cancellation.
[0046]
Further, according to the second embodiment, the audio canceling device and the echo canceller for HFT can be integrated even when the audio reproduction sound is output from the speaker on the passenger seat side during the hands-free call. In this case, according to the second embodiment, the L channel audio canceling process and the call voice echo canceling process can be performed simultaneously.
In the above description, the first state is the audio state and the second state is the call state, but the first and second states are not limited to the embodiment.
In the above, the vehicle having the driver's seat on the right side has been described, but the present invention can of course be applied to a vehicle on the left side.
The present invention has been described with reference to the embodiments. However, the present invention can be variously modified in accordance with the gist of the present invention described in the claims, and the present invention does not exclude these.
[0047]
【The invention's effect】
As described above, according to the present invention, the audio cancellation apparatus and the HFT echo canceller can be integrated. As a result, the apparatus scale can be reduced, and the structure can be reduced.
Further, according to the present invention, the filter coefficient is stored and read out when necessary, and the adaptive signal processing is continued, so that the audio sound can be canceled immediately at the time of the voice recognition, and the voice recognition can be performed. Can be canceled immediately to perform echo cancellation.
[0048]
Further, according to the present invention, during a hands-free call, even if the configuration is such that output of audio playback sound from all speakers is stopped and the other party's received voice is output from the speaker in the driver's seat, An echo canceller for HFT can be integrated.
In addition, according to the present invention, the audio canceling device and the HFT echo canceller can be integrated even if the audio reproduction sound is output from the passenger side speaker during the hands-free call. Audio cancel processing and call echo cancel processing can be performed simultaneously.
[Brief description of the drawings]
FIG. 1 is an explanatory diagram of a state of a first embodiment.
FIG. 2 is a configuration diagram of an unnecessary sound signal removing apparatus according to the first embodiment.
FIG. 3 is a configuration diagram of an adaptive signal processing unit.
FIG. 4 is a processing flow at the time of a state change of the first embodiment.
FIG. 5 is a state explanatory diagram of a second embodiment.
FIG. 6 is a configuration diagram of an unnecessary sound signal removing apparatus according to a second embodiment.
FIG. 7 is a processing flow at the time of a state change of the second embodiment.
FIG. 8 is a configuration diagram of a conventional audio sound canceling apparatus.
FIG. 9 is a configuration diagram of an adaptive filter.
FIG. 10 is a configuration diagram of an audio sound canceling apparatus for a vehicle interior speaker.
FIG. 11 is a configuration diagram of an echo canceller.
[Explanation of symbols]
52 _FL , 52 _RL , 52 _FR , 52 _RR ..Speakers
61. Audio source
62. Hands-free telephone
63 .. Control part
64-66 switch
67 ・・ Switching device
71. Microphone
73. Unnecessary sound signal removal unit
74. Voice recognition device
75. Hands-free telephone transmitter
81 .. Error signal generator
82 .. Adaptive signal processing section
83 .. Filter coefficient storage unit

Claims

In an unnecessary sound signal removing device that detects a sound output from a speaker and a speaker and removes a signal corresponding to a sound output from the speaker from a detection signal.
A sound detection unit that detects sound output from a speaker and a speaker and outputs a detection signal;
A first signal corresponding to the sound output from the first speaker group in the first state is generated by adaptive signal processing, the first signal is removed from the detection signal, and in the second state An unnecessary sound signal removing unit that generates a second signal corresponding to sound output from at least one second speaker group by adaptive signal processing, and removes the second signal from the detection signal;
A storage unit that stores the first parameter updated by the adaptive signal processing in the first state when switching from the first state to the second state;
And the unnecessary sound signal removing unit performs adaptive signal processing using the stored first parameter when switching from the second state to the first state.
The unnecessary sound signal removal apparatus characterized by the above-mentioned.

When switching from the second state to the first state, the second parameter updated by the adaptive signal processing in the second state is stored in the storage unit, and the unnecessary sound signal removing unit is started from the first state. When switching to the second state, adaptive signal processing is performed using the stored second parameter.
The unnecessary sound signal removing apparatus according to claim 1.

A signal in the first state, in which the first signal is removed from the detection signal during speech recognition, is input to the speech recognition device, and the second signal is removed from the detection signal during a hands-free call in the second state To enter the hands-free phone,
The unnecessary sound signal removing apparatus according to claim 2, further comprising:

A state monitoring unit that monitors the current state with the hands-free telephone call state as the second state and the non-second state as the first state;
The unnecessary sound signal removing apparatus according to claim 3, further comprising:

The unnecessary sound signal removing unit is
An error signal between the first signal generated by adaptive signal processing in the first state and the detection signal is generated, and the second signal generated by adaptive signal processing in the second state and the detection signal An error signal generator for generating an error signal,
An adaptive signal processing unit for performing adaptive signal processing,
The adaptive signal processing unit comprises:
In the first state, a signal input to the first speaker group is input to generate the first signal, and in the second state, a signal input to the second speaker group is input to the second signal. Adaptive filter, which generates
Adaptive signal calculation in which the adaptive signal calculation is performed using the signals input to the first and second speaker groups and the error signal in each of the first and second states, and the coefficient value of the adaptive filter in each state is updated. Part,
And using the coefficient value of the adaptive filter as the parameter,
The unnecessary sound signal removing apparatus according to claim 1, 2, or 3.

The first speaker group is set as all speakers in the passenger compartment, the second speaker group is set as speakers near the driver's seat, and the first signal corresponding to the audio reproduction sound output from all the speakers is detected as the first state. When the audio state is more removed, and the second state is the hands-free call state in which the second signal corresponding to the received voice from the hands-free telephone is removed from the detection signal, the detection signal is used when the voice is recognized in the first state. Means for inputting a signal from which the first signal has been removed to the voice recognition device, and for inputting a signal from which the second signal has been removed from the detection signal in the second state to a hands-free telephone;
The unnecessary sound signal removing apparatus according to claim 1 or 2, wherein:

The unnecessary sound signal removing unit is
An error signal between the first signal generated by adaptive signal processing in the first state and the detection signal is generated, and an error signal between the second signal generated by adaptive signal processing in the second state and the detection signal. An error signal generator,
An adaptive signal processing unit for performing adaptive signal processing,
The adaptive signal processing unit
First and second adaptive signal processing units;
In the first state, the signals output from the first and second adaptive signal processing units are added to output the first signal, and in the second state, the signals output from the first and second adaptive signal processing units. An adder for adding and outputting the second signal;
The first adaptive signal processing unit includes:
In the first state, an audio signal input to the speaker on the same side as the driver's seat is input, a signal corresponding to the audio reproduction sound output from the speaker on the same side as the driver's seat is generated, and in the second state A first adaptive filter that receives a received voice signal and generates a signal corresponding to the received voice output from a speaker near the driver's seat;
In the first state, the adaptive signal calculation is performed using the signal input to the speaker on the same side as the driver's seat and the error signal to update the coefficient value of the first adaptive filter, and in the second state, A first adaptive signal calculation unit that performs adaptive signal calculation using a received voice signal input to a speaker in the vicinity of the driver's seat and the error signal to update a coefficient value of the first adaptive filter;
The second adaptive signal processing unit includes:
A second adaptive filter that receives an audio signal input to a speaker opposite to the driver's seat in each of the first and second states and generates a signal corresponding to the audio reproduction sound output from the speaker;
In each of the first and second states, the adaptive signal calculation is performed using the audio signal input to the speaker opposite to the driver's seat and the error signal, and the coefficient value of the second adaptive filter in each state is updated. A second adaptive signal calculation unit;
And using the coefficient value of the first adaptive filter as the parameter,
The unnecessary sound signal removing apparatus according to claim 6.

In the second state, the audio signal is not input to any speaker.
The unnecessary sound signal removing apparatus according to claim 7.

In the second state, an audio signal is input to the speaker on the opposite side to the driver's seat, and the audio signal is not input to the speaker on the same side as the driver's seat.
The unnecessary sound signal removing apparatus according to claim 7.

In the first state, the audio signal of one of the L channel and the R channel is input to the speaker on the side opposite to the driver's seat, the audio signal of the other channel is input to the speaker on the same side as the driver's seat, and the second In this state, it has means for inputting the audio signal of the one channel to the speaker on the side opposite to the driver's seat and not inputting the audio signal to the speaker on the same side as the driver's seat,
The unnecessary sound signal removing apparatus according to claim 9.

Detecting the sound output from the speaker in the vehicle and the speaker, removing the first unnecessary sound signal corresponding to the audio reproduction sound output from the speaker in the vehicle from the detection signal at the time of voice recognition, and inputting it to the voice recognition device; In the unnecessary sound signal removing apparatus for removing the second unnecessary sound signal corresponding to the reception voice output from the reception voice output speaker during the call by the hands-free telephone and inputting it to the hands-free telephone,
When the hands-free telephone conversation state is the second state and the state other than the second state is the first state, the sound output from the speaker and the speaker is detected in each state and a detection signal is output. Detection unit,
A first unnecessary sound signal corresponding to the audio reproduction sound output from the vehicle interior speaker in the first state is generated by adaptive signal processing, the first unnecessary sound signal is removed from the detection signal, and the second An unnecessary sound signal removing unit that generates a second unnecessary sound signal corresponding to the received voice output from the received voice speaker in the state of the adaptive signal processing and removes the second unnecessary sound signal from the detection signal;
Means for inputting a detection signal from which the first unnecessary sound signal has been removed in the first state to a voice recognition device, and inputting the detection signal from which the second unnecessary sound signal has been removed in the second state to a hands-free telephone;
When switching from the first state to the second state, the first parameter updated by the adaptive signal processing in the first state is stored, and when switching from the second state to the first state, A parameter storage unit for storing the second parameter updated by the adaptive signal processing in the state of
The unnecessary sound signal removal unit performs adaptive signal processing using the stored first parameter when switching from the second state to the first state, and performs the second state from the first state. When switching to, the adaptive signal processing is performed using the stored second parameter,
The unnecessary sound signal removal apparatus characterized by the above-mentioned.

The unnecessary sound signal removing unit is
An error signal between the first unnecessary sound signal and the detection signal generated by adaptive signal processing in the first state is generated, and the second unnecessary sound signal generated by adaptive signal processing in the second state and the An error signal generator for generating an error signal with the detection signal;
An adaptive signal processing unit for performing adaptive signal processing,
The adaptive signal processing unit comprises:
In the first state, an audio signal to be input to the vehicle interior speaker is input to generate the first unnecessary sound signal, and in the second state, a signal to be input to the reception voice speaker is input to input the second unnecessary sound. An adaptive filter that generates a signal,
In each of the first and second states, an adaptive signal that performs adaptive signal calculation using the signal input to the vehicle interior speaker and the reception voice speaker and the error signal, and updates the coefficient value of the adaptive filter in each state. Arithmetic unit,
And using the coefficient value of the adaptive filter as the parameter,
The unnecessary sound signal removing apparatus according to claim 11.