JP2007288565A

JP2007288565A - Vehicle-mounted equipment

Info

Publication number: JP2007288565A
Application number: JP2006114235A
Authority: JP
Inventors: Shingo Kiuchi; 真吾木内; Nozomi Saito; 望齊藤
Original assignee: Alpine Electronics Inc
Current assignee: Alpine Electronics Inc
Priority date: 2006-04-18
Filing date: 2006-04-18
Publication date: 2007-11-01
Anticipated expiration: 2026-04-18
Also published as: JP4817949B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide vehicle-mounted equipment in an ASC (Audio Sound Cancellation System), which eliminates a filter bank outputting sub-band audio data. <P>SOLUTION: The equipment has a function of reproducing audio sound and outputting it from a speaker, and a function of audio cancellation for canceling the audio sound and outputting voice. The equipment changes non-compressed audio data to a plurality of sub-band audio data by using a filter bank, outputs them, and compresses the sub-band audio data outputted from the filter bank and stores it to a memory. The equipment eliminates the filter bank in the ASC cancellation system, by inputting each sub-band audio data used for the compression in an audio sound cancellation system. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は車載機に係わり、特にオーディオ音を再生してスピーカより出力するオーディオ機能と該オーディオ音をキャンセルして話者音声を出力するオーディオキャンセル機能とを備えた車載機に関する。 The present invention relates to an in-vehicle device, and more particularly to an in-vehicle device having an audio function for reproducing audio sound and outputting it from a speaker, and an audio canceling function for canceling the audio sound and outputting speaker voice.

近年、オーディオ機能とナビゲーション機能を統合したAVNシステム(Audio Visual Navigation System)が車載機の主流になっている。このAVNシステムにおけるオーディオ装置は、オーディオデータがMP3,AACなどの圧縮オーディオデータであれば伸張処理により生のオーディオ信号を復元して再生する。また、オーディオデータが圧縮されていなければ(非圧縮オーディオデータ)、アナログに変換して再生すると共に、該非圧縮オーディオデータに圧縮処理を施してハードディスクに記録する(リッピング)。リッピングすることにより、以後、ハードディスクに記録した圧縮オーディオデータを用いて再生することが可能になる。
また、AVNシステムにおけるナビゲーション装置は、自車位置周辺の地図を表示すると共に、目的地までの経路を探索して地図上に表示して経路誘導を行なう。かかるナビゲーション装置に対してユーザが所定の制御の実行を指示する手段としてリモコン、タッチパネルに加えて音声認識装置を装備したものが多くなっている。かかる音声認識装置は、雑音を抑圧した話者音声が入力されると音声認識率を向上する。このため、マイクで受音したオーディオ再生音と話者音声とが混じった音から適応フィルタを用いてオーディオ音のみを抑圧して音声認識装置に入力するオーディオ音キャンセルシステム (ASCシステム：Audio Sound Cancellation System)が、該音声認識装置の前段に設けられている。 In recent years, an AVN system (Audio Visual Navigation System) that integrates an audio function and a navigation function has become the mainstream of in-vehicle devices. The audio device in this AVN system restores and reproduces a raw audio signal by decompression processing if the audio data is compressed audio data such as MP3 or AAC. If the audio data is not compressed (uncompressed audio data), it is converted to analog and reproduced, and the uncompressed audio data is compressed and recorded (ripped). By ripping, it becomes possible to reproduce using the compressed audio data recorded on the hard disk.
The navigation device in the AVN system displays a map around the vehicle position, searches for a route to the destination, displays it on the map, and performs route guidance. As a means for a user to instruct execution of predetermined control for such a navigation device, a device equipped with a voice recognition device in addition to a remote control and a touch panel is increasing. Such a speech recognition apparatus improves the speech recognition rate when a speaker speech with suppressed noise is input. For this reason, an audio sound cancellation system (ASC system: Audio Sound Cancellation) that suppresses only the audio sound from the mixed audio playback sound received by the microphone and the speaker's sound using an adaptive filter and inputs it to the speech recognition device. System) is provided in front of the speech recognition apparatus.

図８は従来のAVNシステムの構成図であり、ASCシステム１の第１、第２入力端子には、それぞれマイクロホン２により検出された検出音信号Sdとオーディオ部３から出力されるオーディオ信号Saが入力される。ASCシステム１は適応信号処理により、マイクロホン２で検出した検出音信号よりオーディオ再生音を抑圧して音声認識装置４に入力する。オーディオ部３において、CDデッキ等のオーディオソース５あるいはハードディスク６から読み出されたオーディオデータはソース切替部７を介して出力切換部８に入力する。出力切換部８は入力オーディオデータが圧縮されているか否かにより出力先を選択するもので、圧縮オーディオデータを圧縮オーディオデコード部９に入力し、非圧縮オーディオデータ(PCMオーディオデータ)を圧縮オーディオエンコード部１０と入力切換部１１に入力する。
圧縮オーディオデコード部９は圧縮オーディオデータに伸張処理(復元処理)を施してオーディオデータを復元して入力切換部１１に入力する。入力切換部１１は圧縮オーディオデコード部９から入力するオーディオデータあるいは出力切換部８から入力するオーディオデータをASCシステム１に入力すると共に、オーディオ回路１２に入力する。オーディオ回路１２は入力されたオーディオデータSaをアナログに変換してスピーカ１３から音響空間に再生音を出力する。
非圧縮オーディオデータの場合、以上と並行して圧縮オーディオエンコード部１０は非圧縮オーディオデータ(PCMオーディオデータ)に圧縮処理を施してディスク制御部１４に入力する。ディスク制御部１４は入力された圧縮オーディオデータをハードディスク６に記録する(リッピング)。また、ディスク制御部１４は図示しない操作部からの要求により、ハードディスク６に記録されている所定曲の圧縮オーディオデータを読み出してソース切替部７に入力する。 FIG. 8 is a configuration diagram of a conventional AVN system. A detection sound signal Sd detected by the microphone 2 and an audio signal Sa output from the audio unit 3 are respectively input to the first and second input terminals of the ASC system 1. Entered. The ASC system 1 suppresses the audio reproduction sound from the detection sound signal detected by the microphone 2 by adaptive signal processing, and inputs it to the voice recognition device 4. In the audio unit 3, audio data read from the audio source 5 such as a CD deck or the hard disk 6 is input to the output switching unit 8 via the source switching unit 7. The output switching unit 8 selects an output destination depending on whether or not the input audio data is compressed. The output switching unit 8 inputs the compressed audio data to the compressed audio decoding unit 9 and compresses the uncompressed audio data (PCM audio data). Input to the unit 10 and the input switching unit 11.
The compressed audio decoding unit 9 performs decompression processing (restoration processing) on the compressed audio data, restores the audio data, and inputs the audio data to the input switching unit 11. The input switching unit 11 inputs the audio data input from the compressed audio decoding unit 9 or the audio data input from the output switching unit 8 to the ASC system 1 and inputs to the audio circuit 12. The audio circuit 12 converts the input audio data Sa into analog and outputs reproduced sound from the speaker 13 to the acoustic space.
In the case of uncompressed audio data, the compressed audio encoding unit 10 performs compression processing on the uncompressed audio data (PCM audio data) in parallel with the above and inputs the compressed data to the disk control unit 14. The disk control unit 14 records the input compressed audio data on the hard disk 6 (ripping). Further, the disk control unit 14 reads out the compressed audio data of a predetermined music recorded on the hard disk 6 and inputs it to the source switching unit 7 in response to a request from an operation unit (not shown).

図９はASCシステム１の構成図(特許文献１)であり、オーディオデータSaはオーディオ回路１２を介してスピーカ１３に入力し、該スピーカより車室内音響空間に再生音が放射される。マイクロホン２は放射されたオーディオ音を検出し、アンプ２１、及びAD変換器２２を介して誤差演算部２３に入力する。又、前記オーディオデータSaは参照信号として適応信号処理部２４に入力する。適応信号処理部２４は、参照信号としてのオーディオデータx(n)にフィルタリング処理を施して出力する適応フィルタ２４ａと、誤差演算部２３ら出力する誤差信号e(ｎ)のパワーが最小となるようにＬＭＳ適応信号処理を行って適応フィルタの２４ａのフィルタ係数を決定するＬＭＳ演算部(係数更新部)２４ｂを備えている。 FIG. 9 is a configuration diagram of the ASC system 1 (Patent Document 1). Audio data Sa is input to a speaker 13 via an audio circuit 12, and reproduced sound is radiated from the speaker into the vehicle interior acoustic space. The microphone 2 detects the radiated audio sound and inputs it to the error calculator 23 via the amplifier 21 and the AD converter 22. The audio data Sa is input to the adaptive signal processing unit 24 as a reference signal. The adaptive signal processing unit 24 performs the filtering process on the audio data x (n) as the reference signal and outputs the adaptive filter 24a, and the error signal e (n) output from the error calculation unit 23 is minimized. Are provided with an LMS calculation section (coefficient update section) 24b for performing LMS adaptive signal processing to determine the filter coefficient of the adaptive filter 24a.

適応フィルタ２４aはＬＭＳ演算部２４ｂにより決定された係数に従って参照信号ｘ(n)にデジタルフィルタ処理を施してオーディオ音キャンセル信号ｙ(n)を出力する。誤差演算部２３はオーディオ音キャンセル信号y(n)とマイクロホン検出信号d(n)の差を誤差信号e(ｎ)として出力する。適応フィルタ２４aはNタップのFIR型デジタルフィルタで構成される。
ＬＭＳ演算部２４ｂは、１サンプリング時刻Ts後の次の時刻(n+1)・Tsにおける適応フィルタ２４aの係数Ｗ(n+1)を、現時刻n・Tｓにおける適応フィルタの係数Ｗ(n)とエラー信号ｅ(n)を用いて次式

により決定する。但し、アルファベットの太文字Ｗ、Xはベクトルを意味し、

である。又、Ｔは転置行列を意味し、μは適応フィルタ係数の更新ステップを決める１以下の定数（ステップサイズパラメータ）である。
図９のオーディオキャンセルシステムによれば、スピーカからマイクロホンまでの伝達特性を模擬するように、換言すれば、オーディオ音をキャンセルするように適応フィルタ２４aの係数が決定される。この結果、オーディオ音が出力されている状態において話者が音声を発声すると、マイクロホン２により該オーディオ音と発話音声の合成音が検出され、誤差発生部２３は該合成音信号d(n)よりオーディオ音キャンセル信号y(n)を除いた誤差信号e(n)を発話音声信号として音声認識装置４に入力し、音声認識装置４は該発話音声信号に基づいて音声認識処理を実行する。 The adaptive filter 24a performs digital filter processing on the reference signal x (n) according to the coefficient determined by the LMS calculation unit 24b, and outputs an audio sound cancellation signal y (n). The error calculator 23 outputs the difference between the audio sound cancellation signal y (n) and the microphone detection signal d (n) as an error signal e (n). The adaptive filter 24a is composed of an N-tap FIR type digital filter.
The LMS calculation unit 24b uses the coefficient W (n + 1) of the adaptive filter 24a at the next time (n + 1) · Ts after one sampling time Ts as the coefficient W (n) of the adaptive filter at the current time n · Ts. And the error signal e (n)

Determined by However, the bold letters W and X in the alphabet mean vectors,

It is. T denotes a transposed matrix, and μ is a constant of 1 or less (step size parameter) that determines the update step of the adaptive filter coefficient.
According to the audio cancellation system of FIG. 9, the coefficient of the adaptive filter 24a is determined so as to simulate the transfer characteristic from the speaker to the microphone, in other words, to cancel the audio sound. As a result, when the speaker utters a voice in a state where the audio sound is being output, a synthesized sound of the audio sound and the uttered voice is detected by the microphone 2, and the error generating unit 23 uses the synthesized sound signal d (n). The error signal e (n) excluding the audio sound cancellation signal y (n) is input to the speech recognition device 4 as an utterance speech signal, and the speech recognition device 4 executes speech recognition processing based on the utterance speech signal.

図８におけるASCシステム１におけるオーディ音キャンセル処理やオーディオ部３におけるエンコード/デコード処理はDSP（Digital Signal Processor）を用いて実現されている。このASCシステム１におけるDSPの単位時間当たりの処理量を減少するための技術としてマルチレート信号処理技術が知られている。マルチレート信号処理技術は、フィルタバンクを用いて信号を帯域分割し、これにより信号処理を行うサンプリング周波数を下げることで単位時間当たりの処理量を減少し、低スペックなDSPでも複雑な信号処理を可能としている。 Audio sound cancellation processing in the ASC system 1 in FIG. 8 and encoding / decoding processing in the audio unit 3 are realized using a DSP (Digital Signal Processor). A multi-rate signal processing technique is known as a technique for reducing the processing amount per unit time of the DSP in the ASC system 1. Multi-rate signal processing technology uses a filter bank to divide a signal into bands, thereby lowering the sampling frequency for signal processing, thereby reducing the amount of processing per unit time and allowing complex signal processing even with low-spec DSPs. It is possible.

図１０はマルチレート信号処理技術を採用したASCシステム１の構成図であり、オーディオデータSa及びマイクロホン検出音信号Sdをそれぞれフィルタバンク３１，３２を用いてn個（例えば３２個）の帯域に分割し、帯域毎に適応信号処理し、各帯域の適応フィルタ出力を合成して出力するようになっている。フィルタバンク３１，３２はそれぞれn個のバンドパスフィルタBPFを備え、オーディオデータSa及びマイクロホン検出音信号Sdをそれぞれn個の帯域に分割し、ダウンサンプリング部３３，３４はn個のサンプリング部DNを備えて各帯域の信号成分をダウンサンプリングして出力する。 FIG. 10 is a block diagram of the ASC system 1 that employs multi-rate signal processing technology. The audio data Sa and the microphone detection sound signal Sd are divided into n (for example, 32) bands using filter banks 31 and 32, respectively. Then, adaptive signal processing is performed for each band, and the adaptive filter output of each band is synthesized and output. The filter banks 31 and 32 are each provided with n band-pass filters BPF, and each of the audio data Sa and the microphone detection sound signal Sd is divided into n bands, and the down-sampling units 33 and 34 are provided with n sampling units DN. The signal components in each band are down-sampled and output.

オーディオ音キャンセルコントロール部３５は、帯域毎に適応信号処理部ASPを備え、帯域毎に誤差信号のパワーが最小となるようにＬＭＳ適応信号処理を行なう。誤差信号発生部３６は帯域毎に誤差演算部COMを備え、対応する帯域の適応信号処理部の出力信号とマイクロホン検出音信号との差を計算して誤差信号として適応信号処理部ASPにフィードバックする。オーディオ音キャンセルコントロール部３５において、各帯域の適応信号処理部ASPの適応フィルタ（FIR）は対応する帯域のオーディオデータにフィルタリング処理を施して出力し、ＬＭＳ演算部(LMS)は対応する帯域の誤差演算部から出力する誤差信号のパワーが最小となるようにＬＭＳ適応信号処理を行って適応フィルタFIRのフィルタ係数を決定する。
ポストフィルタ３７は各帯域の誤差信号に所定のバンドパスフィルタ処理を施し、合成部３８はポストフィルタ３７の各バンドパスフィルタ出力を合成し、アップサンプリング部（UP）３９は合成信号をアップサンプリングして出力する。
以上により、マイクロホン２がオーディオ再生音のみを検出しているとき、各帯域のオーディオ音のパワーが最小となるフィルタ係数が適応フィルタFIRに設定される。この結果、マイクロホン２がオーディオ再生音と話者音声が混じった合成音を検出すると、オーディオ再生音が抑圧されて話者音声信号がアップサンプリング部（UP）３９から出力する。
特開２００１−２３６０９０号公報 The audio sound cancel control unit 35 includes an adaptive signal processing unit ASP for each band, and performs LMS adaptive signal processing so that the power of the error signal is minimized for each band. The error signal generation unit 36 includes an error calculation unit COM for each band, calculates the difference between the output signal of the adaptive signal processing unit in the corresponding band and the microphone detection sound signal, and feeds back to the adaptive signal processing unit ASP as an error signal. . In the audio sound cancellation control unit 35, the adaptive filter (FIR) of the adaptive signal processing unit ASP of each band performs filtering processing on the audio data of the corresponding band and outputs it, and the LMS calculation unit (LMS) outputs the error of the corresponding band. LMS adaptive signal processing is performed to determine the filter coefficient of the adaptive filter FIR so that the power of the error signal output from the arithmetic unit is minimized.
The post filter 37 performs a predetermined band pass filter process on the error signal of each band, the synthesis unit 38 synthesizes each band pass filter output of the post filter 37, and an up sampling unit (UP) 39 up samples the synthesized signal. Output.
As described above, when the microphone 2 detects only the audio reproduction sound, the filter coefficient that minimizes the power of the audio sound in each band is set in the adaptive filter FIR. As a result, when the microphone 2 detects a synthesized sound in which the audio reproduction sound and the speaker voice are mixed, the audio reproduction sound is suppressed and a speaker voice signal is output from the upsampling unit (UP) 39.
JP 2001-236090 A

マルチレート信号処理技術によれば、フィルタバンクを用いて信号を帯域分割することにより単位時間当たりの処理量を減少できるが、フィルタバンクが余計に必要になる。
以上から、本発明の目的は、オーディオデータの帯域信号成分を出力するASCキャンセルシステムにおけるフィルタバンクを除去してシステムのハードウェア負担を軽減することである。 According to the multi-rate signal processing technique, the amount of processing per unit time can be reduced by dividing a signal into bands using a filter bank, but an extra filter bank is required.
As described above, an object of the present invention is to remove a filter bank in an ASC cancellation system that outputs a band signal component of audio data, thereby reducing the hardware burden of the system.

本発明は、オーディオ再生してスピーカより出力するオーディオ機能とオーディオ音をキャンセルして話者音声を出力するオーディオキャンセル機能とを備えた車載機であり、非圧縮のオーディオデータを入力されてオーディオ音を音響空間に放射するオーディオ回路、前記オーディオデータを複数のサブバンドオーディオデータにして出力するフィルタバンク、前記フィルタバンクから出力される各サブバンドのオーディオデータを圧縮してメモリに記憶する圧縮部、前記フィルタバンクから出力される各サブバンドのオーディオデータにオーディオ音キャンセル処理を施して合成するオーディオ音キャンセル部を備えている。
本発明の車載機は更に、オーディオソースから出力するオーディオデータが圧縮されているか否かを判定する判定部、圧縮オーディオデータを復号する復号部、前記復号部から出力する圧縮オーディオデータに伸張処理を施して前記オーディオ回路に入力する伸張部、前記オーディオソースから出力するオーディオデータが圧縮されていれば、前記復号部の出力データを前記オーディオ音キャンセル部に入力し、圧縮されていなければ前記フィルタバンクから出力するサブバンドオーディオデータを前記オーディオ音キャンセル部に入力する切替部を備えている。 The present invention is an in-vehicle device having an audio function for reproducing audio and outputting from a speaker, and an audio canceling function for canceling audio sound and outputting a speaker's voice. An audio circuit that radiates the sound data into a sound space, a filter bank that outputs the audio data as a plurality of subband audio data, and a compression unit that compresses and stores the audio data of each subband output from the filter bank in a memory; An audio sound cancellation unit is provided that performs audio sound cancellation processing on the audio data of each subband output from the filter bank and synthesizes the data.
The in-vehicle device of the present invention further includes a determination unit that determines whether audio data output from an audio source is compressed, a decoding unit that decodes compressed audio data, and a decompression process on the compressed audio data output from the decoding unit. If the audio data output from the audio source and compressed from the audio source is compressed, the output data of the decoding unit is input to the audio sound canceling unit, and if not compressed, the filter bank Is provided with a switching unit for inputting the subband audio data to be output from the audio sound canceling unit.

本発明によれば、フィルタバンクでオーディオデータを複数のサブバンドのオーディオデータに分割し、該フィルタバンクから出力される各サブバンドオーディオデータを圧縮してメモリに記憶すると共に、前記フィルタバンクから出力される各サブバンドオーディオデータを用いてASCシステムはマルチレート信号処理によりオーディオ音をキャンセルするようにしたから、オーディオ部の圧縮に用いるフィルタバンクをオーディオ音キャンセル処理に共用することができ、ASCシステムにおけるフィルタバンクを省略することができる。 According to the present invention, the audio data is divided into a plurality of sub-band audio data by the filter bank, each sub-band audio data output from the filter bank is compressed and stored in the memory, and output from the filter bank. Since the ASC system cancels the audio sound by multi-rate signal processing using each sub-band audio data, the filter bank used for the compression of the audio part can be shared with the audio sound canceling process. The filter bank in can be omitted.

（A）車載機の要部構成
図１は本発明の車載機の要部構成図である。ASCシステム５１は後述するようにマルチレート信号処理技術を採用した構成を備えており、その第１入力端子側にはマイクロホン５２により検出された検出音信号Sdがフィルタバンク部５３を介して入力され、第２入力端子側にはオーディオ部５４からオーディオ信号Saが入力される。ASCシステム５１は、後述するようにオーディオ部の圧縮に用いるフィルタバンク７２aをオーディオ音キャンセル処理に共用する点に特徴を有している。
フィルタバンク部５３はマイクロホン検出音信号Sdをn個の帯域、例えば３２個のサブバンドに分割して出力するもので、図示しないがフィルタバンクとダウンサンプリング部で構成されている（図１０のフィルタバンク３２、ダウンサンプリング部３４を参照）。すなわち、フィルタバンク部５３はマイクロホン検出音信号Sdを３２サブバンドに分割し、各サブバンドの信号成分をダウンサンプリングしてASCシステム５１に入力する。
オーディオ部５４において、CDデッキ等のオーディオ再生部６１あるいはハードディスク６２から読み出されたオーディオデータは、ソース切替部６３、オーディオソース種別判別部６４を介して出力切換部６５に入力される。ソース切替部６３は所定のオーディオソースからのオーディオデータを選択し、オーディオソース種別判別部６４は該オーディオデータが圧縮されているか、圧縮されていないかを判別し、判別結果を出力切換部６５と第１、第２の入力切替部６６、６７に入力する。
出力切換部６５は入力オーディオデータが圧縮されているか否かにより出力先を選択するもので、圧縮されていればオーディオデータを復号/伸張部７１に入力し、圧縮されていなければオーディオデータ(PCMオーディオデータ)を第２の入力切替部６７と圧縮/符号化部７２に入力する。 (A) Main part structure of onboard equipment FIG. 1: is a principal part block diagram of the onboard equipment of this invention. As will be described later, the ASC system 51 has a configuration employing multi-rate signal processing technology, and a detection sound signal Sd detected by the microphone 52 is input to the first input terminal side via the filter bank unit 53. The audio signal Sa is input from the audio unit 54 to the second input terminal side. The ASC system 51 is characterized in that a filter bank 72a used for compression of the audio part is shared for audio sound cancellation processing, as will be described later.
The filter bank unit 53 divides the microphone detection sound signal Sd into n bands, for example, 32 subbands, and outputs the divided signal. The filter bank unit 53 includes a filter bank and a downsampling unit (not shown) (the filter shown in FIG. 10). (See bank 32, downsampling unit 34). That is, the filter bank unit 53 divides the microphone detection sound signal Sd into 32 subbands, downsamples the signal components of each subband, and inputs them to the ASC system 51.
In the audio unit 54, audio data read from the audio playback unit 61 such as a CD deck or the hard disk 62 is input to the output switching unit 65 via the source switching unit 63 and the audio source type determination unit 64. The source switching unit 63 selects audio data from a predetermined audio source, the audio source type determination unit 64 determines whether the audio data is compressed or not, and the determination result is output to the output switching unit 65. Input to the first and second input switching units 66 and 67.
The output switching unit 65 selects an output destination based on whether or not the input audio data is compressed. If the input audio data is compressed, the audio data (PCM) is input to the decoding / decompression unit 71. Audio data) is input to the second input switching unit 67 and the compression / encoding unit 72.

入力オーディオデータが圧縮されていれば、復号/伸張部７１の復号器７１aは符号化されている圧縮オーディオデータを復号し、第１の入力切替部６６を介して復号結果をASCシステム５１に入力すると共に、該復号結果を伸張部７１ｂに入力する。伸張部７１ｂは圧縮オーディオデータに伸張処理を施して元のオーディオデータを復元し、第２の入力切替部６７を介してオーディオ回路７４に入力する。オーディオ回路７４は入力されたオーディオデータをアナログに変換してスピーカ７５から音響空間に再生音を出力する。
入力オーディオデータが圧縮されていなければ、第２の入力切替部６７は該非圧縮のオーディオデータ(PCMオーディオデータ)をオーディオ回路７４に入力し、オーディオ回路７４は入力されたオーディオデータをアナログに変換してスピーカ７５から音響空間に再生音を出力する。また、圧縮/符号化部７２の分析部(フィルタバンク部)７２aは、非圧縮のオーディオデータをn個の帯域(サブバンド)、例えば３２個のサブバンドに分割し、第１の入力切替部６６を介してASCシステム５１に入力すると共に、各サブバンドオーディオデータを圧縮部７２ｂに入力する。圧縮部７２ｂはＭＰＥＧオーディオ圧縮方式、例えば３２サブバンド・コーディング（帯域分割符号化）方式に従って圧縮/符号化処理を行い、処理結果をディスク制御部７６に入力する。ディスク制御部７６は入力された圧縮オーディオデータをハードディスク６２に記録する(リッピング)。また、ディスク制御部７６は図示しない操作部からの要求により、ハードディスク６２に記録されている所定曲の圧縮オーディオデータを読み出してソース切替部６３に入力して再生する。 If the input audio data is compressed, the decoder 71 a of the decoding / decompression unit 71 decodes the encoded compressed audio data and inputs the decoding result to the ASC system 51 via the first input switching unit 66. At the same time, the decoding result is input to the decompression unit 71b. The decompression unit 71 b decompresses the compressed audio data to restore the original audio data, and inputs the decompressed audio data to the audio circuit 74 via the second input switching unit 67. The audio circuit 74 converts the input audio data into analog and outputs reproduced sound from the speaker 75 to the acoustic space.
If the input audio data is not compressed, the second input switching unit 67 inputs the uncompressed audio data (PCM audio data) to the audio circuit 74, and the audio circuit 74 converts the input audio data into analog. The reproduced sound is output from the speaker 75 to the acoustic space. The analysis unit (filter bank unit) 72a of the compression / encoding unit 72 divides uncompressed audio data into n bands (subbands), for example, 32 subbands, and the first input switching unit. The sub-band audio data is input to the compression unit 72b. The compression unit 72 b performs compression / encoding processing according to an MPEG audio compression method, for example, a 32-subband coding (band division coding) method, and inputs the processing result to the disk control unit 76. The disk control unit 76 records the input compressed audio data on the hard disk 62 (ripping). Further, the disc control unit 76 reads compressed audio data of a predetermined music recorded on the hard disk 62 in response to a request from an operation unit (not shown), and inputs the compressed audio data to the source switching unit 63 for reproduction.

（B）圧縮/符号化部
図２は圧縮/符号化部７２の構成図であり、分析部(フィルタバンク部)７２aと圧縮部７２bとで構成されている。
フィルタバンク部７２aは３２個のバンドパスフィルタＢＰＦを備え、オーディオデータを３２個のサブバンドオーディオデータにして出力する。なお、３２サブバンドオーディオデータは第１の入力切替部６６を介してASCシステム５１に入力される。圧縮部７２ｂはＭＰＥＧオーディオ圧縮方式、例えば、３２サブバンド分割符号化方式により圧縮／符号化する。３２サブバンド分割符号化方式は、聴感心理的な特性を利用して高能率の圧縮を実現する。 (B) Compression / Encoding Unit FIG. 2 is a block diagram of the compression / encoding unit 72, which includes an analysis unit (filter bank unit) 72a and a compression unit 72b.
The filter bank unit 72a includes 32 band pass filters BPF, and outputs the audio data as 32 subband audio data. The 32 subband audio data is input to the ASC system 51 via the first input switching unit 66. The compression unit 72b performs compression / encoding by an MPEG audio compression method, for example, a 32-subband division coding method. The 32-subband division coding scheme achieves high-efficiency compression by using auditory psychological characteristics.

・３２サブバンド分割符号化方式
人間の耳はあるレベル以下の音を聞き取ることができず、このレベルを各帯域毎にプロットしてできる特性曲線は最小マスキングしきい値曲線（最小可聴限界曲線）ＭＴＣと呼ばれている（図３参照）。マスキング効果は周囲の音の状況により変化し、最小マスキングしきい値曲線ＭＴＣ以上のレベルを有する音であっても小さな音は大きな音により聞こえなくなってしまう。これは、大きな音によりマスキングしきい値曲線が図３のＭＴＣ′のように変化するからであり、該曲線以下の音成分Ａ，Ｂはマスキングされて人間の耳に聞こえず、マスキングしきい値曲線ＭＴＣ′より上の音成分Ｃ，Ｄは聞こえる。
以上を考慮して、マスキングしきい値レベルＭＴＣ′以下の音Ａ，Ｂは量子化せず、マスキングしきい値レベル以上の音Ｃ，Ｄを量子化する。又、量子化する場合には、各サブバンドにおけるオーディオレベルとマスキングしきい値レベルの差の大きさに応じて量子化ビット数を割り当てて量子化し、量子化データと割り当てビット数等を出力する。 32 sub-band division coding method The human ear cannot hear sound below a certain level, and the characteristic curve that can be plotted for each band is the minimum masking threshold curve (minimum audible limit curve) It is called MTC (see FIG. 3). The masking effect changes depending on the surrounding sound conditions, and even a sound having a level higher than the minimum masking threshold curve MTC cannot be heard by a loud sound. This is because the masking threshold curve changes as shown by MTC 'in FIG. 3 due to a loud sound, and the sound components A and B below the curve are masked and cannot be heard by human ears. Sound components C and D above the curve MTC 'can be heard.
Considering the above, the sounds A and B below the masking threshold level MTC ′ are not quantized, and the sounds C and D above the masking threshold level are quantized. In the case of quantization, the quantization bit number is assigned and quantized according to the difference between the audio level and the masking threshold level in each subband, and the quantized data and the assigned bit number are output. .

具体的には、図４に示すように３６サブフレーム（３２サンプル／サブフレーム）サンプルのオーディオ信号で１フレームを構成し、各サブフレームのオーディオ信号をそれぞれ３２のサブバンド（帯域）に細分化し、３２バンドのサブバンド符号化を行う。すなわち、全帯域を３２の等間隔の周波数幅に分割し、それぞれのサンプル信号を後述の各サブバンドの量子化ビット数に応じて量子化して符号化を行い、１１５２（＝３６×３２）サンプルデータを１フレームとする。
１つのサブバンドの３６サンプルデータに対して共通に１つのスケールファクタが決められる。すなわち、３６個のそれぞれの波形の最大値が１．０になるように正規化し、その正規化倍率がスケールファクタとして符号化される。
又、各サブバンドの量子化ビット数を決定し、割り当てビット数とする。臨界帯域幅を考慮したマスキングレベルぎりぎりまでの量子化精度（量子化ビット数）を指定することにより、マスキング効果を最も効果的に利用できる。マスキングの結果、聴感系に認識されないレベルの信号しか含まれないバンドについては、完全に情報をなくすことができ、かかる場合はサンプルデータとしてビットを割り当てない。すなわち、各サブバンドにおけるサンプルデータの量子化ビット数が０の場合、サンプリングデータは存在しない。 Specifically, as shown in FIG. 4, an audio signal of 36 subframes (32 samples / subframe) samples constitute one frame, and the audio signal of each subframe is subdivided into 32 subbands (bands). , 32 band sub-band encoding is performed. That is, the entire band is divided into 32 equally-spaced frequency widths, and each sample signal is quantized and encoded according to the number of quantization bits of each subband, which will be described later, and 1152 (= 36 × 32) samples. The data is one frame.
One scale factor is determined in common for 36 sample data of one subband. That is, normalization is performed so that the maximum value of each of the 36 waveforms is 1.0, and the normalization magnification is encoded as a scale factor.
In addition, the number of quantization bits for each subband is determined and set as the number of assigned bits. The masking effect can be used most effectively by specifying the quantization accuracy (number of quantization bits) up to the limit of the masking level in consideration of the critical bandwidth. As a result of the masking, it is possible to completely eliminate information about a band including only a signal of a level that is not recognized by the auditory system. In such a case, no bit is assigned as sample data. That is, when the number of quantization bits of sample data in each subband is 0, there is no sampling data.

図５はオーディオ・ビット・ストリームの１フレームの構造説明図である。１００は１つ１つでオーディオ信号に復号できる最小ユニットで、常に一定のサンプル数＝１１５２（＝３６×３２）サンプルのデータを含んでいる。最小ユニット１００は３２ビットのヘッダ部１０１と、エラーチェックコード（オプション）１０２と、オーディオデータ部１０３で構成され、オーディオデータ部１０３はサブバンド毎の量子化ビット数（アロケーションデータ）１０３ａ、スケールファクタ１０３ｂ、サンプルデータ１０３ｃを備えている。ヘッダ部１０１には、１２ビットのオール"１"の同期ワード１０１ａ、常に"１"のＩＤ１０１ｂ、その他レイヤ識別１０１ｃ、ビットレートインデックス、サンプリング周波数、モード等の情報が含まれている。 FIG. 5 is an explanatory diagram of the structure of one frame of an audio bit stream. 100 is the smallest unit that can be decoded into an audio signal one by one, and always contains data of a fixed number of samples = 11152 (= 36 × 32) samples. The minimum unit 100 includes a 32-bit header section 101, an error check code (option) 102, and an audio data section 103. The audio data section 103 has a quantization bit number (allocation data) 103a for each subband, a scale factor. 103b and sample data 103c. The header portion 101 includes information such as a 12-bit all “1” synchronization word 101a, an ID “101b” always “1”, other layer identification 101c, a bit rate index, a sampling frequency, and a mode.

・圧縮部
図２に戻って、圧縮部７２ｂにおいて、心理聴覚モデル８１は、１フレームｍ（＝１１５２）サンプリングのオーディオデータが入力される毎にマスキングしきい値特性ＭＴＣ′(図３参照)を求め、このマスキングしきい値特性ＭＴＣ′の各サブバンドにおけるマスクレベルと信号レベルとからサブバンド(Ｎ＝３２)毎に量子化ビット数、スケールファクタを決定し、各サブバンドの調節部８２₁〜８２₃₂はスケールファクタに基づいてビットシフト量を決定し、ビットシフト部８３₁〜８３₃₂は各サブバンド信号のビットをシフトして符号化部８４に入力する。符号化部８４はシフト後のサブバンドデータを量子化ビット数で符号化すると共に、各サブバンドの量子化ビット数、スケールファクタを符号化して出力する。 Compression unit Returning to FIG. 2, in the compression unit 72b, the psychoacoustic model 81 changes the masking threshold value characteristic MTC ′ (see FIG. 3) every time audio data of 1 frame m (= 1152) sampling is input. Then, the number of quantization bits and the scale factor are determined for each subband (N = 32) from the mask level and the signal level in each subband of the masking threshold characteristic MTC ′, and the adjustment unit 82 _{1 of} each subband is determined. to 82 ₃₂ determines the bit shift amount on the basis of the scale factor, the bit shift section 83 _1-83 ₃₂ input to the encoding unit 84 shifts the bits of each sub-band signals. The encoding unit 84 encodes the shifted subband data with the number of quantization bits, and encodes and outputs the number of quantization bits and the scale factor of each subband.

（C）復号/伸張部
図６は復号/伸張部７１の構成図であり、復号器７１aと伸張部７１bとで構成されている。復号器７１aは符号化されている圧縮オーディオデータを復号し、各サブバンドの圧縮オーディオデータとスケールファクタを伸張部７１ｂに入力すると共に、第１の入力切替部６６に入力する。
伸張部７１ｂにおける各サブバンドの調節部８５₁〜８５₃₂はスケールファクタに基づいてビットシフト量を決定し、ビットシフト部８６₁〜８６₃₂は各サブバンドデータのビットを圧縮時と逆方向にシフトして伸張する。各サブバンドのポストフィルタ（バンドパスフィルタBPF）は伸張された各サブバンド信号に所定のBPF特性を付与し、合成部８８は全ポストフィルタ出力を合成してオーディオ信号を復元して出力する。 (C) Decoding / Expanding Unit FIG. 6 is a block diagram of the decoding / decompressing unit 71, which includes a decoder 71a and an expanding unit 71b. The decoder 71a decodes the encoded compressed audio data and inputs the compressed audio data and the scale factor of each subband to the decompression unit 71b and also to the first input switching unit 66.
Adjusting portion 85 _1-85 ₃₂ of each sub-band in the expansion section 71b determines the bit shift amount on the basis of the scale factor, the bit shift section 86 _1-86 ₃₂ for compression during the reverse direction the bits of each sub-band data Shift and stretch. Each subband post filter (bandpass filter BPF) gives a predetermined BPF characteristic to each expanded subband signal, and a synthesizing unit 88 reconstructs and outputs an audio signal by synthesizing all postfilter outputs.

（D）ASCシステム
図７は本発明のASCシステム５１の構成図であり、フィルタバンク部５３及び入力切替部６６からマイクロホン検出音信号Sdおよびオーディオ信号Saが３２個のサブバンドに分割されて入力している。すなわち、フィルタバンク部５３はマイクロホン検出音信号Sdを３２サブバンドの信号成分に分割し、各サブバンドの信号成分をダウンサンプリングしてASCシステム５１に入力する。また、入力切替部６６は、非圧縮オーディオデータ再生時、圧縮/符号化部７２のフィルタバンク部７２aから出力する３２サブバンドオーディオデータをASCシステム５１に入力し、圧縮オーディオデータ再生時、復号/伸張部７１の復号器７１aから出力する３２サブバンドオーディオデータ及びスケールファクタをASCシステム５１に入力する。 (D) ASC System FIG. 7 is a block diagram of the ASC system 51 of the present invention. The microphone detection sound signal Sd and the audio signal Sa are divided into 32 subbands and input from the filter bank unit 53 and the input switching unit 66. is doing. That is, the filter bank unit 53 divides the microphone detection sound signal Sd into 32 subband signal components, downsamples the signal components of each subband, and inputs them to the ASC system 51. Further, the input switching unit 66 inputs the 32 subband audio data output from the filter bank unit 72a of the compression / encoding unit 72 to the ASC system 51 when reproducing the uncompressed audio data, and decodes / decodes the compressed audio data when reproducing the compressed audio data. The 32 subband audio data and the scale factor output from the decoder 71 a of the expansion unit 71 are input to the ASC system 51.

・非圧縮オーディオデータ再生時
非圧縮オーディオデータ再生時、入力切替部６６は圧縮/符号化部７２のフィルタバンク部７２aが出力する３２サブバンドオーディオデータをASCシステム５１に入力する。ASCシステム５１のダウンサンプリング部９１はサブバンド毎にサンプリング部（DS）を備え、各サブバンドオーディオデータをダウンサンプリングして出力する。
オーディオ音キャンセルコントロール部９２は、サブバンド毎に適応信号処理部ASPを備え、サブバンド毎に誤差信号のパワーが最小となるようにＬＭＳ適応信号処理を行なう。誤差信号発生部９３はサブバンド毎に誤差演算部COMを備え、対応するサブバンドの適応信号処理部の出力信号とマイクロホン検出音信号との差を計算して誤差信号として適応信号処理部ASPにフィードバックする。オーディオ音キャンセルコントロール部９２において、各サブバンドの適応信号処理部ASPの適応フィルタ（FIR）は対応するサブバンドのオーディオデータにフィルタリング処理を施して出力し、ＬＭＳ演算部(LMS)は対応するサブバンドの誤差演算部から出力する誤差信号のパワーが最小となるようにＬＭＳ適応信号処理を行って適応フィルタFIRのフィルタ係数を決定する。なお、非圧縮オーディオデータ再生時においてスケールファクタは０である。このため、ビットシフト量は０であり、各サブバンドのビットシフト部９４₁〜９４₃₂はビットシフトをしない。
ポストフィルタ９６は各サブバンドの誤差信号に所定のバンドパスフィルタ処理を施し、合成部９７はポストフィルタ９６の各バンドパスフィルタ出力を合成し、アップサンプリング部（UP）９８は合成信号をアップサンプリングして話者音声として音声認識装置に出力する。 When playing back uncompressed audio data When playing back uncompressed audio data, the input switching unit 66 inputs the 32 subband audio data output from the filter bank unit 72a of the compression / encoding unit 72 to the ASC system 51. The downsampling unit 91 of the ASC system 51 includes a sampling unit (DS) for each subband, and downsamples and outputs each subband audio data.
The audio sound cancel control unit 92 includes an adaptive signal processing unit ASP for each subband, and performs LMS adaptive signal processing so that the power of the error signal is minimized for each subband. The error signal generation unit 93 includes an error calculation unit COM for each subband, calculates the difference between the output signal of the corresponding subband adaptive signal processing unit and the microphone detection sound signal, and outputs it to the adaptive signal processing unit ASP as an error signal. provide feedback. In the audio sound cancellation control unit 92, the adaptive filter (FIR) of the adaptive signal processing unit ASP of each subband performs filtering processing on the audio data of the corresponding subband and outputs it, and the LMS calculation unit (LMS) outputs the corresponding subband. LMS adaptive signal processing is performed to determine the filter coefficient of the adaptive filter FIR so that the power of the error signal output from the band error calculator is minimized. Note that the scale factor is 0 when uncompressed audio data is reproduced. Therefore, the bit shift amount is 0, and the bit shift units 94 _{1 to} 94 ₃₂ of each subband do not perform bit shift.
The post filter 96 performs a predetermined band pass filter process on the error signal of each subband, the synthesis unit 97 synthesizes each band pass filter output of the post filter 96, and an upsampling unit (UP) 98 upsamples the synthesized signal. Then, it is output as a speaker voice to the voice recognition device.

・圧縮オーディオデータ再生時
圧縮オーディオデータ再生時、入力切替部６６は復号/伸張部７１の復号器７１aが出力する３２サブバンドオーディオデータ及びスケールファクタをASCシステム５１に入力する。ASCシステム５１のダウンサンプリング部９１は各サブバンドのオーディオデータをダウンサンプリングして出力する。
オーディオ音キャンセルコントロール部９２は、サブバンド毎に適応信号処理部ASPを備え、サブバンド毎に誤差信号のパワーが最小となるようにＬＭＳ適応信号処理を行なう。各サブバンドの調節部９５₁〜９５₃₂はスケールファクタに基づいてビットシフト量を決定し、ビットシフト部９４₁〜９４₃₂は各サブバンドの適応フィルタFIRから出力するデータのビットを圧縮時とは逆方向にシフトして伸張する。誤差信号発生部９３はサブバンド毎に誤差演算部COMを備え、対応するサブバンドの適応信号処理部ASPの適応フィルタ出力信号とマイクロホン検出音信号との差を計算して誤差信号として該適応信号処理部ASPにフィードバックする。
オーディオ音キャンセルコントロール部９２において、各サブバンドの適応信号処理部ASPの適応フィルタ（FIR）はサブバンドのオーディオデータにフィルタリング処理を施して出力し、ＬＭＳ演算部(LMS)は対応するサブバンドの誤差演算部から出力する誤差信号のパワーが最小となるようにＬＭＳ適応信号処理を行って適応フィルタFIRのフィルタ係数を決定する。ポストフィルタ９６は各サブバンドの誤差信号に所定のバンドパスフィルタ処理を施し、合成部９７はポストフィルタ９６の各バンドパスフィルタ出力を合成し、アップサンプリング部（UP）９８は合成信号をアップサンプリングして話者音声として音声認識装置に出力する。 At the time of reproducing compressed audio data At the time of reproducing compressed audio data, the input switching unit 66 inputs the 32 subband audio data and the scale factor output from the decoder 71 a of the decoding / decompression unit 71 to the ASC system 51. The downsampling unit 91 of the ASC system 51 downsamples and outputs the audio data of each subband.
The audio sound cancel control unit 92 includes an adaptive signal processing unit ASP for each subband, and performs LMS adaptive signal processing so that the power of the error signal is minimized for each subband. Adjusting portion 95 _1-95 ₃₂ of each sub-band determines the bit shift amount on the basis of the scale factor, the bit shift section 94 _1-94 ₃₂ both when compressing the bits of data output from the adaptive filter FIR in each sub-band Shifts and expands in the opposite direction. The error signal generator 93 includes an error calculator COM for each subband, calculates the difference between the adaptive filter output signal of the adaptive signal processor ASP of the corresponding subband and the microphone detection sound signal, and uses the adaptive signal as an error signal. Feedback to the processing unit ASP.
In the audio sound canceling control unit 92, the adaptive filter (FIR) of the adaptive signal processing unit ASP of each subband performs filtering processing on the subband audio data and outputs it, and the LMS calculation unit (LMS) outputs the corresponding subband. LMS adaptive signal processing is performed to determine the filter coefficient of the adaptive filter FIR so that the power of the error signal output from the error calculator is minimized. The post filter 96 performs a predetermined band pass filter process on the error signal of each subband, the synthesis unit 97 synthesizes each band pass filter output of the post filter 96, and an upsampling unit (UP) 98 upsamples the synthesized signal. Then, it is output as a speaker voice to the voice recognition device.

以上本発明によれば、フィルタバンクで非圧縮オーディオデータを複数のサブバンドオーディオデータに分割し、該フィルタバンクから出力される各サブバンドオーディオデータを圧縮してハードディスクなどに記憶すると共に、ASCシステムにおいて該フィルタバンクから出力される各サブバンドオーディオデータを用いてマルチレート信号処理によりオーディオ音をキャンセルするようにしたから、オーディオ部の圧縮に用いるフィルタバンクをオーディオ音キャンセル処理に共用することができ、ASCキャンセルシステムにおけるフィルタバンクを省略することができる。 As described above, according to the present invention, the uncompressed audio data is divided into a plurality of subband audio data by the filter bank, each subband audio data output from the filter bank is compressed and stored in the hard disk, and the ASC system. Since the audio sound is canceled by multi-rate signal processing using each subband audio data output from the filter bank, the filter bank used for compression of the audio part can be shared for audio sound cancellation processing. The filter bank in the ASC cancellation system can be omitted.

本発明の車載機の構成図である。It is a block diagram of the vehicle equipment of this invention. 圧縮/符号化部の構成図である。It is a block diagram of a compression / encoding part. 最小マスキングしきい値曲線（最小可聴限界曲線）説明図である。It is explanatory drawing of the minimum masking threshold curve (minimum audible limit curve). ３２サブバンド分割符号化方式の説明図である。It is explanatory drawing of a 32 subband division | segmentation encoding system. ３２サブバンド分割符号化方式によるオーディオ・ビット・ストリームの１フレームの構造説明図である。It is structure explanatory drawing of 1 frame of the audio bit stream by a 32 subband division | segmentation encoding system. 復号/伸張部の構成図である。It is a block diagram of a decoding / decompression part. 本発明のASCシステムの構成図である。It is a block diagram of the ASC system of this invention. 従来のAVNシステムの構成図である。It is a block diagram of the conventional AVN system. ASCシステムの構成図である。It is a block diagram of an ASC system. マルチレート信号処理技術を採用したASCシステムの構成図である。It is a block diagram of the ASC system which employ | adopted the multi-rate signal processing technique.

Explanation of symbols

５１ ASCシステム
５２マイクロホン
５３フィルタバンク部
５４オーディオ部
６１ CDデッキ等のオーディオ再生部
６２ハードディスク
６３ソース切替部
６４オーディオソース種別判別部
６５出力切換部
６６，６７第１、第２の入力切替部
７１復号/伸張部
７１a 復号器
７１ｂ伸張部
７２圧縮/符号化部
７２a フィルタバンク部
７２ｂ圧縮部
７４オーディオ部７６ディスク制御部
51 ASC system 52 Microphone 53 Filter bank unit 54 Audio unit 61 Audio playback unit 62 such as a CD deck Hard disk 63 Source switching unit 64 Audio source type discrimination unit 65 Output switching units 66 and 67 First and second input switching units 71 Decoding / Decompression unit 71a decoder 71b decompression unit 72 compression / encoding unit 72a filter bank unit 72b compression unit 74 audio unit 76 disk control unit

Claims

In an in-vehicle device equipped with an audio function for audio playback and output from a speaker and an audio cancel function for canceling audio sound and outputting speaker voice,
An audio circuit that receives uncompressed audio data and radiates audio sound into the acoustic space;
A filter bank for outputting the audio data as a plurality of subband audio data;
A compression unit for compressing and storing the audio data of each subband output from the filter bank in a memory;
An audio sound cancellation unit that performs audio sound cancellation processing on the audio data of each subband output from the filter bank,
An in-vehicle device characterized by comprising:

A determination unit for determining whether or not the audio data output from the audio source is compressed;
A decoding unit for decoding the compressed audio data;
A decompression unit that decompresses compressed audio data output from the decoding unit and inputs the compressed audio data to the audio circuit;
If the audio data output from the audio source is compressed, the output data of the decoding unit is input to the audio sound canceling unit, and if not compressed, the subband audio data output from the filter bank is input to the audio sound. A switching unit to input to the cancellation unit,
The in-vehicle device according to claim 1, further comprising: