JP5757093B2

JP5757093B2 - Signal processing device

Info

Publication number: JP5757093B2
Application number: JP2011011520A
Authority: JP
Inventors: 真樹片山
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2011-01-24
Filing date: 2011-01-24
Publication date: 2015-07-29
Anticipated expiration: 2031-01-24
Also published as: JP2012156610A

Description

この発明は、仮想音像定位処理を行う信号処理装置に関するものである。 The present invention relates to a signal processing apparatus that performs virtual sound localization processing.

従来、モノラルスピーカで音の広がり感を与えるものとして、例えば特許文献１や特許文献２に示すような処理が知られている。 Conventionally, for example, Patent Document 1 and Patent Document 2 are known as processes that give a sense of sound spread with a monaural speaker.

特許文献１では、Ｌチャンネル入力信号とＲチャンネル入力信号の差成分から低域成分のみ抽出して所定のゲインでＬ＋Ｒ信号に加算し、定位に寄与しない成分を強調することで、広がり感を与えるものが記載されている。 In Patent Literature 1, only a low frequency component is extracted from a difference component between an L channel input signal and an R channel input signal, added to an L + R signal with a predetermined gain, and a component that does not contribute to localization is emphasized to give a sense of spread. Things are listed.

また、特許文献２では、Ｌチャンネル入力信号とＲチャンネル入力信号の差成分に所定の遅延（人が知覚できる程度の遅延）を付与し、Ｌ＋Ｒ信号に加算することで音場を与えるものが記載されている。 Patent Document 2 describes that a predetermined delay (a delay that can be perceived by humans) is given to a difference component between an L channel input signal and an R channel input signal, and a sound field is given by adding the difference component to an L + R signal. Has been.

特許第４５２６７５７号公報Japanese Patent No. 4526757 特開平３−２６６５９８号公報JP-A-3-266598

特許文献１の方式では、非同相成分の低域を強調するものであるため、ＬチャンネルとＲチャンネルが分離するものではない。 In the method of Patent Document 1, since the low frequency range of the non-in-phase component is emphasized, the L channel and the R channel are not separated.

また、特許文献２の方式でも、音像はモノラルのままで、ＬチャンネルとＲチャンネルが明瞭に定位するわけではない。 In the method of Patent Document 2, the sound image remains monaural and the L channel and the R channel are not clearly localized.

そこで、本発明は、複数チャンネルの音声信号を合成してスピーカから出力する場合であっても、複数の信号入力チャンネルに個別の明瞭な定位を与えることで広がりを感じさせることができる信号処理装置を提供することを目的とする。 Accordingly, the present invention provides a signal processing device that can make a sense of spread by giving individual clear localization to a plurality of signal input channels even when a plurality of channels of audio signals are synthesized and output from a speaker. The purpose is to provide.

この発明の信号処理装置は、複数チャンネルの音声信号を入力する入力部と、所定角度以上の仰角差を有する方向にそれぞれ音像を定位させる複数の頭部伝達関数の差分特性を前記複数チャンネルの音声信号に付与する特性付与部と、前記特性付与部が差分特性を付与した複数チャンネルの音声信号を合成して出力する出力部と、を備え、前記特性付与部は、チャンネル毎に異なる差分特性を付与する。 The signal processing apparatus according to the present invention has a difference characteristic of an input unit that inputs a plurality of channels of sound signals and a plurality of head related transfer functions that localize a sound image in a direction having an elevation angle difference of a predetermined angle or more. A characteristic adding unit for adding to the signal, and an output unit for synthesizing and outputting audio signals of a plurality of channels to which the characteristic adding unit has added the differential characteristic, and the characteristic adding unit has different differential characteristics for each channel. Give.

例えば、正面、仰角０度の方向に音像定位する頭部伝達関数と、正面、仰角＋３０度に音像定位する頭部伝達関数と、の差分特性をいずれか１つのチャンネル（例えばＬチャンネル）に付与し、正面、仰角０度の方向に音像定位する頭部伝達関数と、正面、仰角−３０度に音像定位する頭部伝達関数と、の差分特性を他のチャンネル（例えばＲチャンネル）に付与する。差分特性を付与することにより、音源定位のうち仰角に寄与する成分だけが抽出され、正面の仰角（例えば＋３０度）の頭部伝達関数を付与する場合よりも定位を与えることができる。したがって、モノラルスピーカであっても複数の信号入力チャンネルに個別の明瞭な定位を与えることができる。なお、仰角差は、１５度未満であると各チャンネルが明瞭に分離していると認識することが困難となるため、１５度以上であることが望ましい。 For example, the difference characteristic between the head-related transfer function that localizes the sound image in the direction of the front and elevation angles and the head-related transfer function that determines the sound image of the front and elevation angles of +30 degrees is given to any one channel (for example, L channel). Then, a difference characteristic between the head-related transfer function that localizes the sound image in the direction of the front and elevation angles of 0 degrees and the head-related transfer function that determines the sound image of the front and elevation angles of -30 degrees is given to other channels (for example, the R channel). . By giving the difference characteristic, only the component contributing to the elevation angle is extracted from the sound source localization, and the localization can be given more than when the head-related transfer function of the elevation angle of the front (for example, +30 degrees) is given. Therefore, even if it is a monaural speaker, individual clear localization can be given to a plurality of signal input channels. Note that if the elevation angle difference is less than 15 degrees, it is difficult to recognize that each channel is clearly separated.

なお、差分特性は、頭部伝達関数に基づくものであるため、所定の周波数にピークまたはディップを有する。これらピークまたはディップは、フラット特性からの偏差が大きく、そのままでは音質への影響が大きいため、音源定位に影響しない程度に（例えば±６ｄＢ程度のゲインに）緩和することが望ましい。 The difference characteristic is based on the head-related transfer function, and thus has a peak or dip at a predetermined frequency. Since these peaks or dips have a large deviation from the flat characteristics and have a great influence on the sound quality as they are, it is desirable to relax them to such an extent that they do not affect the sound source localization (for example, a gain of about ± 6 dB).

また、本発明の信号処理装置は、入力部が入力した複数チャンネルの音声信号を加算し、当該加算した音声信号から所定帯域以下の音声信号を分離する分離部を備え、前記出力部は、前記特性付与部が差分特性を付与した複数チャンネルの音声信号、および前記分離した音声信号を合成して出力する態様であることが望ましい。また、前記所定帯域は、定位に影響しない周波数帯域であることが望ましい。 The signal processing apparatus of the present invention adds the audio signals of a plurality of channels by the input unit inputted, comprising a separation unit for separating a predetermined band following the audio signal from the audio signal the sum, and the output unit, the It is desirable that the characteristic adding unit synthesizes and outputs the audio signals of a plurality of channels to which the differential characteristic is added and the separated audio signals. The predetermined band is preferably a frequency band that does not affect localization.

このように、定位に影響しない周波数帯域は、分離して差分特性を付与せずに合成することで、音質への影響を緩和することができる。特に、分離した音声信号は、差分特性を付与する音声信号とは別にゲインを制御することが可能となるため、ソースや聴取者の好みに合わせた調整を行うことができる。例えば、ソースに人の音声が含まれたものであれば、分離した音声信号のゲインを高くしてモノラル成分（Ｌチャンネル＋Ｒチャンネル）を強くすることで相対的に同相（センタ定位）成分を強調することができる。逆に、ソースがＢＧＭであれば、分離した音声信号のゲインを相対的に低くして、ＬチャンネルとＲチャネルの定位感を強くすることができる。 As described above, the influence on the sound quality can be reduced by separating the frequency bands that do not affect the localization and synthesizing them without adding the difference characteristics. In particular, the gain of the separated audio signal can be controlled separately from the audio signal to which the differential characteristic is added, so that the adjustment according to the preference of the source or the listener can be performed. For example, if the source contains human speech, the gain of the separated audio signal is increased to increase the monaural component (L channel + R channel), thereby relatively enhancing the in-phase (center localization) component. can do. Conversely, if the source is BGM, the gain of the separated audio signal can be made relatively low, and the localization feeling of the L channel and the R channel can be strengthened.

なお、定位に影響しない周波数帯域は、概ね７００〜１．５ｋＨｚ以下の周波数帯域であり、実際にはスピーカの特性に依存する。例えば指向性の鋭いスピーカのときは、スピーカそのものが音源として定位を与えやすいため、分離する周波数を低くし、音像の定位感は差分特性にゆだねるものとする。逆に、無指向性のスピーカであれば、スピーカそのものの定位感は弱くなるため、分離する周波数を高くしても各チャンネルの音像の定位感を保つことができる。 Note that the frequency band that does not affect localization is approximately 700 to 1.5 kHz or less, and actually depends on the characteristics of the speaker. For example, in the case of a speaker with sharp directivity, the speaker itself is likely to be localized as a sound source. Therefore, the frequency to be separated is lowered, and the localization of the sound image depends on the differential characteristics. On the contrary, if the speaker is omnidirectional, the feeling of localization of the speaker itself is weakened, so that the feeling of localization of the sound image of each channel can be maintained even if the frequency to be separated is increased.

この発明によれば、複数チャンネルの音声信号を合成してスピーカから出力する場合であっても、複数の信号入力チャンネルに個別の明瞭な定位を与えることで広がりあるサラウンド感を与えることができる。 According to the present invention, even when a plurality of channels of audio signals are synthesized and output from a speaker, a broad sense of surround can be provided by giving individual clear localization to the plurality of signal input channels.

仮想音像定位システムの構成を示すブロック図である。It is a block diagram which shows the structure of a virtual sound image localization system. 信号処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of a signal processing apparatus. 仰角±３０度、および差分の周波数特性を示す図である。It is a figure which shows an elevation angle +/- 30 degree | times and the frequency characteristic of a difference. 仰角±１５度、および差分の周波数特性を示す図である。It is a figure which shows the frequency characteristic of an elevation angle +/- 15 degree | times and a difference.

図１は、本発明の信号処理装置を備えた仮想音像定位システムの構成を示すブロック図である。本実施形態の仮想音像定位システムは、モノラルスピーカを用いてマルチチャンネル再生を行うものである。図１に示すように、本実施形態における仮想音像定位システムは、信号処理装置１と、スピーカ２と、を備えている。なお、本実施形態において、特に記載なき場合、音声信号は、全てデジタル信号として説明し、ＤＡコンバータやパワーアンプの構成は省略する。 FIG. 1 is a block diagram showing a configuration of a virtual sound image localization system including a signal processing device of the present invention. The virtual sound image localization system of this embodiment performs multi-channel reproduction using a monaural speaker. As shown in FIG. 1, the virtual sound image localization system in this embodiment includes a signal processing device 1 and a speaker 2. In this embodiment, unless otherwise specified, all audio signals are described as digital signals, and the configurations of the DA converter and the power amplifier are omitted.

信号処理装置１は、ある位置に設置した複数のスピーカから聴取者の耳に至る音の大きさ、到達時間、および周波数特性の差を表現した頭部伝達関数（以下、ＨＲＴＦと言う。）に基づく信号処理を音声信号に施してスピーカ２に供給することで、正中面（聴取者の正面）に設置したモノラルスピーカでありながら、複数の信号入力チャンネルに個別の明瞭な定位を与えることで広がりあるサラウンド感を与えることができるものである。 The signal processing apparatus 1 uses a head-related transfer function (hereinafter referred to as HRTF) that expresses the difference in sound volume, arrival time, and frequency characteristics from a plurality of speakers installed at a certain position to the listener's ear. By applying the signal processing based on the audio signal and supplying it to the speaker 2, it is a monaural speaker installed on the median plane (in front of the listener), but spreads by giving individual clear localization to multiple signal input channels It can give a certain surround feeling.

図２に示すように、信号処理装置１は、入力インタフェース（Ｉ／Ｆ）１１、ハイパスフィルタ（ＨＰＦ）１２、ＦＩＲフィルタ１３、ＨＰＦ１４、ＦＩＲフィルタ１５、レベル調整器１６、レベル調整器１７、加算器１８、ＬＰＦ１９、レベル調整器２０、加算器２１、および出力Ｉ／Ｆ２２を備えている。 As shown in FIG. 2, the signal processing apparatus 1 includes an input interface (I / F) 11, a high pass filter (HPF) 12, an FIR filter 13, an HPF 14, an FIR filter 15, a level adjuster 16, a level adjuster 17, and an addition. A unit 18, an LPF 19, a level adjuster 20, an adder 21, and an output I / F 22 are provided.

入力Ｉ／Ｆ１１は、他装置や自装置内のコンテンツ再生部（不図示）から音声信号を取得する。取得した音声信号のうち、左チャンネル（Ｌ）信号はＨＰＦ１２およびレベル調整器１６に供給され、右チャンネル（Ｒ）信号はＨＰＦ１４およびレベル調整器１７に供給される。 The input I / F 11 acquires an audio signal from another device or a content reproduction unit (not shown) in the device itself. Of the acquired audio signals, the left channel (L) signal is supplied to the HPF 12 and the level adjuster 16, and the right channel (R) signal is supplied to the HPF 14 and the level adjuster 17.

ＨＰＦ１２に入力されたＬ信号は、ＦＩＲフィルタ１３を経て、加算器２１に出力される。ＨＰＦ１４に入力されたＬ信号は、ＦＩＲフィルタ１５を経て、加算器２１に出力される。 The L signal input to the HPF 12 is output to the adder 21 through the FIR filter 13. The L signal input to the HPF 14 is output to the adder 21 through the FIR filter 15.

ＦＩＲフィルタ１３およびＦＩＲフィルタ１５は、本発明の特性付与部に相当し、それぞれＬ信号およびＲ信号に所定の伝達特性を付与する。ＦＩＲフィルタ１３は、図１に示した正面仰角＋３０度に設置したＶＳＬに音像定位させるＨＲＴＦと、正面仰角０度に設置したＶＳＣに音像定位させるＨＲＴＦと、の差分特性を実現するためのフィルタ係数が設定されている。ＦＩＲフィルタ１５は、図１に示した正面仰角−３０度に設置したＶＳＲに音像定位させるＨＲＴＦと、正面仰角０度に設置したＶＳＣに音像定位させるＨＲＴＦと、の差分特性を実現するためのフィルタ係数が設定されている。これにより、Ｌチャンネル信号を仰角プラス方向に定位させ、Ｒチャンネル信号を仰角マイナス方向に定位させることができる。詳細な特性については、図３を参照して後述する。 The FIR filter 13 and the FIR filter 15 correspond to a characteristic providing unit of the present invention, and give predetermined transfer characteristics to the L signal and the R signal, respectively. The FIR filter 13 is a filter coefficient for realizing a differential characteristic between the HRTF that localizes the sound image to the VSL installed at the front elevation angle +30 degrees shown in FIG. 1 and the HRTF that localizes the sound image to the VSC installed at the front elevation angle 0 degrees. Is set. The FIR filter 15 is a filter for realizing a differential characteristic between the HRTF for sound image localization in the VSR installed at a front elevation angle of −30 degrees shown in FIG. 1 and the HRTF for sound image localization in the VSC installed at a front elevation angle of 0 degrees. The coefficient is set. As a result, the L channel signal can be localized in the elevation angle plus direction, and the R channel signal can be localized in the elevation angle minus direction. Detailed characteristics will be described later with reference to FIG.

一方、レベル調整器１６に入力されたＬ信号、およびレベル調整器１７に入力されたＲ信号は、加算器１８で加算され、Ｌ＋Ｒ信号としてＬＰＦ１９に入力される。レベル調整器１６およびレベル調整器１７のゲインは、それぞれ０．５に設定され、同相成分の加算前後のレベルを維持する態様とする。 On the other hand, the L signal input to the level adjuster 16 and the R signal input to the level adjuster 17 are added by the adder 18 and input to the LPF 19 as an L + R signal. The gains of the level adjuster 16 and the level adjuster 17 are set to 0.5, respectively, so that the levels before and after the addition of the in-phase components are maintained.

ＬＰＦ１９から出力されたＬ＋Ｒ信号は、レベル調整器２０でレベル調整がなされ、加算器２１に出力される。レベル調整器２０のゲインは、０以上の任意の値である。このレベル調整器２０のゲインは、ＦＩＲフィルタ１３およびＦＩＲフィルタ１５の出力信号であるＬ信号およびＲ信号との相対的なレベル調整のために設定されるものである。レベル調整器２０のゲインを大きくすれば、Ｌ＋Ｒ信号が強調され、同相成分が強調されることになる。例えば、入力音声信号（ソース）に人の音声が含まれたものであれば、レベル調整器２０のゲインを高くして人の音声を強調するようにしてもよいし、逆に、ソースがＢＧＭであれば、レベル調整器２０のゲインを低くして、Ｌ信号およびＲ信号（ＦＩＲフィルタの出力信号）を強調し、定位感を強調するようにしてもよい。あるいは聴取者の好みに合わせた設定を行ってもよい。 The level of the L + R signal output from the LPF 19 is adjusted by the level adjuster 20 and output to the adder 21. The gain of the level adjuster 20 is an arbitrary value of 0 or more. The gain of the level adjuster 20 is set for relative level adjustment with the L signal and the R signal that are output signals of the FIR filter 13 and the FIR filter 15. If the gain of the level adjuster 20 is increased, the L + R signal is enhanced and the in-phase component is enhanced. For example, if a human voice is included in the input voice signal (source), the gain of the level adjuster 20 may be increased to emphasize the human voice, and conversely, the source is BGM. If so, the gain of the level adjuster 20 may be lowered to emphasize the L signal and the R signal (output signal of the FIR filter), thereby enhancing the sense of localization. Or you may set according to a listener's liking.

加算器２１は、ＦＩＲフィルタ１３から出力されたＬ信号、ＦＩＲフィルタ１５から出力されたＲ信号、およびレベル調整器２０から出力されたＬ＋Ｒ信号を合成し、出力Ｉ／Ｆ２２に出力する。出力Ｉ／Ｆ２２は、この合成後の信号をスピーカ２に出力する（図示は省略するが、合成後の信号は、Ｄ／Ａ変換され、パワーアンプで増幅されてから出力される）。これにより、スピーカ２からは、正面仰角０度と正面仰角＋３０度との差分特性が付与されたＬ信号、正面仰角０度と正面仰角−３０度との差分特性が付与されたＲ信号、ゲイン調整がなされたＬ＋Ｒ信号が出力される。 The adder 21 combines the L signal output from the FIR filter 13, the R signal output from the FIR filter 15, and the L + R signal output from the level adjuster 20, and outputs the resultant signal to the output I / F 22. The output I / F 22 outputs the combined signal to the speaker 2 (not shown, but the combined signal is D / A converted, amplified by a power amplifier, and then output). Thereby, from the speaker 2, the L signal to which the difference characteristic between the front elevation angle 0 degree and the front elevation angle +30 degrees is given, the R signal to which the difference characteristic between the front elevation angle 0 degree and the front elevation angle -30 degrees is given, and the gain. The adjusted L + R signal is output.

なお、ＦＩＲフィルタ１３およびＦＩＲフィルタ１５の後段にもレベル調整器を設け、個別にゲイン調整を行ってもよい。 Note that a level adjuster may be provided at the subsequent stage of the FIR filter 13 and the FIR filter 15 to individually adjust the gain.

ＨＰＦ１２およびＨＰＦ１４は、同じ特性であり、同じカットオフ周波数が設定されている。また、ＬＰＦ１９のカットオフ周波数はＨＰＦ１２およびＨＰＦ１４に対応して設定されている。本実施形態では、ＨＰＦ１２、ＨＰＦ１４、およびＬＰＦ１９のカットオフ周波数を全て１．５ｋＨｚとし、定位に影響がある周波数帯域についてのみＦＩＲフィルタ１３およびＦＩＲフィルタ１５で差分特性を付与し、定位に影響がない周波数帯域を分離して差分特性を付与せずにゲイン調整して合成することで、音質への影響を緩和する。特に、上述したように、分離したＬ＋Ｒ信号は、差分特性を付与するＬ信号およびＲ信号とは別にゲインを制御することが可能となるため、ソースや聴取者の好みに合わせたゲイン調整を行うことができる。ただし、ＨＰＦ１２およびＨＰＦ１４のカットオフ周波数を低くし、ＬＰＦ１９のカットオフ周波数は高くする等して、帯域が重複するようにしてもよいし、逆にＨＰＦ１２およびＨＰＦ１４のカットオフ周波数を高くし、ＬＰＦ１９のカットオフ周波数を低くする等して、帯域が離れていてもよい。 The HPF 12 and the HPF 14 have the same characteristics, and the same cutoff frequency is set. The cut-off frequency of the LPF 19 is set corresponding to the HPF 12 and the HPF 14. In this embodiment, the cutoff frequencies of the HPF 12, HPF 14, and LPF 19 are all set to 1.5 kHz, and the FIR filter 13 and the FIR filter 15 provide differential characteristics only for frequency bands that affect localization, and the localization is not affected. The effect on the sound quality is mitigated by separating the frequency band and adjusting the gain without synthesizing the difference characteristic. In particular, as described above, since the separated L + R signal can control the gain separately from the L signal and the R signal that give the differential characteristics, the gain is adjusted according to the preference of the source or the listener. be able to. However, the cut-off frequencies of the HPF 12 and the HPF 14 may be lowered, the cut-off frequency of the LPF 19 may be increased, and the bands may be overlapped. Conversely, the cut-off frequencies of the HPF 12 and the HPF 14 are raised, and the LPF 19 The band may be separated by lowering the cut-off frequency.

なお、周波数レベル特性差による定位に影響がない周波数帯域は、概ね７００〜１．５ｋＨｚ以下の周波数帯域であり、実際にはスピーカの特性に依存する。例えば指向性の鋭いスピーカのときは、スピーカそのものが音源として定位を与えやすいため、分離する周波数を低くし、音像の定位は差分特性の影響にゆだねる。逆に、無指向性のスピーカであれば、スピーカそのものの定位は弱くなるため、分離する周波数を高くしても各チャンネルの音像の定位感を保つことができる。 The frequency band that does not affect the localization due to the frequency level characteristic difference is approximately 700 to 1.5 kHz or less, and actually depends on the characteristics of the speaker. For example, in the case of a speaker with sharp directivity, since the speaker itself is likely to be localized as a sound source, the frequency to be separated is lowered, and the localization of the sound image is influenced by the difference characteristics. On the contrary, if the speaker is omnidirectional, the localization of the speaker itself is weak, so that the localization of the sound image of each channel can be maintained even if the frequency to be separated is increased.

次に、図３を参照して、ＦＩＲフィルタ１３およびＦＩＲフィルタ１５により実現する周波数特性について説明する。図３（Ａ）は、正面仰角０度のＨＲＴＦ（周波数特性）、正面仰角＋３０度のＨＲＴＦ、および差分特性を示す図であり、同図（Ｂ）は正面仰角０度のＨＲＴＦ、正面仰角−３０度のＨＲＴＦ、および差分特性を示す図である。 Next, frequency characteristics realized by the FIR filter 13 and the FIR filter 15 will be described with reference to FIG. FIG. 3A is a diagram showing a HRTF (frequency characteristic) with a front elevation angle of 0 degrees, a HRTF with a front elevation angle of +30 degrees, and a difference characteristic. FIG. 3B shows a HRTF with a front elevation angle of 0 degrees, a front elevation angle− It is a figure which shows HRTF of 30 degree | times, and a difference characteristic.

これらの図に示すように、正面仰角０度のＨＲＴＦは、１０ｋＨｚ付近に−２０〜−２５ｄＢ程度の大きなディップを有するとともに、他の周波数においてもゲインの上昇や落ち込みを有する。正面仰角＋３０度のＨＲＴＦは、７ｋＨｚ付近に−５〜−１０ｄＢ程度、および１０数ｋＨｚに−１０ｄＢ程度の大きなディップを有するとともに、他の周波数においてもゲインの上昇や落ち込みを有する。同様に、正面仰角−３０度のＨＲＴＦは、８〜９ｋＨｚ付近に−１５〜−２０ｄＢ程度、および２０ｋＨｚ弱にも−１５〜−２０ｄＢ程度の大きなディップを有するとともに、他の周波数においてもゲインの上昇や落ち込みを有する。 As shown in these figures, the HRTF with a front elevation angle of 0 degrees has a large dip of about -20 to -25 dB in the vicinity of 10 kHz, and also has an increase or decrease in gain at other frequencies. An HRTF with a front elevation angle of +30 degrees has a large dip of about -5 to -10 dB around 7 kHz and about -10 dB at a few dozen kHz, and also has an increase or decrease in gain at other frequencies. Similarly, an HRTF with a front elevation angle of −30 degrees has a large dip of about −15 to −20 dB in the vicinity of 8 to 9 kHz, and a gain of about −15 to −20 dB at a little less than 20 kHz, and gain increases at other frequencies. Have a slight depression.

これらの複雑な特性は、例えばＨＲＴＦを測定したときに用いたダミーヘッド固有の影響を受けたものが含まれている。つまり、各種仰角に音像を定位させるＨＲＴＦには、仰角定位に影響するピークやディップ以外にも様々な特性を有することになる。本実施形態の信号処理装置１では、正面仰角０度のＨＲＴＦと正面仰角＋３０度のＨＲＴＦとの差分特性、および正面仰角０度のＨＲＴＦと正面仰角−３０度のＨＲＴＦとの差分特性、をそれぞれＬ信号およびＲ信号に付与することにより、仰角定位に影響する成分だけを抽出することで、同じ仰角を持つＨＲＴＦを付与するだけよりも効果的に定位感を与えることができる。よって、モノラルスピーカであってもＬチャンネルおよびＲチャンネルに明瞭な定位を与えることができ、広がりあるサラウンド感を与えることができる。 These complicated characteristics include, for example, those affected by the influence of the dummy head used when measuring the HRTF. That is, the HRTF that localizes the sound image at various elevation angles has various characteristics other than the peak and dip that affect the elevation angle localization. In the signal processing device 1 of the present embodiment, the difference characteristics between the HRTF with a front elevation angle of 0 degrees and the HRTF with a front elevation angle of +30 degrees, and the difference characteristics between the HRTF with a front elevation angle of 0 degrees and the HRTF with a front elevation angle of -30 degrees, respectively. By adding only the components that affect the elevation angle localization by adding them to the L signal and the R signal, it is possible to give a sense of localization more effectively than simply applying the HRTF having the same elevation angle. Therefore, even if it is a monaural speaker, a clear localization can be given to the L channel and the R channel, and a broad surround feeling can be given.

なお、Ｌ信号に付与する差分特性は、１０ｋＨｚ付近に１５〜２０ｄＢ程度のピークを有し、Ｒ信号に付与する差分特性は１０ｋＨｚ付近に１５〜２０ｄＢ程度のピークを有するとともに８〜９ｋＨｚに（１８ｋＨｚ程度にも）−１５ｄＢ程度のディップを有する。これらピークまたはディップは、そのままでは音質への影響が大きいため、音源定位に影響しない程度に（例えば±６ｄＢ程度のゲインに）緩和することが望ましい。 The differential characteristic imparted to the L signal has a peak of about 15 to 20 dB around 10 kHz, and the differential characteristic imparted to the R signal has a peak of about 15 to 20 dB around 10 kHz and at 8 to 9 kHz (18 kHz Also has a dip of about -15 dB. Since these peaks or dips have a great influence on the sound quality as they are, it is desirable to relax them so as not to affect the sound source localization (for example, gain of about ± 6 dB).

なお、本実施形態では、Ｌ信号を仰角プラス方向に、Ｒ信号を仰角マイナス方向に定位させて分離感を与える態様としているが、無論Ｒ信号を仰角プラス方向に、Ｌ信号を仰角マイナス方向に定位させて分離感を与える態様としてもよい。 In this embodiment, the L signal is localized in the plus direction of the elevation angle and the R signal is localized in the minus direction of the elevation angle to give a sense of separation. Of course, the R signal is in the plus direction of the elevation angle, and the L signal is in the minus direction of the elevation angle. It is good also as an aspect which makes it localize and gives a feeling of separation.

なお、仰角差は、１５度未満であるとスピーカの視覚的影響を受け易くなり、各チャンネルが明瞭に分離していると認識することが難しくなるため、１５度以上であることが望ましい。図４（Ａ）は、正面仰角０度のＨＲＴＦ（周波数特性）、正面仰角１５度のＨＲＴＦ、および差分特性を示す図であり、同図（Ｂ）は正面仰角０度のＨＲＴＦ、正面仰角−１５度のＨＲＴＦ、および差分特性を示す図である。 If the elevation angle difference is less than 15 degrees, it is likely to be visually influenced by the speaker, and it is difficult to recognize that each channel is clearly separated. FIG. 4A is a diagram showing a HRTF (frequency characteristic) with a front elevation angle of 0 degrees, a HRTF with a front elevation angle of 15 degrees, and a difference characteristic. FIG. 4B shows a HRTF with a front elevation angle of 0 degrees, a front elevation angle− It is a figure which shows HRTF of 15 degree | times, and a difference characteristic.

これらの図に示すように、正面仰角１５度のＨＲＴＦは、１０数ｋＨｚ付近に−１５ｄＢ程度の大きなディップを有するとともに、他の周波数においてもゲインの上昇や落ち込みを有する。同様に、正面仰角−１５度のＨＲＴＦは、９ｋＨｚ付近に−３０ｄＢ超の大きなディップを有するとともに、他の周波数においてもゲインの上昇や落ち込みを有する。 As shown in these figures, the HRTF with a front elevation angle of 15 degrees has a large dip of about −15 dB in the vicinity of several tens of kHz, and also has an increase or decrease in gain at other frequencies. Similarly, an HRTF with a front elevation angle of -15 degrees has a large dip of more than -30 dB in the vicinity of 9 kHz, and also has gain rises and falls at other frequencies.

この場合も、図３の例と同様に、正面仰角０度のＨＲＴＦと正面仰角１５度のＨＲＴＦとの差分特性、および正面仰角０度のＨＲＴＦと正面仰角−１５度のＨＲＴＦとの差分特性、をそれぞれＬ信号およびＲ信号に付与することにより、仰角定位に影響する成分だけを抽出し、同じ仰角を持つＨＲＴＦを付与するだけよりも効果的に定位を与えることができる。よって、１５度程度の仰角差を有する差分特性を付与する場合であっても、ＬチャンネルおよびＲチャンネルに明瞭な定位を与えることができ、広がりあるサラウンド感を与えることができる。 In this case as well, as in the example of FIG. 3, the difference characteristics between the HRTF with a front elevation angle of 0 degrees and the HRTF with a front elevation angle of 15 degrees, and the difference characteristics between the HRTF with a front elevation angle of 0 degrees and the HRTF with a front elevation angle of −15 degrees, Is applied to the L signal and the R signal, respectively, so that only the component that affects the elevation angle localization is extracted, and the localization can be given more effectively than just adding the HRTF having the same elevation angle. Therefore, even when a difference characteristic having an elevation angle difference of about 15 degrees is given, a clear localization can be given to the L channel and the R channel, and a broad surround feeling can be given.

なお、図４の例においても、Ｌ信号に付与する差分特性は、１０ｋＨｚ付近に１０〜１５ｄＢ程度のピークを有し、Ｒ信号に付与する差分特性は１０ｋＨｚ付近に１５ｄＢ程度のピークを有するとともに、９ｋＨｚに−２０ｄＢ超のディップを有する。これらピークまたはディップについても、そのままでは音質への影響が大きいため、音源定位に影響しない程度に（例えば±６ｄＢ程度のゲインに）緩和することが望ましい。 Also in the example of FIG. 4, the differential characteristic imparted to the L signal has a peak of about 10 to 15 dB around 10 kHz, and the differential characteristic imparted to the R signal has a peak of about 15 dB around 10 kHz. It has a dip greater than −20 dB at 9 kHz. Since these peaks or dips have a great influence on the sound quality as they are, it is desirable to relax them to such an extent that they do not affect the sound source localization (for example, a gain of about ± 6 dB).

なお、本実施形態では、モノラルスピーカを用いてＬチャンネルおよびＲチャンネルの２チャンネル再生を行う例を示したが、さらに多数チャンネル再生を行う場合も可能である。例えば、リアＬチャンネルおよびリアＲチャンネルの音声信号を聴取者の背面に設置した１つのスピーカから出力する場合も、本実施形態に示した差分特性を付与することで、リアＬチャンネルおよびリアＲチャンネルに明瞭な定位を与えることができ、広がりあるサラウンド感を与えることができる。 In this embodiment, an example in which two-channel reproduction of the L channel and the R channel is performed using a monaural speaker is shown, but it is also possible to perform reproduction of more channels. For example, when the rear L channel and rear R channel audio signals are output from a single speaker installed on the back of the listener, the difference characteristics shown in the present embodiment are added to the rear L channel and rear R channel. Can be given a clear localization, and can provide a broad sense of surround.

１…信号処理装置
２…スピーカ
１１…入力Ｉ／Ｆ
１２…ＨＰＦ
１３…ＦＩＲフィルタ
１４…ＨＰＦ
１５…ＦＩＲフィルタ
１６…レベル調整器
１７…レベル調整器
１８…加算器
１９…ＬＰＦ
２０…レベル調整器
２１…加算器
２２…出力Ｉ／Ｆ DESCRIPTION OF SYMBOLS 1 ... Signal processing apparatus 2 ... Speaker 11 ... Input I / F
12 ... HPF
13 ... FIR filter 14 ... HPF
15 ... FIR filter 16 ... Level adjuster 17 ... Level adjuster 18 ... Adder 19 ... LPF
20 ... Level adjuster 21 ... Adder 22 ... Output I / F

Claims

An input unit for inputting multi-channel audio signals;
A characteristic imparting unit that imparts to the audio signals of the plurality of channels the differential characteristics of a plurality of head related transfer functions that each localize a sound image in a direction having an elevation angle difference equal to or greater than a predetermined angle;
An output unit that synthesizes and outputs audio signals of a plurality of channels to which the characteristic assigning unit has added differential characteristics;
With
The said characteristic provision part is a signal processing apparatus which provides a different differential characteristic for every channel.

The signal processing apparatus according to claim 1, wherein the predetermined angle is 15 degrees.

The input section adds the audio signals of a plurality of channels inputted, comprising a separation unit for separating a predetermined band following the audio signal from the audio signal the sum,
The signal processing apparatus according to claim 1, wherein the output unit synthesizes and outputs a plurality of channels of audio signals to which the characteristic adding unit has added a differential characteristic and the separated audio signals.

The signal processing apparatus according to claim 3, wherein the predetermined band is a frequency band that does not affect localization.

The signal processing apparatus according to claim 1, wherein the characteristic imparting unit relaxes a peak or a dip in the difference characteristic.