JP2019169835A

JP2019169835A - Out-of-head localization processing apparatus, out-of-head localization processing method, and program

Info

Publication number: JP2019169835A
Application number: JP2018055768A
Authority: JP
Inventors: 敬洋下条; Takahiro Shimojo; 村田　寿子; Toshiko Murata; 寿子村田; 優美藤井; Yumi Fujii; 正也小西; Masaya Konishi; 邦明高地; Kuniaki Kochi; 永井　俊明; Toshiaki Nagai; 俊明永井
Original assignee: JVCKenwood Corp
Current assignee: JVCKenwood Corp
Priority date: 2018-03-23
Filing date: 2018-03-23
Publication date: 2019-10-03
Anticipated expiration: 2038-03-23
Also published as: JP6981330B2

Abstract

To provide an out-of-head localization processing apparatus, a method, and a program, capable of appropriately performing out-of-head localization processing.SOLUTION: An out-of-head localization processing apparatus 100 includes: a left unit 43L having a built-in microphone 48; a reference signal acquisition unit 122 for acquiring a reference signal on the basis of a sound collection signal collected by the built-in microphone; an ECTF acquisition unit 132 for acquiring transfer characteristics on the basis of a sound collection signal collected by the left microphone 2L in a measurement state; a first storage unit 141 for storing a first reference signal acquired by the reference signal acquisition unit in the measurement state; a second storage unit 142 for storing a second reference signal acquired by the reference signal acquisition unit 122 in a listening state; a conversion function calculation unit 151 for calculating a conversion function of a frequency characteristic on the basis of the first and second reference signals; and a correction unit 152 for correcting the frequency characteristics of an ECTF using the conversion function.SELECTED DRAWING: Figure 3

Description

本発明は、頭外定位処理装置、頭外定位処理方法、及びプログラムに関する。 The present invention relates to an out-of-head localization processing apparatus, an out-of-head localization processing method, and a program.

例えば、音像定位技術として、ヘッドホンを用いて受聴者の頭部の外側に音像を定位させる頭外定位技術がある。頭外定位技術では、ヘッドホンから耳（鼓膜）までの特性（外耳道伝達特性）をキャンセルし、ステレオスピーカから耳までの４本の特性（空間音響伝達特性）を与えることにより、音像を頭外に定位させている。 For example, as a sound image localization technique, there is an out-of-head localization technique in which a sound image is localized outside the listener's head using headphones. Out-of-head localization technology cancels the characteristics from the headphones to the ears (tympanic membrane) (the ear canal transmission characteristics), and gives four characteristics from the stereo speakers to the ears (spatial acoustic transmission characteristics), thereby moving the sound image out of the head. I have been localized.

頭外定位再生においては、ヘッドホンから発した測定信号（インパルス音等）を聴取者本人の耳に設置したマイクロフォン（以下、マイクとする）で録音する（特許文献１）。インパルス応答で得られた収音信号に基づいて、処理装置がヘッドホン特性を測定する。処理装置がヘッドホン特性に対する逆フィルタを作成する。処理装置が、空間音響伝達特性を畳み込んだ後、逆フィルタをさらに畳み込むことにより、頭外定位再生を実現することができる。 In the out-of-head localization reproduction, a measurement signal (impulse sound or the like) emitted from the headphones is recorded with a microphone (hereinafter referred to as a microphone) installed in the ear of the listener (Patent Document 1). Based on the collected sound signal obtained by the impulse response, the processing device measures the headphone characteristics. The processing device creates an inverse filter for the headphone characteristics. After the processing device convolves the spatial acoustic transfer characteristic, the out-of-head localization reproduction can be realized by further convolving the inverse filter.

特開２０１４−３８６０８号公報JP 2014-38608 A

しかしながら、ヘッドホンの装着状態に応じて、ヘッドホンのスピーカユニットから鼓膜までの外耳道伝達特性（外耳道伝達関数ともいう）が変化してしまうことがある。例えば、ヘッドホンを装着する都度、ヘッドホンの装着位置がずれるおそれがある。この場合、ヘッドホンのスピーカユニットの位置がずれるため、外耳道伝達特性が変化してしまう。つまり、耳にマイクを装着して、外耳道伝達特性を測定する測定時と、耳からマイクを外して、頭外定位受聴を行う受聴時とで、ヘッドホンの装着位置が変わってしまう。この場合、外耳道伝達特性が変化し、適切に頭外定位処理を行うことができなくなるおそれがある。 However, the ear canal transfer characteristic (also referred to as the ear canal transfer function) from the headphone speaker unit to the eardrum may change depending on the wearing state of the headphones. For example, each time the headphones are worn, there is a possibility that the wearing position of the headphones is shifted. In this case, since the position of the speaker unit of the headphones is shifted, the ear canal transmission characteristic changes. That is, the mounting position of the headphones changes between when measuring the ear canal transmission characteristics with a microphone attached to the ear and when listening with the microphone removed from the ear and performing stereotaxic listening. In this case, the transmission characteristics of the external auditory canal are changed, and there is a possibility that the out-of-head localization process cannot be performed appropriately.

本実施形態は上記の点に鑑みなされたもので、適切に頭外定位処理を行うことができる頭外定位処理装置、頭外定位処理方法、及びプログラムを提供することを目的とする。 The present embodiment has been made in view of the above points, and an object thereof is to provide an out-of-head localization processing apparatus, an out-of-head localization processing method, and a program that can appropriately perform out-of-head localization processing.

本実施形態にかかる頭外定位処理装置は、参照用マイクを有する音声出力部と、前記参照用マイクが収音した収音信号に基づく参照信号を取得する参照信号取得部と、前記音声出力部を装着したユーザの外耳道に測定用マイクが配置された測定状態において、前記参照信号取得部が取得した第１の参照信号を記憶する第１の記憶部と、前記測定用マイクが前記外耳道から取り外された受聴状態において、前記参照信号取得部が取得した第２の参照信号を記憶する第２の記憶部と、前記測定状態において、前記測定用マイクが収音した収音信号に基づいて取得された伝達特性を記憶する第３の記憶部と、前記第１の参照信号と前記第２の参照信号とに基づいて、周波数特性の変換関数を算出する変換関数算出部と、前記変換関数を用いて、前記伝達特性の周波数特性を補正する補正部と、前記補正された周波数特性に基づいて、フィルタを生成するフィルタ生成部と、前記フィルタを用いて、頭外定位処理を行う頭外定位処理部と、を備えたものである。 The out-of-head localization processing apparatus according to the present embodiment includes an audio output unit having a reference microphone, a reference signal acquisition unit that acquires a reference signal based on a collected sound signal collected by the reference microphone, and the audio output unit In the measurement state in which the measurement microphone is arranged in the ear canal of the user wearing the first storage unit, the first storage unit that stores the first reference signal acquired by the reference signal acquisition unit, and the measurement microphone is removed from the ear canal A second storage unit that stores the second reference signal acquired by the reference signal acquisition unit in the received listening state, and a sound acquisition signal acquired by the measurement microphone in the measurement state. A third storage unit that stores the transfer characteristic, a conversion function calculation unit that calculates a conversion function of a frequency characteristic based on the first reference signal and the second reference signal, and the conversion function. And said A correction unit that corrects the frequency characteristic of the arrival characteristic, a filter generation unit that generates a filter based on the corrected frequency characteristic, an out-of-head localization processing unit that performs out-of-head localization processing using the filter, It is equipped with.

本実施形態にかかる頭外定位処理方法は、参照用マイクを有する音声出力部を装着したユーザの外耳道に測定用マイクが配置された測定状態において、前記測定用マイクで収音された収音信号に基づく伝達特性を取得するステップと、前記測定状態において、前記参照用マイクで収音された収音信号に基づく第１の参照信号を取得するステップと、前記測定用マイクが前記外耳道から取り外された受聴状態において、前記参照用マイクで収音された収音信号に基づく第２の参照信号を取得するステップと、前記第１の参照信号と前記第２の参照信号とに基づいて、周波数特性の変換関数を算出するステップと、前記変換関数を用いて、前記伝達特性の周波数特性を補正するステップと、前記補正された周波数特性に基づいて、フィルタを生成するステップと、前記フィルタを用いて、頭外定位処理を行うステップと、を備えたものである。 In the out-of-head localization processing method according to the present embodiment, the collected sound signal collected by the measurement microphone in the measurement state in which the measurement microphone is arranged in the ear canal of the user wearing the audio output unit having the reference microphone. Obtaining a transfer characteristic based on the first step, obtaining a first reference signal based on a collected sound signal collected by the reference microphone in the measurement state, and removing the measurement microphone from the ear canal In the listening state, a frequency characteristic is obtained based on the step of obtaining a second reference signal based on the collected sound signal collected by the reference microphone, and the first reference signal and the second reference signal. A step of calculating a conversion function, a step of correcting a frequency characteristic of the transfer characteristic using the conversion function, and generating a filter based on the corrected frequency characteristic A step that, using the filter, those having a step of performing out-of-head localization processing, the.

本実施形態にかかるプログラムは、コンピュータに、参照用マイクを有する音声出力部を装着したユーザの外耳道に測定用マイクが配置された測定状態において、前記測定用マイクで収音された収音信号に基づく伝達特性を取得するステップと、前記測定状態において、前記参照用マイクで収音された収音信号に基づく第１の参照信号を取得するステップと、前記測定用マイクが前記外耳道から取り外された受聴状態において、前記参照用マイクで収音された収音信号に基づく第２の参照信号を取得するステップと、前記第１の参照信号と前記第２の参照信号とに基づいて、周波数特性の変換関数を算出するステップと、前記変換関数を用いて、前記伝達特性の周波数特性を補正するステップと、前記補正された周波数特性に基づいて、フィルタを生成するステップと、前記フィルタを用いて、頭外定位処理を行うステップと、を実行させるものである。 The program according to the present embodiment is a computer program for collecting sound signals collected by the measurement microphone in a measurement state in which the measurement microphone is arranged in a user's external auditory canal equipped with a sound output unit having a reference microphone. Obtaining a transfer characteristic based on the input signal; obtaining a first reference signal based on a collected sound signal collected by the reference microphone in the measurement state; and removing the measurement microphone from the ear canal In a listening state, a step of obtaining a second reference signal based on a collected sound signal collected by the reference microphone, and a frequency characteristic based on the first reference signal and the second reference signal A step of calculating a conversion function; a step of correcting a frequency characteristic of the transfer characteristic using the conversion function; and a step of correcting a frequency characteristic based on the corrected frequency characteristic. Generating a data, by using the filter, it is intended to execute a step of performing out-of-head localization processing, the.

本実施形態によれば、適切に頭外定位処理を行うことができる頭外定位処理装置、頭外定位処理方法、及びプログラムを提供することができる。 According to the present embodiment, an out-of-head localization processing apparatus, an out-of-head localization processing method, and a program that can appropriately perform out-of-head localization processing can be provided.

本実施の形態に係る頭外定位処理装置を示すブロック図である。It is a block diagram which shows the out-of-head localization processing apparatus which concerns on this Embodiment. 外耳道伝達特性の測定構成を示す図である。It is a figure which shows the measurement structure of an external auditory canal transfer characteristic. 外耳道伝達特性を補正するための構成を示す図である。It is a figure which shows the structure for correct | amending an external auditory canal transfer characteristic. 外耳道伝達特性を補正するための処理を示すフローチャートである。It is a flowchart which shows the process for correct | amending an external auditory canal transfer characteristic. 変換関数を算出するための処理を示すフローチャートである。It is a flowchart which shows the process for calculating a conversion function. パワースペクトルの極値間のベクトルを示す図である。It is a figure which shows the vector between the extreme values of a power spectrum. ベクトルの変換手法を示す図である。It is a figure which shows the conversion method of a vector. 参照信号と外耳道伝達特性の対数パワースペクトルを示す図である。It is a figure which shows the logarithmic power spectrum of a reference signal and an external auditory canal transfer characteristic.

フィルタを用いた音像定位処理の概要について説明する。本実施形態にかかる頭外定位処理は、空間音響伝達特性と外耳道伝達特性を用いて頭外定位処理を行うものである。空間音響伝達特性は、スピーカなどの音源から外耳道までの伝達特性である。外耳道伝達特性は、ヘッドホン又はイヤホンのスピーカユニットから鼓膜までの伝達特性であり、ヘッドホン特性ともいう。本実施形態では、ヘッドホン又はイヤホンを装着した状態での外耳道伝達特性を測定し、それらの測定データを用いてフィルタを生成する処理に特徴を有するものである。 An outline of sound image localization processing using a filter will be described. The out-of-head localization processing according to the present embodiment performs out-of-head localization processing using spatial acoustic transmission characteristics and ear canal transmission characteristics. The spatial acoustic transfer characteristic is a transfer characteristic from a sound source such as a speaker to the ear canal. The ear canal transfer characteristic is a transfer characteristic from the speaker unit of the headphones or earphones to the eardrum, and is also referred to as a headphone characteristic. The present embodiment is characterized by the process of measuring the external auditory canal transfer characteristic in a state where headphones or earphones are worn, and generating a filter using those measurement data.

本実施の形態にかかる頭外定位処理は、パーソナルコンピュータ、スマートホン、タブレットＰＣなどのユーザ端末で実行される。ユーザ端末は、プロセッサ等の処理手段、メモリやハードディスクなどの記憶手段、液晶モニタ等の表示手段、タッチパネル、ボタン、キーボード、マウスなどの入力手段を有する情報処理装置である。ユーザ端末は、データを送受信する通信機能を有していてもよい。さらに、ユーザ端末には、左右の出力ユニットを有する音声出力部（ヘッドホン又はイヤホン）が接続される。以下、音声出力部をヘッドホンとした場合の構成について例示する。 The out-of-head localization processing according to the present embodiment is executed by a user terminal such as a personal computer, a smart phone, or a tablet PC. The user terminal is an information processing apparatus having processing means such as a processor, storage means such as a memory and a hard disk, display means such as a liquid crystal monitor, and input means such as a touch panel, buttons, a keyboard, and a mouse. The user terminal may have a communication function for transmitting and receiving data. Furthermore, an audio output unit (headphone or earphone) having left and right output units is connected to the user terminal. Hereinafter, the configuration when the audio output unit is a headphone will be exemplified.

（頭外定位処理装置）
本実施の形態にかかる音場再生装置の一例である頭外定位処理装置１００を図１に示す。図１は、頭外定位処理装置１００のブロック図である。頭外定位処理装置１００は、ヘッドホン４３を装着するユーザＵに対して音場を再生する。そのため、頭外定位処理装置１００は、ＬｃｈとＲｃｈのステレオ入力信号ＸＬ、ＸＲについて、音像定位処理を行う。ＬｃｈとＲｃｈのステレオ入力信号ＸＬ、ＸＲは、ＣＤ（Compact Disc）プレイヤーなどから出力されるアナログのオーディオ再生信号、又は、mp3(MPEG Audio Layer-3)等のデジタルオーディオデータである。なお、オーディオ再生信号、又はデジタルオーディオデータをまとめて再生信号と称する。すなわち、ＬｃｈとＲｃｈのステレオ入力信号ＸＬ、ＸＲが再生信号となっている。 (Out-of-head localization processor)
FIG. 1 shows an out-of-head localization processing apparatus 100 that is an example of a sound field reproducing apparatus according to the present embodiment. FIG. 1 is a block diagram of the out-of-head localization processing apparatus 100. The out-of-head localization processing apparatus 100 reproduces a sound field for the user U wearing the headphones 43. Therefore, the out-of-head localization processing apparatus 100 performs sound image localization processing on the Lch and Rch stereo input signals XL and XR. The Lch and Rch stereo input signals XL and XR are analog audio playback signals output from a CD (Compact Disc) player or the like, or digital audio data such as mp3 (MPEG Audio Layer-3). Note that audio playback signals or digital audio data are collectively referred to as playback signals. That is, the Lch and Rch stereo input signals XL and XR are reproduction signals.

なお、頭外定位処理装置１００は、物理的に単一な装置に限られるものではなく、一部の処理が異なる装置で行われてもよい。例えば、一部の処理がパソコンなどにより行われ、残りの処理がヘッドホン４３に内蔵されたＤＳＰ(Digital Signal Processor)などにより行われてもよい。 The out-of-head localization processing apparatus 100 is not limited to a physically single apparatus, and some processes may be performed by different apparatuses. For example, a part of the processing may be performed by a personal computer or the like, and the remaining processing may be performed by a DSP (Digital Signal Processor) built in the headphones 43 or the like.

頭外定位処理装置１００は、頭外定位処理部１０、フィルタ部４１、フィルタ部４２、及びヘッドホン４３を備えている。頭外定位処理部１０、フィルタ部４１、及びフィルタ部４２は、具体的にはプロセッサ等により実現可能である。 The out-of-head localization processing apparatus 100 includes an out-of-head localization processing unit 10, a filter unit 41, a filter unit 42, and headphones 43. Specifically, the out-of-head localization processing unit 10, the filter unit 41, and the filter unit 42 can be realized by a processor or the like.

頭外定位処理部１０は、畳み込み演算部１１〜１２、２１〜２２、及び加算器２４、２５を備えている。畳み込み演算部１１〜１２、２１〜２２は、空間音響伝達特性を用いた畳み込み処理を行う。頭外定位処理部１０には、ＣＤプレイヤーなどからのステレオ入力信号ＸＬ、ＸＲが入力される。頭外定位処理部１０には、空間音響伝達特性が設定されている。頭外定位処理部１０は、各ｃｈのステレオ入力信号ＸＬ、ＸＲに対し、空間音響伝達特性のフィルタ（以下、空間音響フィルタとも称する）を畳み込む。空間音響伝達特性は被測定者の頭部や耳介で測定した頭部伝達関数ＨＲＴＦでもよいし、ダミーヘッドまたは第三者の頭部伝達関数であってもよい。 The out-of-head localization processing unit 10 includes convolution operation units 11 to 12 and 21 to 22 and adders 24 and 25. The convolution operation units 11 to 12 and 21 to 22 perform convolution processing using spatial acoustic transfer characteristics. Stereo input signals XL and XR from a CD player or the like are input to the out-of-head localization processing unit 10. Spatial acoustic transfer characteristics are set in the out-of-head localization processing unit 10. The out-of-head localization processing unit 10 convolves a spatial acoustic transfer characteristic filter (hereinafter also referred to as a spatial acoustic filter) with respect to the stereo input signals XL and XR of each channel. The spatial acoustic transfer characteristic may be a head-related transfer function HRTF measured with the head or auricle of the measurement subject, or may be a dummy head or a third-party head-related transfer function.

４つの空間音響伝達特性Ｈｌｓ、Ｈｌｏ、Ｈｒｏ、Ｈｒｓを１セットとしたものを空間音響伝達関数とする。畳み込み演算部１１、１２、２１、２２で畳み込みに用いられるデータが空間音響フィルタとなる。空間音響伝達特性Ｈｌｓ、Ｈｌｏ、Ｈｒｏ、Ｈｒｓを所定のフィルタ長で切り出すことで、空間音響フィルタが生成される。 A set of four spatial acoustic transfer characteristics Hls, Hlo, Hro, and Hrs is defined as a spatial acoustic transfer function. Data used for convolution in the convolution operation units 11, 12, 21, and 22 is a spatial acoustic filter. A spatial acoustic filter is generated by cutting out the spatial acoustic transfer characteristics Hls, Hlo, Hro, and Hrs with a predetermined filter length.

空間音響伝達特性Ｈｌｓ、Ｈｌｏ、Ｈｒｏ、Ｈｒｓのそれぞれは、インパルス応答測定などにより、事前に取得されている。例えば、ユーザＵが左右の耳にマイクをそれぞれ装着する。ユーザＵの前方に配置された左右のスピーカが、インパルス応答測定を行うための、インパルス音をそれぞれ出力する。そして、スピーカから出力されたインパルス音等の測定信号をマイクで収音する。マイクでの収音信号に基づいて、空間音響伝達特性Ｈｌｓ、Ｈｌｏ、Ｈｒｏ、Ｈｒｓが取得される。左スピーカと左マイクとの間の空間音響伝達特性Ｈｌｓ、左スピーカと右マイクとの間の空間音響伝達特性Ｈｌｏ、右スピーカと左マイクとの間の空間音響伝達特性Ｈｒｏ、右スピーカと右マイクとの間の空間音響伝達特性Ｈｒｓが測定される。 Each of the spatial acoustic transfer characteristics Hls, Hlo, Hro, and Hrs is acquired in advance by an impulse response measurement or the like. For example, the user U attaches microphones to the left and right ears. The left and right speakers arranged in front of the user U output impulse sounds for performing impulse response measurement. A measurement signal such as an impulse sound output from the speaker is collected by a microphone. Spatial acoustic transfer characteristics Hls, Hlo, Hro, and Hrs are acquired based on a sound collection signal from the microphone. Spatial acoustic transmission characteristic Hls between the left speaker and the left microphone, spatial acoustic transmission characteristic Hlo between the left speaker and the right microphone, spatial acoustic transmission characteristic Hro between the right speaker and the left microphone, right speaker and right microphone The spatial acoustic transfer characteristic Hrs between the two is measured.

そして、畳み込み演算部１１は、Ｌｃｈのステレオ入力信号ＸＬに対して空間音響伝達特性Ｈｌｓに応じた空間音響フィルタを畳み込む。畳み込み演算部１１は、畳み込み演算データを加算器２４に出力する。畳み込み演算部２１は、Ｒｃｈのステレオ入力信号ＸＲに対して空間音響伝達特性Ｈｒｏに応じた空間音響フィルタを畳み込む。畳み込み演算部２１は、畳み込み演算データを加算器２４に出力する。加算器２４は２つの畳み込み演算データを加算して、フィルタ部４１に出力する。 Then, the convolution unit 11 convolves a spatial acoustic filter corresponding to the spatial acoustic transfer characteristic Hls with respect to the Lch stereo input signal XL. The convolution operation unit 11 outputs the convolution operation data to the adder 24. The convolution operation unit 21 convolves a spatial acoustic filter corresponding to the spatial acoustic transfer characteristic Hro with respect to the Rch stereo input signal XR. The convolution operation unit 21 outputs the convolution operation data to the adder 24. The adder 24 adds the two convolution calculation data and outputs the result to the filter unit 41.

畳み込み演算部１２は、Ｌｃｈのステレオ入力信号ＸＬに対して空間音響伝達特性Ｈｌｏに応じた空間音響フィルタを畳み込む。畳み込み演算部１２は、畳み込み演算データを、加算器２５に出力する。畳み込み演算部２２は、Ｒｃｈのステレオ入力信号ＸＲに対して空間音響伝達特性Ｈｒｓに応じた空間音響フィルタを畳み込む。畳み込み演算部２２は、畳み込み演算データを、加算器２５に出力する。加算器２５は２つの畳み込み演算データを加算して、フィルタ部４２に出力する。 The convolution operation unit 12 convolves a spatial acoustic filter corresponding to the spatial acoustic transfer characteristic Hlo with respect to the Lch stereo input signal XL. The convolution operation unit 12 outputs the convolution operation data to the adder 25. The convolution operation unit 22 convolves a spatial acoustic filter corresponding to the spatial acoustic transfer characteristic Hrs with respect to the Rch stereo input signal XR. The convolution operation unit 22 outputs the convolution operation data to the adder 25. The adder 25 adds the two convolution calculation data and outputs the result to the filter unit 42.

フィルタ部４１、４２にはヘッドホン特性（ヘッドホンのスピーカユニットとマイク間の特性）をキャンセルする逆フィルタが設定されている。そして、頭外定位処理部１０での処理が施された再生信号（畳み込み演算信号）に逆フィルタを畳み込む。フィルタ部４１で加算器２４からのＬｃｈ信号に対して、Ｌｃｈ側のヘッドホン特性の逆フィルタを畳み込む。同様に、フィルタ部４２は加算器２５からのＲｃｈ信号に対して、Ｒｃｈ側のヘッドホン特性の逆フィルタを畳み込む。逆フィルタは、ヘッドホン４３を装着した場合に、ヘッドホンユニットからマイクまでのヘッドホン特性をキャンセルする。マイクは、外耳道入口から鼓膜までの間ならばどこに配置してもよい。逆フィルタは、後述するように、ユーザＵ本人の特性の測定結果から算出されている。 In the filter units 41 and 42, an inverse filter for canceling the headphone characteristics (characteristics between the headphone speaker unit and the microphone) is set. Then, an inverse filter is convoluted with the reproduction signal (convolution operation signal) that has been processed by the out-of-head localization processing unit 10. The filter unit 41 convolves the Lch signal from the adder 24 with a headphone characteristic inverse filter on the Lch side. Similarly, the filter unit 42 convolves the Rch signal from the adder 25 with an Rch-side headphone characteristic inverse filter. The reverse filter cancels the headphone characteristics from the headphone unit to the microphone when the headphone 43 is attached. The microphone may be placed anywhere from the ear canal entrance to the eardrum. The inverse filter is calculated from the measurement result of the characteristics of the user U himself / herself, as will be described later.

フィルタ部４１は、処理されたＬｃｈ信号ＹＬをヘッドホン４３の左ユニット４３Ｌに出力する。フィルタ部４２は、処理されたＲｃｈ信号ＹＲをヘッドホン４３の右ユニット４３Ｒに出力する。ユーザＵは、ヘッドホン４３を装着している。ヘッドホン４３は、Ｌｃｈ信号ＹＬとＲｃｈ信号ＹＲ（以下、Ｌｃｈ信号ＹＬとＲｃｈ信号をまとめてステレオ信号ともいう）をユーザＵに向けて出力する。これにより、ユーザＵの頭外に定位された音像を再生することができる。 The filter unit 41 outputs the processed Lch signal YL to the left unit 43L of the headphones 43. The filter unit 42 outputs the processed Rch signal YR to the right unit 43R of the headphones 43. User U is wearing headphones 43. The headphones 43 output the Lch signal YL and the Rch signal YR (hereinafter, the Lch signal YL and the Rch signal are collectively referred to as a stereo signal) to the user U. Thereby, the sound image localized outside the user U's head can be reproduced.

このように、頭外定位処理装置１００は、空間音響伝達特性Ｈｌｓ、Ｈｌｏ、Ｈｒｏ、Ｈｒｓに応じた空間音響フィルタと、ヘッドホン特性の逆フィルタを用いて、頭外定位処理を行っている。以下の説明において、空間音響伝達特性Ｈｌｓ、Ｈｌｏ、Ｈｒｏ、Ｈｒｓに応じた空間音響フィルタと、ヘッドホン特性の逆フィルタとをまとめて頭外定位処理フィルタとする。２ｃｈのステレオ再生信号の場合、頭外定位フィルタは、４つの空間音響フィルタと、２つの逆フィルタとから構成されている。そして、頭外定位処理装置１００は、ステレオ再生信号に対して合計６個の頭外定位フィルタを用いて畳み込み演算処理を行うことで、頭外定位処理を実行する。頭外定位フィルタは、ユーザＵ個人の測定に基づくものであることが好ましい。例えば，ユーザＵの耳に装着されたマイクが収音した収音信号に基づいて、頭外定位フィルタが設定されている。 As described above, the out-of-head localization processing apparatus 100 performs the out-of-head localization processing using the spatial acoustic filter corresponding to the spatial acoustic transfer characteristics Hls, Hlo, Hro, and Hrs and the inverse filter of the headphone characteristics. In the following description, a spatial acoustic filter according to the spatial acoustic transfer characteristics Hls, Hlo, Hro, and Hrs and an inverse filter with headphone characteristics are collectively referred to as an out-of-head localization processing filter. In the case of a 2ch stereo reproduction signal, the out-of-head localization filter is composed of four spatial acoustic filters and two inverse filters. Then, the out-of-head localization processing apparatus 100 performs the out-of-head localization processing by performing convolution operation processing on the stereo reproduction signal using a total of six out-of-head localization filters. The out-of-head localization filter is preferably based on the measurement of the individual user U. For example, an out-of-head localization filter is set based on a sound collection signal collected by a microphone attached to the ear of the user U.

このように空間音響フィルタと、ヘッドホン特性の逆フィルタはオーディオ信号用のフィルタである。これらのフィルタが再生信号（ステレオ入力信号ＸＬ、ＸＲ）に畳み込まれることで、頭外定位処理装置１００が、頭外定位処理を実行する。 Thus, the spatial acoustic filter and the headphone characteristic inverse filter are filters for audio signals. These filters are convolved with the reproduction signals (stereo input signals XL and XR), so that the out-of-head localization processing apparatus 100 performs out-of-head localization processing.

（外耳道伝達特性の測定装置）
次に、逆フィルタを生成するために、外耳道伝達特性を測定する測定装置２００について、図２を用いて説明する。図２は、被測定者１に対して外耳道伝達特性を測定するための構成を示している。測定装置２００は、マイクユニット２と、ヘッドホン４３と、処理装置２０１と、を備えている。なお、ここでは、被測定者１は、図１のユーザＵと同一人物となっている。 (Equipment for measuring ear canal transmission characteristics)
Next, a measurement apparatus 200 that measures the ear canal transfer characteristics in order to generate an inverse filter will be described with reference to FIG. FIG. 2 shows a configuration for measuring the external auditory canal transfer characteristic of the person under measurement 1. The measuring device 200 includes a microphone unit 2, headphones 43, and a processing device 201. Here, the person to be measured 1 is the same person as the user U in FIG.

処理装置２０１には、マイクユニット２と、ヘッドホン４３と、が接続されている。なお、マイクユニット２は、ヘッドホン４３に脱着可能に取り付けられていてもよい。マイクユニット２は、左マイク２Ｌと、右マイク２Ｒとを備えている。左マイク２Ｌは、被測定者１の左耳９Ｌに装着される。右マイク２Ｒは、被測定者１の右耳９Ｒに装着される。処理装置２０１は、頭外定位処理装置１００と同じ処理装置であってもよく、異なる処理装置であってよい。以下の説明では、頭外定位処理装置１００と処理装置２０１が同じ装置であるとして説明する。 The microphone unit 2 and the headphones 43 are connected to the processing device 201. The microphone unit 2 may be detachably attached to the headphones 43. The microphone unit 2 includes a left microphone 2L and a right microphone 2R. The left microphone 2L is attached to the left ear 9L of the person 1 to be measured. The right microphone 2R is attached to the right ear 9R of the measurement subject 1. The processing device 201 may be the same processing device as the out-of-head localization processing device 100, or may be a different processing device. In the following description, it is assumed that the out-of-head localization processing apparatus 100 and the processing apparatus 201 are the same apparatus.

ヘッドホン４３は、ヘッドホンバンド４３Ｂと、左ユニット４３Ｌと、右ユニット４３Ｒとを、有している。左ユニット４３Ｌと、右ユニット４３Ｒとはそれぞれ、左右の耳９Ｌ、９Ｒに対して音を出力する出力ユニットである。ヘッドホンバンド４３Ｂは、左ユニット４３Ｌと右ユニット４３Ｒとを連結する。左ユニット４３Ｌは被測定者１の左耳９Ｌに向かって音を出力する。右ユニット４３Ｒは被測定者１の右耳９Ｒに向かって音を出力する。ヘッドホン４３は密閉型、開放型、半開放型、または半密閉型等であり、ヘッドホンの種類を問わない。マイクユニット２が被測定者１に装着された状態で、被測定者１がヘッドホン４３を装着する。すなわち、左マイク２Ｌ、右マイク２Ｒが装着された左耳９Ｌ、右耳９Ｒにヘッドホン４３の左ユニット４３Ｌ、右ユニット４３Ｒがそれぞれ装着される。ヘッドホンバンド４３Ｂは、左ユニット４３Ｌと右ユニット４３Ｒとをそれぞれ左耳９Ｌ、右耳９Ｒに押し付ける付勢力を発生する。 The headphone 43 has a headphone band 43B, a left unit 43L, and a right unit 43R. The left unit 43L and the right unit 43R are output units that output sound to the left and right ears 9L and 9R, respectively. The headphone band 43B connects the left unit 43L and the right unit 43R. The left unit 43L outputs a sound toward the left ear 9L of the person 1 to be measured. The right unit 43R outputs sound toward the right ear 9R of the person 1 to be measured. The headphones 43 are a sealed type, an open type, a semi-open type, a semi-sealed type, etc., and the type of headphones is not limited. The subject 1 wears the headphones 43 in a state where the microphone unit 2 is attached to the subject 1. That is, the left unit 43L and the right unit 43R of the headphones 43 are respectively attached to the left ear 9L and the right ear 9R to which the left microphone 2L and the right microphone 2R are attached. The headphone band 43B generates a biasing force that presses the left unit 43L and the right unit 43R against the left ear 9L and the right ear 9R, respectively.

左マイク２Ｌは、ヘッドホン４３の左ユニット４３Ｌから出力された音を収音する。右マイク２Ｒは、ヘッドホン４３の右ユニット４３Ｒから出力された音を収音する。左マイク２Ｌ、及び右マイク２Ｒのマイク部は、外耳孔近傍の収音位置に配置される。左マイク２Ｌ、及び右マイク２Ｒは、ヘッドホン４３に干渉しないように構成されている。すなわち、左マイク２Ｌ、及び右マイク２Ｒは左耳９Ｌ、右耳９Ｒの適切な位置に配置された状態で、被測定者１がヘッドホン４３を装着することができる。 The left microphone 2L collects sound output from the left unit 43L of the headphones 43. The right microphone 2R collects the sound output from the right unit 43R of the headphones 43. The microphone portions of the left microphone 2L and the right microphone 2R are arranged at a sound collection position near the outer ear hole. The left microphone 2L and the right microphone 2R are configured not to interfere with the headphones 43. That is, the person to be measured 1 can wear the headphones 43 in a state where the left microphone 2L and the right microphone 2R are disposed at appropriate positions of the left ear 9L and the right ear 9R.

処理装置２０１は、ヘッドホン４３に対して測定信号を出力する。これにより、ヘッドホン４３はインパルス音などを発生する。具体的には、左ユニット４３Ｌから出力されたインパルス音を左マイク２Ｌで測定する。右ユニット４３Ｒから出力されたインパルス音を右マイク２Ｒで測定する。測定信号の出力時に、マイク２Ｌ、２Ｒが収音信号を取得することで、インパルス応答測定が実施される。 The processing device 201 outputs a measurement signal to the headphones 43. Thereby, the headphones 43 generate an impulse sound or the like. Specifically, the impulse sound output from the left unit 43L is measured by the left microphone 2L. The impulse sound output from the right unit 43R is measured by the right microphone 2R. When the measurement signal is output, the microphones 2L and 2R acquire the collected sound signal, thereby performing impulse response measurement.

処理装置２０１は、インパルス応答測定に基づく収音信号をメモリなどに記憶する。これにより、左ユニット４３Ｌと左マイク２Ｌとの間の伝達特性（すなわち、左耳の外耳道伝達特性）と、右ユニット４３Ｒと右マイク２Ｒとの間の伝達特性（すなわち、右耳の外耳道伝達特性）が取得される。左マイク２Ｌで取得された左耳の外耳道伝達特性をＬｃｈ（左ｃｈ）の外耳道伝達特性とし、右マイク２Ｒで取得された右耳の外耳道伝達特性をＲｃｈ（右ｃｈ）の外耳道伝達特性とする。処理装置２０１が伝達特性の測定データを所定のフィルタ長で切り出すことで、フィルタ係数が求められる。処理装置２０１は、フィルタ係数から外耳道伝達特性（ヘッドホン特性）を打ち消すような逆フィルタを算出する。 The processing device 201 stores a collected sound signal based on the impulse response measurement in a memory or the like. Thereby, the transfer characteristic between the left unit 43L and the left microphone 2L (that is, the external ear canal transfer characteristic of the left ear) and the transfer characteristic between the right unit 43R and the right microphone 2R (that is, the external ear canal transfer characteristic of the right ear). ) Is acquired. The left ear external ear canal transmission characteristic acquired by the left microphone 2L is Lch (left ch) external ear canal transmission characteristic, and the right ear external ear canal transmission characteristic acquired by the right microphone 2R is Rch (right ch) external ear canal transmission characteristic. . The processing device 201 cuts out the transfer characteristic measurement data with a predetermined filter length, thereby obtaining a filter coefficient. The processing device 201 calculates an inverse filter that cancels the ear canal transfer characteristic (headphone characteristic) from the filter coefficient.

処理装置２０１は、伝達特性の測定データをそれぞれ記憶するメモリなどを有している。なお、処理装置２０１は、外耳道伝達特性を測定するための測定信号として、インパルス信号やＴＳＰ（ＴｉｍｅＳｔｒｅｔｃｈｅｄＰｕｌｓｅ）信号等を発生する。測定信号はインパルス音等の測定音を含んでいる。 The processing device 201 includes a memory that stores measurement data of transfer characteristics. The processing device 201 generates an impulse signal, a TSP (Time Stretched Pulse) signal, or the like as a measurement signal for measuring the ear canal transfer characteristic. The measurement signal includes measurement sound such as impulse sound.

（外耳道伝達特性の補正）
本実施の形態にかかる頭外定位処理装置１００は、ヘッドホン４３の装着状態に応じて、外耳道伝達特性を補正する。そして、頭外定位処理装置１００が、補正後の外耳道伝達特性に基づいて、逆フィルタを算出している。このようにすることで、頭外定位処理装置１００が、ヘッドホン装着状態に適応した逆フィルタを用いて、頭外定位処理することができる。 (Correction of ear canal transmission characteristics)
The out-of-head localization processing apparatus 100 according to the present embodiment corrects the ear canal transmission characteristics according to the wearing state of the headphones 43. Then, the out-of-head localization processing apparatus 100 calculates an inverse filter based on the corrected ear canal transfer characteristics. By doing in this way, the out-of-head localization processing apparatus 100 can perform out-of-head localization processing using an inverse filter adapted to the headphone wearing state.

図３を用いて、外耳道伝達特性の補正処理について説明する。図３は、外耳道伝達特性を補正するための構成を示す図である。図３では、説明の簡略化のため、左ユニット４３Ｌのみを示しているが、右ユニット４３Ｒについても同様の構成となっている。従って、右ユニット４３Ｒに関する説明を適宜省略する。 The correction process of the ear canal transfer characteristic will be described with reference to FIG. FIG. 3 is a diagram showing a configuration for correcting the ear canal transfer characteristics. In FIG. 3, only the left unit 43L is shown for simplification of explanation, but the right unit 43R has the same configuration. Therefore, the description regarding the right unit 43R is omitted as appropriate.

頭外定位処理装置１００は、測定信号生成部１１１と、Ｄ／Ａコンバータ１１２と、Ａ／Ｄコンバータ１２１と、参照信号取得部１２２と、Ａ／Ｄコンバータ１３１と、ＥＣＴＦ取得部１３２と、メモリ１４０と、変換関数算出部１５１と、補正部１５２と、逆フィルタ生成部１５３と、を備えている。メモリ１４０は、第１の記憶部１４１と、第２の記憶部１４２と、第３の記憶部１４３と、を備えている。 The out-of-head localization processing apparatus 100 includes a measurement signal generation unit 111, a D / A converter 112, an A / D converter 121, a reference signal acquisition unit 122, an A / D converter 131, an ECTF acquisition unit 132, and a memory. 140, a conversion function calculation unit 151, a correction unit 152, and an inverse filter generation unit 153. The memory 140 includes a first storage unit 141, a second storage unit 142, and a third storage unit 143.

左ユニット４３Ｌは、ハウジング４５と、ヘッドホンスピーカ４６と、内蔵マイク４８とを備えている。ハウジング４５は、耳を覆うイヤーカップを備えている。ハウジング４５には、ヘッドホンスピーカ４６と、内蔵マイク４８とが設けられている。ヘッドホンスピーカ４６は、磁器回路や振動板等を有しており、ユーザＵの左耳に対して音を出力する。 The left unit 43L includes a housing 45, a headphone speaker 46, and a built-in microphone 48. The housing 45 includes an ear cup that covers the ear. The housing 45 is provided with a headphone speaker 46 and a built-in microphone 48. The headphone speaker 46 has a porcelain circuit, a diaphragm, etc., and outputs sound to the left ear of the user U.

内蔵マイク４８は、左ユニット４３Ｌに内蔵されている。内蔵マイク４８は、ハウジング４５で覆われた空間に配置されている。内蔵マイク４８は、ヘッドホンスピーカ４６から出力された音を収音する。内蔵マイク４８で収音した信号を参照用収音信号とする。つまり、内蔵マイク４８は、参照用収音信号を収音するための参照用マイクである。内蔵マイク４８と、ヘッドホンスピーカ４６は、ハウジング４５に固定されている。よって、ヘッドホンスピーカ４６に対する内蔵マイク４８の位置は一定となる。 The built-in microphone 48 is built in the left unit 43L. The built-in microphone 48 is disposed in a space covered with the housing 45. The built-in microphone 48 collects sound output from the headphone speaker 46. The signal collected by the built-in microphone 48 is used as a reference sound collection signal. That is, the built-in microphone 48 is a reference microphone for collecting a reference sound collection signal. The built-in microphone 48 and the headphone speaker 46 are fixed to the housing 45. Therefore, the position of the built-in microphone 48 with respect to the headphone speaker 46 is constant.

なお、ヘッドホン４３はＢｌｕｅｔｏｏｔｈ（登録商標）等の無線通信を用いたワイヤレスタイプであってもよい。さらに、ヘッドホン４３は、一部の処理を実施するＤＳＰを備えていてもよい。ヘッドホン４３のＤＳＰ等が図３に示すブロックの一部、又は全ての処理を行ってもよい。例えば、Ｄ／Ａコンバータ１１２、Ａ／Ｄコンバータ１２１、Ａ／Ｄコンバータ１３１等がヘッドホン４３に内蔵されていてもよい。 The headphones 43 may be a wireless type using wireless communication such as Bluetooth (registered trademark). Furthermore, the headphones 43 may include a DSP that performs a part of the processing. The DSP or the like of the headphone 43 may perform a part or all of the processing shown in FIG. For example, the D / A converter 112, the A / D converter 121, the A / D converter 131, and the like may be built in the headphones 43.

さらに、左耳９Ｌ（図３では不図示）には、外耳道伝達特性を測定するための左マイク２Ｌが配置されている。左マイク２Ｌは、図２に示したように、外耳道伝達特性を測定するための測定用マイクである。左マイク２Ｌは左耳９Ｌに対して脱着可能に設けられている。外耳道伝達特性の測定時には、左耳９Ｌに左マイク２Ｌが装着される。また、ヘッドホン４３により音楽を頭外定位受聴する時（以下、単に受聴時とする）、左マイク２Ｌは、左耳９Ｌから取り外される。なお、左マイク２Ｌは、左ユニット４３Ｌに脱着可能に取り付けられていてもよく、左ユニット４３Ｌとは独立した構成となっていてもよい。例えば、左マイク２Ｌは、左耳９Ｌに直接取り付けられていてもよい。 Further, a left microphone 2L for measuring the external auditory canal transfer characteristic is disposed in the left ear 9L (not shown in FIG. 3). As shown in FIG. 2, the left microphone 2L is a measurement microphone for measuring the ear canal transfer characteristics. The left microphone 2L is detachably attached to the left ear 9L. At the time of measuring the external ear canal transfer characteristic, the left microphone 2L is attached to the left ear 9L. In addition, when listening to music out of the head with headphones 43 (hereinafter simply referred to as listening), the left microphone 2L is removed from the left ear 9L. The left microphone 2L may be detachably attached to the left unit 43L, or may be configured independently of the left unit 43L. For example, the left microphone 2L may be directly attached to the left ear 9L.

外耳道伝達特性の測定時には、左耳９Ｌに左マイク２Ｌを装着した装着状態となり、この状態を測定状態とする。ヘッドホン４３により音楽を頭外定位受聴する受聴時には、左耳９Ｌから左マイク２Ｌを取り外した状態を非装着状態となり、この状態を受聴状態とする。 At the time of measuring the external auditory canal transfer characteristics, the left microphone 2L is attached to the left ear 9L, and this state is set as a measurement state. At the time of listening to the headphone 43 listening to music out of the head, the state in which the left microphone 2L is removed from the left ear 9L is the non-wearing state, and this state is the listening state.

測定信号生成部１１１は、測定信号を生成する。測定信号生成部１１１で生成された測定信号は、Ｄ／Ａコンバータ１１２でＤ／Ａ変換されて、ヘッドホンスピーカ４６に出力される。ヘッドホンスピーカ４６が伝達特性を測定するための測定信号を出力する。 The measurement signal generator 111 generates a measurement signal. The measurement signal generated by the measurement signal generator 111 is D / A converted by the D / A converter 112 and output to the headphone speaker 46. The headphone speaker 46 outputs a measurement signal for measuring the transfer characteristic.

測定状態において、マイク２Ｌは、ヘッドホンスピーカ４６からの測定信号を収音する。マイク２Ｌで収音された収音信号（特性用収音信号ともいう）は、Ａ／Ｄコンバータ１３１でＡ／Ｄ変換される。Ａ／Ｄ変換された特性用収音信号は、ＥＣＴＦ取得部１３２に出力される。なお、インパルス応答測定を複数回行って、特性用収音信号を同期加算してもよい。 In the measurement state, the microphone 2L collects the measurement signal from the headphone speaker 46. A sound collection signal (also referred to as a characteristic sound collection signal) collected by the microphone 2L is A / D converted by the A / D converter 131. The A / D converted characteristic sound pickup signal is output to the ECTF acquisition unit 132. Note that the impulse response measurement may be performed a plurality of times, and the characteristic sound pickup signals may be synchronously added.

ＥＣＴＦ取得部１３２は、特性用収音信号に基づいて、外耳道伝達特性（ＥＣＴＦ）を取得する。例えば、ＥＣＴＦ取得部１３２は、ＦＦＴ（高速フーリエ変換）により、時間領域の特性用収音信号から周波数領域の外耳道伝達特性を算出する。これにより、外耳道伝達特性のパワー特性（パワースペクトル）と、位相特性（位相スペクトル）が生成される。なお、パワースペクトルの代わりに振幅スペクトルを生成してもよい。なお、ＥＣＴＦ取得部１３２は、離散フーリエ変換や離散コサイン変換等により、特性用収音信号を周波数領域のデータ（周波数特性）に変換することができる。 The ECTF acquisition unit 132 acquires the ear canal transfer characteristic (ECTF) based on the characteristic sound pickup signal. For example, the ECTF acquisition unit 132 calculates the external auditory canal transfer characteristic in the frequency domain from the collected sound signal for the characteristic in the time domain by FFT (Fast Fourier Transform). Thereby, the power characteristic (power spectrum) of the ear canal transmission characteristic and the phase characteristic (phase spectrum) are generated. An amplitude spectrum may be generated instead of the power spectrum. The ECTF acquisition unit 132 can convert the characteristic sound pickup signal into frequency domain data (frequency characteristics) by discrete Fourier transform, discrete cosine transform, or the like.

測定状態、及び、受聴状態の両方で、内蔵マイク４８は、ヘッドホンスピーカ４６からの測定信号を収音する。内蔵マイク４８で収音された収音信号（参照用収音信号ともいう）は、Ａ／Ｄコンバータ１２１でＡ／Ｄ変換される。Ａ／Ｄ変換された参照用収音信号は、参照信号取得部１２２に出力される。具体的には、受聴者Ｕが左マイク２Ｌを装着した測定状態と、装着していない受聴状態との両方で、インパルス応答測定が実施される。なお、インパルス応答測定を複数回行って、参照用収音信号を同期加算してもよい。 The built-in microphone 48 picks up the measurement signal from the headphone speaker 46 in both the measurement state and the listening state. A sound collection signal (also referred to as a reference sound collection signal) collected by the built-in microphone 48 is A / D converted by the A / D converter 121. The A / D converted reference sound pickup signal is output to the reference signal acquisition unit 122. Specifically, impulse response measurement is performed both in the measurement state in which the listener U wears the left microphone 2L and in the listening state in which the listener U is not wearing. Note that the impulse response measurement may be performed a plurality of times, and the reference collected sound signal may be synchronously added.

参照信号取得部１２２は、参照用収音信号に基づいて、参照信号を取得する。例えば、参照信号取得部１２２は、ＦＦＴ（高速フーリエ変換）により、時間領域の参照用収音信号から周波数領域の参照信号を算出する。これにより、参照用収音信号のパワー特性（パワースペクトル）と、位相特性（位相スペクトル）が生成される。なお、パワースペクトルの代わりに振幅スペクトルを生成してもよい。なお、参照信号取得部１２２は、離散フーリエ変換や離散コサイン変換等により、参照用収音信号を周波数領域のデータ（周波数特性）に変換することができる。 The reference signal acquisition unit 122 acquires a reference signal based on the reference sound collection signal. For example, the reference signal acquisition unit 122 calculates a reference signal in the frequency domain from the reference sound pickup signal in the time domain by FFT (Fast Fourier Transform). As a result, a power characteristic (power spectrum) and a phase characteristic (phase spectrum) of the reference collected sound signal are generated. An amplitude spectrum may be generated instead of the power spectrum. Note that the reference signal acquisition unit 122 can convert the collected sound signal for reference into frequency domain data (frequency characteristics) by discrete Fourier transform, discrete cosine transform, or the like.

ここで、測定状態において、内蔵マイク４８で収音された参照用収音信号を第１の参照用収音信号とする。第１の参照用収音信号に基づいて取得された参照信号を第１の参照信号とする。第１の参照用収音信号は、特性用収音信号とは、実質的に同時に収音される。つまり、同じインパルス応答測定での測定された第１の参照用収音信号は、特性用収音信号に基づいて、第１の参照信号と外耳道伝達特性とが取得される。 Here, in the measurement state, the reference sound collection signal collected by the built-in microphone 48 is defined as a first reference sound collection signal. The reference signal acquired based on the first reference sound pickup signal is set as the first reference signal. The first reference sound pickup signal is picked up substantially simultaneously with the characteristic sound pickup signal. That is, as for the first reference sound pickup signal measured in the same impulse response measurement, the first reference signal and the ear canal transfer characteristic are acquired based on the characteristic sound pickup signal.

受聴状態において、内蔵マイク４８で収音された参照用収音信号を第２の参照用収音信号とする。第２の参照用収音信号に基づいて取得された参照信号を第２の参照信号とする。第１の参照信号と第２の参照信号は、ヘッドホン４３の装着状態に応じて変化する信号である。換言すると、第１の参照信号と第２の参照信号との差が、ヘッドホン４３の装着状態の違いに相当する。 In the listening state, the reference sound collection signal collected by the built-in microphone 48 is set as a second reference sound collection signal. The reference signal acquired based on the second reference sound collection signal is set as the second reference signal. The first reference signal and the second reference signal are signals that change according to the wearing state of the headphones 43. In other words, the difference between the first reference signal and the second reference signal corresponds to the difference in the wearing state of the headphones 43.

メモリ１４０は、第１の記憶部１４１と、第２の記憶部１４２と、第３の記憶部１４３とを備えている。第１の記憶部１４１は、第１の参照信号を記憶する。第２の記憶部１４２は、第２の参照信号を記憶する。第３の記憶部１４３は、外耳道伝達特性を記憶する。第１の記憶部１４１、第２の記憶部１４２、及び第３の記憶部１４３は、それぞれメモリ１４０において静的に確保しておいてもよく、任意の領域を動的に確保してもよい。 The memory 140 includes a first storage unit 141, a second storage unit 142, and a third storage unit 143. The first storage unit 141 stores the first reference signal. The second storage unit 142 stores the second reference signal. The third storage unit 143 stores the ear canal transmission characteristics. The first storage unit 141, the second storage unit 142, and the third storage unit 143 may be statically secured in the memory 140, and arbitrary areas may be dynamically secured. .

具体的には、第３の記憶部１４３は、外耳道伝達特性のパワースペクトル、及び位相スペクトルを記憶する。第１の記憶部１４１は、第１の参照信号のパワースペクトルを記憶する。第２の記憶部１４２は、第２の参照信号のパワースペクトルを記憶する。第１の記憶部１４１、及び第２の記憶部１４２は、それぞれ参照信号の位相スペクトルを記憶していてもよく、記憶していなくてもよい。 Specifically, the third storage unit 143 stores the power spectrum and phase spectrum of the ear canal transfer characteristic. The first storage unit 141 stores the power spectrum of the first reference signal. The second storage unit 142 stores the power spectrum of the second reference signal. Each of the first storage unit 141 and the second storage unit 142 may or may not store the phase spectrum of the reference signal.

第１の記憶部１４１と、第２の記憶部１４２と、第３の記憶部１４３とは、物理的に単一なメモリ装置であってもよく、異なる装置であってもよい。例えば、１つのメモリ１４０が、外耳道伝達特性、第１の参照信号、及び第２の参照信号を記憶していてもよい。あるは、２つ以上のメモリに分けて、外耳道伝達特性、第１の参照信号、及び第２の参照信号が記憶されていてもよい。 The first storage unit 141, the second storage unit 142, and the third storage unit 143 may be physically single memory devices or different devices. For example, one memory 140 may store the ear canal transfer characteristic, the first reference signal, and the second reference signal. Alternatively, the ear canal transfer characteristic, the first reference signal, and the second reference signal may be stored separately in two or more memories.

変換関数算出部１５１は、第１の記憶部１４１、及び第２の記憶部１４２から第１の参照信号と第２の参照信号を読み出す。そして、変換関数算出部１５１は、第１の参照信号と第２の参照信号とに基づいて、周波数特性の変換関数を算出する。変換関数算出部１５１は、２つのパワースペクトルに基づいて、変換関数を算出している。つまり、変換関数算出部１５１は、２つの参照信号のパワースペクトルを比較することで、変換関数を算出している。 The conversion function calculation unit 151 reads the first reference signal and the second reference signal from the first storage unit 141 and the second storage unit 142. Then, the conversion function calculation unit 151 calculates a frequency characteristic conversion function based on the first reference signal and the second reference signal. The conversion function calculation unit 151 calculates a conversion function based on the two power spectra. That is, the conversion function calculation unit 151 calculates the conversion function by comparing the power spectra of the two reference signals.

補正部１５２は、第３の記憶部１４３から外耳道伝達特性を読み出す。そして、補正部１５２は、変換関数を用いて、外耳道伝達特性を補正する。補正部１５２は、外耳道伝達特性のパワースペクトルを補正している。ここでは、補正部１５２は、パワースペクトルのみを補正しており、位相スペクトルについては補正していないが、位相スペクトルについて補正してもよい。 The correction unit 152 reads the ear canal transfer characteristic from the third storage unit 143. Then, the correction unit 152 corrects the ear canal transfer characteristic using the conversion function. The correction unit 152 corrects the power spectrum of the ear canal transfer characteristic. Here, the correction unit 152 corrects only the power spectrum and does not correct the phase spectrum, but may correct the phase spectrum.

逆フィルタ生成部１５３は、補正された外耳道伝達特性（パワースペクトル）を用いて、逆フィルタを算出する。具体的には、逆フィルタ生成部１５３は、逆離散フーリエ変換により、補正後のパワースペクトル（振幅特性）と位相スペクトル（位相特性）とを用いて時間信号を算出する。逆フィルタ生成部１５３は、時間信号に基づいて、所定のフィルタ長の逆フィルタを算出する。 The inverse filter generation unit 153 calculates an inverse filter using the corrected ear canal transfer characteristic (power spectrum). Specifically, the inverse filter generation unit 153 calculates a time signal using the corrected power spectrum (amplitude characteristic) and phase spectrum (phase characteristic) by inverse discrete Fourier transform. The inverse filter generation unit 153 calculates an inverse filter having a predetermined filter length based on the time signal.

以下、逆フィルタを生成するための処理に基づいて、図４を用いて説明する。図４は、逆フィルタを生成するための処理を示すフローチャートである。 Hereinafter, a description will be given with reference to FIG. 4 based on processing for generating an inverse filter. FIG. 4 is a flowchart showing a process for generating an inverse filter.

まず、測定状態での測定により、ＥＣＴＦ取得部１３２、及び参照信号取得部１２２が、外耳道伝達特性ＥＣＴＦ、及び第１の参照信号Ｒｅｆ１を取得する（Ｓ１１）。つまり、ユーザＵがマイク２Ｌ、及びヘッドホン４３を装着した装着状態で、頭外定位処理装置１００がインパルス応答測定を実施する。これにより、マイク２Ｌ、及び内蔵マイク４８がそれぞれ収音信号を収音する。そして、特性用収音信号に基づいて、ＥＣＴＦ取得部１３２が外耳道伝達特性を算出する。参照用収音信号に基づいて、参照信号取得部１２２が第１の参照信号Ｒｅｆ１を算出する。 First, by measurement in the measurement state, the ECTF acquisition unit 132 and the reference signal acquisition unit 122 acquire the ear canal transfer characteristic ECTF and the first reference signal Ref1 (S11). That is, with the user U wearing the microphone 2L and the headphones 43, the out-of-head localization processing apparatus 100 performs impulse response measurement. As a result, the microphone 2L and the built-in microphone 48 each pick up a sound pickup signal. Then, based on the characteristic sound pickup signal, the ECTF acquisition unit 132 calculates the ear canal transfer characteristic. Based on the reference sound pickup signal, the reference signal acquisition unit 122 calculates the first reference signal Ref1.

次に、受聴状態での測定により、参照信号取得部１２２が第２の参照信号Ｒｅｆ２を取得する（Ｓ１２）。つまり、ユーザＵがマイク２Ｌを取り外した状態で、ヘッドホン４３を装着する。ユーザＵがヘッドホン４３のみを装着した状態で、インパルス応答測定が実施される。参照用収音信号に基づいて、参照信号取得部１２２が第２の参照信号Ｒｅｆ２を算出する。 Next, the reference signal acquisition unit 122 acquires the second reference signal Ref2 by measurement in the listening state (S12). That is, the user U wears the headphones 43 with the microphone 2L removed. Impulse response measurement is performed with the user U wearing only the headphones 43. Based on the reference sound pickup signal, the reference signal acquisition unit 122 calculates the second reference signal Ref2.

次に、変換関数算出部１５１が、第１の参照信号Ｒｅｆ１と第２の参照信号Ｒｅｆ２から、変換関数を算出する（Ｓ１３）。補正部１５２が、外耳道伝達特性ＥＣＴＦに対して、変換関数を適用して、補正後の外耳道伝達特性（以下、補正特性ＡｄＥＣＴＦとする）を算出する（Ｓ１４）。ここでは、対数パワースペクトルに対して、変換関数が適用されている。つまり、補正部１５２が、対数パワースペクトルのみを補正している。逆フィルタ生成部１５３が補正特性ＡｄＥＣＴＦから逆フィルタを算出する（Ｓ１５）。 Next, the conversion function calculation unit 151 calculates a conversion function from the first reference signal Ref1 and the second reference signal Ref2 (S13). The correction unit 152 applies a conversion function to the ear canal transfer characteristic ECTF to calculate a corrected ear canal transfer characteristic (hereinafter, referred to as a correction characteristic AdECTF) (S14). Here, a conversion function is applied to the logarithmic power spectrum. That is, the correction unit 152 corrects only the logarithmic power spectrum. The inverse filter generation unit 153 calculates an inverse filter from the correction characteristic AdECTF (S15).

次に、Ｓ１３〜Ｓ１５の処理について、図５を用いて詳細に説明する。図５は、Ｓ１３〜Ｓ１５の処理の１例を示すフローチャートである。つまり、図５は、変換関数算出部１５１、補正部１５２、及び逆フィルタ生成部１５３における処理を示すフローチャートである。 Next, the process of S13-S15 is demonstrated in detail using FIG. FIG. 5 is a flowchart illustrating an example of the processing of S13 to S15. That is, FIG. 5 is a flowchart illustrating processing in the conversion function calculation unit 151, the correction unit 152, and the inverse filter generation unit 153.

まず、変換関数算出部１５１が外耳道伝達特性ＥＣＴＦ、第１の参照信号Ｒｅｆ１、第２の参照信号Ｒｅｆ２の対数パワースペクトルを計算し、正規化する（Ｓ２１）。変換関数算出部１５１、補正部１５２、逆フィルタ生成部１５３は、正規化後のデータについて、以下の処理を実施する。 First, the conversion function calculation unit 151 calculates and normalizes the logarithmic power spectrum of the ear canal transfer characteristic ECTF, the first reference signal Ref1, and the second reference signal Ref2 (S21). The conversion function calculation unit 151, the correction unit 152, and the inverse filter generation unit 153 perform the following processing on the normalized data.

変換関数算出部１５１が第１の参照信号Ｒｅｆ１と第２の参照信号の対数パワースペクトルＰＳＤ１、ＰＳＤ２において、極値ＥＸ１、ＥＸ２を求める（Ｓ２２）。変換関数算出部１５１は、第１の参照信号Ｒｅｆ１の対数パワースペクトルＰＳＤ１の極値ＥＸ１と、第２の参照信号Ｒｅｆ２の対数パワースペクトルＰＳＤ２の極値ＥＸ２を求める。図６に極値ＥＸ１，ＥＸ２を示す。図６は、対数パワースペクトルＰＳＤ１、ＰＳＤ２の一部を示す図であり、横軸が周波数（Ｈｚ）、縦軸がパワー（ｄＢ）となっている。 The conversion function calculation unit 151 obtains extreme values EX1 and EX2 in the logarithmic power spectra PSD1 and PSD2 of the first reference signal Ref1 and the second reference signal (S22). The conversion function calculation unit 151 obtains the extreme value EX1 of the logarithmic power spectrum PSD1 of the first reference signal Ref1 and the extreme value EX2 of the logarithmic power spectrum PSD2 of the second reference signal Ref2. FIG. 6 shows extreme values EX1 and EX2. FIG. 6 is a diagram illustrating a part of the logarithmic power spectrum PSD1, PSD2, where the horizontal axis represents frequency (Hz) and the vertical axis represents power (dB).

対数パワースペクトルＰＳＤ１、ＰＳＤ２は通常、複数の極値を有している。ここで、対数パワースペクトルＰＳＤ１の複数の極値ＥＸ１を低周波数側から順にＥＸ１−１、ＥＸ１−２、ＥＸ１−３、・・・・ＥＸ１−Ｎとする。同様に、対数パワースペクトルＰＳＤ２の複数の極値ＥＸ２を低周波数側から順にＥＸ２−１、ＥＸ２−２、ＥＸ２−３、・・・・ＥＸ２−Ｎとする。なお、図６ではＮ＝４となっており、極値ＥＸ１−１〜ＥＸ１−４と極値ＥＸ２−１〜ＥＸ２−４が図示されている。 The logarithmic power spectra PSD1 and PSD2 usually have a plurality of extreme values. Here, a plurality of extreme values EX1 of the logarithmic power spectrum PSD1 are assumed to be EX1-1, EX1-2, EX1-3,... EX1-N in order from the low frequency side. Similarly, a plurality of extreme values EX2 of the logarithmic power spectrum PSD2 are set as EX2-1, EX2-2, EX2-3,... EX2-N in order from the low frequency side. In FIG. 6, N = 4, and extreme values EX1-1 to EX1-4 and extreme values EX2-1 to EX2-4 are illustrated.

変換関数算出部１５１は、極値ＥＸ１、ＥＸ２について、対応する極値間の変化ベクトルＶを算出する（Ｓ２３）。つまり、図６に示すように、変換関数算出部１５１は、極値ＥＸ１−１と極値ＥＸ２−１の変化ベクトルを変化ベクトルＶ１として求める。同様に、変換関数算出部１５１は、極値ＥＸ１−２と極値ＥＸ２−２の変化ベクトルを変化ベクトルＶ２として求める。変換関数算出部１５１は、変化ベクトルＶ１〜ＶＮを算出する。図６の例では、Ｎ＝４であるため、４つの変化ベクトルＶ１〜Ｖ４が図示されている。変化ベクトルＶは、周波数と対数パワーとを要素とする２次元ベクトルである。 The conversion function calculation unit 151 calculates a change vector V between corresponding extreme values for the extreme values EX1 and EX2 (S23). That is, as shown in FIG. 6, the conversion function calculation unit 151 obtains a change vector between the extreme value EX1-1 and the extreme value EX2-1 as a change vector V1. Similarly, the conversion function calculation unit 151 obtains a change vector between the extreme value EX1-2 and the extreme value EX2-2 as a change vector V2. The conversion function calculation unit 151 calculates change vectors V1 to VN. In the example of FIG. 6, since N = 4, four change vectors V1 to V4 are illustrated. The change vector V is a two-dimensional vector whose elements are frequency and logarithmic power.

なお、対数パワースペクトルＰＳＤ１の極値ＥＸ１と、対数パワースペクトルＰＳＤ２の極値ＥＸ２の数が異なる場合、最も近い極値間の変化ベクトルＶを求めればよい。例えば、対数パワースペクトルＰＳＤの極値ＥＸ１の数が、対数パワースペクトルＰＳＤ２の極値ＥＸ２の数よりも小さい場合、対数パワースペクトルＰＳＤ１の極大値に最も近い対数パワースペクトルＰＳＤの極大値をペアとして、変化ベクトルが求められる。同様に、対数パワースペクトルＰＳＤ１の極小値に最も近い対数パワースペクトルＰＳＤの極小値をペアとして、変化ベクトルが求められる。 If the number of extreme values EX1 of the logarithmic power spectrum PSD1 and the extreme value EX2 of the logarithmic power spectrum PSD2 are different, the change vector V between the nearest extreme values may be obtained. For example, when the number of extreme values EX1 of the logarithmic power spectrum PSD is smaller than the number of extreme values EX2 of the logarithmic power spectrum PSD2, the maximum value of the logarithmic power spectrum PSD closest to the maximum value of the logarithmic power spectrum PSD1 is taken as a pair. A change vector is determined. Similarly, a change vector is obtained by using the minimum value of the logarithmic power spectrum PSD closest to the minimum value of the logarithmic power spectrum PSD1 as a pair.

次に、変換関数算出部１５１は、変化ベクトルＶに基づいて、変換関数を求める（Ｓ２４）。ここでは、変換関数算出部１５１は、ベクトル変換手法を用いて、変換関数を算出している。具体的には、図７に示すように、対数パワースペクトルのグラフ内に格子状の制御点を配置する。そして、複数の変化ベクトルＶ１〜ＶＮに基づいて、制御点のメッシュ構造を変化させる。 Next, the conversion function calculation unit 151 obtains a conversion function based on the change vector V (S24). Here, the conversion function calculation unit 151 calculates a conversion function using a vector conversion method. Specifically, as shown in FIG. 7, grid-like control points are arranged in a logarithmic power spectrum graph. Then, the mesh structure of the control points is changed based on the plurality of change vectors V1 to VN.

例えば、（５，５）にある制御点が、（６，６）に変化した場合、隣接した制御点がその変化に連動して移動するようにする。変換関数算出部１５１は、極値に近い制御点を変化ベクトルＶに基づいて移動させる。変換関数算出部１５１は、複数の変化ベクトルＶ１〜ＶＮに基づいて、対数パワースペクトルのグラフ上における全制御点の移動先を求める。つまり、変換関数算出部１５１は、変換関数（変換用メッシュ）を設定する。本ベクトル変換手法は、例えば、画像の形状変化や３次元データのモーフィング（Ｍｏｒｐｈｉｎｇ）などに用いられるメッシュ構造と同等の手法であるため、詳細な説明は省略する。 For example, when the control point at (5, 5) changes to (6, 6), the adjacent control point moves in conjunction with the change. The conversion function calculation unit 151 moves the control point close to the extreme value based on the change vector V. The conversion function calculation unit 151 obtains the movement destinations of all control points on the logarithmic power spectrum graph based on the plurality of change vectors V1 to VN. That is, the conversion function calculation unit 151 sets a conversion function (conversion mesh). Since this vector conversion method is a method equivalent to a mesh structure used for, for example, image shape change or morphing of three-dimensional data, detailed description thereof is omitted.

補正部１５２は、外耳道伝達特性ＥＣＴＦの対数パワースペクトルＰＳＤに変換関数を適用する（Ｓ２５）。ここでは、説明のため、外耳道伝達特性ＥＣＴＦの対数パワースペクトルをＰＳＤとし、位相スペクトルをＡＳＤとする。補正部１５２は、変換関数を用いて、対数パワースペクトルＰＳＤを補正する。補正後の対数パワースペクトルをＡｄＰＳＤとする。つまり、補正部１５２は、対数パワースペクトルＰＳＤに、変換関数を適用することで、補正後の対数パワースペクトルＡｄＰＳＤを算出する。なお、外耳道伝達特性ＥＣＴＦを補正した補正特性ＡｄＥＣＴＦは、位相スペクトルＡＳＤと、補正後の対数パワースペクトルＡｄＰＳＤとから構成される。 The correcting unit 152 applies a conversion function to the logarithmic power spectrum PSD of the ear canal transfer characteristic ECTF (S25). Here, for the sake of explanation, the logarithmic power spectrum of the ear canal transfer characteristic ECTF is PSD and the phase spectrum is ASD. The correction unit 152 corrects the logarithmic power spectrum PSD using the conversion function. Let the logarithmic power spectrum after correction be AdPSD. That is, the correction unit 152 calculates a corrected logarithmic power spectrum AdPSD by applying a conversion function to the logarithmic power spectrum PSD. The correction characteristic AdECTF obtained by correcting the ear canal transmission characteristic ECTF is composed of a phase spectrum ASD and a corrected logarithmic power spectrum AdPSD.

逆フィルタ生成部１５３は、補正後の対数パワースペクトルＡｄＰＳＤに基づいて、逆フィルタを生成する（Ｓ２６）。具体的には、逆フィルタ生成部１５３は、逆離散フーリエ変換又は逆離散コサイン変換等により、補正後の振幅特性（対数パワースペクトルＡｄＰＳＤ）と位相特性（位相スペクトルＡＳＤ）から時間信号を算出する。この時間信号が補正された外耳道伝達特性ＡｄＥＣＴＦ（以下、時間領域の補正特性ＡｄＥＣＴＦとも称する）を示す。逆フィルタ生成部１５３は、時間領域の補正特性ＡｄＥＣＴＦから、振幅と位相が反転する逆フィルタを生成する。逆フィルタの生成方法については公知の手法を用いることができるため説明を省略する。 The inverse filter generation unit 153 generates an inverse filter based on the corrected logarithmic power spectrum AdPSD (S26). Specifically, the inverse filter generation unit 153 calculates a time signal from the corrected amplitude characteristic (logarithmic power spectrum AdPSD) and phase characteristic (phase spectrum ASD) by inverse discrete Fourier transform or inverse discrete cosine transform. The ear canal transfer characteristic AdECTF (hereinafter also referred to as a time domain correction characteristic AdECTF) in which the time signal is corrected is shown. The inverse filter generation unit 153 generates an inverse filter whose amplitude and phase are inverted from the correction characteristic AdECTF in the time domain. Since a known method can be used for the generation method of the inverse filter, description thereof is omitted.

この逆フィルタが図１で示したフィルタ部４１に設定される。また、右ユニット４３に対して同様の処理を行うことで求められた逆フィルタは、フィルタ部４２に設定される。 This inverse filter is set in the filter unit 41 shown in FIG. Further, the inverse filter obtained by performing the same process on the right unit 43 is set in the filter unit 42.

図８に、外耳道伝達特性ＥＣＴＦの対数パワースペクトルＰＳＤと補正特性ＡｄＥＣＴＦの対数パワースペクトルＡｄＰＳＤと、を示す。さらに、図８は、第１の参照信号Ｒｅｆ１の対数パワースペクトルＰＳＤ１と、第２の参照信号Ｒｅｆ２の対数パワースペクトルＰＳＤ２と、を示す。このように、対数パワースペクトルＰＳＤ１、ＰＳＤ２に基づく変換関数を、対数パワースペクトルＰＳＤに適用することで、補正後の対数パワースペクトルＡｄＰＳＤを求めることができる。 FIG. 8 shows a logarithmic power spectrum PSD of the ear canal transfer characteristic ECTF and a logarithmic power spectrum AdPSD of the correction characteristic AdECTF. Further, FIG. 8 shows a logarithmic power spectrum PSD1 of the first reference signal Ref1 and a logarithmic power spectrum PSD2 of the second reference signal Ref2. Thus, the logarithmic power spectrum AdPSD after correction | amendment can be calculated | required by applying the conversion function based on logarithmic power spectrum PSD1, PSD2 to logarithmic power spectrum PSD.

対数パワースペクトルＰＳＤ１と対数パワースペクトルＰＳＤ２の差は、ヘッドホン４３の装着状態の変化に対応する。補正部１５２が、対数パワースペクトルＰＳＤ１と対数パワースペクトルＰＳＤ２とに基づく変換関数を用いることで、適切に外耳道伝達特性を補正することができる。逆フィルタ生成部１５３が補正後の外耳道伝達特性から逆フィルタを算出している。従って、頭外定位処理装置１００が精度の高い頭外定位処理を行うことができる。 The difference between the logarithmic power spectrum PSD1 and the logarithmic power spectrum PSD2 corresponds to a change in the wearing state of the headphones 43. The correction unit 152 can appropriately correct the ear canal transfer characteristic by using a conversion function based on the logarithmic power spectrum PSD1 and the logarithmic power spectrum PSD2. The inverse filter generation unit 153 calculates an inverse filter from the corrected ear canal transfer characteristics. Therefore, the out-of-head localization processing apparatus 100 can perform out-of-head localization processing with high accuracy.

具体的には、頭外定位受聴を行う前の事前測定として、測定状態でのインパルス応答測定を行う。これにより、メモリ１４０に、第１の参照信号Ｒｅｆ１、及び外耳道伝達特性ＥＣＴＦを予め記憶させておくことができる。事前測定が終了したら、ユーザＵがマイクユニット２を取り外す。そして、ユーザＵがヘッドホン４３を装着すると、受聴状態でのインパルス応答測定を実施して、第２の参照信号Ｒｅｆ２を取得する。ヘッドホン４３を装着する毎に、参照信号取得部１２２が、第２の参照信号Ｒｅｆ２を取得することが好ましい。つまり、ユーザＵがヘッドホン４３を装着して、頭外定位受聴を行う前に、参照信号取得部１２２が、第２の参照信号Ｒｅｆ２を取得する。 Specifically, impulse response measurement in a measurement state is performed as a pre-measurement before listening to out-of-head localization. As a result, the first reference signal Ref1 and the ear canal transfer characteristic ECTF can be stored in the memory 140 in advance. When the preliminary measurement is completed, the user U removes the microphone unit 2. Then, when the user U wears the headphones 43, the impulse response measurement in the listening state is performed to obtain the second reference signal Ref2. It is preferable that the reference signal acquisition unit 122 acquires the second reference signal Ref2 every time the headphones 43 are worn. That is, the reference signal acquisition unit 122 acquires the second reference signal Ref2 before the user U wears the headphones 43 and performs out-of-head localization listening.

上記の処理により、頭外定位処理装置１００が、外耳道伝達特性ＥＣＴＦ、第１の参照信号Ｒｅｆ１、及び第２の参照信号Ｒｅｆ２から逆フィルタを算出する。このようにすることで、適応化された逆フィルタを用いて頭外定位処理を行うことができる。つまり、ヘッドホン４３の装着状態が変化しても、頭外定位処理装置１００が、適切な逆フィルタを用いて、頭外定位処理を行うことができる。なお、受聴時にヘッドホン４３を装着したことを自動検知して、受聴状態での測定を自動で行ってもよく。あるいは、ユーザＵが、タッチパネルなど操作部を操作して、受聴状態での測定を行うことを指示してもよい。 Through the above processing, the out-of-head localization processing apparatus 100 calculates an inverse filter from the ear canal transfer characteristic ECTF, the first reference signal Ref1, and the second reference signal Ref2. By doing so, it is possible to perform out-of-head localization processing using an adapted inverse filter. That is, even if the wearing state of the headphones 43 changes, the out-of-head localization processing apparatus 100 can perform out-of-head localization processing using an appropriate inverse filter. Note that it is possible to automatically detect that the headphones 43 are worn at the time of listening and automatically perform the measurement in the listening state. Alternatively, the user U may instruct to perform measurement in the listening state by operating an operation unit such as a touch panel.

上記のように、変換関数算出部１５１は、パワースペクトルに基づいて、変換関数を算出している。したがって、第１の記憶部１４１、及び第２の記憶部１４２は、それぞれパワースペクトルのみを記憶していればよい。すなわち、第１の記憶部１４１と第２の記憶部１４２は、位相スペクトルを記憶していなくてもよい。あるいは、第１の記憶部１４１と第２の記憶部１４２は、時間領域の収音信号をそのまま参照信号として記憶していてもよい。同様に、第３の記憶部１４３は、時間領域の収音信号を外耳道伝達特性として記憶していてもよい。そして、変換関数算出部１５１が変換関数を算出する都度、離散フーリエ変換等を行って、参照信号及び外耳道伝達特性のパワースペクトルを算出するようにしてもよい。 As described above, the conversion function calculation unit 151 calculates a conversion function based on the power spectrum. Therefore, each of the first storage unit 141 and the second storage unit 142 may store only the power spectrum. That is, the first storage unit 141 and the second storage unit 142 may not store the phase spectrum. Or the 1st memory | storage part 141 and the 2nd memory | storage part 142 may memorize | store the sound-collection signal of a time domain as a reference signal as it is. Similarly, the third storage unit 143 may store a time domain sound pickup signal as an ear canal transfer characteristic. Then, each time the conversion function calculation unit 151 calculates the conversion function, the power spectrum of the reference signal and the ear canal transfer characteristic may be calculated by performing discrete Fourier transform or the like.

なお、変換関数の算出方法は上記の手法に限定されるものではない。変換関数算出部１５１は、上記したベクトル変換に限らず、様々なパラメータや変換手法によって、変換関数を算出することができる。例えば、対数パワースペクトルＰＳＤ１と対数パワースペクトルＰＳＤ２の差に基づいて、変換関数を求めてもよい。 Note that the calculation method of the conversion function is not limited to the above method. The conversion function calculation unit 151 can calculate the conversion function not only by the above-described vector conversion but also by various parameters and conversion methods. For example, the conversion function may be obtained based on the difference between the logarithmic power spectrum PSD1 and the logarithmic power spectrum PSD2.

補正前の外耳道伝達特性ＥＣＴＦの対数パワースペクトルＰＳＤ、第１の参照信号Ｒｅｆ１の対数パワースペクトルＰＳＤ１、第２の参照信号Ｒｅｆ２の対数パワースペクトルＰＳＤ２を用いて、補正後の外耳道伝達特性ＥＣＴＦの対数パワースペクトルＡｄＰＳＤを以下の式（１）を変換関数とすることができる。
ＡｄＰＡＤ＝ＰＳＤ＋（ＰＳＤ２−ＰＳＤ１）・・・（１） Using the logarithmic power spectrum PSD of the ear canal transfer characteristic ECTF before correction, the logarithmic power spectrum PSD1 of the first reference signal Ref1, and the logarithmic power spectrum PSD2 of the second reference signal Ref2, the logarithmic power of the ear canal transfer characteristic ECTF after correction is used. The following formula (1) can be used as a conversion function for the spectrum AdPSD.
AdPAD = PSD + (PSD2-PSD1) (1)

なお、図２の処理装置２０１は、頭外定位処理装置１００と同じ装置であってもよく、異なる装置であってもよい。頭外定位処理装置１００と、処理装置２０１が異なる装置である場合、処理装置２０１によって取得された外耳道伝達特性ＥＣＴＦ及び第１の参照信号Ｒｅｆ１を、頭外定位処理装置１００が用いればよい。 2 may be the same device as the out-of-head localization processing device 100, or may be a different device. When the out-of-head localization processing apparatus 100 and the processing apparatus 201 are different apparatuses, the out-of-head localization processing apparatus 100 may use the ear canal transfer characteristic ECTF and the first reference signal Ref1 acquired by the processing apparatus 201.

具体的には、処理装置２０１は、外耳道伝達特性ＥＣＴＦと、第１の参照信号Ｒｅｆ１を取得するために、測定状態でのインパルス応答測定を実施する。そして、処理装置２０１は、外耳道伝達特性ＥＣＴＦと第１の参照信号Ｒｅｆ１を頭外定位処理装置１００に無線又は有線で送信する。あるいは、外耳道伝達特性ＥＣＴＦと第１の参照信号Ｒｅｆ１のデータの一部又は全部は外部に保存されていてもよい。処理装置２０１は、外耳道伝達特性ＥＣＴＦと第１の参照信号Ｒｅｆ１を外部記憶装置やクラウドネットワークなどに記憶させる。頭外定位処理装置１００は、外部記憶装置やクラウドネットワークなどに保存されているデータを読み取る、又は受信することで、外耳道伝達特性ＥＣＴＦと第１の参照信号Ｒｅｆ１を取得する。この場合、メモリ１４０は、外耳道伝達特性ＥＣＴＦと第１の参照信号Ｒｅｆ１と第２の参照信号Ｒｅｆ２とを一時的に記憶するメモリであってもよい。つまり、各データの使用後に、データを消去してもよい。 Specifically, the processing device 201 performs an impulse response measurement in a measurement state in order to acquire the ear canal transfer characteristic ECTF and the first reference signal Ref1. Then, the processing device 201 transmits the ear canal transfer characteristic ECTF and the first reference signal Ref1 to the out-of-head localization processing device 100 wirelessly or by wire. Alternatively, part or all of the data of the ear canal transfer characteristic ECTF and the first reference signal Ref1 may be stored outside. The processing device 201 stores the ear canal transfer characteristic ECTF and the first reference signal Ref1 in an external storage device, a cloud network, or the like. The out-of-head localization processing apparatus 100 reads or receives data stored in an external storage device, a cloud network, or the like, thereby acquiring the ear canal transfer characteristic ECTF and the first reference signal Ref1. In this case, the memory 140 may be a memory that temporarily stores the ear canal transfer characteristic ECTF, the first reference signal Ref1, and the second reference signal Ref2. That is, the data may be deleted after each data is used.

ヘッドホン４３による受聴を行う前に、頭外定位処理装置１００が受聴状態でのインパルス応答測定を行って、第２の参照信号Ｒｅｆ２を取得する。このように取得された外耳道伝達特性ＥＣＴＦ、第１の参照信号Ｒｅｆ１、第２の参照信号Ｒｅｆ２を用いて、頭外定位処理装置１００が上記の補正処理を実施する。したがって、頭外定位処理装置１００は、ヘッドホン装着状態に適応した逆フィルタを求めることができる。これにより、適切に頭外定位処理を行うことができ、高い精度での音像定位効果を得ることができる。 Before listening to the headphones 43, the out-of-head localization processing apparatus 100 performs impulse response measurement in the listening state to obtain the second reference signal Ref2. Using the external auditory canal transfer characteristic ECTF, the first reference signal Ref1, and the second reference signal Ref2 acquired in this way, the out-of-head localization processing apparatus 100 performs the above correction processing. Therefore, the out-of-head localization processing apparatus 100 can obtain an inverse filter adapted to the headphone wearing state. Thereby, an out-of-head localization process can be performed appropriately, and a sound image localization effect with high accuracy can be obtained.

上記処理のうちの一部又は全部は、コンピュータプログラムによって実行されてもよい。上述したプログラムは、様々なタイプの非一時的なコンピュータ可読媒体（ｎｏｎ−ｔｒａｎｓｉｔｏｒｙｃｏｍｐｕｔｅｒｒｅａｄａｂｌｅｍｅｄｉｕｍ）を用いて格納され、コンピュータに供給することができる。非一時的なコンピュータ可読媒体は、様々なタイプの実体のある記録媒体（ｔａｎｇｉｂｌｅｓｔｏｒａｇｅｍｅｄｉｕｍ）を含む。非一時的なコンピュータ可読媒体の例は、磁気記録媒体（例えばフレキシブルディスク、磁気テープ、ハードディスクドライブ）、光磁気記録媒体（例えば光磁気ディスク）、ＣＤ−ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＣＤ−Ｒ、ＣＤ−Ｒ／Ｗ、半導体メモリ（例えば、マスクＲＯＭ、ＰＲＯＭ（ＰｒｏｇｒａｍｍａｂｌｅＲＯＭ)、ＥＰＲＯＭ（ＥｒａｓａｂｌｅＰＲＯＭ)、フラッシュＲＯＭ、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ））を含む。また、プログラムは、様々なタイプの一時的なコンピュータ可読媒体（ｔｒａｎｓｉｔｏｒｙｃｏｍｐｕｔｅｒｒｅａｄａｂｌｅｍｅｄｉｕｍ)によってコンピュータに供給されてもよい。一時的なコンピュータ可読媒体の例は、電気信号、光信号、及び電磁波を含む。一時的なコンピュータ可読媒体は、電線及び光ファイバ等の有線通信路、又は無線通信路を介して、プログラムをコンピュータに供給できる。 Part or all of the above processing may be executed by a computer program. The programs described above can be stored and provided to a computer using various types of non-transitory computer readable media. Non-transitory computer readable media include various types of tangible storage media. Examples of non-transitory computer-readable media include magnetic recording media (for example, flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (for example, magneto-optical disks), CD-ROMs (Read Only Memory), CD-Rs, CD-R / W, semiconductor memory (for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory)). The program may also be supplied to the computer by various types of transitory computer readable media. Examples of transitory computer readable media include electrical signals, optical signals, and electromagnetic waves. The temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

以上、本発明者によってなされた発明を実施の形態に基づき具体的に説明したが、本発明は上記実施の形態に限られたものではなく、その要旨を逸脱しない範囲で種々変更可能であることは言うまでもない。 As mentioned above, the invention made by the present inventor has been specifically described based on the embodiment. However, the present invention is not limited to the above embodiment, and various modifications can be made without departing from the scope of the invention. Needless to say.

Ｕユーザ
１被測定者
１０頭外定位処理部
１１畳み込み演算部
１２畳み込み演算部
２１畳み込み演算部
２２畳み込み演算部
２４加算器
２５加算器
４１フィルタ部
４２フィルタ部
４３ヘッドホン
４５ハウジング
４６ヘッドホンスピーカ
４８内蔵マイク
１００頭外定位処理装置
１１１測定信号生成部
１１２Ｄ／Ａコンバータ
１２１Ａ／Ｄコンバータ
１２２参照信号取得部
１３１Ａ／Ｄコンバータ
１３２ＥＣＴＦ取得部
１４０メモリ
１４１第１の記憶部
１４２第２の記憶部
１４３第３の記憶部
１５１変換関数算出部
１５２補正部
１５３逆フィルタ生成部 U user 1 person to be measured 10 out-of-head localization processing unit 11 convolution operation unit 12 convolution operation unit 21 convolution operation unit 22 convolution operation unit 24 adder 25 adder 41 filter unit 42 filter unit 43 headphone 45 housing 46 headphone speaker 48 built-in microphone DESCRIPTION OF SYMBOLS 100 Out-of-head localization processing apparatus 111 Measurement signal production | generation part 112 D / A converter 121 A / D converter 122 Reference signal acquisition part 131 A / D converter 132 ECTF acquisition part 140 Memory 141 1st memory | storage part 142 2nd memory | storage part 143 Third storage unit 151 Conversion function calculation unit 152 Correction unit 153 Inverse filter generation unit

Claims

An audio output unit having a reference microphone;
A reference signal acquisition unit for acquiring a reference signal based on a collected sound signal collected by the reference microphone;
A first storage unit that stores the first reference signal acquired by the reference signal acquisition unit in a measurement state in which a measurement microphone is arranged in the ear canal of the user wearing the audio output unit;
A second storage unit that stores the second reference signal acquired by the reference signal acquisition unit in a listening state in which the measurement microphone is removed from the ear canal;
A third storage unit that stores a transfer characteristic acquired based on a collected sound signal collected by the measurement microphone in the measurement state;
A conversion function calculating unit that calculates a conversion function of a frequency characteristic based on the first reference signal and the second reference signal;
A correction unit that corrects the frequency characteristic of the transfer characteristic using the conversion function;
A filter generation unit that generates a filter based on the corrected frequency characteristics;
An out-of-head localization processing apparatus comprising: an out-of-head localization processing unit that performs out-of-head localization processing using the filter.

The conversion function calculation unit obtains a change vector between extreme values of power spectra of the first reference signal and the second reference signal,
The out-of-head localization processing apparatus according to claim 1, wherein the conversion function is calculated based on the change vector.

Acquiring a transfer characteristic based on a collected sound signal collected by the measurement microphone in a measurement state in which the measurement microphone is arranged in the ear canal of the user wearing a sound output unit having a reference microphone;
Obtaining a first reference signal based on a collected sound signal collected by the reference microphone in the measurement state;
Obtaining a second reference signal based on a collected sound signal collected by the reference microphone in a listening state in which the measurement microphone is removed from the ear canal;
Calculating a conversion function of a frequency characteristic based on the first reference signal and the second reference signal;
Correcting the frequency characteristic of the transfer characteristic using the conversion function;
Generating a filter based on the corrected frequency characteristic;
Performing an out-of-head localization process using the filter.

On the computer,
Acquiring a transfer characteristic based on a collected sound signal collected by the measurement microphone in a measurement state in which the measurement microphone is arranged in the ear canal of the user wearing a sound output unit having a reference microphone;
Obtaining a first reference signal based on a collected sound signal collected by the reference microphone in the measurement state;
Obtaining a second reference signal based on a collected sound signal collected by the reference microphone in a listening state in which the measurement microphone is removed from the ear canal;
Calculating a conversion function of a frequency characteristic based on the first reference signal and the second reference signal;
Correcting the frequency characteristic of the transfer characteristic using the conversion function;
Generating a filter based on the corrected frequency characteristic;
Performing out-of-head localization using the filter;
A program that executes