JP2009260574A

JP2009260574A - Sound signal processing device, sound signal processing method and mobile terminal equipped with the sound signal processing device

Info

Publication number: JP2009260574A
Application number: JP2008106125A
Authority: JP
Inventors: Jin Chin; 迅陳
Original assignee: Sony Ericsson Mobile Communications Japan Inc
Current assignee: Sony Corp
Priority date: 2008-04-15
Filing date: 2008-04-15
Publication date: 2009-11-05

Abstract

<P>PROBLEM TO BE SOLVED: To make it possible to reduce a structure or computation processing quantity necessary for computation and reproduce an appropriate sound when reproducing a stereophonic sound. <P>SOLUTION: Head-related transfer functions measured by using a dummy head are thinned within a limited number of samples, and the result is stored as a head-related transfer function database in a memory section 32. A processing section 33 extracts a transfer function at an indicated position of sound source from the limited number of samples of the head-related transfer functions within the stored head-related transfer function database. The input sound signal is multiplied by the extracted transfer function, and a sound signal with two channels is obtained by a calculation section 37 to create a binaural stereophonic sound. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、例えば携帯電話端末などの音声信号を扱う比較的小型の電子機器に適用して好適な音声信号処理装置及び音声信号処理方法、並びにその音声信号処理装置を備えた携帯端末に関し、特に、立体音響を再生させる技術に関する。 The present invention relates to an audio signal processing device and an audio signal processing method suitable for application to a relatively small electronic device that handles an audio signal such as a mobile phone terminal, and a mobile terminal equipped with the audio signal processing device. The present invention relates to a technique for reproducing stereophonic sound.

従来、音声信号を扱うポータブル音声機器が各種実用化されている。例えば、音楽データをダウンロードして記憶し、その記憶した音楽データを、装着されたヘッドホンから再生させる携帯電話端末が普及している。また、携帯電話端末としての機能を備えてなく、音楽データの記憶と再生を行う、いわゆるポータブル型再生装置も各種実用化されている。 Conventionally, various portable audio devices that handle audio signals have been put into practical use. For example, mobile phone terminals that download and store music data and play back the stored music data from attached headphones have become widespread. Various so-called portable playback devices that do not have a function as a mobile phone terminal and store and play back music data have been put into practical use.

このような音楽データなどを再生する機能を備えた機器は、ヘッドホンを接続して、そのヘッドホンから再生させるのが一般的である。また、機器そのものが、小型のスピーカを内蔵して、その内蔵されたスピーカから出力させるようにしたものもある。 In general, a device having a function of reproducing music data or the like is connected to a headphone and played from the headphone. There is also a device in which a small speaker is built in and output from the built-in speaker.

通常、この種の機器で音楽再生を行う際には、入力した音声信号が２チャンネルの音声信号であることが一般的である。従って、ヘッドホンから再生させる場合には、その２チャンネルの音声信号をそのままヘッドホンの左右のチャンネルのユニットに供給して出力させるようにしている。ところが、一般的な２チャンネルの音声信号は、ある程度の間隔を離して設置したスピーカとリスナーが向き合ったときに正しい立体音響となる、いわゆるステレオフォニックの音声信号である。 Normally, when music is played back on this type of device, the input audio signal is generally a 2-channel audio signal. Therefore, when reproducing from the headphones, the audio signals of the two channels are supplied as they are to the left and right channel units of the headphones and output. However, a general two-channel audio signal is a so-called stereophonic audio signal that provides correct stereophonic sound when a speaker and a listener placed at a certain interval face each other.

これに対して、ヘッドホンから再生される音声信号として、リスナーの頭部にヘッドホンを装着した際に、正しい立体音響で再生されるバイノーラル方式の音声信号が知られている。
２チャンネルのバイノーラル音声信号を生成させる処理は、近年の集積回路化されたＤＳＰを使用することで可能となっており、高機能化された音声処理装置などで実用化されている。例えば、ビデオ再生装置と組み合わせて使用されるヘッドホン装置として、映画などのビデオプログラムを視聴する際に、立体音響が再生される処理を行うものが実用化されている。 On the other hand, a binaural audio signal that is reproduced with correct stereophonic sound when a headphone is worn on the listener's head is known as an audio signal reproduced from the headphones.
The process of generating a 2-channel binaural audio signal is possible by using a DSP integrated in recent years, and has been put to practical use in a highly functional audio processing apparatus. For example, as a headphone device that is used in combination with a video playback device, a device that performs a process of reproducing stereophonic sound when a video program such as a movie is viewed has been put into practical use.

特許文献１には、２チャンネルのバイノーラル方式で集音された音声信号の処理についての開示がある。
特開２００５−２２３７１３号公報 Japanese Patent Application Laid-Open No. 2004-228688 discloses a process for processing an audio signal collected by a 2-channel binaural method.
JP 2005-223713 A

先に述べたように、ポータブル型の音声再生装置の１つとして、携帯電話端末が普及している。この携帯電話端末に、上述した２チャンネルのバイノーラル音声信号を生成させる処理回路を内蔵させれば、携帯電話端末にヘッドホンを接続して、音楽などを再生させる際に、正しい立体音響で再生させることができ、好ましい。 As described above, mobile phone terminals are widely used as one of portable sound reproducing apparatuses. If the processing circuit for generating the above-described 2-channel binaural audio signal is built in this mobile phone terminal, headphones can be connected to the mobile phone terminal to reproduce music and the like with correct stereophonic sound. This is preferable.

ところが、従来のバイノーラル音声信号を生成させる処理構成は、ＤＳＰと称される非常に大規模な回路構成の集積回路を使用した非常に大規模な演算処理が必要であった。このため、例えば携帯電話端末の如き、ポータブル型の電子機器に内蔵させるのには、演算処理能力、コストなどのいずれの面からも無理があった。演算処理量が多いということは、バッテリの持続時間も短くしてしまい、好ましくない。 However, the processing configuration for generating a conventional binaural audio signal requires a very large arithmetic processing using an integrated circuit having a very large circuit configuration called a DSP. For this reason, for example, it has been impossible to incorporate it in a portable electronic device such as a mobile phone terminal in terms of arithmetic processing capability and cost. A large amount of calculation processing is not preferable because the battery duration is shortened.

また、別の問題として、従来のバイノーラル音声信号を生成させる処理構成は、実際に収音した音を解析したデータに基いて行うために生じる問題がある。即ち、解析用のデータは、人間の頭部を模した形状のダミーヘッドを使用して、実際の音源からの音が、そのダミーヘッドの耳介の部分に装着したマイクロフォンで収音したものである。このため、ダミーヘッドのサイズとほぼ一致した頭部のサイズを有するリスナーが聴取する場合には、適正な立体音響が再現されるが、頭部のサイズが異なるリスナーが聴取する場合には、適正でない可能性がある。 As another problem, the conventional processing configuration for generating a binaural audio signal has a problem that occurs because it is based on data obtained by analyzing the actually collected sound. In other words, the data for analysis was obtained by using a dummy head shaped like a human head and picking up the sound from the actual sound source with a microphone attached to the pinna of the dummy head. is there. For this reason, when a listener having a head size that substantially matches the size of the dummy head listens, the appropriate stereophonic sound is reproduced, but when a listener with a different head size listens, It may not be.

本発明はかかる点に鑑みてなされたものであり、立体音響の再生処理を行う場合に、演算処理に必要な構成や演算処理量を削減すると共に、良好な再生ができるようにすることを目的とする。 The present invention has been made in view of such a point, and an object of the present invention is to reduce the configuration and the amount of calculation processing necessary for calculation processing and perform good reproduction when performing three-dimensional sound reproduction processing. And

本発明は、ダミーヘッドを使用して測定された頭部伝達関数を、制限されたサンプル数に間引いて頭部伝達関数データベースとして記憶する。そして、記憶された頭部伝達関数データベース内の、制限されたサンプル数の頭部伝達関数から、指示された音源位置の伝達関数を抽出する。その抽出した伝達関数を、入力した音声信号に畳み込んで、バイノーラル立体音響を生成させるための２チャンネルの音声信号を得る。 In the present invention, the head-related transfer function measured using the dummy head is thinned out to a limited number of samples and stored as a head-related transfer function database. Then, the transfer function of the instructed sound source position is extracted from the limited number of sample head transfer functions in the stored head transfer function database. The extracted transfer function is convoluted with the input audio signal to obtain a 2-channel audio signal for generating binaural stereophonic sound.

本発明によると、制限されたサンプル数に間引いた頭部伝達関数のデータベースを用意することで、音声信号処理を行う機器が必要な記憶手段の記憶容量をそれだけ削減することが出来る。また、そのサンプル数が間引かれた頭部伝達関数を利用して演算処理を行うことで、バイノーラル立体音響を生成するための演算処理量を削減することが出来る。 According to the present invention, by preparing a database of head related transfer functions thinned to a limited number of samples, it is possible to reduce the storage capacity of the storage means required by the device that performs audio signal processing. Further, by performing the calculation process using the head-related transfer function from which the number of samples is thinned, the calculation processing amount for generating binaural stereophonic sound can be reduced.

本発明によると、音声信号処理を行う機器が必要な記憶手段の記憶容量をそれだけ削減することが出来ると共に、バイノーラル立体音響を生成するための演算処理量を削減することが出来る。従って、本発明によると機器構成の簡易化につながり、例えば携帯電話端末の如きポータブル型の電子機器に内蔵させるのに適した構成に出来る。 According to the present invention, it is possible to reduce the storage capacity of the storage means required by the device that performs the audio signal processing, and it is possible to reduce the amount of calculation processing for generating binaural stereophonic sound. Therefore, according to the present invention, the device configuration can be simplified, and a configuration suitable for incorporation in a portable electronic device such as a mobile phone terminal can be achieved.

以下、本発明の実施の形態を、添付図面を参照して説明する。
本実施の形態においては、携帯用として小型に構成された無線通信端末である、携帯電話端末に適用した例としてある。その携帯電話端末が内蔵した音声信号処理機能部において、以下に説明する処理を実行するようにしたものである。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings.
In this embodiment, the present invention is applied to a mobile phone terminal, which is a wireless communication terminal configured to be small for portable use. In the audio signal processing function unit built in the mobile phone terminal, the processing described below is executed.

まず、図２を参照して、本実施の形態の携帯電話端末の全体の構成例を説明する。
図２に示すように、制御部１１を備えて、この制御部１１が、携帯電話端末内の各部の処理動作を制御する。制御部１１は、制御ライン２８を介して、端末内の各部とデータのやり取りをおこなう。 First, an overall configuration example of the mobile phone terminal according to the present embodiment will be described with reference to FIG.
As shown in FIG. 2, the control part 11 is provided and this control part 11 controls the processing operation of each part in a mobile telephone terminal. The control unit 11 exchanges data with each unit in the terminal via the control line 28.

また本実施の形態の携帯電話端末は、通信端末として必要な無線通信処理を行う通信部１２を備え、通信部１２にアンテナ１３が接続してある。この通信部１２が、無線電話用の基地局と無線通信を行って、基地局との間で、双方向のデータ伝送を行う。通信部１２は、データライン２９を介して、基地局側から受信したデータを端末内の各部に送出する。また、端末内の各部１７からデータライン２９を介して伝送されたデータを、基地局側に送信させる。 In addition, the mobile phone terminal according to the present embodiment includes a communication unit 12 that performs wireless communication processing necessary as a communication terminal, and an antenna 13 is connected to the communication unit 12. The communication unit 12 performs wireless communication with a base station for wireless telephone, and performs bidirectional data transmission with the base station. The communication unit 12 sends the data received from the base station side to each unit in the terminal via the data line 29. Also, the data transmitted from each unit 17 in the terminal via the data line 29 is transmitted to the base station side.

データライン２９には、通信部１３の他に、メモリ１４と表示部１５と音声処理部１７と立体音響処理部２１とが接続してある。メモリ１４は、本実施の形態の端末を動作させるために必要なプログラムや、ユーザが記憶させた各種データなどを記憶する。ダウンロードなどで得た音楽データなどの音声信号の記憶についても、メモリ１４が行う。後述するデータベースについても、このメモリ１４を記憶手段として使用しても良い。
表示部１５は、液晶表示ディスプレイや有機ＥＬディスプレイなどが表示手段として使用され、制御部１１の制御で、各種情報の表示を行う。後述する設定操作時には、この表示部１５での表示に従ってユーザは操作部１６での設定操作を行う。
操作部１６は、携帯電話端末として必要な数字や記号などのダイヤルキー、各種機能キーなどで構成される。これらの操作部１６を構成する各キーの操作情報は、制御部１１に供給される。 In addition to the communication unit 13, the memory 14, the display unit 15, the sound processing unit 17, and the stereophonic sound processing unit 21 are connected to the data line 29. The memory 14 stores a program necessary for operating the terminal according to the present embodiment, various data stored by the user, and the like. The memory 14 also stores audio signals such as music data obtained by downloading. This memory 14 may also be used as a storage means for a database to be described later.
As the display unit 15, a liquid crystal display, an organic EL display, or the like is used as a display unit, and various information is displayed under the control of the control unit 11. At the time of a setting operation to be described later, the user performs a setting operation on the operation unit 16 according to the display on the display unit 15.
The operation unit 16 includes dial keys such as numbers and symbols necessary for a mobile phone terminal, various function keys, and the like. The operation information of each key constituting these operation units 16 is supplied to the control unit 11.

音声処理部１７は、音声信号の処理を行う処理部であり、スピーカ１８及びマイクロフォン１９が接続してある。このスピーカ１８及びマイクロフォン１９は、通話時に受話器として使用されるものである。即ち、通信部１２から音声処理部１７に供給される音声データを、音声処理部１７で復調してアナログ音声信号とし、増幅などのアナログ処理を行ってスピーカ１８から放音させる。また、マイクロフォン１９が集音した音声信号を、音声処理部１７でデジタル音声データに変調し、その変調された音声データを通信部１２に供給して、無線送信などを行う。
また、音声処理部１７に供給される音声データの内で、立体音響として出力させる音声については、次に説明する立体音響処理部２１に供給して処理させる。 The audio processing unit 17 is a processing unit that processes audio signals, and is connected to a speaker 18 and a microphone 19. The speaker 18 and the microphone 19 are used as a receiver during a call. That is, the audio data supplied from the communication unit 12 to the audio processing unit 17 is demodulated by the audio processing unit 17 into an analog audio signal, and is subjected to analog processing such as amplification and emitted from the speaker 18. In addition, the audio signal collected by the microphone 19 is modulated into digital audio data by the audio processing unit 17, and the modulated audio data is supplied to the communication unit 12 to perform wireless transmission or the like.
Of the audio data supplied to the audio processing unit 17, audio to be output as 3D sound is supplied to the 3D sound processing unit 21 described below for processing.

そして本実施の形態の携帯電話端末は、立体音響処理部２１を備える。立体音響処理部２１は、バイノーラル立体音響としての２チャンネルの音声信号を生成する処理部である。この立体音響処理部２１で処理する音声信号は、音声処理部１７から供給される場合の他に、メモリ１４などから読み出してデータライン２９を介して供給される場合や、通信部１２で受信した音声データがデータライン２９を介して供給される場合など、いずれの音声信号であってもよい。立体音響処理部２１でバイノーラル立体音響としての２チャンネルの音声信号を生成させる具体的な処理については、図１などを参照して後述する。 The mobile phone terminal according to the present embodiment includes a stereophonic sound processing unit 21. The stereophonic sound processing unit 21 is a processing unit that generates a two-channel audio signal as binaural stereophonic sound. The audio signal to be processed by the stereophonic sound processing unit 21 is read from the memory 14 or the like and supplied via the data line 29 in addition to the case where the audio signal is supplied from the audio processing unit 17 or received by the communication unit 12. Any audio signal may be used, such as when audio data is supplied via the data line 29. Specific processing for generating a two-channel audio signal as binaural stereophonic sound in the stereophonic sound processing unit 21 will be described later with reference to FIG.

立体音響処理部２１で生成された音声信号は、携帯電話端末本体に内蔵された左右のチャンネル用の２つのスピーカ２２Ｌ，２２Ｒから出力させる場合と、出力端子２３に接続されたヘッドホン（図示せず）から出力させる場合とがある。スピーカ２２Ｌ，２２Ｒは、携帯電話端末本体に内蔵されるスピーカであるので、比較的小型なスピーカユニットを使用したスピーカであるが、端末本体の周囲にいるリスナーに対して再生音を聞かせることが出来程度に増幅して出力させるスピーカである。
なお、ヘッドホンから出力させる場合には、出力端子２３にヘッドホンを直接接続していわゆる有線接続する場合の他に、例えばＢｌｕｅｔｏｏｔｈ（商標）方式などでヘッドホンと無線通信する近距離無線通信部を内蔵させて、その近距離無線通信部を介してヘッドホンに音声信号を供給する構成としてもよい。 The audio signal generated by the stereophonic sound processing unit 21 is output from the two speakers 22L and 22R for the left and right channels built in the mobile phone terminal body, and headphones (not shown) connected to the output terminal 23. ) May be output. Since the speakers 22L and 22R are speakers built in the mobile phone terminal main body, they are speakers using a relatively small speaker unit. However, the listener 22 around the terminal main body can hear the reproduced sound. It is a speaker that amplifies to the extent possible and outputs it.
In addition, when outputting from the headphones, in addition to the case where the headphones are directly connected to the output terminal 23 and so-called wired connection, a short-range wireless communication unit that performs wireless communication with the headphones by, for example, Bluetooth (trademark) method is incorporated. Then, a configuration may be adopted in which an audio signal is supplied to the headphones via the short-range wireless communication unit.

次に、図２に示した立体音響処理部２１の構成例を、図１を参照して説明する。
図１は、本実施の形態による立体音響処理部２１の全体構成例を示した図である。
図１の左側から信号の流れに沿って順に説明すると、まず音源方向設定部３１を有する。この音源方向設定部３１は、出力させる音声信号により生成される立体音響中に音源を定位させる位置を設定するものである。この音源方向は、例えば、図２に示した制御部１１の制御に基づいて、処理する音声信号ごとに予め決められた位置とする。或いは、処理する音声信号の付加情報などに、音源位置についての指示がある場合には、その位置とする。或いはまた、図１に示した操作部１６のユーザ操作などで、音源位置を自由に設定できるようにしてもよい。 Next, a configuration example of the stereophonic sound processing unit 21 illustrated in FIG. 2 will be described with reference to FIG.
FIG. 1 is a diagram showing an example of the overall configuration of the stereophonic sound processing unit 21 according to the present embodiment.
If it demonstrates in order along the flow of a signal from the left side of FIG. 1, it has the sound source direction setting part 31 first. The sound source direction setting unit 31 sets a position where the sound source is localized in the stereophonic sound generated by the audio signal to be output. The sound source direction is set to a predetermined position for each audio signal to be processed, for example, based on the control of the control unit 11 shown in FIG. Alternatively, if there is an instruction about the sound source position in the additional information of the audio signal to be processed, the position is set. Alternatively, the sound source position may be freely set by a user operation of the operation unit 16 shown in FIG.

音源方向設定部３１が出力する音源位置のデータは、ＨＲＴＦ処理部３３に供給する。ＨＲＴＦ処理部３３は、頭部伝達関数（ＨＲＴＦ：Head-Related Transfer Function）を処理する処理部であり、ＨＲＴＦ（頭部伝達関数）データベース３２に格納された頭部伝達関数の中から、適切な頭部伝達関数を抽出する。ＨＲＴＦデータベース３２には、リスナーの位置を中心とした水平方向の周囲３６０°のそれぞれの音源位置での左右のチャンネルの頭部伝達関数を記憶してある。このデータベース３２は、例えば図２に示したメモリ１４を使用する。或いは、立体音響処理部２１内に専用の記憶部を用意しても良い。本実施の形態においては、このＨＲＴＦデータベース３２に格納させる頭部伝達関数として、ダミーヘッドを使用して測定された本来の頭部伝達関数のサンプリング値を間引いたサンプリング値として、データ量を大幅に削減したデータとしてある。また、データベースで示される音源位置についても、例えば１０°刻みで周囲３６０°の音源位置のデータとして、比較的粗い位置間隔としてある。 The sound source position data output from the sound source direction setting unit 31 is supplied to the HRTF processing unit 33. The HRTF processing unit 33 is a processing unit that processes a head-related transfer function (HRTF), and selects an appropriate one from the head-related transfer functions stored in the HRTF (head-related transfer function) database 32. Extract head-related transfer function. The HRTF database 32 stores the head-related transfer functions of the left and right channels at respective sound source positions of 360 ° around the horizontal direction around the listener's position. The database 32 uses, for example, the memory 14 shown in FIG. Alternatively, a dedicated storage unit may be prepared in the stereophonic sound processing unit 21. In the present embodiment, as the head-related transfer function to be stored in the HRTF database 32, the data amount is greatly increased as the sampling value obtained by thinning out the sampling value of the original head-related transfer function measured using the dummy head. As reduced data. The sound source positions indicated in the database are also relatively coarse position intervals, for example, as data of sound source positions around 360 ° in 10 ° increments.

ＨＲＴＦ処理部３３で抽出された頭部伝達関数は、パーソナライズ部３４に供給する。パーソナライズ部３４は、ＨＲＴＦモデル算出部３６が算出した、頭部伝達関数のモデルのデータを使用して、ＨＲＴＦ処理部３３から供給される頭部伝達関数を修正する。ＨＲＴＦモデル算出部３６で算出する頭部伝達関数のモデルのデータは、サイズ設定部３５で設定されたリスナーの頭部のサイズのデータに基いて算出が行われる。従って、パーソナライズ部３４では、サイズ設定部３５で設定されたリスナーの頭部のサイズに基いた修正が行われることになる。サイズ設定部３５でのリスナーの頭部のサイズの設定は、例えば、図２に示した操作部１６の操作で、立体音響の設定画面を表示部１５に表示させた上で、リスナーの頭部のサイズと耳介のサイズをユーザ操作で選択することで実行される。設定されたリスナーの頭部及び耳介のサイズの設定値は、メモリ１４などに記憶されて読み出される。 The head-related transfer function extracted by the HRTF processing unit 33 is supplied to the personalization unit 34. The personalization unit 34 corrects the head related transfer function supplied from the HRTF processing unit 33 using the data of the model of the head related transfer function calculated by the HRTF model calculating unit 36. The head transfer function model data calculated by the HRTF model calculation unit 36 is calculated based on the listener head size data set by the size setting unit 35. Therefore, the personalization unit 34 performs correction based on the size of the listener's head set by the size setting unit 35. The setting of the listener's head size in the size setting unit 35 is performed by, for example, displaying the stereoscopic sound setting screen on the display unit 15 by the operation of the operation unit 16 shown in FIG. This is executed by selecting the size of the mouse and the size of the pinna by user operation. The set values of the listener's head and auricle size are stored in the memory 14 and read out.

パーソナライズ部３４で修正された頭部伝達関数は、立体音響算出部３７に供給される。立体音響算出部３７では、音声信号入力部３８に入力した音声信号について、供給される頭部伝達関数を使用した演算処理で、立体音響化されたバイノーラル信号としての２チャンネルの音声信号を得る。 The head-related transfer function corrected by the personalization unit 34 is supplied to the stereophonic sound calculation unit 37. The stereophonic sound calculation unit 37 obtains a 2-channel audio signal as a binaural signal that has been made stereophonic by performing arithmetic processing using the supplied head-related transfer function for the audio signal input to the audio signal input unit 38.

立体音響算出部３７で得られた２チャンネルの音声信号は、出力端子２３に接続されたヘッドホン２４に供給して出力させる。或いは、立体音響算出部３７で得られた２チャンネルの音声信号を、クロストークキャンセル部３９に供給して、２つのチャンネルのクロストーク成分を除去した上で、携帯電話端末本体に内蔵されたスピーカ２２Ｌ，２２Ｒから出力させる。既に説明したように携帯電話端末本体からヘッドホン２４への伝送については、無線伝送するようにしてもよい。 The two-channel audio signal obtained by the stereophonic sound calculation unit 37 is supplied to the headphones 24 connected to the output terminal 23 for output. Alternatively, the two-channel audio signal obtained by the stereophonic sound calculation unit 37 is supplied to the crosstalk cancellation unit 39 to remove the crosstalk component of the two channels, and then the speaker built in the mobile phone terminal body Output from 22L and 22R. As already described, the transmission from the mobile phone terminal body to the headphones 24 may be performed wirelessly.

次に、立体音響処理部２１内の各部の具体的な例を、図３以降を参照して説明する。
図３は、ＨＲＴＦデータベース３２に格納させる頭部伝達関数（ＨＲＴＦ）の生成処理例を示した構成である。この図３に示した処理構成は、データベースに格納させるデータを生成させる際の処理であるので、携帯電話端末を製造するメーカー側で、端末に格納させるソフトウェアを製作する際に用意するものである。 Next, specific examples of each part in the stereophonic sound processing unit 21 will be described with reference to FIG.
FIG. 3 shows a configuration example of generation processing of a head related transfer function (HRTF) stored in the HRTF database 32. Since the processing configuration shown in FIG. 3 is processing for generating data to be stored in the database, it is prepared when the manufacturer of the mobile phone terminal manufactures software to be stored in the terminal. .

図３に示したように、まず測定された頭部伝達関数を記憶したＨＲＴＦデータベース５１を用意する。このデータベース５１に記憶された頭部伝達関数は、ダミーヘッドの両耳の箇所に装着したマイクロフォンで、それぞれの音源位置での発する音のインパルス応答を収音する測定を行って、その測定値に基いた頭部伝達関数である。この測定に使用するダミーヘッドは、標準的なサイズのものとしてある。それぞれの音源位置から拾った音は、図示しない収音処理構成で、インパルス応答を所定のサンプリング周期による５１２サンプル点で測定した信号とする。その５１２サンプル点の信号で構成される頭部伝達関数を、１つの音源位置の頭部伝達関数として記憶する。また、音源位置は、ダミーヘッドの周囲水平方向の３６０°について、例えば５°刻みの位置に設定してある。それぞれの音源位置の頭部伝達関数には、両耳時間差情報（ＩＴＤ：Inter-aural Time Differences）と、振幅情報（ＭＰＳ：Minimum Phase Systems）とがある。
このデータベース５１に記憶された頭部伝達関数は、一般的な頭部伝達関数として既知のものであり、既存の頭部伝達関数が使用可能であれば、そのまま使用してよい。 As shown in FIG. 3, the HRTF database 51 storing the measured head-related transfer functions is prepared first. The head-related transfer function stored in the database 51 is a microphone that is mounted on both ears of the dummy head, performs measurement to collect the impulse response of the sound emitted at each sound source position, and obtains the measured value. Based on the head-related transfer function. The dummy head used for this measurement is of a standard size. The sound picked up from each sound source position is a signal obtained by measuring an impulse response at 512 sample points with a predetermined sampling period in a sound collection processing configuration (not shown). The head-related transfer function composed of the 512 sample point signals is stored as a head-related transfer function at one sound source position. The sound source position is set to a position in increments of 5 °, for example, at 360 ° in the horizontal direction around the dummy head. The head-related transfer function of each sound source position includes binaural time difference information (ITD: Inter-aural Time Differences) and amplitude information (MPS: Minimum Phase Systems).
The head-related transfer functions stored in the database 51 are known as general head-related transfer functions. If an existing head-related transfer function can be used, it may be used as it is.

そして、データベース５１に記憶された頭部伝達関数の内の、両耳時間差抽出部５２で両耳時間差情報ＩＴＤを、それぞれの音源位置のデータごとに抽出する。また、データベース５１に記憶された頭部伝達関数の内の、振幅情報ＭＰＳを、最小位相系変換部５３で抽出して必要なデータ形式に変換する。変換された振幅情報ＭＰＳは、サンプル数変換部５４に供給して、サンプル数を削減させる処理を行う。ここでは、５１２サンプルの信号を、３２サンプルに間引く処理を行う。３２サンプル以外のサンプル数に間引くようにしてもよい。 Then, the binaural time difference extracting unit 52 extracts the binaural time difference information ITD from the head-related transfer functions stored in the database 51 for each sound source position data. In addition, the amplitude information MPS in the head-related transfer function stored in the database 51 is extracted by the minimum phase system converter 53 and converted into a necessary data format. The converted amplitude information MPS is supplied to the sample number conversion unit 54 to perform processing for reducing the number of samples. Here, a process of thinning 512-sample signals into 32 samples is performed. The number of samples other than 32 samples may be thinned out.

両耳時間差抽出部５２で抽出された両耳時間差情報ＩＴＤと、サンプル数変換部５４でサンプル数が変換された振幅情報ＭＰＳは、空間的リサンプリング部５５に供給して、１音源位置ごとに３２サンプルのインパルス応答値で構成された頭部伝達関数とする。このとき、空間的リサンプリング部５５では、５°刻みの音源位置のデータを、１０°刻みの音源位置のデータとして、音源位置についても間引くようにしてある。
このようにして空間的リサンプリング部５５で得られた、１０°刻みの音源位置のそれぞれで３２サンプルの信号による頭部伝達関数を、図１に示した携帯電話端末内の処理済ＨＲＴＦデータベース３２に記憶させる。 The binaural time difference information ITD extracted by the binaural time difference extraction unit 52 and the amplitude information MPS obtained by converting the number of samples by the sample number conversion unit 54 are supplied to the spatial resampling unit 55 for each sound source position. It is assumed that the head-related transfer function is composed of 32-sample impulse response values. At this time, the spatial resampling unit 55 thins out the sound source position data in units of 5 ° as data of the sound source position in units of 10 °.
The head-related transfer function obtained by the spatial re-sampling unit 55 in this manner by using 32 samples of signals at each 10 ° source position is processed HRTF database 32 in the mobile phone terminal shown in FIG. Remember me.

次に、図４を参照して、図１に示したＨＲＴＦ処理部３３での処理構成例を説明する。
図４に示すように、音源方向設定部３１からの音源指示データを、ＨＲＴＦ処理部３３内の候補抽出部６１に供給する。候補抽出部６１では、ＨＲＴＦデータベース３２に記憶された頭部伝達関数の内で、指示された音源位置に近い音源位置の複数の頭部伝達関数を抽出する。例えば、音源位置が正面から右側に１３°の位置と指示されたとき、処理済ＨＲＴＦデータベース３２に記憶された１０°の頭部伝達関数と２０°の頭部伝達関数を抽出する。
抽出された頭部伝達関数の内の両耳時間差情報は、ＩＴＤ処理部６２に供給し、振幅情報はＭＰＳ処理部６３に供給する。 Next, a processing configuration example in the HRTF processing unit 33 shown in FIG. 1 will be described with reference to FIG.
As shown in FIG. 4, the sound source instruction data from the sound source direction setting unit 31 is supplied to the candidate extraction unit 61 in the HRTF processing unit 33. The candidate extraction unit 61 extracts a plurality of head related transfer functions at a sound source position close to the instructed sound source position from the head related transfer functions stored in the HRTF database 32. For example, when the sound source position is instructed to be 13 ° from the front to the right, the 10 ° head-related transfer function and the 20 ° head-related transfer function stored in the processed HRTF database 32 are extracted.
The binaural time difference information in the extracted head-related transfer function is supplied to the ITD processing unit 62, and the amplitude information is supplied to the MPS processing unit 63.

そして、それぞれの情報を内挿処理部６４に供給する。この内挿処理部６４では、ＩＴＤ処理部６２とＭＰＳ処理部６３から供給される複数の頭部伝達関数を使用して、音源方向設定部３１から指定された音源位置の頭部伝達関数を内挿で生成させる。例えば、音源位置が正面から右側に１３°であるとき、１０°の頭部伝達関数と２０°の頭部伝達関数を、それぞれの位置に応じた比率で乗算させる処理を行って、１３°の位置の頭部伝達関数を生成させる。指示された音源位置とＨＲＴＦデータベース３２に格納された音源位置とがほぼ一致する場合には補間は行わない。なお、内挿処理部６４での補間は行わない構成として、指示された音源位置を、処理済ＨＲＴＦデータベース３２に格納された音源位置に近似させる構成としてもよい。 Then, each information is supplied to the interpolation processing unit 64. The interpolation processing unit 64 uses a plurality of head-related transfer functions supplied from the ITD processing unit 62 and the MPS processing unit 63 to interpolate the head-related transfer function at the sound source position designated by the sound source direction setting unit 31. Generate by insertion. For example, when the sound source position is 13 ° from the front to the right side, a process of multiplying a head transfer function of 10 ° and a head transfer function of 20 ° by a ratio corresponding to each position is performed. Generate a head-related transfer function of position. When the instructed sound source position and the sound source position stored in the HRTF database 32 substantially match, interpolation is not performed. In addition, it is good also as a structure which approximates the instruct | indicated sound source position to the sound source position stored in the processed HRTF database 32 as a structure which does not perform the interpolation in the interpolation process part 64. FIG.

図５は、図１に示したパーソナライズ部３４での処理構成例を示した図である。
このパーソナライズ部３４での処理は、図１に示したサイズ設定部３５で既に設定されたリスナーの頭部及び耳介のサイズのデータに基づいて実行される。
ここで、サイズの設定処理状態の例について説明する。例えば携帯電話端末の操作部１６を操作して、頭部のサイズ設定モードとし、そのサイズ設定用の画面を、表示部１５に表示させる。
図８はその場合の設定画面の例を示した図である。この例では、頭のサイズを、「大」「標準」「小」の３種類の中からユーザ操作で選択できる例としてある。また、耳介のサイズを、「大」「標準」「小」の３種類の中からユーザ操作で選択できる例としてある。図８の例では、頭部のサイズを「標準」として選択してあり、耳介のサイズを、「小」として選択してある。
なお、図８に示した例よりもより細かく選択ができるようにしてもよい。例えば、それぞれのサイズ選択が３段階ではなく、４段階以上選択できるようにしてもよい。また、頭部のサイズとして、水平方向の頭部のサイズと垂直方向の頭部のサイズを個別に選択できるようにしてもよい。或いは、頭部のおおよその形状として、丸形形状、細長形状などから選択させ、その上で、「大」「標準」「小」などのサイズを選択させてもよい。 FIG. 5 is a diagram showing a processing configuration example in the personalization unit 34 shown in FIG.
The processing in the personalization unit 34 is executed based on the listener head and pinna size data already set by the size setting unit 35 shown in FIG.
Here, an example of the size setting processing state will be described. For example, the operation unit 16 of the mobile phone terminal is operated to set the head size setting mode, and a screen for setting the size is displayed on the display unit 15.
FIG. 8 shows an example of a setting screen in that case. In this example, the size of the head can be selected by user operation from three types of “large”, “standard”, and “small”. In addition, as an example, the size of the auricle can be selected by a user operation from three types of “large”, “standard”, and “small”. In the example of FIG. 8, the head size is selected as “standard”, and the pinna size is selected as “small”.
Note that the selection may be made more finely than the example shown in FIG. For example, each size may be selected not in three stages but in four or more stages. Further, as the size of the head, the size of the head in the horizontal direction and the size of the head in the vertical direction may be individually selectable. Alternatively, the approximate shape of the head may be selected from a round shape, an elongated shape, and the like, and then a size such as “large”, “standard”, and “small” may be selected.

サイズに応じた補正処理を行うパーソナライズ部３４には、図４に示した内挿処理部６４から、内挿された（又は内挿されていない）頭部伝達関数としての、両耳時間差情報ＩＴＤと振幅情報ＭＰＳとが供給される。さらに、図１に示したサイズ設定部３５で既に設定されたリスナーの頭部及び耳介のサイズのデータについても供給される。
また、ＨＲＴＦモデル測定データ３６ａを用意し、頭部伝達関数の各サイズでの変化のデータを記憶させておく。ＨＲＴＦモデル測定データ３６ａに記憶された頭部伝達関数の各サイズでの変化のデータの内の、そのときに設定されたサイズに応じたデータを抽出部３６ｂで抽出する。 The personalization unit 34 that performs the correction process according to the size, the binaural time difference information ITD as a head-related transfer function interpolated (or not interpolated) from the interpolation processing unit 64 shown in FIG. And amplitude information MPS are supplied. In addition, the listener's head and auricle size data already set by the size setting unit 35 shown in FIG. 1 is also supplied.
In addition, HRTF model measurement data 36a is prepared, and data of changes in each size of the head-related transfer function is stored. Of the change data at each size of the head related transfer function stored in the HRTF model measurement data 36a, data corresponding to the size set at that time is extracted by the extraction unit 36b.

そして、パーソナライズ部３４で、抽出部３６ｂで抽出されたデータを使用して、内挿処理部６４から供給された両耳時間差情報ＩＴＤと振幅情報ＭＰＳとを、現在設定されたサイズに応じて補正された両耳時間差情報ＩＴＤ′及び振幅情報ＭＰＳ′とする。サイズ補正処理の詳細はここでは説明しないが、頭部のサイズの大小によって、主として中音域から低音域の周波数帯域の頭部伝達関数に影響がある。また、耳介のサイズの大小によって、主として高音域の周波数帯域の頭部伝達関数に影響がある。 Then, the personalization unit 34 corrects the binaural time difference information ITD and the amplitude information MPS supplied from the interpolation processing unit 64 according to the currently set size using the data extracted by the extraction unit 36b. The binaural time difference information ITD ′ and the amplitude information MPS ′ are used. Details of the size correction processing will not be described here, but the head-related transfer function in the frequency band of the middle to low frequency range is mainly affected by the size of the head size. In addition, the size of the pinna mainly affects the head-related transfer function in the high frequency range.

パーソナライズ部３４で補正された頭部伝達関数である両耳時間差情報ＩＴＤ′及び振幅情報ＭＰＳ′は、図１に示した立体音響算出部３７に供給して、音声入力部３８に入力した音声信号に対して頭部伝達関数を畳み込んで２チャンネルの音声信号として、バイノーラル方式で立体音響が再現される音声信号とする。 The binaural time difference information ITD ′ and amplitude information MPS ′, which are head-related transfer functions corrected by the personalization unit 34, are supplied to the stereophonic sound calculation unit 37 shown in FIG. 1 and input to the audio input unit 38. On the other hand, the head-related transfer function is convoluted to obtain a two-channel audio signal that reproduces the stereophonic sound by the binaural method.

図６は、立体音響算出部３７の構成例を示した図である。
パーソナライズ部３４で補正された両耳時間差情報ＩＴＤ′は、フェーズ情報処理部７１に供給して、音声入力部３８から入力した音声信号に対して、その両耳時間差情報ＩＴＤ′で示された左右の時差を付与した２チャンネルの音声信号とする。そして、そのフェーズ情報処理部７１で得られた左右のチャンネルの音声信号Ｌ及びＲを、それぞれのチャンネルのＦＩＲフィルタ７２Ｌ，７２Ｒに供給する。各ＦＩＲフィルタ７２Ｌ，７２Ｒでは、供給される振幅情報ＭＰＳ′に基づいて振幅を調整して、バイノーラル方式で立体音響が再現される音声信号とする。 FIG. 6 is a diagram illustrating a configuration example of the stereophonic sound calculation unit 37.
The binaural time difference information ITD ′ corrected by the personalization unit 34 is supplied to the phase information processing unit 71, and the left and right indicated by the binaural time difference information ITD ′ is supplied to the audio signal input from the audio input unit 38. It is assumed that the two-channel audio signal is added with the time difference. Then, the left and right channel audio signals L and R obtained by the phase information processing unit 71 are supplied to the FIR filters 72L and 72R of the respective channels. Each of the FIR filters 72L and 72R adjusts the amplitude based on the supplied amplitude information MPS ′ to obtain an audio signal that reproduces stereophonic sound by a binaural method.

図７は、このようして生成されたバイノーラル方式用の２チャンネルの音声信号を出力させる出力部の構成例である。
この例では、２チャンネルの音声信号Ｌ，Ｒを、それぞれ切換スイッチ８１Ｌ，８１Ｒを介して出力端子２３に供給し、出力端子２３に接続されたヘッドホン２４の左右のドライバユニットから放音させる。このようにすることで、そのヘッドホン２４を装着したリスナーには、音源の位置が音源方向設定部３１（図１）で設定した方向の音として聞き取れる。 FIG. 7 is a configuration example of an output unit that outputs the binaural audio signal generated in this way for two channels.
In this example, two-channel audio signals L and R are supplied to the output terminal 23 via the changeover switches 81L and 81R, respectively, and sound is emitted from the left and right driver units of the headphones 24 connected to the output terminal 23. In this way, the listener wearing the headphones 24 can hear the position of the sound source as sound in the direction set by the sound source direction setting unit 31 (FIG. 1).

また、２チャンネルの音声信号Ｌ，Ｒを、それぞれ切換スイッチ８１Ｌ，８１Ｒを介してクロストークキャンセル部３９に供給する構成としてある。クロストークキャンセル部３９は、係数乗算器８２Ｌ，８２Ｒと加算器８３Ｌ，８３Ｒと増幅器８４Ｌ，８４Ｒとで構成されて、２つチャンネルの信号のクロストーク成分をキャンセルして、通常の２チャンネルの音声信号とする。クロストークキャンセル部３９でクロストーク成分がキャンセルされた左右のチャンネルの音声信号は、それぞれのチャンネル用に携帯電話端末本体に内蔵されたスピーカ２２Ｌ，２２Ｒから出力させる。このスピーカ２２Ｌ，２２Ｒから出力される音声によっても、そのスピーカ２２Ｌ，２２Ｒと向き合ったリスナーには、音源の位置が音源方向設定部３１（図１）で設定した方向の音として聞き取れる。 In addition, the two-channel audio signals L and R are supplied to the crosstalk cancellation unit 39 via the changeover switches 81L and 81R, respectively. The crosstalk cancel unit 39 includes coefficient multipliers 82L and 82R, adders 83L and 83R, and amplifiers 84L and 84R. The crosstalk cancel unit 39 cancels the crosstalk component of the two-channel signal and performs normal two-channel audio. Signal. The audio signals of the left and right channels whose crosstalk components are canceled by the crosstalk canceling unit 39 are output from the speakers 22L and 22R incorporated in the mobile phone terminal main body for the respective channels. The sound output from the speakers 22L and 22R can be heard by the listener facing the speakers 22L and 22R as the sound in the direction set by the sound source direction setting unit 31 (FIG. 1).

このように本実施の形態によると、携帯電話端末にバイノーラル方式の音声信号を生成させる立体音響算出部３７を内蔵させたので、指定された音源位置の立体音響として、ヘッドホンを装着したリスナーに聴取させることが可能となる。この場合、本実施の形態の場合には、用意する頭部伝達関数のデータベース３２として、図３に示したように、本来の頭部伝達関数からサンプル数や音源位置を大幅に削減したデータを保持するようにしたので、データベース３２が記憶する情報量を大幅に少なくすることができる。また、データベース３２から頭部伝達関数を読み出して演算処理する処理構成についても、少ない情報量の頭部伝達関数を使用した演算であるので、携帯電話端末内の回路の負担が少なくなる。従って、図２に示した携帯電話端末の如き電子機器に、それほど回路などの負担を増やすことなく、立体音響処理回路を内蔵させることが可能になる。 As described above, according to the present embodiment, since the stereophonic sound calculation unit 37 that generates a binaural sound signal is built in the mobile phone terminal, the listener wearing the headphones listens to the stereophonic sound at the designated sound source position. It becomes possible to make it. In this case, in the case of the present embodiment, as the head related transfer function database 32 to be prepared, as shown in FIG. 3, data obtained by greatly reducing the number of samples and the sound source position from the original head related transfer function is obtained. Since the information is stored, the amount of information stored in the database 32 can be greatly reduced. In addition, the processing configuration for reading out the head-related transfer function from the database 32 and performing arithmetic processing is an operation that uses the head-related transfer function with a small amount of information, so the burden on the circuit in the mobile phone terminal is reduced. Therefore, it is possible to incorporate a stereophonic sound processing circuit in an electronic device such as the mobile phone terminal shown in FIG. 2 without increasing the load on the circuit or the like.

頭部伝達関数のサンプル数を減らすことは、再生される立体音響の再現精度が劣化することにつながる。しかしながら本例においては、リスナーの頭部のサイズを設定して、それぞれの設定に基づいて補正を行い、その点から精度を上げるようにしたので、頭部伝達関数のサンプル数の減少に伴った、立体音響の再現精度の劣化を補うように機能する。頭部のサイズだけでなく、耳介のサイズも設定するようにしたことで、さらに変換精度を向上させることができる。 Reducing the number of samples of the head-related transfer function leads to deterioration in the reproduction accuracy of the reproduced stereophonic sound. However, in this example, the size of the listener's head is set, correction is performed based on each setting, and the accuracy is increased from that point, which is accompanied by a decrease in the number of head transfer function samples. It functions to compensate for the deterioration in the reproduction accuracy of stereophonic sound. By setting not only the size of the head but also the size of the auricle, the conversion accuracy can be further improved.

また、本実施の形態の場合には、図７に示すように、クロストークキャンセル部３９を備える構成として、ヘッドホンから再生させる場合と同様の立体音響が、端末本体内のスピーカからも出力可能としたことで、ヘッドホンを使用しない場合にも対処可能である。 Further, in the case of the present embodiment, as shown in FIG. 7, as the configuration including the crosstalk cancellation unit 39, the same 3D sound as the case of reproducing from the headphones can be output from the speaker in the terminal body. Therefore, it is possible to cope with the case where the headphones are not used.

なお、ここまで説明した実施の形態では、携帯電話端末に立体音響処理回路を内蔵させた場合の例について説明したが、その他の音声信号（オーディオ信号）を再生処理する各種電子機器に、上述した実施の形態で説明した立体音響処理部を内蔵させてもよい。例えば、音楽データを記憶して再生するポータブル型の音楽再生装置に、上述した実施の形態で説明した立体音響処理部を内蔵させてもよい。 In the embodiment described so far, an example in which a stereophonic sound processing circuit is built in a mobile phone terminal has been described. However, the above-described various electronic devices that reproduce and process audio signals (audio signals) are described above. The stereophonic sound processing unit described in the embodiment may be incorporated. For example, the stereophonic sound processing unit described in the above-described embodiment may be incorporated in a portable music player that stores and reproduces music data.

本発明の一実施の形態による音声信号処理構成例を示すブロック図である。It is a block diagram which shows the audio signal processing structural example by one embodiment of this invention. 本発明の一実施の形態を適用した携帯電話端末の構成例を示すブロック図である。It is a block diagram which shows the structural example of the mobile telephone terminal to which one embodiment of this invention is applied. 本発明の一実施の形態によるデータベースの作成処理構成例を示したブロック図である。It is the block diagram which showed the example of a creation process structure of the database by one embodiment of this invention. 本発明の一実施の形態による頭部伝達関数の処理構成例を示したブロック図である。It is the block diagram which showed the example of a process structure of the head related transfer function by one embodiment of this invention. 本発明の一実施の形態による頭部伝達関数の補正処理構成例を示したブロック図である。It is the block diagram which showed the correction process structural example of the head related transfer function by one embodiment of this invention. 本発明の一実施の形態による頭部伝達関数を使用した音声信号の処理構成例を示したブロック図である。It is the block diagram which showed the processing structural example of the audio | voice signal using the head-related transfer function by one embodiment of this invention. 本発明の一実施の形態による出力部の構成例を示したブロック図である。It is the block diagram which showed the example of a structure of the output part by one embodiment of this invention. 本発明の一実施の形態による設定画面の表示例を示した説明図である。It is explanatory drawing which showed the example of a display of the setting screen by one embodiment of this invention.

Explanation of symbols

１１…制御部、１２…通信部、１３…アンテナ、１４…メモリ、１５…表示部、１６…操作部、１７…音声処理部、１８…スピーカ、１９…マイクロフォン、２１…立体音響処理部、２２Ｌ，２２Ｒ…スピーカ、２３…出力端子、２４…ヘッドホン、２８…制御ライン、２９…データライン、３１…音源方向設定部、３２…処理済ＨＲＴＦデータベース、３３…ＨＲＴＦ処理部、３４…パーソナライズ部、３５…サイズ設定部、３６…ＨＲＴＦモデル算出部、３７…立体音響算出部、３８…音声入力部、３９…クロストークキャンセル部、５１…ＨＲＴＦデータベース、５２…両耳時間差抽出部、５３…最小位相系変換部、５４…サンプル数変換部、５５…空間的リサンプリング部、６１…候補抽出部、６２…ＩＴＤ処理部、６３…ＭＰＳ処理部、６４…内挿処理部、７１…フェーズ情報処理部、７２Ｌ，７２Ｒ…ＦＩＲフィルタ、８１Ｌ，８１Ｒ…切換スイッチ DESCRIPTION OF SYMBOLS 11 ... Control part, 12 ... Communication part, 13 ... Antenna, 14 ... Memory, 15 ... Display part, 16 ... Operation part, 17 ... Sound processing part, 18 ... Speaker, 19 ... Microphone, 21 ... Stereophonic sound processing part, 22L , 22R ... speaker, 23 ... output terminal, 24 ... headphone, 28 ... control line, 29 ... data line, 31 ... sound source direction setting unit, 32 ... processed HRTF database, 33 ... HRTF processing unit, 34 ... personalization unit, 35 ... Size setting section, 36 ... HRTF model calculation section, 37 ... Stereophonic sound calculation section, 38 ... Audio input section, 39 ... Crosstalk cancellation section, 51 ... HRTF database, 52 ... Binaural time difference extraction section, 53 ... Minimum phase system Conversion unit, 54 ... Sample number conversion unit, 55 ... Spatial resampling unit, 61 ... Candidate extraction unit, 62 ... ITD processing unit, 63 ... MPS process Parts, 64 ... interpolation processing section, 71 ... phase information processing unit, 72L, 72R ... FIR filter, 81L, 81R ... changeover switch

Claims

A head-related transfer function database that stores a head-related transfer function measured using a dummy head by thinning out a limited number of samples;
A transfer function extracting unit that extracts a transfer function of the designated sound source position from a limited number of sample head transfer functions stored in the head transfer function database;
A stereophonic sound processing unit that convolves the transfer function extracted by the transfer function extraction unit with the input audio signal to obtain a two-channel audio signal for generating binaural stereophonic sound; and
An audio signal processing apparatus comprising: an output unit configured to output a 2-channel audio signal obtained by the stereophonic sound processing unit.

A correction unit that corrects the head-related transfer function extracted by the transfer function extraction unit according to a designated head size, and supplies the corrected head-related transfer function to the stereophonic sound processing unit. Item 2. The audio signal processing device according to Item 1.

The audio signal processing device according to claim 2, wherein the correction unit also corrects the head-related transfer function by designating a pinna size.

The audio signal processing apparatus according to claim 1, wherein the output unit is a terminal or a transmission processing unit that outputs audio signals to headphones.

2. The output unit includes a cancel processing unit that cancels crosstalk of a two-channel audio signal, and two speakers that output a two-channel audio signal from which crosstalk has been canceled by the cancel processing unit. Or the audio | voice signal processing apparatus of 2.

The head-related transfer function measured using a dummy head is thinned out to a limited number of samples and stored as a head-related transfer function database.
From the head related transfer functions of the limited number of samples in the stored head related transfer function database, extract the transfer function of the indicated sound source position,
An audio signal processing method for obtaining a 2-channel audio signal for generating binaural stereophonic sound by convolving the extracted transfer function with an input audio signal.

A head-related transfer function database that stores a head-related transfer function measured using a dummy head by thinning out a limited number of samples;
A transfer function extracting unit that extracts a transfer function of the designated sound source position from a limited number of sample head transfer functions stored in the head transfer function database;
A stereophonic sound processing unit that convolves the transfer function extracted by the transfer function extraction unit with the input audio signal to obtain a two-channel audio signal for generating binaural stereophonic sound; and
The portable terminal provided with the audio | voice signal processing apparatus provided with the output part which outputs the audio | voice signal of 2 channels obtained by the said stereophonic sound processing part.