JP2006270649A

JP2006270649A - Voice acoustic signal processing apparatus and method thereof

Info

Publication number: JP2006270649A
Application number: JP2005087223A
Authority: JP
Inventors: Kei Kikuiri; 圭菊入; Nobuhiko Naka; 信彦仲; Tomoyuki Oya; 智之大矢
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2005-03-24
Filing date: 2005-03-24
Publication date: 2006-10-05

Abstract

<P>PROBLEM TO BE SOLVED: To provide a technology of revising or correcting a sound image position with low delays and low computation amount. <P>SOLUTION: A minority channel signal decoding section 101 of a voice acoustic signal processing apparatus decodes a minority channel signal-coded sequence into a minority channel signal. An inter-channel parameter decoding section 102 decodes an inter-channel parameter coded sequence, obtained by coding parameters representing a relation among channels. An inter-channel parameter revision section 103 revises the decoded inter-channel parameters, on the basis of an original sound image position and a desired sound image position so that the sound image position of a plurality of channel signals finally decoded reaches the desired sound image position. A plural channel signal compositing section 104 receives the revised inter-channel parameters and the decoded minority channel signal and composites the minority channel signal with a plurality of channel signals, by using the revised inter-channel parameters. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、音声・音響信号処理装置およびその方法に関し、より詳細には、パラメータを用いた符号化方式による音声・音響信号処理装置およびその方法に関する。 The present invention relates to a speech / acoustic signal processing apparatus and method, and more particularly, to a speech / acoustic signal processing apparatus and method using an encoding method using parameters.

現在、音声・音響信号を高効率に圧縮して符号化し復号する方法および装置は数多く存在する。また、複数チャネルの音声・音響信号の符号化方式も多数存在する。例えば、２チャネルの信号（ステレオ信号）に対しては、M/S（Middle/Side）ステレオ符号化方式、インテンシティステレオ（IS）符号化方式などがある（例えば、非特許文献1参照）。 Currently, there are many methods and apparatuses for compressing and encoding voice / acoustic signals with high efficiency. There are also many encoding methods for audio / acoustic signals of a plurality of channels. For example, for a two-channel signal (stereo signal), there are an M / S (Middle / Side) stereo coding system, an intensity stereo (IS) coding system, and the like (for example, see Non-Patent Document 1).

M/S符号化方式では、元のチャネル間の相関の強いと想定される２チャネルの信号を、その和信号と差信号に変換し、チャネル間の相関を弱くして符号化する。これにより、元の２チャネルの信号に含まれていた冗長な情報を削減して符号化することができる。IS符号化方式では、２チャネルの信号の和信号をスケーリングし、そのスペクトル情報と２チャネルの信号の強度差を符号化する。また、２チャネルの信号間の相関を利用し、２チャネルの信号を1チャネルの信号とチャネル間の関係を表すパラメータとによって表現し、１チャネルの信号(例えば元の２チャネルの信号の和信号)とパラメータとを符号化するパラメトリックステレオ（PS）符号化方式というものもある（例えば、非特許文献２参照）。 In the M / S encoding method, a two-channel signal that is assumed to have a strong correlation between original channels is converted into a sum signal and a difference signal, and the correlation between the channels is weakened for encoding. Thereby, the redundant information contained in the original 2-channel signal can be reduced and encoded. In the IS encoding method, the sum signal of two-channel signals is scaled, and the spectrum information and the intensity difference between the two-channel signals are encoded. Also, using the correlation between the two channel signals, the two channel signals are represented by one channel signal and a parameter representing the relationship between the channels, and one channel signal (for example, the sum signal of the original two channel signals). There is also a parametric stereo (PS) encoding method that encodes a parameter and a parameter (see, for example, Non-Patent Document 2).

この方式は２チャネルの信号に限らず、複数チャネルの信号に対しても適用可能である。非特許文献２においては、チャネル間の関係を表すパラメータとして、２チャネル間の強度差を表すInter-channel Intensity Difference（IID）、２チャネル間の位相差を表すInter-channel Phase Difference（IPD）、符号化される１チャネルの信号と２チャネルの信号のうちの片方のチャネルの信号との位相差Overall Phase Difference（OPD）、２チャネル間の相似性を表すInter-Channel Coherence（ICC）が用いられている。PS符号化方式は、１チャネル信号とパラメータとにより符号化を行なうため、２チャネルの信号として符号化するM/S符号化方式よりも、一般的に少ない情報量で符号化可能である。一方、IS符号化方式はPS符号化方式と非常に似ているが、チャネル間の位相情報など、符号化しない部分が異なる。 This method is applicable not only to a 2-channel signal but also to a multi-channel signal. In Non-Patent Document 2, as a parameter representing the relationship between channels, Inter-channel Intensity Difference (IID) representing an intensity difference between two channels, Inter-channel Phase Difference (IPD) representing a phase difference between two channels, The phase difference between the encoded 1-channel signal and one of the 2-channel signals, the overall phase difference (OPD), and the Inter-Channel Coherence (ICC) representing the similarity between the two channels are used. ing. Since the PS encoding method performs encoding using a 1-channel signal and parameters, it can generally be encoded with a smaller amount of information than the M / S encoding method for encoding as a 2-channel signal. On the other hand, the IS coding method is very similar to the PS coding method, but the portions that are not coded, such as phase information between channels, are different.

一方、複数チャネルの信号を用いて、音声・音響信号をあたかもある方向から聞こえてきているかのように、音声・音響信号を立体的に再生させる立体音響技術が古くより知られている。例えば、ある場所において聞こえる音を再現するために、左右前方、中央、左右後方にスピーカを設置し、さらに低音再生用のスピーカを設置する5.1チャネル再生方式がある。また別の例として、ある方向からある音が聞こえてきたときに、人間が実際に両耳で聞く音を模擬して、２チャネルの信号としてヘッドホンで再生するバイノーラル再生方式がある。人間が音を聞いて、どの方向からその音が聞こえてきているかを判断する際には、両耳間の信号の差が利用されている。 On the other hand, a stereophonic technique for reproducing a sound / sound signal in a three-dimensional manner using a signal of a plurality of channels as if the sound / sound signal is heard from a certain direction has long been known. For example, in order to reproduce a sound that can be heard at a certain place, there is a 5.1 channel reproduction system in which speakers are installed at the left and right front, center, and left and right, and speakers for low sound reproduction are installed. As another example, there is a binaural reproduction method in which when a certain sound is heard from a certain direction, a sound that a person actually hears with both ears is simulated and reproduced with headphones as a two-channel signal. When a human hears a sound and determines from which direction the sound is heard, the signal difference between both ears is used.

この両耳間の信号の差は、強度差、時間差(周波数領域では位相差)、周波数スペクトルの相違といったものである。そのためバイノーラル再生方式で用いられるバイノーラル信号は、上記PS符号化方式により符号化可能である。 This difference in signal between both ears is an intensity difference, a time difference (phase difference in the frequency domain), and a difference in frequency spectrum. Therefore, the binaural signal used in the binaural reproduction method can be encoded by the PS encoding method.

以上のようなバイノーラル信号を含め複数チャネルを用いて立体音像を再生する際に、音が聞こえてくる方向(音像位置)を変更または修正する場合がある。このような場合としては、例えば、信号が作成されたときに想定される聴取者の頭の向きに対して、再生側で頭の向きが異なっている場合や、音像位置を再生側で好きな方向に変更して再生する場合などがある。このように音の音像位置を変更または修正するためには、従来、信号再生側から信号作成側へ音像位置の変更量や修正量などを通知し、信号作成側にて音像位置を変更または修正した新しい信号を作成して送信してもらい、その新しい信号を再生するという方法が主に用いられていた。 When a three-dimensional sound image is reproduced using a plurality of channels including binaural signals as described above, the direction (sound image position) in which sound is heard may be changed or corrected. In such a case, for example, when the head is different on the playback side with respect to the head direction of the listener assumed when the signal is created, or the sound image position is preferred on the playback side. There is a case of changing the direction and playing. In order to change or correct the sound image position in this way, conventionally, the signal reproduction side notifies the signal creation side of the amount of change or correction of the sound image position, and the signal creation side changes or corrects the sound image position. A method of generating and transmitting a new signal and reproducing the new signal has been mainly used.

ISO/IEC International Standard 14496-3 Information Technology Coding of audio-visual objects Part 3: Audio, 2001.ISO / IEC International Standard 14496-3 Information Technology Coding of audio-visual objects Part 3: Audio, 2001. ISO/IEC International Standard 14496-3 Information Technology Coding of audio-visual objects Part 3: Audio, AMENDMENT ２: Parametric Codimg for high quality audio, 2004.ISO / IEC International Standard 14496-3 Information Technology Coding of audio-visual objects Part 3: Audio, AMENDMENT 2: Parametric Codimg for high quality audio, 2004.

しかしながら、従来の方法で音像位置の変更をすると、一度信号作成側に戻って音像位置を変更してから符号化等を行なうことから、信号の再生に遅延が生じてしまい、頭の動きと音の動きの差を不自然に感じてしまうという問題があり、特にバイノーラル信号の場合、その作成に一般的に数百以上の遅延素子数のFIRフィルタによる処理を要するため、特に問題である。 However, when the sound image position is changed by the conventional method, since the sound image position is changed once after returning to the signal generation side and encoding is performed, the signal reproduction is delayed, and the head movement and sound are delayed. In particular, in the case of a binaural signal, the creation of the binaural signal generally requires processing by an FIR filter having a delay element number of several hundreds or more.

本発明は、このような問題に鑑みてなされたもので、本発明では複数チャネルの立体音響信号をチャネル間の関係を表すパラメータを用いた符号化による符号化系列に対して、音の音像位置を所望の方向に変更または修正するようにチャネル間パラメータを再生側において修正することで、低遅延・低演算量で音像位置の変更または修正を実現することを目的とする。 The present invention has been made in view of such a problem. In the present invention, the sound image position of a sound is encoded with respect to an encoded sequence obtained by encoding a multi-channel stereophonic signal using a parameter representing a relationship between channels. It is an object to realize a change or correction of a sound image position with a low delay and a low calculation amount by correcting an inter-channel parameter on the reproduction side so as to change or correct the signal in a desired direction.

このような目的を達成するために、請求項１に記載の発明は、複数のチャネル信号を該複数のチャネルよりも少数のチャネル信号と、複数のチャネル間の関係を示すパラメータとに変換された信号を処理する音声・音響信号処理装置であって、複数のチャネル間の関係を示すパラメータを変更する演算を行なって、複数チャネルにより形成される音像位置を所定の角度だけ変更するパラメータ処理手段と、複数のチャネルよりも少数のチャネル信号と、演算処理が行われた複数のチャネル間の関係を示すパラメータとによって、音像位置を所定の角度だけ変更された複数のチャネル信号を生成する複数チャネル信号生成手段とを備えたことを特徴とする。 In order to achieve such an object, the invention according to claim 1 converts a plurality of channel signals into a smaller number of channel signals than the plurality of channels and a parameter indicating a relationship between the plurality of channels. An audio / acoustic signal processing apparatus for processing a signal, wherein a parameter processing unit is configured to change a sound image position formed by the plurality of channels by a predetermined angle by performing an operation for changing a parameter indicating a relationship between the plurality of channels; A multi-channel signal that generates a plurality of channel signals in which the sound image position is changed by a predetermined angle by using a smaller number of channel signals than the plurality of channels and a parameter indicating a relationship between the plurality of channels on which arithmetic processing has been performed. And generating means.

請求項２に記載の発明は、請求項１に記載の音声・音響信号処理装置において、元信号の音像位置を示す元信号音像位置情報と、所望する音像位置を示す所望音像位置情報とを受信する音像位置受信手段をさらに備え、パラメータ処理手段は、元信号音像位置情報と所望音像位置情報とに基づいて、複数のチャネル間の関係を示すパラメータを変更する演算を行なうことを特徴とする。 According to a second aspect of the present invention, in the audio / acoustic signal processing device according to the first aspect, the original signal sound image position information indicating the sound image position of the original signal and the desired sound image position information indicating the desired sound image position are received. A sound image position receiving means for performing the operation of changing a parameter indicating a relationship between a plurality of channels based on the original signal sound image position information and the desired sound image position information.

請求項３に記載の発明は、請求項１に記載の音声・音響信号処理装置において、処理される信号に所定の符号化を行う符号化手段をさらに備え、パラメータ処理手段は、所定の符号化を行われた信号を復号する先行復号手段と、先行復号手段により復号された複数のチャネル間の関係を示すパラメータを演算する復号後演算処理手段とを含むことを特徴とする。 According to a third aspect of the present invention, in the speech / acoustic signal processing device according to the first aspect of the present invention, the voice / acoustic signal processing apparatus further includes a coding unit that performs a predetermined coding on the signal to be processed. Including a preceding decoding means for decoding the signal subjected to the decoding and a post-decoding operation processing means for calculating a parameter indicating a relationship between a plurality of channels decoded by the preceding decoding means.

請求項４に記載の発明は、請求項３に記載の音声・音響信号処理装置において、先行復号手段により復号された複数のチャネル間の関係を示すパラメータに基づいて元信号の音像位置を示す元信号音像位置情報を推定する音像位置推定手段と、所望する音像位置を示す所望音像位置情報を受信する音像位置受信手段とをさらに備え、パラメータ処理手段は、元信号音像位置情報と所望音像位置情報とに基づいて、複数のチャネル間の関係を示すパラメータを変更する演算を行なうことを特徴とする。 According to a fourth aspect of the present invention, there is provided the speech / acoustic signal processing device according to the third aspect, wherein the original image signal position indicating the sound image position of the original signal based on the parameter indicating the relationship between the plurality of channels decoded by the preceding decoding means. A sound image position estimating means for estimating the signal sound image position information; and a sound image position receiving means for receiving the desired sound image position information indicating the desired sound image position. The parameter processing means includes the original signal sound image position information and the desired sound image position information. Based on the above, a calculation for changing a parameter indicating a relationship between a plurality of channels is performed.

請求項５に記載の発明は、請求項１または２に記載の音声・音響信号処理装置において、処理される信号に所定の符号化を行う符号化手段をさらに備え、パラメータ処理手段は、所定の符号化を行われた複数のチャネル間の関係を示すパラメータを演算する復号前演算処理手段と、演算処理が行われた符号化を行われた複数のチャネル間の関係を示すパラメータを復号する後行復号手段とを含むことを特徴とする。 According to a fifth aspect of the present invention, in the speech / acoustic signal processing apparatus according to the first or second aspect, the audio / acoustic signal processing apparatus further includes an encoding unit that performs predetermined encoding on a signal to be processed. Pre-decoding operation processing means for calculating a parameter indicating a relationship between a plurality of encoded channels, and after decoding a parameter indicating a relationship between a plurality of encoded channels subjected to the operation processing And a row decoding means.

請求項６に記載の発明は、請求項１ないし５のいずれかに記載の音声・音響信号処理装置において、パラメータ処理手段は、複数チャネルにより形成される音像位置と、所定の角度だけ変更した音像位置とに基づいて、予め設定された複数のパラメータ変更因子のうちいずれのパラメータ変更因子を用いるかを決定し、決定されたパラメータ変更因子を適用することにより演算することを特徴とする。 According to a sixth aspect of the present invention, in the audio / acoustic signal processing device according to any one of the first to fifth aspects, the parameter processing means includes a sound image position formed by a plurality of channels and a sound image changed by a predetermined angle. Based on the position, it is determined which one of the plurality of parameter change factors set in advance is to be used, and the calculation is performed by applying the determined parameter change factor.

請求項７に記載の発明は、請求項１ないし５のいずれかに記載の音声・音響信号処理装置において、パラメータ処理手段は、複数チャネルにより形成される音像位置と、所定の角度だけ変更した音像位置とに基づいて、予め設定された複数のパラメータ変更因子のうちいずれのパラメータ変更因子を用いるかを決定し、決定されたパラメータ変更因子にさらに、複数チャネルにより形成される音像位置と、所定の角度だけ変更した音像位置とに基づいて、修正を加え修正パラメータ変更因子を生成し、生成された修正パラメータ変更因子を適用することにより演算することを特徴とする。 According to a seventh aspect of the present invention, in the audio / acoustic signal processing device according to any one of the first to fifth aspects, the parameter processing means includes a sound image position formed by a plurality of channels and a sound image changed by a predetermined angle. Based on the position, it is determined which of the plurality of preset parameter change factors to use, and further, the determined parameter change factor further includes a sound image position formed by a plurality of channels, and a predetermined value Based on the sound image position changed by the angle, correction is performed to generate a correction parameter change factor, and calculation is performed by applying the generated correction parameter change factor.

請求項８に記載の発明は、請求項１ないし７のいずれかに記載の音声・音響信号処理装置において、複数のチャネル信号は、バイノーラル信号であることを特徴とする。 According to an eighth aspect of the present invention, in the audio / acoustic signal processing apparatus according to any one of the first to seventh aspects, the plurality of channel signals are binaural signals.

請求項９に記載の発明は、請求項１ないし８のいずれかに記載の音声・音響信号処理装置において、複数チャネル信号生成手段により生成された複数のチャネル信号を音声として聴取する聴取者の聴取方向を検出する聴取方向検出手段をさらに備え、パラメータ処理手段は、聴取方向検出手段に基づいて所定の角度を決定することを特徴とする。 According to a ninth aspect of the present invention, in the voice / acoustic signal processing device according to any one of the first to eighth aspects, the listener listens to the plurality of channel signals generated by the plurality of channel signal generating means as a voice. The apparatus further includes listening direction detection means for detecting a direction, and the parameter processing means determines a predetermined angle based on the listening direction detection means.

請求項１０に記載の発明は、音声・音響信号処理方法であって、複数のチャネル信号を複数のチャネルよりも少数のチャネル信号と、複数のチャネル間の関係を示すパラメータとに変換された信号を処理する音声・音響信号処理方法であって、複数のチャネル間の関係を示すパラメータを変更する演算を行なって、複数チャネルにより形成される音像位置を所定の角度だけ変更するパラメータ処理ステップと、複数のチャネルよりも少数のチャネル信号と、演算処理が行われた複数のチャネル間の関係を示すパラメータとによって、音像位置を所定の角度だけ変更された複数のチャネル信号を生成する複数チャネル信号生成ステップとを備えたことを特徴とする。 The invention according to claim 10 is a speech / acoustic signal processing method, wherein a plurality of channel signals are converted into a smaller number of channel signals than a plurality of channels and a parameter indicating a relationship between the plurality of channels. A parameter processing step for changing a sound image position formed by a plurality of channels by performing an operation for changing a parameter indicating a relationship between a plurality of channels; Multi-channel signal generation that generates a plurality of channel signals in which the sound image position is changed by a predetermined angle based on a smaller number of channel signals than a plurality of channels and a parameter indicating a relationship between the plurality of channels on which arithmetic processing has been performed. And a step.

以上説明したように、本発明によれば、複数のチャネル間の関係を示すパラメータを変更する演算を行なって、複数チャネルにより形成される音像位置を所定の角度だけ変更するパラメータ処理手段と、複数のチャネルよりも少数のチャネル信号と、演算処理が行われた複数のチャネル間の関係を示すパラメータとによって、音像位置が所定の角度だけ変更された複数のチャネル信号を生成する複数チャネル信号生成手段とを備えることによって、低遅延・低演算量で音像位置の変更または修正を実現する音声・音響信号処理装置およびその方法を提供することができる。 As described above, according to the present invention, the parameter processing means for changing the parameter indicating the relationship between the plurality of channels and changing the position of the sound image formed by the plurality of channels by a predetermined angle, Multi-channel signal generating means for generating a plurality of channel signals in which the sound image position is changed by a predetermined angle by using a smaller number of channel signals than the number of channels and a parameter indicating a relationship between the plurality of channels subjected to the arithmetic processing By providing the above, it is possible to provide an audio / acoustic signal processing apparatus and method for realizing a change or correction of a sound image position with a low delay and a low calculation amount.

以下、本発明の一実施形態について図面を参照して説明する。なお、以下の説明では、信号はアナログ−デジタル変換後のデジタル信号とする。以下の実施形態では、いずれも信号を符号化して信号系列を生成して復号する処理を含め説明するが、本発明は符号および復号とは特に関係がなく、このような構成を含めなくても本発明の作用を発揮し、効果を奏することができるのは言うまでもない。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings. In the following description, the signal is a digital signal after analog-digital conversion. In the following embodiments, description will be given including processing to generate and decode a signal sequence by encoding a signal, but the present invention is not particularly related to encoding and decoding, and it is not necessary to include such a configuration. Needless to say, the effects of the present invention can be exhibited and the effects can be achieved.

（第１実施形態）
図１は、本発明の一実施形態として、複数チャネルの音声・音響信号を符号化した符号化系列を復号する復号装置を示す図である。複数チャネルの信号は、元の複数チャネル数より少数のチャネル信号と、元の複数チャネル信号のチャネル間の関係を表すパラメータとに変換され、符号化されている。このような符号化方式の例としてはPS符号化方式があり、チャネル間の関係を表すパラメータとしては、例えばIID、IPD、OPD、ICC、２チャネル間の時間差を表すInter-channel Time Difference（ITD）がある。 (First embodiment)
FIG. 1 is a diagram showing a decoding apparatus for decoding an encoded sequence obtained by encoding a plurality of channels of audio / acoustic signals as an embodiment of the present invention. The multi-channel signal is converted into a channel signal having a smaller number than the original multi-channel number and a parameter representing the relationship between the channels of the original multi-channel signal and encoded. An example of such a coding scheme is the PS coding scheme, and parameters representing the relationship between channels include, for example, IID, IPD, OPD, ICC, and Inter-channel Time Difference (ITD) representing a time difference between two channels. )

少数チャネル信号を符号化して得られた少数チャネル信号符号化系列は、少数チャネル信号復号部１０１において少数チャネル信号に復号される。チャネル間の関係を表すパラメータを符号化して得られたチャネル間パラメータ符号化系列は、チャネル間パラメータ復号部１０２においてチャネル間の関係を表すパラメータに復号される。復号されたチャネル間パラメータは、パラメータ手段の一部および復号後演算処理手段を構成するチャネル間パラメータ変更部１０３に入力される。 The minority channel signal encoded sequence obtained by encoding the minority channel signal is decoded into the minority channel signal by the minority channel signal decoding unit 101. The inter-channel parameter encoded sequence obtained by encoding the parameter representing the relationship between the channels is decoded into the parameter representing the relationship between the channels in the inter-channel parameter decoding unit 102. The decoded inter-channel parameter is input to a part of the parameter means and the inter-channel parameter changing unit 103 constituting the post-decoding arithmetic processing means.

符号化される前の元信号の音像位置および所望の音像位置も、チャネル間パラメータ変更部１０３に入力される。チャネル間パラメータ変更部１０３において、元信号音像位置入力と所望の音像位置入力とに基づいて、最終的に復号される複数チャネル信号の音像位置が所望の音像位置になるように、復号部１０２からのチャネル間パラメータが変更される。変更されたチャネル間パラメータと、復号部１０１で復号された少数チャネル信号とが複数チャネル信号生成手段としての複数チャネル信号合成部１０４に入力され、変更チャネル間パラメータを用いて少数チャネル信号から複数チャネル信号が合成されて出力される。 The sound image position of the original signal before encoding and the desired sound image position are also input to the inter-channel parameter changing unit 103. In the inter-channel parameter changing unit 103, based on the original signal sound image position input and the desired sound image position input, from the decoding unit 102 so that the sound image positions of the multi-channel signals finally decoded become the desired sound image positions. The channel-to-channel parameters are changed. The changed channel-to-channel parameter and the minority channel signal decoded by the decoding unit 101 are input to the multiple-channel signal combining unit 104 serving as a multiple-channel signal generation unit, and the changed channel parameter is used to input a plurality of channels from the minority channel signal. The signals are synthesized and output.

ここで、符号器側の構成の第１の例を図１４に、第２の例を図１５に示す。図１４は、音声・音響信号より複数チャネルの立体音響信号を生成し符号化する構成を示している。複数チャネル信号生成部１１２において、入力された音声・音響信号から別途入力される音像位置情報に応じて複数チャネルの立体音響信号を生成し、生成された複数チャネルの立体音響信号は複数チャネル信号符号化部１１７において符号化される。複数チャネル信号符号化部１１７では、少数チャネル信号変換部１１３において入力された複数チャネル信号を、より少数のチャネルの信号に変換し、その変換により得られた少数チャネル信号は少数チャネル信号符号化部１１４において符号化され、少数チャネル符号化系列が出力される。チャネル間パラメータ算出部１１５では、複数チャネルの立体音響信号よりチャネル間パラメータが算出される。算出されたチャネル間パラメータは、チャネル間パラメータ符号化部１１６において符号化され、チャネル間パラメータ符号化系列が出力される。 Here, a first example of the configuration on the encoder side is shown in FIG. 14, and a second example is shown in FIG. FIG. 14 shows a configuration for generating and encoding a stereophonic signal of a plurality of channels from a voice / acoustic signal. The multi-channel signal generation unit 112 generates a multi-channel stereo sound signal according to sound image position information separately input from the input sound / sound signal, and the generated multi-channel stereo sound signal is a multi-channel signal code. The encoding unit 117 performs encoding. In the multiple channel signal encoding unit 117, the multiple channel signal input in the minority channel signal conversion unit 113 is converted into a signal of a smaller number of channels, and the minority channel signal obtained by the conversion is converted into the minority channel signal encoding unit. Encoded at 114, a minority channel encoded sequence is output. The inter-channel parameter calculation unit 115 calculates inter-channel parameters from a plurality of channels of stereophonic sound signals. The calculated inter-channel parameter is encoded by the inter-channel parameter encoding unit 116, and an inter-channel parameter encoded sequence is output.

音像位置情報は、音像位置情報符号化部１１８において符号化され、音像位置符号化系列が出力される。復号器側では、音像位置符号化系列を復号し、得られた音像位置情報を元信号の音像位置情報として利用する。ここでは音像位置情報を符号化する例を示したが、音像位置情報は必ずしも符号化する必要はなく、その通知方法はこれに限定されない。音像位置情報としては、例えば受聴者の正面に対する音源の水平面上での相対的な角度の情報などがある。例として、頭の動きに追従するように、複数チャネル信号に含まれる音源の音像位置を変更する場合を考える。図１６に示すように、時刻tにおける受聴者の頭の向きα(t)を符号器側に通知し、符号器側において、時刻t＋m (m>0)における音源の絶対的到来方向β(t+m)と時刻tにおける頭の向きα(t)に基づいて複数チャネルの立体音響信号が生成及び符号化されるものとする。復号器側においては、時刻t＋n (n>m)における頭の向きα(t+n)と音源の絶対的到来方向β(t+m)に基づいた複数チャネルの立体音響信号に変更するものとする。このとき、元信号到来方向はβ(t+m)−α(t)と記述でき、所望の音像位置はβ(t+m)−α(t+n)と記述できる。 The sound image position information is encoded by the sound image position information encoding unit 118, and a sound image position encoded sequence is output. On the decoder side, the sound image position coded sequence is decoded, and the obtained sound image position information is used as the sound image position information of the original signal. Here, an example in which the sound image position information is encoded is shown, but the sound image position information is not necessarily encoded, and the notification method is not limited to this. The sound image position information includes, for example, information on a relative angle of the sound source on the horizontal plane with respect to the front of the listener. As an example, consider a case where the sound image position of a sound source included in a plurality of channel signals is changed so as to follow the movement of the head. As shown in FIG. 16, the head direction α (t) of the listener at time t is notified to the encoder side, and the absolute arrival direction β (t of the sound source at time t + m (m> 0) is notified on the encoder side. It is assumed that a multi-channel stereophonic signal is generated and encoded based on + m) and the head orientation α (t) at time t. On the decoder side, it is changed to a multi-channel stereophonic signal based on the head direction α (t + n) and the absolute arrival direction β (t + m) of the sound source at time t + n (n> m). To do. At this time, the original signal arrival direction can be described as β (t + m) −α (t), and the desired sound image position can be described as β (t + m) −α (t + n).

図１５は、複数個のマイクロホンにより集音した複数チャネルの立体音響信号を符号化する構成を示す図である。Ｎ個のマイクロホンで集音した信号を複数チャネル信号集音部１１９で複数チャネルの信号とし、複数チャネル信号は複数チャネル信号符号化部１１７において符号化される。複数チャネル信号符号化部１１７の動作は、上記符号器側の構成の第１の例と同様である。到来方向推定部１２０では複数チャネル信号より到来方向を推定し、推定結果を音像位置情報符号化部１１８で符号化し音像位置情報符号化系列を出力する。到来方向の推定は、チャネル間パラメータ算出部１１５で算出されたチャネル間パラメータを用いてもよい。復号器側における音像位置情報符号化系列の扱いは、上記符号器側の構成の第１の例と同様である。 FIG. 15 is a diagram showing a configuration for encoding a plurality of channels of stereophonic sound signals collected by a plurality of microphones. A signal collected by N microphones is used as a multi-channel signal by a multi-channel signal collecting unit 119, and the multi-channel signal is encoded by a multi-channel signal encoding unit 117. The operation of the multi-channel signal encoding unit 117 is the same as that of the first example of the configuration on the encoder side. The arrival direction estimation unit 120 estimates the arrival direction from a plurality of channel signals, encodes the estimation result by the sound image position information encoding unit 118, and outputs a sound image position information encoded sequence. The direction of arrival may be estimated using an inter-channel parameter calculated by the inter-channel parameter calculation unit 115. The handling of the sound image position information encoded sequence on the decoder side is the same as in the first example of the configuration on the encoder side.

本実施形態では、符号器側で到来方向の推定を行なったが、これに替えて、復号器側で推定を行なうこともできる。すなわち、例えば図１において復号部１０１、１０２の後段に元信号の音像位置を推定する回路(図示せず)を配置して、所望の音像位置の入力をチャネル間パラメータ変更部１０３に入力することにより音像位置を変更するような構成とすることもできる。 In the present embodiment, the direction of arrival is estimated on the encoder side, but instead, estimation on the decoder side can also be performed. That is, for example, in FIG. 1, a circuit (not shown) for estimating the sound image position of the original signal is arranged downstream of the decoding units 101 and 102, and an input of a desired sound image position is input to the inter-channel parameter changing unit 103. Therefore, the sound image position can be changed.

符号器側の構成の第１および第２の例において、音像位置情報、すなわち元信号の音像位置に関する情報は、必ずしも復号器側に補助情報として通知される必要はなく、復号器側において、上記符号器側の構成の第２の例のように、復号された複数チャネル信号またはチャネル間パラメータ等から推定してもよい。本発明において、復号器側における元信号音像位置の取得方法は限定されない。 In the first and second examples of the configuration on the encoder side, the sound image position information, that is, information on the sound image position of the original signal does not necessarily need to be notified to the decoder side as auxiliary information. As in the second example of the configuration on the encoder side, it may be estimated from the decoded multi-channel signal or the inter-channel parameter. In the present invention, the acquisition method of the original signal sound image position on the decoder side is not limited.

図２は、図１示したチャネル間パラメータ変更部１０３の第１の例を示す図である。符号化される前の元信号の音像位置と所望の音像位置とが、チャネル間パラメータ変更因子選択部１０６に入力される。チャネル間パラメータ変更因子選択部１０６では、あらかじめ保持されている複数のチャネル間パラメータ変更因子より、符号化される前の元信号の音像位置と所望の音像位置とに基づいて、適切なチャネル間パラメータ変更因子を選択する。選択されたチャネル間パラメータ変更因子と、チャネル間パラメータ復号部１０２で復号されたチャネル間パラメータとは、チャネル間パラメータ変更因子適用部１０５に入力される。チャネル間パラメータ変更因子適用部１０５からチャネル間パラメータ変更部１０３の出力として、チャネル間パラメータにチャネル間パラメータ変更因子を適用した変更チャネル間パラメータが出力される。以下にチャネル間パラメータ変更因子の選択・適用の一例を示すが選択方法及び適用方法はこれに限定されない。たとえば、以下の例ではパラメータ変更因子を加算しているが、チャネル間パラメータ変更因子とチャネル間パラメータとの積により変更されたチャネル間パラメータを求める方法を用いることもできる。 FIG. 2 is a diagram illustrating a first example of the inter-channel parameter changing unit 103 illustrated in FIG. The sound image position of the original signal before encoding and the desired sound image position are input to the inter-channel parameter change factor selection unit 106. In the inter-channel parameter change factor selection unit 106, an appropriate inter-channel parameter is determined based on the sound image position of the original signal before encoding and the desired sound image position based on a plurality of inter-channel parameter change factors held in advance. Select a change factor. The selected inter-channel parameter change factor and the inter-channel parameter decoded by the inter-channel parameter decoding unit 102 are input to the inter-channel parameter change factor application unit 105. As an output from the inter-channel parameter change factor application unit 105 to the inter-channel parameter change unit 103, a changed inter-channel parameter obtained by applying the inter-channel parameter change factor to the inter-channel parameter is output. An example of selection / application of the inter-channel parameter changing factor is shown below, but the selection method and application method are not limited to this. For example, although the parameter change factor is added in the following example, a method for obtaining the inter-channel parameter changed by the product of the inter-channel parameter change factor and the inter-channel parameter may be used.

例えば、チャネル間パラメータとしてIPDを用い、チャネル間パラメータをサブバンドごとに符号化して生成された信号が復号されたとする。k番目のサブバンドの元信号の到来方向がa_kのとき、復号されたk番目のサブバンドのIPDはipd (a_k,k)である。所望の到来方向がb_kのとき、あらかじめ保持していたチャネル間パラメータ変更因子よりδ(a_k,b_k,k)が選択される。δ(a_k,b_k,k)を保持していない場合は、例えばc_k,及びd_kがa_k及びb_kに最も近い値であるチャネル間パラメータ変更因子δ(c_k,d_k,k)が選択される。選択されたδ(a_k,b_k,k)が，チャネル間パラメータ変更因子適用部１０５において適用され、以下のような式から、音像位置変更後のチャネル間パラメータ For example, it is assumed that IPD is used as an inter-channel parameter, and a signal generated by encoding the inter-channel parameter for each subband is decoded. When the arrival direction of the original signal of the kth subband is a _k , the decoded IPD of the kth subband is ipd (a _k , k). When the desired direction of arrival is b _k , δ (a _k , b _k , k) is selected from the inter-channel parameter changing factor that has been held in advance. If δ (a _k , b _k , k) is not held, for example, the inter-channel parameter changing factor δ (c _k , d _k , c _k , and d _k are values closest to a _k and b _k k) is selected. The selected δ (a _k , b _k , k) is applied in the inter-channel parameter change factor application unit 105, and the inter-channel parameter after the sound image position change is obtained from the following equation.

(k=0, 1, …, K-1)
が得られる。 (k = 0, 1,…, K-1)
Is obtained.

これがK個のすべてのサブバンドに対して行われ、チャネル間パラメータが以下のように出力される。 This is done for all K subbands and the inter-channel parameters are output as follows:

図３は、上記チャネル間パラメータ変更部１０３の第２の例を示す図である。符号化される前の元信号の音像位置と、所望の音像位置とがチャネル間パラメータ変更因子選択部１０６に入力される。チャネル間パラメータ変更因子選択部１０６では、あらかじめ保持されている複数のチャネル間パラメータ変更因子より、符号化される前の元信号の音像位置と所望の音像位置とに基づいて、適切なチャネル間パラメータ変更因子が選択される。選択されたチャネル間パラメータ変更因子は、チャネル間パラメータ変更因子修正部１０７に送られ、元信号の音像位置と所望の音像位置とに基づいて、パラメータ変更因子に修正が必要な場合には修正される。 FIG. 3 is a diagram illustrating a second example of the inter-channel parameter changing unit 103. The sound image position of the original signal before encoding and the desired sound image position are input to the inter-channel parameter change factor selection unit 106. In the inter-channel parameter change factor selection unit 106, an appropriate inter-channel parameter is determined based on the sound image position of the original signal before encoding and the desired sound image position based on a plurality of inter-channel parameter change factors held in advance. A change factor is selected. The selected inter-channel parameter change factor is sent to the inter-channel parameter change factor correction unit 107, and is corrected when the parameter change factor needs to be corrected based on the sound image position of the original signal and the desired sound image position. The

以下にチャネル間パラメータ変更因子の修正の一例を示すが、修正方法はこれに限定されない。上記チャネル間パラメータ変更部の第１の例に記載した例と同様に、k番目のサブバンドの元信号の音像位置をa_k，所望の音像位置をx_kとし、チャネル間パラメータ変更因子選択部１０６で音像位置をa_kからb_kに変更するチャネル間パラメータ変更因子δ(a_k, b_k, k) (a_k < x_k < b_k)が選択されたとする。このときδ(a_k, b_k, k)は次式のようにδ(a_k, x_k, k)に修正される。 An example of correction of the inter-channel parameter changing factor is shown below, but the correction method is not limited to this. Similar to the example described in the first example of the inter-channel parameter changing unit, the sound image position of the original signal of the k-th subband is a _k , the desired sound image position is x _k , and the inter-channel parameter changing factor selection unit It is assumed that an inter-channel parameter changing factor δ (a _k , b _k , k) (a _k <x _k <b _k ) for changing the sound image position from a _k to b _k is selected in 106. At this time, δ (a _k , b _k , k) is corrected to δ (a _k , x _k , k) as shown in the following equation.

これがK個のすべてのサブバンドに対して行われる。修正されたチャネル間パラメータは、チャネル間パラメータ変更因子適用部１０５に入力される。チャネル間パラメータ変更因子適用部１０５からチャネル間パラメータ変更部１０３の出力として、チャネル間パラメータに変更チャネル間パラメータ変更因子を適用した変更チャネル間パラメータが出力される。 This is done for all K subbands. The corrected inter-channel parameter is input to the inter-channel parameter change factor application unit 105. As an output from the inter-channel parameter change factor application unit 105 to the inter-channel parameter change unit 103, a changed inter-channel parameter obtained by applying the changed inter-channel parameter change factor to the inter-channel parameter is output.

図４は、上述の一実施形態のチャネル間パラメータ変更部１０３の第３の例を示す図である。符号化される前の元信号の音像位置と、所望の音像位置とがチャネル間パラメータ変更因子算出部１０８に入力される。チャネル間パラメータ変更因子算出部１０８では、元信号の音像位置と、所望の音像位置とによりチャネル間パラメータ変更因子を算出する。 FIG. 4 is a diagram illustrating a third example of the inter-channel parameter changing unit 103 according to the above-described embodiment. The sound image position of the original signal before encoding and the desired sound image position are input to the inter-channel parameter change factor calculation unit 108. The inter-channel parameter change factor calculation unit 108 calculates the inter-channel parameter change factor based on the sound image position of the original signal and the desired sound image position.

以下に算出の一例を示すが算出方法はこれに限定されない。複数チャネル信号はバイノーラル信号であり、チャネル間パラメータとしてIPD（ipd）を用い、受聴者の頭が図５に示すような球形であると仮定すると、ipdは図５におけるチャネル間到来距離差dに相当する位相差となり、到来方向が角度xのときのipdは、 An example of calculation is shown below, but the calculation method is not limited to this. Assuming that the multi-channel signal is a binaural signal, using IPD (ipd) as an inter-channel parameter, and assuming that the listener's head is spherical as shown in FIG. 5, ipd is the inter-channel arrival distance difference d in FIG. When the arrival direction is angle x, the ipd becomes the corresponding phase difference,

により求められる。ただし、ｆは周波数、rは球形頭の半径、cは音速である。上記チャネル間パラメータ変更部の第１の例と同様に、k番目のサブバンドの元信号の到来方向をa_k、所望の到来方向をx_kとすると、k番目のサブバンドのIPDの変更因子δ^ｉｐｄ（a_k，x_k ，k）は式(2)を用いて、 Is required. Where f is the frequency, r is the radius of the spherical head, and c is the speed of sound. As in the first example between the channel parameter changing section, the k-th the direction of arrival of the original signal a _k of subbands, when a desired incoming direction is x _k, modifiers of IPD of the k-th subband δ ^ipd (a _k, x _k , k) is calculated using equation (2):

と算出される。ただしｆ_ｋはk番目のサブバンドの中心周波数とする。算出されたチャネル間パラメータ変更因子は、チャネル間パラメータ変更因子適用部１０５に入力される。チャネル間パラメータ変更因子適用部１０５からチャネル間パラメータ変更部１０３の出力として、チャネル間パラメータにチャネル間パラメータ変更因子を適用した変更チャネル間パラメータが出力される。 Is calculated. Here, f _k is the center frequency of the kth subband. The calculated inter-channel parameter change factor is input to the inter-channel parameter change factor application unit 105. As an output from the inter-channel parameter change factor application unit 105 to the inter-channel parameter change unit 103, a changed inter-channel parameter obtained by applying the inter-channel parameter change factor to the inter-channel parameter is output.

図１７は、本実施形態の符号装置を用いた再生装置の一例を示す図である。再生装置１７０１は、プロセッサ１７０２、復号装置１７０３、元信号音像位置信号復号器１７０４、および所望の音像位置情報入力部１７０５を備える。プロセッサ１７０２は、元信号音像位置信号復号器１７０４および所望の音像位置情報入力部１７０５から取得した元信号音像位置情報および所望の音像位置情報により、チャネル間パラメータの変更が必要か否かを判定し、必要な場合復号装置１７０３に指令を出して変更を施した上、複数チャネル信号を出力するようにした後、所定のデジタルアナログ信号変換を経てスピーカ１７０６にて立体音響を再生させる。所望の音像位置情報は、後述するように聴取者の向きなどにより自動的に算出することもでき、また聴取者に所望の角度を入力させるようにすることもできる。図１７は本実施形態を用いた応用例を示しただけであり、本実施形態の応用例としてはこのような再生装置のみに限定されず、種々の装置、形態を考えることができる。 FIG. 17 is a diagram illustrating an example of a playback device using the encoding device of the present embodiment. The playback device 1701 includes a processor 1702, a decoding device 1703, an original signal sound image position signal decoder 1704, and a desired sound image position information input unit 1705. Based on the original signal sound image position information and the desired sound image position information acquired from the original signal sound image position signal decoder 1704 and the desired sound image position information input unit 1705, the processor 1702 determines whether it is necessary to change the parameter between channels. If necessary, a command is sent to the decoding device 1703 to make a change, and a plurality of channel signals are output. Then, a stereophonic sound is reproduced by the speaker 1706 through predetermined digital-analog signal conversion. The desired sound image position information can be automatically calculated according to the orientation of the listener, as will be described later, or the listener can be made to input a desired angle. FIG. 17 only shows an application example using the present embodiment, and the application example of the present embodiment is not limited to such a playback apparatus, and various apparatuses and forms can be considered.

図６は、本発明の第１実施形態の処理を示すフローチャートである。まず、少数チャネル信号符号化系列を少数チャネル信号に復号し(Ｓ６０１)、チャネル間パラメータ符号化系列をチャネル間パラメータに復号する（Ｓ６０２）。符号化前の元信号の音像位置と所望の音像位置とに基づいて、復号されたチャネル間パラメータを変更する必要があるか否かを判断する（Ｓ６０３）。チャネル間パラメータを変更する必要がある場合は、チャネル間パラメータを、元信号の音像位置と所望の音像位置とに基づいて変更する（Ｓ６０４）。変更されたチャネル間パラメータと、復号された少数チャネル信号を用いて複数チャネル信号を合成し出力する（Ｓ６０５）。ただし、チャネル間パラメータを変更する必要がない場合は、復号されたチャネル間パラメータをそのまま用いて複数チャネル信号を合成する。なお、ステップＳ６０１およびステップＳ６０２の処理順序は逆でもよく、また並列に処理してもよい。 FIG. 6 is a flowchart showing the processing of the first embodiment of the present invention. First, the minority channel signal coded sequence is decoded into a minority channel signal (S601), and the interchannel parameter coded sequence is decoded into an interchannel parameter (S602). Based on the sound image position of the original signal before encoding and the desired sound image position, it is determined whether it is necessary to change the decoded inter-channel parameter (S603). If it is necessary to change the inter-channel parameter, the inter-channel parameter is changed based on the sound image position of the original signal and the desired sound image position (S604). A multi-channel signal is synthesized and output using the changed inter-channel parameter and the decoded minority channel signal (S605). However, when there is no need to change the inter-channel parameter, a multi-channel signal is synthesized using the decoded inter-channel parameter as it is. Note that the processing order of steps S601 and S602 may be reversed, or may be processed in parallel.

図７は、上記ステップＳ６０４の第１の処理例を示すフローチャートである。符号化前の元信号の音像位置と所望の音像位置とに基づいて、予め保持されているチャネル間パラメータ変更因子から適切なチャネル間パラメータ変更因子を選択する（Ｓ７０１）。選択されたチャネル間パラメータ変更因子を、復号されたチャネル間パラメータに適用してチャネル間パラメータを修正し（Ｓ７０２）、修正されたチャネル間パラメータを出力する（Ｓ７０３）。 FIG. 7 is a flowchart showing a first processing example of step S604. Based on the sound image position of the original signal before encoding and the desired sound image position, an appropriate inter-channel parameter change factor is selected from the inter-channel parameter change factors held in advance (S701). The selected channel-to-channel parameter changing factor is applied to the decoded channel-to-channel parameter to correct the channel-to-channel parameter (S702), and the corrected channel-to-channel parameter is output (S703).

図８は、上記ステップＳ６０４の第２の処理例を示すフローチャートである。まず、上記ステップＳ７０１の処理を行う。符号化前の元信号の音像位置と所望の音像位置とに基づいて、選択されたチャネル間パラメータ変更因子を修正する必要があるかを判断する（Ｓ８０１。修正する必要がる場合は、元信号の音像位置と所望の音像位置とに基づいて、チャネル間パラメータ変更因子を修正する（Ｓ８０２）。修正方法の一例としては、チャネル間パラメータ変更部１０３の第２の例の方法を用いることができる。修正されたチャネル間パラメータ変更因子を用いて、ステップＳ７０２〜Ｓ７０３の処理が行われる。ステップＳ８０１において修正する必要がないと判断された場合は、ステップＳ７０１において選択されたパラメータ変更因子を用いて、ステップＳ７０２〜Ｓ７０３の処理が行われる。 FIG. 8 is a flowchart showing a second processing example of step S604. First, the process of step S701 is performed. Based on the sound image position of the original signal before encoding and the desired sound image position, it is determined whether or not the selected inter-channel parameter changing factor needs to be corrected (S801. If correction is necessary, the original signal is determined). The inter-channel parameter changing factor is corrected based on the sound image position and the desired sound image position (S802) As an example of the correcting method, the method of the second example of the inter-channel parameter changing unit 103 can be used. Steps S702 to S703 are performed using the modified inter-channel parameter change factor, and if it is determined that correction is not necessary in step S801, the parameter change factor selected in step S701 is used. Steps S702 to S703 are performed.

図９は、上記ステップＳ６０４の第３の処理例を示すフローチャートである。符号化前の元信号の音像位置と所望の音像位置とに基づいて、チャネル間パラメータ変更因子を算出する（Ｓ９０１）。算出方法の一例としては、チャネル間パラメータ変更部１０３の第３の例の方法などがある。算出されたチャネル間パラメータ変更因子を用いて、ステップＳ７０２〜Ｓ７０３の処理が行われる。 FIG. 9 is a flowchart showing a third processing example of step S604. An inter-channel parameter change factor is calculated based on the sound image position of the original signal before encoding and the desired sound image position (S901). As an example of the calculation method, there is a third example method of the inter-channel parameter changing unit 103. Processing of steps S702 to S703 is performed using the calculated inter-channel parameter change factor.

以上、符号化されたチャネル間パラメータを復号した後に変更を行なう方法について説明をしたが、これにより上述の本発明の効果に加え、受聴者の頭の動きを検出する聴取方向検出部(図示せず)を設けて、受聴者の聴取方向にあわせて音像位置を変更することもできる。 As described above, the method of performing the change after decoding the encoded inter-channel parameter has been described. In addition to the above-described effects of the present invention, the listening direction detection unit (not shown) detects the listener's head movement. The position of the sound image can be changed according to the listening direction of the listener.

さらに、２者間の無線通信において本実施形態の装置または方法を用いる場合、両聴取者の位置情報を検出する検出手段（図示せず）を設けた携帯端末(図示せず)を使用することによって、相手方の存在する方向に話者の音像位置を設定することもできる。このような手法を用いることにより、近い位置にいるにもかかわらず人ごみで相互に発見できないときなど、相手の方向が分かるので容易に近づいて発見することが可能となる。 Furthermore, when using the apparatus or method of the present embodiment in wireless communication between two parties, use a portable terminal (not shown) provided with detection means (not shown) for detecting the position information of both listeners. Thus, the sound image position of the speaker can be set in the direction where the other party exists. By using such a method, it becomes possible to approach and discover easily because it is possible to know the direction of the other party, such as when it is not possible to discover each other even though they are close.

(第２実施形態)
図１０は、本発明の第２実施形態の複数チャネルの音声・音響信号を符号化した符号化系列を復号する装置を示す図である。チャネル間の関係を表すパラメータを符号化して得られたチャネル間パラメータ符号化系列は、パラメータ処理手段の一部および復号後演算処理手段を構成するチャネル間パラメータ符号化系列変更部１０９に入力される。チャネル間パラメータ符号化系列変更部１０９には、符号化前の元信号の音像位置入力と所望の音像位置入力とも供給され、これらの入力に基づいて、チャネル間パラメータ符号化系列は他のチャネル間パラメータ符号化系列に変更され、チャネル間パラメータ復号部１０２に入力される。ただし、元信号の音像位置入力と所望の音像位置入力とにより、チャネル間パラメータ符号化系列を変更する必要がないと判断される、例えば元信号の音像位置と所望の音像位置とが一致していると判断されるなどの場合、チャネル間パラメータ符号化系列は変更されずに、チャネル間パラメータ復号部１０２に入力される。図１０において、図１に示した第１実施形態と同様に動作する要素には同符号を付す。本実施形態は、第１実施形態ではまず復号が行なわれてから音像位置の変更処理が行われたのに対し、本実施形態では音像位置の変更を行ってから復号を行なう点で基本的に相違する。本発明の第２実施形態に係る符号器側の構成の例としては、図１３および図１４に示した上述の第１実施形態に係る符号器側の構成の第１および第２の例がある。 (Second Embodiment)
FIG. 10 is a diagram showing an apparatus for decoding an encoded sequence obtained by encoding a plurality of channels of audio / acoustic signals according to the second embodiment of the present invention. The inter-channel parameter encoded sequence obtained by encoding the parameter representing the relationship between the channels is input to a part of the parameter processing means and the inter-channel parameter encoded sequence changing unit 109 constituting the post-decoding arithmetic processing means. . The inter-channel parameter coding sequence changing unit 109 is supplied with both the sound image position input and the desired sound image position input of the original signal before encoding, and based on these inputs, the inter-channel parameter coding sequence is transmitted between other channels. The parameter coding sequence is changed and input to the inter-channel parameter decoding unit 102. However, it is determined that there is no need to change the inter-channel parameter encoding sequence based on the sound image position input of the original signal and the desired sound image position input. For example, the sound image position of the original signal matches the desired sound image position. For example, the inter-channel parameter coding sequence is input to the inter-channel parameter decoding unit 102 without being changed. 10, elements that operate in the same manner as in the first embodiment shown in FIG. In the present embodiment, the sound image position changing process is performed after the decoding is first performed in the first embodiment, whereas in the present embodiment, the decoding is basically performed after the sound image position is changed. Is different. Examples of the configuration on the encoder side according to the second embodiment of the present invention include the first and second examples of the configuration on the encoder side according to the first embodiment shown in FIGS. 13 and 14. .

図１１は、上記チャネル間パラメータ符号化系列変更部１０９の第１の例を示す図である。符号化される前の元信号の音像位置入力と、所望の音像位置入力とが変更規則選択部１１１に入力される。変更規則選択部１１１では、あらかじめ保持されているチャネル間パラメータ符号化系列の変更規則により、符号化される前の元信号の音像位置入力と所望の音像位置入力とに基づいて、適切な変更規則を選択し変更規則適用部１１０に通知する。また、元信号の音像位置と所望の音像位置とにより、チャネル間パラメータ符号化系列の変更が必要ないと判断された場合、チャネル間パラメータ符号化系列を変更しないことを変更規則適用部１１０に通知する。変更規則適用部１１０において、通知された変更規則に従ってチャネル間パラメータ符号化系列を他のチャネル間パラメータ符号化系列に変更し出力する。ただし、チャネル間パラメータ符号化系列を変更しないことが通知された場合は、チャネル間パラメータ符号化系列を変更せずに出力する。 FIG. 11 is a diagram illustrating a first example of the inter-channel parameter coding sequence changing unit 109. A sound image position input of the original signal before encoding and a desired sound image position input are input to the change rule selection unit 111. In the change rule selection unit 111, an appropriate change rule is determined based on the sound image position input and the desired sound image position input of the original signal before being encoded according to the change rule of the inter-channel parameter coding sequence held in advance. Is selected and notified to the change rule application unit 110. Further, when it is determined that it is not necessary to change the inter-channel parameter encoded sequence based on the sound image position of the original signal and the desired sound image position, the change rule applying unit 110 is notified that the inter-channel parameter encoded sequence is not changed. To do. In the change rule application unit 110, the inter-channel parameter coded sequence is changed to another inter-channel parameter coded sequence according to the notified change rule and output. However, when it is notified that the inter-channel parameter encoded sequence is not changed, the inter-channel parameter encoded sequence is output without being changed.

図１２は、本発明の第２実施形態の処理を示すフローチャートである。ステップＳ６０１の処理は上記第１実施形態のフローチャートと同様である。符号化前の元信号の音像位置と所望の音像位置とに基づいて、チャネル間パラメータ符号化系列を変更する必要があるか否かを判断する（Ｓ１２０１）。次いで、チャネル間パラメータ符号化系列は、チャネル間パラメータ復号部１０２によりチャネル間パラメータに復号される（Ｓ６０２）。チャネル間パラメータ符号化系列を変更する必要がある場合は、チャネル間パラメータ符号化系列を、元信号の音像位置と所望の音像位置とに基づいて、チャネル間パラメータ符号化系列変更部１０９により変更して（Ｓ１２０２）から、ステップＳ６０２に進む。チャネル間パラメータ符号化系列を変更する必要がない場合は、チャネル間パラメータ符号化系列を変更しないで、ステップＳ６０２に進む。以降、ステップＳ６０２およびステップＳ６０５の処理は上記第１実施形態のフローチャートと同様である。なお、ステップＳ６０１およびステップＳ１２０１〜Ｓ６０２の処理順序は逆でもよく、また並列に処理してもよい。 FIG. 12 is a flowchart showing the processing of the second embodiment of the present invention. The processing in step S601 is the same as the flowchart in the first embodiment. Based on the sound image position of the original signal before encoding and the desired sound image position, it is determined whether it is necessary to change the inter-channel parameter encoded sequence (S1201). Next, the inter-channel parameter coded sequence is decoded into inter-channel parameters by the inter-channel parameter decoding unit 102 (S602). When it is necessary to change the inter-channel parameter encoded sequence, the inter-channel parameter encoded sequence is changed by the inter-channel parameter encoded sequence changing unit 109 based on the sound image position of the original signal and the desired sound image position. (S1202), the process proceeds to step S602. If it is not necessary to change the inter-channel parameter encoded sequence, the process proceeds to step S602 without changing the inter-channel parameter encoded sequence. Henceforth, the process of step S602 and step S605 is the same as that of the flowchart of the said 1st Embodiment. Note that the processing order of step S601 and steps S1201 to S602 may be reversed, or may be processed in parallel.

図１３は、上記ステップＳ６０４の第１の処理例を示すフローチャートである。符号化前の元信号の音像位置と所望の音像位置とに基づいて、あらかじめ保持されているチャネル間パラメータ符号化系列の変更規則から適切な変更規則を選択する（Ｓ１３０１）。選択された変更規則をチャネル間パラメータ符号化系列に適用し、チャネル間パラメータ符号化系列を変更する（Ｓ１３０２）。変更されたチャネル間パラメータ符号化系列を出力する（Ｓ１３０３）。 FIG. 13 is a flowchart showing a first processing example of step S604. Based on the sound image position of the original signal before encoding and the desired sound image position, an appropriate change rule is selected from the change rules of the inter-channel parameter coding sequence held in advance (S1301). The selected change rule is applied to the inter-channel parameter encoded sequence to change the inter-channel parameter encoded sequence (S1302). The changed inter-channel parameter coded sequence is output (S1303).

以上説明したように、本発明により複数チャネルの立体音響信号をチャネル間の関係を表すパラメータを用いた符号化による符号化系列に対して、音の音像位置を所望の方向に変更または修正するようにチャネル間パラメータを再生側において変更することにより、低遅延・低演算量で音像位置の変更または修正を行なうことができる。 As described above, according to the present invention, the sound image position of a sound is changed or corrected in a desired direction with respect to an encoded sequence obtained by encoding a multi-channel stereophonic signal using a parameter indicating a relationship between channels. In addition, by changing the inter-channel parameter on the reproduction side, the sound image position can be changed or corrected with a low delay and a low calculation amount.

本発明の一実施形態の複数チャネルの音声・音響信号を符号化した符号化系列を復号する装置を示す図である。It is a figure which shows the apparatus which decodes the encoding series which encoded the audio | voice and acoustic signal of multiple channels of one Embodiment of this invention. 本発明の一実施形態のチャネル間パラメータ変更部の第１の例を示す図である。It is a figure which shows the 1st example of the parameter change part between channels of one Embodiment of this invention. 本発明の一実施形態にかかるチャネル間パラメータ変更部の第２の例を示す図である。It is a figure which shows the 2nd example of the parameter change part between channels concerning one Embodiment of this invention. 本発明の一実施形態にかかるチャネル間パラメータ変更部の第３の例を示す図である。It is a figure which shows the 3rd example of the parameter change part between channels concerning one Embodiment of this invention. 本実施形態の受聴者の頭の例を示すである。It is an example of a listener's head of this embodiment. 本発明の第１実施形態の処理を示すフローチャートである。It is a flowchart which shows the process of 1st Embodiment of this invention. 本実施形態のステップＳ６０４の第１の処理例を示すフローチャートである。It is a flowchart which shows the 1st process example of step S604 of this embodiment. 本実施形態のステップＳ６０４の第２の処理例を示すフローチャートである。It is a flowchart which shows the 2nd process example of step S604 of this embodiment. 本実施形態のステップＳ６０４の第３の処理例を示すフローチャートである。It is a flowchart which shows the 3rd processing example of step S604 of this embodiment. 本発明の第２実施形態の複数チャネルの音声・音響信号を符号化した符号化系列を復号する装置を示す図である。It is a figure which shows the apparatus which decodes the encoding series which encoded the audio | voice and sound signal of the multiple channel of 2nd Embodiment of this invention. 本発明の第２実施形態のチャネル間パラメータ符号化系列変更部の第１の例を示す図である。It is a figure which shows the 1st example of the channel parameter coding series change part of 2nd Embodiment of this invention. 本発明の第２実施形態の処理を示すフローチャートである。It is a flowchart which shows the process of 2nd Embodiment of this invention. 本実施形態のステップＳ６０４の第１の処理例を示すフローチャートである。It is a flowchart which shows the 1st process example of step S604 of this embodiment. 本発明の一実施形態の音声・音響信号より複数チャネルの立体音響信号を生成し符号化する構成を示している。1 shows a configuration for generating and encoding a stereophonic signal of a plurality of channels from an audio / acoustic signal according to an embodiment of the present invention. 本発明の一実施形態の複数個のマイクロホンにより集音した複数チャネルの立体音響信号を符号化する構成を示す図である。It is a figure which shows the structure which encodes the multi-channel stereophonic sound signal collected with the some microphone of one Embodiment of this invention. 本発明の一実施形態の受聴者の頭の例を示すである。It is an example of a listener's head of one embodiment of the present invention. 本実施形態の符号装置を用いた再生装置の一例を示す図である。It is a figure which shows an example of the reproducing | regenerating apparatus using the encoding apparatus of this embodiment.

Explanation of symbols

１０１少数チャネル信号復号部
１０２チャネル間パラメータ復号部
１０３チャネル間パラメータ変更部
１０４複数チャネル信号合成部
１０５チャネル間パラメータ変更因子適用部
１０６チャネル間パラメータ変更因子選択部
１０７チャネル間パラメータ変更因子修正部
１０８チャネル間パラメータ変更因子算出部
１０９チャネル間パラメータ符号化系列変更部
１１０変更規則適用部
１１１変更規則選択部
１１２複数チャネル信号生成部
１１３少数チャネル信号変換部
１１４少数チャネル信号符号化部
１１５チャネル間パラメータ算出部
１１６チャネル間パラメータ符号化部
１１７複数チャネル信号符号化部
１１８音像位置情報符号化部
１１９複数チャネル信号集音部
１２０到来方向推定部
１７０１再生装置
１７０２プロセッサ
１７０３復号装置
１７０４元信号音像位置信号復号器
１７０５所望の音像位置情報入力部
１７０６スピーカ
101 Minority channel signal decoding unit 102 Inter-channel parameter decoding unit 103 Inter-channel parameter changing unit 104 Multiple channel signal combining unit 105 Inter-channel parameter changing factor applying unit 106 Inter-channel parameter changing factor selecting unit 107 Inter-channel parameter changing factor correcting unit 108 Channel Inter-parameter change factor calculation unit 109 Inter-channel parameter coding sequence change unit 110 Change rule application unit 111 Change rule selection unit 112 Multiple channel signal generation unit 113 Minority channel signal conversion unit 114 Minority channel signal encoding unit 115 Inter-channel parameter calculation unit 116 Inter-channel parameter encoding unit 117 Multi-channel signal encoding unit 118 Sound image position information encoding unit 119 Multi-channel signal sound collection unit 120 Arrival direction estimation unit 1701 Reproducing apparatus 1702 Processor 1703 Decoding Device 1704 Original Signal Sound Image Position Signal Decoder 1705 Desired Sound Image Position Information Input Unit 1706 Speaker

Claims

An audio / acoustic signal processing apparatus for processing a signal obtained by converting a plurality of channel signals into a smaller number of channel signals than the plurality of channels and a parameter indicating a relationship between the plurality of channels,
Parameter processing means for performing a calculation to change a parameter indicating a relationship between the plurality of channels, and changing a sound image position formed by the plurality of channels by a predetermined angle;
The plurality of channel signals in which the sound image position is changed by a predetermined angle are generated based on a smaller number of channel signals than the plurality of channels and a parameter indicating a relationship between the plurality of channels on which the arithmetic processing has been performed. An audio / acoustic signal processing apparatus comprising: a plurality of channel signal generating means.

Sound image position receiving means for receiving original signal sound image position information indicating the sound image position of the original signal and desired sound image position information indicating the desired sound image position;
2. The parameter processing unit according to claim 1, wherein the parameter processing unit performs an operation of changing a parameter indicating a relationship between the plurality of channels based on the original signal sound image position information and the desired sound image position information. Voice / acoustic signal processing device.

The parameter processing means includes a preceding decoding means for decoding a signal that has been subjected to predetermined encoding, and a post-decoding operation processing means for calculating a parameter indicating a relationship between the plurality of channels decoded by the preceding decoding means. The speech / acoustic signal processing apparatus according to claim 1, wherein:

Sound image position estimation means for estimating original signal sound image position information indicating the sound image position of the original signal based on the parameter indicating the relationship between the plurality of channels decoded by the preceding decoding means; and desired sound image indicating the desired sound image position Sound image position receiving means for receiving position information;
The said parameter processing means performs the calculation which changes the parameter which shows the relationship between these channels based on the said original signal sound image position information and the said desired sound image position information. Voice / acoustic signal processing device.

The parameter processing means includes a pre-decoding arithmetic processing means for calculating a parameter indicating a relationship between the plurality of channels that has been subjected to predetermined encoding, and a plurality of the encoding that has been subjected to the arithmetic processing. The audio / acoustic signal processing apparatus according to claim 1, further comprising: a subsequent decoding unit that decodes a parameter indicating a relationship between channels.

The parameter processing means uses which parameter change factor among a plurality of preset parameter change factors based on a sound image position formed by the plurality of channels and a sound image position changed by the predetermined angle. The sound / acoustic signal processing apparatus according to claim 1, wherein the calculation is performed by determining the parameter change factor and applying the determined parameter changing factor.

The parameter processing means uses which parameter change factor among a plurality of preset parameter change factors based on a sound image position formed by the plurality of channels and a sound image position changed by the predetermined angle. And, based on the determined parameter change factor, the sound image position formed by the plurality of channels and the sound image position changed by the predetermined angle, a correction is made to generate a corrected parameter change factor, 6. The speech / acoustic signal processing apparatus according to claim 1, wherein the calculation is performed by applying the generated correction parameter changing factor.

The audio / acoustic signal processing apparatus according to claim 1, wherein the plurality of channel signals are binaural signals.

A listening direction detecting means for detecting a listening direction of a listener who listens to the plurality of channel signals generated by the plurality of channel signal generating means as sound;
9. The audio / acoustic signal processing apparatus according to claim 1, wherein the parameter processing means determines the predetermined angle based on the listening direction detection means.

An audio / acoustic signal processing method for processing a signal obtained by converting a plurality of channel signals into a smaller number of channel signals than the plurality of channels and a parameter indicating a relationship between the plurality of channels,
A parameter processing step of performing an operation of changing a parameter indicating a relationship between the plurality of channels, and changing a sound image position formed by the plurality of channels by a predetermined angle;
The plurality of channel signals in which the sound image position is changed by a predetermined angle are generated based on a smaller number of channel signals than the plurality of channels and a parameter indicating a relationship between the plurality of channels on which the arithmetic processing has been performed. A voice / acoustic signal processing method comprising: a multi-channel signal generation step.