JP2017163458A

JP2017163458A - Up-mix device and program

Info

Publication number: JP2017163458A
Application number: JP2016048051A
Authority: JP
Inventors: 敏行西口; Toshiyuki Nishiguchi; 陽佐々木; Akira Sasaki; 千晶森; Chiaki Mori; 一穂小野; Kazuo Ono
Original assignee: Nippon Hoso Kyokai NHK
Current assignee: Japan Broadcasting Corp
Priority date: 2016-03-11
Filing date: 2016-03-11
Publication date: 2017-09-14
Anticipated expiration: 2036-03-11
Also published as: JP6630599B2

Abstract

PROBLEM TO BE SOLVED: To provide an up-mixed output acoustic signal from an input acoustic signal without performing mixing work and while reducing sound quality deterioration even in down-mixing.SOLUTION: An up-mix device 1 comprises: a reverberation separator 11 for separating an input acoustic signal into a direct sound and an indirect sound; an impulse response estimation section 12 for estimating an impulse response of an original sound field from which the input acoustic signal is recorded; an impulse response generation section 13 for generating multiple similar impulse responses for which a maximum value of absolute values of correlation functions with the impulse response is smaller than 1; an indirect sound generation section 14 for generating multiple similar indirect sounds by convolution operation of the direct sound and the multiple similar impulse responses; and a mixer 15 for generating an output acoustic signal by performing linear operation using the direct sound, the indirect sound and the similar indirect sounds.SELECTED DRAWING: Figure 1

Description

本発明は、入力音響信号よりもチャンネル数の多い出力音響信号を生成するアップミックス装置及びプログラムに関する。 The present invention relates to an upmix device and a program for generating an output sound signal having a larger number of channels than an input sound signal.

従来、２ｃｈステレオから５．１ｃｈサラウンドに変換するといった、少ないチャンネル数のオーディオ信号からより多いチャンネル数のオーディオ信号を作り出す、アップミックスと呼ばれる技術が知られている（例えば、特許文献１及び２参照）。 2. Description of the Related Art Conventionally, a technique called upmix that creates an audio signal with a larger number of channels from an audio signal with a smaller number of channels, such as conversion from 2ch stereo to 5.1ch surround, is known (see, for example, Patent Documents 1 and 2). ).

２ｃｈステレオの信号から５．１ｃｈサラウンドの信号を作り出すアルゴリズムはいろいろと提案されており、すでに製品化もされている。これらの機器は、例えば、ポストプロダクションなどで既存の２チャンネルオーディオ素材を５．１ｃｈサラウンド制作に応用したり、あるいは、５．１ｃｈサラウンドの音楽番組で２チャンネル制作のカラオケを５．１ｃｈに変換して使用したりするといった用途に使われている。また、素材レベルでのアップミックスの利用だけではなく、２チャンネルステレオで制作されたプログラムを丸ごとアップミックスして５．１ｃｈサラウンドのプログラムにするといった例もある。 Various algorithms for creating 5.1ch surround signals from 2ch stereo signals have been proposed and have already been commercialized. These devices, for example, apply existing 2-channel audio material to 5.1ch surround production in post-production, etc., or convert 2-channel production karaoke to 5.1ch in 5.1ch surround music programs. It is used for purposes such as In addition to the use of upmix at the material level, there is an example in which a program produced in 2-channel stereo is upmixed into a 5.1ch surround program.

さらに近年、８Ｋスーパーハイビジョン用の２２．２ｃｈ音響を始め、５．１ｃｈを上回る多数のチャンネルを有する音響システムの開発が進められている。しかし、現在市販されているアップミックス装置の中に２２．２ｃｈ音響に対応したものはない。また、Ｎチャンネルのオーディオ信号からＭチャンネルのオーディオ信号を生成する技術（Ｎ≦Ｍ）も提案されているが（例えば、特許文献３及び４参照）、その多くは５．１ｃｈサラウンドなどの二次元平面上にスピーカを配置したシステムを想定したものであり、高さ方向にもスピーカが存在する三次元音響システムに適用した場合に、自然な三次元空間印象が得られるかどうかは検証されていない。 Furthermore, in recent years, development of an acoustic system having many channels exceeding 5.1 ch, including 22.2 ch sound for 8K Super Hi-Vision, has been underway. However, none of the upmix devices currently on the market are compatible with 22.2 ch sound. Further, a technique (N ≦ M) for generating an M-channel audio signal from an N-channel audio signal has also been proposed (see, for example, Patent Documents 3 and 4), many of which are two-dimensional such as 5.1ch surround. This system assumes a system in which speakers are arranged on a plane, and it has not been verified whether a natural three-dimensional spatial impression can be obtained when applied to a three-dimensional acoustic system in which speakers are also present in the height direction. .

このように、音響制作において、異なるチャンネル方式で制作された素材を融合したり、互換したりする上で、アップミックス技術は重要である。特に、音響制作の効率的化には、アップミックス用の付帯情報（サイド情報）を持たない従来の２ｃｈステレオ、５．１ｃｈの音声信号（音声データ）から、音響エンジニアがミキシング作業を行うことなく２２．２ｃｈの音声信号（音声データ）を得ることができる、自動アップミックス技術が不可欠である。 In this way, in sound production, upmix technology is important in order to fuse and interchange materials produced by different channel methods. In particular, for efficient sound production, acoustic engineers do not perform mixing work from conventional 2ch stereo and 5.1ch audio signals (audio data) that do not have additional information (side information) for upmixing. Automatic upmix technology that can obtain 22.2 ch audio signals (audio data) is essential.

そこで、特許文献５では、アップミックス用の付帯情報を持たない従来の２ｃｈステレオ、５．１ｃｈサラウンドの音声信号から、ミキシング作業を行うことなく２２．２ｃｈの音声信号を得る技術が開示されている。 Therefore, Patent Document 5 discloses a technique for obtaining a 22.2 ch audio signal without performing a mixing operation from a conventional 2 ch stereo and 5.1 ch surround audio signal having no additional information for upmixing. .

また、２２．２ｃｈを５．１ｃｈや２ｃｈステレオに変換するといった、より少ないチャンネル数のオーディオ信号を作り出すことをダウンミックスと呼ぶ。２２．２ｃｈの再生に対応しない機器（受信機、オーディオアンプ等）では、予め定められた配分（ダウンミックス係数）で複数チャンネルの信号を混合し、再生可能なより少ないチャンネル数に再構成するダウンミックス処理が行われる（例えば、非特許文献１参照）。 Creating an audio signal with a smaller number of channels, such as converting 22.2 ch into 5.1 ch or 2 ch stereo, is called downmix. For devices that do not support 22.2ch playback (receivers, audio amplifiers, etc.), a multi-channel signal is mixed with a predetermined distribution (downmix coefficient) and reconfigured to a smaller number of channels that can be played back. Mix processing is performed (for example, refer nonpatent literature 1).

特許第４７９２０８６号公報Japanese Patent No. 4792086 特許第４６６４４３１号公報Japanese Patent No. 4664431 特許第４９８９４６８号公報Japanese Patent No. 4998468 特表２０１２−５１１８４５号公報Special table 2012-511845 gazette 特開２０１５−７６８５７号公報Japanese Patent Laying-Open No. 2015-76857

ＡＲＩＢＳＴＤ−Ｂ３２３．２版、「デジタル放送における映像符号化、音声符号化及び多重化方式」ARIB STD-B32 version 3.2, "Video coding, audio coding and multiplexing system in digital broadcasting"

特許文献５に記載のアップミックス装置によれば、アップミックス用の付帯情報を持たない従来の２ｃｈステレオ、５．１ｃｈサラウンドの音声信号から、ミキシング作業を行うことなく２２．２ｃｈの音声信号を生成することができる。しかし、得られた２２．２ｃｈ信号から、上述のダウンミックス係数を使った自動ダウンミックスを行った場合、以下の理由により音質が劣化するという課題があった。 According to the upmix device described in Patent Document 5, a 22.2 ch audio signal is generated from a conventional 2 ch stereo and 5.1 ch surround audio signal having no additional information for upmix without performing a mixing operation. can do. However, when automatic downmix using the above-described downmix coefficient is performed from the obtained 22.2ch signal, there is a problem that sound quality deteriorates for the following reason.

特許文献５に記載のアップミックス装置は、アップミックス前の原信号から分離した間接音に遅延処理を施し、この遅延時間を変えることで、複数の間接音を新たに生成してアップミックスを行う。このアップミックス後の信号に、ダウンミックス係数によるダウンミックス処理が施された場合、遅延時間だけが異なる同一の信号が加算される。そのため、櫛形フィルタが形成され、周波数特性に遅延時間の逆数の周波数間隔でピーク又はディップが生じ、音質が劣化してしまう。 The upmix device described in Patent Document 5 performs a delay process on the indirect sound separated from the original signal before the upmix, and changes the delay time to newly generate a plurality of indirect sounds and perform the upmix. . When a downmix process using a downmix coefficient is performed on the signal after the upmixing, the same signal that is different only in delay time is added. Therefore, a comb filter is formed, and a peak or dip occurs in the frequency characteristic at a frequency interval that is the reciprocal of the delay time, and the sound quality is degraded.

かかる事情に鑑みてなされた本発明の目的は、アップミックス用の付帯情報を持たない従来の２ｃｈステレオ、５．１ｃｈサラウンドなどの入力音響信号から、ミキシング作業を行うことなく、かつダウンミックスしても音質劣化の少ない、アップミックスされた出力音響信号を得ることが可能なアップミックス装置及びプログラムを提供することにある。 An object of the present invention made in view of such circumstances is to perform downmixing without performing mixing work from an input acoustic signal such as a conventional 2ch stereo or 5.1ch surround which does not have additional information for upmixing. Another object of the present invention is to provide an upmix device and a program capable of obtaining an upmixed output acoustic signal with little deterioration in sound quality.

上記課題を解決するため、本発明に係るアップミックス装置は、入力音響信号よりもチャンネル数の多い出力音響信号を生成するアップミックス装置であって、入力音響信号を直接音及び間接音に分離する残響分離器と、入力音響信号を収録した原音場のインパルス応答を推定するインパルス応答推定部と、前記インパルス応答との相関関数の絶対値の最大値が１よりも小さい複数の類似インパルス応答を生成するインパルス応答生成部と、前記直接音及び前記複数の類似インパルス応答の畳み込み演算により複数の類似間接音を生成する間接音生成部と、前記直接音、前記間接音、及び前記類似間接音を用いた線形演算を行い、前記出力音響信号を生成する混合器と、を備えることを特徴とする。 In order to solve the above problems, an upmix device according to the present invention is an upmix device that generates an output acoustic signal having a larger number of channels than an input acoustic signal, and separates the input acoustic signal into a direct sound and an indirect sound. A reverberation separator, an impulse response estimator for estimating an impulse response of an original sound field recorded with an input acoustic signal, and a plurality of similar impulse responses having a maximum absolute value of a correlation function with the impulse response smaller than 1. Using the direct sound, the indirect sound, and the similar indirect sound, and the indirect sound generating unit that generates a plurality of similar indirect sounds by convolution of the direct sound and the plurality of similar impulse responses. And a mixer for performing the linear operation and generating the output acoustic signal.

さらに、本発明に係るアップミックス装置において、前記インパルス応答生成部は、前記インパルス応答を所定の時間で分割し、乱数を用いて並べ直して前記類似インパルス応答を生成することを特徴とする。 Furthermore, in the upmix device according to the present invention, the impulse response generation unit divides the impulse response by a predetermined time, rearranges the impulse response using a random number, and generates the similar impulse response.

さらに、本発明に係るアップミックス装置において、前記混合器により生成された出力音響信号について、チャンネル毎に音量レベルを調節する乗算器をさらに備えることを特徴とする。 The upmix device according to the present invention further includes a multiplier that adjusts a volume level for each channel of the output acoustic signal generated by the mixer.

また、上記課題を解決するため、本発明に係るプログラムは、コンピュータを、上記アップミックス装置として機能させることを特徴とする。 In order to solve the above problem, a program according to the present invention causes a computer to function as the upmix device.

本発明によれば、アップミックス用の付帯情報を持たない従来の２ｃｈステレオ、５．１ｃｈサラウンドなどの入力音響信号から、ミキシング作業を行うことなく、入力音響信号よりもチャンネル数の多いアップミックスされた出力音響信号を得ることができる。また、得られた出力音響信号をダウンミックスしても、音質劣化の少ないブラインドアップミックスを実現することができる。 According to the present invention, a conventional 2ch stereo having no additional information for upmixing and 5.1ch surround input audio signals are mixed without any mixing work and having a larger number of channels than the input audio signals. Output acoustic signals can be obtained. Moreover, even if the obtained output sound signal is downmixed, a blind upmix with little deterioration in sound quality can be realized.

本発明の第１の実施形態に係るアップミックス装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the upmix apparatus which concerns on the 1st Embodiment of this invention. ２２．２ｃｈ信号を再生する場合のスピーカ配置とチャンネル番号及びチャンネルラベルの一例を示す図である。It is a figure which shows an example of the speaker arrangement | positioning in the case of reproducing | regenerating a 22.2ch signal, a channel number, and a channel label. 本発明の第１の実施形態に係るアップミックス装置におけるインパルス応答生成部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the impulse response production | generation part in the upmix apparatus which concerns on the 1st Embodiment of this invention. 本発明の第２の実施形態に係るアップミックス装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the upmix apparatus which concerns on the 2nd Embodiment of this invention.

本発明のアップミックス装置は、入力音響信号よりもチャンネル数の多い出力音響信号を生成する装置である。以下、本発明の実施形態について、図面を参照して詳細に説明する。 The upmix apparatus of the present invention is an apparatus that generates an output acoustic signal having a larger number of channels than the input acoustic signal. Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

（第１の実施形態）
第１の実施形態では、入力音響信号（原信号）を、前方左右の音を収録した２ｃｈステレオ信号とし、出力音響信号を、２２．２ｃｈ信号とする。図１に、本発明の第１の実施形態に係るアップミックス装置の構成例を示す。図１に示す例では、アップミックス装置１は、残響分離器１１と、インパルス応答推定部１２と、インパルス応答生成部１３と、間接音生成部１４と、混合器１５と、乗算器１６と、ローパスフィルタ１７とを備える。ただし、乗算器１６は必須の構成ではない。 (First embodiment)
In the first embodiment, the input sound signal (original signal) is a 2ch stereo signal that includes sounds on the left and right in front, and the output sound signal is a 22.2ch signal. FIG. 1 shows a configuration example of an upmix device according to the first embodiment of the present invention. In the example illustrated in FIG. 1, the upmix device 1 includes a reverberation separator 11, an impulse response estimation unit 12, an impulse response generation unit 13, an indirect sound generation unit 14, a mixer 15, a multiplier 16, And a low-pass filter 17. However, the multiplier 16 is not an essential configuration.

図２に、２２．２ｃｈ信号を再生する場合のスピーカ配置とチャンネル番号・ラベルを示す。 FIG. 2 shows the speaker arrangement and channel number / label when reproducing 22.2ch signals.

アップミックスによって得られる図２の１〜２４ｃｈ用の信号をｙ₁（ｔ），ｙ₂（ｔ），・・・，ｙ₂₄（ｔ）とする。原信号である２ｃｈステレオ信号の左右の信号をそれぞれｘ_L（ｔ），ｘ_R（ｔ）とする。原信号ｘ_L（ｔ），ｘ_R（ｔ）は、それぞれの直接音ｄ_L（ｔ），ｄ_R（ｔ）と間接音（残響音）ｒ_L（ｔ），ｒ_R（ｔ）とを用いて次のように表される。 Signals for channels ₁ to 24 in FIG. 2 obtained by upmixing are assumed to be y ₁ (t), y ₂ (t),..., Y ₂₄ (t). The left and right signals of the 2ch stereo signal that is the original signal are assumed to be x _L (t) and x _R (t), respectively. The original signals x _L (t) and x _R (t) are obtained from the respective direct sounds d _L (t) and d _R (t) and indirect sounds (reverberation sounds) r _L (t) and r _R (t). It is expressed as follows.

残響分離器１１は、入力された原信号ｘ_L（ｔ），ｘ_R（ｔ）を、それぞれ直接音ｄ_L（ｔ），ｄ_R（ｔ）、及び間接音ｒ_L（ｔ），ｒ_R（ｔ）に分離して、インパルス応答推定部１２、混合器１５、及びローパスフィルタ１７に出力する。残響分離処理は、公知の処理を用いることができる。例えば、K. Kinoshita et al, "Blind Upmix Of Stereo Music Signals Using Multi-Step Linear Prediction Based Reverberation Extraction", ICASSP, pp.49-52, 2010.に開示された手法により、残響分離処理を行う。 The reverberation separator 11 converts the input original signals x _L (t) and x _R (t) into direct sounds d _L (t) and d _R (t) and indirect sounds r _L (t) and r _{R, respectively.} The result is separated into (t) and output to the impulse response estimation unit 12, the mixer 15, and the low-pass filter 17. A well-known process can be used for the reverberation separation process. For example, the reverberation separation process is performed by the method disclosed in K. Kinoshita et al, “Blind Upmix Of Stereo Music Signals Using Multi-Step Linear Prediction Based Reverberation Extraction”, ICASSP, pp. 49-52, 2010.

インパルス応答推定部１２は、残響分離器１１により分離された直接音ｄ_L（ｔ），ｄ_R（ｔ）と間接音ｒ_L（ｔ），ｒ_R（ｔ）とから、あるいは原信号ｘ_L（ｔ），ｘ_R（ｔ）と直接音ｄ_L（ｔ），ｄ_R（ｔ）とから、原音場（入力音響信号を収録した、直接音を含まない原音場）のインパルス応答ｈ_L（ｔ），ｈ_R（ｔ）を推定し、インパルス応答生成部１３に出力する。 The impulse response estimation unit 12 uses the direct sounds d _L (t) and d _R (t) separated by the reverberation separator 11 and the indirect sounds r _L (t) and r _R (t) or the original signal x _L. From the (t), x _R (t) and the direct sounds d _L (t), d _R (t), the impulse response h _L (the original sound field containing the input sound signal and not including the direct sound) t) and h _R (t) are estimated and output to the impulse response generator 13.

間接音ｒ_L（ｔ），ｒ_R（ｔ）は、原音場のインパルス応答ｈ_L（ｔ），ｈ_R（ｔ）を用いて、式（２）で表される。 The indirect sounds r _L (t) and r _R (t) are expressed by Equation (2) using the impulse responses h _L (t) and h _R (t) of the original sound field.

ここで、＊は畳み込み演算を示す。式（２）の周波数領域表現は、式（３）のように表せる。 Here, * indicates a convolution operation. The frequency domain representation of Equation (2) can be expressed as Equation (3).

式（３）より、原音場の伝達関数Ｈ_L（ω），Ｈ_R（ω）は、式（４）で表される。 From Expression (3), the transfer functions H _L (ω) and H _R (ω) of the original sound field are expressed by Expression (4).

インパルス応答推定部１２は、式（４）に基づき伝達関数Ｈ_L（ω），Ｈ_R（ω）を求め、時間領域に戻したインパルス応答ｈ_L（ｔ），ｈ_R（ｔ）を算出する。直接音の周波数成分Ｄ_L（ω），Ｄ_R（ω）に零点がある場合、式（４）の値を算出できないが、原信号の間接音ｒ_L（ｔ），ｒ_R（ｔ）、及び直接音ｄ_L（ｔ），ｄ_R（ｔ）から、例えばクロススペクトル法や適応フィルタ法など既知の手法を適用することにより、インパルス応答ｈ_L（ｔ），ｈ_R（ｔ）を推定することができる。 The impulse response estimation unit 12 obtains the transfer functions H _L (ω) and H _R (ω) based on the equation (4), and calculates the impulse responses h _L (t) and h _R (t) returned to the time domain. . If the frequency components D _L (ω) and D _R (ω) of the direct sound have zeros, the value of Equation (4) cannot be calculated, but the indirect sounds r _L (t), r _R (t), And impulse responses h _L (t) and h _R (t) are estimated from the direct sounds d _L (t) and d _R (t) by applying a known method such as a cross spectrum method or an adaptive filter method. be able to.

インパルス応答生成部１３は、インパルス応答推定部１２により推定されたインパルス応答との相関係数の絶対値の最大値が１よりも小さいインパルス応答である類似インパルス応答ｈ_Lm（ｔ），ｈ_Rn（ｔ）を複数生成し、間接音生成部１４に出力する。なお、特許文献５に記載の手法では、遅延時間を変えることで複数の間接音を生成しているが、この方法だと遅延前のインパルス応答と遅延後のインパルス応答の相関関数の絶対値の最大値が１となる。すなわち、インパルス応答生成部１３の処理には、特許文献５に記載の手法が含まれない。 The impulse response generation unit 13 is a similar impulse response h _Lm (t), h _Rn (in which the maximum value of the absolute value of the correlation coefficient with the impulse response estimated by the impulse response estimation unit 12 is smaller than 1. A plurality of t) are generated and output to the indirect sound generation unit 14. In the method described in Patent Document 5, a plurality of indirect sounds are generated by changing the delay time. With this method, the absolute value of the correlation function between the impulse response before the delay and the impulse response after the delay is obtained. The maximum value is 1. That is, the method described in Patent Document 5 is not included in the processing of the impulse response generation unit 13.

インパルス応答生成部１３は、例えば、元になるインパルス応答ｈ_L（ｔ），ｈ_R（ｔ）を所定の時間で分割し、乱数を用いて並べ直すことにより、類似インパルス応答ｈ_Lm（ｔ），ｈ_Rn（ｔ）を生成する。 For example, the impulse response generation unit 13 divides the original impulse responses h _L (t) and h _R (t) by a predetermined time, and rearranges them using random numbers, thereby resembling the similar impulse responses h _Lm (t). , H _Rn (t).

図３に、インパルス応答生成部１３の一例として、特許第３６５７３４０号公報に開示された構成例を示す。インパルス応答生成部１３にはインパルス応答ｈ_L（ｔ）及びｈ_R（ｔ）が与えられるが、説明の便宜上、ここではインパルス応答ｈ（ｔ）が与えられるものとし、１つのインパルス応答ｈ（ｔ）からｍ個の類似インパルス応答ｈ₁（ｔ），ｈ₂（ｔ），・・・，ｈ_m（ｔ）を生成する処理について説明する。 FIG. 3 shows a configuration example disclosed in Japanese Patent No. 3657340 as an example of the impulse response generation unit 13. Impulse responses h _L (t) and h _R (t) are given to the impulse response generation unit 13. For convenience of explanation, it is assumed here that the impulse response h (t) is given, and one impulse response h (t ) To generate _m similar impulse responses h ₁ (t), h ₂ (t),..., H _m (t).

図３に示す例では、インパルス応答生成部１３は、信号分離部１３１と、ｎ個の乗算部１３２（１３２−１〜１３２−ｎ）と、ｍ個の乱数発生部１３３（１３３−１〜１３３−ｍ）と、ｎ個の時間シフト部１３４（１３４−１〜１３４−ｎ）と、信号加算部１３５と、スイッチ１３６と、スイッチ１３７とを備える。 In the example illustrated in FIG. 3, the impulse response generation unit 13 includes a signal separation unit 131, n multiplication units 132 (132-1 to 132-n), and m random number generation units 133 (133-1 to 133). -M), n time shift units 134 (134-1 to 134-n), a signal addition unit 135, a switch 136, and a switch 137.

信号分離部１３１は、与えられたインパルス応答を初期反射音部ｅ（ｔ）と残響音部ｒ（ｔ）とに分離する。 The signal separation unit 131 separates the given impulse response into an initial reflection sound part e (t) and a reverberation sound part r (t).

乗算部１３２は、信号分離部１３１により分離された残響音部ｒ（ｔ）に対して、所定の有限長の時間窓をそれぞれ所定時間だけシフトして得られた時間窓を乗じ、信号ｒ₁（ｔ），ｒ₂（ｔ），・・・，ｒ_n（ｔ）を生成する。 The multiplier 132 multiplies the reverberation sound part r (t) separated by the signal separation part 131 by a time window obtained by shifting a predetermined finite-length time window by a predetermined time to obtain a signal r _1. (T), r ₂ (t),..., R _n (t) are generated.

乱数発生部１３３は、時間シフト量をランダム時間系列で与える。 The random number generation unit 133 gives the time shift amount in a random time series.

時間シフト部１３４は、乱数発生部１３３のランダムな時間シフト量で、乗算部１３２の出力の時間軸をシフトし、信号ｒ₁ ^(k)（ｔ），ｒ₂ ^(k)（ｔ），・・・，ｒ_n ^(k)（ｔ）を生成する。 The time shift unit 134 shifts the time axis of the output of the multiplication unit 132 by the random time shift amount of the random number generation unit 133, and the signals r ₁ ^(k) (t), r ₂ ^(k) (t),. .., R _n ^(k) (t) is generated.

信号加算部１３５は、信号分離部１３１により分離された初期反射音部ｅ（ｔ）と、時間シフト部１３４の出力とを全て加算する。 The signal adder 135 adds all of the initial reflected sound part e (t) separated by the signal separator 131 and the output of the time shifter 134.

スイッチ１３６は、乱数発生部１３３の異なる出力を選択するものである。スイッチ１３７は、スイッチ１３６と連動しており、信号加算部１３５からの信号を振り分ける。
The switch 136 selects a different output of the random number generator 133. The switch 137 is linked with the switch 136 and distributes the signal from the signal adding unit 135.

間接音生成部１４は、式（５）に示すように、直接音ｄ_L（ｔ），ｄ_R（ｔ）、及び類似インパルス応答ｈ_Lm（ｔ），ｈ_Rn（ｔ）の畳み込み演算により、間接音と類似した音色を持つ間接音である類似間接音ｒ_Lm（ｔ），ｒ_Rn（ｔ）を複数生成し、混合器１５に出力する。 As shown in Expression (5), the indirect sound generation unit 14 performs a convolution operation on the direct sounds d _L (t) and d _R (t) and the similar impulse responses h _Lm (t) and h _Rn (t). A plurality of similar indirect sounds r _Lm (t) and r _Rn (t), which are indirect sounds having a timbre similar to the indirect sound, are generated and output to the mixer 15.

間接音と類似した類似間接音を用いることにより、聴感上３次元的な広がり感を得ることができる。また、インパルス応答ｈ_Lm（ｔ），ｈ_Rn（ｔ）から生成した類似間接音ｒ_Lm（ｔ），ｒ_Rn（ｔ）は、単純な時間遅延と異なり、混合しても櫛形フィルタが形成されず、ダウンミックスした際の音質劣化を避けることができる。 By using a similar indirect sound similar to the indirect sound, a three-dimensional sense of spread can be obtained in terms of hearing. Also, the similar indirect sounds r _Lm (t) and r _Rn (t) generated from the impulse responses h _Lm (t) and h _Rn (t) are different from simple time delays, and even if mixed, a comb filter is formed. Therefore, it is possible to avoid deterioration in sound quality when downmixing.

ローパスフィルタ１７は、残響分離器１１から入力された直接音ｄ_L（ｔ），ｄ_R（ｔ）の低域成分ｌ_dL（ｔ），ｌ_dR（ｔ）、及び間接音ｒ_L（ｔ），ｒ_R（ｔ）の低域成分ｌ_rL（ｔ），ｌ_rR（ｔ）をそれぞれ抽出して、混合器１５に出力する。ここで、ローパスフィルタ１７のカットオフ周波数は、例えば、８０〜１２０Ｈｚとする。またこの例では、ローパスフィルタ１７からの出力信号はＬＦＥ（Low Frequency Effects）チャンネルである、図２のＬＦＥ１(４ｃｈ)及びＬＦＥ２（１０ｃｈ）に出力される信号に適用される。 The low-pass filter 17 includes the low frequency components l _dL (t) and l _dR (t) of the direct sounds d _L (t) and d _R (t) input from the reverberation separator 11 and the indirect sound r _L (t). , R _R (t), low frequency components l _rL (t), l _rR (t) are extracted and output to the mixer 15. Here, the cut-off frequency of the low-pass filter 17 is, for example, 80 to 120 Hz. In this example, the output signal from the low-pass filter 17 is applied to signals output to LFE1 (4ch) and LFE2 (10ch) in FIG. 2, which are LFE (Low Frequency Effects) channels.

混合器１５は、直接音ｄ_L（ｔ），ｄ_R（ｔ）、間接音ｒ_L（ｔ），ｒ_R（ｔ）、及び類似間接音ｒ_Lm（ｔ），ｒ_Rn（ｔ）を混合し（すなわち、直接音、間接音、及び類似間接音を用いた線形演算を行い）、各チャンネルへの信号を生成する。ここで、線形演算は直接音、間接音、類似間接音のうち、少なくとも一つを用いていればよい。また、混合器１５は、各チャンネルへの信号に含まれる間接音に係数ａ_r1，ａ_r2，…，ａ_r24を乗算することにより、各チャンネルに間接音を付加する度合いを調節する。例えば、直接音と間接音を混合するチャンネル（本実施形態では１ｃｈ、２ｃｈ、３ｃｈ、４ｃｈ、１０ｃｈ）については、直接音と間接音の比を調節することができる。アップミックス装置１が乗算器１６を備えない場合には、混合器１５の出力する信号が出力音響信号となる。 The mixer 15 mixes the direct sounds d _L (t) and d _R (t), the indirect sounds r _L (t) and r _R (t), and the similar indirect sounds r _Lm (t) and r _Rn (t). (Ie, performing a linear operation using a direct sound, an indirect sound, and a similar indirect sound) to generate a signal for each channel. Here, the linear calculation may use at least one of a direct sound, an indirect sound, and a similar indirect sound. Further, the mixer 15 adjusts the degree to which the indirect sound is added to each channel by multiplying the indirect sound included in the signal to each channel by coefficients a _r1 , a _r2 _,. For example, for a channel that mixes direct sound and indirect sound (1ch, 2ch, 3ch, 4ch, 10ch in this embodiment), the ratio of direct sound and indirect sound can be adjusted. When the upmix device 1 does not include the multiplier 16, the signal output from the mixer 15 is an output acoustic signal.

乗算器１６は、混合器１５から出力される信号に係数ａ₁，ａ₂，…，ａ₂₄を乗算することにより、チャンネル毎に音量レベルを調節して、２２．２ｃｈのスピーカにそれぞれ出力音響信号を出力する。２２．２ｃｈのスピーカが図２に示す位置に配置される場合、乗算器１６が出力する出力信号ｙ₁（ｔ），ｙ₂（ｔ），…，ｙ２４（ｔ）は、例えば式（６）のように表される。 The multiplier 16, the coefficient on the signal output from the mixer _{_{15 a 1, a 2, ...}} , by multiplying the a _24, to adjust the volume level for each channel, each output sound to the speaker 22.2ch Output a signal. When the 22.2ch speaker is disposed at the position shown in FIG. 2, the output signals y ₁ (t), y ₂ (t),. It is expressed as

式（６）の例では、混合器１５は、図２に示すＦＬ(１ｃｈ)への信号を生成するため、直接音ｄ_L（ｔ）に、係数ａ_r1を乗算した間接音ｒ_L（ｔ）を付加する。同様に、混合器１５は、ＦＲ（２ｃｈ）への信号を生成するため、直接音ｄ_R（ｔ）に、係数ａ_r2を乗算した間接音ｒ_R（ｔ）を付加する。 In the example of Expression (6), the mixer 15 generates an indirect sound r _L (t) obtained by multiplying the direct sound d _L (t) by a coefficient a _r1 in order to generate a signal to FL (1ch) shown in FIG. ) Is added. Similarly, the mixer 15 adds an indirect sound r _R (t) obtained by multiplying the direct sound d _R (t) by a coefficient a _r2 in order to generate a signal to FR (2ch).

混合器１５は、ＦＣ（３ｃｈ）については、原信号の間接音のもつ方向特性の再現性と２２．２ｃｈ音響のチャンネル配置に鑑み左右中央に位置する信号であるため、原信号の左右の直接音を混合したｄ_L（ｔ）＋ｄ_R（ｔ）に、原信号の左右の間接音を混合した｛ｒ_L（ｔ）＋ｒ_R（ｔ）｝を、係数ａ_r3を乗算して付加する。 The mixer 15 is a signal located at the center in the left and right in view of the reproducibility of the directional characteristics of the indirect sound of the original signal and the channel arrangement of the 22.2 ch sound for FC (3ch). {R _L (t) + r _R (t)} obtained by mixing the left and right indirect sounds of the original signal is added to d _L (t) + d _R (t) mixed with the sound by multiplying by a coefficient a _r3 .

同様に、混合器１５は、左右中央に位置するＢＣ（９ｃｈ），ＴｐＦＣ（１５ｃｈ），ＴｐＣ（１６ｃｈ），ＴｐＢＣ（２１ｃｈ），ＢｔＦＣ（２２ｃｈ）についても、原信号から得た左右のインパルス応答ｈ_L（ｔ），ｈ_R（ｔ）それぞれから生成したｈ_Lm（ｔ），ｈ_Rn（ｔ）に基づく類似間接音ｒ_Lm（ｔ），ｒ_Rn（ｔ）（ｍ＝ｎ＝９，１５，１６，２１，２２）を混合した間接音を付加する。また、混合器１５は、中央より左側に位置するチャンネルには、原信号の左側インパルス応答ｈ_L（ｔ）から生成したｈ_Lm（ｔ）に基づく類似間接音ｒ_Lm（ｔ）（ｍ＝５，７，１１，１３，１７，１９，２３）を付加する。中央より右側に位置するチャンネルには、原信号の右側インパルス応答ｈ_R（ｔ）から生成したｈ_Rn（ｔ）に基づく類似間接音ｒ_Rm（ｔ）（ｎ＝６，８，１２，１４，１８，２０，２４）を付加する。なお、類似間接音それぞれの相関は低く、音の定位にも大きく影響しないため、左右どちらか一方の類似間接音を付加するようにしてもよい。 Similarly, the mixer 15 also uses the left and right impulse responses h obtained from the original signal for BC (9 ch), TpFC (15 ch), TpC (16 ch), TpBC (21 ch), and BtFC (22 ch) located at the center of the left and right. Similar indirect sounds r _Lm (t) and r _Rn (t) based on h _Lm (t) and h _Rn (t) generated from _L (t) and h _R (t) (m = n = 9, 15, Indirect sound mixed with 16, 21, 22) is added. In addition, the mixer 15 has a similar indirect sound r _Lm (t) (m = 5) based on h _Lm (t) generated from the left impulse response h _L (t) of the original signal in a channel located on the left side from the center. , 7, 11, 13, 17, 19, 23). The channel located on the right side of the center has a similar indirect sound r _Rm (t) (n = 6, 8, 12, 14,...) Based on h _Rn (t) generated from the right impulse response h _R (t) of the original signal. 18, 20, 24) are added. In addition, since the correlation of each similar indirect sound is low and does not have a great influence on the localization of the sound, either the left or right similar indirect sound may be added.

（第２の実施形態）
つぎに、本発明の第２の実施形態について説明する。第２の実施形態では、入力音響信号（原信号）を、Ｌ，Ｒ，Ｃ，ＳＬ，ＳＲ，ＬＦＥチャンネルからなる５．１ｃｈサラウンド信号とし、出力音響信号を、２２．２ｃｈ信号とする。図４に、本発明の第２の実施形態に係るアップミックス装置の構成例を示す。図４に示す例では、アップミックス装置２は、残響分離器２１と、インパルス応答推定部２２と、インパルス応答生成部２３と、間接音生成部２４と、混合器２５と、乗算器２６とを備える。ただし、乗算器２６は必須の構成ではない。 (Second Embodiment)
Next, a second embodiment of the present invention will be described. In the second embodiment, an input acoustic signal (original signal) is a 5.1 channel surround signal including L, R, C, SL, SR, and LFE channels, and an output acoustic signal is a 22.2 channel signal. FIG. 4 shows a configuration example of an upmix device according to the second embodiment of the present invention. In the example illustrated in FIG. 4, the upmix device 2 includes a reverberation separator 21, an impulse response estimation unit 22, an impulse response generation unit 23, an indirect sound generation unit 24, a mixer 25, and a multiplier 26. Prepare. However, the multiplier 26 is not an essential configuration.

原信号である５．１ｃｈサラウンド信号ｘ_L（ｔ），ｘ_R（ｔ），ｘ_C（ｔ），ｘ_SL（ｔ），ｘ_SR（ｔ），ｘ_LFE（ｔ）は、それぞれの直接音ｄ_L（ｔ），ｄ_R（ｔ），ｄ_C（ｔ），ｄ_SL（ｔ），ｄ_SR（ｔ），ｄ_LFE（ｔ）と、間接音ｒ_L（ｔ），ｒ_R（ｔ），ｒ_C（ｔ），ｒ_SL（ｔ），ｒ_SR（ｔ），ｒ_LFE（ｔ）を用いて、式（７）のように表される。 The original 5.1-channel surround signals x _L (t), x _R (t), x _C (t), x _SL (t), x _SR (t), and x _LFE (t) d _L (t), d _R (t), d _C (t), d _SL (t), d _SR (t), d _LFE (t) and indirect sounds r _L (t), r _R (t) , R _C (t), r _SL (t), r _SR (t), r _LFE (t) are expressed as in Expression (7).

残響分離器２１は、原信号をそれぞれ直接音ｄ_L（ｔ），ｄ_R（ｔ），ｄ_C（ｔ），ｄ_SL（ｔ），ｄ_SR（ｔ），ｄ_LFE（ｔ）、及び間接音ｒ_L（ｔ），ｒ_R（ｔ），ｒ_C（ｔ），ｒ_SL（ｔ），ｒ_SR（ｔ），ｒ_LFE（ｔ）に分離して、インパルス応答推定部２２、及び混合器２５に出力する。 The reverberation separator 21 converts the original signal into direct sounds d _L (t), d _R (t), d _C (t), d _SL (t), d _SR (t), d _LFE (t), and indirect, respectively. The impulse response estimator 22 and the mixer are separated into sounds r _L (t), r _R (t), r _C (t), r _SL (t), r _SR (t), r _LFE (t). To 25.

インパルス応答推定部２２は、残響分離器２１により分離された直接音と間接音とから、あるいは原信号と直接音とから、原音場のインパルス応答ｈ_L（ｔ），ｈ_R（ｔ），ｈ_C（ｔ），ｈ_SL（ｔ），ｈ_SR（ｔ），ｈ_LFE（ｔ）を推定し、インパルス応答生成部２３に出力する。 The impulse response estimation unit 22 generates an impulse response h _L (t), h _R (t), h of the original sound field from the direct sound and the indirect sound separated by the reverberation separator 21 or from the original signal and the direct sound. _C (t), h _SL (t), h _SR (t), and h _LFE (t) are estimated and output to the impulse response generator 23.

原信号の間接音ｒ_L（ｔ），ｒ_R（ｔ），ｒ_C（ｔ），ｒ_SL（ｔ），ｒ_SR（ｔ），ｒ_LFE（ｔ）は、原音場のインパルス応答ｈ_L（ｔ），ｈ_R（ｔ），ｈ_C（ｔ），ｈ_SL（ｔ），ｈ_SR（ｔ），ｈ_LFE（ｔ）を用いて、式（８）で表される。 The indirect sound r _L (t), r _R (t), r _C (t), r _SL (t), r _SR (t), r _LFE (t) of the original signal is the impulse response h _L ( t), h _R (t), h _C (t), h _SL (t), h _SR (t), and h _LFE (t) are expressed by Expression (8).

ここで、＊は畳み込み演算を示す。式（８）の周波数領域表現を用いて、原音場の伝達関数Ｈ_L（ω），Ｈ_R（ω），Ｈ_C（ω），Ｈ_SL（ω），Ｈ_SR（ω），Ｈ_LFE（ω）は、式（９）で表される。 Here, * indicates a convolution operation. Using the frequency domain representation of Equation (8), the transfer function H _L (ω), H _R (ω), H _C (ω), H _SL (ω), H _SR (ω), H _LFE ( ω) is expressed by equation (9).

第１の実施形態と同様、インパルス応答推定部２２は、式（９）の伝達関数に基づき、クロススペクトル法や適応フィルタ法など既知の手法を適用することにより、インパルス応答ｈ_L（ｔ），ｈ_R（ｔ），ｈ_C（ｔ），ｈ_SL（ｔ），ｈ_SR（ｔ），ｈ_LFE（ｔ）を推定する。 Similar to the first embodiment, the impulse response estimation unit 22 applies a known method such as a cross spectrum method or an adaptive filter method based on the transfer function of the equation (9), so that the impulse response h _L (t), h _R (t), h _C (t), h _SL (t), h _SR (t), and h _LFE (t) are estimated.

インパルス応答生成部２３は、インパルス応答推定部２２により推定されたインパルス応答との相関係数の絶対値の最大値が１よりも小さいインパルス応答である類似インパルス応答ｈ_Lk（ｔ），ｈ_Rl（ｔ），ｈ_Cm（ｔ），ｈ_SLn（ｔ），ｈ_SRp（ｔ），ｈ_LFEq（ｔ）を複数生成し、間接音生成部２４に出力する。例えば、元になるインパルス応答ｈ_L（ｔ），ｈ_R（ｔ），ｈ_C（ｔ），ｈ_SL（ｔ），ｈ_SR（ｔ），ｈ_LFE（ｔ）を所定の時間で分割し、乱数を用いて並べ直すことにより、類似インパルス応答ｈ_Lk（ｔ），ｈ_Rl（ｔ），ｈ_Cm（ｔ），ｈ_SLn（ｔ），ｈ_SRp（ｔ），ｈ_LFEq（ｔ）を生成する。 The impulse response generation unit 23 is a similar impulse response h _Lk (t), h _Rl (in which the maximum value of the absolute value of the correlation coefficient with the impulse response estimated by the impulse response estimation unit 22 is smaller than 1. t), h _Cm (t), h _SLn (t), h _SRp (t), and h _LFEq (t) are generated and output to the indirect sound generator 24. For example, the original impulse responses h _L (t), h _R (t), h _C (t), h _SL (t), h _SR (t), h _LFE (t) are divided by a predetermined time, Similar impulse responses h _Lk (t), h _Rl (t), h _Cm (t), h _SLn (t), h _SRp (t), h _LFEq (t) are generated by rearranging using random numbers. .

間接音生成部２４は、式（１０）に示すように、直接音ｄ_L（ｔ），ｄ_R（ｔ），ｄ_C（ｔ），ｄ_SL（ｔ），ｄ_SR（ｔ），ｄ_LFE（ｔ）、及び類似インパルス応答ｈ_Lk（ｔ），ｈ_Rl（ｔ），ｈ_Cm（ｔ），ｈ_SLn（ｔ），ｈ_SRp（ｔ），ｈ_LFEq（ｔ）の畳み込み演算により、間接音と類似した音色を持つ間接音である類似間接音ｒ_Lk（ｔ），ｒ_Rl（ｔ），ｒ_Cm（ｔ），ｒ_SLn（ｔ），ｒ_SRp（ｔ），ｒ_LFEq（ｔ）を複数生成し、混合器２５に出力する。間接音と類似した類似間接音を用いることにより、聴感上３次元的な広がり感を得ることができる。 As shown in Expression (10), the indirect sound generation unit 24 generates direct sounds d _L (t), d _R (t), d _C (t), d _SL (t), d _SR (t), d _LFE. (T) and similar impulse responses h _Lk (t), h _Rl (t), h _Cm (t), h _SLn (t), h _SRp (t), h _LFEq (t) A plurality of similar indirect sounds r _Lk (t), r _Rl (t), r _Cm (t), r _SLn (t), r _SRp (t), r _LFEq (t), which are indirect sounds having similar tones And output to the mixer 25. By using a similar indirect sound similar to the indirect sound, a three-dimensional sense of spread can be obtained in terms of hearing.

混合器２５は、直接音ｄ_L（ｔ），ｄ_R（ｔ），ｄ_C（ｔ），ｄ_SL（ｔ），ｄ_SR（ｔ），ｄ_LFE（ｔ）、間接音ｒ_L（ｔ），ｒ_R（ｔ），ｒ_C（ｔ），ｒ_SL（ｔ），ｒ_SR（ｔ），ｒ_LFE（ｔ）、及び類似間接音ｒ_Lk（ｔ），ｒ_Rl（ｔ），ｒ_Cm（ｔ），ｒ_SLn（ｔ），ｒ_SRp（ｔ），ｒ_LFEq（ｔ）を混合し（すなわち、直接音、間接音、及び類似間接音を用いた線形演算を行い）、各チャンネルへの信号を生成する。ここで、線形演算は直接音、間接音、類似間接音のうち、少なくとも一つを用いていればよい。また、混合器２５は、各チャンネルへの信号に含まれる間接音に係数ａ_r1，ａ_r2，…，ａ_r24を乗算することにより、各チャンネルに間接音を付加する度合いを調節する。アップミックス装置２が乗算器２６を備えない場合には、混合器２５の出力する信号が出力音響信号となる。 The mixer 25 includes direct sound d _L (t), d _R (t), d _C (t), d _SL (t), d _SR (t), d _LFE (t), indirect sound r _L (t). , R _R (t), r _C (t), r _SL (t), r _SR (t), r _LFE (t), and similar indirect sounds r _Lk (t), r _Rl (t), r _Cm ( t), r _SLn (t), r _SRp (t), r _LFEq (t) are mixed (ie, linear operation using direct sound, indirect sound, and similar indirect sound is performed), and the signal to each channel Is generated. Here, the linear calculation may use at least one of a direct sound, an indirect sound, and a similar indirect sound. Further, the mixer 25 adjusts the degree to which the indirect sound is added to each channel by multiplying the indirect sound included in the signal to each channel by the coefficients a _r1 , a _r2 _,. When the upmix device 2 does not include the multiplier 26, the signal output from the mixer 25 is an output acoustic signal.

乗算器２６は、混合器２５から出力される信号に係数ａ₁，ａ₂，…，ａ₂₄を乗算することにより、チャンネル毎に音量レベルを調節して、２２．２ｃｈのスピーカにそれぞれ出力音響信号を出力する。２２．２ｃｈのスピーカが図２に示す位置に配置される場合、乗算器２６が出力する出力信号ｙ₁（ｔ），ｙ₂（ｔ），…，ｙ２４（ｔ）は、例えば式（１１）のように表される。 The multiplier 26 multiplies the signal output from the mixer 25 by coefficients a ₁ , a ₂ ,..., A ₂₄ to adjust the volume level for each channel, and outputs the sound to the 22.2 ch speaker. Output a signal. When the 22.2ch speaker is arranged at the position shown in FIG. 2, the output signals y ₁ (t), y ₂ (t),. It is expressed as

式（１０）の例では、原信号の間接音のもつ方向特性の再現性と２２．２ｃｈ音響のチャンネル配置に鑑み、原音場の左側のインパルス応答ｈ_L（ｔ）から生成した間接音及び類似間接音は、アップミックスで生成される２２．２ｃｈ信号の左側チャンネルに利用し、原音場の右側のインパルス応答ｈ_R（ｔ）から生成した間接音及び類似間接音は２２．２ｃｈ信号の右側チャンネルに利用している。同様に、ｈ_C（ｔ）から生成した間接音及び類似間接音は左右中央のチャンネルに割り当て、ｈ_SL（ｔ），ｈ_SR（ｔ）から生成した間接音及び類似間接音はそれぞれ後方の左右に割り当て、ｈ_LFE（ｔ）から生成した間接音及び類似間接音はＬＦＥチャンネルに割り当てている。なお、間接音及び類似間接音それぞれの相関は低く、音の定位にも大きく影響しないため、他の割り当て方法も考えられる。 In the example of Expression (10), in consideration of the reproducibility of the directional characteristics of the indirect sound of the original signal and the channel arrangement of the 22.2 ch sound, the indirect sound generated from the impulse response h _L (t) on the left side of the original sound field and similar The indirect sound is used for the left channel of the 22.2 ch signal generated by the upmix, and the indirect sound and the similar indirect sound generated from the impulse response h _R (t) on the right side of the original sound field are the right channel of the 22.2 ch signal. It is used for. Similarly, the indirect sound and similar indirect sound generated from h _C (t) are assigned to the left and right center channels, and the indirect sound and similar indirect sound generated from h _SL (t) and h _SR (t) are respectively rear left and right. And the indirect sound and similar indirect sound generated from h _LFE (t) are assigned to the LFE channel. Since the correlation between the indirect sound and the similar indirect sound is low and does not greatly affect the localization of the sound, other allocation methods can be considered.

以上、アップミックス装置１，２について説明したが、このアップミックス装置１，２として機能させるためにコンピュータを好適に用いることができ、そのようなコンピュータは、アップミックス装置１，２の各機能を実現する処理内容を記述したプログラムを該コンピュータの記憶部に格納しておき、該コンピュータのＣＰＵによってこのプログラムを読み出して実行させることで実現することができる。なお、このプログラムは、コンピュータ読取り可能な記録媒体に記録可能である。 In the above, the upmix devices 1 and 2 have been described. However, a computer can be suitably used to cause the upmix devices 1 and 2 to function. This can be realized by storing a program describing the processing contents to be realized in a storage unit of the computer, and reading and executing the program by the CPU of the computer. This program can be recorded on a computer-readable recording medium.

上述したアップミックス装置１，２、又はアップミックス装置１，２として機能させるコンピュータによれば、原音場のインパルス応答に類似する類似インパルス応答から複数の類似間接音を生成するため、アップミックス用の付帯情報を持たない入力音響信号から、ミキシング作業を行うことなく、入力音響信号よりもチャンネル数の多いアップミックスされた出力音響信号を得ることができる。
また、類似インパルス応答はインパルス応答との相関関数の絶対値の最大値が１よりも小さいため、得られた出力音響信号をダウンミックスしても、音質劣化の少ないブラインドアップミックスを実現することができる。 According to the above-described upmix devices 1 and 2 or the computer that functions as the upmix devices 1 and 2, in order to generate a plurality of similar indirect sounds from similar impulse responses similar to the impulse response of the original sound field, An up-mixed output sound signal having a larger number of channels than the input sound signal can be obtained from the input sound signal having no additional information without performing a mixing operation.
Further, since the maximum value of the absolute value of the correlation function with the impulse response of the similar impulse response is smaller than 1, it is possible to realize a blind upmix with little deterioration in sound quality even if the obtained output acoustic signal is downmixed. it can.

上述の実施形態は代表的な例として説明したが、本発明の趣旨及び範囲内で、多くの変更及び置換ができることは当業者に明らかである。したがって、本発明は、上述の実施形態によって制限するものと解するべきではなく、特許請求の範囲から逸脱することなく、種々の変形や変更が可能である。例えば、実施形態の構成図に記載の複数の構成ブロックを１つに組み合わせたり、あるいは１つの構成ブロックを分割したりすることが可能である。 Although the above embodiment has been described as a representative example, it will be apparent to those skilled in the art that many changes and substitutions can be made within the spirit and scope of the invention. Therefore, the present invention should not be construed as being limited by the above-described embodiments, and various modifications and changes can be made without departing from the scope of the claims. For example, it is possible to combine a plurality of constituent blocks described in the configuration diagram of the embodiment into one, or to divide one constituent block.

また、上述の実施形態では、２２．２ｃｈへアップミックミックスする例を示したが、本発明によれば、同様な方法で７．１ｃｈなど、入力音響信号よりもチャンネル数の多い他のマルチチャンネルフォーマットへアップミックスすることができる。 Further, in the above-described embodiment, an example in which upmix mixing is performed to 22.2 ch has been described. However, according to the present invention, other multi-channels having a larger number of channels than the input acoustic signal, such as 7.1 ch, by the same method. Can be upmixed to format.

１，２アップミックス装置
１１，２１残響分離器
１２，２２インパルス応答推定部
１３，２３インパルス応答生成部
１４，２４間接音生成部
１５，２５混合器
１６，２６乗算器
１７ローパスフィルタ
１３１信号分離部
１３２乗算部
１３３乱数発生部
１３４時間シフト部
１３５信号加算部
１３６スイッチ
１３７スイッチ 1, 2 Upmix device 11, 21 Reverberation separator 12, 22 Impulse response estimation unit 13, 23 Impulse response generation unit 14, 24 Indirect sound generation unit 15, 25 Mixer 16, 26 Multiplier 17 Low pass filter 131 Signal separation unit 132 Multiplier 133 Random Number Generator 134 Time Shift Unit 135 Signal Adder 136 Switch 137 Switch

Claims

An upmix device that generates an output acoustic signal having a larger number of channels than an input acoustic signal,
A reverberation separator that separates input sound signals into direct and indirect sounds;
An impulse response estimator for estimating the impulse response of the original sound field containing the input acoustic signal;
An impulse response generator that generates a plurality of similar impulse responses having a maximum absolute value of a correlation function with the impulse response smaller than 1.
An indirect sound generation unit that generates a plurality of similar indirect sounds by convolution of the direct sound and the plurality of similar impulse responses;
A mixer that performs a linear operation using the direct sound, the indirect sound, and the similar indirect sound, and generates the output acoustic signal;
An upmix device comprising:

2. The upmix apparatus according to claim 1, wherein the impulse response generation unit generates the similar impulse response by dividing the impulse response by a predetermined time and rearranging the impulse response using a random number.

The upmix device according to claim 1, further comprising a multiplier that adjusts a volume level for each channel of the output acoustic signal generated by the mixer.

The program for functioning a computer as an upmix apparatus as described in any one of Claims 1-3.