JP6694755B2

JP6694755B2 - Channel number converter and its program

Info

Publication number: JP6694755B2
Application number: JP2016103594A
Authority: JP
Inventors: 大出　訓史; 訓史大出; 岳大杉本; 一穂小野; 北島　周; 周北島; 陽佐々木; 小森　智康; 智康小森
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2016-05-24
Filing date: 2016-05-24
Publication date: 2020-05-20
Anticipated expiration: 2036-05-24
Also published as: JP2017212547A

Description

本発明は、マルチチャンネルの音響信号から再生環境に応じた再生用音響信号を生成するチャンネル数変換装置およびプログラムに関する。 The present invention relates to a channel number conversion device and a program for generating a reproduction audio signal according to a reproduction environment from a multi-channel audio signal.

現在、２２．２ｃｈなどのマルチチャンネル音響放送（非特許文献１）の実用化が進められている。また、近年、５．１ｃｈなどのマルチチャンネル音響システムが家庭等でも広がりつつある。しかし、家庭等の音響システムは、２２．２ｃｈより少ないチャンネル数でのみ再生可能なシステムである場合が多いと想定される。一般に所定の数のチャンネル数で制作された番組の音響信号を、制作時よりも少ないチャンネル数で再生する場合、ダウンミックスと呼ばれるチャンネル数変換処理が行われる。ダウンミックスとは、制作時の音響信号の各チャンネルの信号にダウンミックス係数を乗じて加算することで、再生時のチャンネル数に応じた音響信号を算出する処理である。ダウンミックス係数は、規格において予め定められている場合がある（例えば、非特許文献２）。例えば、５．１ｃｈサラウンド（Ｌ，Ｒ，Ｃ，ＬＦＥ，Ｌｓ，Ｒｓの６チャンネル）からステレオ２ｃｈ（Ｌｔ, Ｒｔの２チャンネル）へのダウンミックス係数は、ＡＲＩＢＳＴＤ−Ｂ３２に以下のように規定されている。 Currently, practical application of multi-channel audio broadcasting (Non-Patent Document 1) such as 22.2 ch is in progress. In addition, in recent years, 5.1-channel multi-channel audio systems are spreading in homes and the like. However, it is assumed that home acoustic systems are often systems that can reproduce only with a channel number less than 22.2 ch. Generally, when an audio signal of a program produced with a predetermined number of channels is reproduced with a smaller number of channels than at the time of production, a channel number conversion process called downmix is performed. Downmix is a process of multiplying the signal of each channel of the acoustic signal at the time of production by a downmix coefficient and adding the signals to calculate an acoustic signal according to the number of channels at the time of reproduction. The downmix coefficient may be predetermined in the standard (for example, Non-Patent Document 2). For example, the downmix coefficient from 5.1ch surround (L, R, C, LFE, Ls, Rs 6 channels) to stereo 2ch (Lt, Rt 2 channels) is specified in ARIB STD-B32 as follows. Has been done.

ここで、サラウンドチャンネル（Ｌｓ，Ｒｓ）のレベルを規定する係数ｋには以下の値が用いられる。 Here, the following values are used for the coefficient k that defines the levels of the surround channels (Ls, Rs).

また、メタデータが伝送されない場合、デジタルテレビ受信機では、係数ｋに以下の値が用いられる。 When the metadata is not transmitted, the following value is used for the coefficient k in the digital television receiver.

また、放送では、番組ごとにメタデータを添付することで、番組内容に合わせて異なるダウンミックス係数を指定することが可能である。 Also, in broadcasting, by attaching metadata for each program, it is possible to specify different downmix coefficients according to the content of the program.

なお、近年、家庭やシアター等で利用される音響システムには、オブジェクトベース音響方式が採用されることも多い。オブジェクトベース音響方式では、様々なスピーカ配置が採用されており、再生環境は多様化している。 It should be noted that in recent years, an object-based audio system is often adopted in an audio system used at home, a theater, or the like. In the object-based audio system, various speaker arrangements are adopted, and the reproduction environment is diversified.

「デジタル放送における映像符号化、音声符号化及び多重化方式標準規格 VIDEO CODING, AUDIO CODING AND MULTIPLEXING SPECIFICATIONS FOR DIGITAL BROADCASTING ARIB STANDARD ARIB STD-B32 3.6版」，平成２８年（２０１６年）３月２５日，一般社団法人電波産業会"Video CODING, AUDIO CODING AND MULTIPLEXING SPECIFICATIONS FOR DIGITAL BROADCASTING ARIB STANDARD ARIB STD-B32 3.6 version", March 25, 2016, "Video coding, audio coding and multiplexing standard in digital broadcasting" General Association of Radio Industries and Businesses 勧告 ITU-R BS.775-3、「Multichannel stereophonic sound system with and without accompanying picture」、インターネット＜URL:https://www.itu.int/rec/R-REC-BS.775-3-201208-I/en＞Recommendation ITU-R BS.775-3, "Multichannel stereophonic sound system with and without accompanying picture", Internet <URL: https: //www.itu.int/rec/R-REC-BS.775-3-201208- I / en ＞

上記のとおり、４Ｋ／８Ｋ放送の２２．２ｃｈなど、５．１ｃｈを上回るチャンネル数のマルチチャンネル音響放送が提案されている。しかし、家庭では、２２．２ｃｈに対応するスピーカ数を所定のスピーカ位置に設置することができず、番組制作時に想定した数よりも少ないスピーカ数で再生される場合も多い。その場合、ダウンミックスが必要となるが、スピーカ数とその配置は、メーカーの製品の仕様に依存しており、全てのスピーカ配置に対するダウンミックス係数を規定するのは現実的ではない。また、ドキュメンタリーや音楽番組など番組によって最適なダウンミックス係数が異なる可能性があり、同じスピーカ配置であっても、番組ごとにスピーカ数、スピーカ配置に応じた適切なダウンミックス係数を規定することが望ましい。このような中、マルチチャンネル音響信号を、再生環境（スピーカ数、配置）や番組内容に応じて、最適にダウンミックスする技術に対するニーズが存在する。 As described above, multi-channel audio broadcasting having a number of channels exceeding 5.1 ch, such as 22.2 ch for 4K / 8K broadcasting, has been proposed. However, at home, the number of speakers corresponding to 22.2 ch cannot be set at a predetermined speaker position, and in many cases, reproduction is performed with a smaller number of speakers than the number assumed at the time of program production. In that case, downmixing is required, but the number of speakers and their arrangement depend on the manufacturer's product specifications, and it is not realistic to specify downmix coefficients for all speaker arrangements. Also, the optimal downmix coefficient may differ depending on the program such as documentary or music program, and even with the same speaker arrangement, it is possible to specify the number of speakers for each program and an appropriate downmix coefficient according to the speaker arrangement. desirable. Under such circumstances, there is a need for a technique for optimally downmixing a multi-channel audio signal according to a reproduction environment (number of speakers, arrangement) and program content.

そこでこの発明は、上述の課題を解決することのできるチャンネル数変換装置およびそのプログラムを提供することを目的としている。 Therefore, an object of the present invention is to provide a channel number conversion device and a program thereof that can solve the above problems.

本発明の一態様によれば、チャンネル数変換装置は、第一ダウンミックス係数を用いてマルチチャネル音響信号から所望の第一ダウンミックス信号を算出し、第二ダウンミックス係数を用いて前記第一ダウンミックス信号から参照信号と同じチャンネル数の第二ダウンミックス信号を算出するダウンミックス信号算出部と、前記第二ダウンミックス信号と参照信号との差分を算出する差分信号算出部と、前記差分信号算出部が算出した差分が、最小または所定の閾値以下となるように、前記第一ダウンミックス係数および前記第二ダウンミックス係数を更新するダウンミックス係数更新部と、を備える。 According to one aspect of the present invention, the channel number conversion device calculates a desired first downmix signal from the multi-channel acoustic signal using the first downmix coefficient, and uses the second downmix coefficient to calculate the desired first downmix signal. A downmix signal calculator that calculates a second downmix signal having the same number of channels as the reference signal from the downmix signal, a difference signal calculator that calculates the difference between the second downmix signal and the reference signal, and the difference signal. A downmix coefficient updating unit that updates the first downmix coefficient and the second downmix coefficient so that the difference calculated by the calculating unit becomes a minimum value or a predetermined threshold value or less.

本発明の一態様によれば、前記チャンネル数変換装置は、前記マルチチャネル音響信号を所定のダウンミックス係数を用いてダウンミックスし、前記参照信号を算出する参照信号算出部、をさらに備えてもよい。 According to an aspect of the present invention, the channel number conversion device may further include a reference signal calculation unit that downmixes the multi-channel acoustic signal using a predetermined downmix coefficient and calculates the reference signal. Good.

本発明の一態様によれば、前記ダウンミックス係数更新部は、前記第二ダウンミックス係数を固定して、前記第一ダウンミックス係数のみを更新してもよい。 According to an aspect of the present invention, the downmix coefficient updating unit may fix the second downmix coefficient and update only the first downmix coefficient.

本発明の一態様によれば、前記チャンネル数変換装置は、前記第一ダウンミックス係数の初期値と前記第二ダウンミックス係数の初期値とのうち少なくとも一方を記憶するダウンミックス係数記憶部、をさらに備えてもよい。 According to an aspect of the present invention, the channel number conversion device includes a downmix coefficient storage unit that stores at least one of an initial value of the first downmix coefficient and an initial value of the second downmix coefficient. It may be further provided.

本発明の一態様によれば、前記ダウンミックス係数記憶部は、前記マルチチャネル音響信号に含まれる各チャンネルの音響信号の再生位置と前記第一ダウンミックス信号に含まれる各チャンネルの音響信号の再生位置との位置関係に基づいて定められた初期値を有する前記第一ダウンミックス係数を記憶してもよい。 According to an aspect of the present invention, the downmix coefficient storage unit reproduces an audio signal of each channel included in the multi-channel audio signal and an audio signal of each channel included in the first downmix signal. The first downmix coefficient having an initial value determined based on the positional relationship with the position may be stored.

本発明の一態様によれば、前記ダウンミックス係数記憶部は、前記第一ダウンミックス信号に含まれる各チャンネルの音響信号の再生位置と前記第二ダウンミックス信号に含まれる各チャンネルの音響信号の再生位置との位置関係に基づいて定められた初期値を有する前記第二ダウンミックス係数を記憶してもよい。 According to an aspect of the present invention, the downmix coefficient storage unit includes a reproduction position of an acoustic signal of each channel included in the first downmix signal and an acoustic signal of each channel included in the second downmix signal. The second downmix coefficient having an initial value determined based on the positional relationship with the reproduction position may be stored.

本発明の一態様によれば、前記ダウンミックス係数更新部は、前記マルチチャネル音響信号に含まれる各チャンネルの音響信号の再生位置と前記第一ダウンミックス信号に含まれる各チャンネルの音響信号の再生位置との位置の類似度によって定められた拘束条件に基づいて、前記第一ダウンミックス係数を更新してもよい。 According to an aspect of the present invention, the downmix coefficient updating unit reproduces an acoustic signal of each channel included in the multi-channel acoustic signal and an acoustic signal of each channel included in the first downmix signal. The first downmix coefficient may be updated based on a constraint condition defined by the similarity of the position to the position.

本発明の一態様によれば、前記ダウンミックス係数更新部は、前記第一ダウンミックス信号に含まれる各チャンネルの音響信号の再生位置と前記第二ダウンミックス信号に含まれる各チャンネルの音響信号の再生位置との位置の類似度によって定められた拘束条件に基づいて、前記第二ダウンミックス係数を更新してもよい。 According to an aspect of the present invention, the downmix coefficient updating unit includes a reproduction position of an acoustic signal of each channel included in the first downmix signal and an acoustic signal of each channel included in the second downmix signal. The second downmix coefficient may be updated based on a constraint condition determined by the similarity of the position with the reproduction position.

本発明の一態様によれば、前記ダウンミックス係数更新部は、前記位置の類似度が最も高いチャンネル間のダウンミックス係数の値が最大となることを拘束条件として、前記第一ダウンミックス係数および第二ダウンミックス係数を更新してもよい。 According to an aspect of the present invention, the downmix coefficient updating unit sets the first downmix coefficient and the first downmix coefficient as a constraint condition that a value of a downmix coefficient between channels having the highest position similarity becomes maximum. The second downmix coefficient may be updated.

本発明の一態様によれば、コンピュータを、上記の何れか１つに記載のチャンネル数変換装置、として機能させるためのプログラムである。 According to one aspect of the present invention, there is provided a program for causing a computer to function as the channel number conversion device according to any one of the above.

本発明のチャンネル数変換装置によれば、再生環境（スピーカ数、スピーカ配置）に応じたダウンミックス係数を番組ごとに算出することができる。 According to the channel number conversion device of the present invention, the downmix coefficient according to the reproduction environment (the number of speakers and the speaker arrangement) can be calculated for each program.

本発明に係る第一実施形態におけるチャンネル数変換装置の一例を示すブロック図である。It is a block diagram showing an example of a channel number conversion device in a first embodiment concerning the present invention. 本発明に係る第一実施形態におけるダウンミックス係数の一例を示す第一の図である。It is a 1st figure which shows an example of the downmix coefficient in 1st embodiment which concerns on this invention. 本発明に係る第一実施形態における同一平面のチャンネル配置の一例を示す第一の図であるIt is a 1st figure which shows an example of the channel arrangement of the same plane in 1st embodiment which concerns on this invention. 本発明に係る第一実施形態における同一平面のチャンネル配置の一例を示す第二の図である。It is a 2nd figure which shows an example of the channel arrangement of the same plane in 1st embodiment which concerns on this invention. 本発明に係る第一実施形態における上層があるチャンネル配置の一例を示す図である。It is a figure showing an example of channel arrangement with an upper layer in a first embodiment concerning the present invention. 本発明に係る第一実施形態におけるダウンミックス係数の一例を示す第二の図である。It is a 2nd figure which shows an example of the downmix coefficient in 1st embodiment which concerns on this invention. 本発明に係る第一実施形態におけるチャンネル数変換処理の一例を示すフローチャートである。It is a flow chart which shows an example of channel number conversion processing in a first embodiment concerning the present invention. 本発明に係る第二実施形態におけるチャンネル数変換装置の一例を示すブロック図である。It is a block diagram which shows an example of the channel number converter in 2nd embodiment which concerns on this invention. 本発明に係る第二実施形態におけるチャンネル数変換処理の一例を示すフローチャートである。It is a flow chart which shows an example of channel number conversion processing in a second embodiment concerning the present invention.

＜第一実施形態＞
以下、本発明の第一実施形態によるチャンネル数変換装置を図１〜図７を参照して説明する。
図１は、本発明に係る第一実施形態におけるチャンネル数変換装置の一例を示すブロック図である。
図１に示すようにチャンネル数変換装置１０は、マルチチャンネル音響信号入力部１１と、参照信号入力部１２と、スピーカ位置情報入力部１３と、ダウンミックス信号算出部１４と、差分信号算出部１５と、ダウンミックス係数更新部１６と、ダウンミックス係数記憶部１７と、を備えている。
チャンネル数変換装置１０は、所定のマルチチャンネル音響信号（Ｎチャンネル）を、そのマルチチャンネル音響信号のチャンネル数よりも少ないチャンネル数の音響信号（以下、参照信号と呼ぶ）（Ｓチャンネル）と、マルチチャンネル音響信号のスピーカ位置の情報と、ダウンミックス先のスピーカ位置の情報とに基づいて、所望のチャンネル数の再生用ダウンミックス音響信号（Ｍチャンネル）に変換する装置である。 <First embodiment>
Hereinafter, a channel number conversion device according to a first exemplary embodiment of the present invention will be described with reference to FIGS.
FIG. 1 is a block diagram showing an example of a channel number conversion device according to the first embodiment of the present invention.
As shown in FIG. 1, the channel number conversion device 10 includes a multi-channel acoustic signal input unit 11, a reference signal input unit 12, a speaker position information input unit 13, a downmix signal calculation unit 14, and a difference signal calculation unit 15. And a downmix coefficient updating unit 16 and a downmix coefficient storage unit 17.
The channel number conversion device 10 converts a predetermined multi-channel audio signal (N channel) into a multi-channel audio signal (hereinafter referred to as a reference signal) (S channel) having a smaller number of channels than the multi-channel audio signal. This is a device for converting into a downmix audio signal for reproduction (M channel) of a desired number of channels based on the information on the speaker position of the channel audio signal and the information on the speaker position of the downmix destination.

以下、所定のマルチチャンネル音響信号として４ｋ／８Ｋの２２．２ｃｈ音響システム、参照信号としてステレオ２ｃｈ、再生音響信号を７．１ｃｈの場合を例に説明を行う。しかし、マルチチャンネル音響信号、参照信号、ダウンミックス信号の各チャンネル数は、この例の数に限らない。また、チャンネル数変換装置１０は、コンピュータによって構成されており、例えば、テレビなどの放送受信機やホームシアターなどのメディアの再生装置に組み込まれていてもよい。
図１は、チャンネル数変換装置１０にマルチチャンネル音響信号と参照信号を入力し、ダウンミックス信号を出力する構成を示す。チャンネル数変換装置１０が出力するダウンミックス信号は、例えば、再生装置に接続したスピーカ等から出力される。このとき、チャンネル数変換装置１０は、入力したマルチチャンネル音響信号が作成されたときのスピーカ数およびスピーカ配置によって出力されたときに聴取者が感じる音の印象を、なるべく再現できるようなダウンミックス信号を生成する。チャンネル数変換装置１０は、そのようなダウンミックス信号を生成するために、再生環境におけるスピーカの数およびスピーカ配置に応じた適切なダウンミックス係数を算出する。 An example will be described below in which the predetermined multi-channel audio signal is a 22.2ch audio system of 4k / 8K, the stereo signal is 2ch as a reference signal, and the reproduced audio signal is 7.1ch. However, the number of channels of the multi-channel audio signal, the reference signal, and the downmix signal is not limited to the number in this example. The channel number conversion device 10 is composed of a computer, and may be incorporated in a broadcast receiver such as a television or a media reproducing device such as a home theater, for example.
FIG. 1 shows a configuration in which a multi-channel acoustic signal and a reference signal are input to a channel number conversion device 10 and a downmix signal is output. The downmix signal output from the channel number conversion device 10 is output from, for example, a speaker connected to the reproduction device. At this time, the channel number conversion device 10 can reproduce the impression of the sound that the listener feels when it is output according to the number of speakers and the arrangement of the speakers when the input multi-channel acoustic signal is created, as much as possible. To generate. In order to generate such a downmix signal, the channel number conversion device 10 calculates an appropriate downmix coefficient according to the number of speakers and the speaker arrangement in the reproduction environment.

マルチチャンネル音響信号入力部１１は、マルチチャンネル音響信号を入力する。ここで、マルチチャンネル音響信号は、例えば、放送局から送出された２２．２ｃｈのマルチチャンネル音響信号とする。
参照信号入力部１２は、参照信号を入力する。参照信号は、所望のダウンミックス信号のチャンネル数よりも少ないチャンネル数で制作されたマルチチャンネル音響信号と同一内容の音響信号であり、元のマルチチャンネル音響信号と別途入力される音響信号である。ここで、参照信号は、例えば、マルチチャンネル音響信号と同時に放送されたステレオ２ｃｈの音声信号である。また、参照信号は、所定のダウンミックス係数に基づいてマルチチャンネル音響信号からダウンミックスして生成された参照信号であってもよい。 The multi-channel audio signal input unit 11 inputs a multi-channel audio signal. Here, the multi-channel audio signal is, for example, a 22.2 ch multi-channel audio signal transmitted from a broadcasting station.
The reference signal input unit 12 inputs a reference signal. The reference signal is an acoustic signal having the same content as the multi-channel acoustic signal produced with a smaller number of channels than the desired downmix signal, and is an acoustic signal separately input from the original multi-channel acoustic signal. Here, the reference signal is, for example, a stereo 2ch audio signal broadcast simultaneously with the multi-channel audio signal. Further, the reference signal may be a reference signal generated by downmixing a multi-channel audio signal based on a predetermined downmix coefficient.

スピーカ位置情報入力部１３は、マルチチャンネル音響信号の作成時において、そのマルチチャンネル音響信号に対して定められた複数のスピーカそれぞれの位置情報を取得する。また、スピーカ位置情報入力部１３は、ダウンミックス信号の再生環境における一つまたは複数のスピーカそれぞれの位置情報を取得する。ここで、スピーカ位置情報は、例えば、座標情報、極座標情報として与えられる。 The speaker position information input unit 13 acquires the position information of each of a plurality of speakers defined for the multi-channel audio signal when creating the multi-channel audio signal. Further, the speaker position information input unit 13 acquires position information of each of one or a plurality of speakers in the reproduction environment of the downmix signal. Here, the speaker position information is given as, for example, coordinate information or polar coordinate information.

ダウンミックス信号算出部１４は、第一ダウンミックス信号算出部１４１と、第二ダウンミックス信号算出部１４２と、を含む。第一ダウンミックス信号算出部１４１は、後述するダウンミックス係数記憶部１７が記憶する第一ダウンミックス係数（Ｍ×Ｎ）を用いて、チャンネル数Ｎのマルチチャネル音響信号（Ｎ）から所望のチャンネル数Ｍの第一ダウンミックス信号（Ｍ）を算出する。
第二ダウンミックス信号算出部１４２は、ダウンミックス係数記憶部１７が記憶する第二ダウンミックス係数（Ｓ×Ｍ）を用いて、算出したチャンネル数Ｍの第一ダウンミックス信号（Ｍ）から参照信号と同じチャンネル数Ｓの第二ダウンミックス信号（Ｓ）とを算出する。 The downmix signal calculation unit 14 includes a first downmix signal calculation unit 141 and a second downmix signal calculation unit 142. The first downmix signal calculation unit 141 uses a first downmix coefficient (M × N) stored in a downmix coefficient storage unit 17 to be described later to convert a desired number of channels from a multi-channel audio signal (N) having N channels. Calculate a number M of first downmix signals (M).
The second downmix signal calculation unit 142 uses the second downmix coefficient (S × M) stored in the downmix coefficient storage unit 17 to calculate the reference signal from the first downmix signal (M) having the number of channels M calculated. And a second downmix signal (S) having the same number of channels S as.

差分信号算出部１５は、参照信号と第二ダウンミックス信号（Ｓ）との差分を算出する。また、差分信号算出部１５は、算出した差分が最小かどうか、または、算出した差分が所定の閾値以下かどうかを判定する。差分信号算出部１５は、完全に最小値となるまで計算を繰り返さなくても、第一ダウンミックス係数（Ｍ×Ｎ）と第二ダウンミックス係数（Ｓ×Ｍ）を更新しても差分の変化が閾値以下となることに基づいて差分が最小となったと判定してもよい。
ダウンミックス係数更新部１６は、差分信号算出部１５が算出する差分が最小もしくは閾値以下になるように第一ダウンミックス係数（Ｍ×Ｎ）および第二ダウンミックス係数（Ｓ×Ｍ）、または、第一ダウンミックス係数（Ｍ×Ｎ）のみを補正する。ダウンミックス係数更新部１６は、補正した第一ダウンミックス係数（Ｍ×Ｎ）等でダウンミックス係数記憶部１７が記憶する第一ダウンミックス係数（Ｍ×Ｎ）等を更新する。なお、後述するように、ダウンミックス係数更新部１６は、例えば、マルチチャネル音響信号に含まれる各チャンネルの音響信号の再生位置と、第一ダウンミックス信号（Ｍ）に含まれる各チャンネルの音響信号の再生位置との位置関係に基づいて第一ダウンミックス係数（Ｍ×Ｎ）を更新する。 The difference signal calculation unit 15 calculates the difference between the reference signal and the second downmix signal (S). Further, the difference signal calculation unit 15 determines whether the calculated difference is the minimum or whether the calculated difference is less than or equal to a predetermined threshold. Even if the difference signal calculation unit 15 does not repeat the calculation until the value becomes the minimum value completely, the difference signal calculation unit 15 updates the difference between the first downmix coefficient (M × N) and the second downmix coefficient (S × M). It may be determined that the difference has become the minimum based on that is less than or equal to the threshold.
The downmix coefficient updating unit 16 sets the first downmix coefficient (M × N) and the second downmix coefficient (S × M) so that the difference calculated by the difference signal calculating unit 15 becomes a minimum value or a threshold value or less, or Only the first downmix coefficient (M × N) is corrected. The downmix coefficient updating unit 16 updates the first downmix coefficient (M × N) and the like stored in the downmix coefficient storage unit 17 with the corrected first downmix coefficient (M × N) and the like. As will be described later, the downmix coefficient updating unit 16 may, for example, reproduce the acoustic signal of each channel included in the multi-channel acoustic signal and the acoustic signal of each channel included in the first downmix signal (M). The first downmix coefficient (M × N) is updated based on the positional relationship with the reproduction position of.

ダウンミックス係数記憶部１７は、チャンネル数変換処理に必要な種々のデータを記憶する。まず、ダウンミックス係数記憶部１７は、第一ダウンミックス係数（Ｍ×Ｎ）の初期値、および、第二ダウンミックス係数（Ｓ×Ｍ）の初期値を記憶する。そして、ダウンミックス信号算出部１４は、これらの初期値を用いて第一ダウンミックス係数（Ｍ×Ｎ）、第二ダウンミックス係数（Ｓ×Ｍ）の算出を開始する。なお、第一ダウンミックス係数（Ｍ×Ｎ）初期値、および、第二ダウンミックス係数（Ｓ×Ｍ）初期値については、例えば乱数を発生させて設定してもよい。または、ダウンミックス前後の各チャンネル位置の角度差や距離差に応じた数値を設定してもよい。また、ダウンミックス係数が規格などで決まっており、チャンネル数変換装置１０によって、番組ごとの補正を行う場合は、規格で定められたダウンミックス係数を初期値に用いてもよい。 The downmix coefficient storage unit 17 stores various data necessary for the channel number conversion processing. First, the downmix coefficient storage unit 17 stores the initial value of the first downmix coefficient (M × N) and the initial value of the second downmix coefficient (S × M). Then, the downmix signal calculation unit 14 starts calculation of the first downmix coefficient (M × N) and the second downmix coefficient (S × M) using these initial values. The first downmix coefficient (M × N) initial value and the second downmix coefficient (S × M) initial value may be set by generating a random number, for example. Alternatively, a numerical value may be set according to the angle difference or distance difference between the positions of the channels before and after downmixing. Further, when the downmix coefficient is determined by the standard and the like, and when the channel number conversion device 10 corrects each program, the downmix coefficient defined by the standard may be used as the initial value.

図２は、本発明に係る第一実施形態におけるダウンミックス係数の一例を示す第一の図である。図２を用いて本実施形態のチャンネル数変換処理の概要を説明する。
図２に示す（Ｍ−Ｃｈ_１、・・・、Ｍ−Ｃｈ_Ｎ）は、Ｎチャンネルのマルチチャンネル音響信号である。
また、図２に示す行列の第２項目 FIG. 2 is a first diagram showing an example of the downmix coefficient in the first embodiment according to the present invention. An outline of the channel number conversion processing of this embodiment will be described with reference to FIG.
(M-Ch ₁ , ..., M-Ch _N ) shown in FIG. 2 is an N-channel multi-channel acoustic signal.
Also, the second item of the matrix shown in FIG.

は、Ｍ行Ｎ列の第一ダウンミックス係数（Ｍ×Ｎ）である。第一ダウンミックス信号算出部１４１は、この第一ダウンミックス係数（Ｍ×Ｎ）を用いて、マルチチャネル音響信号から所望のチャンネル数Ｍの第一ダウンミックス信号（Ｍ）を算出する。
また、図２に示す行列の第１項目 Is a first downmix coefficient (M × N) of M rows and N columns. The first downmix signal calculation unit 141 uses the first downmix coefficient (M × N) to calculate the first downmix signal (M) of the desired number M of channels from the multi-channel acoustic signal.
Also, the first item of the matrix shown in FIG.

は、Ｓ行Ｍ列の第二ダウンミックス係数（Ｓ×Ｍ）である。第二ダウンミックス信号算出部１４２は、この第二ダウンミックス係数（Ｓ×Ｍ）を用いて、第一ダウンミックス信号（Ｍ）から参照信号と同じチャンネル数Ｓの第二ダウンミックス信号（Ｓ）を算出する。図２において、（Ｌｔ、Ｒｔ）は第二ダウンミックス信号（Ｓ）である。差分信号算出部１５は、第二ダウンミックス信号（Ｓ）「（Ｌｔ、Ｒｔ）」と参照信号「（Ｌ、Ｒ）」との差分を、それぞれの信号のエネルギー差、二乗平均誤差、１−正規化相互相関係数（１から２つの信号の正規化相互相関係数を減算する）などの方法で算出する。なお、差分の算出方法は、これらの方法に限定されない。 Is a second downmix coefficient (S × M) of S rows and M columns. The second downmix signal calculation unit 142 uses the second downmix coefficient (S × M) to output the second downmix signal (S) having the same number S of channels as the reference signal from the first downmix signal (M). To calculate. In FIG. 2, (Lt, Rt) is the second downmix signal (S). The difference signal calculation unit 15 calculates the difference between the second downmix signal (S) “(Lt, Rt)” and the reference signal “(L, R)”, the energy difference between the respective signals, the root mean square error, 1− It is calculated by a method such as a normalized cross-correlation coefficient (subtracting the normalized cross-correlation coefficient of two signals from 1). Note that the difference calculation method is not limited to these methods.

ダウンミックス係数更新部１６は、差分信号算出部１５が算出する第二ダウンミックス信号（Ｓ）と参照信号の差分に基づいて、第一ダウンミックス係数（Ｍ×Ｎ）、第二ダウンミックス係数（Ｓ×Ｍ）を更新する。ダウンミックス係数更新部１６が、差分が小さくなるようにダウンミックス係数を更新するアルゴリズムには、遺伝的アルゴリズム、最急降下法、確率的勾配降下法などを用いることができるが、差分を小さくする方法であれば、他の方法を用いてもよい。
また、ダウンミックス係数更新部１６が、ダウンミックス係数を更新する場合、第一ダウンミックス係数（Ｍ×Ｎ）および第二ダウンミックス係数（Ｓ×Ｍ）を同時に変化させてもよい。あるいは、ダウンミックス係数更新部１６は、第二ダウンミックス係数（Ｓ×Ｍ）を所定の値に固定した状態で、第一ダウンミックス係数（Ｍ×Ｎ）だけを変化させてもよい。また、ダウンミックス係数更新部１６は、チャンネルの重要度に応じて、重要度の高いチャンネルの音響信号から順にダウンミックス係数を変化させてもよい。チャンネルの重要度は、例えば、前方に位置するスピーカに対応するチャンネルを重要度が高いと設定するなど再生位置によって規定されるほか、ダイアログ音声信号が含まれるチャンネルを重要度が高いと設定するなど音響信号の内容に応じて設定されてもよい。重要度は、メタデータとしてマルチチャンネル音響信号に付加されるほか、別途入力されるか、予め記憶されるか、ユーザによって指定されてもよい。なお、ダイアログ音声信号に含まれる音声は必ずしもダイアログ（対話）の音声に限られない。ナレーションなどの主に人の声で構成される音声信号をダイアログ音声信号としてよい。 The downmix coefficient updating unit 16 is based on the difference between the second downmix signal (S) calculated by the difference signal calculating unit 15 and the reference signal, and the first downmix coefficient (M × N) and the second downmix coefficient ( S × M) is updated. A genetic algorithm, a steepest descent method, a stochastic gradient descent method, or the like can be used as an algorithm for the downmix coefficient updating unit 16 to update the downmix coefficient so that the difference becomes small. If so, another method may be used.
Further, when the downmix coefficient updating unit 16 updates the downmix coefficient, the first downmix coefficient (M × N) and the second downmix coefficient (S × M) may be changed at the same time. Alternatively, the downmix coefficient updating unit 16 may change only the first downmix coefficient (M × N) while fixing the second downmix coefficient (S × M) to a predetermined value. Further, the downmix coefficient updating unit 16 may change the downmix coefficient in order from the audio signal of the channel having the higher importance according to the importance of the channel. The importance of the channel is defined by the playback position, for example, the channel corresponding to the speaker located in the front is set to have high importance, and the channel including the dialog audio signal is set to have high importance. It may be set according to the content of the audio signal. The importance level is added to the multi-channel audio signal as metadata, and may be input separately, stored in advance, or designated by the user. The voice included in the dialog voice signal is not necessarily limited to the voice of the dialog (dialogue). An audio signal mainly composed of human voice such as narration may be used as the dialog audio signal.

ダウンミックス係数更新部１６は、第一ダウンミックス係数（Ｍ×Ｎ）、第二ダウンミックス係数（Ｓ×Ｍ）を用いて再度第二ダウンミックス信号（Ｓ）を出力する。ダウンミックス係数更新部１６は、差分が最小となるか、差分が予め定められた閾値以下になるまでダウンミックス係数を更新する処理を繰り返す。差分信号算出部１５で差分が閾値以下または最小と判定されると、ダウンミックス信号算出部１４（第二ダウンミックス信号算出部１４２）は、最終的に更新された第一ダウンミックス係数（Ｍ×Ｎ）を用いて元のマルチチャンネル音響信号から所望の第一ダウンミックス信号（Ｍ）を算出し、再生装置に出力する。 The downmix coefficient updating unit 16 outputs the second downmix signal (S) again using the first downmix coefficient (M × N) and the second downmix coefficient (S × M). The downmix coefficient updating unit 16 repeats the process of updating the downmix coefficient until the difference becomes minimum or the difference becomes equal to or less than a predetermined threshold value. When the difference signal calculation unit 15 determines that the difference is less than or equal to the threshold value or the minimum value, the downmix signal calculation unit 14 (second downmix signal calculation unit 142) causes the finally updated first downmix coefficient (M ×). N) is used to calculate the desired first downmix signal (M) from the original multi-channel audio signal and output to the playback device.

図３は、本発明に係る第一実施形態における同一平面のチャンネル配置の一例を示す第一の図である。
図３（ａ）は、変換前のマルチチャンネル音響信号に含まれる各チャンネルの配置の一例を示している。図３（ａ）は、２２．２ｃｈの中層のチャンネル配置の一例を示している。図３（ａ）において、例えば、チャンネル「ＦＣ」はユーザの正面、チャンネル「ＦＬ」「ＦＬｃ」、「ＦＣ」、「ＦＲｃ」、「ＦＲ」はユーザの前面に位置している。また、例えば、チャンネル「ＳｉＬ」はユーザの左側、チャンネル「ＳｉＲ」はユーザの右側、チャンネル「ＢＣ」はユーザの後側に位置するチャンネルである。
図３（ｂ）は、変換後の所望の第一ダウンミックス信号（Ｍ）に含まれる各チャンネルの配置の一例を示している。図３（ｂ）は、一例として、７．１ｃｈのチャンネル配置の一例を示している。例えば、チャンネル「Ｃｍ」はユーザの正面、チャンネル「Ｌｍ」、「Ｃｍ」、「Ｒｍ」はユーザの前面に位置している。これら、正面または前面のチャンネルには高い重要度が設定される場合がある。
図３（ｃ）は、参照信号に含まれる各チャンネルの配置の一例を示している。図３（ｃ）は、一例として、２ｃｈのチャンネル配置の一例を示している。
なお、図３（ｂ）、図３（ｃ）に例示するチャンネルは、２２．２ｃｈの中層の高さに対応するチャンネルである。 FIG. 3 is a first diagram showing an example of coplanar channel arrangement in the first embodiment according to the present invention.
FIG. 3A shows an example of the arrangement of the channels included in the multi-channel acoustic signal before conversion. FIG. 3A shows an example of channel arrangement in the middle layer of 22.2 ch. In FIG. 3A, for example, the channel “FC” is located in front of the user, and the channels “FL” “FLc”, “FC”, “FRc”, and “FR” are located in front of the user. Further, for example, the channel “SiL” is located on the left side of the user, the channel “SiR” is located on the right side of the user, and the channel “BC” is located on the rear side of the user.
FIG. 3B shows an example of the arrangement of the channels included in the desired first downmix signal (M) after conversion. FIG. 3B shows, as an example, an example of channel arrangement of 7.1ch. For example, the channel “Cm” is located in front of the user, and the channels “Lm”, “Cm”, and “Rm” are located in front of the user. A high degree of importance may be set for these front or front channels.
FIG. 3C shows an example of the arrangement of the channels included in the reference signal. As an example, FIG. 3C shows an example of channel arrangement of 2ch.
The channels illustrated in FIGS. 3B and 3C are channels corresponding to the height of the middle layer of 22.2 ch.

本実施形態のチャンネル数変換装置１０は、元のマルチチャンネル音響信号が持つ音響の印象をなるべく保持したまま第一ダウンミックス信号（Ｍ）を算出する。具体的には、ダウンミックス係数更新部１６が、上記の第一ダウンミックス係数（Ｍ×Ｎ）等の更新処理を繰り返す中で、マルチチャンネル音響信号が持つ音響の印象をできるだけ保持できるような第一ダウンミックス係数（Ｍ×Ｎ）を算出し、第二ダウンミックス信号（Ｓ）と参照信号の差分が収束するように第一ダウンミックス係数（Ｍ×Ｎ）等を更新する。そのため、チャンネル数変換装置１０では、第一ダウンミックス係数（Ｍ×Ｎ）の算出において、マルチチャンネル音響信号の特徴を保持するための拘束条件が必要となる。例えば、参照信号がモノ信号（１ｃｈ）やステレオ信号（２ｃｈ）であり、所望のチャンネル数が５．１ｃｈや７．１ｃｈであった場合、参照信号のチャンネル位置は、所望のチャンネル数のチャンネル配置に完全に包含される。この場合、拘束条件を設定しないと、元のマルチチャンネル音響から所望のチャンネル数に変換するための第一ダウンミックス係数（Ｍ×Ｎ）は、元のマルチチャンネル音響信号から参照信号と同じチャンネル数のダウンミックス係数と同じになる可能性がある。なぜなら、このような第一ダウンミックス係数（Ｍ×Ｎ）によって変換した第一ダウンミックス信号（Ｍ）を、図３（ｂ）で例示したチャンネルの配置どおりに設置されたスピーカ群を有する再生装置が出力しても、そのとき再生される音は、図３（ｃ）の「Ｌｔ」、「Ｒｔ」と同じ位置に配置されたチャンネル「Ｌｍ」、「Ｒｍ」に対応するスピーカだけから出力され、目的とする「マルチチャンネル音響信号が持つ音響の印象を保持した」音とはならないためである。このことからも、元のマルチチャネル音響の特徴をなるべく保持するためには、第一ダウンミックス係数（Ｍ×Ｎ）の算出に拘束条件が必要である。 The channel number conversion device 10 of the present embodiment calculates the first downmix signal (M) while maintaining the acoustic impression of the original multi-channel audio signal as much as possible. Specifically, the downmix coefficient updating unit 16 repeats the above-described update processing of the first downmix coefficient (M × N) and the like so that the acoustic impression of the multi-channel audio signal can be retained as much as possible. One downmix coefficient (M × N) is calculated, and the first downmix coefficient (M × N) and the like are updated so that the difference between the second downmix signal (S) and the reference signal converges. Therefore, in the number-of-channels conversion device 10, in the calculation of the first downmix coefficient (M × N), a constraint condition for holding the characteristics of the multi-channel audio signal is required. For example, when the reference signal is a mono signal (1 ch) or a stereo signal (2 ch) and the desired number of channels is 5.1 ch or 7.1 ch, the channel position of the reference signal is the channel arrangement of the desired number of channels. Is completely included in. In this case, if the constraint condition is not set, the first downmix coefficient (M × N) for converting the original multi-channel sound to the desired number of channels has the same number of channels as the reference signal from the original multi-channel sound signal. May be the same as the downmix coefficient of. This is because the first downmix signal (M) converted by such a first downmix coefficient (M × N) has a speaker group installed according to the arrangement of the channels illustrated in FIG. 3B. , The sound reproduced at that time is output only from the speakers corresponding to the channels “Lm” and “Rm” arranged at the same positions as “Lt” and “Rt” in FIG. 3C. This is because the target sound does not have the "maintaining the acoustic impression of the multi-channel audio signal". From this, too, in order to keep the characteristics of the original multi-channel sound as much as possible, a constraint condition is necessary for the calculation of the first downmix coefficient (M × N).

次に、図３で例示したチャンネル配置を用いて、拘束条件の一部について説明を行う。図３に例示した各チャンネル数におけるチャンネル配置は、変換前後の各チャンネル数において、略同一の平面上に配置されるチャンネル群を対象としている。例えば、２２．２ｃｈのマルチチャンネル音響信号では、図３（ａ）に例示する中層の他、上層、下層が存在し、これらを含めた場合、異なる高さである上層、下層からの変換を考慮しなければならない。図３では、これらの変換については考慮せず、略同一の平面内に配置されるチャンネル間での変換における拘束条件について説明する。 Next, some of the constraint conditions will be described using the channel arrangement illustrated in FIG. The channel arrangement in each number of channels illustrated in FIG. 3 is intended for a group of channels arranged on substantially the same plane in each number of channels before and after conversion. For example, in a 22.2 ch multi-channel audio signal, in addition to the middle layer illustrated in FIG. 3A, there are upper and lower layers, and when these are included, conversion from the upper and lower layers having different heights is considered. Must. In FIG. 3, these conversions are not taken into consideration, and constraint conditions for conversions between channels arranged in substantially the same plane will be described.

（位置の類似性による拘束条件）
拘束条件は、例えば、元のマルチチャンネル音響のチャンネル位置と所望のダウンミックス先のチャンネル位置の類似度によって規定してもよい。位置の類似度は、例えば、ユーザの聴取位置（図３の場合、円の中心）に対する変換前後のそれぞれのチャンネル位置の間の距離、角度の変化によって定義してもよい。例えば、図３（ａ）の「ＦＣ」と、変換後の図３（ｂ）の「Ｃｍ」との位置の類似度は高い（どちらもユーザの正面であって、変換前後で距離、角度が同じ）。このような場合、チャンネル「ＦＣ」に割り当てられた音声信号から、チャンネル「Ｃｍ」に割り当てられる音声信号との変換を行うダウンミックス係数には、例えば、「１．０」と設定することを拘束条件として定めてもよい。
あるいは、この場合のダウンミックス係数が最大となるように拘束条件として定めてもよい。拘束条件で「１．０」と定めた場合には、ダウンミックス係数算出の計算量を減らすことができる。 (Restriction condition by position similarity)
The constraint condition may be defined by, for example, the degree of similarity between the original multi-channel sound channel position and the desired down-mix destination channel position. The position similarity may be defined by, for example, a change in distance and angle between each channel position before and after conversion with respect to the listening position of the user (in the case of FIG. 3, the center of the circle). For example, the similarity between the position of “FC” in FIG. 3A and the position of “Cm” in FIG. 3B after conversion is high (both are in front of the user and the distance and angle before and after conversion are the same). In such a case, the downmix coefficient for converting the audio signal assigned to the channel “FC” to the audio signal assigned to the channel “Cm” is restricted to be set to “1.0”, for example. It may be set as a condition.
Alternatively, the constraint condition may be set so that the downmix coefficient in this case becomes maximum. When the constraint condition is set to "1.0", the amount of downmix coefficient calculation can be reduced.

（ダウンミックスの前後でユーザからの距離が変わらない場合）
また、この例の場合、まず、マルチチャネル音響信号に含まれるチャンネルの音響信号の再生位置の一つである「ＦＣ」と、ダウンミックス信号に含まれるチャンネルの音響信号の再生位置の一つである「Ｃｍ」とを予め対応付けておく。そして、「ＦＣ」から「Ｃｍ」へのダウンミックス係数の値を例えば「１．０」と定め、「ＦＣ」から他のチャンネルへのダウンミックス係数の値は「０」とすることを拘束条件としてもよい。 (When the distance from the user does not change before and after downmix)
In the case of this example, first, "FC", which is one of the reproduction positions of the audio signals of the channels included in the multi-channel audio signal, and one of the reproduction positions of the audio signals of the channels included in the downmix signal, are selected. A certain “Cm” is associated in advance. Then, the constraint condition is that the value of the downmix coefficient from "FC" to "Cm" is set to "1.0", and the value of the downmix coefficient from "FC" to another channel is "0". May be

また、例えば、図３（ａ）の「ＦＣ」と、変換後の図３（ｂ）の「Ｌｍ」、「Ｒｍ」との位置の類似度はやや高い（どちらもユーザの前面であって、変換前後でユーザからの距離は変化せず、角度は例えば２０度強変化する）。このような場合、位置の類似度に応じて、チャンネル「ＦＣ」に割り当てられた音声信号から、チャンネル「Ｌｍ」、「Ｒｍ」に割り当てられる音声信号との変換を行うダウンミックス係数には、例えば、「１．０以下の同じ値」を設定することを拘束条件として定めてもよい。 Further, for example, the similarity between the positions of “FC” in FIG. 3A and “Lm” and “Rm” in FIG. 3B after conversion is slightly high (both in front of the user, The distance from the user does not change before and after the conversion, and the angle changes, for example, slightly more than 20 degrees). In such a case, the downmix coefficient for converting the audio signal assigned to the channel “FC” to the audio signal assigned to the channels “Lm” and “Rm” according to the position similarity is, for example, , “Same value less than or equal to 1.0” may be set as a constraint condition.

同様に、例えば、図３（ａ）の「ＦＣ」と、変換後の図３（ｂ）の「Ｌｓｓｍ」、「Ｒｓｓｍ」との位置の類似度は低い（変換前後でユーザからの距離は変化しないが、角度は１１０度以上異なる）。このような場合、位置の類似度は低いことに基づいて、チャンネル「ＦＣ」に割り当てられた音声信号から、チャンネル「Ｌｓｓｍ」、「Ｒｓｓｍ」に割り当てられる音声信号との変換を行うダウンミックス係数には、例えば、「０」を設定することを拘束条件として定めてもよい。 Similarly, for example, the position similarity between “FC” in FIG. 3A and “Lssm” and “Rssm” in FIG. 3B after conversion is low (the distance from the user changes before and after conversion. No, but the angle differs by 110 degrees or more). In such a case, a downmix coefficient for converting the audio signal assigned to the channel “FC” to the audio signals assigned to the channels “Lssm” and “Rssm” is based on the fact that the position similarity is low. May be set as a constraint condition that "0" is set, for example.

また、例えば、図３（ａ）の「ＢＬ」と変換後の図３（ｂ）の「Ｌｒｓｍ」との位置の類似度と、図３（ａ）の「ＢＲ」と変換後の図３（ｂ）の「Ｒｒｓｍ」との位置の類似度とは同程度（どちらもユーザからみて左側または右側の斜め後方から、同じ側の斜め後方に位置するチャンネルへの変換）である。このような場合、位置の類似度が同程度であるため、チャンネル「ＢＬ」に割り当てられた音声信号から、チャンネル「Ｌｒｓｍ」に割り当てられる音声信号との変換を行うダウンミックス係数と、チャンネル「ＢＲ」に割り当てられた音声信号からチャンネル「Ｒｒｓｍ」に割り当てられる音声信号との変換を行うダウンミックス係数とに、同じ値を設定することを拘束条件として定めてもよい。 Further, for example, the position similarity between “BL” in FIG. 3A and “Lrsm” in FIG. 3B after conversion, and “BR” in FIG. The degree of similarity with the position of “Rrsm” in b) is approximately the same (both conversion from a channel on the left or right side of the user to a channel on the same side and diagonally rearward). In this case, since the position similarity is similar, the downmix coefficient for converting the audio signal assigned to the channel “BL” to the audio signal assigned to the channel “Lrsm” and the channel “BR”. The constraint condition may be that the same value is set for the downmix coefficient for converting the audio signal assigned to “” to the audio signal assigned to the channel “Rrsm”.

また、例えば、図３（ａ）の「ＢＬ」に割り当てられた音声信号から、変換後の図３（ｂ）の「Ｌｓｓｍ」、「Ｌｒｓｍ」に割り当てられる音声信号との変換を行うダウンミックス係数については、何れのチャンネルもユーザからの距離は同じとなる。しかしながら、変換前の「ＢＬ」の位置（ユーザからの角度）は、「Ｌｒｓｍ」により近く、「Ｌｒｓｍ」に比べれば「Ｌｓｓｍ」がより遠いことに基づいて、「Ｌｒｓｍ」へのダウンミックス係数の値が最大となるように設定することを拘束条件として定めてもよい。あるいは、さらに「Ｌｓｓｍ」へのダウンミックス係数の値は２番目に大きくなるように拘束条件を定めてもよい。また、円の中心と「ＢＬ」の位置を結ぶ線を基準とした、円の中心と「Ｌｓｓｍ」および「Ｌｓｓｍ」それぞれを結ぶ線がなす開き角度に基づいて、それぞれのダウンミックス係数の値の比が、開き角度の比の逆数となるようにダウンミックス係数を設定することを拘束条件として定めてもよい。 Also, for example, a downmix coefficient for converting the audio signal assigned to “BL” in FIG. 3A to the converted audio signal assigned to “Lssm” and “Lrsm” in FIG. 3B. For, the distance from the user is the same for all channels. However, the position of “BL” before conversion (angle from the user) is closer to “Lrsm”, and “Lssm” is farther than “Lrsm”. The constraint condition may be set so that the value becomes maximum. Alternatively, the constraint condition may be set such that the value of the downmix coefficient to “Lssm” is the second largest. Further, based on the opening angle formed by the line connecting the center of the circle and each of “Lssm” and “Lssm” with reference to the line connecting the center of the circle and the position of “BL”, the value of each downmix coefficient is calculated. The constraint condition may be to set the downmix coefficient so that the ratio is the reciprocal of the ratio of the opening angle.

（スピーカを理想的な位置に置けないような場合）
図３は、ダウンミックスの前後でユーザからの距離が変わらない場合を例示している。次に、図４を用いて、ユーザがスピーカなどを設置する部屋のレイアウト等の都合で、あるチャンネルに対応するスピーカを理想的な位置に置けないような場合の例を挙げて拘束条件の説明を行う。 (When the speaker cannot be placed in the ideal position)
FIG. 3 illustrates a case where the distance from the user does not change before and after the downmix. Next, the constraint conditions will be described with reference to FIG. 4 by taking an example in which the speaker corresponding to a certain channel cannot be placed in an ideal position due to the layout of the room where the user installs the speaker and the like. I do.

図４は、本発明に係る第一実施形態における同一平面のチャンネル配置の一例を示す第二の図である。
図４（ａ）は、変換前のマルチチャンネル音響信号におけるチャンネル配置の一例を示している。図４（ｂ）は、変換後の所望の第一ダウンミックス信号（Ｍ）におけるチャンネル配置の一例を示している。図４（ｂ）において、チャンネル「Ｌｓｓｍ」と「Ｌｒｓｍ」に対応するスピーカの位置は、所定の「Ｌｓｓｍ」と「Ｌｒｓｍ」の位置からずれている。この例の場合、「Ｌｓｓｍ」のスピーカが、変換前のマルチチャンネル音響信号におけるチャンネル「ＢＬ」の位置から距離ｖの位置に設置され、「Ｌｒｓｍ」のスピーカが、変換前のマルチチャンネル音響信号におけるチャンネル「ＢＬ」の位置から距離ｕの位置に設置されているものとする（ｖ＞ｕ）。また、図示するように他のチャンネルから「Ｌｓｓｍ」のスピーカ位置は、チャンネル「ＳｉＬ」から「Ｌｓｓｍ」までの距離ｙよりも離れているものとする。この場合、「ＳｉＬ」から「Ｌｓｓｍ」へのダウンミックス係数の値が最大となるように設定することを拘束条件として定めてもよい。さらに「ＢＬ」から「Ｌｒｓｍ」へのダウンミックス係数の値を１番目に大きく、「Ｌｓｓｍ」へのダウンミックス係数の値が２番目に大きくなるように拘束条件を定めてもよい。また、距離ｕ、ｖに基づいて、「ＢＬ」から「Ｌｓｓｍ」へのダウンミックス係数の値と「ＢＬ」から「Ｌｒｓｍ」へのダウンミックス係数の値の比が、ｖ：ｕとなるようにダウンミックス係数を設定することを拘束条件として定めてもよい。 FIG. 4 is a second diagram showing an example of coplanar channel arrangement in the first embodiment according to the present invention.
FIG. 4A shows an example of channel arrangement in the multi-channel acoustic signal before conversion. FIG. 4B shows an example of channel arrangement in the desired first downmix signal (M) after conversion. In FIG. 4B, the positions of the speakers corresponding to the channels “Lssm” and “Lrsm” are deviated from the predetermined “Lssm” and “Lrsm” positions. In the case of this example, the speaker of “Lssm” is installed at a position of a distance v from the position of the channel “BL” in the multi-channel audio signal before conversion, and the speaker of “Lrsm” in the multi-channel audio signal before conversion. It is assumed that it is installed at a position at a distance u from the position of the channel “BL” (v> u). Further, as shown in the figure, it is assumed that the speaker position of “Lssm” from another channel is farther than the distance y from the channel “SiL” to “Lssm”. In this case, the constraint condition may be set such that the value of the downmix coefficient from “SiL” to “Lssm” is maximized. Further, the constraint condition may be set such that the value of the downmix coefficient from “BL” to “Lrsm” is the first largest and the value of the downmix coefficient to “Lssm” is the second largest. Also, based on the distances u and v, the ratio of the value of the downmix coefficient from “BL” to “Lssm” and the value of the downmix coefficient from “BL” to “Lrsm” becomes v: u. Setting the downmix coefficient may be set as a constraint condition.

また、図４（ｂ）において、チャンネル「Ｃｍ」に対応するスピーカの位置は、所定の「Ｃｍ」の位置から側方に距離ｗだけずれている。「ＦＣ」と「Ｃｍ」とを予め対応付け、「ＦＣ」から「Ｃｍ」へのダウンミックス係数の値を「１．０」、それ以外の他のチャンネルへのダウンミックス係数の値を「０」とする拘束条件の例を上記で説明した。この場合、変換前の「ＦＣ」の位置と変換後のｗのずれを含む「Ｃｍ」の位置との位置の類似度（この場合は距離の差）が所定の範囲内であるときのみ、当該拘束条件を適用してもよい。 Further, in FIG. 4B, the position of the speaker corresponding to the channel “Cm” is laterally displaced from the position of the predetermined “Cm” by the distance w. “FC” and “Cm” are associated in advance, the value of the downmix coefficient from “FC” to “Cm” is “1.0”, and the value of the downmix coefficient to other channels is “0”. An example of the constraint condition that is “. In this case, only when the position similarity between the position of “FC” before conversion and the position of “Cm” including the shift of w after conversion (distance difference in this case) is within a predetermined range, Constraints may be applied.

なお、家庭でのスピーカ位置の座標情報は、ユーザが、チャンネル数変換装置１０に入力してもよい。あるいは、音響再生装置が備える各スピーカ位置を検出する機能を利用して得た座標情報を、音響再生装置がチャンネル数変換装置１０に入力してもよい。 The user may input the coordinate information of the speaker position at home into the channel number conversion device 10. Alternatively, the sound reproducing device may input the coordinate information obtained by using the function of detecting the position of each speaker provided in the sound reproducing device to the channel number converting device 10.

（変換前のマルチチャンネル音響信号におけるチャンネル配置の一例）
次に図５を用いて上層から中層への変換におけるダウンミックス係数の拘束条件について説明を行う。
図５は、本発明に係る第一実施形態における上層があるチャンネル配置の一例を示す図である。
図５（ａ）は、変換前のマルチチャンネル音響信号におけるチャンネル配置の一例を示している。図５（ａ）に、上層、中層の二層からなるチャンネル配置の一例を示す。図５（ａ）において、チャンネル「ＴｐＦＬ」は上層のユーザの左斜め前、チャンネル「ＴｐＦＲ」は上層のユーザの右斜め前、チャンネル「ＴｐＢＬ」は上層のユーザの左斜め後、チャンネル「ＴｐＢＲ」は上層のユーザの右斜め後に位置している。図５（ｂ）は、変換後の所望の第一ダウンミックス信号（Ｍ）におけるチャンネル配置の一例を示している。図５（ｃ）は、参照信号におけるチャンネル配置の一例を示している。 (Example of channel arrangement in multi-channel audio signal before conversion)
Next, the constraint condition of the downmix coefficient in the conversion from the upper layer to the middle layer will be described with reference to FIG.
FIG. 5 is a diagram showing an example of a channel arrangement with an upper layer according to the first embodiment of the present invention.
FIG. 5A shows an example of channel arrangement in the multi-channel acoustic signal before conversion. FIG. 5A shows an example of a channel arrangement including two layers, an upper layer and a middle layer. In FIG. 5A, the channel “TpFL” is diagonally to the left of the upper layer user, the channel “TpFR” is diagonally to the right of the upper layer user, the channel “TpBL” is diagonally to the left of the upper layer user, and the channel “TpBR”. Is located diagonally to the right of the user in the upper layer. FIG. 5B shows an example of channel arrangement in the desired first downmix signal (M) after conversion. FIG. 5C shows an example of channel arrangement in the reference signal.

例えば、図５（ａ）の「ＴｐＦＬ」と、変換後の図５（ｂ）の「Ｌ」との位置の類似度は、どちらもユーザの斜め左前であるが、ユーザからの開き角度が若干変化する。また、「ＴｐＦＬ」は上層にあり「Ｌ」は中層に位置するという違いが存在する。従って、チャンネル「ＴｐＦＬ」に割り当てられた音声信号から、チャンネル「Ｌ」に割り当てられる音声信号との変換を行うダウンミックス係数には、例えば、「１．０」または「１．０以下」の値を設定することを拘束条件として定めてもよい。拘束条件で「１．０」と定めた場合には、ダウンミックス係数算出の計算量を減らすことができる。 For example, the similarity between the positions of “TpFL” in FIG. 5A and “L” in FIG. 5B after conversion is both diagonally left front of the user, but the opening angle from the user is slightly different. Change. Further, there is a difference that "TpFL" is located in the upper layer and "L" is located in the middle layer. Therefore, the downmix coefficient for converting the audio signal assigned to the channel "TpFL" to the audio signal assigned to the channel "L" has a value of, for example, "1.0" or "1.0 or less". The setting may be defined as a constraint condition. When the constraint condition is set to "1.0", the amount of downmix coefficient calculation can be reduced.

また、この例の場合、上層の「ＴｐＦＬ」と中層の「ＴｐＦＬ」に対応する位置の近傍に存在する「Ｌ」とを予め対応付けておき、「ＴｐＦＬ」から「Ｌ」へのダウンミックス係数の値を例えば「１．０」、「ＴｐＦＬ」から他のチャンネルへのダウンミックス係数の値を「０」とすることを拘束条件としてもよい。 In the case of this example, the “TpFL” in the upper layer and the “L” existing in the vicinity of the position corresponding to the “TpFL” in the middle layer are previously associated with each other, and the downmix coefficient from “TpFL” to “L” is set. The constraint condition may be that the value of is, for example, “1.0”, and the value of the downmix coefficient from “TpFL” to another channel is “0”.

また、例えば、図５（ａ）の「ＴｐＦＬ」と、変換後の図５（ｂ）の「Ｌｓ」、「Ｒｓ」との位置の類似度は低い（変換前後でユーザからの距離が遠ざかる関係にあり、角度も大きく異なり、さらに上層と中層の違いがある）。このような場合、チャンネル「ＴｐＦＬ」に割り当てられた音声信号から、チャンネル「Ｌｓ」、「Ｒｓ」に割り当てられる音声信号との変換を行うダウンミックス係数には、例えば、「０」を設定することを拘束条件として定めてもよい。 Further, for example, the degree of similarity between the positions of “TpFL” in FIG. 5A and “Ls” and “Rs” in FIG. 5B after conversion is low (the distance from the user increases before and after conversion). In, the angle is also very different, there is a difference between the upper and middle layers). In such a case, the downmix coefficient for converting the audio signal assigned to the channel "TpFL" to the audio signal assigned to the channels "Ls" and "Rs" should be set to "0", for example. May be defined as a constraint condition.

また、例えば、図５（ａ）の「ＴｐＦＬ」と変換後の図５（ｂ）の「Ｌ」との位置の類似度と、図５（ａ）の「ＴｐＦＲ」と変換後の図５（ｂ）の「Ｒ」との位置の類似度とは同程度である。このような場合、同一平面上でのダウンミックス係数と同様、チャンネル「ＴｐＦＬ」に割り当てられた音声信号から、チャンネル「Ｌ」に割り当てられる音声信号との変換を行うダウンミックス係数と、チャンネル「ＴｐＦＲ」に割り当てられた音声信号から、チャンネル「Ｒ」に割り当てられる音声信号との変換を行うダウンミックス係数とに、同じ値を設定することを拘束条件として定めてもよい。 Further, for example, the similarity of the position between “TpFL” in FIG. 5A and “L” in FIG. 5B after conversion, and “TpFR” in FIG. The degree of similarity with the position of “R” in b) is about the same. In such a case, as with the downmix coefficient on the same plane, the downmix coefficient for converting the audio signal assigned to the channel "TpFL" to the audio signal assigned to the channel "L" and the channel "TpFR". The constraint condition may be that the same value is set to the downmix coefficient for converting the audio signal assigned to “” to the audio signal assigned to the channel “R”.

また、例えば、図５（ａ）の「ＴｐＦＬ」と図５（ｂ）の「Ｌ」の距離、図５（ａ）の「ＴｐＦＬ」と図５（ｂ）の「Ｃ」の距離に基づいて、「ＴｐＦＬ」は「Ｌ」により近いことから、「ＴｐＦＬ」に割り当てられた音声信号から「Ｌ」に割り当てられる音声信号への変換を行うダウンミックス係数の値が最大となるように設定することを拘束条件として定めてもよい。 Further, for example, based on the distance between “TpFL” in FIG. 5A and “L” in FIG. 5B, and the distance between “TpFL” in FIG. 5A and “C” in FIG. 5B. , "TpFL" is closer to "L", so the value of the downmix coefficient for converting the audio signal assigned to "TpFL" to the audio signal assigned to "L" should be set to the maximum value. May be defined as a constraint condition.

このように、変換元のマルチチャンネル音響信号のチャンネル位置とダウンミックス先のチャンネル位置との距離または開き角を算出し、最も近傍のチャンネル位置へのダウンミックス係数を最大となることを拘束条件とすることで、元のマルチチャンネル音響信号の特徴を保持できるダウンミックス係数を算出することが可能となる。 In this way, the constraint condition is that the distance or opening angle between the channel position of the conversion source multi-channel audio signal and the channel position of the downmix destination is calculated, and the downmix coefficient to the nearest channel position is maximized. By doing so, it becomes possible to calculate a downmix coefficient that can retain the characteristics of the original multi-channel audio signal.

（聞きやすさによる拘束条件）
図３、図５に戻り、他の拘束条件の他の例について説明する。拘束条件は、音の聞きやすさの観点によって規定されてもよい。例えば、後方、側方にあるチャンネルが前方にあるチャンネルにダウンミックスされるとき、前方のチャンネルに割り当てられたダイアログ音声信号などの音声が聞きにくくなる可能性がある。そのような場合に聞きやすさを担保するために、後方、側方のうち少なくとも一方に位置するチャンネルに割り当てられた音声信号から前方のチャンネルに割り当てられる音声信号へのダウンミックス係数に「１．０」よりも小さい補正値を乗じることを拘束条件として定めてもよい。 (Restriction condition based on ease of listening)
Returning to FIGS. 3 and 5, another example of another constraint condition will be described. The restraint condition may be defined in terms of the audibility of sound. For example, when a rear channel or a side channel is downmixed with a front channel, it may be difficult to hear a voice such as a dialog voice signal assigned to the front channel. In such a case, in order to ensure audibility, the downmix coefficient from the audio signal assigned to the channel located at least one of the rear side and the side to the audio signal assigned to the front channel is set to "1. The constraint condition may be defined as multiplication by a correction value smaller than "0".

同様に、上層か下層、もしくはその両方のチャンネルから中層のチャンネルへダウンミックスする場合に、上層、下層の音によって中層の音が聞き取りにくくなる可能性がある。従って、上層か下層、もしくはその両方のチャンネルから中層のチャンネルへのダウンミックス係数に１．０よりも小さい補正値を用いることを拘束条件として加えてもよい。 Similarly, when down-mixing channels from the upper layer, the lower layer, or both to the channels in the middle layer, the sounds in the middle layer may be difficult to hear due to the sounds in the upper and lower layers. Therefore, the use of a correction value smaller than 1.0 as the downmix coefficient from the channels in the upper layer, the lower layer, or both to the channel in the middle layer may be added as a constraint condition.

また、チャンネル配置は通常左右対称であるが、左右から聞こえてくる音の大きさのバランスが変化すると、マルチチャンネル音響信号を再生した場合の印象から大きく変わってしまう可能性がある。従って、左右対称な位置に配置されたチャンネルから対応する左右対称な位置に配置されたチャンネルへのダウンミックス係数を算出するにあたり、元のマルチチャネル音響信号に含まれる左右対称な位置に配置されたチャンネルから対応する左右対称な位置に配置されたチャンネルへのダウンミックス係数に同じ数値を用いることを拘束条件として加えてもよい。 Further, although the channel arrangement is usually bilaterally symmetric, if the balance of the volume of sounds heard from the left and right changes, the impression when reproducing a multi-channel audio signal may change greatly. Therefore, when calculating the downmix coefficient from the channel arranged in the symmetrical position to the corresponding channel arranged in the symmetrical position, the downmix coefficient is arranged in the symmetrical position included in the original multi-channel audio signal. It may be added as a constraint condition that the same numerical value is used for the downmix coefficient from the channel to the corresponding channel arranged at the symmetrical position.

前方からのダイアログ音声信号を強調した方が聞き取りやすい場合の例のように、ダウンミックス信号の特徴が、元のマルチチャンネル音響の特徴とは完全に一致しないことが望ましい場合がある。聞きやすさによる拘束条件を適用することで、ユーザの聞きやすさを確保することができる。 It may be desirable that the characteristics of the downmix signal do not exactly match the characteristics of the original multi-channel audio, as in the case where it is easier to hear if the dialog speech signal from the front is emphasized. The ease of listening for the user can be secured by applying the constraint condition based on the ease of listening.

（重要度による拘束条件）
次に、重要度による拘束条件について説明する。拘束条件は、重要度の観点によって規定されてもよい。例えば、報道番組の場合、アナウンサー等によるダイアログ音声信号が最も重要となる。このような場合、例えば前面に位置するチャンネル（例えば、図５の「Ｌ」、「Ｃ」、「Ｒ」）に対して、これらはダイアログ音声信号に対応するチャンネルであるとして、高い重要度が設定されてもよい。重要度は、例えばマルチチャンネル音響信号のメタデータとして入力される。あるいは、ユーザが入力することでダウンミックス係数更新部１６に設定されてもよい。
この場合、例えば、「ＦＬ」から「Ｌ」、「ＦＣ」から「Ｃ」、「ＦＲ」から「Ｒ」へのダウンミックス係数に最大の値を設定することを拘束条件として加えてもよい。
重要度による拘束条件を適用することによって、特定の音響信号の印象を強調するようにダウンミックスすることができる。 (Restriction by importance)
Next, the constraint condition based on the degree of importance will be described. The constraint condition may be defined in terms of importance. For example, in the case of a news program, a dialog voice signal by an announcer or the like becomes the most important. In such a case, for example, with respect to the channels located on the front side (for example, “L”, “C”, and “R” in FIG. 5), it is considered that these are channels corresponding to the dialog voice signal, and therefore, the importance is high. It may be set. The importance level is input, for example, as metadata of a multi-channel audio signal. Alternatively, it may be set in the downmix coefficient updating unit 16 by being input by the user.
In this case, for example, setting a maximum value to the downmix coefficient from “FL” to “L”, “FC” to “C”, and “FR” to “R” may be added as a constraint condition.
By applying an importance constraint, it is possible to downmix to emphasize the impression of a particular acoustic signal.

なお、上記した拘束条件は、第一ダウンミックス係数（Ｍ×Ｎ）の算出時だけでなく、第二ダウンミックス係数（Ｓ×Ｍ）の算出時にも適用してよい。また、第一ダウンミックス係数（Ｍ×Ｎ）の初期値、第二ダウンミックス係数（Ｓ×Ｍ）の初期値の各要素の値について、拘束条件によってダウンミックス係数の値を「１．０」とすると予め定められているような場合、そのチャンネル間のダウンミックス係数に対応する要素には、初期値の段階で「１．０」が設定されていてもよい。また、拘束条件によって値が「１．０」等と定められていない場合でも、第一ダウンミックス係数（Ｍ×Ｎ）の初期値、第二ダウンミックス係数（Ｓ×Ｍ）の初期値の各要素の値には、上記した位置の類似性等に基づく拘束条件が考慮されて予め定められた値が設定されていてもよい。 The constraint condition described above may be applied not only when calculating the first downmix coefficient (M × N) but also when calculating the second downmix coefficient (S × M). In addition, regarding the value of each element of the initial value of the first downmix coefficient (M × N) and the initial value of the second downmix coefficient (S × M), the value of the downmix coefficient is “1.0” depending on the constraint condition. Then, in the case where it is predetermined, "1.0" may be set at the stage of the initial value for the element corresponding to the downmix coefficient between the channels. Further, even when the value is not set to “1.0” or the like due to the constraint condition, each of the initial value of the first downmix coefficient (M × N) and the initial value of the second downmix coefficient (S × M) is obtained. The value of the element may be set to a predetermined value in consideration of the constraint condition based on the above-described position similarity and the like.

次に図６を用いて、これらの拘束条件を課したうえで算出された第一ダウンミックス係数（Ｍ×Ｎ）および第二ダウンミックス係数（Ｓ×Ｍ）の一例を説明する。
図６は、本発明に係る第一実施形態におけるダウンミックス係数の一例を示す第二の図である。
図６に例示する第一ダウンミックス係数（Ｍ×Ｎ）、第二ダウンミックス係数（Ｓ×Ｍ）は、ダウンミックス係数更新部１６が、拘束条件を満たすようにして算出したものである。
図６の上図は、図５（ａ）で例示したマルチチャンネル音響信号に含まれる各チャンネル信号から、図５（ｂ）で例示した所望の第一ダウンミックス信号（Ｍ）に含まれる各チャンネル信号への第一ダウンミックス係数（Ｍ×Ｎ）の一例を示している。例えば、「Ｌ」と「Ｒ」、「ＦＬ」と「ＦＲ」は左右対称の位置に配置されている。従って、「ＦＬ」に割り当てられた音声信号からチャンネル「Ｌ」に割り当てられる音声信号との変換を行うダウンミックス係数と、「ＦＲ」に割り当てられた音声信号からチャンネル「Ｒ」に割り当てられる音声信号との変換を行うダウンミックス係数には同じ値「Ｃ１」が設定されている。また、また、「ＦＬ」と「Ｌ」はほぼ同じ位置に配置されている。同様に「ＦＲ」と「Ｒ」はほぼ同じ位置に配置されている。従って、「Ｃ１」の大きさは例えば「１．０」であってもよい。また、例えば、「ＴｐＦＬ」に割り当てられた音声信号から、チャンネル「Ｌ」に割り当てられる音声信号との変換を行うダウンミックス係数「ｋ１Ｃ１」の「ｋ１」は、上層から中層のチャンネルへのダウンミックス係数に乗じる補正値の例である。ここで、ｋ１は１よりも小さい値である。また、「ＳｉＬ」と「Ｌ」の距離、「ＳｉＬ」と「Ｌｓ」の距離を比べると、「ＳｉＬ」と「Ｌｓ」の距離の方が短い。従って、「ＳｉＬ」から「Ｌｓ」へのダウンミックス係数により大きな値が設定される（Ｃ４≧Ｃ３）。同様に、「ＢＬ」と「Ｌｓ」の距離、「ＢＲ」と「Ｌｓ」の距離を比べると、「ＢＬ」と「Ｌｓ」の距離の方が短い。従って、「ＢＬ」と「ＢＲ」とでは、「ＢＬ」から「Ｌｓ」へのダウンミックス係数により大きな値が設定される（Ｃ５≧Ｃ６）。また、上層の「ＴｐＢＬ」、「ＴｐＢＲ」に割り当てられた音声信号から、チャンネル「Ｌｓ」に割り当てられる音声信号との変換を行うダウンミックス係数には、上層から中層のチャンネルへのダウンミックス係数に用いる補正値「ｋ１」が含まれている。 Next, with reference to FIG. 6, an example of the first downmix coefficient (M × N) and the second downmix coefficient (S × M) calculated by imposing these constraint conditions will be described.
FIG. 6 is a second diagram showing an example of the downmix coefficient in the first embodiment according to the present invention.
The first downmix coefficient (M × N) and the second downmix coefficient (S × M) illustrated in FIG. 6 are calculated by the downmix coefficient updating unit 16 so as to satisfy the constraint condition.
The upper diagram of FIG. 6 shows each channel included in the desired first downmix signal (M) illustrated in FIG. 5B from each channel signal included in the multi-channel acoustic signal illustrated in FIG. 5A. The example of the 1st downmix coefficient (MxN) to a signal is shown. For example, “L” and “R”, “FL” and “FR” are arranged at symmetrical positions. Therefore, a downmix coefficient for converting an audio signal assigned to "FL" to an audio signal assigned to channel "L", and an audio signal assigned to channel "R" from an audio signal assigned to "FR". The same value "C1" is set as the downmix coefficient for performing the conversion with. Moreover, “FL” and “L” are arranged at substantially the same position. Similarly, “FR” and “R” are arranged at almost the same position. Therefore, the size of “C1” may be “1.0”, for example. Further, for example, the downmix coefficient “k1” of the downmix coefficient “k1C1” for converting the audio signal assigned to “TpFL” to the audio signal assigned to the channel “L” is downmixed from the upper layer to the middle layer. It is an example of a correction value by which a coefficient is multiplied. Here, k1 is a value smaller than 1. Further, comparing the distance between “SiL” and “L” and the distance between “SiL” and “Ls”, the distance between “SiL” and “Ls” is shorter. Therefore, a large value is set for the downmix coefficient from “SiL” to “Ls” (C4 ≧ C3). Similarly, comparing the distance between “BL” and “Ls” and the distance between “BR” and “Ls”, the distance between “BL” and “Ls” is shorter. Therefore, for “BL” and “BR”, a larger value is set for the downmix coefficient from “BL” to “Ls” (C5 ≧ C6). In addition, the downmix coefficient for converting the audio signals assigned to the upper layers “TpBL” and “TpBR” to the audio signals assigned to the channel “Ls” is the downmix coefficient from the upper layer to the middle layer. The correction value “k1” to be used is included.

図６の下図は、図５（ｂ）で例示した所望の第一ダウンミックス信号（Ｍ）における各チャンネル信号から図５（ｃ）で例示した参照信号への第二ダウンミックス係数（Ｓ×Ｍ）の一例を示している。上記の拘束条件は、第二ダウンミックス係数（Ｓ×Ｍ）に適用することも可能である。例えば、ｋ２は、後方のチャンネルから前方のチャンネルへのダウンミックス係数に用いる補正値である。また、「Ｌｔ」は「Ｃ」よりも「Ｌ」により近い為、「Ｌ」から「Ｌｔ」へのダウンミックス係数により大きな値が設定される（Ｃｔ１＞Ｃｔ２）。 The lower part of FIG. 6 shows the second downmix coefficient (S × M) from each channel signal in the desired first downmix signal (M) illustrated in FIG. 5B to the reference signal illustrated in FIG. 5C. ) Is shown. The above constraint condition can also be applied to the second downmix coefficient (S × M). For example, k2 is a correction value used for the downmix coefficient from the rear channel to the front channel. Since "Lt" is closer to "L" than "C", a larger value is set for the downmix coefficient from "L" to "Lt" (Ct1> Ct2).

図７は、本発明に係る第一実施形態におけるチャンネル数変換処理の一例を示すフローチャートである。
図７を用いて本実施形態のチャンネル数変換処理の流れを説明する。
前提として、マルチミックス音響信号の各チャンネルの位置情報（座標情報）、所望のダウンミックス信号の再生環境におけるスピーカの数および各スピーカの位置情報は、予めチャンネル数変換装置１０に入力され、スピーカ位置情報入力部１３がこれらの情報の入力を受け付けている。また、スピーカ位置情報入力部１３は、マルチミックス音響信号の各チャンネルの位置情報と再生環境における各スピーカの位置情報とをダウンミックス係数更新部１６に出力している。また、ダウンミックス係数更新部１６には、ダウンミックス係数算出における種々の拘束条件が設定されている。また、ダウンミックス係数記憶部１７は、第一ダウンミックス係数（Ｍ×Ｎ）の初期値、第二ダウンミックス係数（Ｓ×Ｍ）の初期値を記憶している。 FIG. 7 is a flowchart showing an example of the channel number conversion processing in the first embodiment according to the present invention.
The flow of the channel number conversion processing of this embodiment will be described with reference to FIG.
As a premise, the position information (coordinate information) of each channel of the multimix audio signal, the number of speakers in the reproduction environment of the desired downmix signal, and the position information of each speaker are input to the channel number conversion device 10 in advance, and the speaker position is adjusted. The information input unit 13 receives input of these pieces of information. Further, the speaker position information input unit 13 outputs the position information of each channel of the multimix acoustic signal and the position information of each speaker in the reproduction environment to the downmix coefficient updating unit 16. In the downmix coefficient updating unit 16, various constraint conditions for downmix coefficient calculation are set. Further, the downmix coefficient storage unit 17 stores the initial value of the first downmix coefficient (M × N) and the initial value of the second downmix coefficient (S × M).

まず、ステップＳ１１で、参照信号入力部１２は、参照信号を入力する。参照信号入力部１２は、入力した参照信号を差分信号算出部１５に出力する。また、ステップＳ１１と並行して、ステップＳ１２で、マルチチャンネル音響信号入力部１１は、マルチチャンネル音響信号を入力する。続いて、マルチチャンネル音響信号入力部１１は、マルチチャンネル音響信号をダウンミックス信号算出部１４に出力する。
次に、ステップＳ１３で、ダウンミックス信号算出部１４では、第一ダウンミックス信号算出部１４１が第一ダウンミックス信号（Ｍ）を算出する。具体的には、第一ダウンミックス信号算出部１４１は、ダウンミックス係数記憶部１７から第一ダウンミックス係数（Ｍ×Ｎ）の初期値を読み出して取得し、この初期値でマルチチャンネル音響信号をダウンミックスして第一ダウンミックス信号（Ｍ）を算出する。続いて、第一ダウンミックス信号算出部１４１は、第一ダウンミックス信号（Ｍ）を第二ダウンミックス信号算出部１４２に出力する。
次に、ステップＳ１４で、第二ダウンミックス信号算出部１４２が第二ダウンミックス信号（Ｓ）を算出する。具体的には、第二ダウンミックス信号算出部１４２は、ダウンミックス係数記憶部１７から第二ダウンミックス係数（Ｓ×Ｍ）の初期値を読み出して取得し、この初期値で第一ダウンミックス信号（Ｍ）をダウンミックスして第二ダウンミックス信号（Ｓ）を算出する。第二ダウンミックス信号算出部１４２は、ダウンミックス信号（Ｓ）を差分信号算出部１５へ出力する。 First, in step S11, the reference signal input unit 12 inputs a reference signal. The reference signal input unit 12 outputs the input reference signal to the difference signal calculation unit 15. In addition, in parallel with step S11, the multi-channel acoustic signal input unit 11 inputs the multi-channel acoustic signal in step S12. Then, the multi-channel acoustic signal input unit 11 outputs the multi-channel acoustic signal to the downmix signal calculation unit 14.
Next, in step S13, in the downmix signal calculation unit 14, the first downmix signal calculation unit 141 calculates the first downmix signal (M). Specifically, the first downmix signal calculation unit 141 reads and acquires the initial value of the first downmix coefficient (M × N) from the downmix coefficient storage unit 17, and the multichannel audio signal is obtained with this initial value. Downmixing is performed to calculate a first downmix signal (M). Then, the first downmix signal calculation unit 141 outputs the first downmix signal (M) to the second downmix signal calculation unit 142.
Next, in step S14, the second downmix signal calculation unit 142 calculates the second downmix signal (S). Specifically, the second downmix signal calculation unit 142 reads and acquires the initial value of the second downmix coefficient (S × M) from the downmix coefficient storage unit 17, and the first downmix signal is obtained with this initial value. The second downmix signal (S) is calculated by downmixing (M). The second downmix signal calculation unit 142 outputs the downmix signal (S) to the difference signal calculation unit 15.

次に、ステップＳ１５で、差分信号算出部１５は、第二ダウンミックス信号（Ｓ）と参照信号の差分を算出する。差分の算出には、２つの信号のエネルギー差、二乗平均誤差、１−正規化相互相関係数などの方法を用いてもよい。
次に、ステップＳ１６で、差分信号算出部１５は、差分が所定の閾値以下かどうかを判定する。あるいは、差分信号算出部１５は、差分が最小となったかどうかを判定してもよい。差分が閾値以下の場合（差分が最小となった場合）、第一ダウンミックス信号算出部１４１は、ステップＳ１３で第一ダウンミックス係数（Ｍ×Ｎ）によって、マルチチャンネル音響信号をダウンミックスして生成した第一ダウンミックス信号（Ｍ）を再生装置へ出力する。 Next, in step S15, the difference signal calculation unit 15 calculates the difference between the second downmix signal (S) and the reference signal. A method such as energy difference between two signals, mean square error, and 1-normalized cross-correlation coefficient may be used to calculate the difference.
Next, in step S16, the difference signal calculation unit 15 determines whether the difference is less than or equal to a predetermined threshold value. Alternatively, the difference signal calculation unit 15 may determine whether the difference has become the minimum. When the difference is less than or equal to the threshold value (when the difference is the minimum), the first downmix signal calculation unit 141 downmixes the multi-channel audio signal with the first downmix coefficient (M × N) in step S13. The generated first downmix signal (M) is output to the reproducing device.

差分が閾値より大きい場合（差分が最小ではない場合）、差分信号算出部１５は、算出した差分をダウンミックス係数更新部１６に出力する。
次に、ステップＳ１７で、ダウンミックス係数更新部１６は、ダウンミックス係数を更新する。ダウンミックス係数更新部１６は、図３〜図６を用いて説明した拘束条件を満たしつつ、差分を小さくする第一ダウンミックス係数（Ｍ×Ｎ）および第二ダウンミックス係数（Ｓ×Ｍ）を算出する。
または、ダウンミックス係数更新部１６は、第二ダウンミックス係数（Ｓ×Ｍ）が固定されている場合、第一ダウンミックス係数（Ｍ×Ｎ）のみを算出する。なお、第一ダウンミックス係数（Ｍ×Ｎ）等の算出には、遺伝的アルゴリズム、最急降下法、確率的勾配降下法などを用いてもよい。ダウンミックス係数更新部１６は、第一ダウンミックス係数（Ｍ×Ｎ）等を算出すると、算出した新たな第一ダウンミックス係数（Ｍ×Ｎ）等をダウンミックス係数記憶部１７に記録する。そして、差分が閾値以下となるまで、ステップＳ１３からの処理を繰り返す。
なお、２回目以降のステップＳ１３、及びステップＳ１４の処理では、ダウンミックス係数記憶部１７が記憶する第一ダウンミックス係数（Ｍ×Ｎ）の初期値、第二ダウンミックス係数（Ｓ×Ｍ）の初期値を用いるのではなく、ステップＳ１７でダウンミックス係数更新部１６が算出し、ダウンミックス係数記憶部１７に記録した第一ダウンミックス係数（Ｍ×Ｎ）、第二ダウンミックス係数（Ｓ×Ｍ）を用いる。 When the difference is larger than the threshold value (when the difference is not the minimum), the difference signal calculation unit 15 outputs the calculated difference to the downmix coefficient updating unit 16.
Next, in step S17, the downmix coefficient updating unit 16 updates the downmix coefficient. The downmix coefficient updating unit 16 sets the first downmix coefficient (M × N) and the second downmix coefficient (S × M) that reduce the difference while satisfying the constraint conditions described with reference to FIGS. 3 to 6. calculate.
Alternatively, the downmix coefficient updating unit 16 calculates only the first downmix coefficient (M × N) when the second downmix coefficient (S × M) is fixed. A genetic algorithm, a steepest descent method, a stochastic gradient descent method, or the like may be used to calculate the first downmix coefficient (M × N) and the like. After calculating the first downmix coefficient (M × N) and the like, the downmix coefficient updating unit 16 records the calculated new first downmix coefficient (M × N) and the like in the downmix coefficient storage unit 17. Then, the processing from step S13 is repeated until the difference becomes equal to or less than the threshold value.
In the processing of step S13 and step S14 after the second time, the initial value of the first downmix coefficient (M × N) stored in the downmix coefficient storage unit 17 and the second downmix coefficient (S × M) are stored. Instead of using the initial value, the downmix coefficient updating unit 16 calculates the first downmix coefficient (M × N) and the second downmix coefficient (S × M) stored in the downmix coefficient storage unit 17 in step S17. ) Is used.

地上デジタル放送において基本の音声フォーマットはステレオ２ｃｈであり、一方、ＢＳデジタル放送については、ステレオ２ｃｈまたは５．１ｃｈサラウンド放送である。４Ｋ／８Ｋ放送では、２２．２ｃｈマルチチャンネル音響と同時にステレオ２ｃｈ用の音声信号（参照信号）のいわゆるサイマル放送が検討されている。
そこで、本実施形態では、マルチチャンネル音響信号から再生環境のスピーカ配置に対応する第一ダウンミックス信号（Ｍ）と、再生環境のスピーカ配置に対応する参照信号と同じチャンネル数の第二ダウンミックス信号（Ｓ）とをダウンミックス係数の初期値を用いて作成する。そして、ステレオ２ｃｈ用の音声信号とダウンミックス信号（Ｓ）との差が最小となるようにダウンミックス係数を最適化する。
このとき、元のマルチチャンネル音響信号と所望の第一ダウンミックス信号（Ｍ）との変換について、スピーカ位置に応じた拘束条件を加えることで、元のマルチチャンネル音響の印象をなるべく保持した第一ダウンミックス信号（Ｍ）を実現する第一ダウンミックス係数（Ｍ×Ｎ）が算出できる。また、放送された番組製作者の意図が反映されたステレオ２ｃｈによる音声信号を参照することで、より番組製作者の意図に沿った第一ダウンミックス信号（Ｍ）を実現する第一ダウンミックス係数（Ｍ×Ｎ）が算出できる。 The basic audio format in terrestrial digital broadcasting is stereo 2ch, while the BS digital broadcasting is stereo 2ch or 5.1ch surround broadcasting. In 4K / 8K broadcasting, so-called simulcasting of audio signals (reference signals) for stereo 2ch at the same time as 22.2ch multi-channel sound is being considered.
Therefore, in the present embodiment, the first downmix signal (M) corresponding to the speaker arrangement in the reproduction environment and the second downmix signal having the same number of channels as the reference signal corresponding to the speaker arrangement in the reproduction environment are generated from the multi-channel audio signal. (S) and are created using the initial value of the downmix coefficient. Then, the downmix coefficient is optimized so that the difference between the audio signal for stereo 2ch and the downmix signal (S) is minimized.
At this time, regarding the conversion of the original multi-channel audio signal and the desired first downmix signal (M), by adding a constraint condition according to the position of the speaker, the first impression of the original multi-channel audio is retained as much as possible. The first downmix coefficient (M × N) that realizes the downmix signal (M) can be calculated. Further, the first downmix coefficient that realizes the first downmix signal (M) more in line with the program producer's intention by referring to the broadcasted audio signal from stereo 2ch in which the intention of the program producer is reflected. (M × N) can be calculated.

＜第二実施形態＞
以下、本発明の第二実施形態によるチャンネル数変換装置を、図８〜図９を参照して説明する。
図８は、本発明に係る第二実施形態におけるチャンネル数変換装置の一例を示すブロック図である。
図８に示すようにチャンネル数変換装置１０ａは、マルチチャンネル音響信号入力部１１と、スピーカ位置情報入力部１３と、ダウンミックス信号算出部１４と、差分信号算出部１５と、ダウンミックス係数更新部１６と、ダウンミックス係数記憶部１７と、参照信号算出部１８と、を備えている。つまり、第二実施形態によるチャンネル数変換装置１０ａは、第一実施形態の参照信号入力部１２に代えて参照信号算出部１８を備えている。他の構成は、第一実施形態と同様である。 <Second embodiment>
Hereinafter, a channel number conversion device according to a second exemplary embodiment of the present invention will be described with reference to FIGS.
FIG. 8 is a block diagram showing an example of a channel number conversion device according to the second embodiment of the present invention.
As shown in FIG. 8, the channel number conversion device 10a includes a multi-channel acoustic signal input unit 11, a speaker position information input unit 13, a downmix signal calculation unit 14, a difference signal calculation unit 15, and a downmix coefficient update unit. 16, a downmix coefficient storage unit 17, and a reference signal calculation unit 18. That is, the channel number conversion device 10a according to the second embodiment includes the reference signal calculation unit 18 instead of the reference signal input unit 12 of the first embodiment. Other configurations are similar to those of the first embodiment.

参照信号算出部１８は、所定のダウンミックス係数を用いて、マルチチャンネル音響信号をダウンミックスして例えば２ｃｈステレオの参照信号を算出する。
第一実施形態では、マルチチャンネル音響信号と同時にステレオ２ｃｈの音声信号（参照信号）が放送される場合を前提とした。しかし、マルチチャンネル音響信号に対応する参照信号が常に得られるとは限らない。例えば、マルチチャンネル音響信号に付加されるメタデータとして参照信号へのダウンミックス係数が送出される場合がある。そこで、第二実施形態では、参照信号算出部１８が、入力したマルチチャンネル音響信号から参照信号を算出する。 The reference signal calculation unit 18 downmixes the multi-channel audio signal using a predetermined downmix coefficient to calculate, for example, a 2ch stereo reference signal.
The first embodiment is based on the premise that a stereo 2ch audio signal (reference signal) is broadcast at the same time as the multi-channel audio signal. However, the reference signal corresponding to the multi-channel audio signal is not always obtained. For example, a downmix coefficient for a reference signal may be transmitted as metadata added to a multi-channel audio signal. Therefore, in the second embodiment, the reference signal calculation unit 18 calculates the reference signal from the input multi-channel acoustic signal.

次に図９を用いて本実施形態のチャンネル数変換処理の流れを説明する。
図９は、本発明に係る第二実施形態におけるチャンネル数変換処理の一例を示すフローチャートである。
前提として、参照信号算出部１８には、予めマルチチャンネル音響信号をステレオ２ｃｈ音声信号（参照信号）にダウンミックスするダウンミックス係数（参照ダウンミックス係数と呼ぶ）が設定されているとする。他の前提条件は第一実施形態と同様である。また、図７と同様の処理については簡単に説明する。
まず、ステップＳ１２で、マルチチャンネル音響信号入力部１１は、マルチチャンネル音響信号を入力する。マルチチャンネル音響信号入力部１１は、マルチチャンネル音響信号を、参照信号算出部１８、ダウンミックス信号算出部１４に出力する。
ステップＳ１２１で、参照信号算出部１８は、参照ダウンミックス係数によってマルチチャンネル音響信号をダウンミックスして参照信号を算出する。参照信号算出部１８は、算出した参照信号を差分信号算出部１５に出力する。以下の処理については第一実施形態と同様である。
つまり、次に、ステップＳ１３で、第一ダウンミックス信号算出部１４１が第一ダウンミックス信号（Ｍ）を算出し、ステップＳ１４で、第二ダウンミックス信号算出部１４２が第二ダウンミックス信号（Ｓ）を算出する。 Next, the flow of the channel number conversion processing of this embodiment will be described with reference to FIG.
FIG. 9 is a flowchart showing an example of the channel number conversion processing in the second embodiment according to the present invention.
As a premise, it is assumed that a downmix coefficient (referred to as a reference downmix coefficient) for downmixing a multi-channel audio signal into a stereo 2ch audio signal (reference signal) is set in the reference signal calculation unit 18 in advance. Other prerequisites are the same as those in the first embodiment. Moreover, a process similar to that of FIG. 7 will be briefly described.
First, in step S12, the multi-channel acoustic signal input unit 11 inputs a multi-channel acoustic signal. The multi-channel acoustic signal input unit 11 outputs the multi-channel acoustic signal to the reference signal calculation unit 18 and the downmix signal calculation unit 14.
In step S121, the reference signal calculator 18 downmixes the multi-channel audio signal with the reference downmix coefficient to calculate the reference signal. The reference signal calculation unit 18 outputs the calculated reference signal to the difference signal calculation unit 15. The subsequent processing is the same as in the first embodiment.
That is, next, in step S13, the first downmix signal calculation unit 141 calculates the first downmix signal (M), and in step S14, the second downmix signal calculation unit 142 causes the second downmix signal (S). ) Is calculated.

次に、ステップＳ１５で、差分信号算出部１５は、第二ダウンミックス信号算出部１４２が算出した第二ダウンミックス信号（Ｓ）と参照信号算出部１８が算出した参照信号の差分を算出する。次に、ステップＳ１６で、差分信号算出部１５は、差分が所定の閾値以下かどうかを判定し、差分が閾値以下の場合、第一ダウンミックス信号算出部１４１が第一ダウンミックス信号（Ｍ）を再生装置へ出力する。 Next, in step S15, the difference signal calculation unit 15 calculates the difference between the second downmix signal (S) calculated by the second downmix signal calculation unit 142 and the reference signal calculated by the reference signal calculation unit 18. Next, in step S16, the difference signal calculation unit 15 determines whether the difference is less than or equal to a predetermined threshold value, and if the difference is less than or equal to the threshold value, the first downmix signal calculation unit 141 causes the first downmix signal (M) to be detected. To the playback device.

また、差分が閾値より大きい場合、ステップＳ１７で、ダウンミックス係数更新部１６は、ダウンミックス係数を更新し、差分が閾値より小さくなるまで、ステップＳ１３からの処理を繰り返す。 If the difference is larger than the threshold, the downmix coefficient updating unit 16 updates the downmix coefficient in step S17, and repeats the processing from step S13 until the difference becomes smaller than the threshold.

本実施形態によれば、ステレオ２ｃｈが同時に放送されない場合でも、マルチチャンネル音響方式からステレオ２ｃｈなどの既存のチャンネル数への最適化されたダウンミックス係数（例えば、最適化されたダウンミックス係数がマルチチャンネル音響信号にメタデータとして付加されてもよい）によってダウンミックスされた信号を参照信号の代わりに用いる。これにより、第一実施形態と同様の手順で、マルチチャンネル音響信号の制作時のスピーカ数、スピーカ配置で再生されたときの音の印象をなるべく保持しつつ、再生装置のスピーカ数およびスピーカ配置に応じた第一ダウンミックス係数（Ｍ×Ｎ）を、番組ごとに算出することができる。 According to the present embodiment, even if the stereo 2ch is not broadcast at the same time, the optimized downmix coefficient from the multi-channel audio system to the existing number of channels such as the stereo 2ch (for example, the optimized downmix coefficient is multi-valued). The signal down-mixed by the channel audio signal (which may be added as metadata) is used instead of the reference signal. With this, by the same procedure as in the first embodiment, the number of speakers and the speaker arrangement of the reproducing apparatus are adjusted while maintaining the number of speakers when producing the multi-channel audio signal and the impression of the sound when reproduced with the speaker arrangement as much as possible. The corresponding first downmix coefficient (M × N) can be calculated for each program.

なお、上述のチャンネル数変換装置１０、１０ａは、内部にコンピュータシステムを有している。そして、チャンネル数変換装置１０等の動作の過程は、プログラムの形式でコンピュータ読み取り可能な記録媒体に記憶されており、このプログラムをコンピュータシステムが読み出して実行することによって、上記処理が行われる。ここでいうコンピュータシステムとは、ＣＰＵ及び各種メモリやＯＳ、周辺機器等のハードウェアを含むものである。 The above-described channel number conversion devices 10 and 10a have a computer system inside. The process of the operation of the channel number conversion device 10 and the like is stored in a computer-readable recording medium in the form of a program, and the above process is performed by the computer system reading and executing this program. The computer system mentioned here includes a CPU, various memories, an OS, and hardware such as peripheral devices.

また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。
また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含むものとする。また上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよい。 Further, the “computer system” also includes a homepage providing environment (or display environment) if a WWW system is used.
Further, the “computer-readable recording medium” refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, a CD-ROM, or a storage device such as a hard disk built in a computer system. Further, the "computer-readable recording medium" means to hold a program dynamically for a short time like a communication line when transmitting the program through a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory inside a computer system that serves as a server or a client in that case holds a program for a certain period of time. Further, the program may be for realizing a part of the functions described above, or may be a program for realizing the functions described above in combination with a program already recorded in the computer system.

その他、本発明の趣旨を逸脱しない範囲で、上記した実施の形態における構成要素を周知の構成要素に置き換えることは適宜可能である。また、この発明の技術範囲は上記の実施形態に限られるものではなく、本発明の趣旨を逸脱しない範囲において種々の変更を加えることが可能である。
なお、第一ダウンミックス係数（Ｍ×Ｎ）、第一ダウンミックス係数（Ｍ×Ｎ）の初期値は、第一ダウンミックス係数の一例である。第二ダウンミックス係数（Ｓ×Ｍ）、第二ダウンミックス係数（Ｓ×Ｍ）の初期値は、第二ダウンミックス係数の一例である。また、第一ダウンミックス信号（Ｍ）は第一ダウンミックス信号の一例、第二ダウンミックス信号（Ｓ）は第二ダウンミックス信号の一例である。 In addition, it is possible to appropriately replace the components in the above-described embodiments with known components without departing from the spirit of the present invention. Further, the technical scope of the present invention is not limited to the above-described embodiment, and various modifications can be made without departing from the spirit of the present invention.
The initial values of the first downmix coefficient (M × N) and the first downmix coefficient (M × N) are examples of the first downmix coefficient. The initial values of the second downmix coefficient (S × M) and the second downmix coefficient (S × M) are examples of the second downmix coefficient. The first downmix signal (M) is an example of the first downmix signal, and the second downmix signal (S) is an example of the second downmix signal.

１０、１０ａ・・・チャンネル数変換装置
１１・・・マルチチャンネル音響信号入力部
１２・・・参照信号入力部
１３・・・スピーカ位置情報入力部
１４・・・ダウンミックス信号算出部
１４１・・・第一ダウンミックス信号算出部
１４２・・・第二ダウンミックス信号算出部
１５・・・差分信号算出部
１６・・・ダウンミックス係数更新部
１７・・・ダウンミックス係数記憶部
１８・・・参照信号算出部 10, 10a ... Channel number converter 11 ... Multi-channel acoustic signal input unit 12 ... Reference signal input unit 13 ... Speaker position information input unit 14 ... Downmix signal calculation unit 141 ... First downmix signal calculation unit 142 ... Second downmix signal calculation unit 15 ... Difference signal calculation unit 16 ... Downmix coefficient update unit 17 ... Downmix coefficient storage unit 18 ... Reference signal Calculator

Claims

A desired first downmix signal is calculated from the multi-channel acoustic signal using the first downmix coefficient, and a second downmix having the same number of channels as the reference signal from the first downmix signal is calculated using the second downmix coefficient. A downmix signal calculation unit that calculates a signal,
A difference signal calculation unit that calculates a difference between the second downmix signal and a reference signal,
The difference calculated by the difference signal calculation unit is a minimum or a predetermined threshold value or less, a downmix coefficient updating unit for updating the first downmix coefficient and the second downmix coefficient,
An apparatus for converting the number of channels, comprising:

A reference signal calculation unit that down-mixes the multi-channel acoustic signal using a predetermined down-mix coefficient, and calculates the reference signal,
The channel number conversion device according to claim 1, further comprising:

The channel number conversion device according to claim 1 or 2,
The downmix coefficient updating unit fixes the second downmix coefficient and updates only the first downmix coefficient,
A channel number conversion device characterized by the above.

The channel number conversion device according to any one of claims 1 to 3,
A downmix coefficient storage unit that stores at least one of the initial value of the first downmix coefficient and the initial value of the second downmix coefficient,
An apparatus for converting the number of channels, further comprising:

The channel number conversion device according to claim 4,
The downmix coefficient storage unit is determined based on a positional relationship between a reproduction position of an audio signal of each channel included in the multi-channel audio signal and a reproduction position of an audio signal of each channel included in the first downmix signal. Storing the first downmix coefficient with an initial value set to
A channel number conversion device characterized by the above.

The channel number conversion device according to claim 4 or 5,
The downmix coefficient storage unit is based on the positional relationship between the reproduction position of the acoustic signal of each channel included in the first downmix signal and the reproduction position of the acoustic signal of each channel included in the second downmix signal. Storing said second downmix coefficient having a defined initial value,
A channel number conversion device characterized by the above.

The channel number conversion device according to any one of claims 1 to 6,
The downmix coefficient updating unit is determined by the similarity of the position between the reproduction position of the audio signal of each channel included in the multi-channel audio signal and the reproduction position of the audio signal of each channel included in the first downmix signal. Updating the first downmix coefficient based on the restrained conditions,
A channel number conversion device characterized by the above.

The channel number conversion device according to claim 7,
The downmix coefficient updating unit updates the first downmix coefficient with a constraint that the value of the downmix coefficient between the channels having the highest position similarity is maximum.
A channel number conversion device characterized by the above.

The channel number converter according to any one of claims 4 to 8, which does not cite claim 1, claim 2, or claim 3 ,
The downmix coefficient updating unit determines the position similarity between the reproduction position of the acoustic signal of each channel included in the first downmix signal and the reproduction position of the acoustic signal of each channel included in the second downmix signal. Updating the second downmix coefficient based on a defined constraint condition,
A channel number conversion device characterized by the above.

The channel number conversion device according to claim 9,
The downmix coefficient updating unit updates the second downmix coefficient with a constraint that the value of the downmix coefficient between the channels having the highest position similarity is maximum.
A channel number conversion device characterized by the above.

A program for causing a computer to function as the channel number conversion device according to any one of claims 1 to 10.