JP5333043B2

JP5333043B2 - Mixing apparatus and mixing method

Info

Publication number: JP5333043B2
Application number: JP2009192064A
Authority: JP
Inventors: 治吉野
Original assignee: JVCKenwood Corp
Current assignee: JVCKenwood Corp
Priority date: 2009-08-21
Filing date: 2009-08-21
Publication date: 2013-11-06
Anticipated expiration: 2029-08-21
Also published as: JP2011044924A

Abstract

<P>PROBLEM TO BE SOLVED: To perform stable mixing processing without uselessly enlarging a buffer upper-limit accumulation amount while the fluctuation of packets becomes obvious between a plurality of channels. <P>SOLUTION: A mixing device 130 includes a packet acquisition unit for acquiring the sound signals of a plurality of channels in units of packets which are transmitted through a communication network 120, a packet loss decision unit 218 for defining the packets as the missing packets when a delay of arrival time of the acquired packets in the channels exceeds a prescribed time, a packet interpolating part 220 for interpolating the packets defined as the missing packets, a mixing buffer 216 for temporarily holding the packets of the channels, and a mixing processing unit 222 for adding the packets of the channels and generating one packet. The prescribed time is made different for each channel in response to the priority degree of the channels. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、複数のチャンネルの音信号を加算し、新たに音信号を生成するミキシング装置およびミキシング方法に関する。 The present invention relates to a mixing device and a mixing method for adding sound signals of a plurality of channels and generating a new sound signal.

近年、楽曲や音声といった音信号をＩＰ（Internet Protocol）を用いた通信網（ＩＰ網）を通じて伝達する技術が様々なサービスに採用されるようになった。かかる通信網を通じた伝達技術の場合、例えば、音信号は所定データ量のパケットに一旦分割され、分割された各パケットは個々に通信網を経由し伝達先で元の音信号に復号される。 In recent years, techniques for transmitting sound signals such as music and voices through a communication network (IP network) using IP (Internet Protocol) have been adopted for various services. In the case of a transmission technique through such a communication network, for example, a sound signal is once divided into packets of a predetermined data amount, and each divided packet is individually decoded to the original sound signal at the transmission destination via the communication network.

このような通信網では、その伝達経路が特定されないため、各パケットの到達時間が一様にならず、特にルータ等のネットワーク機器を経由する場合には到達時間にジッタ（音信号の揺らぎ）が生じ得る。パケットから音信号を復号する際、音信号を出力すべきタイミングにパケットが到着していない場合には、そのパケットは欠落したと見なされ、音切れを招き、その音信号を聴いているユーザに違和感を生じさせていた。そこで、パケットの受信装置において、単一の音信号に関するパケットを一時的に保持するバッファを設け、通信網を経由したパケットの時間的な揺らぎを吸収させ、音切れを防止する技術が知られている（例えば、特許文献１）。 In such a communication network, since the transmission path is not specified, the arrival time of each packet is not uniform, and jitter (sound signal fluctuation) is found in the arrival time particularly when passing through a network device such as a router. Can occur. When decoding a sound signal from a packet, if the packet has not arrived at the timing at which the sound signal should be output, it is considered that the packet has been dropped, causing the sound to be interrupted and for the user listening to the sound signal. It made me feel uncomfortable. Therefore, a technique is known in which a packet receiving device is provided with a buffer for temporarily holding a packet related to a single sound signal, so as to absorb temporal fluctuation of the packet passing through the communication network and prevent sound interruption. (For example, Patent Document 1).

また、単一の音信号に関する到着したパケットの揺らぎがバッファで吸収可能か否かを判定し、吸収できないと判定された場合に、到着したパケットそれぞれの重要度を判定して、重要度が低い音声パケットを破棄し、重要度が高い音声パケットを再生する技術が開示されている（特許文献２）。 Also, it is determined whether or not the fluctuation of the arrived packet related to a single sound signal can be absorbed by the buffer, and when it is determined that it cannot be absorbed, the importance of each arrived packet is determined and the importance is low. A technique for discarding a voice packet and reproducing a voice packet with high importance is disclosed (Patent Document 2).

特開２００７−２５９４７１号公報JP 2007-259471 A 特開２００７−２５８９２８号公報JP 2007-258928 A

一方、音信号の用いられ方に関しては、上述した単一の音信号（１チャンネルの音信号）のみならず、音源や音質等が異なる複数のチャンネルの音信号を加算し、新たに音信号を生成する、所謂ミキシングの用途も存在する。このとき、複数のチャンネルの音信号を同期した状態で取得できれば、取得した音信号をそのままミキシング処理することで所望する音信号を得ることが可能である。このような複数のチャンネルに関する音信号の伝送には、従来、専用ケーブルが用いられていた。しかし、専用ケーブルを用いた場合、チャンネル数や送受信装置の増加に伴って専用ケーブル数も増やさなくてはならず、工事が必要になるなど再構築に手間がかかっていた。 On the other hand, regarding the way the sound signal is used, not only the above-mentioned single sound signal (one-channel sound signal) but also the sound signals of a plurality of channels with different sound sources and sound qualities are added, and a new sound signal is obtained. There are also so-called mixing applications that produce. At this time, if the sound signals of a plurality of channels can be acquired in a synchronized state, a desired sound signal can be obtained by mixing the acquired sound signals as they are. Conventionally, dedicated cables have been used for transmission of sound signals for such a plurality of channels. However, when dedicated cables were used, the number of dedicated cables had to be increased as the number of channels and transmission / reception devices increased.

そこで、上述した通信網を利用して音信号を配信することを検討する。通信網では、チャンネル数の増加や送受信装置の再構築の際にも配線の物理的な変更を伴わずに済み、音信号以外の映像信号や制御情報を追加できるなど柔軟な対応が可能である。しかし、通信網を通じて複数のチャンネルの音信号に基づくパケットを取得する場合、一様な時刻に到着しない複数のチャンネルのパケットを時間軸を合わせてミキシング処理するためには未達のパケットを待たなければならないので、最終的な音信号の遅延を抑えるためにミキシング処理には時間を割くことができなかった。 Therefore, it is considered to distribute a sound signal using the communication network described above. In the communication network, it is not necessary to physically change the wiring even when the number of channels is increased or the transmitter / receiver is reconfigured, and it is possible to flexibly cope with the addition of video signals and control information other than sound signals. . However, when acquiring packets based on sound signals of multiple channels through a communication network, in order to mix the packets of multiple channels that do not arrive at a uniform time with the same time axis, you must wait for unreachable packets. Therefore, in order to suppress the delay of the final sound signal, it was not possible to spend time on the mixing process.

また、任意のチャンネルのパケットが到着する前に、他のチャンネルにおける次のパケットが到着してしまうと、かかる他のチャンネルの前のパケットが欠落する事態に陥る。上述した従来技術では１チャンネルの音信号に関する前後パケットの揺らぎは吸収できるものの、このような複数のチャンネル間のパケットの到達時刻の揺らぎを吸収する手段までは言及されていない。 In addition, if a next packet arrives in another channel before a packet of an arbitrary channel arrives, a situation occurs in which the previous packet of the other channel is lost. Although the above-described prior art can absorb the fluctuation of the front and rear packets regarding the sound signal of one channel, it does not mention any means for absorbing the fluctuation of the arrival time of the packets between the plurality of channels.

本発明は、このような課題に鑑み、複数のチャンネル間でパケットの揺らぎが顕在している状態で、バッファの上限蓄積量を無駄に大きくすることなく、安定したミキシング処理を実行することが可能なミキシング装置およびミキシング方法を提供することを目的としている。 In view of such a problem, the present invention can execute stable mixing processing without unnecessarily increasing the upper limit accumulation amount of a buffer in a state where packet fluctuations are manifested between a plurality of channels. An object of the present invention is to provide a mixing device and a mixing method.

上記課題を解決するために、通信網を通じて伝送された複数のチャンネルの音信号を加算し、新たに音信号を生成するミキシング装置は、通信網を通じて伝送された複数のチャンネルの音信号をパケット単位で取得するパケット取得部と、取得された複数のチャンネルにおけるパケットの、最先に取得された他のチャンネルのパケットに対する到着時刻の遅れと所定時間とを比較し、パケットの到着時刻の遅れが所定時間を超えるとそのパケットは欠落したとみなすパケットロス判定部と、欠落したとみなされたパケットを補完するパケット補完部と、パケットの到着時刻の遅れを吸収するため複数のチャンネルのパケットを一時的に保持するミキシングバッファと、ミキシングバッファによって時間軸が揃えられた複数のチャンネルのパケットを加算して１つのパケットを生成するミキシング処理部と、を備え、所定時間は、複数のチャンネルの優先度に従ってチャンネル毎に異ならせる。 In order to solve the above problem, a mixing device that adds sound signals of a plurality of channels transmitted through a communication network and newly generates a sound signal is provided for each sound signal of a plurality of channels transmitted through the communication network in units of packets. The packet acquisition unit acquired in step 1 compares the arrival time delay of a packet in the acquired plurality of channels with respect to the packet of the other channel acquired first and the predetermined time, and the packet arrival time delay is predetermined. A packet loss determination unit that considers that the packet is missing when the time is exceeded, a packet complementation unit that complements the packet deemed to be missing, and packets of multiple channels temporarily to absorb delays in the arrival time of the packet The mixing buffer that is held in the It includes a mixing processor for generating one packet by adding the door, and the predetermined time, varying per channel according to the priority of the plurality of channels.

かかるチャンネル毎にミキシングバッファを設ける構成により、チャンネル間のパケット到着時刻の揺らぎを吸収することができ、パケットの欠落を回避することが可能となる。一方、ミキシングバッファを設けることで、ミキシング処理が実行されるまでミキシングバッファの上限蓄積量に相当する遅延が生じ得る。このような場合であっても、ミキシング処理後の最終的な音信号の遅延を抑えるため、ミキシング処理時間を短時間で実行しなければならないが、それは音切れ等の聴感上のノイズ発生の可能性を高める。しかし、ミキシング処理に余裕を持たせるためパケットロスの判定を全てのチャンネルに対して一律に早めると優先度の高いチャンネルまで早期にパケットが欠落したと判定されてしまい、音切れの原因となる。ここでは、優先度に従って、パケットの欠落とみなす時間を異ならせる、即ち、優先度の高いチャンネルはミキシング処理時間を短縮しても可能な限りパケットを欠落させないようにし、優先度の低いチャネルは早めにパケットロス判定を行い、余裕を持ったミキシング処理を実行させることで、安定したミキシング処理を遂行することを可能にする。 By providing a mixing buffer for each channel, fluctuations in packet arrival time between channels can be absorbed, and packet loss can be avoided. On the other hand, by providing the mixing buffer, a delay corresponding to the upper limit accumulation amount of the mixing buffer may occur until the mixing process is executed. Even in such a case, in order to suppress the delay of the final sound signal after mixing processing, it is necessary to execute the mixing processing time in a short time, but this may cause audible noise such as sound interruption. Increase sex. However, if the packet loss determination is uniformly advanced for all channels in order to provide a margin for the mixing process, it is determined that the packet has been dropped early to the channel with a higher priority, which causes a sound cut. Here, according to the priority, the time for which the packet is considered to be dropped is different, that is, the channel with high priority is not dropped as much as possible even if the mixing processing time is shortened, and the channel with low priority is advanced. It is possible to perform stable mixing processing by performing packet loss determination and executing mixing processing with a margin.

上述した所定時間は、優先度の低いチャンネルに対して、優先度の高いチャンネルよりも短く設定されるとしてもよい。ここでは、優先度の低いチャンネルに対して、優先度の高いチャンネルよりも、遅れを判定するための所定時間を短く設定し、パケットロス判定を早めることで、優先度の低いチャンネルのパケット遅延に拘わらず、優先度の高いチャンネルのミキシング処理を早期に実行することができる。このような安定したミキシング処理によって、少なくとも優先度の高いチャンネルの音信号を確実かつ安定して出力することが可能となる。 The predetermined time described above may be set shorter for a low priority channel than for a high priority channel. Here, for a low-priority channel, a predetermined time for determining a delay is set shorter than that for a high-priority channel, and packet loss determination is advanced, thereby reducing the packet delay of a low-priority channel. Regardless, it is possible to execute the mixing process of the high priority channel at an early stage. By such stable mixing processing, it is possible to reliably and stably output a sound signal of at least a high priority channel.

また、パケット補完部は、無音のパケットもしくはダミーパケットへの置換処理、過去のパケットを繰り返す処理、または、補間パケットの生成および置換処理のいずれかの処理によって欠落したとみなされたパケットを補完してもよい。かかる構成により、欠落したとみなされたパケットに、再生可能な他のパケットを補完することができ、正常な再生処理を継続することが可能となる。 In addition, the packet complementing unit complements a packet that has been considered missing by any one of the replacement process of a silent packet or a dummy packet, the process of repeating a past packet, or the process of generating and replacing an interpolated packet. May be. With this configuration, it is possible to supplement other packets that can be played back to the packets that are regarded as missing, and it is possible to continue normal playback processing.

上記課題を解決するために、通信網を通じて伝送された複数のチャンネルの音信号を加算し、新たに音信号を生成するミキシング方法では、複数のチャンネルにおけるパケットの、他のチャンネルのパケットに対する到着時刻の遅れの許容時間を示す所定時間を、複数のチャンネルの優先度に従ってチャンネル毎に異ならせて設定し、通信網を通じて伝送された複数のチャンネルの音信号をパケット単位で取得し、取得された複数のチャンネルにおけるパケットの、最先に取得された他のチャンネルのパケットに対する到着時刻の遅れと、複数のチャンネル毎に設定された所定時間とを比較し、パケットの到着時刻の遅れが所定時間を超えるとそのパケットは欠落したとみなし、欠落したとみなされたパケットを補完し、パケットの到着時刻の遅れを吸収するため複数のチャンネルのパケットを一時的にミキシングバッファに保持し、ミキシングバッファに保持されることによって時間軸が揃えられた複数のチャンネルのパケットを加算して１つのパケットを生成する。 In order to solve the above-mentioned problem, in the mixing method of adding sound signals of a plurality of channels transmitted through a communication network and newly generating a sound signal, arrival times of packets in the plurality of channels with respect to packets of other channels The predetermined time indicating the allowable delay time is set differently for each channel according to the priority of the plurality of channels, and the sound signals of the plurality of channels transmitted through the communication network are acquired in units of packets. The delay in the arrival time of the packet in the other channel with respect to the packet of the other channel acquired first is compared with the predetermined time set for each of the plurality of channels, and the delay in the arrival time of the packet exceeds the predetermined time And the packet is considered missing, complements the packet deemed missing, and the packet arrival time Held temporarily in the mixing buffer packets of a plurality of channels for absorbing a delay adds a packet of a plurality of channels aligned time axis by being held in the mixing buffer for generating one packet.

上述した、ミキシング装置の技術的思想に基づく構成要素やその説明は、当該ミキシング方法にも適用可能である。 The above-described components based on the technical idea of the mixing apparatus and the description thereof can also be applied to the mixing method.

以上説明したように本発明によれば、複数のチャンネル間でパケットの揺らぎが顕在している状態で、バッファの上限蓄積量を無駄に大きくすることなく、安定したミキシング処理を実行することが可能となる。 As described above, according to the present invention, it is possible to perform stable mixing processing without unnecessarily increasing the upper limit accumulation amount of a buffer in a state where packet fluctuations are manifested between a plurality of channels. It becomes.

音信号出力システムの概略的な接続関係を示した説明図である。It is explanatory drawing which showed the rough connection relation of the sound signal output system. ミキシング装置の概略的な構成を示した機能ブロック図である。It is the functional block diagram which showed the schematic structure of the mixing apparatus. ミキシングバッファの動作を説明するための説明図である。It is explanatory drawing for demonstrating operation | movement of a mixing buffer. パケットロス判定部の動作を説明するための説明図である。It is explanatory drawing for demonstrating operation | movement of a packet loss determination part. チャンネル毎に所定時間を異ならせた場合のパケットロス判定部の動作を説明するための説明図である。It is explanatory drawing for demonstrating operation | movement of the packet loss determination part at the time of changing predetermined time for every channel. ミキシング方法の処理の流れを示したフローチャートである。It is the flowchart which showed the flow of the process of the mixing method.

以下に添付図面を参照しながら、本発明の好適な実施形態について詳細に説明する。かかる実施形態に示す寸法、材料、その他具体的な数値等は、発明の理解を容易とするための例示にすぎず、特に断る場合を除き、本発明を限定するものではない。なお、本明細書及び図面において、実質的に同一の機能、構成を有する要素については、同一の符号を付することにより重複説明を省略し、また本発明に直接関係のない要素は図示を省略する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. The dimensions, materials, and other specific numerical values shown in the embodiments are merely examples for facilitating the understanding of the invention, and do not limit the present invention unless otherwise specified. In the present specification and drawings, elements having substantially the same function and configuration are denoted by the same reference numerals, and redundant description is omitted, and elements not directly related to the present invention are not illustrated. To do.

（音信号出力システム１００）
図１は、本実施形態にかかる音信号出力システム１００の概略的な接続関係を示した説明図である。音信号出力システム１００は、配信サーバ１１０と、通信網１２０と、ミキシング装置１３０と、音出力装置１４０とを含んで構成される。 (Sound signal output system 100)
FIG. 1 is an explanatory diagram showing a schematic connection relationship of the sound signal output system 100 according to the present embodiment. The sound signal output system 100 includes a distribution server 110, a communication network 120, a mixing device 130, and a sound output device 140.

配信サーバ１１０は、楽曲や音声等の音信号を既存の様々な符号化方式で圧縮しパケット化して通信網１２０に伝送する所謂ＶｏＩＰ（Voice over Internet Protocol）を用いて、ミキシング装置１３０のユーザが所望する音信号を通信網１２０に出力する。本実施形態では、理解を容易にするため、音信号の参照元を配信サーバ１１０としているが、サーバに限るものではなく、通信網１２０に接続可能な様々な電子機器を採用することができる。また、通信網１２０を通じた伝送プロトコルとしてＶｏＩＰを用いているが、かかる場合に限られず、音信号（音声データ）のパケット単位の伝送が可能な様々なプロトコルを採用することができる。本実施形態では、このような配信サーバ１１０が、単独または複数連携して複数のチャンネルの音信号を出力している。 The distribution server 110 uses a so-called VoIP (Voice over Internet Protocol) that compresses sound signals such as music and voices using various existing encoding methods, packetizes them, and transmits them to the communication network 120, so that the user of the mixing device 130 can transmit them. A desired sound signal is output to the communication network 120. In this embodiment, in order to facilitate understanding, the reference source of the sound signal is the distribution server 110, but is not limited to the server, and various electronic devices that can be connected to the communication network 120 can be employed. In addition, VoIP is used as a transmission protocol through the communication network 120. However, the present invention is not limited to this, and various protocols capable of transmitting sound signals (voice data) in units of packets can be employed. In this embodiment, such a distribution server 110 outputs sound signals of a plurality of channels independently or in cooperation with a plurality of channels.

通信網１２０は、任意の装置間を通信接続し、例えばＩＰを用いる場合、ＩＰヘッダとペイロードで構成されるパケット単位で情報を伝達する役割を担う。ただし、通信網１２０ではパケットの伝達経路が特定されないため、配信サーバ１１０から出力された各パケットのミキシング装置１３０における到達時間が一様にならず、特にルータ等のネットワーク機器を経由する場合には到達時間にジッタ（音信号の揺らぎ）が生じ得る。さらに、通信網１２０ではパケットの到達すら保証されていないので、パケットの欠落も考慮しなければならない。 The communication network 120 performs communication connection between arbitrary devices, and when using IP, for example, plays a role of transmitting information in units of packets composed of an IP header and a payload. However, since the transmission route of the packet is not specified in the communication network 120, the arrival time of each packet output from the distribution server 110 in the mixing device 130 is not uniform, especially when it passes through a network device such as a router. Jitter (sound signal fluctuation) may occur in the arrival time. Further, since even the arrival of the packet is not guaranteed in the communication network 120, it is necessary to consider the loss of the packet.

ミキシング装置１３０は、通信網１２０を通じて複数のチャンネルの音信号をパケット単位で取得し、その複数のチャンネルの音信号を加算（ミキシング処理）して１または複数の出力ライン毎に音信号を生成し、音出力装置１４０に出力する。また、ミキシング装置１３０は、ミキシング処理を実行する単体の装置として構成することもできるが、ＢＤ（Blu-ray Disc）レコーダ／プレーヤ、ＤＶＤレコーダ／プレーヤ、ＣＤプレーヤ、ＨＤＤ（Hard disk drive）レコーダ／プレーヤ、パーソナルコンピュータ、ノート型パーソナルコンピュータ、ＰＤＡ（Personal Digital Assistant）、携帯電話、ＰＨＳ（Personal Handy phone System）端末、デジタルカメラ、デジタルビデオカメラ、テレビジョン、ゲーム機器等、通信網１２０に接続可能な様々な電気機器で構成することもできる。 The mixing device 130 acquires sound signals of a plurality of channels in units of packets through the communication network 120, adds the sound signals of the plurality of channels (mixing process), and generates a sound signal for each of one or a plurality of output lines. To the sound output device 140. The mixing device 130 can also be configured as a single device that performs mixing processing, but is not limited to a BD (Blu-ray Disc) recorder / player, DVD recorder / player, CD player, HDD (Hard disk drive) recorder / Players, personal computers, notebook personal computers, PDAs (Personal Digital Assistants), mobile phones, PHS (Personal Handy phone System) terminals, digital cameras, digital video cameras, televisions, game machines, etc. can be connected to the communication network 120 It can also be composed of various electric devices.

音出力装置１４０は、ダイナミックスピーカ、コンデンサスピーカ、骨伝導スピーカ、ヘッドホン等で構成され、ミキシング装置１３０から出力された電気的な音信号を物理振動に変換して音（音声）を出力する。 The sound output device 140 is composed of a dynamic speaker, a condenser speaker, a bone conduction speaker, headphones, and the like, and converts an electrical sound signal output from the mixing device 130 into physical vibrations and outputs sound (sound).

音信号出力システム１００では、１または複数の配信サーバ１１０から伝送された複数のチャンネルの音信号をミキシング装置１３０でミキシング処理し、１または複数の出力ラインに対する音信号を音出力装置１４０から出力している。ここでは、ミキシング装置１３０において、複数のチャンネルそれぞれにバッファを設けることで、通信網１２０を経由することに基づくチャンネル間のパケット到着時刻の揺らぎを吸収する。しかし、バッファの総量を各チャンネルに一様に分割して割り当てると、各チャンネルのバッファのデータ量は制限され、所定時間を超えても到着しないパケットは欠落したと見なさざるを得なくなる。このような、パケットが欠落していると判断するタイミングを、チャンネルの優先度に拘わらず各チャンネルに一律に設定すると、優先度の低いチャンネルの遅れによってミキシング処理が滞り、一緒にミキシング処理されるはずの優先度の高いチャンネルの音切れを招くおそれがある。従って、本実施形態においては、単にチャンネル毎にバッファを設けるだけでなく、チャンネルの優先度に応じたパケット処理を遂行している。 In the sound signal output system 100, sound signals of a plurality of channels transmitted from one or more distribution servers 110 are mixed by the mixing device 130, and sound signals for one or a plurality of output lines are output from the sound output device 140. ing. Here, in mixing apparatus 130, a buffer is provided for each of a plurality of channels to absorb fluctuations in packet arrival times between channels based on passing through communication network 120. However, if the total amount of the buffer is divided and allocated to each channel uniformly, the data amount of the buffer of each channel is limited, and a packet that does not arrive after a predetermined time must be regarded as missing. When the timing for determining that a packet is missing is set uniformly for each channel regardless of the priority of the channel, the mixing process is delayed due to the delay of the channel with the lower priority, and the mixing process is performed together. There is a risk that the sound of a channel with a high priority should be cut off. Therefore, in this embodiment, not only a buffer is provided for each channel, but packet processing according to the priority of the channel is performed.

こうして、本実施形態のミキシング装置１３０は、複数のチャンネル間でパケットの揺らぎが顕在している状態で、バッファの上限蓄積量を無駄に大きくすることなく、安定したミキシング処理を実行することが可能となる。以下、このような目的を達成するためのミキシング装置１３０の具体的構成を述べ、その処理の流れを後ほど詳述する。 In this way, the mixing apparatus 130 according to the present embodiment can perform stable mixing processing without unnecessarily increasing the upper limit accumulation amount of the buffer in a state where packet fluctuations are manifested between a plurality of channels. It becomes. Hereinafter, a specific configuration of the mixing apparatus 130 for achieving such an object will be described, and the processing flow will be described in detail later.

（ミキシング装置１３０）
図２は、ミキシング装置１３０の概略的な構成を示した機能ブロック図である。ミキシング装置１３０は、通信部２１０と、制御部２１２と、操作部２１４と、ミキシングバッファ２１６と、パケットロス判定部２１８と、パケット補完部２２０と、ミキシング処理部２２２と、出力バッファ２２４とを含んで構成される。 (Mixing device 130)
FIG. 2 is a functional block diagram illustrating a schematic configuration of the mixing apparatus 130. The mixing apparatus 130 includes a communication unit 210, a control unit 212, an operation unit 214, a mixing buffer 216, a packet loss determination unit 218, a packet complementing unit 220, a mixing processing unit 222, and an output buffer 224. Consists of.

通信部２１０は、有線、または、Ｂｌｕｅｔｏｏｔｈ（登録商標）、ＩＥＥＥ８０２．１１ａ／ｂ／ｇ／ｎ等の無線によって通信網１２０に接続され、配信サーバ１１０とＶｏＩＰによる通信を確立する。また、通信部２１０は、パケット取得部として機能し、通信網１２０を通じて伝送された複数のチャンネルの音信号をパケット単位で取得する。 The communication unit 210 is connected to the communication network 120 by wire, or wireless such as Bluetooth (registered trademark), IEEE802.11a / b / g / n, and establishes communication with the distribution server 110 by VoIP. The communication unit 210 functions as a packet acquisition unit, and acquires sound signals of a plurality of channels transmitted through the communication network 120 in units of packets.

制御部２１２は、中央処理装置（ＣＰＵ）、プログラム等が格納されたＲＯＭ、ワークエリアとしてのＲＡＭ等を含む半導体集積回路により、ミキシング装置１３０全体を制御する。特に本実施形態において、制御部２１２は、通信部２１０が取得する各チャンネルの優先度に応じて後述するミキシングバッファ毎の所定時間を設定する。かかる構成の詳細な説明は後で行う。 The control unit 212 controls the mixing device 130 as a whole by a semiconductor integrated circuit including a central processing unit (CPU), a ROM storing programs, a RAM as a work area, and the like. In particular, in the present embodiment, the control unit 212 sets a predetermined time for each mixing buffer described later according to the priority of each channel acquired by the communication unit 210. A detailed description of this configuration will be given later.

操作部２１４は、キーボード、十字キー、ジョイスティック等のスイッチ、ディスプレイを有する場合そのディスプレイの表示面上に設置されたタッチパネル等で構成され、ミキシング装置１３０へのユーザの操作入力を受け付ける。 The operation unit 214 includes a switch such as a keyboard, a cross key, a joystick, and a display, and a touch panel installed on a display surface of the display, and receives a user operation input to the mixing device 130.

ミキシングバッファ２１６は、ミキシング装置１３０で取得された複数のチャンネルのパケットをミキシング処理する前段で一時的に保持する。 The mixing buffer 216 temporarily holds packets of a plurality of channels acquired by the mixing device 130 at a stage before mixing processing.

図３は、ミキシングバッファ２１６の動作を説明するための説明図である。図３では横軸が時間軸となっており、左側ほど到着が早いこととなる。また、ここでは、理解を容易にするため２つのチャンネルＡ、Ｂのみ挙げて説明するが、３以上のチャンネルにも対応可能であることは言うまでもない。 FIG. 3 is an explanatory diagram for explaining the operation of the mixing buffer 216. In FIG. 3, the horizontal axis is the time axis, and the arrival on the left side is earlier. In addition, here, only two channels A and B will be described for ease of understanding, but it goes without saying that three or more channels can be handled.

例えば、図３（ａ）に示すように、ミキシングバッファ２１６を設けていない状態で、複数のチャンネルＡ、Ｂにおける音信号のパケットＡ_１、Ａ_２、Ａ_３、…、Ａ_ｎ、Ｂ_１、Ｂ_２、Ｂ_３、…、Ｂ_ｎ（到着順が早いパケットには若い番号を付している。）を取得することを想定する。複数のチャンネルＡ、Ｂの音信号をミキシングするため、ミキシング処理を実行する前に各チャンネルのパケットの時間軸を合わせる必要がある。ここでは、チャンネルＡのパケットＡ_１とチャンネルＢのパケットＢ_１とが揃った時点（１）でミキシング処理が遂行される。 For example, as shown in FIG. 3A, sound signal packets A ₁ , A ₂ , A ₃ ,..., A _n , B ₁ , in a plurality of channels A and B without the mixing buffer 216 being provided. Assume that B ₂ , B ₃ ,..., B _n (a packet with a fast arrival order is given a young number) is acquired. In order to mix the sound signals of a plurality of channels A and B, it is necessary to match the time axis of the packet of each channel before executing the mixing process. Here, the mixing process is performed when the packet A ₁ of the channel A and the packet B _{1 of the} channel B are aligned (1).

しかし、同図３（ａ）に示すように、チャンネルＡのパケットＡ_２が到達しているが（２）、チャンネルＢのパケットＢ_２が未だ到達していない状態で、パケットＢ_２より先にチャンネルＡのパケットＡ_３を取得してしまうと、パケットＡ_２かパケットＡ_３のいずれかを破棄しなくてはならなくなる（４）。 However, as shown in FIG. 3A, the packet A _{2 of the} channel A has arrived (2), but the packet B _{2 of the} channel B has not yet arrived, and the packet B ₂ is ahead of the packet B _2. If packet A _{3 of} channel A is acquired, either packet A ₂ or packet A ₃ must be discarded (4).

本実施形態では、ミキシングバッファ２１６を備えているので、図３（ｂ）に示すように、上記の状態であっても、パケットＡ_２とパケットＡ_３のいずれもミキシングバッファ２１６に保持でき（５）（６）、パケットＢ_２が到来したとき、パケットを欠落させることなしに、ミキシングバッファ２１６内のパケットＡ_２とパケットＢ_２とをミキシング処理することが可能となる（７）。 In this embodiment, since the mixing buffer 216 is provided, as shown in FIG. 3B, both the packet A ₂ and the packet A ₃ can be held in the mixing buffer 216 even in the above state (5 ) (6) When packet B ₂ arrives, it is possible to mix the packet A ₂ and packet B ₂ in the mixing buffer 216 without dropping the packet (7).

かかるミキシングバッファ２１６の構成により、通信網１２０を経由したことに基づく、複数のチャンネルそれぞれの到着時刻の揺らぎのみならず、複数のチャンネル間の音信号のパケットの到着時刻の遅れを吸収することができ、パケットの欠落を回避し音切れを防止することが可能となる。 Such a configuration of the mixing buffer 216 can absorb not only the fluctuation of the arrival time of each of the plurality of channels but also the delay of the arrival time of the sound signal packet between the plurality of channels based on the fact that it has passed through the communication network 120. This makes it possible to avoid missing packets and prevent sound interruptions.

このようなミキシングバッファ２１６の容量は大きいほど良いが、リソースには限界があり、また、ミキシングバッファ２１６を設けたことによる時間軸の遅れが大きくなると、それに伴ってミキシング処理後の音信号を出力するタイミングも遅延するので、ミキシングバッファ２１６の上限蓄積量は音信号出力システム１００で要求される遅延時間との兼ね合いで設定される。 The larger the capacity of the mixing buffer 216 is, the better. However, the resource is limited, and when the time axis delay due to the provision of the mixing buffer 216 increases, a sound signal after mixing processing is output accordingly. Therefore, the upper limit accumulation amount of the mixing buffer 216 is set in consideration of the delay time required by the sound signal output system 100.

従って、上述したミキシングバッファ２１６を設けたとしても、通信網１２０を経由することによるパケットの揺らぎを完全に吸収することはできない。ミキシング処理を実行するまでに許容される遅延量は各チャンネルに準備された上限蓄積量に相当するので、各チャンネルに準備された上限蓄積量を超えてパケットの到達が遅れた場合、そのパケットを待つことなく、ミキシング処理を遂行しなければならない。そこで、パケットの到着が上限蓄積量を超えているか否かを判定する手段が必要となる。 Therefore, even if the above-described mixing buffer 216 is provided, it is not possible to completely absorb packet fluctuations caused by passing through the communication network 120. The amount of delay allowed before executing the mixing process corresponds to the upper limit accumulation amount prepared for each channel, so if the arrival of a packet is delayed beyond the upper limit accumulation amount prepared for each channel, that packet is The mixing process must be performed without waiting. Therefore, a means for determining whether or not the arrival of the packet exceeds the upper limit accumulation amount is required.

パケットロス判定部２１８は、取得された複数のチャンネルにおけるパケットの、最先に取得された他のチャンネルのパケットに対する到着時刻の遅れと、制御部２１２によって設定された所定時間とをチャンネル毎に比較し、パケットの到着時刻の遅れが所定時間を超えるとそのパケットは欠落したとみなす。かかるパケット間の到着時刻の遅れは、パケットに示されたタイムスタンプの差分により算出してもよいし、最先に取得されたチャンネルのパケット到着からの時間をカウントすることによって算出してもよい。 The packet loss determination unit 218 compares, for each channel, the delay in arrival time of the acquired packets in the plurality of channels with respect to the packet acquired in the other channel first and the predetermined time set by the control unit 212. If the delay of the arrival time of the packet exceeds a predetermined time, the packet is regarded as missing. The delay in arrival time between packets may be calculated from the difference in time stamps indicated in the packets, or may be calculated by counting the time from the arrival of the packet of the channel acquired first. .

図４は、パケットロス判定部２１８の動作を説明するための説明図である。図４でも横軸が時間軸となっている。例えば、複数のチャンネルＡ、Ｂから図４（ａ）に示すような時間配分で、複数のチャンネルＡ、Ｂにおける音信号のパケットＡ_１、Ａ_２、…、Ａ_ｎ、Ｂ_１、Ｂ_２、…、Ｂ_ｎを取得したとする。ここでは、同タイミングに出力すべきパケット間、即ち、パケットＡ_１、Ｂ_１やパケットＡ_２、Ｂ_２の時間差（図４（ａ）で白抜き両矢印で記載される。）が到着時刻の遅れとなる。 FIG. 4 is an explanatory diagram for explaining the operation of the packet loss determination unit 218. In FIG. 4, the horizontal axis is the time axis. For example, sound signal packets A ₁ , A ₂ ,..., A _n , B ₁ , B ₂ , A, B from a plurality of channels A, B are distributed in time as shown in FIG. ..., _Bn is acquired. Here, the time difference between the packets to be output at the same timing, that is, the time difference between the packets A ₁ and B ₁ and the packets A ₂ and B ₂ (indicated by white double arrows in FIG. 4A) is the arrival time. It will be late.

ここで、それぞれのチャンネルに割り振られたミキシングバッファ２１６の上限蓄積量が、パケットＡ_１、Ａ_２、…、Ａ_ｎまでしか蓄積できない量である場合に、図４（ｂ）に示すように、チャンネルＡのパケットに対してチャンネルＢのパケットが大幅に遅れ、例えば、チャンネルＡのパケットＡ_１、Ａ_２、…、Ａ_ｎが到達しているがチャンネルＢのパケットＢ_１は未だ到達していないとする。すると、パケットＢ_１の到着を待っている間にチャンネルＡの新たなパケットＡ_ｎ＋１が到達してしまうと、ミキシングバッファ２１６のチャンネルＡに関するデータ領域がオーバーフローしてしまう。従って、パケットロス判定部２１８は、パケットＡ_１との到達時間の遅れが所定時間を超えたパケットＢ_１を欠落したパケットとみなし、パケットＡ_１、Ｂ_１のミキシング処理を進める。このときパケットＢ_１は存在しないが、ミキシング処理を遂行するためパケットＢ_１に代わる何らかのデータが必要となる。 Here, the upper limit accumulation amount of mixing the buffer 216 allocated to each channel, the packet A _1, A 2, _..., if an amount can only accumulate to A _n, as shown in FIG. 4 (b), The channel B packet is significantly delayed with respect to the channel A packet. For example, the channel A packets A ₁ , A ₂ ,..., _An reach, but the channel B packet B ₁ has not yet arrived. And Then, when a new packet A _{n + 1} of channel A will reach while waiting for the arrival of packets B _1, data area overflows about the channel A of mixing the buffer 216. Thus, the packet loss determining unit 218 regards the packet delay in the arrival time of the packet A ₁ is missing a packet B ₁ exceeds a predetermined time, advances the mixing processing of packets A _1, B _1. At this time, the packet B ₁ does not exist, but some data in place of the packet B ₁ is required to perform the mixing process.

パケット補完部２２０は、欠落したとみなされたパケットを、ミキシング処理や再生が可能となる何らかのデータで補完する。従って、その後到達するであろう真のパケットは当該ミキシング処理に利用されずに破棄されることになる。ここで、補完処理としては、無音のパケットもしくは所定のダミーパケットへの置換処理、１または複数の過去のパケットを繰り返す処理、または、前後のパケットから補間パケットを生成しその補間パケットへの置換処理等、様々な処理を適用することができる。かかる構成により、正常な再生処理を継続することが可能となる。 The packet complementing unit 220 supplements a packet regarded as missing with some data that can be mixed and reproduced. Therefore, the true packet that will arrive after that is discarded without being used in the mixing process. Here, as a complementary process, a replacement process with a silent packet or a predetermined dummy packet, a process of repeating one or a plurality of past packets, or a process of generating an interpolation packet from previous and subsequent packets and replacing it with the interpolation packet Various processes can be applied. With this configuration, it is possible to continue normal reproduction processing.

従って、図４（ｂ）に示すように、任意のパケット、ここではパケットＢ_１が他のチャンネルＡに対して大幅に到達が遅れ、所定時間を超えてしまうと、例えばパケットＢ_１のダミーパケットが生成され、生成されたダミーパケットはミキシングバッファ２１６にパケットＢ_１として保持される。こうして、パケットＡ_１、Ｂ_１が揃うとその２つのパケットＡ_１、Ｂ_１に関してミキシング処理を遂行することが可能となる。 Accordingly, as shown in FIG. 4 (b), any packet, wherein the delay is significantly reaches the packet B ₁ is other channel A, when exceeds a predetermined time, for example, the dummy packet of the packet B ₁ Is generated, and the generated dummy packet is held in the mixing buffer 216 as the packet B ₁ . In this way, when the packets A ₁ and B ₁ are ready, it is possible to perform the mixing process for the two packets A ₁ and B ₁ .

ところで、図４（ｂ）に示した状態では、ミキシング処理が実行されるまで、ミキシングバッファ２１６の上限蓄積量に相当する遅延が生じる。従って、ミキシング処理後の最終的な音信号はかかる遅延を許容できる程度遅らせて出力される。ここで、ミキシング処理後の最終的な音信号の遅延を可能な限り抑えるため、ミキシング処理時間を短時間で実行する必要が生じる。しかし、それは音切れ等の聴感上のノイズ発生の可能性を高める。 Meanwhile, in the state shown in FIG. 4B, a delay corresponding to the upper limit accumulation amount of the mixing buffer 216 occurs until the mixing process is executed. Therefore, the final sound signal after the mixing process is output with such a delay as tolerable. Here, in order to suppress the delay of the final sound signal after the mixing process as much as possible, it is necessary to execute the mixing process time in a short time. However, it increases the possibility of audible noise such as sound interruption.

例えば、本実施形態のミキシング装置１３０では、複数のチャンネルそれぞれにおける、パケットの送受信、パケットロス判定、パケット補完、ミキシング等の複数の処理を、処理能力に限界がある制御部２１２で実行している。従って、任意の処理によって待ち時間が生じると他の処理に影響を及ぼし、ミキシング処理を短時間で実行しなればならないのに反して、ミキシング処理が間に合わなくなる。すると、後述するミキシング処理部２２２が、出力バッファ２２４にミキシング後のパケットを適切なタイミングで伝達することができなくなる可能性が生じる。これが音切れの原因となる。 For example, in the mixing apparatus 130 of the present embodiment, a plurality of processes such as packet transmission / reception, packet loss determination, packet complementation, and mixing in each of a plurality of channels are executed by the control unit 212 having limited processing capability. . Therefore, if a waiting time is caused by an arbitrary process, it affects other processes, and the mixing process must be executed in a short time, but the mixing process is not in time. Then, there is a possibility that the mixing processing unit 222 described later cannot transmit the mixed packet to the output buffer 224 at an appropriate timing. This causes a sound interruption.

このようなミキシング処理に余裕を持たせるため、パケットロスと判定するタイミングを全てのチャンネルに対して一律に早めようとすると優先度の高いチャンネルまで早期にパケットが欠落したと判定されてしまい、主要な音信号の音切れを招き、その音信号を聴いているユーザに違和感を生じさせることとなる。 In order to provide a margin for such mixing processing, if the timing for determining packet loss is uniformly advanced for all channels, it will be determined that the packet has been dropped early to the channel with the higher priority, As a result, the sound signal is interrupted and the user who is listening to the sound signal feels uncomfortable.

ここでは、優先度に従って、各チャネルのパケットの欠落とみなす時間を異ならせる、即ち、優先度の高いチャンネルはミキシング処理時間を短縮しても可能な限りパケットを欠落させないようにし、優先度の低いチャネルは早めにパケットロス判定を行い、余裕を持ったミキシング処理を実行させることで、安定したミキシング処理を遂行することを可能にする。 Here, according to the priority, the time for which the packet of each channel is regarded as missing is made different, that is, the channel with high priority is not dropped as much as possible even if the mixing processing time is shortened, and the priority is low. The channel makes it possible to perform stable mixing processing by performing packet loss determination early and executing mixing processing with a margin.

まず、制御部２１２は、各チャンネルの優先度に応じて、チャンネルにおけるパケットの、他のチャンネルのパケットに対する到着時刻の遅れの許容時間を示す所定時間を予め設定しておく。 First, in accordance with the priority of each channel, the control unit 212 sets in advance a predetermined time that indicates an allowable time of arrival delay of a packet in a channel with respect to a packet in another channel.

具体的に、ミキシングバッファ２１６には、複数のチャンネルに対応したデータ領域毎に異なる所定時間が予め設定されており、パケットのＩＰヘッダ等に示された優先度を参照して、チャンネルをミキシングバッファ２１６に割り当てる。例えば、アナウンス等、優先度の高いチャンネルの音信号のパケットを任意の所定時間のデータ領域に対応付け、ＢＧＭ（Back-Ground Music）等、優先度の低いチャンネルの音信号のパケットを上記任意の所定時間より短い所定時間のデータ領域に対応付ける。 Specifically, in the mixing buffer 216, different predetermined times are set in advance for each data area corresponding to a plurality of channels, and the channel is mixed with reference to the priority indicated in the IP header or the like of the packet. 216. For example, a sound signal packet of a high priority channel such as an announcement is associated with a data area of an arbitrary predetermined time, and a sound signal packet of a low priority channel such as BGM (Back-Ground Music) is Corresponding to a data area of a predetermined time shorter than the predetermined time.

また、他の例として、まず、チャンネルをミキシングバッファ２１６の複数のデータ領域に割り当て、その後、テーブル等を参照して優先度に応じた所定時間をそれぞれデータ領域毎に設定することもできる。 As another example, first, a channel can be assigned to a plurality of data areas of the mixing buffer 216, and then a predetermined time corresponding to the priority can be set for each data area with reference to a table or the like.

さらに具体的には、所定時間を、優先度の低いチャンネルに対して、優先度の高いチャンネルよりも短く設定する。このように、優先度の低いチャンネルに対して、優先度の高いチャンネルよりも、遅れを判定するための所定時間を短く設定し、パケットロス判定を早めることで、優先度の低いチャンネルのパケット遅延に拘わらず、優先度の高いチャンネルのミキシング処理を早期に実行することができる。このような安定したミキシング処理によって、少なくとも優先度の高いチャンネルの音信号を確実かつ安定して出力することが可能となる。ここでは、優先度が低いチャンネルに対する所定時間を短くする構成を述べたが、当然、優先順位が高いチャンネルに対する所定時間を延ばすことでも目的を達成することができる。 More specifically, the predetermined time is set shorter for the low priority channel than for the high priority channel. In this way, for a low-priority channel, the packet delay of a low-priority channel is set by shortening the predetermined time for determining the delay compared to a high-priority channel and accelerating packet loss determination. Regardless of this, it is possible to execute the mixing process of the high priority channel at an early stage. By such stable mixing processing, it is possible to reliably and stably output a sound signal of at least a high priority channel. Here, a configuration has been described in which the predetermined time for a channel with a low priority is shortened. Naturally, the object can also be achieved by extending the predetermined time for a channel with a high priority.

図５は、チャンネル毎に所定時間を異ならせた場合のパケットロス判定部２１８の動作を説明するための説明図である。複数のチャンネルＡ、ＢにおいてチャンネルＡの優先度がチャンネルＢより高い場合、チャンネルＢのパケットＢ_１、Ｂ_２、…、Ｂ_ｎがミキシングバッファ２１６の上限蓄積量と等しくなるまで（最長の所定時間まで）チャンネルＡのパケットを待つこととなるので、図５（ａ）に示すように、パケットロス判定部２１８は、チャンネルＡのパケットの到達が、ミキシングバッファ２１６の上限蓄積量に相当する長い所定時間を超えて初めてパケットが欠落したと判定する。そして、パケット補完部２２０は、パケットが欠落したと判定されたパケットを再生可能なパケットで補完する。 FIG. 5 is an explanatory diagram for explaining the operation of the packet loss determination unit 218 when the predetermined time is varied for each channel. When the priority of channel A is higher than that of channel B in a plurality of channels A and B, until packets B ₁ , B ₂ ,..., B _{n of} channel B become equal to the upper limit accumulation amount of mixing buffer 216 (the longest predetermined time) 5), the packet loss determination unit 218 determines that the arrival of the channel A packet is a long predetermined amount corresponding to the upper limit accumulation amount of the mixing buffer 216, as shown in FIG. It is determined that a packet is missing for the first time after the time has elapsed. Then, the packet complementing unit 220 supplements the packet determined to be missing with a reproducible packet.

これに対して、チャンネルＡより優先度が低いチャンネルＢでは、所定時間がチャンネルＡより短く設定され、図５（ｂ）に示すように、チャンネルＡのパケットがミキシングバッファ２１６の上限蓄積量に至っていない、例えばパケットＡ_１、Ａ_２のみがミキシングバッファ２１６に保持されている状態であっても、チャンネルＢのパケットのチャンネルＡのパケットに対する到着時刻の遅れがチャンネルＢの短い所定時間を超えるので、パケットロス判定部２１８は、チャンネルＢのパケットが欠落したと判定する。ここでもパケット補完部２２０は、パケットが欠落したと判定されたパケットを再生可能なパケットで補完する。 On the other hand, in channel B, which has a lower priority than channel A, the predetermined time is set shorter than that of channel A, and the packet of channel A reaches the upper limit accumulation amount of mixing buffer 216 as shown in FIG. For example, even when only the packets A ₁ and A ₂ are held in the mixing buffer 216, the delay in the arrival time of the channel B packet with respect to the channel A packet exceeds the short predetermined time of the channel B. The packet loss determination unit 218 determines that the channel B packet is missing. Again, the packet complementing unit 220 supplements a packet determined to be missing with a reproducible packet.

こうして、優先度の低いチャンネル、例えばチャンネルＢに関しては、ミキシングバッファ２１６の上限蓄積量に相当する時間待たなくとも、図５（ｂ）に示すように、それより短い時間で再生可能なパケットＢ_１が生成され、優先度の高いチャンネルのパケットＡ_１をパケットＢ_１と共に速やかにミキシング処理部２２２に渡すことができる。従って、早期にパケットＡ_１、Ｂ_１をミキシング処理できるので、ミキシング処理に余裕が生じ、音切れの頻度を低減すると共に聴感上の音質の向上を図ることが可能となる。 Thus, for a channel with a low priority, for example, channel B, as shown in FIG. 5B, packet B _{1 that} can be reproduced in a shorter time without waiting for the time corresponding to the upper limit accumulation amount of mixing buffer 216. Is generated, and the packet A ₁ of the channel with a high priority can be promptly transferred to the mixing processing unit 222 together with the packet B ₁ . Therefore, since the packets A ₁ and B ₁ can be mixed at an early stage, there is a margin in the mixing process, and it is possible to reduce the frequency of sound interruption and improve the sound quality on hearing.

優先度の異なるチャンネルの例としては、例えば、アナウンス等の音声で形成されたチャンネルと、ＢＧＭ等のチャンネルとがある。図５を用いて説明した構成を用いると、優先度の高いチャンネルであるアナウンス等は、ミキシングバッファ２１６の上限蓄積量に相当する所定時間分パケットの到着を待ってもらえるので、パケットの欠落を回避することができる。一方、優先度の低いチャネルであるＢＧＭ等は、早めにパケットロス判定を行いミキシング処理部２２２に早期に渡すことで余裕を持ったミキシング処理を実行させることが可能となる。 Examples of channels with different priorities include, for example, a channel formed by voice such as an announcement and a channel such as BGM. When the configuration described with reference to FIG. 5 is used, announcements that are high-priority channels can wait for arrival of packets for a predetermined amount of time corresponding to the upper limit accumulation amount of the mixing buffer 216, thereby avoiding packet loss. can do. On the other hand, a BGM or the like having a low priority channel can execute a mixing process with a margin by performing packet loss determination early and passing it to the mixing processing unit 222 at an early stage.

この場合、優先度の低いチャンネルであるＢＧＭは、パケット補完部２２０によってパケットが補完される可能性、即ち、真のパケットが破棄される可能性が高くなるが、優先度の高いアナウンスと比較して再生音量そのものが低い場合が多く、連続的に聴く必要性も低いので、アナウンスとＢＧＭとをミキシング処理した結果に影響を及ぼさない。 In this case, the BGM, which is a channel with a low priority, has a higher possibility that the packet will be complemented by the packet complementer 220, that is, the possibility that the true packet will be discarded, but compared with the announcement with a higher priority. In many cases, the playback volume itself is low and the necessity of continuous listening is low, so that the result of mixing the announcement and BGM is not affected.

このような優先度の異なるチャンネルの他の組み合わせは、例えば、地震速報等、緊急度の高いアナウンスや時報等とＢＧＭ、カラオケにおける音声と演奏、販売店舗専用の個別宣伝音楽とＢＧＭ等、様々採用することができる。また、後述するように、出力バッファ２２４からの出力ラインが複数ある場合、任意の出力ラインでは、アナウンスとＢＧＭの両チャンネルをミキシング処理した結果を出力し、他の出力ラインでは、ＢＧＭのみをミキシング処理した結果を出力するとしてもよい。 Other combinations of channels with different priorities, such as earthquake early warning, announcements and hourly reports with high urgency, BGM, voice and performance in karaoke, individual advertising music and BGM dedicated to retail stores, etc. can do. As will be described later, when there are a plurality of output lines from the output buffer 224, the result of mixing both the announcement and BGM channels is output on any output line, and only the BGM is mixed on the other output lines. The processed result may be output.

ミキシング処理部２２２は、ミキシングバッファ２１６によって各チャンネルのパケット（パケット補完部２２０によって補完されたパケットを含む）の時間軸が揃えられると、各チャンネルのパケットを加算して１つのパケットを生成する。各チャンネルのパケットが揃っているか否かは、パケットのタイムスタンプによって確認してもよいし、ミキシングバッファ２１６内の所定の位置にパケットが存在するか否かで確認してもよい。 When the time axis of the packets of each channel (including the packets complemented by the packet complementing unit 220) is aligned by the mixing buffer 216, the mixing processing unit 222 adds the packets of each channel to generate one packet. Whether or not the packets for each channel are ready may be confirmed by the time stamp of the packet, or may be confirmed by whether or not the packet exists at a predetermined position in the mixing buffer 216.

出力バッファ２２４は、ミキシング処理部２２２に対応して設けられ、ミキシング処理部２２２によってミキシング処理されたパケットを一時的に保持して出力ライン間の揺らぎを吸収する。出力ライン間の揺らぎは、配信サーバ１１０の起動タイミングのずれ、ミキシング装置１３０における各種コマンド発行またはパケットの取得タイミングのずれ、基準クロックの微少なずれ、チャンネル毎のミキシング処理の消費時間の違い等から生じ得る。 The output buffer 224 is provided corresponding to the mixing processing unit 222, temporarily holds the packet processed by the mixing processing unit 222, and absorbs fluctuations between the output lines. The fluctuations between the output lines are due to a difference in the start timing of the distribution server 110, a difference in the issue timing of various commands in the mixing device 130 or a packet acquisition timing, a slight difference in the reference clock, a difference in the consumption time of the mixing process for each channel, and the like. Can occur.

そして、出力バッファ２２４は、出力ライン毎のパケットの出力タイミングが到来すると、パケットを復号してそれぞれ音出力装置１４０に出力する。こうして、音信号の出力ラインが複数あった場合であってもその出力ライン毎の揺らぎを吸収し、音信号のリアルタイム性を維持することが可能となる。 Then, when the output timing of the packet for each output line arrives, the output buffer 224 decodes the packet and outputs it to the sound output device 140. In this way, even when there are a plurality of output lines for the sound signal, it is possible to absorb fluctuation for each output line and maintain the real-time property of the sound signal.

以上、説明したミキシング装置１３０によって、複数のチャンネル間でパケットの揺らぎが顕在している状態で、バッファの上限蓄積量を無駄に大きくすることなく、安定したミキシング処理を実行することが可能となる。 As described above, the mixing device 130 described above makes it possible to perform stable mixing processing without unnecessarily increasing the upper limit accumulation amount of the buffer in a state where packet fluctuations are manifested between a plurality of channels. .

（ミキシング方法）
次に、上述したミキシング装置１３０を用い、通信網１２０を通じて伝送された複数のチャンネルの音信号を加算し、新たに音信号を生成するミキシング方法について説明する。 (Mixing method)
Next, a mixing method for newly generating a sound signal by adding the sound signals of a plurality of channels transmitted through the communication network 120 using the above-described mixing device 130 will be described.

図６は、ミキシング方法の処理の流れを示したフローチャートである。制御部２１２は、図６に示す処理が開始される前に、予め、複数のチャンネルにおけるパケットの、他のチャンネルのパケットに対する到着時刻の遅れの許容時間を示す所定時間を、複数のチャンネルの優先度に従ってチャンネル毎に異ならせて設定している。 FIG. 6 is a flowchart showing a processing flow of the mixing method. Before the processing shown in FIG. 6 is started, the control unit 212 sets a predetermined time indicating a delay time of arrival time of a packet in a plurality of channels with respect to a packet in another channel as a priority for the plurality of channels. It is set differently for each channel according to the degree.

ユーザが、ミキシング装置１３０による音信号の再生開始を所望すると（Ｓ３０２のＹＥＳ）、配信サーバ１１０は、通信網１２０を通じて複数のチャンネルの音信号のパケットを伝送し、パケット取得部（通信部２１０）は、かかるパケットを取得する（Ｓ３０４）。 When the user desires to start the reproduction of the sound signal by the mixing device 130 (YES in S302), the distribution server 110 transmits the sound signal packets of a plurality of channels through the communication network 120, and the packet acquisition unit (communication unit 210). Acquires such a packet (S304).

続いて、パケットロス判定部２１８は、複数のチャンネルにおけるパケットの、最先に取得された他のチャンネルのパケットに対する到着時刻の遅れと、複数のチャンネル毎に予め設定されている所定時間とを比較し（Ｓ３０６）、パケットの到着時刻の遅れが所定時間を超えると（Ｓ３０８のＹＥＳ）、そのパケットは欠落したとみなし、パケット補完部２２０は、欠落したとみなされたパケットを補完する（Ｓ３１０）。 Subsequently, the packet loss determination unit 218 compares the arrival time delay of the packet in the plurality of channels with respect to the packet of the other channel acquired first and a predetermined time preset for each of the plurality of channels. When the delay of the arrival time of the packet exceeds the predetermined time (YES in S308), the packet is regarded as missing, and the packet complementing unit 220 complements the packet regarded as missing (S310). .

ミキシングバッファ２１６は、パケット取得部に取得されたパケットまたはパケット補完部２２０に補完されたパケットを一時的に保持し（Ｓ３１２）、ミキシング処理部２２２は、ミキシングバッファ２１６に保持されることによって時間軸が揃えられた複数のチャンネルのパケットを加算（ミキシング処理）して１つのパケットを生成する（Ｓ３１４）。 The mixing buffer 216 temporarily holds the packet acquired by the packet acquisition unit or the packet complemented by the packet complementing unit 220 (S312), and the mixing processing unit 222 holds the time axis by being held in the mixing buffer 216. Are added (mixing process) to generate one packet (S314).

出力バッファ２２４は、ミキシング処理部２２２によってミキシング処理されたパケットを一時的に保持し（Ｓ３１６）、復号後、他の出力ラインと時刻同期させて出力する（Ｓ３１８）。かかるミキシング処理は、ユーザが音信号の再生停止を所望するまで（Ｓ３２０のＹＥＳ）繰り返される。 The output buffer 224 temporarily holds the packet mixed by the mixing processor 222 (S316), and after decoding, outputs the packet in time synchronization with other output lines (S318). Such mixing processing is repeated until the user desires to stop the reproduction of the sound signal (YES in S320).

かかるミキシング方法によっても複数のチャンネル間でパケットの揺らぎが顕在している状態で、バッファの上限蓄積量を無駄に大きくすることなく、安定したミキシング処理を実行することが可能となる。 Even with such a mixing method, it is possible to perform stable mixing processing without unnecessarily increasing the upper limit accumulation amount of the buffer in a state where packet fluctuations are manifested between a plurality of channels.

以上、添付図面を参照しながら本発明の好適な実施形態について説明したが、本発明はかかる実施形態に限定されないことは言うまでもない。当業者であれば、特許請求の範囲に記載された範疇において、各種の変更例または修正例に想到し得ることは明らかであり、それらについても当然に本発明の技術的範囲に属するものと了解される。 As mentioned above, although preferred embodiment of this invention was described referring an accompanying drawing, it cannot be overemphasized that this invention is not limited to this embodiment. It will be apparent to those skilled in the art that various changes and modifications can be made within the scope of the claims, and these are naturally within the technical scope of the present invention. Is done.

なお、本明細書のミキシング方法における各工程は、必ずしもフローチャートとして記載された順序に沿って時系列に処理する必要はなく、並列的あるいはサブルーチンによる処理を含んでもよい。 Note that each step in the mixing method of the present specification does not necessarily have to be processed in time series in the order described in the flowchart, and may include processing in parallel or by a subroutine.

本発明は、複数のチャンネルの音信号を加算し、新たに音信号を生成するミキシング装置およびミキシング方法に利用することができる。 INDUSTRIAL APPLICABILITY The present invention can be used for a mixing apparatus and a mixing method that add sound signals of a plurality of channels and newly generate a sound signal.

１００ …音信号出力システム
１１０ …配信サーバ
１２０ …通信網
１３０ …ミキシング装置
１４０ …音出力装置
２１０ …通信部（パケット取得部）
２１２ …制御部
２１４ …操作部
２１６ …ミキシングバッファ
２１８ …パケットロス判定部
２２０ …パケット補完部
２２２ …ミキシング処理部
２２４ …出力バッファ DESCRIPTION OF SYMBOLS 100 ... Sound signal output system 110 ... Distribution server 120 ... Communication network 130 ... Mixing apparatus 140 ... Sound output apparatus 210 ... Communication part (packet acquisition part)
212 ... Control unit 214 ... Operation unit 216 ... Mixing buffer 218 ... Packet loss determination unit 220 ... Packet complementation unit 222 ... Mixing processing unit 224 ... Output buffer

Claims

A mixing device that adds sound signals of a plurality of channels transmitted through a communication network and generates a new sound signal,
A packet acquisition unit for acquiring sound signals of a plurality of channels transmitted through the communication network in units of packets;
Compare the arrival time delay of the acquired packets in the plurality of channels with respect to the packet of the other channel acquired first and a predetermined time, and if the arrival time delay of the packet exceeds the predetermined time, A packet loss determination unit that considers that a packet is missing,
A packet complementer that complements the packet deemed to be missing;
A mixing buffer that temporarily holds packets of the plurality of channels to absorb delays in arrival times of the packets;
A mixing processor that adds the packets of the plurality of channels whose time axes are aligned by the mixing buffer to generate one packet;
With
The mixing apparatus according to claim 1, wherein the predetermined time is varied for each of the channels according to the priority of the plurality of channels.

The mixing apparatus according to claim 1, wherein the predetermined time is set shorter for a low priority channel than for a high priority channel.

The packet complementing unit complements the packet deemed to be missing by any one of a replacement process with a silent packet or a dummy packet, a process of repeating a past packet, or a process of generating and replacing an interpolation packet. The mixing apparatus according to claim 1, wherein the mixing apparatus is characterized in that

A mixing method for generating a new sound signal by adding sound signals of a plurality of channels transmitted through a communication network,
A predetermined time indicating an allowable delay time of arrival time of a packet in the plurality of channels with respect to a packet in another channel is set differently for each channel according to the priority of the plurality of channels,
Obtaining sound signals of a plurality of channels transmitted through the communication network in units of packets;
Comparing the delay in arrival time of the acquired packets in the plurality of channels with respect to the packet of the other channel acquired first and the predetermined time set for each of the plurality of channels;
When the delay of the arrival time of the packet exceeds the predetermined time, the packet is regarded as missing,
Complement the packet considered to be missing,
In order to absorb delays in arrival times of the packets, the packets of the plurality of channels are temporarily held in a mixing buffer,
A mixing method characterized in that the packets of the plurality of channels whose time axes are aligned by being held in the mixing buffer are added to generate one packet.