JP2023179125A

JP2023179125A - Digital audio signal synchronization device and program

Info

Publication number: JP2023179125A
Application number: JP2022092224A
Authority: JP
Inventors: 弘樹久保; Hiroki Kubo; 訓史大出; Norifumi Oide; 敏行西口; Toshiyuki Nishiguchi; 岳大杉本; Takehiro Sugimoto; 靖茂中山; Yasushige Nakayama; 洋幸大久保; Hiroyuki Okubo
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2022-06-07
Filing date: 2022-06-07
Publication date: 2023-12-19

Abstract

To provide a digital audio signal synchronization device and program that, even when one system receives a digital audio signal in different synchronization system containing non-PCM data, can pull the received signal into synchronization of the system with the data of the digital audio signal being retained in bit units, in a large facility consisting of a combination of multiple systems operating in different synchronization systems.SOLUTION: An audio signal synchronization device 1 is provided with: a buffer section 2 that temporarily stores input digital audio signals and performs clearing processing as necessary; a sample number adjustment section 3 that performs thinning or insertion of the digital audio signals to accommodate difference in clock speed between the input digital audio signals and synchronization signals; and an audio signal playback section 4 that plays back the digital audio signals read from the buffer section 2 in sync with the synchronization signals.SELECTED DRAWING: Figure 1

Description

本発明は、異なる同期系で動作している複数のシステムの組み合わせにより構成されている大規模な設備において、一のシステムが異なる同期系のデジタル音声信号を受信した場合であっても、ビット単位で前記デジタル音声信号のデータが保持されている状態で、受信したシステムの同期に引き込むことのできるデジタル音声信号同期装置及びプログラムに関する。 In a large-scale facility configured by a combination of a plurality of systems operating in different synchronization systems, the present invention enables bit-by-bit processing even when one system receives digital audio signals of different synchronization systems. The present invention relates to a digital audio signal synchronization device and a program that can synchronize a system that receives the digital audio signal while the data of the digital audio signal is held.

始めに、本明細書において用いる略語を解説する。
「ＡＤＭ」は、Audio Definition Modelの略であり、オーディオの技術的特性を記述するための標準化されたメタデータモデルである。「Ｓ－ＡＤＭ」は、Serial Audio Definition Modelの略である。
「ＡＥＳ」は、Audio Engineering Society（オーディオ技術者協会）の略であり、同協会が定めた規格には「ＡＥＳ」が冠される。
「ＩＥＣ」は、International Electrotechnical Commission（国際電気標準会議）の略であり、同会議が定めた規格にも「ＩＥＣ」が冠される。
「ＭＡＤＩ」は、Multichannel Audio Digital Interfaceの略であり、サンプル単位の精度でマルチチャンネルの長距離オーディオ伝送を可能とするインターフェイス技術の規格（ＡＥＳ１０）である。２００１年に拡張された「ＭＡＤＩ－Ｘ（MADI-Extended）」では、チャンネル数は６４チャンネル、サンプリングレートは４８ｋＨｚ±１％のバリピッチがサポートされた。
「ＰＣＭ」は、Pulse Code Modulation（パルス符号変調）の略である。
「ＳＤＩ」は、Serial Digital Interfaceの略である。
「ＳＭＰＴＥ」は、the Society of Motion Picture and Television Engineers（映画テレビ技術者協会）の略であるとともに、同協会が定めた規格に冠される記号でもある。
「ＳＴ」は、Standard（標準、規格）の略である。 First, abbreviations used in this specification will be explained.
“ADM” stands for Audio Definition Model, and is a standardized metadata model for describing the technical characteristics of audio. "S-ADM" is an abbreviation for Serial Audio Definition Model.
"AES" is an abbreviation for Audio Engineering Society, and standards established by the association are prefixed with "AES."
"IEC" is an abbreviation for International Electrotechnical Commission, and the standards established by the same commission are also prefixed with "IEC."
"MADI" is an abbreviation for Multichannel Audio Digital Interface, and is an interface technology standard (AES10) that enables multi-channel long-distance audio transmission with sample-level accuracy. MADI-X (MADI-Extended), which was expanded in 2001, supported variable pitch with a sampling rate of 48kHz±1% and 64 channels.
"PCM" is an abbreviation for Pulse Code Modulation.
"SDI" is an abbreviation for Serial Digital Interface.
"SMPTE" is an abbreviation for the Society of Motion Picture and Television Engineers, and is also the symbol given to the standards established by the society.
"ST" is an abbreviation for Standard.

オブジェクトベース音響を放送やストリーミングサービスで用いる場合、音声信号と音響メタデータを合わせて伝送する必要がある。その伝送方式の一つとして、シリアル表現された国際標準の音響メタデータ（Ｓ－ＡＤＭ）を、業務用のデジタル音声信号として広く用いられているＡＥＳ３に格納して伝送する方法が国際標準化されており（非特許文献１）、音声信号と音響メタデータをサンプル単位で同期して伝送することが可能である。 When using object-based audio in broadcasting or streaming services, it is necessary to transmit the audio signal and audio metadata together. As one of the transmission methods, the method of storing and transmitting serially expressed international standard audio metadata (S-ADM) in AES3, which is widely used as a professional digital audio signal, has been standardized internationally. (Non-Patent Document 1), it is possible to synchronize and transmit audio signals and audio metadata on a sample-by-sample basis.

この伝送方式は、ＡＥＳ３の音声信号を格納する領域に、ＰＣＭではなく符号化された音声信号などの非ＰＣＭのデータを格納して伝送する標準規格（非特許文献２）を元としている。格納されたデータは音声信号と同様に扱って記録・再生・伝送が可能であり、非ＰＣＭデータのために特殊な記録再生機やケーブルを用意する必要がないのが利点である。ただし、格納された非ＰＣＭのデータを正しく復号するためには、当該データが伝送前後においてビット単位で保持されている（欠落がない）必要がある。 This transmission method is based on a standard (Non-Patent Document 2) that stores and transmits non-PCM data such as encoded audio signals instead of PCM in an area for storing AES3 audio signals. The stored data can be handled and recorded, reproduced, and transmitted in the same way as audio signals, and the advantage is that there is no need to prepare a special recording/reproducing device or cable for non-PCM data. However, in order to correctly decode the stored non-PCM data, the data needs to be held in bits (no omissions) before and after transmission.

一方、放送局などの大規模な設備は、中継先から局内のスタジオ、副調整室から主調整室など、多数のシステムの組み合わせにより構成されている。異なる同期系で動作している複数のシステム間でＰＣＭのデジタル音声信号を受け渡しする際には、当該音声信号を受信側のシステムの同期に引き込むためにサンプリングレートコンバータを挿入することが一般的である。ここで、放送局で用いる音声信号のサンプリングレートは基本的に一定値（４８ｋＨｚなど）に統一されているためサンプリングレート自体は変換する必要がなく、音声信号を受信側システムの同期に引き込むことだけが目的であることが多い。 On the other hand, large-scale facilities such as broadcasting stations are made up of a combination of many systems, from relay destinations to in-house studios, sub-control rooms to main control rooms, and so on. When transferring PCM digital audio signals between multiple systems operating in different synchronization systems, it is common to insert a sampling rate converter to bring the audio signals into synchronization with the receiving system. be. Here, since the sampling rate of audio signals used by broadcasting stations is basically unified to a constant value (48 kHz, etc.), there is no need to convert the sampling rate itself; all that is required is to bring the audio signal into synchronization with the receiving system. is often the purpose.

SMPTE ST 2116 Format for Non-PCM Audio and Data in AES3 - Carriage of Metadata of Serial ADM (Audio Definition Model)SMPTE ST 2116 Format for Non-PCM Audio and Data in AES3 - Carriage of Metadata of Serial ADM (Audio Definition Model)

SMPTE ST 337 Format for Non-PCM Audio and Data in an AES3 Serial Digital Audio InterfaceSMPTE ST 337 Format for Non-PCM Audio and Data in an AES3 Serial Digital Audio Interface

オブジェクトベース音響の音響メタデータであるＳ－ＡＤＭをはじめ、符号化された音声信号などの非ＰＣＭデータを放送局などの大規模な設備内で伝送する場合、スタジオ間などで音声信号を受け渡しする際には、ＰＣＭと同様に当該音声信号を受信側の同期に引き込む必要がある。しかし従来のサンプリングレートコンバータは、受信したデジタル音声信号を一度アナログ音声信号（ないしはアナログ音声信号と同等とみなせるほどのハイサンプリングレートのデジタル音声信号）の波形に戻し（オーバーサンプリングし）、当該アナログ音声信号を受信側システムの同期信号を元に適切なサンプリングレートでリサンプリングすることで新しい同期に引き込むため、例えスタジオ間などの送受信（入出力）前後でサンプリングレートが同じ場合であってもデジタル音声信号のビット列は保持されず、Ｓ－ＡＤＭなどの非ＰＣＭのデータが伝送されていた場合にはデータに欠落が生じていた。 When transmitting non-PCM data such as S-ADM, which is acoustic metadata for object-based audio, and encoded audio signals within large-scale facilities such as broadcasting stations, audio signals are exchanged between studios, etc. In some cases, it is necessary to bring the audio signal into synchronization on the receiving side, similar to PCM. However, conventional sampling rate converters convert the received digital audio signal back into the waveform of an analog audio signal (or a digital audio signal with a high sampling rate that can be considered equivalent to an analog audio signal) (oversampling), The signal is resampled at an appropriate sampling rate based on the synchronization signal of the receiving system to bring it into new synchronization, so even if the sampling rate is the same before and after transmission and reception (input and output) between studios, digital audio The bit string of the signal is not retained, and if non-PCM data such as S-ADM was being transmitted, data would be missing.

デジタル音声信号のビット列を保持して異なる同期に引き込むために、最も単純な手段としては、元のデジタル音声信号に遅延を加えて引き込む先の同期信号にタイミングを揃えることが考えられる。しかし一般に、デジタル音声信号の生成に用いる同期信号源は、同じクロック周波数に設定されていたとしても信号源ごとにわずかな誤差が存在し、遅延を加えても正確には同期が取れずノイズを生じる原因となっていた。 In order to retain the bit string of a digital audio signal and pull it into a different synchronization, the simplest method would be to add a delay to the original digital audio signal to align the timing with the synchronization signal to which it is pulled. However, in general, synchronized signal sources used to generate digital audio signals have slight errors in each signal source even if they are set to the same clock frequency, and even if a delay is added, accurate synchronization cannot be achieved, resulting in noise. It was the cause of this.

また、元の非ＰＣＭデータを一時的に復号し、再度符号化してＡＥＳ３に格納しなおすことで、ビット列自体は保持されずとも同内容のデータを受信側システムの同期系に引き込むことは可能だが、設備内の全ての同期装置に復号及び符号化の処理能力が必要となることによる装置のコスト増加の観点や、符号化・復号の繰り返しによる品質劣化の観点からも現実的ではなかった。 Also, by temporarily decoding the original non-PCM data, re-encoding it, and storing it in AES3 again, it is possible to pull the same data into the synchronization system of the receiving system even though the bit string itself is not retained. However, this was not practical from the perspective of increased equipment costs due to the need for decoding and encoding processing capabilities in all synchronization devices within the facility, and from the standpoint of quality deterioration due to repeated encoding and decoding.

係る事情を鑑みてなされた本発明の目的は、異なる同期系で動作している複数のシステムの組み合わせにより構成されている大規模な設備において、一のシステムが非ＰＣＭのデータを含む異なる同期系のデジタル音声信号を受信した場合であっても、ビット単位で前記デジタル音声信号のデータが保持されている状態で、受信したシステムの同期に引き込むことができるデジタル音声信号同期装置及びプログラムを提供することにある。 The object of the present invention, which was made in view of the above circumstances, is that in a large-scale facility configured by a combination of a plurality of systems operating in different synchronous systems, one system operates in different synchronous systems including non-PCM data. To provide a digital audio signal synchronizing device and a program capable of synchronizing a receiving system even when a digital audio signal is received, while data of the digital audio signal is held bit by bit. There is a particular thing.

本発明に係るデジタル音声信号同期装置は、入力されたデジタル音声信号を一時的に格納し、必要に応じてクリア処理を行うバッファ部と、前記入力されたデジタル音声信号と同期信号のクロック速度の違いに対応するために前記デジタル音声信号の間引き又は挿入を行うサンプル数調整部と、前記バッファ部から読み出した前記デジタル音声信号を前記同期信号に同期して再生する音声信号再生部と、を備える。 The digital audio signal synchronization device according to the present invention includes a buffer unit that temporarily stores an input digital audio signal and performs clearing processing as necessary, and a clock speed adjustment device for the input digital audio signal and the synchronization signal. A sample number adjustment section that thins out or inserts the digital audio signal in order to accommodate the difference, and an audio signal playback section that plays back the digital audio signal read from the buffer section in synchronization with the synchronization signal. .

前記バッファ部が、前記デジタル音声信号に格納された非ＰＣＭデータを一時的に格納するようにしてもよい。 The buffer unit may temporarily store non-PCM data stored in the digital audio signal.

前記バッファ部が、前記デジタル音声信号に格納された非ＰＣＭデータのデータバーストを前記デジタル音声信号のデータに基づいて検出し、前記非ＰＣＭデータが入力されていない期間で、前記バッファ部のクリア処理又は読み書きインターバルのリセットを行うようにしてもよい。 The buffer section detects a data burst of non-PCM data stored in the digital audio signal based on the data of the digital audio signal, and clears the buffer section during a period in which the non-PCM data is not input. Alternatively, the read/write interval may be reset.

前記サンプル数調整部が、入力信号と同期信号のクロック速度の違いによって生じる前記バッファ部のオーバーフロー又はアンダーフローを検出し、非ＰＣＭデータのデータバーストの間の０データ区間で、前記オーバーフローを検出した場合は０データを間引き、前記アンダーフローを検出した場合は０データを挿入することにより、前記デジタル音声信号のサンプル数を調整するようにしてもよい。 The sample number adjustment unit detects overflow or underflow of the buffer unit caused by a difference in clock speed between the input signal and the synchronization signal, and detects the overflow in a 0 data interval between data bursts of non-PCM data. The number of samples of the digital audio signal may be adjusted by thinning out 0 data when the underflow is detected, and inserting 0 data when the underflow is detected.

前記デジタル音声信号同期装置が、前記デジタル音声信号に格納された非ＰＣＭデータとＰＣＭデータを区別しその後の処理を分けるために非ＰＣＭデータを判定する非ＰＣＭデータ判定部と、非ＰＣＭデータから区別されたＰＣＭデータにサンプリングレートコンバート処理を加えることで前記同期信号に同期させるサンプリングレートコンバート部と、サンプリングレートコンバート処理されたＰＣＭデータに、非ＰＣＭデータの前記バッファ部への一時的な格納によって生じる遅延と同等の遅延を加えることにより、前記ＰＣＭデータと非ＰＣＭデータのタイミングを揃える遅延部と、をさらに備えるようにしてもよい。 The digital audio signal synchronizer includes a non-PCM data determination unit that determines non-PCM data in order to distinguish between non-PCM data and PCM data stored in the digital audio signal and separate subsequent processing; A sampling rate converter synchronizes the PCM data with the synchronization signal by performing sampling rate conversion processing on the converted PCM data, and a sampling rate converter that performs sampling rate conversion processing on the PCM data, and temporarily stores non-PCM data in the buffer section. The apparatus may further include a delay unit that aligns the timings of the PCM data and non-PCM data by adding a delay equivalent to the delay.

また、本発明は、コンピュータを制御するためのプログラムであって、前記コンピュータを、前記デジタル音声信号同期装置として機能させるためのデジタル音声信号同期プログラムであってもよい。 Further, the present invention may be a program for controlling a computer, and a digital audio signal synchronization program for causing the computer to function as the digital audio signal synchronization device.

本発明によれば、異なる同期系で動作している複数のシステムの組み合わせにより構成されている大規模な設備において、一のシステムが非ＰＣＭのデータを含む異なる同期系のデジタル音声信号を受信した場合であっても、データの欠落を防ぎながら、受信したシステムの同期に引き込むことが可能となる。 According to the present invention, in a large-scale facility configured by a combination of a plurality of systems operating in different synchronous systems, one system receives digital audio signals of different synchronous systems including non-PCM data. Even in cases where data is not lost, it is possible to synchronize the receiving system while preventing data loss.

本発明の音声信号同期装置１の構成例を示す図である。1 is a diagram showing a configuration example of an audio signal synchronization device 1 of the present invention. 本発明の音声信号同期装置１の一の実装例を示す図である。1 is a diagram showing an example of implementation of the audio signal synchronization device 1 of the present invention. FIG. 本発明の音声信号同期装置１の他の実装例を示す図である。It is a figure which shows the other implementation example of the audio signal synchronization apparatus 1 of this invention. 本発明の音声信号同期装置１の処理フローチャートである。It is a processing flowchart of the audio signal synchronization device 1 of the present invention. 実施例１の構成例を示す図である。1 is a diagram showing a configuration example of Example 1. FIG. 実施例１の処理フローチャートである。3 is a processing flowchart of Example 1. FIG. ポインタ処理の例を示す図である。FIG. 3 is a diagram illustrating an example of pointer processing.

以下、図面を参照しながら本発明の実施形態について詳細に説明する。
図１は、本発明の一実施形態に係る音声信号同期装置１の構成例を示すブロック図である。図１に示す音声信号同期装置１は、バッファ部２と、サンプル数調整部３と、音声信号再生部４とを備える。 Embodiments of the present invention will be described in detail below with reference to the drawings.
FIG. 1 is a block diagram showing a configuration example of an audio signal synchronization device 1 according to an embodiment of the present invention. The audio signal synchronization device 1 shown in FIG. 1 includes a buffer section 2, a sample number adjustment section 3, and an audio signal reproduction section 4.

音声信号同期装置１は、入力された非ＰＣＭデータを含むデジタル音声信号をバッファ部２に一時的に格納し、サンプル数調整部３によって入出力のクロック周波数のわずかな誤差に起因するサンプル数のずれを適切に調節したうえで、音声信号再生部４で同期信号に同期させて当該デジタル音声信号を再生する。 The audio signal synchronizer 1 temporarily stores the input digital audio signal including non-PCM data in the buffer unit 2, and uses the sample number adjustment unit 3 to adjust the number of samples due to a slight error in the input/output clock frequency. After adjusting the deviation appropriately, the audio signal reproducing section 4 reproduces the digital audio signal in synchronization with the synchronization signal.

音声信号同期装置１は、典型的には、同期系のデジタル音声信号を受信する側のシステムにおいて、当該システムの入力部として実装されるものである。この場合、デジタル音声信号の受信側システムにおける受信は当該システムへの入力であることから、受信側システムにおける「受信」は「入力」と同義と考えてよい。同様に、送信側システムにおける「送信」は「出力」と同義と考えてよい。ただし、ここでいう「システム」は、単一の機器である場合もあれば、複数の機器の組み合わせである場合もあることに留意する必要がある。 The audio signal synchronizer 1 is typically installed as an input section of a system that receives a synchronous digital audio signal. In this case, since reception of a digital audio signal in the receiving system is input to the system, "reception" in the receiving system may be considered to be synonymous with "input." Similarly, "transmission" in the sending system may be considered to be synonymous with "output". However, it should be noted that the "system" here may be a single device or a combination of multiple devices.

図２は、音声信号同期装置１の一の実装例を示す図であり、図３は、音声信号同期装置１の他の実装例を示す図である。図２は、異なる同期系で動作しているスタジオＡとスタジオＢの組み合わせにより構成されている設備を示しており、図３は、当該設備のスタジオＢ側のみを示している。音声信号同期装置１は、図２の実装例ではデジタル音声信号を受信する側のシステムであるスタジオＢの入力部として実装されており、図３の実装例ではスタジオＢにおける音声卓の入力部として実装されている。ここで、スタジオＡの同期信号発生器Ａが発生する同期信号とスタジオＢの同期信号発生器Ｂが発生する同期信号は同じクロック周波数に設定されているが、わずかな誤差が存在し得るものである。 FIG. 2 is a diagram showing one implementation example of the audio signal synchronization device 1, and FIG. 3 is a diagram showing another implementation example of the audio signal synchronization device 1. FIG. 2 shows equipment configured by a combination of studio A and studio B operating in different synchronous systems, and FIG. 3 shows only the studio B side of the equipment. In the implementation example shown in FIG. 2, the audio signal synchronizer 1 is implemented as an input section of studio B, which is a system that receives digital audio signals, and in the implementation example shown in FIG. 3, it is implemented as an input section of an audio console in studio B. Implemented. Here, the sync signal generated by sync signal generator A in studio A and the sync signal generated by sync signal generator B in studio B are set to the same clock frequency, but there may be a slight error. be.

図４は、本発明の一実施形態に係る音声信号同期装置１の処理手順を示すフローチャートの例である。以下、図４のフローを参照しつつ、一つひとつの処理を説明する。 FIG. 4 is an example of a flowchart showing a processing procedure of the audio signal synchronization device 1 according to an embodiment of the present invention. Each process will be explained below with reference to the flow shown in FIG.

本発明の音声信号同期装置１はまず、入力された異なる同期系のデジタル音声信号をバッファ部２に格納する（図４のステップ１０１）。バッファ上でデータの書き込み・読み出しを行うために、処理全体に一定の遅延が生じるが、そのバッファ量は以降の処理に必要な最短のサイズに装置が自動で調節してもよいし、処理の余裕を確保するためにユーザが手動でサイズを決定してもよいし、入力された音声信号が映像に合わせることを前提としている場合には、映像信号の１フレームと同じ時間長になるよう決定してもよい。 The audio signal synchronizer 1 of the present invention first stores input digital audio signals of different synchronization systems in the buffer section 2 (step 101 in FIG. 4). Writing and reading data on the buffer causes a certain delay in the entire process, but the device can automatically adjust the buffer size to the shortest size required for subsequent processing, or The user may manually determine the size to ensure a margin, or if the input audio signal is assumed to match the video, the size may be determined to be the same time length as one frame of the video signal. You may.

次に、バッファ部２に格納された非ＰＣＭデータを確認し、一定時間連続した無音区間があるかどうかを判定する（ステップ１０２）。一定の無音区間（例えば数秒間という、非ＰＣＭデータの送信周期に対し比較的長期の区間）を検出できない場合（ステップ１０２で「Ｎｏ」）は、後述するサンプル数調整部３の処理に進む。一定以上の無音区間を検出した場合（ステップ１０２で「Ｙｅｓ」）は、非ＰＣＭデータの伝送が途切れていると判断し、バッファに蓄積されたデータをクリアして（ステップ１０３）、改めて入力されたデータの格納を始める。バッファ部２に蓄積されたデータを随時クリアすることで、入力される音声信号と引き込む先の同期信号のクロック周波数のわずかな誤差が蓄積されることによって生じうるバッファのオーバーフローやアンダーフローを未然に防ぐことができる。非ＰＣＭデータが途切れていなくとも、ユーザが現在バッファに格納されているデータが不要であると判断可能な場合、ユーザの意図的な操作により、強制的にバッファのクリアを行ってもよい。 Next, the non-PCM data stored in the buffer section 2 is checked, and it is determined whether there is a silent section that continues for a certain period of time (step 102). If a certain silent period (for example, a period of several seconds, which is relatively long compared to the transmission cycle of non-PCM data) cannot be detected ("No" in step 102), the process proceeds to the sample number adjustment section 3, which will be described later. If a silent section of a certain length or more is detected ("Yes" in step 102), it is determined that the transmission of non-PCM data is interrupted, the data accumulated in the buffer is cleared (step 103), and the data is input again. Start storing the data. By clearing the data stored in the buffer section 2 at any time, buffer overflows and underflows that may occur due to the accumulation of slight clock frequency errors between the input audio signal and the destination synchronization signal can be prevented. It can be prevented. Even if the non-PCM data is not interrupted, if the user can determine that the data currently stored in the buffer is unnecessary, the buffer may be forcibly cleared by the user's intentional operation.

ここで、バッファ部２に蓄積されたデータを全てクリアせず、バッファ上のデータ書き込み位置と読み出し位置を調節することでバッファのクリアの代用としてもよい。例えば、書き込みの速度が読み出しの速度より早い場合に無音区間内で書き込みポインタを巻き戻したり、逆に読み出しの速度が書き込みの速度より早い場合に無音区間内でバッファとは別のＮｕｌｌデータを一時的に読み込ませたりすることでも、書き込みポインタと読み出しポインタの差分を処理開始時と同様に戻し、バッファのオーバーフローやアンダーフローを未然に防ぐことが可能である。この場合のバッファは、先頭と末尾が連結されたリングバッファなど、連続的に読み書きを行ってもバッファを使い果たすことがないものが好ましい。 Here, instead of clearing all the data accumulated in the buffer unit 2, adjusting the data write position and read position on the buffer may be used instead of clearing the buffer. For example, if the writing speed is faster than the reading speed, the write pointer is rewound within the silent interval, or conversely, when the reading speed is faster than the writing speed, null data separate from the buffer is temporarily stored in the silent interval. It is also possible to return the difference between the write pointer and read pointer to the same value as when the process started, thereby preventing buffer overflow or underflow. The buffer in this case is preferably one that does not run out even if continuous reading and writing is performed, such as a ring buffer in which the beginning and end are connected.

非特許文献２で伝送される非ＰＣＭデータは、その中身が符号化された音声信号やメタデータであるために、対応するＰＣＭデータに比較し一般にデータ量が小さい。そこで非ＰＣＭデータは、ある程度の連続したデジタル音声信号に格納されて伝送される、いわゆるバースト伝送が行われ、次のデータバーストを送信するまでの期間は０データが埋め込まれる。非特許文献２の標準規格では、あるデータバーストから次のデータバーストまでの期間を最大４０９６サンプルに制限しているため、音声信号同期装置の無音区間の判定にそのまま４０９６サンプルを用いてもよいし、非ＰＣＭデータのデータバーストが一定の周期で入力されることが事前にわかっている場合などでは、その周期に合わせた長さを用いてもよいし、非特許文献２の標準規格で規定されるデータバーストの第４音声サンプルに記載されるデータバーストのバイト数のデータ（Ｐｄと呼ばれる）を確認して無音区間を特定してもよい。 Since the content of the non-PCM data transmitted in Non-Patent Document 2 is encoded audio signals and metadata, the amount of data is generally smaller than that of the corresponding PCM data. Therefore, non-PCM data is stored in a certain amount of continuous digital audio signals and transmitted, ie, so-called burst transmission is performed, and 0 data is embedded in the period until the next data burst is transmitted. The standard in Non-Patent Document 2 limits the period from one data burst to the next to a maximum of 4096 samples, so 4096 samples may be used as is for determining the silent period of the audio signal synchronization device. , if it is known in advance that a data burst of non-PCM data will be input at a certain period, a length that matches that period may be used, or a length specified by the standard in Non-Patent Document 2 may be used. The silent section may be identified by confirming data on the number of bytes of the data burst (referred to as Pd) written in the fourth audio sample of the data burst.

ステップ１０２で一定期間連続で無音を検出できない場合は、次に、サンプル数調整部３で入力信号と同期信号（出力信号）のクロック速度を比較する（ステップ１０４）。クロック速度を直接的に比較してもよいし、バッファの書き込みポインタ及び読み出しポインタの位置関係の変化により速度差を判断してもよい。本発明の音声信号同期装置では、入力信号と同期信号のサンプリングレートは同じに設定されていることを前提としているが、そのクロック速度のわずかな誤差、及び誤差の蓄積によるバッファのオーバーフロー又はアンダーフローを検出する（ステップ１０５で「オーバーフロー」又は「アンダーフロー」）。バッファ部２で逐次、バッファのクリアが行われていた場合はオーバーフロー又はアンダーフローは未然に防がれていると判断できるが、長時間非ＰＣＭデータが途切れず伝送され続けた場合など、バッファのクリアが行えなかった場合には、オーバーフロー又はアンダーフローへの対処として、サンプル数の調整（ステップ１０６ないし１０８）に進む。 If silence cannot be detected continuously for a certain period of time in step 102, then the sample number adjustment unit 3 compares the clock speeds of the input signal and the synchronization signal (output signal) (step 104). The clock speeds may be directly compared, or the speed difference may be determined based on a change in the positional relationship between the write pointer and read pointer of the buffer. The audio signal synchronizer of the present invention assumes that the sampling rate of the input signal and the synchronization signal are set to be the same, but a slight error in the clock speed and an overflow or underflow of the buffer due to the accumulation of errors may occur. (“overflow” or “underflow” in step 105). If the buffer is cleared sequentially in buffer unit 2, it can be determined that overflow or underflow is prevented, but if non-PCM data continues to be transmitted without interruption for a long time, If the clearing cannot be performed, the process proceeds to adjusting the number of samples (steps 106 to 108) as a countermeasure against overflow or underflow.

具体的には、データバーストの開始点となるデジタル音声信号のデータ（サンプル）を検出し、その直前の埋め込み区間で０データを間引くか、又は０データを挿入する。非特許文献２では、データバーストの開始点を示すため、データバーストの第１サンプルと第２サンプルに、Ｐａ、Ｐｂと呼ばれる特殊なデータを格納する。Ｐａ、Ｐｂは、例えばビット深度が１６ｂｉｔの時のＰａは０ｘＦ８７２、２４ｂｉｔの時は０ｘ９６Ｆ８７２などと、デジタル音声信号のビット深度に応じたサイズのデータが規定されている。加えて、Ｐａ、Ｐｂの直前に４サンプル以上埋め込みの０データを連続させることも合わせて規定されている。Ｐａ、Ｐｂだけの２サンプルだけではＰＣＭの音声信号や符号化されたデータが偶然Ｐａ、Ｐｂと同じビット列となり、データバーストが誤検出される可能性もあったが、直前の４サンプルとあわせた６サンプル（０，０，０，０，Ｐａ，Ｐｂ）をもってデータバーストを検出することで、誤検出の可能性を実用上問題ないレベルまで下げることが可能となっている。 Specifically, data (sample) of a digital audio signal that is the starting point of a data burst is detected, and 0 data is thinned out or 0 data is inserted in the embedding section immediately before that data. In Non-Patent Document 2, special data called Pa and Pb are stored in the first and second samples of the data burst in order to indicate the starting point of the data burst. For Pa and Pb, the data size is defined according to the bit depth of the digital audio signal, for example, Pa is 0xF872 when the bit depth is 16 bits, and 0x96F872 when it is 24 bits. In addition, it is also specified that 4 or more samples of embedded 0 data are consecutively placed immediately before Pa and Pb. If there were only two samples of Pa and Pb, the PCM audio signal or encoded data would coincidentally become the same bit string as Pa and Pb, and there was a possibility that a data burst would be erroneously detected, but when combined with the previous four samples, By detecting a data burst using 6 samples (0, 0, 0, 0, Pa, Pb), it is possible to reduce the possibility of false detection to a level that poses no problem in practice.

バッファ部２で逐次、バッファのクリアが行われていた場合、あるいは入力信号と同期信号のクロック速度が極めて高い精度で一致していた時など、オーバーフロー及びアンダーフローが生じないと判断される場合（ステップ１０５で「それ以外」）にはサンプル数の調整を行わない。 When it is determined that overflow and underflow will not occur, such as when the buffer is being cleared sequentially in the buffer section 2, or when the clock speeds of the input signal and the synchronization signal match with extremely high precision ( In step 105, the number of samples is not adjusted if "other").

ここで、誤差がある程度蓄積しても、オーバーフロー又はアンダーフローが生じないうちはその誤差を許容して０データの間引き又は挿入を行わずに、バッファをクリアできるタイミングを待ってもよいし（バッファクリア優先モード）、誤差の蓄積が音声信号１サンプル分を超えた時点で逐次０データの間引き又は挿入を行ってもよい（サンプル数調整優先モード）。 Here, even if errors accumulate to a certain extent, you can accept the errors and wait for the timing when the buffer can be cleared without thinning out or inserting 0 data as long as overflow or underflow does not occur (buffer clear priority mode), zero data may be successively thinned out or inserted when the accumulation of errors exceeds one sample of the audio signal (sample number adjustment priority mode).

なお、サンプル数調整優先モードにおいて、誤差の蓄積が音声信号１サンプル分を超えた時点で逐次０データの間引き又は挿入を行う場合には、間引き／挿入する０データのサンプル数は当然に１サンプルとなる。 In addition, in the sample number adjustment priority mode, when thinning out or inserting 0 data sequentially when the accumulation of errors exceeds one sample of the audio signal, the number of samples of 0 data to be thinned out/inserted is naturally 1 sample. becomes.

これに対して、バッファクリア優先モードでは、オーバーフロー又はアンダーフローが生じないうちは誤差を許容するので基本的には間引き／挿入を行わないが、番組長が極端に長い場合など、バッファがクリアできるタイミングが長時間訪れずにバッファのオーバーフロー又はアンダーフローが生じてしまいそうな範囲まで誤差が蓄積した場合には間引き／挿入、ないし強制的なバッファのクリアを行わざるを得なくなる。間引き／挿入を行う場合、サンプル数調整優先モードと同様に１サンプルを間引き／挿入してもよいし、調整のタイミングを減らすためにまとめて複数サンプルを間引き／挿入してもよい。ただし、定期的に調整可能な０データはＰａ、Ｐｂ前の４サンプル分しか保証されないので、複数サンプルといっても４サンプルが限度であり、バッファクリア優先モードにおいて０データの間引き／挿入を行う場合のサンプル数は１～４サンプルとなる。 On the other hand, in buffer clear priority mode, errors are allowed until overflow or underflow occurs, so basically no thinning/insertion is performed, but the buffer can be cleared if the program length is extremely long. If the timing does not arrive for a long time and errors accumulate to the extent that overflow or underflow of the buffer is likely to occur, thinning/insertion or forced clearing of the buffer must be performed. When performing thinning/insertion, one sample may be thinned out/inserted as in the sample number adjustment priority mode, or a plurality of samples may be thinned out/inserted at once in order to reduce the adjustment timing. However, the 0 data that can be adjusted periodically is guaranteed only for 4 samples before Pa and Pb, so even if multiple samples are used, the limit is 4 samples, and 0 data is thinned out/inserted in buffer clear priority mode. In this case, the number of samples is 1 to 4 samples.

バッファクリア優先モードでは、バッファがクリアされるタイミングまでは、あるいは誤差の蓄積が進み０データの間引き又は挿入が必要となるタイミングまでは、装置に入力されたデジタル音声信号が完全に保持されるため、本装置を介して音声信号を接続するシステムの動作に関して、より高い安定性が期待できる。一方、サンプル数調整優先モードでは、後述するようにＰＣＭデータと非ＰＣＭデータを別々に扱うときなどに、データバースト単位でＰＣＭデータと非ＰＣＭデータの絶対時刻が合わせられることになり、より高い時間的精度が期待できる。 In buffer clear priority mode, the digital audio signal input to the device is completely retained until the buffer is cleared, or until errors accumulate and it becomes necessary to thin out or insert 0 data. , higher stability can be expected with respect to the operation of a system that connects audio signals via this device. On the other hand, in sample number adjustment priority mode, when handling PCM data and non-PCM data separately as described later, the absolute times of PCM data and non-PCM data are matched in data burst units, resulting in a higher time You can expect accurate results.

最後に、音声信号再生部４で、サンプル数を調整済みの、又は調整が必要ないと判断された非ＰＣＭデータを、同期信号に同期して再生する（ステップ１１０）。以降一連の処理を、音声信号同期装置１が停止されるまで繰り返す。 Finally, the audio signal reproducing unit 4 reproduces the non-PCM data for which the number of samples has been adjusted or for which it has been determined that adjustment is not necessary, in synchronization with the synchronization signal (step 110). Thereafter, the series of processing is repeated until the audio signal synchronization device 1 is stopped.

なお、ＡＥＳ３やＡＥＳ３の拡張であるＭＡＤＩ（ＡＥＳ１０）では、一つの音声ストリームに複数のチャンネルのデータを合わせて伝送可能であり、ＰＣＭデータと非ＰＣＭデータが混在して入力されることが考えられる。その場合、上記の一連の処理をＰＣＭデータにも適応してもよいし、一連の処理を加える前にＰＣＭデータのチャンネルと非ＰＣＭデータのチャンネルを分離し、別々の処理を加えてもよい。後者の場合、例えば非ＰＣＭデータには上記の一連の処理を行い、ＰＣＭデータには従来のサンプリングレートコンバート処理を行い、非ＰＣＭデータの処理によって生じる遅延を加えた上で、それぞれの処理が完了したＰＣＭデータと非ＰＣＭデータをまた一つの音声ストリームとして出力する。 In addition, with AES3 and MADI (AES10), which is an extension of AES3, it is possible to transmit data from multiple channels in one audio stream, and it is possible that a mixture of PCM data and non-PCM data may be input. . In that case, the series of processes described above may also be applied to PCM data, or the channels of PCM data and channels of non-PCM data may be separated and separate processes may be applied before applying the series of processes. In the latter case, for example, the above series of processing is performed on non-PCM data, conventional sampling rate conversion processing is performed on PCM data, and each processing is completed after adding the delay caused by processing of non-PCM data. The PCM data and non-PCM data are output as one audio stream.

ＰＣＭデータの場合、無理にサンプルを挿入又は間引くと、その処理が例え１サンプルだけであったとしても可聴レベルのノイズを生じる可能性がある。バッファ部２でバッファのクリアが行えずに、サンプルの挿入又は間引きが必要な場合には、ＰＣＭデータは従来のサンプリングレートコンバート処理により同期に引き込むことで、ビット列は保持されておらずとも聴感上の問題が生じる可能性を従来のサンプリングレートコンバータと同程度に抑えることが可能である。 In the case of PCM data, if samples are forcibly inserted or thinned out, even if only one sample is processed, audible noise may occur. If the buffer cannot be cleared in the buffer section 2 and samples need to be inserted or thinned out, the PCM data can be synchronized using conventional sampling rate conversion processing, which improves the audibility even if the bit string is not retained. It is possible to reduce the possibility of problems occurring to the same extent as with conventional sampling rate converters.

以下、本発明の実施例について説明する。
（実施例１）
本発明の実施例１として、ＭＡＤＩでオブジェクトベース音響コンテンツのストリームが入力された場合を例にとって説明する。ＭＡＤＩの６４チャンネルのうち、１～６０チャンネルにＰＣＭデータ、６１～６４チャンネルにＳ－ＡＤＭ（非ＰＣＭデータ）が格納されており、データバーストは規格上最大の４０９６サンプルごとに送信されているものとする。バッファ部２での遅延量を、インターレース方式の映像のフレームレート（６０Ｈｚ）に合わせ、１÷３０≒３３ｍｓ（ミリ秒）とする。 Examples of the present invention will be described below.
(Example 1)
As a first embodiment of the present invention, a case will be described taking as an example a case where a stream of object-based audio content is input using MADI. Of the 64 channels of MADI, channels 1 to 60 store PCM data, channels 61 to 64 store S-ADM (non-PCM data), and data bursts are transmitted every 4096 samples, which is the maximum according to the standard. shall be. The amount of delay in the buffer section 2 is set to 1÷30≈33 ms (milliseconds) in accordance with the frame rate (60 Hz) of interlaced video.

業務用の音声同期信号発生器（マスタークロックジェネレータ）の多くは発振器にルビジウムを用いられ、高い精度の同期信号を出力する。例えばデジタル音声信号用の同期信号の一種でありＩＥＣ６０９５８－４の中で規定されているＡＥＳ１１のＧｒａｄｅ１では、周波数精度の許容範囲を±１ｐｐｍ（parts per million）以内とすることが定められている。実施例１では規格の範囲内で最大のずれを想定し、入出力信号のクロック周波数を４８ｋＨｚ＋１ｐｐｍ及び４８ｋＨｚ－１ｐｐｍとする。 Most professional audio synchronization signal generators (master clock generators) use rubidium for their oscillators and output highly accurate synchronization signals. For example, Grade 1 of AES11, which is a type of synchronization signal for digital audio signals and is specified in IEC 60958-4, stipulates that the allowable range of frequency accuracy is within ±1 ppm (parts per million). There is. In the first embodiment, assuming the maximum deviation within the standard range, the clock frequencies of the input/output signals are set to 48 kHz+1 ppm and 48 kHz-1 ppm.

図５は実施例１の構成例を示すブロック図、図６は実施例１の処理フローの例である。図５に示すとおり、実施例１に係る音声信号同期装置１は、バッファ部２、サンプル数調整部３及び音声信号再生部４に加えて、非ＰＣＭデータ判定部５と、遅延部６と、サンプリングレートコンバータ部７とをさらに備える。 FIG. 5 is a block diagram showing a configuration example of the first embodiment, and FIG. 6 is an example of a processing flow of the first embodiment. As shown in FIG. 5, the audio signal synchronization device 1 according to the first embodiment includes, in addition to the buffer section 2, the sample number adjustment section 3, and the audio signal reproduction section 4, a non-PCM data determination section 5, a delay section 6, The sampling rate converter section 7 is further included.

入力されたＭＡＤＩ信号はまず、非ＰＣＭデータ判定部５によりＰＣＭチャネルか非ＰＣＭチャネルかの判定が行われ（図６のステップ２０１）、非ＰＣＭデータが格納されている６１～６４チャンネルのデータはバッファ部２に、ＰＣＭが格納されている１～６０チャンネルのデータは遅延部６にそれぞれ送られる。 First, the input MADI signal is determined by the non-PCM data determination unit 5 as to whether it is a PCM channel or a non-PCM channel (step 201 in FIG. 6), and the data of channels 61 to 64 in which non-PCM data is stored is The data of channels 1 to 60 whose PCM is stored in the buffer section 2 is sent to the delay section 6, respectively.

ここで、非ＰＣＭデータが格納されているチャンネルの判定に、データバーストの開始を示すＰａ、Ｐｂとその直前の４サンプルとあわせた６サンプル（０，０，０，０，Ｐａ，Ｐｂ）が存在しているかによって装置が自動で処理を切り替えてもよいし、オブジェクトベース音響コンテンツのストリームなどのＰＣＭデータと非ＰＣＭデータが混在して伝送される場合には、Ｓ－ＡＤＭ（非ＰＣＭデータ）を伝送しているチャンネルは事前に決められていることも十分考えられるため、あらかじめユーザが６１～６４チャンネルのみを非ＰＣＭデータと扱うように指定してもよい。 Here, 6 samples (0, 0, 0, 0, Pa, Pb), including Pa and Pb indicating the start of the data burst, and the 4 samples immediately before them, are used to determine the channel in which non-PCM data is stored. The device may automatically switch the processing depending on whether the data exists, or if a mixture of PCM data and non-PCM data such as a stream of object-based audio content is transmitted, S-ADM (non-PCM data) may be used. Since it is quite conceivable that the channels transmitting the data are determined in advance, the user may specify in advance that only channels 61 to 64 are to be treated as non-PCM data.

６１～６４チャンネルの非ＰＣＭデータについては、バッファ部２でバッファに書き込まれ、読み出しを待つ状態になる（ステップ２０２）。
入出力信号のクロック周波数が４８ｋＨｚ＋１ｐｐｍ及び４８ｋＨｚ－１ｐｐｍであり、そのずれは４８ｋＨｚの２ｐｐｍとなるため、１時間で最大３６００（秒）×４８０００（サンプル／秒）×２×１０^－６≒３４６サンプル（７．２ｍｓ）となる。誤差の蓄積が事前に設定された遅延量３３ｍｓを超えるのに約４．６時間かかる。一般的な放送番組の時間長を考慮すると、誤差が蓄積する前に番組が終了し、Ｓ－ＡＤＭ（非ＰＣＭデータ）を含めオブジェクトベース音響のストリームが一度途切れ、クリアが可能であると考えられる。 The non-PCM data of the 61st to 64th channels are written into the buffer in the buffer section 2 and are in a state of waiting for reading (step 202).
The clock frequencies of the input and output signals are 48kHz + 1ppm and 48kHz - 1ppm, and the deviation is 2ppm of 48kHz, so the maximum time in one hour is 3600 (seconds) x 48000 (samples/second) x 2 x 10 ^-6 ≒ 346 samples ( 7.2ms). It takes about 4.6 hours for the error accumulation to exceed the preset delay amount of 33 ms. Considering the length of a typical broadcast program, it is thought that the program will end before errors accumulate, and the stream of object-based audio, including S-ADM (non-PCM data), will be interrupted and cleared. .

このように、一般的な番組長を考えると、オブジェクトベース音響ではバッファのクリアが適切に行われ、アンダーフロー又はオーバーフローの対策のために０データの間引き／挿入が必要となるケースは少ないことが想定される。そのため、元のデータを完全に保持することを優先してバッファクリア優先モードを用いてもよいし、サンプルのずれが最小になるよう逐次補正することを優先してサンプル数調整優先モードを用いてもよい。 In this way, considering the typical length of a program, the buffer is cleared appropriately in object-based audio, and there are few cases where it is necessary to thin out/insert 0 data to prevent underflow or overflow. is assumed. Therefore, the buffer clear priority mode may be used to prioritize completely retaining the original data, or the sample number adjustment priority mode may be used to prioritize sequential correction to minimize sample deviation. Good too.

なお、バッファクリア優先モードを用いた場合には、誤差の蓄積に応じてＰＣＭデータとＳ－ＡＤＭ（非ＰＣＭデータ）の時刻にずれが生じる可能性がある。Ｓ－ＡＤＭでは、時刻によってＰＣＭデータにかけあわせるゲインを動的に変えるといった記述が可能であり、時刻ずれによってＰＣＭデータにかけ合わせるゲインに誤差が生じ得るが、その時刻ずれは上記のとおり１時間で約７ｍｓと十分小さい。市販されている業務用のデジタル音声卓には、音声エンジニアが操作したフェーダのゲインを読み取る周期が２０ｍｓと、７ｍｓより数倍大きい製品も存在し、実用上問題にならないと考えられる。 Note that when the buffer clear priority mode is used, there is a possibility that a time difference will occur between the PCM data and the S-ADM (non-PCM data) depending on the accumulation of errors. In S-ADM, it is possible to dynamically change the gain multiplied by PCM data depending on the time, and an error may occur in the gain multiplied by PCM data due to time lag, but as mentioned above, the time lag is about 1 hour. It is sufficiently small at 7ms. Among commercially available digital audio consoles for professional use, there are some products in which the cycle for reading the gain of the fader operated by an audio engineer is 20 ms, which is several times longer than 7 ms, and this is not considered to be a problem in practice.

バッファクリア優先モードを用いた場合には、０データの間引き／挿入処理は行われず、バッファから読み出されたサンプル（非ＰＣＭデータ）をそのまま入力された同期信号に同期して再生する。サンプル数調整優先モードを用いた場合には、１÷（４８０００×２×１０^－６）≒１０．４２秒ごとに生じる１サンプル分のずれを補正するため、１０．４２÷（４０９６÷４８０００）≒１２２回のデータバーストのうち１回の直前で１サンプル分の０データの間引き又は挿入が行われる。 When the buffer clear priority mode is used, 0 data thinning/insertion processing is not performed, and the samples (non-PCM data) read from the buffer are reproduced as they are in synchronization with the input synchronization signal. When using sample number adjustment priority mode, 1÷(48000×2×10 ⁻⁶ )≒10.42÷(4096÷48000) to correct the deviation of 1 sample that occurs every 10.42 seconds. One sample of 0 data is thinned out or inserted immediately before one out of approximately 122 data bursts.

１～６０チャンネルのＰＣＭデータについては、非ＰＣＭデータからは分離されて処理される。
まず、非ＰＣＭデータのバッファによって生じる遅延に合わせるため、ＰＣＭデータについても遅延（ディレイ）処理を加える（ステップ２０３）。そのディレイ値は、非ＰＣＭデータのバッファ部２で自動又は手動で設定された遅延量に合わせて決定する。
その後、従来のサンプリングレートコンバート処理を加える（ステップ２０４）。ここで、１～６０チャンネルのＰＣＭデータについて、実施例１の装置の入出力前後を比較すると、ビット単位の正確性は損なわれるが、聴感上の違いを小さく抑えながら、入力された同期信号に引き込むことが可能である。加えて、ＭＡＤＩなどによる多チャンネル入力が想定される場合、実施例１のようにバッファ部２に格納するのを非ＰＣＭデータだけとすることで、バッファ部２に必要なメモリの量を抑えることができるというメリットもある。 PCM data of channels 1 to 60 are processed separately from non-PCM data.
First, in order to match the delay caused by the buffer of non-PCM data, delay processing is also applied to the PCM data (step 203). The delay value is determined according to the amount of delay automatically or manually set in the buffer unit 2 for non-PCM data.
Thereafter, conventional sampling rate conversion processing is applied (step 204). Here, when comparing the PCM data of channels 1 to 60 before and after the input/output of the device of Example 1, although bit-by-bit accuracy is lost, the input synchronization signal is It is possible to pull in. In addition, when multi-channel input via MADI or the like is assumed, the amount of memory required for the buffer section 2 can be reduced by storing only non-PCM data in the buffer section 2 as in the first embodiment. There is also the advantage of being able to

以上の処理により、非ＰＣＭデータ及びＰＣＭデータの両方が入力された同期信号に同期している状態になったため、最後にこれらの信号を統合し（ステップ２０５）、再度１～６４チャンネルのＭＡＤＩとして出力する。 Through the above processing, both non-PCM data and PCM data are synchronized with the input synchronization signal, so these signals are finally integrated (step 205) and re-assigned as MADI for channels 1 to 64. Output.

（実施例２）
本発明の実施例２として、ＳＤＩ信号で映像を伴うオブジェクトベース音響コンテンツのストリームを伝送している場合を例にとって説明する。ＳＤＩ信号を異なる同期系に引き込む際に用いられるフレームシンクロナイザは、入力された映像信号にエンベデッドされている音声信号を分離（デマルチプレックス）し、映像と音声は別々に同期引き込み処理をしたのちに再度統合（マルチプレックス）して出力する。この際、音声信号の同期引き込み処理は、従来のサンプリングレートコンバート処理を行っていることが一般的である。実施例２では、このサンプリングレートコンバート処理を本発明の装置の処理に置き換えることを想定して説明する。 (Example 2)
As a second embodiment of the present invention, a case where a stream of object-based audio content accompanied by video is transmitted using an SDI signal will be described as an example. The frame synchronizer used to pull SDI signals into different synchronization systems separates (demultiplexes) the audio signal embedded in the input video signal, and synchronizes the video and audio separately. Re-integrate (multiplex) and output. At this time, the audio signal synchronization pull-in process is generally performed using conventional sampling rate conversion processing. Embodiment 2 will be described assuming that this sampling rate conversion process is replaced with the process of the apparatus of the present invention.

映像に合わせることが前提の音声信号であるため、ＳＤＩの１～１５チャンネルにＰＣＭデータ、１６チャンネルにＳ－ＡＤＭ（非ＰＣＭデータ）が格納されており、データバーストは映像のフレームレート（６０Ｈｚ）に合わせて、４８０００÷６０＝８００サンプルごとに送信されているものとする。バッファ部２での遅延量、入力信号と同期信号のクロック速度は実施例１と同じとし、バッファ部２にはリングバッファを用いる。なお、バッファ部２でのデータクリアは、ポインタの調整によって代用する。 Since the audio signal is intended to match the video, PCM data is stored in SDI channels 1 to 15, S-ADM (non-PCM data) is stored in channel 16, and the data burst is at the video frame rate (60Hz). It is assumed that data is transmitted every 48,000÷60=800 samples. The amount of delay in the buffer section 2 and the clock speed of the input signal and the synchronization signal are the same as in the first embodiment, and the buffer section 2 uses a ring buffer. Note that data clearing in the buffer section 2 is substituted by pointer adjustment.

ブロック図及びフロー図は実施例１と同じである。入力された１６チャンネルの音声信号はまず、非ＰＣＭデータ判定部５により、非ＰＣＭデータが格納されている１６チャンネルのデータはバッファ部２に、ＰＣＭが格納されている１～１５チャンネルのデータは遅延部６にそれぞれ送られる。 The block diagram and flow diagram are the same as in the first embodiment. The input 16-channel audio signal is first processed by a non-PCM data determining section 5. The 16-channel data containing non-PCM data is stored in the buffer section 2, and the data of channels 1 to 15 containing PCM is stored in the buffer section 2. The signals are sent to the delay section 6, respectively.

１６チャンネルの非ＰＣＭデータについては、バッファ部２でバッファに書き込まれ、読み出しを待つ状態になる。入力信号と同期信号のクロック周波数のずれは実施例１と同じであるため、バッファ部２でのデータクリアが可能であると考えられる。 The 16 channels of non-PCM data are written into a buffer in the buffer unit 2 and are in a state of waiting for reading. Since the clock frequency difference between the input signal and the synchronization signal is the same as in the first embodiment, it is considered that the data can be cleared in the buffer section 2.

ここで、バッファ部２を実際にはクリアをせずとも、リングバッファの書込ポインタと読出ポインタを調整することで代用することができる。図７はポインタの調整の例である。バッファ部２での遅延が１÷６０≒１６．７ｍｓと設定されており、動作開始当初には読出ポインタから書込ポインタに対し８００サンプル分遅らせられている。入力音声信号と同期信号のクロック速度が完全に同一である場合は、動作開始後も読出ポインタと書込ポインタの差分（インターバル）が８００サンプルに保たれるが、クロック速度に差がある場合にはこの差分が徐々に拡大又は縮小する。差分が拡大しているときには、リングバッファをクリア可能となったタイミングで、書込ポインタを読出ポインタ＋８００（インターバル分）に変更する（巻き戻す）ことで、リングバッファをクリアせずとも動作開始当初のインターバルにリセットできる。逆に、差分が縮小しているときには、インターバル分の０（Ｎｕｌｌ）データが格納された補助バッファを別途用意しておき、リングバッファをクリア可能となったタイミングで読出ポインタを補助バッファに移し、補助バッファの読み出しが終わった時点で、読出ポインタをリングバッファがクリア可能となったタイミングでの書込ポインタと同じ値とすることで、リングバッファをクリアせずとも動作開始当初のインターバルにリセットできる。ここで、わかりやすさのために、リングバッファをクリア可能となったタイミングで、書込ポインタをリングバッファの第１インデックスに変更してもよい。 Here, without actually clearing the buffer unit 2, it can be substituted by adjusting the write pointer and read pointer of the ring buffer. FIG. 7 is an example of pointer adjustment. The delay in the buffer unit 2 is set to 1÷60≈16.7ms, and at the beginning of the operation, the read pointer is delayed by 800 samples from the write pointer. If the clock speeds of the input audio signal and the synchronization signal are completely the same, the difference (interval) between the read pointer and write pointer will be maintained at 800 samples even after the operation starts, but if there is a difference in the clock speeds, This difference gradually increases or decreases. When the difference is expanding, by changing (rewinding) the write pointer to the read pointer + 800 (interval) at the timing when the ring buffer can be cleared, it is possible to return to the beginning of operation without clearing the ring buffer. can be reset to the interval. On the other hand, when the difference is decreasing, a separate auxiliary buffer is prepared in which 0 (Null) data for the interval is stored, and the read pointer is moved to the auxiliary buffer at the timing when the ring buffer can be cleared. By setting the read pointer to the same value as the write pointer at the time when the ring buffer can be cleared when the auxiliary buffer has been read, it is possible to reset to the initial interval of operation without clearing the ring buffer. . Here, for the sake of clarity, the write pointer may be changed to the first index of the ring buffer at the timing when the ring buffer can be cleared.

実施例１と同様にバッファクリア優先モードとサンプル数調整優先モードの選択によって０データの間引き／挿入処理を行ったうえで、バッファから読み出されたサンプル（非ＰＣＭデータ）を別途入力された同期信号に同期して再生する。 As in Example 1, after thinning/inserting 0 data by selecting buffer clear priority mode and sample number adjustment priority mode, samples (non-PCM data) read from the buffer are input separately and synchronized. Play in sync with the signal.

１～１５チャンネルのＰＣＭデータについては、実施例１と同様に非ＰＣＭデータからは分離し、遅延処理とサンプリングレートコンバート処理を加える。 PCM data of channels 1 to 15 are separated from non-PCM data as in the first embodiment, and delay processing and sampling rate conversion processing are added.

以上の処理により、非ＰＣＭデータ及びＰＣＭデータの両方が入力された同期信号に同期している状態になったため、最後にこれらの信号を統合し１～１６チャンネルの音声信号とすることで、ＳＤＩ信号にマルチプレックスして出力することが可能となる。 Through the above processing, both the non-PCM data and PCM data are synchronized with the input synchronization signal, so finally, by integrating these signals into audio signals of channels 1 to 16, SDI It becomes possible to multiplex the signal and output it.

本実施形態によれば、異なる同期系で動作している複数のシステムの組み合わせにより構成されている大規模な設備において、一のシステムが非ＰＣＭのデータを含む異なる同期系のデジタル音声信号を受信した場合であっても、非ＰＣＭデータを格納したチャンネルを含むデジタル音声信号を一定量バッファリングし、別途入力された同期信号に同期してバッファリングしたデータを出力することで受信したシステムの同期に引き込むので、データの欠落を防ぎながら、受信したシステムの同期に引き込むことが可能となる。 According to this embodiment, in a large-scale facility configured by a combination of a plurality of systems operating in different synchronous systems, one system receives digital audio signals of different synchronous systems including non-PCM data. Even in such cases, the received system can be synchronized by buffering a certain amount of digital audio signals, including channels containing non-PCM data, and outputting the buffered data in synchronization with a separately input synchronization signal. Since data is pulled into the system, it is possible to synchronize the receiving system while preventing data loss.

また、非ＰＣＭデータを復号せず、非ＰＣＭデータが格納されたデジタル音声信号のデータに基づいて、バッファのクリアタイミングやサンプル数を調整可能なタイミングを検出することで、復号及び符号化の処理能力を必要とせず、かつ低遅延での処理を実現することが可能となる。 In addition, without decoding non-PCM data, the decoding and encoding process is performed by detecting the timing at which the buffer clear timing and the number of samples can be adjusted based on the data of the digital audio signal in which non-PCM data is stored. It becomes possible to realize processing with low delay without requiring any additional capacity.

上述の実施形態は代表的な例として説明したが、本発明の趣旨及び範囲内で、多くの変形及び変更ができることは当業者に明らかである。したがって、本発明は、上述の実施形態に限るものと解するべきではなく、特許請求の範囲から逸脱することなく、種々の変形や変更が可能である。例えば、図４に記載した処理を、図２に示したスタジオや図３に示したデジタル音声卓の入力部や、コンピュータのデジタル音声信号入力ボード（図示せず）の一機能として実装することが可能である。 Although the embodiments described above have been described as representative examples, it will be apparent to those skilled in the art that many modifications and changes can be made within the spirit and scope of the invention. Therefore, the present invention should not be construed as being limited to the embodiments described above, and various modifications and changes can be made without departing from the scope of the claims. For example, the processing shown in FIG. 4 can be implemented as a function of the studio shown in FIG. 2, the input section of the digital audio console shown in FIG. 3, or a digital audio signal input board (not shown) of a computer. It is possible.

また、上述の本実施形態に記載された効果は、本発明から生じる最も好適な効果を列挙したに過ぎず、本発明による効果は、本実施形態に記載されたものに限定されるものではない。 Further, the effects described in the present embodiment described above are only a list of the most preferable effects resulting from the present invention, and the effects according to the present invention are not limited to those described in this embodiment. .

さらに、上述した音声信号同期装置１として機能させるためにコンピュータを好適に用いることができ、音声信号同期装置１の各機能を実現する処理内容を記述したプログラムをコンピュータの記憶部に格納しておき、該コンピュータのＣＰＵによってこのプログラムを読み出して実行させることで音声信号同期装置１を実現することができる。 Furthermore, a computer can be suitably used to function as the audio signal synchronization device 1 described above, and a program that describes the processing contents for realizing each function of the audio signal synchronization device 1 can be stored in the storage section of the computer. The audio signal synchronization device 1 can be realized by reading and executing this program by the CPU of the computer.

また、このようなプログラムは、コンピュータ読み取りが可能な媒体に記録されていてもよい。コンピュータ読取可能媒体を用いれば、プログラムをコンピュータにインストールすることが可能である。ここで、プログラムが記録されたコンピュータ読取可能媒体は、非一時的な記録媒体であってもよい。非一時的な記録媒体は、特に限定されるものではないが、例えば、ＣＤ－ＲＯＭやＤＶＤ－ＲＯＭなどの記録媒体であってもよい。 Moreover, such a program may be recorded on a computer-readable medium. Computer-readable media allow programs to be installed on a computer. Here, the computer-readable medium on which the program is recorded may be a non-temporary recording medium. The non-temporary recording medium is not particularly limited, and may be, for example, a recording medium such as a CD-ROM or a DVD-ROM.

１音声信号同期装置
２バッファ部
３サンプル数調整部
４音声信号再生部
５非ＰＣＭデータ判定部
６遅延部
７サンプリングレートコンバータ部 1 Audio signal synchronizer 2 Buffer section 3 Sample number adjustment section 4 Audio signal reproduction section 5 Non-PCM data determination section 6 Delay section 7 Sampling rate converter section

Claims

A digital audio signal synchronization device,
A buffer section that temporarily stores the input digital audio signal and performs clearing processing as necessary;
a sample number adjustment unit that thins out or inserts the digital audio signal in order to accommodate the difference in clock speed between the input digital audio signal and the synchronization signal;
an audio signal reproducing unit that reproduces the digital audio signal read from the buffer unit in synchronization with the synchronization signal;
A digital audio signal synchronization device comprising:

The digital audio signal synchronization device according to claim 1, wherein the buffer section temporarily stores non-PCM data stored in the digital audio signal.

The buffer section detects a data burst of non-PCM data stored in the digital audio signal based on the data of the digital audio signal, and performs a clearing process on the buffer section during a period in which the non-PCM data is not input. The digital audio signal synchronization device according to claim 2, wherein the digital audio signal synchronization device resets the read/write interval.

The sample number adjustment section is
detecting an overflow or underflow of the buffer section caused by a difference in clock speed between an input signal and a synchronization signal, and thinning out 0 data if the overflow is detected in a 0 data interval between data bursts of non-PCM data; 3. The digital audio signal synchronization device according to claim 2, wherein the number of samples of the digital audio signal is adjusted by inserting 0 data when the underflow is detected.

a non-PCM data determination unit that determines non-PCM data in order to distinguish between non-PCM data and PCM data stored in the digital audio signal and separate subsequent processing;
a sampling rate converter that synchronizes the PCM data separated from non-PCM data with the synchronization signal by performing a sampling rate conversion process;
a delay unit that aligns the timings of the PCM data and the non-PCM data by adding a delay equivalent to a delay caused by temporary storage of the non-PCM data in the buffer unit to the PCM data subjected to the sampling rate conversion process;
The digital audio signal synchronization device according to claim 2, further comprising:

A digital audio signal synchronization program for controlling a computer, the program for causing the computer to function as the digital audio signal synchronization device according to any one of claims 1 to 5.