JP2006139827A

JP2006139827A - Device for recording three-dimensional sound field information, and program

Info

Publication number: JP2006139827A
Application number: JP2004326803A
Authority: JP
Inventors: Takayuki Sugawara; 隆幸菅原; Sadahiro Yasura; 定浩安良; Takao Yamabe; 孝朗山辺; Katsumi Hasegawa; 勝巳長谷川
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2004-11-10
Filing date: 2004-11-10
Publication date: 2006-06-01

Abstract

<P>PROBLEM TO BE SOLVED: To provide a device for recording three-dimensional sound field information, and a program which can reproduce both of normal audio and three-dimensional sound field. <P>SOLUTION: In a format based on DVD video standard, packing as a binaural audio data pack of a differential data pack (D_PACK) 38 is performed. Thus, by performing MPEG multiplexing, binaural three-dimensional sound field audio can be reproduced, using the differential pack 38; and when the pack is not used, the format can output a standard normal audio as the DVD video standard. Further, by describing the binaural three-dimensional sound field audio information data in a form, based on the DVD video standard and in the form linked with DVD-video zone 40 and DVD-others zone 41, the binaural three-dimensional sound field audio and the normal audio can be recorded compatibly with the DVD video standards. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、３次元音場情報の記録再生を、２次元音場オーディオ情報と互換を保って好適に実現する３次元音場情報記録装置及びプログラムに関する。 The present invention relates to a three-dimensional sound field information recording apparatus and a program that suitably realize recording and reproduction of three-dimensional sound field information while maintaining compatibility with two-dimensional sound field audio information.

従来、いくつかの方式で３次元音場に関する技術、即ち、スピーカーのないところからあたかもスピーカーがあるように音を定位させて再生する技術が提案されている。また、ＤＶＤビデオ、ＤＶＤオーディオなどの規格に関する技術も提案されている。 Conventionally, a technique related to a three-dimensional sound field, that is, a technique for localizing and reproducing a sound as if there is a speaker from a place where there is no speaker has been proposed. In addition, techniques related to standards such as DVD video and DVD audio have been proposed.

例えば、３次元バーチャルリアリティシステム等において、仮想体験による臨場感を向上させる手段として、音像定位装置が使用されている。この種のシステムでは、例えばモノラル音源からバイノーラル手法に基づいて、時間差、振幅差及び周波数特性差を持つ複数チャネルの信号を発生させることにより、聴感上、方向感及び距離感を与えて立体音場を生成する。 For example, in a three-dimensional virtual reality system or the like, a sound image localization device is used as a means for improving a sense of reality by a virtual experience. In this type of system, for example, a three-dimensional sound field is generated from a monaural sound source by generating a multi-channel signal having a time difference, an amplitude difference, and a frequency characteristic difference based on a binaural method, thereby giving a sense of direction and a sense of distance. Is generated.

即ち、オーディオ入力信号は、例えばノッチフィルタにより特定の周波数成分が減衰されて上下方向感が付与され、遅延回路によって時間差を持つ左右チャネルの信号に変換され、ＦＩＲ（有限インパルス応答）フィルタにより、仮想音源位置からの音響伝達特性が付与される。ＦＩＲフィルタのフィルタ係数は、予めダミーヘッドにより測定された頭部伝達関数（ＨＲＴＦ：ＨｅａｄＲｅｌａｔｅｄＴｒａｎｓｆｅｒＦｕｎｃｔｉｏｎ）を記憶したＨＲＴＦデータベースから与えられる。 That is, the audio input signal has a specific frequency component attenuated by, for example, a notch filter to give a sense of vertical direction, is converted into a left and right channel signal having a time difference by a delay circuit, and is virtually converted by a FIR (finite impulse response) filter. A sound transfer characteristic from the sound source position is given. The filter coefficient of the FIR filter is given from an HRTF database storing a head related transfer function (HRTF) measured in advance by a dummy head.

このような従来の音像定位装置では、すべての仮想音源位置からのＨＲＴＦを記憶しておくことは不可能であるため、通常はリスナから所定距離だけ離れた位置からの伝達特性のみを測定して記憶することで発生する所定外の距離における各耳で感じる音像が一致せず良好に定位しないという問題を解決するために、リスナから前記所定距離とは異なる距離だけ隔てた位置が仮想音源位置として指定された際、その指定された仮想音源位置により特定される伝達距離及び伝達方向とリスナの両耳間の距離とに基づいて前記仮想音源位置から前記リスナの各耳に至る右チャネルの伝達方向と左チャネルの伝達方向とをそれぞれ算出し、これら左右チャンネルの伝達方向により前記左右チャネル用のフィルタの音響伝達特性をそれぞれ決定する技術が特許文献１に開示されている。 In such a conventional sound image localization device, it is impossible to store HRTFs from all virtual sound source positions, so usually only the transfer characteristics from a position away from the listener by a predetermined distance are measured. In order to solve the problem that sound images sensed by each ear at a distance other than the predetermined distance generated by storage do not match and are not localized well, a position separated from the listener by a distance different from the predetermined distance is a virtual sound source position When specified, the transmission direction of the right channel from the virtual sound source position to each ear of the listener based on the transmission distance and direction specified by the designated virtual sound source position and the distance between the listener's ears And the transmission direction of the left channel, and the sound transmission characteristics of the left and right channel filters are determined by the transmission directions of the left and right channels, respectively. There is disclosed in Patent Document 1.

また、特許文献２には、ＤＶＤビデオやＤＶＤオーディオのフォーマットに互換性を持って独自のデータを記述する方法の一例が開示されている。
特開平１０−１７４２００号公報特開平１１−１７８０９０号公報 Patent Document 2 discloses an example of a method for describing unique data with compatibility with a DVD video or DVD audio format.
JP-A-10-174200 Japanese Patent Laid-Open No. 11-178090

従来、３次元音場情報を記録再生する際に、ノーマルなノーマルオーディオ情報と互換を保って３次元音場情報再生を可能とするフォーマットを提供し、空間音響の空間定位技術を用いた音響の再生を、既存のＤＶＤビデオ規格やＤＶＤオーディオのようにステレオ再生を含む従来再生方法と互換性を持って記録再生することができる装置がなく、ノーマルオーディオと３次元音場の双方の再生を可能とする装置が要望されていた。ここでいうノーマルオーディオとは、３次元音場オーディオ以外のものと定義する。例えば、通常のステレオオーディオである。 Conventionally, when recording and playing back 3D sound field information, a format that enables playback of 3D sound field information while maintaining compatibility with normal normal audio information has been provided. There is no device that can record and play back with conventional playback methods including stereo playback like the existing DVD video standard and DVD audio, and both normal audio and 3D sound fields can be played back. There has been a demand for a device. Normal audio here is defined as something other than three-dimensional sound field audio. For example, normal stereo audio.

そこで本発明は、ステレオ再生を含む従来再生方法と互換性を持ってノーマルオーディオと３次元音場の双方の再生を可能とする３次元音場情報記録装置及びプログラムを提供することを目的とする。 SUMMARY OF THE INVENTION Accordingly, an object of the present invention is to provide a three-dimensional sound field information recording apparatus and program that can reproduce both normal audio and a three-dimensional sound field with compatibility with conventional reproduction methods including stereo reproduction. .

本発明の３次元音場情報記録装置は、２次元音場オーディオのオーディオオブジェクトを記録する手段と、前記オーディオオブジェクトの特殊再生に用いる情報が記述されている管理情報を管理情報領域に記録する手段と、前記２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を前記オーディオオブジェクトのユーザーデータ領域に記録する手段とを具備することを特徴とする。 The three-dimensional sound field information recording apparatus of the present invention records means for recording an audio object of two-dimensional sound field audio, and means for recording management information in which information used for special reproduction of the audio object is described in a management information area And means for recording information relating to the three-dimensional sound field information in a user data area of the audio object for each predetermined unit of each frame of the two-dimensional sound field audio.

また、本発明の３次元音場情報記録装置は、２次元音場オーディオのオーディオオブジェクトを記録する手段と、前記オーディオオブジェクトの特殊再生に用いる情報が記述されている管理情報を管理情報領域に記録する手段と、前記２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を前記オーディオオブジェクトの前記２次元音場オーディオとは別のストリームデータとして多重化して記録する手段とを具備することを特徴とする。 The three-dimensional sound field information recording apparatus of the present invention records means for recording an audio object of two-dimensional sound field audio, and management information describing information used for special reproduction of the audio object in a management information area. And means for multiplexing and recording information about three-dimensional sound field information as stream data different from the two-dimensional sound field audio of the audio object for each predetermined unit of each frame of the two-dimensional sound field audio It is characterized by comprising.

また、本発明の３次元音場情報記録装置は、２次元音場オーディオのオーディオオブジェクトを記録する手段と、前記オーディオオブジェクトの特殊再生に用いる情報が記述されている管理情報を管理情報領域に記録する手段と、前記２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を前記管理情報領域とは別の３次元音場用の管理情報領域に記録する手段とを具備することを特徴とする。 The three-dimensional sound field information recording apparatus of the present invention records means for recording an audio object of two-dimensional sound field audio, and management information describing information used for special reproduction of the audio object in a management information area. And means for recording information relating to 3D sound field information in a management information area for a 3D sound field different from the management information area for each predetermined unit of each frame of the 2D sound field audio. It is characterized by comprising.

また、本発明の３次元音場情報記録装置は、前記３次元音場情報に関する情報は、２次元音場オーディオ情報とバイノーラルオーディオ情報との差分情報を差分符号化、予測符号化、もしくは直交変換を用いた符号化の少なくとも一つを用いて符号化をしてから記録をすることを特徴とする。 In the three-dimensional sound field information recording apparatus according to the present invention, the information related to the three-dimensional sound field information may be obtained by differential encoding, predictive encoding, or orthogonal transform of difference information between the two-dimensional sound field audio information and binaural audio information. Recording is performed after encoding using at least one of the encoding using.

また、本発明の３次元音場情報記録プログラムは、２次元音場オーディオのオーディオオブジェクトを記録媒体に記録させるステップと、前記オーディオオブジェクトの特殊再生に用いる情報が記述されている管理情報を記録媒体の管理情報領域に記録させるステップと、前記２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を記録媒体における前記オーディオオブジェクトのユーザーデータ領域に記録させるステップとをコンピュータに実行させることを特徴とする。 The three-dimensional sound field information recording program of the present invention also includes a step of recording an audio object of two-dimensional sound field audio on a recording medium, and management information describing information used for special reproduction of the audio object. Recording in the management information area, and recording information on the three-dimensional sound field information in the user data area of the audio object in the recording medium for each predetermined unit of each frame of the two-dimensional sound field audio. It is made to perform.

また、本発明の３次元音場情報記録プログラムは、２次元音場オーディオのオーディオオブジェクトを記録媒体に記録させるステップと、前記オーディオオブジェクトの特殊再生に用いる情報が記述されている管理情報を記録媒体の管理情報領域に記録させるステップと、前記２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を前記オーディオオブジェクトの前記２次元音場オーディオとは別のストリームデータとして多重化して記録媒体に記録させるステップとをコンピュータに実行させることを特徴とする。 The three-dimensional sound field information recording program of the present invention also includes a step of recording an audio object of two-dimensional sound field audio on a recording medium, and management information describing information used for special reproduction of the audio object. Recording information in the management information area, and for each predetermined unit of each frame of the two-dimensional sound field audio, information relating to the three-dimensional sound field information as stream data different from the two-dimensional sound field audio of the audio object And causing the computer to execute a step of multiplexing and recording on the recording medium.

また、本発明の３次元音場情報記録プログラムは、２次元音場オーディオのオーディオオブジェクトを記録媒体に記録させるステップと、前記オーディオオブジェクトの特殊再生に用いる情報が記述されている管理情報を記録媒体の管理情報領域に記録させるステップと、前記２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を記録媒体における前記管理情報領域とは別の３次元音場用の管理情報領域に記録させるステップとをコンピュータに実行させることを特徴とする。 The three-dimensional sound field information recording program of the present invention also includes a step of recording an audio object of two-dimensional sound field audio on a recording medium, and management information describing information used for special reproduction of the audio object. Recording information in the management information area for each frame, and for each predetermined unit of each frame of the two-dimensional sound field audio, information relating to the three-dimensional sound field information for a three-dimensional sound field different from the management information area in the recording medium. And causing the computer to execute the step of recording in the management information area.

また、本発明の３次元音場情報記録プログラムは、前記３次元音場情報に関する情報は、２次元音場オーディオ情報とバイノーラルオーディオ情報との差分情報を差分符号化、予測符号化、もしくは直交変換を用いた符号化の少なくとも一つを用いて符号化をしてから記録させることを特徴とする。 In the three-dimensional sound field information recording program of the present invention, the information related to the three-dimensional sound field information may be obtained by differential encoding, predictive encoding, or orthogonal transform of difference information between the two-dimensional sound field audio information and binaural audio information. The recording is performed after encoding using at least one of the encoding using.

本発明の３次元音場情報記録装置は、２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報をオーディオオブジェクトのユーザーデータ領域に記録する手段を有するので、バイノーラル３次元音場オーディオと２次元音場オーディオをＤＶＤビデオ規格互換で記録することができ、３次元音場情報を２次元音場情報と互換を保って記録再生することができる。 Since the 3D sound field information recording apparatus of the present invention has means for recording information relating to 3D sound field information in the user data area of the audio object for each predetermined unit of each frame of the 2D sound field audio, the binaural 3 Two-dimensional sound field audio and two-dimensional sound field audio can be recorded in compatibility with the DVD video standard, and three-dimensional sound field information can be recorded and reproduced while maintaining compatibility with the two-dimensional sound field information.

また、本発明の３次元音場情報記録装置は、２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を前記オーディオオブジェクトの前記２次元音場オーディオとは別のストリームデータとして多重化して記録する手段を有するので、バイノーラル３次元音場オーディオと２次元音場オーディオをＤＶＤビデオ規格互換で記録することができ、３次元音場情報を２次元音場情報と互換を保って記録再生することができる。 Also, the 3D sound field information recording apparatus of the present invention provides information related to 3D sound field information separately from the 2D sound field audio of the audio object for each predetermined unit of each frame of the 2D sound field audio. Since it has a means to multiplex and record as stream data, binaural 3D sound field audio and 2D sound field audio can be recorded with DVD video standard compatibility, and 3D sound field information is compatible with 2D sound field information. Can be recorded and played back.

また、本発明の３次元音場情報記録装置は、２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を前記管理情報領域とは別の３次元音場用の管理情報領域に記録する手段を有するので、バイノーラル３次元音場オーディオと２次元音場オーディオをＤＶＤビデオ規格互換で記録することができ、３次元音場情報を２次元音場情報と互換を保って記録再生することができる。 The 3D sound field information recording apparatus according to the present invention provides information on 3D sound field information for a 3D sound field separate from the management information area for each predetermined unit of each frame of 2D sound field audio. Since there is a means for recording in the management information area, binaural 3D sound field audio and 2D sound field audio can be recorded with DVD video standard compatibility, and 3D sound field information is kept compatible with 2D sound field information. Can be recorded and played back.

また、本発明の３次元音場情報記録プログラムによれば、２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を記録媒体におけるオーディオオブジェクトのユーザーデータ領域に記録するので、バイノーラル３次元音場オーディオと２次元音場オーディオをＤＶＤビデオ規格互換で記録することができ、３次元音場情報を２次元音場情報と互換を保って記録再生することができる。 According to the three-dimensional sound field information recording program of the present invention, information relating to the three-dimensional sound field information is recorded in the user data area of the audio object in the recording medium for each predetermined unit of each frame of the two-dimensional sound field audio. Therefore, binaural three-dimensional sound field audio and two-dimensional sound field audio can be recorded compatible with the DVD video standard, and three-dimensional sound field information can be recorded and reproduced while maintaining compatibility with the two-dimensional sound field information.

また、本発明の３次元音場情報記録装置プログラムによれば、２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を前記オーディオオブジェクトの前記２次元音場オーディオとは別のストリームデータとして多重化して記録媒体に記録するので、バイノーラル３次元音場オーディオと２次元音場オーディオをＤＶＤビデオ規格互換で記録することができ、３次元音場情報を２次元音場情報と互換を保って記録再生することができる。 According to the three-dimensional sound field information recording apparatus program of the present invention, information relating to the three-dimensional sound field information is transmitted to the two-dimensional sound field audio of the audio object for each predetermined unit of each frame of the two-dimensional sound field audio. Multiplexes them as separate stream data and records them on a recording medium, so that binaural 3D sound field audio and 2D sound field audio can be recorded in compliance with the DVD video standard, and 3D sound field information can be recorded in 2D sound fields. Recording and playback can be performed while maintaining compatibility with information.

また、本発明の３次元音場情報記録装置プログラムによれば、２次元音場オーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を記録媒体における前記管理情報領域とは別の３次元音場用の管理情報領域に記録するので、バイノーラル３次元音場オーディオと２次元音場オーディオをＤＶＤビデオ規格互換で記録することができ、３次元音場情報を２次元音場オーディオ情報と互換を保って記録再生することができる。 According to the three-dimensional sound field information recording apparatus program of the present invention, the information related to the three-dimensional sound field information is different from the management information area in the recording medium for each predetermined unit of each frame of the two-dimensional sound field audio. Since it is recorded in the management information area for 3D sound fields, binaural 3D sound field audio and 2D sound field audio can be recorded in compliance with the DVD video standard, and 3D sound field information can be recorded as 2D sound field audio information. Can be recorded and played back with compatibility.

以下、本発明の実施の形態を図面を参照して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（実施例１）
本発明の３次元音場情報記録装置及びプログラムの実施例１を図１乃至図５に基づいて説明する。図１は本発明の実施例１の３次元音場情報記録装置の構成を示すブロック図、図２はＡＡＣ符号化装置の構成を示すブロック図、図３はグルーピングの一例を示す図、図４は本実施例においてＤＶＤに応用するフォーマットを示す説明図、図５は図１に示す３次元映像情報記録装置を使用した３次元映像情報記録プログラムを示すフローチャートである。 Example 1
A first embodiment of a three-dimensional sound field information recording apparatus and program according to the present invention will be described with reference to FIGS. 1 is a block diagram showing a configuration of a three-dimensional sound field information recording apparatus according to a first embodiment of the present invention, FIG. 2 is a block diagram showing a configuration of an AAC encoding apparatus, FIG. 3 is a diagram showing an example of grouping, and FIG. FIG. 5 is an explanatory diagram showing a format applied to a DVD in this embodiment, and FIG. 5 is a flowchart showing a 3D video information recording program using the 3D video information recording apparatus shown in FIG.

図１に示すように、本実施例の３次元音場情報記録装置は、ノーマルなオーディオ情報を収録するノーマルオーディオマイク１と、バイノーラル収録を行うバイノーラルオーディオマイク２と、ノーマルオーディオマイク１によって収録されたノーマルオーディオデータを圧縮するノーマルオーディオ圧縮器３と、ノーマルオーディオ圧縮器３によって圧縮されたノーマルオーディオ圧縮データを一時的にバッファリングするバッファ４と、ノーマルオーディオ圧縮データを復号するノーマルオーディオ復号器５と、ノーマルオーディオ復号器５によって復号されたデータをバイノーラルオーディオデータから減算して差分バイノーラルオーディオデータを作成する演算器６と、差分バイノーラルオーディオデータを圧縮する差分バイノーラルオーディオ圧縮器７と、差分バイノーラルオーディオ圧縮器７によって圧縮された差分バイノーラルオーディオ圧縮データを一時的にバッファリングするバッファ８と、ビデオ信号を出力するカメラ９と、ビデオ信号を圧縮するビデオ圧縮器１０と、ビデオ圧縮器１０によって圧縮されたビデオ圧縮データを一時的にバッファリングするバッファ１１と、バッファリングされているビデオ圧縮データ、ノーマルオーディオ圧縮データ、差分バイノーラルオーディオ圧縮データを多重化する情報多重化器１２と、タイムスタンプを情報多重化器１２に入力するタイムスタンプ発生器１３と、情報多重器１２によって多重化されたストリームをＤＶＤの規格に準拠した形式にフォーマット化するＤＶＤフォーマット化器１４と、フォーマット化されたストリームを記録媒体１６に記録する記録器１５と、装置に設けられた各部を制御する制御部１７とを備える。 As shown in FIG. 1, the three-dimensional sound field information recording apparatus of this embodiment is recorded by a normal audio microphone 1 that records normal audio information, a binaural audio microphone 2 that performs binaural recording, and a normal audio microphone 1. A normal audio compressor 3 for compressing the normal audio data, a buffer 4 for temporarily buffering the normal audio compressed data compressed by the normal audio compressor 3, and a normal audio decoder 5 for decoding the normal audio compressed data. And a calculator 6 for subtracting the data decoded by the normal audio decoder 5 from the binaural audio data to create differential binaural audio data, and a differential binaural for compressing the differential binaural audio data Audio compressor 7, buffer 8 that temporarily buffers differential binaural audio compression data compressed by differential binaural audio compressor 7, camera 9 that outputs a video signal, and video compressor 10 that compresses the video signal A buffer 11 for temporarily buffering the compressed video data compressed by the video compressor 10, and information multiplexing for multiplexing the buffered video compressed data, normal audio compressed data, and differential binaural audio compressed data A time stamp generator 13 for inputting a time stamp to the information multiplexer 12, and a DVD formatter 14 for formatting the stream multiplexed by the information multiplexer 12 into a format compliant with the DVD standard; , Formatting Comprises a recording unit 15 for recording the stream on the recording medium 16, and a control unit 17 that controls each unit provided in the apparatus.

図１に示す３次元音場情報記録装置において、まず、ノーマルオーディオマイク１によってノーマルなオーディオ情報を収録する。ここでいうノーマルオーディオとは、３次元音場オーディオ以外のものと定義する。例えば、通常のステレオオーディオである。 In the three-dimensional sound field information recording apparatus shown in FIG. 1, first, normal audio information is recorded by a normal audio microphone 1. Normal audio here is defined as something other than three-dimensional sound field audio. For example, normal stereo audio.

それと同時にバイノーラルオーディオマイク２によって、バイノーラル収録を行う。バイノーラル収録はダミーヘッドなどを用いて行う。 At the same time, the binaural audio microphone 2 performs binaural recording. Binaural recording is performed using a dummy head.

ノーマルオーディオマイク１によって収録されたノーマルオーディオデータは、ノーマルオーディオ圧縮器３によって圧縮される。圧縮方式は、ＭＰＥＧ方式でも他の方式でも良い。 Normal audio data recorded by the normal audio microphone 1 is compressed by a normal audio compressor 3. The compression method may be the MPEG method or another method.

ノーマルオーディオ圧縮器３によって圧縮されたノーマルオーディオ圧縮データは後述するビデオ圧縮データや、差分バイノーラルオーディオ圧縮データとの同期を取るためにバッファ４に一時的にバッファリングされる。またノーマルオーディオ圧縮データはノーマルオーディオ復号器５によって復号され、バイノーラルオーディオデータから減算されて差分バイノーラルオーディオデータを作成し、差分バイノーラルオーディオ圧縮器７に入力される。 The normal audio compressed data compressed by the normal audio compressor 3 is temporarily buffered in the buffer 4 in order to synchronize with video compressed data described later and differential binaural audio compressed data. The normal audio compressed data is decoded by the normal audio decoder 5 and subtracted from the binaural audio data to create differential binaural audio data, which is input to the differential binaural audio compressor 7.

差分バイノーラルオーディオ圧縮器７では、差分バイノーラルオーディオデータが圧縮される。圧縮方式は、ＭＰＥＧ方式でも他の方式でも良い。ＡＡＣ方式などは可変長符号化を用いて非常に圧縮効率が良いものであり、本実施例では好適なアルゴリズムといえる。 The differential binaural audio compressor 7 compresses the differential binaural audio data. The compression method may be the MPEG method or another method. The AAC method or the like uses variable length coding and has very good compression efficiency, and can be said to be a suitable algorithm in this embodiment.

ＡＡＣ符号化装置を以下に簡単に説明する。ＡＡＣ符号化装置の構成は図２に示すような機能部からなっている。 The AAC encoding apparatus will be briefly described below. The configuration of the AAC encoding apparatus includes functional units as shown in FIG.

まず、オーディオ信号が所定サンプル数からなるフレーム単位で聴覚心理分析器２１とＭＤＣＴ（変形離散コサイン変換：ＭｏｄｉｆｉｅｄＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）器２２へ取り込まれる。 First, an audio signal is taken into a psychoacoustic analyzer 21 and an MDCT (Modified Discrete Cosine Transform) unit 22 in units of frames each having a predetermined number of samples.

そして、聴覚心理分析器２１では、入力オーディオ信号に対して高速フーリエ変換（ＦＦＴ：ＦａｓｔＦｏｕｒｉｅｒＴｒａｎｓｆｏｒｍ）を行って周波数スペクトルを求め、その周波数スペクトルに基づいて聴覚上のマスキングを演算し、予め設定された周波数帯域毎の許容量子化雑音電力と聴覚心理パラメータを算出すると共に、その聴覚心理パラメータに基づいてＭＤＣＴのための変換ブロック長を決定する。 The auditory psychological analyzer 21 performs a fast Fourier transform (FFT) on the input audio signal to obtain a frequency spectrum, calculates auditory masking based on the frequency spectrum, and is set in advance. The permissible quantization noise power and auditory psychological parameters for each frequency band are calculated, and the transform block length for MDCT is determined based on the auditory psychological parameters.

一方、ＭＤＣＴ器２２では、入力オーディオ信号に対してＭＤＣＴを行って周波数スペクトルに変換し、各周波数スペクトルに係るＭＤＣＴ係数を求める。その場合、ＭＤＣＴ器２２は、周波数スペクトルへの変換に際して、その変換ブロック長を５０％ずつオーバーラップさせ、例えば、２０４８サンプルを１０２４本のＭＤＣＴ係数に変換する。 On the other hand, the MDCT unit 22 performs MDCT on the input audio signal to convert it into a frequency spectrum, and obtains an MDCT coefficient related to each frequency spectrum. In that case, the MDCT unit 22 overlaps the conversion block length by 50% at the time of conversion to the frequency spectrum, and converts, for example, 2048 samples into 1024 MDCT coefficients.

また、ＭＤＣＴ器２２は、聴覚心理分析器２１から得られる変換ブロック長情報に基づいてＭＤＣＴの対象となるブロック長を長い変換ブロック（ロングブロック）又は短い変換ブロック（ショートブロック）に切り替えるためのブロックスイッチング機能を採用している。これは、一般に長い変換ブロック長を用いる方がスペクトルの集中度が高まるので効率的なビット配分を行えるが、周波数領域での量子化雑音は時間領域に戻された時に変換ブロック長全体に広がるため、静寂部の後で急峻な立ち上がり（アタック部）を有するような波形を長いブロック長で変換して量子化すると、その量子化雑音が静寂部まで広がることになり、聴覚上極めて耳障りなものとなるからである。 The MDCT unit 22 is a block for switching the block length to be subjected to MDCT to a long conversion block (long block) or a short conversion block (short block) based on the conversion block length information obtained from the auditory psychological analyzer 21. A switching function is adopted. This is because, generally, the longer the transform block length, the more concentrated the spectrum, so that efficient bit allocation can be performed. However, the quantization noise in the frequency domain spreads over the entire transform block length when it is returned to the time domain. When a waveform that has a steep rise (attack part) after the quiet part is converted with a long block length and quantized, the quantization noise spreads to the quiet part, which is extremely harsh to hearing. Because it becomes.

即ち、ＭＤＣＴ器２２は、聴覚心理分析器２１から得た変換ブロック長情報に基づいて変換ブロック長を選択し、特に、アタック部の前後では長い変換ブロックから複数個の短い変換ブロックに切り替えるようにしている。 That is, the MDCT unit 22 selects a conversion block length based on the conversion block length information obtained from the psychoacoustic analyzer 21, and in particular, switches from a long conversion block to a plurality of short conversion blocks before and after the attack part. ing.

例えば、定常的な信号の場合には、ＭＤＣＴの変換ブロック長は２０４８サンプルのロングブロックとして１０２４本のＭＤＣＴ係数に変換し、一方、過渡的な信号の場合には、２５６サンプルのショートブロックとして１２８本のＭＤＣＴ係数に変換する。そして、ショートブロックについては、８個連続で短い変換長を選択することとし、出力されるＭＤＣＴ係数の本数を１０２４本としてロングブロックと一致させるようにしている。 For example, in the case of a stationary signal, the conversion block length of MDCT is converted into 1024 MDCT coefficients as a long block of 2048 samples, while in the case of a transient signal, it is 128 blocks as a short block of 256 samples. Convert to MDCT coefficient of book. For the short block, eight consecutive short conversion lengths are selected, and the number of output MDCT coefficients is set to 1024 to match the long block.

次に、量子化器２３では、人間の聴覚特性に基づいて周波数帯域毎に１０２４本のＭＤＣＴ係数を複数のスケールファクタバンドに分け、スケールファクタバンド毎にＭＤＣＴ係数を正規化して量子化を行う。その際に、ショートブロックの場合には１２８本のＭＤＣＴ係数を複数のスケールファクタバンドに分ける。 Next, the quantizer 23 divides 1024 MDCT coefficients into a plurality of scale factor bands for each frequency band based on human auditory characteristics, normalizes the MDCT coefficients for each scale factor band, and performs quantization. At that time, in the case of a short block, 128 MDCT coefficients are divided into a plurality of scale factor bands.

また、各スケールファクタバンドについて計算された量子化雑音が、聴覚心理分析器２１で算出された許容量子化雑音電力よりも大きくならないように、各スケールファクタバンドの量子化ステップ数を制御し、且つ量子化に必要なビット数がフレーム単位で所定ビット数以内に収まるように全体の量子化ステップ数を制御して量子化を実行する。 Further, the number of quantization steps of each scale factor band is controlled so that the quantization noise calculated for each scale factor band does not become larger than the allowable quantization noise power calculated by the psychoacoustic analyzer 21; Quantization is executed by controlling the total number of quantization steps so that the number of bits required for quantization falls within a predetermined number of bits per frame.

なお、スケールファクタバンドの量子化ステップ数とは、各周波数帯域内のサンプルデータを波形と倍率に分離し、波形の最大振幅が１.０となるように正規化して倍率を符号化したものに相当し、スケールファクタとも言われるものである。 The scale factor band quantization step number is obtained by separating the sample data in each frequency band into a waveform and a magnification, normalizing the waveform so that the maximum amplitude of the waveform is 1.0, and encoding the magnification. It is also called a scale factor.

次に、量子化されたデータはグループ処理器２４へ入力され、より高い符号化効率が得られるようにショートブロックについてグループ化が行われる。グループ化されたブロックにおいては補助情報が共有化され、符号化効率が改善されることになる。 Next, the quantized data is input to the group processor 24, and the short blocks are grouped so as to obtain higher encoding efficiency. In the grouped blocks, auxiliary information is shared, and the coding efficiency is improved.

図３はそのグルーピングの一例を示し、８個のショートブロックが４組のグループに分けられており、各グループにはそれぞれ３，１，２，２個のショートブロックが含まれている。 FIG. 3 shows an example of the grouping. Eight short blocks are divided into four groups, and each group includes 3, 1, 2, and 2 short blocks.

次に、可変長符号化器２５では、量子化器２３とグループ処理器２４で処理された後のＭＤＣＴ係数の量子化値やスケールファクタ等の符号化パラメータに対して可変長符号化処理を施して冗長度を削減し、それをビット数判定器２６へ出力する。 Next, the variable length encoder 25 performs a variable length encoding process on the encoding parameters such as the quantized value of the MDCT coefficient and the scale factor after being processed by the quantizer 23 and the group processor 24. Thus, the redundancy is reduced, and it is output to the bit number decision unit 26.

ビット数判定器２６では、符号化された１フレーム分のビット数が予め設定された所定範囲内に収まっているか否かを判定し、その条件を満たしていれば、符号化データをそのままビットストリーム生成器２７へ出力するが、満たしていない場合には、その判定結果を処理制御部２８へ出力する。 The bit number determination unit 26 determines whether or not the number of encoded bits for one frame is within a predetermined range set in advance. If the condition is satisfied, the encoded data is directly used as a bit stream. Although it outputs to the generator 27, when it is not satisfy | filling, the determination result is output to the process control part 28. FIG.

処理制御部２８では、その判定結果に基づいて量子化器２３とグループ処理器２４と可変長符号化器２５による前記の一連の処理を再度実行させ、ビット数判定器２６において前記条件を満たしていると判定されるまでその処理を反復させる。 The processing control unit 28 re-executes the series of processes by the quantizer 23, the group processor 24, and the variable length encoder 25 based on the determination result, and the bit number determiner 26 satisfies the condition. The process is repeated until it is determined that there is.

そして、ビット数判定器２６において前記条件を満たした符号化データはビットストリーム生成器２７へ出力され、ブロック情報等の符号化パラメータと共に多重化されたビットストリームとして伝送されることになる。以上がＡＡＣ符号化装置の説明である。 The encoded data satisfying the above condition in the bit number determination unit 26 is output to the bit stream generator 27 and transmitted as a multiplexed bit stream together with encoding parameters such as block information. The above is the description of the AAC encoding apparatus.

一方、図１に示す３次元音場情報記録装置の構成の説明を続けると、ビデオ信号がカメラ９から入力され、ビデオ圧縮器１０によって圧縮される。圧縮方式はＭＰＥＧ方式などを用いる。 On the other hand, when the description of the configuration of the three-dimensional sound field information recording apparatus shown in FIG. 1 is continued, a video signal is input from the camera 9 and compressed by the video compressor 10. The MPEG method is used as the compression method.

圧縮されたビデオ圧縮データはノーマルオーディオ圧縮データや、差分バイノーラルオーディオ圧縮データとの同期を取るために、バッファ１１に一時的にバッファリングされる。 The compressed video compressed data is temporarily buffered in the buffer 11 in order to synchronize with normal audio compressed data and differential binaural audio compressed data.

そして、情報多重化器１２は、バッファリングされているビデオ圧縮データ、ノーマルオーディオ圧縮データ、差分バイノーラルオーディオ圧縮データを、同期を取りながら多重化する。多重化はＭＰＥＧシステムレイヤの同期方式でプログラムストリーム方式を用いてエレメンタリー毎にパック化し、再生時の同期を取れるようにプレゼンテーションタイムスタンプを打ちながら多重化する。タイムスタンプはタイムスタンプ発生器１３から２７ＭＨｚまたは９０ｋＨｚのカウンター情報が使用され、情報多重化器１２に入力される。この仕組みはＭＰＥＧ多重化の規格を用いれば可能であるので詳細な説明は省略する。 The information multiplexer 12 multiplexes the buffered video compressed data, normal audio compressed data, and differential binaural audio compressed data while maintaining synchronization. Multiplexing is performed by packing each elementary element using a program stream method in the MPEG system layer synchronization method, and multiplexing the presentation time stamps so as to achieve synchronization during reproduction. The time stamp uses counter information of 27 MHz or 90 kHz from the time stamp generator 13 and is input to the information multiplexer 12. Since this mechanism is possible using the MPEG multiplexing standard, a detailed description is omitted.

多重化されたストリームはＤＶＤフォーマット化器１４によって、後述するＤＶＤの規格に準拠した形式にフォーマット化されて、記録器１５によって記録媒体１６に記録される。ＤＶＤのＲＯＭ型のメディアを作成するにはＤＶＤのマスターデータとして一旦ＨＤＤに記録してから製造工程を経て、ＤＶＤメディアに記録される。 The multiplexed stream is formatted by the DVD formatter 14 into a format conforming to the DVD standard described later, and is recorded on the recording medium 16 by the recorder 15. In order to create a DVD ROM-type medium, it is once recorded on the HDD as DVD master data and then recorded on the DVD medium through a manufacturing process.

このように、ノーマルオーディオデータと３次元音場データであるバイノーラルオーディオデータは、差分をとることで相関の強い部分を削除して音場を表現する位相差や響き部分の情報が差分として符号化されることで、より符号化効率、記録能率を上げることができる。 As described above, normal audio data and binaural audio data that is three-dimensional sound field data are encoded as differences in phase difference and sounding part information that represent a sound field by removing a strongly correlated part by taking a difference. Thus, encoding efficiency and recording efficiency can be further increased.

次に、図４を参照して、ＤＶＤビデオ規格を利用してノーマルオーディオの各フレームの所定単位ごとに３次元音場情報に関する情報をオーディオオブジェクトのノーマルオーディオとは別のストリームデータとして多重化して記録する手段を説明する。 Next, referring to FIG. 4, the DVD video standard is used to multiplex information regarding the three-dimensional sound field information for each predetermined unit of each frame of normal audio as stream data different from the normal audio of the audio object. A means for recording will be described.

上記に説明した差分のバイノーラルオーディオデータは、本来、ＤＶＤ規格ではないので、ＭＰＥＧ多重化されるパックに別のストリームとして記録することが考えられる。 Since the difference binaural audio data described above is not originally a DVD standard, it is conceivable to record it as a separate stream in the MPEG multiplexed pack.

図４に示すフォーマットの一番下の階層にはその差分のバイノーラルオーディオデータが所定のサンプル数で１オーディオフレームとして１オーディオフレームレイヤ３１を構成している。これらの１オーディオフレームがいくつか集まって、約２ｋＢでパックを構成する。 In the lowest layer of the format shown in FIG. 4, a binaural audio data of the difference constitutes one audio frame layer 31 as one audio frame with a predetermined number of samples. Several of these 1 audio frames are gathered to form a pack at about 2 kB.

このパックにはヘッダが付いており、１４バイトのパックヘッダ３２とオーディオパケットにより構成され、オーディオパケットは９〜２９バイトのパケットヘッダ３３、１バイトのサブストリームＩＤ３４、３バイトのオーディオフレーム情報３５、３バイトのオーディオデータ情報３６を記録し、その後ろに２０１３バイトの差分オーディオデータ３７により構成される。これらはディファレンシャルパック（Ｄ＿ＰＡＣＫ）３８として、他のノーマルオーディオパックやビデオパックとともにバイノーラルオーディオデータパックとしてＭＰＥＧ多重化される。 This pack has a header and is composed of a 14-byte pack header 32 and an audio packet. The audio packet is a 9-29 byte packet header 33, a 1-byte substream ID 34, a 3-byte audio frame information 35, Three-byte audio data information 36 is recorded, followed by 2013-byte differential audio data 37. These are MPEG-multiplexed as a differential pack (D_PACK) 38 as a binaural audio data pack together with other normal audio packs and video packs.

ここでフォーマットを上位から見てみると、ＤＶＤビデオには記録層がVolume spaceとして、Volume and File structure３９、DVD-video zone４０、DVD-others zone４１に分かれていて、DVD-video zone４０にはビデオマネージャー（ＶＭＧ）４２、ビデオタイトルセット（ＶＴＳ）４３という構造が存在している。ビデオマネージャー４２はビデオマネージャーインフォメーションなど後続するビデオタイトルセット４３の識別情報や様々な情報自体のスタートアドレスやエンドアドレス、どこのビデオストリームから再生を開始するかなどの情報が記述されている。 Looking at the format from the top here, the recording layer is divided into Volume and File structure 39, DVD-video zone 40, and DVD-others zone 41 in the DVD video, and the DVD-video zone 40 has a video manager ( VMG) 42 and video title set (VTS) 43 exist. The video manager 42 describes information such as identification information of the subsequent video title set 43 such as video manager information, start addresses and end addresses of various information itself, and from which video stream playback is started.

ビデオタイトルセット４３には、再生されるべきオーディオやビデオのデータのアドレス情報や識別情報などのControl Data４４が記述されている。これらのビデオマネージャー４２やControl Data４４は再生に必須な情報を記録した管理情報領域であり、この領域のデータはＤＶＤフォーマット器やＤＶＤフォーマットステップによって記録され、ＤＶＤフォーマット復号器やＤＶＤフォーマット復号ステップによって再生される。 In the video title set 43, control data 44 such as address information and identification information of audio and video data to be reproduced is described. The video manager 42 and the control data 44 are management information areas in which information necessary for reproduction is recorded. Data in this area is recorded by a DVD formatter or a DVD format step, and is reproduced by a DVD format decoder or a DVD format decoding step. Is done.

その後にビデオオブジェクトセット（ＶＯＢＳ）４５というビデオとオーディオの多重化されたＭＰＥＧストリームのセットがあり、さらにビデオオブジェクト（ＶＯＢ）４６という小単位のＭＰＥＧストリームがある。 After that, there is a set of video and audio multiplexed MPEG streams called a video object set (VOBS) 45, and there is a small unit MPEG stream called a video object (VOB) 46.

ビデオオブジェクト４６の下にはさらに細分化されたセル（ＣＥＬＬ）４７という単位、さらにはビデオオブジェクトユニット（ＶＯＢＵ）４８があり、これがＭＰＥＧストリームのグループオブピクチャー（ＧＯＰ）にほぼ相当する構造となっている。このビデオオブジェクトユニット４８の再生時間は、ビデオオブジェクトユニット４８中に含まれる単数又は複数個のＧＯＰから構成されるビデオデータの再生時間に相当し、その再生時間は０．４〜１．０秒程度のものである。この中には、先頭にナビゲーションパック（ＮＶ＿ＰＡＣＫ）４９というストリームサーチ情報などが記述されている。 Below the video object 46, there are further subdivided units of cells (CELL) 47, and further, video object units (VOBU) 48, which have a structure substantially corresponding to a group of pictures (GOP) of an MPEG stream. Yes. The playback time of the video object unit 48 corresponds to the playback time of video data composed of one or a plurality of GOPs included in the video object unit 48, and the playback time is about 0.4 to 1.0 seconds. belongs to. In this, stream search information such as a navigation pack (NV_PACK) 49 is described at the top.

また、ビデオパック（Ｖ＿ＰＡＣＫ）５１というビデオ圧縮データがパック化されたデータ、オーディオパック（Ａ＿ＰＡＣＫ）５０というオーディオ圧縮データがパック化されたデータがあり、それぞれＭＰＥＧ多重化されている。 In addition, there are data in which video compressed data called video pack (V_PACK) 51 is packed, and data in which audio compressed data called audio pack (A_PACK) 50 is packed, which are MPEG-multiplexed.

次に、本実施例の３次元音場情報記録プログラムを図５に基づいて説明する。図５は図１に示す３次元音場情報記録装置を使用した３次元音場情報記録プログラムを示すフローチャートである。詳細なステップの処理内容は図１に示す３次元音場情報記録装置のブロック図の説明で記したものと同様なので、ここではステップの順番についてのみ簡単に説明する。 Next, the three-dimensional sound field information recording program of the present embodiment will be described with reference to FIG. FIG. 5 is a flowchart showing a three-dimensional sound field information recording program using the three-dimensional sound field information recording apparatus shown in FIG. Since the detailed processing contents of the steps are the same as those described in the description of the block diagram of the three-dimensional sound field information recording apparatus shown in FIG. 1, only the order of the steps will be briefly described here.

まず、ステップＳ１００では、制御部１７は、ノーマルオーディオマイク１、バイノーラルオーディオマイク２から音響データを制御部１７に入力させ、また、カメラ９からの画像データを制御部１７に所定の時間分入力させ、制御部１７のメモリーに記憶させるように制御する。 First, in step S100, the control unit 17 inputs acoustic data from the normal audio microphone 1 and the binaural audio microphone 2 to the control unit 17, and inputs image data from the camera 9 to the control unit 17 for a predetermined time. Then, control is performed so as to be stored in the memory of the control unit 17.

次に、ステップＳ１１０では、制御部１７は、ノーマルオーディオ圧縮器３及びビデオ圧縮器１０にノーマルオーディオデータ及びビデオデータを圧縮させ、ステップＳ１２０では、制御部１７は、ビデオ圧縮データ及びノーマルオーディオ圧縮データを、後述する差分バイノーラルオーディオ圧縮データとの同期を取るためにバッファ４及びバッファ１１に一時バッファリングさせるように制御する。 Next, in step S110, the control unit 17 causes the normal audio compressor 3 and the video compressor 10 to compress normal audio data and video data, and in step S120, the control unit 17 performs video compression data and normal audio compression data. Is controlled so as to be temporarily buffered in the buffer 4 and the buffer 11 in order to synchronize with differential binaural audio compressed data to be described later.

次に、ステップＳ１３０では、制御部１７は、ノーマルオーディオ復号器５においてノーマルオーディオ圧縮データを復号させるように制御する。 Next, in step S130, the control unit 17 controls the normal audio decoder 5 to decode the normal audio compressed data.

次に、ステップＳ１４０では、制御部１７は、演算器６においてバイノーラルオーディオデータとノーマルオーディオ復号データの減算計算を行わせ、差分バイノーラルオーディオデータを作成させ、差分バイノーラルオーディオ圧縮器７に入力させるように制御する。 Next, in step S140, the control unit 17 causes the arithmetic unit 6 to perform subtraction calculation between the binaural audio data and the normal audio decoded data, creates differential binaural audio data, and inputs the difference binaural audio data to the differential binaural audio compressor 7. Control.

次に、ステップＳ１５０では、制御部１７は、差分バイノーラルオーディオ圧縮器７において差分バイノーラルオーディオデータを圧縮させ、ステップＳ１６０では、制御部１７は、情報多重化器１２においてビデオ圧縮データ、ノーマルオーディオ圧縮データ、差分バイノーラルオーディオ圧縮データを、同期を取りながら多重化させるように制御する。多重化はＭＰＥＧシステムレイヤの同期方式でプログラムストリーム方式を用いてエレメンタリー毎にパック化し、再生時の同期を取れるようにプレゼンテーションタイムスタンプを打ちながら行わせる。タイムスタンプは２７ＭＨｚまたは９０ｋＨｚのカウンター情報が使用され、制御部１７は、タイムスタンプをタイムスタンプ発生器１３から情報多重化器１２に入力させるように制御する。 Next, in step S150, the control unit 17 compresses the differential binaural audio data in the differential binaural audio compressor 7, and in step S160, the control unit 17 performs video compression data and normal audio compression data in the information multiplexer 12. The differential binaural audio compressed data is controlled to be multiplexed while being synchronized. Multiplexing is carried out by packing each elementary element using the program stream method in the MPEG system layer synchronization method, and giving a presentation time stamp so as to achieve synchronization during reproduction. Counter information of 27 MHz or 90 kHz is used for the time stamp, and the control unit 17 controls the time stamp to be input from the time stamp generator 13 to the information multiplexer 12.

次に、ステップＳ１７０では、制御部１７は、ＤＶＤフォーマット化器１４に図４に示すＤＶＤの規格に準拠した形式にＤＶＤフォーマット化させるように制御する。 Next, in step S170, the controller 17 controls the DVD formatter 14 to format the DVD into a format compliant with the DVD standard shown in FIG.

次に、ステップＳ１８０では、制御部１７は、記録器１５によって所定の単位で記録媒体１６に記録させるように制御する。ＤＶＤであれば２ｋＢが単位である。 Next, in step S180, the control unit 17 controls the recording unit 15 to record on the recording medium 16 in a predetermined unit. For a DVD, the unit is 2 kB.

そして、ステップＳ１９０において、制御部１７は、ノーマルオーディオマイク１、バイノーラルオーディオマイク２及びカメラ９からの入力画像音響データがまだあるかどうかを判定し、これがある場合（ＹＥＳ）にはステップＳ１００に戻り、これがない場合（ＮＯ）には、プログラムを終了する。 In step S190, the control unit 17 determines whether there is still input image acoustic data from the normal audio microphone 1, the binaural audio microphone 2, and the camera 9, and if there is (YES), the process returns to step S100. If this is not present (NO), the program is terminated.

このように、本実施例の３次元音場情報記録装置及びプログラムによれば、ＤＶＤビデオ規格に準拠した形式で、ディファレンシャルパック３８という前述したバイノーラルオーディオデータパックとしてパック化してＭＰＥＧ多重化するので、ディファレンシャルパック３８を用いればバイノーラル３次元音場オーディオが再生でき、ディファレンシャルパック３８を用いなければＤＶＤビデオ規格として標準的なノーマルオーディオが出力できるフォーマットとなり、３次元音場情報をノーマルオーディオ情報と互換を保って記録再生することができる。 Thus, according to the three-dimensional sound field information recording apparatus and program of the present embodiment, the above-described binaural audio data pack called the differential pack 38 is packed and MPEG-multiplexed in a format compliant with the DVD video standard. If the differential pack 38 is used, binaural 3D sound field audio can be reproduced. If the differential pack 38 is not used, a standard normal audio can be output as a DVD video standard. The 3D sound field information is compatible with the normal audio information. It can be recorded and played back.

次に、本実施例の３次元音場情報記録装置で３次元音場情報を記録した記録媒体１６から３次元音場を再生する３次元音場再生装置を、図６に基づいて説明する。図６は図１に示す３次元音場情報記録装置で記録媒体に記録した音場情報を再生する３次元音場情報再生装置の構成を示すブロック図である。 Next, a three-dimensional sound field reproducing apparatus for reproducing a three-dimensional sound field from the recording medium 16 on which the three-dimensional sound field information is recorded by the three-dimensional sound field information recording apparatus of the present embodiment will be described with reference to FIG. FIG. 6 is a block diagram showing a configuration of a three-dimensional sound field information reproducing apparatus for reproducing sound field information recorded on a recording medium by the three-dimensional sound field information recording apparatus shown in FIG.

まず、記録媒体１６から再生器６１によってデータを再生し、ＤＶＤフォーマット復号器６２によってＤＶＤフォーマットからＭＰＥＧのストリームを抽出する。図６には示していないがＤＶＤは再生するための情報（例えばプレイリスト情報や、特殊再生情報）は別途、抽出して、図示せぬユーザーインターフェースやＣＰＵを経由して、インターラクティブな再生を行うことができる。 First, data is reproduced from the recording medium 16 by the reproducer 61, and an MPEG stream is extracted from the DVD format by the DVD format decoder 62. Although not shown in FIG. 6, information for reproducing the DVD (for example, playlist information and special reproduction information) is separately extracted, and interactive reproduction is performed via a user interface and CPU (not shown). be able to.

抽出されたＭＰＥＧストリームは情報分離化器６３において、ＭＰＥＧ多重化を解いて、ビデオ圧縮データ、ノーマルオーディオ圧縮データ、差分バイノーラルオーディオ圧縮データに分離する。ビデオ圧縮データはビデオ復号器６４に入力され、復号されてバッファ６７に一時的にバッファリングされる。ノーマルオーディオ圧縮データは、ノーマルオーディオ復号器６５に入力され、復号されてバッファ６８に一時的にバッファリングされる。差分バイノーラルオーディオ圧縮データは差分バイノーラルオーディオ復号器６６に入力され、復号された後に、演算器７０においてノーマルオーディオデータと加算され、バッファ６９に一時的にバッファリングされる。 The extracted MPEG stream is demultiplexed into MPEG compressed data, normal audio compressed data, and differential binaural audio compressed data by the information separator 63 by demultiplexing the MPEG. The compressed video data is input to the video decoder 64, decoded, and temporarily buffered in the buffer 67. The normal audio compressed data is input to the normal audio decoder 65, decoded, and temporarily buffered in the buffer 68. The differential binaural audio compressed data is input to the differential binaural audio decoder 66, decoded, added with normal audio data in the computing unit 70, and temporarily buffered in the buffer 69.

一方、それぞれの復号データは情報分離化器６３において、各エレメンタリーのパック化されたデータのヘッダーに記録されている、システムクロックリファレンス（ＳＣＲ）やタイムスタンプを検出し、ＳＴＣタイムスタンプ比較器７１によって、ＭＰＥＧ多重化方式で設定されているＳＣＲにて同期させたシステムタイムクロック（ＳＴＣ）時刻と、プレゼンテーションタイムスタンプと比較し、プレゼンテーションタイムスタンプ時刻がＳＴＣ時刻と一致したときに、バッファ６７〜６９からエレメンタリー情報を出力する。 On the other hand, the information separator 63 detects the system clock reference (SCR) and time stamp recorded in the header of each elementary packed data in the information separator 63, and the STC time stamp comparator 71. By comparing the system time clock (STC) time synchronized by the SCR set in the MPEG multiplexing method with the presentation time stamp, the buffers 67 to 69 are used when the presentation time stamp time matches the STC time. Outputs elementary information.

そして、ビデオデータは画像表示器７２へ出力する。オーディオデータについては、ＧＵＩ７４よりユーザーが指定した音源、即ちノーマルオーディオか、バイノーラルオーディオかを選択する選択信号が音源選択器７３に入力され、音源選択器７３ではその信号に従って音源を選択しスピーカー７５に出力再生する。なお、図６に示す３次元音場情報再生装置に設けられた各部は制御部７６により制御されている。 Then, the video data is output to the image display 72. As for the audio data, a selection signal for selecting a sound source designated by the user from the GUI 74, that is, normal audio or binaural audio is input to the sound source selector 73. Play output. Note that each unit provided in the three-dimensional sound field information reproducing apparatus shown in FIG. 6 is controlled by the control unit 76.

次に、図６に示す３次元音場情報再生装置で３次元音場を再生する３次元音場情報再生プログラムを、図７に示すフローチャートに基づいて説明する。詳細なステップの処理内容は図６に示す３次元音場情報再生装置のブロック図の説明で記したものと同様なので、ここではステップの順番についてのみ簡単に説明する。 Next, a three-dimensional sound field information reproducing program for reproducing a three-dimensional sound field by the three-dimensional sound field information reproducing device shown in FIG. 6 will be described based on the flowchart shown in FIG. Since the detailed processing contents of the steps are the same as those described in the description of the block diagram of the three-dimensional sound field information reproducing apparatus shown in FIG. 6, only the order of the steps will be briefly described here.

まず、ステップＳ２００では、制御部７６は、再生器６１に記録媒体１６から多重化されたデータを所定の単位で読み取らせるように制御する。 First, in step S200, the controller 76 controls the player 61 to read data multiplexed from the recording medium 16 in a predetermined unit.

次に、ステップＳ２１０では、制御部７６は、ＤＶＤフォーマット復号器６２にＤＶＤフォーマットを復号させるように制御する。ＤＶＤフォーマットの復号には、ＤＶＤフォーマットからＭＰＥＧのストリームを抽出し、このステップには示していないがＤＶＤは再生するための情報（例えばプレイリスト情報や、特殊再生情報）は別途、抽出して、ユーザーインターフェースやＣＰＵを経由して、インターラクティブな再生を行うことを含む。 Next, in step S210, the control unit 76 controls the DVD format decoder 62 to decode the DVD format. For decoding of the DVD format, an MPEG stream is extracted from the DVD format. Although not shown in this step, information for reproducing the DVD (for example, playlist information and special playback information) is separately extracted, Including interactive playback via a user interface or CPU.

次に、ステップＳ２２０では、制御部７６は、情報分離化器６３に抽出されたＭＰＥＧストリームの情報分離化を行わせるように制御する。 Next, in step S220, the control unit 76 controls the information separator 63 to perform information separation of the extracted MPEG stream.

次に、ステップＳ２３０では、制御部７６は、ビデオ圧縮データ、ノーマルオーディオ圧縮データをそれぞれ復号させ、ステップＳ２４０では、制御部７６は、ビデオ復号データ、ノーマルオーディオ復号データを一時バッファリングさせるように制御する。 Next, in step S230, the control unit 76 decodes the video compressed data and the normal audio compressed data, respectively, and in step S240, the control unit 76 controls the video decoded data and the normal audio decoded data to be temporarily buffered. To do.

次に、ステップＳ２５０では、制御部７６は、差分バイノーラルオーディオ圧縮データを復号させ、ステップＳ２６０では、制御部７６は、演算器７０に差分バイノーラルオーディオ復号データとノーマルオーディオ復号データとを加算計算させるように制御する。 Next, in step S250, the control unit 76 decodes the differential binaural audio compressed data, and in step S260, the control unit 76 causes the computing unit 70 to add and calculate the differential binaural audio decoded data and the normal audio decoded data. To control.

次に、ステップＳ２７０では、制御部７６は、ＧＵＩ７４よりユーザーが指定した音源、即ちノーマルオーディオか、バイノーラルオーディオかを選択する選択信号を音源選択器７３に入力させ、音源選択器７３にバイノーラルオーディオデータとノーマルオーディオのどちらを再生するか選択させるように制御する。 Next, in step S270, the control unit 76 inputs a selection signal for selecting a sound source designated by the user from the GUI 74, that is, normal audio or binaural audio, to the sound source selector 73, and causes the sound source selector 73 to input binaural audio data. Control whether to play normal audio or normal audio.

次に、ステップＳ２８０では、制御部７６は、画像表示器７２にビデオデータを表示させ、ステップＳ２７０で選択されたオーディオをビデオデータの表示と同期してスピーカー７５で再生させるように制御する。 Next, in step S280, the control unit 76 controls the image display 72 to display the video data, and controls the audio selected in step S270 to be reproduced by the speaker 75 in synchronization with the display of the video data.

次に、ステップＳ２９０では、制御部７６は、記録媒体１６に表示画像音響データがまだあるかどうかを判定し、これがある場合（ＹＥＳ）にはステップＳ２００に戻り、これがない場合（ＮＯ）には、プログラムを終了する。 Next, in step S290, the control unit 76 determines whether there is still display image acoustic data on the recording medium 16, and if there is this (YES), returns to step S200, and if there is not (NO). Quit the program.

（実施例２）
本発明の３次元音場情報記録装置の実施例２を図８に基づいて説明する。図８は本実施例においてＤＶＤに応用するフォーマットを示す説明図である。図８では、図４に示したフォーマットの構成と同一構成には同一符号を付して説明する。 (Example 2)
A second embodiment of the three-dimensional sound field information recording apparatus of the present invention will be described with reference to FIG. FIG. 8 is an explanatory diagram showing a format applied to a DVD in this embodiment. In FIG. 8, the same components as those of the format shown in FIG.

本実施例の３次元音場情報記録装置は、ＤＶＤビデオ規格を利用して、ノーマルオーディオの各フレームの所定単位ごとに、３次元音場情報に関する情報を管理情報領域とは別の３次元音場用の管理情報領域に記録する手段を有する。これは、ＤＶＤビデオの規格を準拠する形式をとりながらも、ディファレンシャルパック（Ｄ＿ＰＡＣＫ）に記録するのではなく、DVD others zoneというＤＶＤ規格準拠の形式で、自由に使用できる領域にバイノーラル３次元音場オーディオ情報を記録するものである。 The three-dimensional sound field information recording apparatus according to the present embodiment uses the DVD video standard to transmit information relating to three-dimensional sound field information for each predetermined unit of each frame of normal audio to a three-dimensional sound field different from the management information area. Means for recording in the management information area for the venue. This is a format that conforms to the DVD video standard, but is not recorded in a differential pack (D_PACK), but is a DVD standard conforming to the DVD others zone, and a binaural three-dimensional sound field in a freely usable area. Audio information is recorded.

本実施例の３次元音場情報記録装置の構成は図１に示す実施例１の３次元音場情報記録装置の構成と同様である。 The configuration of the three-dimensional sound field information recording apparatus of the present embodiment is the same as that of the three-dimensional sound field information recording apparatus of the first embodiment shown in FIG.

DVD-others zone４１にはビデオマネージャー（ＤＶＭＧ）８１、ビデオタイトルセット（ＶＴＳ）８２という構造を記述する。ビデオマネージャー８１はビデオマネージャーインフォメーションなど後続するビデオタイトルセット８２の識別情報や様々な情報自体のスタートアドレスやエンドアドレス、どこのビデオストリームから再生を開始するかなどの情報が記述されている。 In the DVD-others zone 41, a structure of a video manager (DVMG) 81 and a video title set (VTS) 82 is described. The video manager 81 describes information such as identification information of the subsequent video title set 82 such as video manager information, start addresses and end addresses of various information itself, and from which video stream playback is started.

ビデオタイトルセット８２には、再生されるべきオーディオやビデオのデータのアドレス情報や識別情報などのControl Dataが記述されたＤＶＴＳＩ８３がある。これらのビデオマネージャー８１やＤＶＴＳＩ８３は再生に必須な情報を記録した管理情報領域であり、この領域のデータはＤＶＤフォーマット器やＤＶＤフォーマットステップによって記録され、ＤＶＤフォーマット復号器やＤＶＤフォーマット復号ステップによって再生される。 The video title set 82 includes a DVTSI 83 in which control data such as address information and identification information of audio and video data to be reproduced is described. The video manager 81 and the DVTSI 83 are management information areas in which information essential for reproduction is recorded. Data in this area is recorded by the DVD formatter and the DVD format step, and is reproduced by the DVD format decoder and the DVD format decoding step. The

その後にビデオオブジェクトセット（ＤＶＯＢＳ）８４というビデオとオーディオの多重化されたＭＰＥＧストリームのセットがあり、さらにビデオオブジェクト（ＤＶＯＢ）８５という小単位のＭＰＥＧストリームがある。 After that, there is a set of video and audio multiplexed MPEG streams called a video object set (DVOBS) 84, and there is a small unit MPEG stream called a video object (DVOB) 85.

ビデオオブジェクト８５の下にはさらに細分化されたセル（ＤＣＥＬＬ）８６という単位、さらにはビデオオブジェクトユニット（ＤＶＯＢＵ）８７があり、バイノーラル３次元音場オーディオ情報のフレームレイヤの数フレームをまとめた構造になっている。 Below the video object 85 is a unit of further subdivided cells (DCELL) 86, and further a video object unit (DVOBU) 87, which has a structure in which several frames of the binaural 3D sound field audio information frame layer are combined. It has become.

DVD-video zone４０の２次元映像のデータと同じ構造とし、一つ一つのビデオオブジェクトユニット、セル、ビデオオブジェクトなどは同じフレーム枚数（同じ再生時間長）を持たせることで、サーチなどのアクセス性を高めることができる。 It has the same structure as the 2D video data in DVD-video zone 40, and each video object unit, cell, video object, etc. has the same number of frames (same playback time length), so that search and other accessibility is possible. Can be increased.

このように本実施例の３次元音場情報記録装置によれば、ＤＶＤビデオ規格に準拠した形式で、DVD-video zone４０とDVD-others zone４１にリンクした形式でバイノーラル３次元音場オーディオ情報データを記述するので、バイノーラル３次元音場オーディオとノーマルオーディオをＤＶＤビデオ規格互換で記録することができ、３次元音場情報をノーマルオーディオ情報と互換を保って記録再生することができる。 As described above, according to the three-dimensional sound field information recording apparatus of the present embodiment, binaural three-dimensional sound field audio information data is formatted in a format compliant with the DVD video standard and linked to the DVD-video zone 40 and the DVD-others zone 41. As described, binaural three-dimensional sound field audio and normal audio can be recorded in compatibility with the DVD video standard, and three-dimensional sound field information can be recorded and reproduced while maintaining compatibility with normal audio information.

なお、本実施例の３次元音場情報記録装置を使用した３次元音場情報記録プログラムは図５に示すフローチャートと同様である。 A three-dimensional sound field information recording program using the three-dimensional sound field information recording apparatus of this embodiment is the same as the flowchart shown in FIG.

また、本実施例の３次元音場情報記録装置で３次元音場情報を記録した記録媒体１６からの３次元音場の再生は、実施例１で図６に示した３次元音場情報再生装置により行うことができ、本実施例の３次元音場情報記録装置で３次元音場情報を記録した記録媒体１６から３次元音場情報を再生するプログラムは図７に示すフローチャートと同様である。 Further, the reproduction of the three-dimensional sound field from the recording medium 16 on which the three-dimensional sound field information is recorded by the three-dimensional sound field information recording apparatus of the present embodiment is the reproduction of the three-dimensional sound field information shown in FIG. The program for reproducing the three-dimensional sound field information from the recording medium 16 on which the three-dimensional sound field information is recorded by the three-dimensional sound field information recording apparatus of the present embodiment is the same as the flowchart shown in FIG. .

（実施例３）
本発明の３次元音場情報記録装置の実施例３を図９に基づいて説明する。図９は本実施例においてＤＶＤに応用するフォーマットを示す説明図である。 (Example 3)
A third embodiment of the three-dimensional sound field information recording apparatus of the present invention will be described with reference to FIG. FIG. 9 is an explanatory diagram showing a format applied to a DVD in this embodiment.

本実施例の３次元音場情報記録装置は、ＤＶＤオーディオ規格を利用して、ノーマルオーディオの各フレームの所定単位ごとに３次元音場情報に関する情報をオーディオオブジェクトのノーマルオーディオとは別のストリームデータとして多重化して記録する手段を有する。 The three-dimensional sound field information recording apparatus of the present embodiment uses the DVD audio standard to transmit information relating to the three-dimensional sound field information for each predetermined unit of each frame of normal audio to stream data different from the normal audio of the audio object. Means for multiplexing and recording.

ＤＶＤオーディオのフォーマットはオーディオマネージャ（ＡＭＧ）８８と、オーディオマネージャ８８に続く複数のオーディオタイトルセット（ＡＴＳ）８９の各エリアにより構成されている。 The DVD audio format includes an audio manager (AMG) 88 and areas of a plurality of audio title sets (ATS) 89 following the audio manager 88.

オーディオタイトルセット８９の各々はこれに対応して先頭のＡＴＳインフォメーション（ＡＴＳＩ）９０と、それに続く１以上のオーディオオブジェクトセット（ＡＯＢＳ）９１により構成されている。 Each of the audio title sets 89 includes a head ATS information (ATSI) 90 and one or more audio object sets (AOBS) 91 subsequent thereto.

これらのオーディオマネージャ８８やＡＴＳインフォメーション９０は再生に必須な情報を記録した管理情報領域であり、この領域のデータはＤＶＤフォーマット器やＤＶＤフォーマットステップによって記録され、ＤＶＤフォーマット復号器やＤＶＤフォーマット復号ステップによって再生される。 These audio manager 88 and ATS information 90 are management information areas in which information essential for reproduction is recorded. Data in this area is recorded by a DVD formatter or a DVD format step, and is recorded by a DVD format decoder or a DVD format decoding step. Played.

オーディオオブジェクトセット９１の各々は複数のオーディオオブジェクト（ＡＯＢ）９２により構成されている。オーディオオブジェクト９２の各々は複数のセル（ＣＥＬＬ）９３により構成され、セル９３はさらに、複数のオーディオオブジェクトユニット（ＡＯＢＵ）９４により構成されている。 Each audio object set 91 includes a plurality of audio objects (AOB) 92. Each of the audio objects 92 is composed of a plurality of cells (CELL) 93, and the cell 93 is further composed of a plurality of audio object units (AOBU) 94.

オーディオオブジェクトユニット９４の各々は、複数のパックにより構成され、１パックは２０４８バイトで構成されている。オーディオオブジェクトユニット９４は再生時間０．４〜１．０秒分の任意の数のパックにより構成されている。 Each audio object unit 94 is composed of a plurality of packs, and one pack is composed of 2048 bytes. The audio object unit 94 is composed of an arbitrary number of packs for a playback time of 0.4 to 1.0 seconds.

隣接するＡパックは、オーディオ信号がお互いに関連するように配置され、例えばステレオの場合にはＬチャネルパックとＲチャネルパックが隣接して配置され、また、マルチチャネルの場合にも同様に隣接して配置される。これらはそれぞれＭＰＥＧ多重化されている。この中にディファレンシャルパック（Ｄ＿ＰＡＣＫ）９５という前述したバイノーラルオーディオデータパックとしてパック化してＭＰＥＧ多重化する。 The adjacent A packs are arranged so that the audio signals are related to each other. For example, in the case of stereo, the L channel pack and the R channel pack are arranged adjacent to each other. Arranged. Each of these is MPEG multiplexed. A differential pack (D_PACK) 95 is packed into the above-described binaural audio data pack and MPEG-multiplexed.

このように本実施例の３次元音場情報記録装置によれば、ＤＶＤビデオ規格に準拠した形式で、ディファレンシャルパック９５という前述したバイノーラルオーディオデータパックとしてパック化してＭＰＥＧ多重化するので、ディファレンシャルパック９５を用いればバイノーラル３次元音場オーディオが再生でき、ディファレンシャルパック９５を用いなければＤＶＤビデオ規格として標準的なノーマルオーディオが出力できるフォーマットとなり、３次元音場情報をノーマルオーディオ情報と互換を保って記録再生することができる。 As described above, according to the three-dimensional sound field information recording apparatus of the present embodiment, the differential pack 95 is packed and MPEG-multiplexed as the above-described binaural audio data pack called the differential pack 95 in a format compliant with the DVD video standard. Can be used to play binaural 3D sound field audio, and if the differential pack 95 is not used, it becomes a format that can output standard normal audio as a DVD video standard, and 3D sound field information is recorded with compatibility with normal audio information. Can be played.

（実施例４）
本発明の３次元音場情報記録装置及びプログラムの実施例４を図１０乃至図１３に基づいて説明する。図１０は本発明の実施例４の３次元音場情報記録装置の構成を示すブロック図、図１１はＭＰＥＧのビデオストリームビデオレイヤの説明表、図１２はＭＰＥＧの多重化トランスポートストリームシステムレイヤを説明表、図１３は図１０に示す３次元音場情報記録装置を使用した３次元音場情報記録プログラムを示すフローチャートである。 Example 4
A third embodiment of the three-dimensional sound field information recording apparatus and program according to the present invention will be described with reference to FIGS. 10 is a block diagram showing the configuration of the three-dimensional sound field information recording apparatus according to the fourth embodiment of the present invention, FIG. 11 is an explanatory table of an MPEG video stream video layer, and FIG. 12 is an MPEG multiplexed transport stream system layer. FIG. 13 is a flowchart showing a three-dimensional sound field information recording program using the three-dimensional sound field information recording apparatus shown in FIG.

なお、図１０では、図１に示した３次元音場情報記録装置の構成と同一構成には同一符号を付して説明する。 In FIG. 10, the same components as those of the three-dimensional sound field information recording apparatus shown in FIG.

図１０に示すように、本実施例の３次元音場情報記録装置は、ノーマルなオーディオ情報を収録するノーマルオーディオマイク１と、バイノーラル収録を行うバイノーラルオーディオマイク２と、ノーマルオーディオマイク１によって収録されたノーマルオーディオデータを圧縮するノーマルオーディオ圧縮器３と、ノーマルオーディオ圧縮器３によって圧縮されたノーマルオーディオ圧縮データを一時的にバッファリングするバッファ４と、ノーマルオーディオ圧縮データを復号するノーマルオーディオ復号器５と、ノーマルオーディオ復号器５によって復号されたデータをバイノーラルオーディオデータから減算して差分バイノーラルオーディオデータを作成する演算器６と、差分バイノーラルオーディオデータを圧縮する差分バイノーラルオーディオ圧縮器７と、差分バイノーラルオーディオ圧縮器７によって圧縮された差分バイノーラルオーディオ圧縮データを一時的にバッファリングするバッファ８と、ビデオ信号を出力するカメラ９と、ビデオ信号を圧縮するビデオ圧縮器１０と、ビデオ圧縮器１０によって圧縮されたビデオ圧縮データを一時的にバッファリングするバッファ１１と、バッファリングされているビデオ圧縮データ、ノーマルオーディオ圧縮データ、差分バイノーラルオーディオ圧縮データを多重化する情報多重器１２と、タイムスタンプを情報多重化器１２に入力するタイムスタンプ発生器１３と、情報多重器１２によって多重化されたストリームを記録媒体１６に記録する記録器１５と、装置に設けられた各部を制御する制御部１７とを備える。 As shown in FIG. 10, the three-dimensional sound field information recording apparatus of this embodiment is recorded by a normal audio microphone 1 that records normal audio information, a binaural audio microphone 2 that performs binaural recording, and a normal audio microphone 1. A normal audio compressor 3 for compressing the normal audio data, a buffer 4 for temporarily buffering the normal audio compressed data compressed by the normal audio compressor 3, and a normal audio decoder 5 for decoding the normal audio compressed data. And a calculator 6 for subtracting the data decoded by the normal audio decoder 5 from the binaural audio data to create differential binaural audio data, and a differential binaural for compressing the differential binaural audio data Audio compressor 7, buffer 8 for temporarily buffering differential binaural audio compression data compressed by differential binaural audio compressor 7, camera 9 for outputting a video signal, and video compressor for compressing a video signal 10, a buffer 11 for temporarily buffering the video compressed data compressed by the video compressor 10, and information multiplexing for multiplexing the buffered video compressed data, normal audio compressed data, and differential binaural audio compressed data , A time stamp generator 13 for inputting a time stamp to the information multiplexer 12, a recorder 15 for recording the stream multiplexed by the information multiplexer 12 on the recording medium 16, and each unit provided in the apparatus The control part 17 which controls is provided.

図１に示す３次元音場情報記録装置と同一構成のブロックの動作は同様であるが、図１０に示すように、情報多重化器１２によって多重化されたストリームを、図１に示す３次元音場情報記録装置とは異なり、ＤＶＤフォーマット化器に送らず記録器１５に送り、記録媒体１６に記録する。 The operation of the block having the same configuration as that of the 3D sound field information recording apparatus shown in FIG. 1 is the same, but as shown in FIG. 10, the stream multiplexed by the information multiplexer 12 is converted into the 3D shown in FIG. Unlike the sound field information recording device, it is sent to the recorder 15 without being sent to the DVD formatter, and is recorded on the recording medium 16.

次に、図１１を参照して、ＤＶＤビデオ規格を利用して、ノーマルオーディオの各フレームの所定単位ごとに３次元音場情報に関する情報をオーディオオブジェクトのユーザーデータ領域に記録する手段を説明する。 Next, referring to FIG. 11, a description will be given of means for recording information relating to three-dimensional sound field information in a user data area of an audio object for each predetermined unit of each frame of normal audio using the DVD video standard.

ＭＰＥＧ規格の中で互換性の取れるような、ユーザーデータ領域やプライベートストリームにて伝送する仕組みが用意されている。例えばＭＰＥＧビデオの規格には、ピクチャレイヤ、ＧＯＰレイヤなどにそれぞれユーザーデータ領域が設定されている。 A mechanism is provided for transmitting data in a user data area or a private stream so as to be compatible in the MPEG standard. For example, in the MPEG video standard, user data areas are set in the picture layer, the GOP layer, and the like.

これらはＭＰＥＧのシンタックスで映像音声とは関係ないデータを埋め込むことのできる所定のエリアとして設定されているuser_data、もしくはprivate_data_byte、もしくはユーザーが任意に設定できるprivate_streamなどのデータパケットに記録する。 These are recorded in a data packet such as user_data or private_data_byte which is set as a predetermined area in which data unrelated to video and audio can be embedded in MPEG syntax, or private_stream which can be arbitrarily set by the user.

例えばＭＰＥＧ１のビデオにおけるピクチャレイヤは図１１に示すようになっており、スライスレイヤの手前で、user_data_start_codeを送った後にuser_dataを８ビット単位で記録することができるような仕組みが定義されている。 For example, a picture layer in an MPEG1 video is as shown in FIG. 11, and a mechanism is defined so that user_data can be recorded in units of 8 bits after sending user_data_start_code before the slice layer.

また、ＭＰＥＧ２などの多重化トランスポートストリームのシステムレイヤにも図１２に示すようにtransport_private_data_flagに１を立てると、private_dataが存在することを明示でき、データ長もトランスポートパケットをはみ出さないという制限のもとで、transport_private_data_lengthに設定したデータ長のprivate_dataを送信することができる。 Further, in the system layer of a multiplexed transport stream such as MPEG2, if the transport_private_data_flag is set to 1 as shown in FIG. 12, it can be clearly indicated that private_data exists, and the data length does not protrude from the transport packet. Originally, private_data having the data length set in transport_private_data_length can be transmitted.

これ以外にも、ＭＰＥＧシステムでユーザー固有のデータを記録する方法は、stream_id にprivate_streamを設定して専用のパケットを宣言することで送信するなど、仕組みは幾つか定義されており、本実施例におけるバイノーラル３次元音場オーディオ情報は、これらの領域に記録することができる。 In addition to this, there are several mechanisms for recording user-specific data in the MPEG system, such as sending private data by declaring a dedicated packet with stream_id set to private_stream. Binaural 3D sound field audio information can be recorded in these areas.

ＭＰＥＧ１ビデオのuser_dataを用いる例をもう少し詳細に説明する。user_data_start_codeはスライスレイヤの手前で0x000001B2とＭＰＥＧでは定義されている。そのコードを送ったあとに、ユーザーデータエリア内で本発明の認証に用いる関数値の存在を示す、予め一意に識別可能なコードである例えば0x0f0f0f0f2428fdaaのコードを送信する。このコードは他のアプリケーションでuser_dataを使う場合に識別する目的で記録するもので、コードの値は特に意味はない。そのコードの後に図８に示すオーディオフレームレイヤー構造を、ＭＰＥＧの１ピクチャ毎にピクチャ表示区間に相当するオーディーフレームレイヤを記録する。 An example using user_data of MPEG1 video will be described in a little more detail. user_data_start_code is defined in MPEG as 0x000001B2 before the slice layer. After sending the code, a code of 0x0f0f0f0f2428fdaa, which is a uniquely identifiable code indicating the presence of the function value used for authentication of the present invention in the user data area, is transmitted. This code is recorded for the purpose of identification when user_data is used in another application, and the value of the code has no particular meaning. The audio frame layer structure shown in FIG. 8 is recorded after the code, and an audio frame layer corresponding to a picture display section is recorded for each MPEG picture.

ピクチャ表示区間とオーディオフレーム再生区間の時間幅が違う場合には、１パケット程度の誤差を平均的に許容する形式で多重化して、ビデオの先頭とオーディオの先頭のプレゼンテーションの時刻の差の情報を、user_dataの先頭に９０ｋＨｚもしくは２７ＭＨｚのクロックのカウント数で、３２ビット程度で記録しても良いし、再生側のクロックでデータ到着順に再生をしても良い。 When the time width between the picture display section and the audio frame playback section is different, the error is about 1 packet is multiplexed in a format that allows an average, and information on the difference between the presentation time at the beginning of the video and the beginning of the audio is obtained. The head count of user_data may be recorded with about 90 bits at a clock count of 90 kHz or 27 MHz, or may be reproduced in the order of data arrival with the reproduction side clock.

なお、本実施例によるバイノーラル３次元音場オーディオ情報は、１ピクチャ毎に記録するように説明したが、０．５秒程度ごとでも、１秒程度ごとでも良い。その場合には、ＭＰＥＧのＧＯＰレイヤのユーザーデータを用いることで実現できる。 Although the binaural three-dimensional sound field audio information according to the present embodiment has been described as being recorded for each picture, it may be about every 0.5 second or about every 1 second. In that case, it can be realized by using user data of the GOP layer of MPEG.

次に、本実施例の３次元音場情報記録プログラムを図１３に基づいて説明する。図１３は図１０に示す３次元音場情報記録装置を使用した３次元音場情報記録プログラムを示すフローチャートである。 Next, the three-dimensional sound field information recording program of the present embodiment will be described with reference to FIG. FIG. 13 is a flowchart showing a three-dimensional sound field information recording program using the three-dimensional sound field information recording apparatus shown in FIG.

図１３に示すフローチャートにおいて、ステップＳ３００〜Ｓ３６０の処理内容は図５に示すフローチャートのステップＳ１００〜Ｓ１６０と同様である。 In the flowchart shown in FIG. 13, the processing contents of steps S300 to S360 are the same as those in steps S100 to S160 of the flowchart shown in FIG.

その後、ステップＳ３７０において、制御部１７は、情報多重化器１２で多重化されたストリームを記録器１５によって記録媒体１６に記録させるように制御する。 After that, in step S370, the control unit 17 controls the recording device 15 to record the stream multiplexed by the information multiplexer 12 on the recording medium 16.

そして、ステップＳ３８０において、制御部１７は、ノーマルオーディオマイク１、バイノーラルオーディオマイク２及びカメラ９からの入力画像音響データがまだあるかどうかを判定し、これがある場合（ＹＥＳ）にはステップＳ３００に戻り、これがない場合（ＮＯ）には、プログラムを終了する。 In step S380, the control unit 17 determines whether there is still input image acoustic data from the normal audio microphone 1, the binaural audio microphone 2, and the camera 9, and if there is (YES), the process returns to step S300. If this is not present (NO), the program is terminated.

このように、本実施例の３次元音場情報記録装置及びプログラムによれば、３次元音場情報に関する情報をオーディオオブジェクトのユーザーデータ領域に記録するので、ユーザーデータを用いなければノーマルオーディオが出力され、ユーザーデータを用いれば３次元音場オーディオが出力されるフォーマットとなり、３次元音場情報をノーマルオーディオ情報と互換を保って記録再生することができる。 As described above, according to the three-dimensional sound field information recording apparatus and program of this embodiment, information relating to the three-dimensional sound field information is recorded in the user data area of the audio object, so that normal audio is output if no user data is used. If user data is used, 3D sound field audio is output, and 3D sound field information can be recorded and reproduced while maintaining compatibility with normal audio information.

次に、本実施例の３次元音場情報記録装置で３次元音場情報を記録した記録媒体１６から３次元音場を再生する３次元音場再生装置を図１４に基づいて説明する。図１４は図１０に示す３次元音場情報記録装置で記録媒体に記録した音場情報を再生する３次元音場情報再生装置の構成を示すブロック図である。 Next, a three-dimensional sound field reproducing apparatus for reproducing a three-dimensional sound field from the recording medium 16 on which the three-dimensional sound field information is recorded by the three-dimensional sound field information recording apparatus of the present embodiment will be described with reference to FIG. FIG. 14 is a block diagram showing a configuration of a three-dimensional sound field information reproducing apparatus for reproducing sound field information recorded on a recording medium by the three-dimensional sound field information recording apparatus shown in FIG.

まず、記録媒体１６から多重化されたデータを再生器６１で読み取り、情報分離化器６３へ出力する。 First, the data multiplexed from the recording medium 16 is read by the player 61 and output to the information separator 63.

その後は実施例１で説明した、図６に示す３次元音場情報再生装置と同様の処理を行い、スピーカー７５に出力再生する。 Thereafter, the same processing as that of the three-dimensional sound field information reproducing apparatus shown in FIG.

次に、図１４に示す３次元音場情報再生装置で３次元音場を再生する３次元音場情報再生プログラムを、図１５に示すフローチャートに基づいて説明する。 Next, a three-dimensional sound field information reproducing program for reproducing a three-dimensional sound field by the three-dimensional sound field information reproducing device shown in FIG. 14 will be described based on the flowchart shown in FIG.

まず、ステップＳ４００では、制御部１７は、再生器６１に記録媒体１６から多重化されたデータを所定の単位で読み取らせるように制御する。 First, in step S400, the control unit 17 controls the reproducing device 61 to read data multiplexed from the recording medium 16 in a predetermined unit.

その後、ステップＳ４１０〜Ｓ４８０では、図７に示すフローチャートのステップＳ２２０〜Ｓ２９０と同様の処理を行う。 Thereafter, in steps S410 to S480, processing similar to that in steps S220 to S290 in the flowchart shown in FIG. 7 is performed.

なお、実施例１乃至４の３次元音場情報再生装置の説明では、３次元音場を再生する方法としてバイノーラルオーディオを用いて説明したが、例えば複数のＣＨを持つことで、アレイスピーカーを用いて局在的音場を作り出すことも考えられる。 In the description of the three-dimensional sound field information reproducing apparatuses according to the first to fourth embodiments, binaural audio is used as a method for reproducing the three-dimensional sound field. However, for example, an array speaker is used by having a plurality of CHs. It is also possible to create a localized sound field.

即ち、空間上のある焦点付近の音圧を局所的に上昇させるようにスピーカーアレイの中心から焦点までの経路と、各スピーカーから焦点までの経路との差に応じた遅延量を与えた再生信号により実現する方法である。 That is, a reproduction signal giving a delay amount corresponding to the difference between the path from the center of the speaker array to the focal point and the path from each speaker to the focal point so as to locally increase the sound pressure near a focal point in space. It is a method realized by.

アレイスピーカーの原理を図１６に基づいて説明する。まずスピーカー９６を図１６に示すようにアレイ状に組み、一つ一つに遅延回路を設ける。聴取位置近傍に焦点を結ぶように遅延を設定すると、聴取位置においてスピーカー９６からの直接音よりも、焦点において発生する音圧成分が極めて高くなるように再生することが可能となる。この原理を用いて連続的にリアルタイムで制御することで立体動画像のオブジェクトの位置にリンクして音像の定位を制御できる。 The principle of the array speaker will be described with reference to FIG. First, the speakers 96 are assembled in an array as shown in FIG. 16, and a delay circuit is provided for each. When the delay is set so as to focus on the vicinity of the listening position, it is possible to reproduce the sound pressure component generated at the focus at an extremely higher level than the direct sound from the speaker 96 at the listening position. By using this principle and continuously controlling in real time, the localization of the sound image can be controlled by linking to the position of the object of the stereoscopic moving image.

そして、図１７に示すような聴取位置から、いくつかの局在音場を生成して３次元音場空間を作ることが可能である。この場合には、複数のＣＨを図４や図９に示すようにディファレンシャルパック（Ｄ＿ＰＡＣＫ）にして複数のストリームで多重化して記録する。もしくは図８に示すようにDVD others zoneにＣＨ毎にオーディオフレームを作成し、所定の順に複数のＣＨを順番に多重化して記録することで、既存ＤＶＤと互換性を保ちながら、３次元音場データを記録することが可能である。 It is possible to create a three-dimensional sound field space by generating several localized sound fields from the listening position as shown in FIG. In this case, a plurality of CHs are recorded in a differential pack (D_PACK) as shown in FIGS. Alternatively, as shown in FIG. 8, an audio frame is created for each CH in the DVD others zone, and a plurality of CHs are multiplexed and recorded in a predetermined order in order to maintain the compatibility with the existing DVD, and a three-dimensional sound field. It is possible to record data.

なお、上記説明の実施例１及び４の３次元音場情報記録プログラムは、コンピュータに実現させるようにしても良い。このプログラムは、記録媒体から読み取られてコンピュータに取り込まれても良いし、通信ネットワークを介して伝送されてコンピュータに取り込まれても良い。 The three-dimensional sound field information recording program according to the first and fourth embodiments described above may be realized by a computer. This program may be read from a recording medium and loaded into a computer, or may be transmitted via a communication network and loaded into a computer.

また、上記説明の実施例１乃至４においては、最終的な情報は記録媒体に記録したが、通信や放送特有のパケット化がなされて、パケット化器を経由して放送や通信網に伝送や受信をしても良い。記録媒体にデータを記録しなくても、通信、放送などあらゆる伝送媒体を経由してデータを送信することが可能で、その場合には、記録装置は伝送装置として使用することもできる。また、再生装置は受信装置として使用することも可能である。 In the first to fourth embodiments described above, the final information is recorded on a recording medium. However, packetization unique to communication and broadcast is performed, and the packet is transmitted to the broadcast or communication network via a packetizer. You may receive. Even if data is not recorded on the recording medium, it is possible to transmit the data via any transmission medium such as communication and broadcasting. In that case, the recording apparatus can also be used as a transmission apparatus. Further, the playback device can be used as a receiving device.

また、３次元音場データにはバイフォニック録音されたオーディオデータの他にも、特別なサラウンド効果をもたらすデータや、３次元音場を作成するにあたり必要な無響室で録音されたようなレアな音源データから、頭部伝達関数とホールなどの音場環境データによってシミュレーションにより仮想的に、バイフォニック録音に近い音場を作成することも可能である。 In addition to biphonically recorded audio data, 3D sound field data includes data that has special surround effects, and rare data that is recorded in an anechoic room that is necessary to create a 3D sound field. From sound source data, it is possible to create a sound field virtually similar to biphonic recording by simulation using sound field environment data such as a head-related transfer function and a hall.

本発明では、３次元音場情報に関する情報はバイフォニック録音されたオーディオデータで説明を行ったが、上記の特別なサラウンド効果をもたらすデータや、３次元音場を作成するにあたり必要な無響室で録音されたようなレアな音源データ（レアオーディオ）であっても良い。レアオーディオからは特殊なエフェクトがかかっていないことから３次元音場を創生し易いという利点がある。 In the present invention, the information about the three-dimensional sound field information has been described with the audio data recorded by biphonic recording. However, the anechoic room necessary for creating the data that brings about the special surround effect and the three-dimensional sound field. Rare sound source data (rare audio) as recorded in Rare audio has the advantage that it is easy to create a three-dimensional sound field because no special effects are applied.

また、本発明では、３次元音場データの圧縮方式はＭＰＥＧ等で説明したが、他のＤＰＣＭやＤＣＴなどの直交変換で量子化する方式でも良い。またオーディオオブジェクトの種類としても、リニアＰＣＭで圧縮をしないものや、可逆圧縮をしたもの、例えばＤＶＤオーディオに採用されているＰａｃｋｅｄＰＣＭ (ロスレス圧縮方式)を用いても良い。 In the present invention, the compression method of the three-dimensional sound field data has been described by MPEG or the like, but may be a method of quantization by other orthogonal transformation such as DPCM or DCT. As the type of the audio object, one that is not compressed by linear PCM or one that is reversibly compressed, for example, Packed PCM (lossless compression method) adopted for DVD audio may be used.

ノーマルオーディオにはリニアＰＣＭのマルチチャンネルオーディオも応用できる。即ち、マルチチャンネルのＬＲの２ＣＨを、本発明のように３次元音場データに対応するＬＲとの差分をとるようにすればよい。それ以外のＣＨに関してはそのまま記録する。 Linear PCM multi-channel audio can also be applied to normal audio. That is, the difference between the multi-channel LR 2CH and the LR corresponding to the three-dimensional sound field data may be taken as in the present invention. Other CHs are recorded as they are.

また、処理量に余裕がある場合には、所定の時間毎に最も相関の強いＣＨを選んで、適応的に３次元音場データとの差分をとるようにしても良い。その場合、どのＣＨからの差分かを示す情報を数ビットで示し、ヘッダーやユーザー領域に指示するフォーマットとすれば良い。 Further, when there is a margin in the processing amount, the CH having the strongest correlation may be selected every predetermined time, and the difference from the three-dimensional sound field data may be adaptively taken. In that case, information indicating which channel the difference is from may be indicated by a few bits and in a format instructing the header or user area.

また、レアオーディオとしては、無響室で録音したデータだけでなく、マルチマイクによるホール録音や６ｃｈにミックスダウンしたマスター音源でも良い。 The rare audio is not limited to data recorded in an anechoic room, but may be a hall sound recording using a multi-microphone or a master sound source mixed down to 6 channels.

また、本発明の信号データを記録した記録媒体は、３次元音場情報を記録再生する際に、ノーマルなノーマルオーディオ情報と互換を保って３次元音場情報再生を可能とするフォーマットを記録してあるという媒体特有の効果を有する。 In addition, the recording medium on which the signal data of the present invention is recorded records a format that enables reproduction of 3D sound field information while maintaining compatibility with normal normal audio information when recording and reproducing 3D sound field information. It has a medium-specific effect.

また、記録媒体は、媒体という定義はデータを記録できる媒体という狭義な媒体というものだけでなく、信号データを伝送するための電磁波、光などを含む。また、記録媒体に記録されている情報は、記録されていない状態での電子ファイルなどのデータ自身を含むものとする。 In addition, the definition of a medium includes not only a narrow medium such as a medium capable of recording data but also an electromagnetic wave and light for transmitting signal data. The information recorded on the recording medium includes data itself such as an electronic file in an unrecorded state.

また、本発明はオーディオを中心に説明したが、ビデオと共にオーディオデータが存在していてＭＰＥＧの多重化でオーディオとビデオが多重化されていても本発明は有効であり、オーディオやビデオに限らず、他のサブピクチャや制御情報などのデータがあっても同様である。 Although the present invention has been described mainly with respect to audio, the present invention is effective even when audio data is present together with video and audio and video are multiplexed by MPEG multiplexing, and is not limited to audio or video. The same applies to data such as other sub-pictures and control information.

本発明の実施例１の３次元音場情報記録装置の構成を示すブロック図である。It is a block diagram which shows the structure of the three-dimensional sound field information recording device of Example 1 of this invention. ＡＡＣ符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of an AAC encoding apparatus. グルーピングの一例を示す図である。It is a figure which shows an example of grouping. 本発明の実施例１のＤＶＤに応用するフォーマットを示す説明図である。It is explanatory drawing which shows the format applied to DVD of Example 1 of this invention. 本発明の実施例１の３次元音場情報記録プログラムを示すフローチャートである。It is a flowchart which shows the three-dimensional sound field information recording program of Example 1 of this invention. 図１に示す３次元音場情報記録装置で記録媒体に記録した音場情報を再生する３次元音場情報再生装置の構成を示すブロック図である。It is a block diagram which shows the structure of the three-dimensional sound field information reproduction apparatus which reproduces | regenerates the sound field information recorded on the recording medium with the three-dimensional sound field information recording apparatus shown in FIG. 図６に示す３次元音場情報再生装置で３次元音場を再生する３次元音場情報再生プログラムを示すフローチャートである。It is a flowchart which shows the three-dimensional sound field information reproduction program which reproduces | regenerates a three-dimensional sound field with the three-dimensional sound field information reproduction apparatus shown in FIG. 本発明の実施例２のＤＶＤに応用するフォーマットを示す説明図である。It is explanatory drawing which shows the format applied to DVD of Example 2 of this invention. 本発明の実施例３のＤＶＤに応用するフォーマットを示す説明図である。It is explanatory drawing which shows the format applied to DVD of Example 3 of this invention. 本発明の実施例４の３次元音場情報記録装置の構成を示すブロック図である。It is a block diagram which shows the structure of the three-dimensional sound field information recording device of Example 4 of this invention. ＭＰＥＧのビデオストリームビデオレイヤの説明表（その１）である。4 is an explanatory table (No. 1) of an MPEG video stream video layer. ＭＰＥＧのビデオストリームビデオレイヤの説明表（その２）である。4 is an explanatory table (2) of an MPEG video stream video layer. ＭＰＥＧのビデオストリームビデオレイヤの説明表（その３）である。4 is an explanatory table (No. 3) of an MPEG video stream video layer. ＭＰＥＧのビデオストリームビデオレイヤの説明表（その４）である。4 is an explanatory table (part 4) of an MPEG video stream video layer. ＭＰＥＧのビデオストリームビデオレイヤの説明表（その５）である。FIG. 10 is an explanatory table (No. 5) of an MPEG video stream video layer. FIG. ＭＰＥＧのビデオストリームビデオレイヤの説明表（その６）である。7 is an explanatory table (No. 6) of an MPEG video stream video layer. ＭＰＥＧのビデオストリームビデオレイヤの説明表（その７）である。FIG. 10 is an explanatory table (No. 7) of an MPEG video stream video layer. FIG. ＭＰＥＧの多重化トランスポートストリームシステムレイヤの説明表である。It is an explanatory table of the multiplexed transport stream system layer of MPEG. 本発明の実施例４の３次元音場情報記録プログラムを示すフローチャートである。It is a flowchart which shows the three-dimensional sound field information recording program of Example 4 of this invention. 図１０に示す３次元音場情報記録装置で記録媒体に記録した音場情報を再生する３次元音場情報再生装置の構成を示すブロック図である。It is a block diagram which shows the structure of the three-dimensional sound field information reproduction apparatus which reproduces | regenerates the sound field information recorded on the recording medium with the three-dimensional sound field information recording apparatus shown in FIG. 図１４に示す３次元音場情報再生装置で３次元音場を再生する３次元音場情報再生プログラムを示すフローチャートである。It is a flowchart which shows the three-dimensional sound field information reproduction program which reproduces | regenerates a three-dimensional sound field with the three-dimensional sound field information reproduction apparatus shown in FIG. アレイスピーカーの説明図である。It is explanatory drawing of an array speaker. アレイスピーカーのシステム図である。It is a system diagram of an array speaker.

Explanation of symbols

１ノーマルオーディオマイク
２バイノーラルオーディオマイク
３ノーマルオーディオ圧縮器
４，８，１１バッファ
５ノーマルオーディオ復号器
６演算器
７差分バイノーラルオーディオ圧縮器
９カメラ
１０ビデオ圧縮器
１２情報多重化器
１３タイムスタンプ発生器
１４ＤＶＤフォーマット化器
１５記録器
１６記録媒体
１７，７６制御部
３１１オーディオフレームレイヤ
３２パックヘッダ
３３パケットヘッダ
３４サブストリームＩＤ
３５オーディオフレーム情報
３６オーディオデータ情報
３７差分オーディオデータ
３８ディファレンシャルパック（Ｄ＿ＰＡＣＫ）
３９ Volume and File structure
４０ DVD-video zone
４１ DVD-others zone
４２ビデオマネージャー（ＶＭＧ）
４３ビデオタイトルセット（ＶＴＳ）
４４ Control Data
４５ビデオオブジェクトセット（ＶＯＢＳ）
４６ビデオオブジェクト（ＶＯＢ）
４７セル（ＣＥＬＬ）
４８ビデオオブジェクトユニット（ＶＯＢＵ）
４９ナビゲーションパック（ＮＶ＿ＰＡＣＫ）
５０オーディオパック（Ａ＿ＰＡＣＫ）
５１ビデオパック（Ｖ＿ＰＡＣＫ）
８１ビデオマネージャー（ＤＶＭＧ）
８２ビデオタイトルセット（ＶＴＳ）
８３ＤＶＴＳＩ
８４ビデオオブジェクトセット（ＤＶＯＢＳ）
８５ビデオオブジェクト（ＤＶＯＢ）
８６セル（ＤＣＥＬＬ）
８７ビデオオブジェクトユニット（ＤＶＯＢＵ）
８８オーディオマネージャ（ＡＭＧ）
８９オーディオタイトルセット（ＡＴＳ）
９０ＡＴＳインフォメーション（ＡＴＳＩ）
９１オーディオオブジェクトセット（ＡＯＢＳ）
９２オーディオオブジェクト（ＡＯＢ）
９３セル（ＣＥＬＬ）
９４オーディオオブジェクトユニット（ＡＯＢＵ）
９５ディファレンシャルパック（Ｄ＿ＰＡＣＫ） DESCRIPTION OF SYMBOLS 1 Normal audio microphone 2 Binaural audio microphone 3 Normal audio compressor 4, 8, 11 Buffer 5 Normal audio decoder 6 Operation unit 7 Differential binaural audio compressor 9 Camera 10 Video compressor 12 Information multiplexer 13 Time stamp generator 14 DVD formatter 15 Recorder 16 Recording medium 17, 76 Control unit 31 1 Audio frame layer 32 Pack header 33 Packet header 34 Substream ID
35 Audio frame information 36 Audio data information 37 Differential audio data 38 Differential pack (D_PACK)
39 Volume and File structure
40 DVD-video zone
41 DVD-others zone
42 Video Manager (VMG)
43 Video title set (VTS)
44 Control Data
45 Video Object Set (VOBS)
46 Video Object (VOB)
47 cells (CELL)
48 Video Object Unit (VOBU)
49 Navigation Pack (NV_PACK)
50 audio packs (A_PACK)
51 Video Pack (V_PACK)
81 Video Manager (DVMG)
82 Video Title Set (VTS)
83 DVTSI
84 Video Object Set (DVOBS)
85 Video object (DVOB)
86 cells (DCELL)
87 Video Object Unit (DVOBU)
88 Audio Manager (AMG)
89 Audio Title Set (ATS)
90 ATS Information (ATSI)
91 Audio Object Set (AOBS)
92 Audio Object (AOB)
93 cells (CELL)
94 Audio Object Unit (AOBU)
95 Differential Pack (D_PACK)

Claims

Means for recording an audio object of a two-dimensional sound field audio;
Means for recording management information describing information used for special reproduction of the audio object in a management information area;
3D sound field information recording apparatus comprising: means for recording information relating to 3D sound field information in a user data area of the audio object for each predetermined unit of each frame of the 2D sound field audio .

Means for recording an audio object of a two-dimensional sound field audio;
Means for recording management information describing information used for special reproduction of the audio object in a management information area;
Means for multiplexing and recording information relating to three-dimensional sound field information as stream data different from the two-dimensional sound field audio of the audio object for each predetermined unit of each frame of the two-dimensional sound field audio. A three-dimensional sound field information recording apparatus.

Means for recording an audio object of a two-dimensional sound field audio;
Means for recording management information describing information used for special reproduction of the audio object in a management information area;
Means for recording information relating to three-dimensional sound field information in a management information area for a three-dimensional sound field separate from the management information area for each predetermined unit of each frame of the two-dimensional sound field audio. A characteristic three-dimensional sound field information recording device.

The information relating to the three-dimensional sound field information is encoded using at least one of differential encoding, predictive encoding, or encoding using orthogonal transform, on difference information between the two-dimensional sound field audio information and binaural audio information. 4. The three-dimensional sound field information recording apparatus according to claim 1, wherein recording is performed after the recording.

Recording an audio object of two-dimensional sound field audio on a recording medium;
Recording management information describing information used for special reproduction of the audio object in a management information area of a recording medium;
Three-dimensional sound field information for causing a computer to record information relating to three-dimensional sound field information in a user data area of the audio object in a recording medium for each predetermined unit of each frame of the two-dimensional sound field audio Recording program.

Recording an audio object of two-dimensional sound field audio on a recording medium;
Recording management information describing information used for special reproduction of the audio object in a management information area of a recording medium;
Multiplexing information related to three-dimensional sound field information as stream data different from the two-dimensional sound field audio of the audio object and recording the information on a recording medium for each predetermined unit of each frame of the two-dimensional sound field audio; 3D sound field information recording program for causing a computer to execute.

Recording an audio object of two-dimensional sound field audio on a recording medium;
Recording management information describing information used for special reproduction of the audio object in a management information area of a recording medium;
Recording information on three-dimensional sound field information in a management information area for a three-dimensional sound field different from the management information area in a recording medium for each predetermined unit of each frame of the two-dimensional sound field audio. A three-dimensional sound field information recording program to be executed.

The information relating to the three-dimensional sound field information is encoded using at least one of differential encoding, predictive encoding, or encoding using orthogonal transform, on difference information between the two-dimensional sound field audio information and binaural audio information. 8. The three-dimensional sound field information recording program according to claim 5, wherein the recording is performed after recording.