JP2005176094A

JP2005176094A - Data processor, data processing method, program and storage medium

Info

Publication number: JP2005176094A
Application number: JP2003415418A
Authority: JP
Inventors: Toshiyuki Nakagawa; 利之中川
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2003-12-12
Filing date: 2003-12-12
Publication date: 2005-06-30

Abstract

<P>PROBLEM TO BE SOLVED: To appropriately divide scene description data and object description data corresponding to a type and capacity of a reception side terminal, and to a communication channel, when distributing encoded multi-media data constituted of a plurality of objects such as moving images, still images, text and CG. <P>SOLUTION: An equipment information / network analysis circuit 104 analyzes the type and throughput of the reception terminal and the condition of the communication channel, and a scene / object conversion circuit 101 judges whether to divide and transmit the scene description data 102 and the object description data 103 corresponding to analyzed results. In the case that division is required, the scene description data 102 and the object description data 103 are appropriately divided. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、マルチメディアデータをネットワークを介して接続された受信端末に対して配信するデータ処理装置、及びデータ処理方法、並びにプログラム、記憶媒体に関するものである。 The present invention relates to a data processing apparatus, a data processing method, a program, and a storage medium that distribute multimedia data to receiving terminals connected via a network.

動画像や音声を圧縮符号化し、多重化し、伝送若しくは蓄積し、これを逆多重化して復号する符号化標準の国際規格としてMPEG-1、及びMPEG-2などが知られている。 MPEG-1 and MPEG-2 are known as international standards for encoding standards for compressing and encoding moving images and audio, multiplexing, transmitting or accumulating them, and demultiplexing and decoding them.

一方、ISO/IEC 14496 part 1(MPEG-4 Systems)では、静止画、動画像や音声、テキスト、ＣＧなど複数のオブジェクトを含むマルチメディアデータの符号化ビットストリームを多重化・同期する手法が標準化されている。 On the other hand, ISO / IEC 14496 part 1 (MPEG-4 Systems) standardizes a method to multiplex and synchronize an encoded bit stream of multimedia data including multiple objects such as still images, moving images, audio, text, and CG. Has been.

上述したようなMPEG-4のデータストリームには、これまでの一般的なマルチメディアストリームとは異なり、静止画像、動画像や音声データに加え、テキストやＣＧなどの各オブジェクトを空間・時間的に配置するための情報として、VRML（Virtual Reality Modeling Language）を自然動画像や音声が扱えるように拡張したBIFS（Binary Format for Scenes）が含まれている。ここでBIFSはMPEG-4のシーンを２値で記述する情報（シーン記述情報）である。 The MPEG-4 data stream as described above is different from conventional multimedia streams, and in addition to still images, moving images and audio data, each object such as text and CG is spatially and temporally. Information for placement includes BIFS (Binary Format for Scenes), which is an extension of VRML (Virtual Reality Modeling Language) to handle natural video and audio. Here, BIFS is information (scene description information) that describes an MPEG-4 scene in binary.

このようなマルチメディアストリームを構成する静止画、動画、音声等個々のオブジェクトは、それぞれ個別に最適な符号化が施されて送信されることになるので、復号側でも個別に復号され、上述のシーン記述情報に伴い空間的に配置され、個々のデータの持つ時間軸を再生機内部の時間軸に合わせて同期させ、シーンを合成し再生することになる。 Individual objects such as still images, moving images, and audio that make up such a multimedia stream are individually encoded and transmitted individually, so that the decoding side also individually decodes the above-described objects. The scenes are spatially arranged in accordance with the scene description information, and the time axis of each data is synchronized with the time axis inside the playback device, and the scene is synthesized and played back.

また、シーンの構成を記述する方法としては、上述したVRML、BIFSの他に、HTML（HyperText Markup Language）や、XML（eXtensible Markup Language）を用いて記述されるSMIL（Synchronized Multimedia Integration Language）、XMT（Extensible MPEG-4 Textual Format）などがある。 In addition to the VRML and BIFS described above, the scene structure can be described using HTML (HyperText Markup Language), XML (eXtensible Markup Language), SMIL (Synchronized Multimedia Integration Language), XMT. (Extensible MPEG-4 Textual Format).

このようなマルチメディアデータのビットストリームを送信する際には、受信側端末の能力や種類、通信回線の状況に応じて最適な情報量のデータを作成し、送信することが要求される。つまり、受信側端末が携帯電話やPDA（Personal Data Assistant）のような処理能力の低い携帯情報端末であったり、通信回線が混雑している場合には、予め送信側で高い圧縮率の符号化フォーマットで圧縮したり、画像サイズを小さくしたり、伝送レートやフレームレートを小さくして符号化する必要がある。 When transmitting such a bit stream of multimedia data, it is required to create and transmit data having an optimal amount of information according to the capability and type of the receiving terminal and the status of the communication line. In other words, if the receiving terminal is a portable information terminal such as a mobile phone or PDA (Personal Data Assistant), or if the communication line is congested, a high compression rate is encoded on the transmitting side in advance. It is necessary to perform encoding by compressing in a format, reducing the image size, or reducing the transmission rate or frame rate.

受信側端末の能力や通信回線の状況に応じて、動画像や音声のレートを制御したり、時間／空間スケーラブルを切り替えたり、画像サイズを変換したり、エラー耐性を制御したりして、情報量を最適化して符号化し、送信する手法についてはいくつか提案されている。例えば特許文献１には、動画像以外のシーン記述データについても最適化して送信する手法が提案されている。
特開２００２−１１２１４９号公報 Depending on the capabilities of the receiving terminal and the status of the communication line, information such as video and audio rates, time / space scalable switching, image size conversion, and error resilience can be controlled. Several methods for optimizing the amount, encoding, and transmitting have been proposed. For example, Patent Document 1 proposes a method for optimizing and transmitting scene description data other than moving images.
JP 2002-112149 A

しかしながら、上記特許文献１に提案されている方式では、（１）シーン記述データを分割して送信する際、どのオブジェクトを優先するかについては言及されていない、（２）オブジェクト記述データの分割は行っていない、（３）コンテンツ作成者がオブジェクト表示の優先順位を指定できない、などの問題を有している。 However, in the method proposed in Patent Document 1, (1) which object is prioritized when scene description data is divided and transmitted is not mentioned. (2) Object description data is divided. There is a problem that (3) the content creator cannot specify the priority order of object display.

本発明は以上の課題に鑑みて成されたものであり、動画、音声、静止画、テキスト、ＣＧ等、複数のメディアオブジェクトから構成される符号化されたマルチメディアデータを配信する際に、受信側の能力に応じてコンテンツ作成者の意図した表示を可能とすることを目的とする。 The present invention has been made in view of the above problems, and is received when distributing encoded multimedia data composed of a plurality of media objects such as moving images, sounds, still images, texts, and CGs. It is intended to enable display intended by the content creator according to the ability of the side.

本発明の一態様によるデータ処理装置は以下の構成を備える。すなわち、
複数のメディアオブジェクトと、該複数のメディアオブジェクトの時空間の関係を記述するシーン記述データ及び当該シーン記述とメディアオブジェクトとの関連付け及び復号に必要な情報の記述を含むオブジェクト記述データを有する記述データとを含むマルチメディアデータを外部装置に配信するデータ処理装置であって、
前記マルチメディアデータの配信に関る外部の能力と前記シーン記述データ及びオブジェクト記述データに基づいて記述データの分割要否を判定する判定手段と、
前記判定手段により分割が必要と判定された場合に、オブジェクト記述データとシーン記述データのそれぞれをメディアオブジェクトに関る部分を単位として分割する分割手段と、
前記分割手段によって分割された各記述データと複数のメディアオブジェクトを符号化し、多重化してマルチメディアデータを生成し、前記受信端末に配信する配信手段とを備える。 A data processing apparatus according to an aspect of the present invention has the following configuration. That is,
Description data having object description data including a plurality of media objects, scene description data describing a temporal and spatial relationship of the plurality of media objects, and description of information necessary for associating and decoding the scene description with the media object; A data processing device for delivering multimedia data including
Determination means for determining whether or not description data needs to be divided based on external capabilities related to distribution of the multimedia data and the scene description data and object description data;
A division unit that divides each of the object description data and the scene description data in units of a part related to the media object when the determination unit determines that the division is necessary;
Each of the description data divided by the dividing means and a plurality of media objects are encoded, multiplexed to generate multimedia data and distributed to the receiving terminal.

また、本発明の他の態様によるデータ処理装置は、
複数のメディアオブジェクトと、各メディアオブジェクトの再生に関する情報を記述した記述データとを含むマルチメディアデータを外部装置に配信するデータ処理装置であって、
前記マルチメディアデータの配信に関る外部の能力と前記記述データに基づいて該記述データの分割要否を判定する判定手段と、
前記判定手段により分割が必要と判定された場合に、前記記述データが関連するメディアオブジェクトに設定されている優先度に基づいて、優先度の低いメディアオブジェクトに関る記述データ部分を分離することにより該記述データを分割する分割手段と、
前記分割手段によって分割された各記述データと複数のメディアオブジェクトを符号化し、多重化してマルチメディアデータを生成し、前記受信端末に配信する配信手段とを備える。 In addition, a data processing device according to another aspect of the present invention provides:
A data processing apparatus for delivering multimedia data including a plurality of media objects and description data describing information related to reproduction of each media object to an external device,
Determining means for determining the necessity of dividing the description data based on the external capability related to the delivery of the multimedia data and the description data;
When the determination unit determines that division is necessary, based on the priority set for the media object to which the description data relates, the description data portion related to the low-priority media object is separated. Dividing means for dividing the description data;
Each of the description data divided by the dividing means and a plurality of media objects are encoded, multiplexed to generate multimedia data and distributed to the receiving terminal.

本発明によれば、複数のメディアオブジェクトから構成される符号化されたマルチメディアデータを配信する際に、配信に関る外部の能力（配信先端末の能力及び／又はネットワークの伝送容量等）に応じてその再生に関る情報を記述した記述データをメディアオブジェクトの優先度に応じて分割して送信することが出来る。このため、受信側の装置が複数のメディアオブジェクトを再生する処理にかかる負荷、時間を適切に分散させることができる。また、メディアオブジェクトの優先度に応じて再生されるので、コンテンツ作成者の意図に則した表示が可能である。
また、本発明によれば、シーン記述データとオブジェクト記述データの両方を外部の能力に応じて分割するので、メディアオブジェクトを再生する際の負荷の分散をより適切に実現できる。 According to the present invention, when the encoded multimedia data composed of a plurality of media objects is distributed, the external capability related to the distribution (the capability of the distribution destination terminal and / or the transmission capacity of the network, etc.) Accordingly, the description data describing the information related to the reproduction can be divided and transmitted according to the priority of the media object. For this reason, it is possible to appropriately distribute the load and time required for the processing on the receiving side device for playing back a plurality of media objects. In addition, since playback is performed according to the priority of the media object, it is possible to display in accordance with the intention of the content creator.
Further, according to the present invention, since both the scene description data and the object description data are divided according to the external capability, it is possible to more appropriately realize the load distribution when reproducing the media object.

以下添付図面を参照して、本発明のデータ処理装置をマルチメディアデータを配信する装置に適用した好適な実施形態に従って詳細に説明する。 Hereinafter, a data processing apparatus according to the present invention will be described in detail according to a preferred embodiment applied to an apparatus for distributing multimedia data with reference to the accompanying drawings.

図１は本実施形態におけるマルチメディアデータ配信装置１００（以下、単に配信装置と呼ぶ）及びマルチメディアデータ受信装置１１０（以下、単に受信装置）の基本構成を示すと共に、各回路間でのデータの流れを示す図である。図１に示す配信装置１００は、シーン／オブジェクト変換回路１０１、機器情報／ネットワーク解析回路１０４、シーン記述データ符号化回路１０５、オブジェクト記述データ符号化回路１０６、メディアビットストリーム記憶装置１０７、多重化回路１０８により構成されている。 FIG. 1 shows a basic configuration of a multimedia data distribution apparatus 100 (hereinafter simply referred to as a distribution apparatus) and a multimedia data reception apparatus 110 (hereinafter simply referred to as a reception apparatus) in the present embodiment, and data between each circuit. It is a figure which shows a flow. A distribution apparatus 100 shown in FIG. 1 includes a scene / object conversion circuit 101, a device information / network analysis circuit 104, a scene description data encoding circuit 105, an object description data encoding circuit 106, a media bitstream storage apparatus 107, a multiplexing circuit. 108.

また、受信装置１１０は、逆多重化回路１１１、シーン記述データ復号回路１１２、オブジェクト記述データ復号回路１１３、メディア復号回路１１４、シーン合成回路１１５、出力機器１１６により構成されている。 The receiving apparatus 110 includes a demultiplexing circuit 111, a scene description data decoding circuit 112, an object description data decoding circuit 113, a media decoding circuit 114, a scene synthesis circuit 115, and an output device 116.

配信装置１００において、シーン記述データ１０２は、視聴者に提示される画面や時間的な構成であり、MPEG-4のシステムパートではシーン記述言語として、前述したBIFSが採用されている。オブジェクト記述データ１０３は、シーン記述データ１０２とシーンを構成する各メディアオブジェクトの関連付け、符号化方法、パケットの構成等の復号に必要な情報である。これらシーン記述データ１０２及びオブジェクト記述データ１０３は、図示しないシーン／オブジェクト編集回路によって作成されたもの、若しくは所定の記憶装置に保存されているシーン記述データ及びオブジェクト記述データが読み込まれたものとする。 In the distribution apparatus 100, the scene description data 102 is a screen or time structure presented to the viewer, and the above-described BIFS is adopted as a scene description language in the MPEG-4 system part. The object description data 103 is information necessary for decoding such as association between the scene description data 102 and each media object constituting the scene, an encoding method, and a packet configuration. The scene description data 102 and the object description data 103 are generated by a scene / object editing circuit (not shown), or scene description data and object description data stored in a predetermined storage device are read.

シーン記述データ１０２、オブジェクト記述データ１０３は、それぞれ、シーン記述データ符号化回路１０５、オブジェクト記述データ符号化回路１０６において符号化され、多重化回路１０８へ入力される。多重化回路１０８は、入力されたシーン記述データ、オブジェクト記述データ、符号化ビットストリームを多重化して、伝送路１０９へビットストリームとして受信装置１１０に配信する。メディアビットストリーム記憶装置１０７には、符号化されたビットストリームが予め用意されており、多重化回路１０８は、メディアビットストリーム記憶装置１０７から必要なメディアオブジェクトの符号化ビットストリームを選択して多重化する。 The scene description data 102 and the object description data 103 are encoded by the scene description data encoding circuit 105 and the object description data encoding circuit 106, respectively, and input to the multiplexing circuit 108. The multiplexing circuit 108 multiplexes the input scene description data, object description data, and encoded bit stream, and distributes them to the transmission apparatus 109 as a bit stream to the receiving apparatus 110. An encoded bit stream is prepared in advance in the media bit stream storage device 107, and the multiplexing circuit 108 selects and multiplexes the encoded bit stream of the required media object from the media bit stream storage device 107. To do.

尚、上記符号化されたビットストリームは、例えば周知のJPEG方式にて高能率（圧縮）符号化された静止画、例えば周知のMPEG-2やMPEG-4、H-263方式にて高能率符号化された動画像データ、例えば周知のCELP（Code Excited Linear Prediction）符号化や、変換領域重み付けインターリーブベクトル量子化（TWINVQ）符号化などの高能率符号化が施された音声データ等である。 The encoded bit stream is a still image that has been encoded with high efficiency (compression) using, for example, the well-known JPEG format, such as a high-efficiency code with the well-known MPEG-2, MPEG-4, or H-263 format. Video data, for example, voice data subjected to high-efficiency encoding such as well-known CELP (Code Excited Linear Prediction) encoding or transform domain weighted interleaved vector quantization (TWINVQ) encoding.

また図１において、１０９は各種ネットワークに代表される伝送路であり、本実施形態においては加工、符号化したマルチメディアデータを配信するネットワークである。 In FIG. 1, reference numeral 109 denotes a transmission path represented by various networks. In this embodiment, 109 is a network for distributing processed and encoded multimedia data.

本実施形態における配信装置１００が、受信装置１１０に対してマルチメディアデータを配信する処理について、図２のフローチャートを用いて説明する。 A process in which the distribution apparatus 100 according to the present embodiment distributes multimedia data to the reception apparatus 110 will be described with reference to the flowchart of FIG.

本実施形態における配信装置１００は、受信装置１１０からネットワークを経由して送られるマルチメディアデータ配信要求を受けると共に、受信装置の種類や処理能力に関する情報（以下、機器情報）を受信する。機器情報／ネットワーク解析回路１０４は、受信した機器情報から受信装置の種類や処理能力を取得すると共に、ネットワーク回線状況を解析する（ステップＳ２０１）。ネットワークの回線状況の解析方法としては、例えば、所定のデータを配信装置１００へ送り、そのレスポンスの速度を計測することにより回線の伝送容量を推定する。 The distribution apparatus 100 according to the present embodiment receives a multimedia data distribution request sent from the reception apparatus 110 via the network, and receives information (hereinafter, device information) regarding the type and processing capability of the reception apparatus. The device information / network analysis circuit 104 acquires the type and processing capability of the receiving device from the received device information and analyzes the network line status (step S201). As a method for analyzing the line status of the network, for example, the transmission capacity of the line is estimated by sending predetermined data to the distribution apparatus 100 and measuring the response speed.

ここで機器情報の送受信手段としては、CC/PP（Composite Capability/Preference Profiles）のようなフレームワークを利用することが考えられる。CC/PPはW3C（World Wide Web Consortium）において標準化が行われており、端末のデバイス名（メーカ名、機種名）、画面サイズ、メモリ量等のハードウェア特性を、XMLをベースとしたメタ情報記述言語を用いて記述することが可能となっている。配信装置側に機種名と能力とを対応付けたテーブルを用意しておき、機器情報として取得された機種名から能力を取得するようにしてもよい。 Here, it is conceivable to use a framework such as CC / PP (Composite Capability / Preference Profiles) as means for transmitting and receiving device information. CC / PP is standardized in the World Wide Web Consortium (W3C), and the hardware characteristics such as the terminal device name (manufacturer name, model name), screen size, memory size, etc., are meta information based on XML. It is possible to describe using a description language. A table in which model names and capabilities are associated with each other may be prepared on the distribution device side, and the capabilities may be acquired from the model names acquired as device information.

次に、シーン／オブジェクト変換回路１０１は、受信装置１１０から配信を要求されたマルチメディアデータに合致するシーン記述データ１０２、オブジェクト記述データ１０３を読み込む（ステップＳ２０２）。そして、上記ステップＳ２０１の解析結果と、ステップＳ２０２で読み込んだシーン記述データ１０２、オブジェクト記述データ１０３とから、シーン記述データ及びオブジェクト記述データを分割する、しないの判断をする（ステップＳ２０３）。 Next, the scene / object conversion circuit 101 reads the scene description data 102 and the object description data 103 that match the multimedia data requested to be distributed from the receiving device 110 (step S202). Then, it is determined whether or not to split the scene description data and the object description data from the analysis result in step S201 and the scene description data 102 and the object description data 103 read in step S202 (step S203).

ステップＳ２０３における分割の判断には、シーン記述データ及びオブジェクト記述データの符号化後のサイズを用いることが挙げられる。シーン記述データやオブジェクト記述データのサイズは記述の複雑さに対応しており、記述の複雑さはメディアオブジェクトの再生処理の負荷の大きさに対応するからである。シーン記述データ１０２とオブジェクト記述データ１０３を送信する際に、記述データの符号化後のサイズに対して伝送路１０９の伝送容量が低い場合、あるいは、受信装置１１０の処理能力が低い場合には、シーン記述データ及びオブジェクト記述データを分割することになる。より具体的には、例えば、以下の（１）、（２）の条件、
（１）「記述データの符号化後のサイズ（バイト数）」≦「伝送帯域に許されるデータのバイト数」、
（２）「記述データの符号化後のサイズ（バイト数）」≦「受信装置の処理能力に許されるデータのバイト数」、
をチェックし、これら（１）、（２）の条件を共に満たす記述データの場合は、分割せずに出力し、どちらから一方でも満たさない場合は分割するようにする。なお、ここで、受信装置の処理能力としては、ＣＰＵのクロックやメモリ容量等を用いることができる。また、別の判断方法として受信装置の種類（ＰＤＡ、携帯電話、ＰＣ等）に応じて判断するようなことも考えられる。 The determination of division in step S203 includes using the encoded sizes of scene description data and object description data. This is because the size of the scene description data and the object description data corresponds to the description complexity, and the description complexity corresponds to the load of the reproduction processing of the media object. When the scene description data 102 and the object description data 103 are transmitted, if the transmission capacity of the transmission path 109 is low with respect to the size of the description data encoded, or if the processing capability of the reception device 110 is low, The scene description data and the object description data are divided. More specifically, for example, the following conditions (1) and (2):
(1) “size of description data after encoding (number of bytes)” ≦ “number of bytes of data allowed in transmission band”,
(2) “size of description data after encoding (number of bytes)” ≦ “number of bytes of data allowed for processing capability of receiving device”,
In the case of descriptive data satisfying both of the conditions (1) and (2), it is output without being divided, and when neither is satisfied, it is divided. Here, as the processing capability of the receiving apparatus, a CPU clock, a memory capacity, or the like can be used. Further, as another determination method, it may be determined according to the type of the receiving device (PDA, mobile phone, PC, etc.).

ステップＳ２０３で、シーン記述データ、オブジェクト記述データの分割の必要がないと判断された場合には、シーン記述データ、オブジェクト記述データを分割せずにそのまま送信する（ステップＳ２０６）。一方、分割の必要があると判断された場合はステップＳ２０４、Ｓ２０５によりシーン記述データ及びオブジェクト記述データを、それ以上分割する必要がないと判断されるまで分割する。以下、具体例によりシーン記述データ、オブジェクト記述データの分割を説明する。 If it is determined in step S203 that the scene description data and the object description data need not be divided, the scene description data and the object description data are transmitted without being divided (step S206). On the other hand, if it is determined that division is necessary, the scene description data and the object description data are divided in steps S204 and S205 until it is determined that further division is not necessary. Hereinafter, division of scene description data and object description data will be described with reference to specific examples.

例えば図３に示されるような、画面３００に動画像３０１と動画像３０２、静止画像３０３とが配置されるマルチメディアデータを実現する為のシーン記述データ、オブジェクト記述データはそれぞれ図４、図５である。尚、ここでは説明を分かり易くするため、本来２値で記述されるシーン記述データ、オブジェクト記述データを、符号化処理が施される前のテキストとして表記する。 For example, as shown in FIG. 3, scene description data and object description data for realizing multimedia data in which a moving image 301, a moving image 302, and a still image 303 are arranged on a screen 300 are shown in FIGS. It is. Here, in order to make the explanation easy to understand, scene description data and object description data that are originally described in binary are expressed as text before the encoding process is performed.

図４で示されるシーン記述データ４００は、Groupノードで始まる。全てのBIFSシーンはSFTopNodeと呼ばれる種類のノードで始まるが、GroupノードはSFTopNodeの一つである。このGroupノードの子ノードとして、動画像３０１、動画像３０２、静止画像３０３に関する情報が、それぞれノード４０１、ノード４０２、ノード４０３に記述される。ノード４０１内には、対応する動画像オブジェクトデータ３０１の所在を示すＩＤ番号が記述され、同様にノード４０２、ノード４０３内には、対応する動画像オブジェクトデータ３０２、静止画像オブジェクトデータ３０３の所在を示すＩＤ番号が記述される。尚、ここではその他のノード、フィールドに関しての詳細な説明は省略する。 The scene description data 400 shown in FIG. 4 starts with a Group node. All BIFS scenes start with a kind of node called SFTopNode, but the Group node is one of SFTopNode. As child nodes of this Group node, information regarding the moving image 301, the moving image 302, and the still image 303 is described in a node 401, a node 402, and a node 403, respectively. In the node 401, an ID number indicating the location of the corresponding moving image object data 301 is described. Similarly, in the nodes 402 and 403, the locations of the corresponding moving image object data 302 and the still image object data 303 are described. An ID number to be indicated is described. It should be noted that detailed description of other nodes and fields is omitted here.

図５で示されるオブジェクト記述データ５００には、前述したシーン記述データとシーンを構成する動画像、静止画像メディアオブジェクトの関連付け、符号化方法、パケットの構成等の、復号に必要な情報が記述される。オブジェクト記述データは、セッションの開始時や、セッションの途中でストリームを追加、消去、変更などする際に必要である。セッションの開始時やシーンに新たにストリームが追加された場合には、オブジェクト記述データを更新するコマンド（UPDATE OD）が使用される。 The object description data 500 shown in FIG. 5 describes information necessary for decoding, such as the association of the scene description data and the moving images and still image media objects constituting the scene, the encoding method, the packet configuration, and the like. The The object description data is necessary at the start of a session or when a stream is added, deleted or changed during the session. A command (UPDATE OD) for updating object description data is used at the start of a session or when a stream is newly added to a scene.

オブジェクト記述データの構成で主要なものは、
・ ObjectDescriptorID
・ ES（Elementary Stream） Descriptor
である。 The main structure of object description data is
・ ObjectDescriptorID
・ ES (Elementary Stream) Descriptor
It is.

ObjectDescriptorIDはオブジェクトを識別するためのＩＤである。上述したように、シーン記述データ中の静止画や音声、動画ストリームを参照するノードには、その所在を示すＩＤ番号が割り当てられるが、このＩＤ番号はObjectDescriptorIDと対応付けられる。例えば、オブジェクト記述データ５０１〜５０３は、ObjectDescriptorID 1〜3で識別されるが、このＩＤはシーン記述データ４００中の“url”によって指定される番号（ＩＤ）に対応する。例えば、オブジェクト記述データ５０３はObjectDescriptorID“3”で識別されるが、これはシーン記述データ４００中、url“3”で指定されるIMAGE1という名前で定義されているImageTextureノードに該当する。 ObjectDescriptorID is an ID for identifying an object. As described above, an ID number indicating the location is assigned to a node that refers to a still image, audio, or moving image stream in the scene description data, and this ID number is associated with an ObjectDescriptor ID. For example, the object description data 501 to 503 are identified by ObjectDescriptor IDs 1 to 3, and this ID corresponds to a number (ID) specified by “url” in the scene description data 400. For example, the object description data 503 is identified by ObjectDescriptorID “3”, which corresponds to the ImageTexture node defined by the name IMAGE1 specified by url “3” in the scene description data 400.

また、ES Descriptorは静止画、動画、音声の各ストリームに対して必要とされ、各ES DescriptorはES_IDによって識別される。ES Descriptorはストリームの種類を判別するためのストリームタイプやプロファイル、デコーダに必要なバッファサイズ、ストリームの最大／平均伝送レートなどを記述するデコーダ設定デスクリプタ（DecoderConfigDescriptor）等を含んでいる。DecoderConfigDescriptorは、受信装置側でこのElementary Streamをデコード可能かどうか決定する際に必要とされる情報である。 An ES Descriptor is required for each stream of still images, moving images, and audio, and each ES Descriptor is identified by ES_ID. The ES Descriptor includes a stream type and profile for determining a stream type, a buffer size necessary for the decoder, a decoder setting descriptor (DecoderConfigDescriptor) describing the maximum / average transmission rate of the stream, and the like. DecoderConfigDescriptor is information required when determining whether or not this elementary stream can be decoded on the receiving device side.

オブジェクト記述データは、シーン記述データのノード中に指定されたメディアストリームの情報を持っており、例えば動画像オブジェクト３０１に関するオブジェクト記述データは５０１に記述されており、動画像オブジェクト３０２に関するオブジェクト記述データは５０２に記述されており、静止画像３０３に関するオブジェクト記述データは５０３に記述されている。 The object description data has information on the media stream specified in the node of the scene description data. For example, the object description data related to the moving image object 301 is described in 501, and the object description data related to the moving image object 302 is The object description data relating to the still image 303 is described in 503.

さて、ステップＳ２０３でシーン記述データ、オブジェクト記述データを分割する必要がないと判断された場合には、図４で示されるシーン記述データ４００、図５で示されるオブジェクト記述データ５００は、それぞれシーン記述データ符号化回路１０５、オブジェクト記述データ符号化回路１０６に入力され、符号化される。そして、符号化されたシーン記述データ４００とオブジェクト記述データ５００は、メディアビットストリーム記憶装置１０７内の関連する動画像３０１、動画像３０２、静止画像３０３の符号化ビットストリームと共に多重化回路１０８において多重化され、送信される。 If it is determined in step S203 that the scene description data and the object description data need not be divided, the scene description data 400 shown in FIG. 4 and the object description data 500 shown in FIG. The data is input to the data encoding circuit 105 and the object description data encoding circuit 106 and encoded. The encoded scene description data 400 and the object description data 500 are multiplexed in the multiplexing circuit 108 together with the encoded bit streams of the related moving image 301, moving image 302, and still image 303 in the media bitstream storage device 107. And sent.

一方、ステップＳ２０３でシーン記述データ、オブジェクト記述データを分割する必要があると判断された場合には、シーン／オブジェクト変換回路１０１はオブジェクト記述データから優先度の低いオブジェクトを別のオブジェクト記述データとして分割する（ステップＳ２０４）。そして、ステップＳ２０４において分割したオブジェクト記述データに関連するノードを、シーン記述データから分割する（ステップＳ２０５）。 On the other hand, if it is determined in step S203 that the scene description data and the object description data need to be divided, the scene / object conversion circuit 101 divides the object with low priority from the object description data as another object description data. (Step S204). Then, the node related to the object description data divided in step S204 is divided from the scene description data (step S205).

ここで、ステップＳ２０４、ステップＳ２０５でのシーン記述データ、オブジェクト記述データの分割方法を具体的に説明する。 Here, the method of dividing the scene description data and the object description data in step S204 and step S205 will be specifically described.

前述した図４、図５で示されるシーン記述データ４００、オブジェクト記述データ５００を分割する場合、まずオブジェクト記述データ５００に含まれる３つのObjectDescriptor（５０１、５０２、５０３）から、優先度の低いObjectDescriptorを別のオブジェクトデータとして分割する。図５の例では、“ObjectDescriptorID 3”で示されるオブジェクト（５０３）の優先度が最も低いので、このObjectDescriptorを別のオブジェクトデータとして分割する。 When the scene description data 400 and the object description data 500 shown in FIGS. 4 and 5 are divided, first, an ObjectDescriptor having a low priority is selected from the three ObjectDescriptors (501, 502, 503) included in the object description data 500. Split as separate object data. In the example of FIG. 5, since the priority of the object (503) indicated by “ObjectDescriptorID 3” is the lowest, this ObjectDescriptor is divided as another object data.

ここで、上記優先度の判断にはObjectDescriptorに含まれるES DescriptorのフィールドstreamPriorityを用いることが考えられる。streamPriorityはこのESの優先度を示す相対的な基準であり、高いstreamPriorityの値を持つESは、それよりも低いstreamPriorityの値を持つESよりも重要であることを示している。よって、分割対象のオブジェクト記述データ５００においては、その中で最も優先度の低いオブジェクト５０３が分割されることになる。このstreamPriorityの値を変更することによりユーザが任意に優先度を設定できる。オブジェクト記述データ５００に対するこのような編集操作は、通常のテキスト編集アプリケーション等を用いることで実現できる。 Here, it is conceivable to use the field streamPriority of the ES Descriptor included in the ObjectDescriptor to determine the priority. The streamPriority is a relative standard indicating the priority of this ES, and indicates that an ES having a high streamPriority value is more important than an ES having a lower streamPriority value. Therefore, in the object description data 500 to be divided, the object 503 having the lowest priority among them is divided. By changing the value of this streamPriority, the user can arbitrarily set the priority. Such an editing operation on the object description data 500 can be realized by using a normal text editing application or the like.

図５のオブジェクト記述データ５００によると、ES_ID 101、102、103で表される各ESのstreamPriorityはそれぞれ3、2、1となっている。よってES_ID 103で示されるESが最も優先順位が低いと判断され、次に優先順位が高いESがES_ID 102で示されるES、最も優先順位が高いESはES_ID 103で示されるESと判断することが出来る。よって、ステップＳ２０４において、図５で示されるオブジェクト記述データ５００は、図６のように２つのオブジェクト記述データ６００と５０３に分割される。 According to the object description data 500 of FIG. 5, the stream priorities of the ESs represented by ES_IDs 101, 102, and 103 are 3, 2, and 1, respectively. Therefore, the ES indicated by ES_ID 103 is determined to be the lowest priority, the ES having the next highest priority is determined to be the ES indicated by ES_ID 102, and the ES having the highest priority is determined to be the ES indicated by ES_ID 103. I can do it. Therefore, in step S204, the object description data 500 shown in FIG. 5 is divided into two object description data 600 and 503 as shown in FIG.

そしてステップＳ２０５では、ステップＳ２０４において分割されたオブジェクト記述データ５０３に関連するノード４０３をシーン記述データ４００から分割する。オブジェクト記述データ５０３はObjectDescriptorID 3で識別されるが、これはシーン記述データ４００中、url 3で指定されるIMAGE1という名前で定義されているImageTextureノードに該当する。 In step S205, the node 403 related to the object description data 503 divided in step S204 is divided from the scene description data 400. The object description data 503 is identified by ObjectDescriptorID 3, which corresponds to an ImageTexture node defined by the name IMAGE1 specified by url 3 in the scene description data 400.

ノードの分割は、当該ノードのみを分割するのではなく、当該ノードを含むグループ化ノードを分割することが可能であり、本例の場合、IMAGE1という名前で定義されているImageTextureノードをグループ化するTransform2Dノード４０３をシーン記述データ４００から分割する。よって、ステップＳ２０５において、図４で示されるシーン記述データ４００は、図７のように２つのシーン記述データ７００と７０１に分割される。例えば、「ノードのみを分割する」と、オブジェクト記述データに対応するノード「だけ」が分割され、そのノードをグループ化するノード単位での分割はされない。図４の例では、IMAGE1（ImageTextureノード）「のみ」が分割されることになり、IMAGE1をグループ化する Transform2D ノード４０３を単位としての分割はなされないことになる。ノードの分割においては、メディアオブジェクトに対応するノードとして、単一ノード単位で考えるのか、そのノードをグループ化するノード単位で考えるのかはどちらでもよく、いずれにしてもメディアオブジェクトに関る部分を単位として分割することになる。ただし、現実的には、分割効率を考えると、単一ノードのみで分割するよりも、グループ化するノード単位で分割した方が好ましい。 In the case of dividing the node, it is possible to divide the grouping node including the node instead of dividing only the node. In this example, the ImageTexture node defined by the name IMAGE1 is grouped. The Transform2D node 403 is divided from the scene description data 400. Therefore, in step S205, the scene description data 400 shown in FIG. 4 is divided into two scene description data 700 and 701 as shown in FIG. For example, when “divide only the node”, the node “only” corresponding to the object description data is divided, and the node is not divided in units of groups. In the example of FIG. 4, IMAGE1 (ImageTexture node) “only” is divided, and no division is performed in units of the Transform2D node 403 that groups IMAGE1. In node division, the node corresponding to the media object may be considered as a single node unit or as a node unit for grouping the nodes, and in any case, the part related to the media object is a unit. Will be divided as However, in reality, in consideration of the division efficiency, it is preferable to divide in units of nodes to be grouped rather than dividing only by a single node.

図６、図７のように２つに分割されたオブジェクト記述データ６００、５０３、シーン記述データ７００、７０１についてステップＳ２０３でさらに分割の必要があるかどうかを判断する。ステップＳ２０３において、これ以上シーン記述データ、オブジェクト記述データの分割の必要がないと判断された場合には、シーン記述データ６００、５０３、オブジェクト記述データ７００、７０１はそれぞれ別々のアクセスユニットとして、優先順位の高いものから順次シーン記述データ符号化回路１０５、オブジェクト記述データ符号化回路１０６へ入力されることになる。 In step S203, it is determined whether or not the object description data 600 and 503 and the scene description data 700 and 701 divided into two as shown in FIGS. If it is determined in step S203 that there is no need to further divide the scene description data and the object description data, the scene description data 600 and 503 and the object description data 700 and 701 are set as priority units as separate access units. Are sequentially input to the scene description data encoding circuit 105 and the object description data encoding circuit 106.

シーン記述データ符号化回路１０５において符号化されたシーン記述データ７００、７０１、オブジェクト記述データ符号化回路１０６において符号化されたオブジェクト記述データ６００、５０３は、それぞれ多重化回路１０８においてタイムスタンプが付加され、多重化されて伝送路１０９へ送信される。ここで、分割されたシーン記述データ及びオブジェクト記述データは、それぞれシーン記述コマンドの挿入コマンド、オブジェクト記述データの更新コマンドが付加されて送信されるものとする。 The scene description data 700 and 701 encoded by the scene description data encoding circuit 105 and the object description data 600 and 503 encoded by the object description data encoding circuit 106 are time stamped by the multiplexing circuit 108, respectively. Are multiplexed and transmitted to the transmission path 109. Here, it is assumed that the divided scene description data and object description data are transmitted with a scene description command insertion command and an object description data update command added thereto, respectively.

尚、タイムスタンプには、復号タイムスタンプ、合成タイムスタンプがある。復号タイムスタンプは符号化されたデータが復号回路前にあるバッファに入力されるべき時刻であり、合成タイムスタンプは復号回路において復号されたデータが復号回路後にあるメモリに出力されるべき時刻である。本実施形態では、簡単のため、復号タイムスタンプと合成タイムスタンプには同一の値が用いられるものとする。 The time stamp includes a decoding time stamp and a composite time stamp. The decoding time stamp is a time at which encoded data is to be input to a buffer before the decoding circuit, and a synthesis time stamp is a time at which the data decoded at the decoding circuit is to be output to a memory after the decoding circuit. . In this embodiment, for simplicity, it is assumed that the same value is used for the decoding time stamp and the composite time stamp.

また、各アクセスユニットに付加されるタイムスタンプは、優先順位の高いものから順に時間的に早いものが付加される。例えば、時間T1がシーン記述データ６００、オブジェクト記述データ７００に付加され、時間T2がシーン記述データ５０３、オブジェクト記述データ７０１に付加されるとした場合、時間T1＜時間T2である。尚、アクセスユニット（Access Unit : 以下ＡＵ）は、復号・合成のための時間管理や、同期のための処理単位である。例えば MPEG-4 Video では１つのＶＯＰ（Video Object Plane）の符号化データ（各フレーム）が１つのＡＵに相当する。JPEGでは１枚の静止画データ（JPEG画像そのもの）が１つのＡＵに相当する。 In addition, the time stamps added to each access unit are added in the order of time from the highest priority to the time stamp. For example, when the time T1 is added to the scene description data 600 and the object description data 700, and the time T2 is added to the scene description data 503 and the object description data 701, time T1 <time T2. An access unit (AU) is a processing unit for time management for decoding and synthesis and for synchronization. For example, in MPEG-4 Video, encoded data (each frame) of one VOP (Video Object Plane) corresponds to one AU. In JPEG, one piece of still image data (JPEG image itself) corresponds to one AU.

図８は、上述の処理により２つに分割されたシーン記述データ７００、７０１、オブジェクト記述データ６００、５０３に基づいて表示される画面の一例である。タイムスタンプT１が付加されたシーン記述データ７００、及びオブジェクト記述データ６００を受信した受信装置１１０は、時間T１において画面８００のようにシーンを構成し、表示する。このように、時間T1においては動画像２０１、及び動画像２０２が再生されることになる。さらに、タイムスタンプT2が付加されたシーン記述データ７０１、及びオブジェクト記述データ５０３を受信した受信装置１１０は、時間T2において静止画像２０３を追加して合成する。このように時間T2において、図２と同様の画面構成で表示をすることになる。 FIG. 8 is an example of a screen displayed based on the scene description data 700 and 701 and the object description data 600 and 503 divided into two by the above-described processing. The receiving apparatus 110 that has received the scene description data 700 to which the time stamp T1 is added and the object description data 600 configures and displays a scene as shown on the screen 800 at the time T1. As described above, the moving image 201 and the moving image 202 are reproduced at the time T1. Further, the receiving apparatus 110 that has received the scene description data 701 to which the time stamp T2 is added and the object description data 503 adds and synthesizes the still image 203 at the time T2. Thus, at time T2, display is performed with the same screen configuration as in FIG.

次に、ステップＳ２０３でシーン記述データ、オブジェクト記述データをさらに分割する必要があると判断された場合の処理を、図９、図１０を用いて具体的に説明する。なお、分割要否の判断は、図６や図７に示された分割後の各記述データの符号化後のサイズに基づいてなされる。本実施形態では、各記述データの符号化後のサイズのうちの最大値を用いて更なる分割の要否を判断する。前述した図６、図７で示されるオブジェクト記述データ６００、５０３、シーン記述データ７００、７０１を分割する場合、オブジェクト記述データ５００から分割されたオブジェクト記述データ５０３についてはこれ以上分割処理を施さずに、オブジェクト記述データ６００から優先度の低いObjectDescriptorID 2で示されるオブジェクト５０２を別のオブジェクトデータとして分割する。 Next, the processing when it is determined in step S203 that the scene description data and the object description data need to be further divided will be specifically described with reference to FIGS. Note that the necessity of division is determined based on the encoded size of each piece of description data after division shown in FIGS. In the present embodiment, the necessity of further division is determined using the maximum value among the encoded sizes of the description data. When the object description data 600 and 503 and the scene description data 700 and 701 shown in FIGS. 6 and 7 are divided, the object description data 503 divided from the object description data 500 is not further divided. Then, the object 502 indicated by ObjectDescriptor ID 2 having a low priority is divided from the object description data 600 as another object data.

図６のオブジェクト記述データ６００によると、ES_ID 101、102で表される各ESのstreamPriorityはそれぞれ3、2となっている。よってES_ID 102で示されるESの方がES_ID 103で示されるESよりも優先順位が低いと判断することが出来る。よって、オブジェクト記述データ６００から、優先度の低いESを含むObjectDescriptorID 2で示されるオブジェクト５０２を別のオブジェクトデータとして分割する。このように、ステップＳ２０４において、図６で示されるオブジェクト記述データ６００は、図９のように２つのオブジェクト記述データ５０１、５０２に分割される。 According to the object description data 600 of FIG. 6, the stream priorities of the ESs represented by ES_IDs 101 and 102 are 3 and 2, respectively. Therefore, it can be determined that the ES indicated by ES_ID 102 has a lower priority than the ES indicated by ES_ID 103. Therefore, from the object description data 600, the object 502 indicated by ObjectDescriptorID 2 including ES with low priority is divided as other object data. Thus, in step S204, the object description data 600 shown in FIG. 6 is divided into two object description data 501 and 502 as shown in FIG.

そしてステップＳ２０５では、ステップＳ２０４において分割されたオブジェクト記述データ５０２に関連するノード、つまり図７のシーン記述データ７００中、url 2で指定されるMOVIE2という名前で定義されているMovieTextureノードをシーン記述データ７００から分割する。尚、前述の分割方法と同様に、ここではMovieTextureノードをグルーピングしているTransform2Dノードをシーン記述データ７００から分割する。よって、ステップＳ２０５において、図７で示されるシーン記述データ７００は、図１０のように２つのシーン記述データ１０００と１００１に分割される。 In step S205, the node related to the object description data 502 divided in step S204, that is, the MovieTexture node defined by the name MOVIE2 specified by url 2 in the scene description data 700 of FIG. Divide from 700. Similar to the above-described division method, here, the Transform2D node that groups MovieTexture nodes is divided from the scene description data 700. Therefore, in step S205, the scene description data 700 shown in FIG. 7 is divided into two scene description data 1000 and 1001 as shown in FIG.

図９、図１０のように３つに分割されたシーン記述データ１０００、１００１、７０１、オブジェクト記述データ５０１、５０２、５０３は、それぞれ別々のアクセスユニットとして、優先順位の高いものから順次シーン記述データ符号化回路１０５、オブジェクト記述データ符号化回路１０６へ入力されることになる。 The scene description data 1000, 1001, 701 and the object description data 501, 502, 503 divided into three as shown in FIGS. 9 and 10 are the scene access data in descending order of priority as separate access units. The data is input to the encoding circuit 105 and the object description data encoding circuit 106.

シーン記述データ符号化回路１０５において符号化されたシーン記述データ１０００、１００１、７０１、オブジェクト記述データ５０１、５０２、５０３は、それぞれ多重化回路１０８においてタイムスタンプが付加され、多重化されて伝送路１０９へ送信される。各アクセスユニットに付加されるタイムスタンプは、優先順位の高いものから順に、時間T1がシーン記述データ１０００、オブジェクト記述データ５０１に付加され、時間T2がシーン記述データ１００１、オブジェクト記述データ５０２に付加され、時間T3がシーン記述データ７０１、オブジェクト記述データ５０３に付加される。ここで時間T1＜T2＜T3である。 The scene description data 1000, 1001, and 701 and the object description data 501, 502, and 503 encoded by the scene description data encoding circuit 105 are time-stamped and multiplexed by the multiplexing circuit 108, respectively, and the transmission path 109 is multiplexed. Sent to. In the time stamp added to each access unit, the time T1 is added to the scene description data 1000 and the object description data 501 in order from the highest priority, and the time T2 is added to the scene description data 1001 and the object description data 502. The time T3 is added to the scene description data 701 and the object description data 503. Here, time T1 <T2 <T3.

図１１は、上記したように３つに分割されたシーン記述データ、オブジェクト記述データに基づいて表示される画面の一例である。タイムスタンプT1が付加されたシーン記述データ１０００、及びオブジェクト記述データ５０１を受信した受信装置１１０は、時間T１においては画面１１００のようにシーンを構成し、表示する。このように、時間T1においては動画像２０１が再生されることになる。さらに、タイムスタンプT2が付加されたシーン記述データ１００１、及びオブジェクト記述データ５０２を受信した受信装置１１０は、時間T2において画面８００のように動画像２０２を追加してシーンを合成する。さらに、タイムスタンプT3が付加されたシーン記述データ７０１、及びオブジェクト記述データ５０３を受信した受信装置１１０は、時間T3において画面２００のように静止画像２０３を追加してシーンを合成する。このように時間T3において、図２と同様の画面構成で表示をすることになる。 FIG. 11 is an example of a screen displayed based on the scene description data and object description data divided into three as described above. The receiving apparatus 110 that has received the scene description data 1000 to which the time stamp T1 is added and the object description data 501 configures and displays a scene as shown on the screen 1100 at the time T1. As described above, the moving image 201 is reproduced at the time T1. Further, the receiving apparatus 110 that has received the scene description data 1001 to which the time stamp T2 is added and the object description data 502 adds the moving image 202 as in the screen 800 at time T2 to synthesize the scene. Further, the receiving apparatus 110 that has received the scene description data 701 to which the time stamp T3 is added and the object description data 503 adds the still image 203 and synthesizes the scene at the time T3 as in the screen 200. Thus, at time T3, display is performed with the same screen configuration as in FIG.

なお、ここでは説明を分かりやすくするために、シーンの構成は動画像２つと、静止画像１つとしたが、シーンを構成するオブジェクトとしては動画像、静止画像に限られるものではなく、音声やＣＧ、テキスト等を用いる事が可能である。よってシーン／オブジェクトデータを分割する制御対象物も静止画、動画オブジェクトに限られるものではなく、動画像データを構成する各オブジェクトやＣＧ、テキスト等のいずれであっても適用可能であることは言うまでもない。 In order to make the explanation easier to understand, the scene is composed of two moving images and one still image. However, the objects constituting the scene are not limited to moving images and still images. , Text, etc. can be used. Therefore, the control target for dividing the scene / object data is not limited to the still image or the moving image object, and it goes without saying that any object, CG, text, etc. constituting the moving image data can be applied. Yes.

以上説明したように、上記実施形態によれば、動画、音声、静止画、テキスト、ＣＧ等、複数のメディアオブジェクトから構成される符号化されたマルチメディアデータを配信する際に、受信側端末の種類や能力に応じて、シーン記述データ及びオブジェクト記述データを最適化して分割し、それらを符号化して送信することが出来る。このため、受信装置がシーン記述データ及びオブジェクト記述データを復号し、シーンを合成する処理にかかる負荷、時間を分散させることができるといった効果がある。 As described above, according to the above-described embodiment, when the encoded multimedia data composed of a plurality of media objects such as moving images, sounds, still images, texts, and CGs is distributed, Depending on the type and ability, the scene description data and the object description data can be optimized and divided, and can be encoded and transmitted. For this reason, there is an effect that the receiving apparatus can distribute the load and time required for the process of decoding the scene description data and the object description data and synthesizing the scene.

また、ネットワークの通信回線状況を解析し、該ネットワークの伝送容量を取得し、これをシーン記述データ及びオブジェクト記述データの分割の要否判断に用いるので、ネットワークの状況に応じた、適切な形態でマルチメディアデータを配信できる。
更に、各メディアオブジェクトの再生に関する情報を記述した記述データ（実施形態ではオブジェクト記述データ）に関連するメディアオブジェクトの優先度を示す記述（実施形態ではstreamPriority）を含むようにしたので、コンテンツ作成者等は容易にメディアオブジェクトの優先度を設定できる。
また、分割された複数の記述データに対して、メディアオブジェクトの優先度に従った順番で再生されるようにタイムスタンプが付加されるので、コンテンツ作成者等の意図通りの再生を実行できる。 Also, it analyzes the communication line status of the network, acquires the transmission capacity of the network, and uses this to determine whether to divide the scene description data and object description data. Therefore, in an appropriate form according to the network status Multimedia data can be distributed.
Further, since the description data (in the embodiment, object priority data) describing the information relating to the reproduction of each media object is included, a description (stream priority in the embodiment) indicating the priority of the media object is included. Can easily set the priority of media objects.
In addition, a time stamp is added to a plurality of divided pieces of description data so that they are reproduced in the order according to the priority of the media object, so that the reproduction can be performed as intended by the content creator or the like.

［その他の実施形態］
また、本発明の目的は前述したように、実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体をシステムあるいは装置に提供し、そのシステムあるいは装置のコンピュータ（またはCPUやMPU）が記憶媒体に格納されたプログラムコードを読み出し実行することによっても達成されることは言うまでもない。この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。 [Other Embodiments]
In addition, as described above, the object of the present invention is to provide a storage medium storing a program code of software that realizes the functions of the embodiment to a system or apparatus, and the computer of the system or apparatus (or CPU or MPU) stores it. Needless to say, this can also be achieved by reading and executing the program code stored in the medium. In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and the storage medium storing the program code constitutes the present invention.

プログラムコードを供給するための記憶媒体としては、例えば、フロッピー(登録商標）ディスク、ハードディスク、光ディスク、光磁気ディスク、CD-ROM、CD-R、磁気テープ、不揮発性のメモリカード、ROMなどを用いることができる。 As a storage medium for supplying the program code, for example, a floppy (registered trademark) disk, hard disk, optical disk, magneto-optical disk, CD-ROM, CD-R, magnetic tape, nonvolatile memory card, ROM, or the like is used. be able to.

また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼動しているOS（オペレーティングシステム）などが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれていることは言うまでもない。 Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an OS (operating system) running on the computer based on the instruction of the program code. Needless to say, some or all of the actual processing is performed and the functions of the above-described embodiments are realized by the processing.

さらに、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書きこまれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるCPUなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含むことは言うまでもない。 Furthermore, after the program code read from the storage medium is written to the memory provided in the function expansion board inserted into the computer or the function expansion unit connected to the computer, the function is based on the instruction of the program code. It goes without saying that the CPU or the like provided in the expansion board or the function expansion unit performs part or all of the actual processing, and the functions of the above-described embodiments are realized by the processing.

実施形態における配信装置の基本構成を示すと共に、各回路間でのデータの流れを示す図である。It is a figure which shows the flow of data between each circuit while showing the basic composition of the delivery apparatus in embodiment. 配信装置がマルチメディアデータを配信する処理を説明するためのフローチャートである。It is a flowchart for demonstrating the process in which a delivery apparatus delivers multimedia data. マルチメディアデータの画面構成の例を示す図である。It is a figure which shows the example of a screen structure of multimedia data. 図３の画面構成を実現するシーン記述データの例を示す図である。It is a figure which shows the example of the scene description data which implement | achieves the screen structure of FIG. 図３の画面構成を実現するオブジェクト記述データの例を示す図である。It is a figure which shows the example of the object description data which implement | achieves the screen structure of FIG. 図５のオブジェクト記述データを分割したオブジェクト記述データの例を示す図である。It is a figure which shows the example of the object description data which divided | segmented the object description data of FIG. 図４のシーン記述データを分割したシーン記述データの例を示す図である。It is a figure which shows the example of the scene description data which divided | segmented the scene description data of FIG. 図７のシーン記述データと図６のオブジェクト記述データから構成される画面の例を示す図である。It is a figure which shows the example of the screen comprised from the scene description data of FIG. 7, and the object description data of FIG. 図６のオブジェクト記述データを分割したオブジェクト記述データの例を示す図である。It is a figure which shows the example of the object description data which divided | segmented the object description data of FIG. 図７のシーン記述データを分割したシーン記述データの例を示す図である。It is a figure which shows the example of the scene description data which divided | segmented the scene description data of FIG. 図１０のシーン記述データと図９のオブジェクト記述データから構成される画面の例を示す図である。It is a figure which shows the example of the screen comprised from the scene description data of FIG. 10, and the object description data of FIG.

Claims

Description data having object description data including a plurality of media objects, scene description data describing a temporal and spatial relationship of the plurality of media objects, and description of information necessary for associating and decoding the scene description with the media object; A data processing device for delivering multimedia data including
Determination means for determining whether or not description data needs to be divided based on external capabilities related to distribution of the multimedia data and the scene description data and object description data;
A division unit that divides each of the object description data and the scene description data in units of a part related to the media object when the determination unit determines that the division is necessary;
A data processing apparatus comprising: distribution means for encoding each description data divided by the dividing means and a plurality of media objects, multiplexing them to generate multimedia data, and distributing the multimedia data to the receiving terminal.

A data processing apparatus for delivering multimedia data including a plurality of media objects and description data describing information related to reproduction of each media object to an external device,
Determining means for determining the necessity of dividing the description data based on the external capability related to the delivery of the multimedia data and the description data;
When the determination unit determines that division is necessary, based on the priority set for the media object to which the description data relates, the description data portion related to the low-priority media object is separated. Dividing means for dividing the description data;
A data processing apparatus comprising: distribution means for encoding each description data divided by the dividing means and a plurality of media objects, multiplexing them to generate multimedia data, and distributing the multimedia data to the receiving terminal.

The description data includes scene description data describing a spatio-temporal relationship between the plurality of media objects, and object description data including a description of information necessary for associating and decoding the scene description and the media object.
3. The data processing apparatus according to claim 2, wherein the dividing unit separates a part related to a media object having a low priority for each of the object description data and the scene description data.

The determination means receives device information from the external device, analyzes the device information to acquire the processing capability of the external device, and determines whether the description data needs to be divided based on the acquired processing capability. The data processing apparatus according to claim 1, wherein the data processing apparatus is a data processing apparatus.

The external device is connected via a network;
The determination means analyzes the communication line status of the network to acquire the transmission capacity of the network, and determines whether or not the description data needs to be divided based on the acquired transmission capacity. 5. The data processing device according to any one of 4 to 4.

The description data includes a description indicating a priority for the associated media object;
The data processing apparatus according to claim 2, wherein the dividing unit determines a description related to the media object to be separated with reference to a description indicating the priority.

4. The data processing apparatus according to claim 3, wherein the priority level is determined according to streamPriority indicating a priority level of a stream included in the object description data.

The dividing means divides the object description data by separating a description portion corresponding to a media object having a lower priority than the object description data, and further divides the scene description data according to the divided object description data. The data processing apparatus according to claim 1 or 3, characterized in that

9. An adding means for adding a time stamp to the plurality of description data divided by the dividing means so as to be reproduced in the order according to the priority of the media object. A data processing apparatus according to any one of the above.

The data processing apparatus according to claim 1, wherein the multimedia data is data according to MPEG-4.

11. The data processing apparatus according to claim 1, wherein the scene description data is described by BIFS.

The data processing apparatus according to claim 1, wherein the media object includes a still image, a moving image, and a CG.

The dividing means divides the scene description data by separating a single node corresponding to the media object from the scene description data, or separates a node that groups nodes corresponding to the media object. The data processing apparatus according to claim 1, wherein the data processing apparatus is divided.

Description data having object description data including a plurality of media objects, scene description data describing a temporal and spatial relationship of the plurality of media objects, and description of information necessary for associating and decoding the scene description with the media object; A data processing method for delivering multimedia data including a message to an external device,
A determination step of determining the necessity of dividing the description data based on the external capabilities related to the delivery of the multimedia data and the scene description data and the object description data;
A division step of dividing each of the object description data and the scene description data in units of portions related to the media object when it is determined that the division is necessary in the determination step;
A data processing method comprising: a distribution step of encoding each description data divided by the division step and a plurality of media objects, multiplexing them to generate multimedia data, and distributing the multimedia data to the receiving terminal.

A data processing method for delivering multimedia data including a plurality of media objects and description data describing information related to reproduction of each media object to an external device,
A determination step of determining whether or not the description data needs to be divided based on external capabilities related to delivery of the multimedia data and the description data;
When the determination step determines that division is necessary, based on the priority set for the media object to which the description data is related, by separating the description data portion related to the media object with the lower priority A dividing step of dividing the description data;
A data processing method comprising: a distribution step of encoding each description data divided by the division step and a plurality of media objects, multiplexing them to generate multimedia data, and distributing the multimedia data to the receiving terminal.

A control program for causing a computer to execute the data processing method according to claim 14 or 15.

A storage medium storing a control program for causing a computer to execute the data processing method according to claim 14.