JP2002543688A

JP2002543688A - How to copy media files

Info

Publication number: JP2002543688A
Application number: JP2000614658A
Authority: JP
Inventors: トニー，リチャードキング，; ティモシー，ホルロイドグロエルト，
Original assignee: テレメディアシステムズリミテッド
Priority date: 1999-04-26
Filing date: 2000-04-26
Publication date: 2002-12-17
Also published as: GB9909607D0; EP1097590A1; WO2000065834A1

Abstract

(57)【要約】【解決手段】本発明はメディアファイルをネットワークによってサーバからクライアントにコピーし、そのファイルのクライアントで再生し、これによって、メディアファイルを再生するのに必要とはされない余分のバンド幅がクライアント上でメディアファイルのコピーを改善するのに用いられるようにした方法を教示している。これによって全く新しい閲覧機能が導かれ、（ｉ）低くされた速度（例えば、スローモーションビュー）或いは品質でビデオフレームのシーケンスをビューイングし、或いは、（ii）ビデオフレームのシーケンスを再度ビューイングし、或いは（iii）単一のビデオフレームのビューイングを休止するという典型的な閲覧動作によって、通常のプレイバック速度よりも実質的により良いビデオフレーム或いはフレームシーケンスの品質が結果として得られる。これは特に、ユーザがこれらの方法で閲覧をするとき、ユーザはおそらくは極詳細な部分にまで関心を払うであろうから、有用である。 The present invention copies a media file from a server to a client over a network and plays the file on the client, thereby providing an extra band that is not required to play the media file. A method is taught in which width is used to improve the copying of media files on a client. This leads to an entirely new browsing function, (i) viewing the sequence of video frames at a reduced speed (eg, slow motion view) or quality, or (ii) re-viewing the sequence of video frames. Or (iii) the typical viewing operation of pausing the viewing of a single video frame results in a video frame or frame sequence quality that is substantially better than normal playback speed. This is particularly useful when the user browses in these ways, as the user will probably be interested in even the most detailed parts.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】技術分野本発明はビデオやオーディオのファイルのような連続的なメディアファイルを
コピーし、サーバに格納されたファイルが符号化されて、クライアントや他のサ
ーバのような機器における改善された閲覧とダウンロードされた特徴とを備える
方法に関する。ネットワークはインターネットを含んでも良い。TECHNICAL FIELD The present invention copies continuous media files, such as video and audio files, and encodes the files stored on the server to improve devices such as clients and other servers. A method comprising provided browsing and downloaded features. The network may include the Internet.

【０００２】発明の背景ビデオやオーディオのファイルのような全ての種類のメディアファイルを、Ｗ
ｅｂページでのこれまでにない増加をみているビデオの使用によって少なからず
促進された低バンド幅ネットワークを介したＰＣクライアントに配信する技術の
改良について、近年、かなりの関心が見られた。ビデオ閲覧の１つの有用な特徴
のセットが、インテル（Ｉｎｔｅｌ）社からのインディオ（Ｉｎｄｅｏ）製品に
用いられている。このシステムでは、１つの符号化されたビデオファイルが作者
によって構成され、多くの異なるデータ速度と品質レベルでクライアントにおい
て再生する。これら速度と品質レベル各々は異なるネットワークのバンド幅或い
は再生を行うプラットフォームの能力に適したものとなっている。インディオに
おける“プログレッシブ品質”オプションの使用は、ビデオダウンロードを最大
６フェーズまでにブレイクダウンする。その第１のフェーズでは最低の品質のフ
レームを配信し、これに続くフェーズではより高い品質のフレームを配信する。
インディオでの“品質”は、３つのレベル、即ち、“良（Good）”、“優（Bett
er）”、及び“最良（Best）”のいずれかになり得る。それゆえ、ビデオの品質
は時間的に３つの離散的な段階、即ち、（ｉ）“良（Good）”の品質のために符
号化されたデータのダウンロードが完了したときの段階、（ii）“優（Better）
”の品質のために符号化されたデータのダウンロードが完了したときの段階、及
び（iii）“最良（Best）” の品質のために符号化されたデータのダウンロード
が完了したときその最上レベルに達した段階で構築される。BACKGROUND OF THE INVENTION All types of media files, such as video and audio files, are
In recent years, there has been considerable interest in improving the technology of delivering to PC clients over low-bandwidth networks, driven not least by the ever-increasing use of video on eb pages. One useful set of features for video browsing is used in the Indeo product from Intel. In this system, one encoded video file is composed by the author and played on the client at many different data rates and quality levels. Each of these speeds and quality levels is adapted to the bandwidth of different networks or the platform's ability to play. The use of the "progressive quality" option in indio breaks down the video download by up to six phases. In the first phase, the lowest quality frames are delivered, and in subsequent phases, higher quality frames are delivered.
"Quality" in Indio has three levels: "Good" and "Bett".
er) "and" Best ". Therefore, the quality of the video is temporally divided into three discrete stages: (i) because of the" Good "quality. When the download of the encoded data is completed, (ii) “Better”
The stage when the download of data encoded for "best" quality is completed, and (iii) at the top level when the download of data encoded for "best" quality is completed. It is built when it reaches.

【０００３】ビデオ技術の回路とシステムについてのＩＥＥＥトランザクション、成長する
インタラクティブマルチメディアサービスにおける画像とビデオ処理の特別号、
１９９８年９月号（IEEE Trans. Circuits & Systems for Video Technology, S
pecial Issue on Image and Video Processing for Emerging Interactive Mult
imedia Services, Sept. 1998）に提出された、ベオン−ジョークム、ジキシ
アンキシオン、及びウィリアムパールマン（Beong-Jo Kim, Zixiang Xiong
and William Pearlman）による“階層ツリーにおける３Ｄセットの区分分けをと
もなう超低速の埋め込み型符号化（Very Low Bit-Rate Embedded Coding with 3
D Set Partitioning in Hierarchical Trees）”も参照されたい。この論文は、
ＳＰＩＨＴ圧縮方式をウェーブレットを基本とする変換への適用を開示している
。この変換で多くの空間解像度の符号化ビットストリームが生み出され、ある与
えられた空間解像度の中ではプログレッシブな品質の順序付けがある。IEEE Transactions on Circuits and Systems in Video Technology, Special Issue on Image and Video Processing in Growing Interactive Multimedia Services,
September 1998 issue (IEEE Trans. Circuits & Systems for Video Technology, S
pecial Issue on Image and Video Processing for Emerging Interactive Mult
imedia Services, Sept. 1998), Beong-Jo Kim, Zixiang Xiong, and Dixian Xiong.
and William Pearlman), “Very Low Bit-Rate Embedded Coding with 3D Set Partitioning in Hierarchical Trees.
D Set Partitioning in Hierarchical Trees).
It discloses the application of the SPIHT compression method to wavelet-based transformation. This transform produces a coded bitstream of many spatial resolutions, with progressive quality ordering within a given spatial resolution.

【０００４】発明の要約本発明に従えば、メディアファイルをネットワークによって機器にコピーし、
そのメディアファイルのコピーを再生可能にし、それによって、メディアファイ
ルを再生するのに必要とはされない余分のバンド幅がその機器上でそのメディア
ファイルのコピーを改善するのに用いられるようにした方法が備えられる。プレ
イバックと改善（エンハンスメント：enhancement）は、プレイバックパラメー
タとエンハンスメントパラメータのセット夫々によって制御されても良い。１つ
の実施形態では、これらのセットのパラメータは、また、プレイバックプロファ
イルとエンハンスメントプロファイルとして知られる。SUMMARY OF THE INVENTION According to the present invention, a media file is copied to a device over a network,
A method of making a copy of the media file playable, so that extra bandwidth not needed to play the media file is used to improve the copy of the media file on the device. Be provided. Playback and enhancement (enhancement) may be controlled by respective sets of playback parameters and enhancement parameters. In one embodiment, these sets of parameters are also known as playback profiles and enhancement profiles.

【０００５】その余分なバンド幅は、（ｉ）低くされた速度或いは品質でそのメディアファ
イルを再生し、或いは、（ii）そのメディアファイルの一部を再生し、或いは（
iii）そのメディアファイルのプレイバックを休止する結果として利用可能であ
るようにしても良い。それゆえに、本発明は全く新しい閲覧機能を熟慮している
もので、その機能において、（ｉ）低速で（例えば、スローモーションビュー）
或いは低品質（例えば、低解像度での高速スキャン）でビデオフレームのシーケ
ンスをビューイングし、或いは、（ii）ビデオフレームのシーケンスを再度ビュ
ーイングし、或いは（iii）一時休止して単一のビデオフレームのビューイング
するという典型的な閲覧動作による結果、余分なバンド幅が利用可能になり、そ
のバンド幅がより理知的に用いられてその機器におけるメディアファイルのコピ
ーを改善し、例えば、通常のプレイバック速度よりも実質的により良いビデオフ
レーム或いはフレームシーケンスの品質が（すぐさま、或いは結果として）得ら
れる。これは特に、ユーザがネットワークによりこれらの方法で閲覧をするとき
、ユーザはおそらくは極詳細な部分にまで関心を払うであろうから有用である。
本発明には流れるような連続メディアファイルへの特有の適用がある。[0005] The extra bandwidth can be achieved by (i) playing the media file at a reduced speed or quality, or (ii) playing a portion of the media file, or (
iii) The media file may be made available as a result of pausing playback. Therefore, the present invention contemplates a completely new browsing function, in which (i) slow (eg, slow motion view)
Alternatively, view a sequence of video frames in low quality (eg, fast scan at low resolution), or (ii) view the sequence of video frames again, or (iii) pause and replay a single video. The typical browsing operation of viewing frames results in extra bandwidth being available, which can be used more intelligently to improve the copying of media files on the device, for example, by Substantially better video frame or frame sequence quality is obtained (immediately or as a result) than playback speed. This is particularly useful when the user browses in these ways over a network, since the user will probably be concerned with even the finest details.
The invention has particular application to flowing continuous media files.

【０００６】 “ネットワーク”という用語は、２つ以上の機器間でのどんな種類のデータ接
続をも包含するように広範に解釈されるべきである。そのネットワークは典型的
にはインターネットを含む。“ファイル”とは一貫性のあるデータセットであれ
ば何でも良く、それで、“メディアファイル”とはビデオフレーム或いはオーデ
ィオレベルのような１つ以上のメディアサンプルを表現する一貫性のあるデータ
セットである。“クライアント”という用語はこの明細書においてはデータを受
信するどんな機器をも意味するものとして用いられる。The term “network” should be interpreted broadly to encompass any kind of data connection between two or more devices. The network typically includes the Internet. A "file" can be any consistent dataset, so a "media file" is a consistent dataset representing one or more media samples, such as video frames or audio levels. . The term "client" is used herein to mean any device that receives data.

【０００７】エンハンスメントは、サーバに格納された、或いは、コピーの前或いはその間
に機器によってサーバに送信されたプロファイル、或いは、サーバに格納された
値とコピーの前或いはその間に機器によってサーバに送信された値との組み合わ
せに従って実行されても良い。エンハンスメントは、一般に、リファインメント
データのネットワークによる送信の結果として発生する。The enhancement is stored in the server, or the profile transmitted to the server by the device before or during the copy, or the value stored in the server and transmitted to the server by the device before or during the copy. It may be executed according to a combination with the set value. Enhancement generally occurs as a result of transmission of refinement data over a network.

【０００８】再生に必要とされるデータは、サーバに格納された、コピーの前或いはその間
にサーバに送信された、或いは、そのサーバに格納された値とコピーの前或いは
その間に機器によってサーバに送信された値との組み合わせであるプロファイル
に従って選択されても良い。[0008] The data required for reproduction is stored in the server, transmitted to the server before or during the copy, or transmitted to the server by the device before and during the copy of the value stored in the server and during the copy. The selection may be made according to a profile that is a combination with the transmitted value.

【０００９】１つの実施形態では、クライアント上のメディアファイルのコピーは一時的な
キャッシュとして作用し、そのメディアファイルの全て或いは一部がプレイバッ
クの間或いはその終わりに削除される。局所的キャッシュ或いはローカルバッフ
ァが用いられ、余分なバンド幅を用いて送信され、そして、機器上のメディアフ
ァイルのコピーを改善することが必要とされる時だけにデコードされるデータを
格納しても良い。In one embodiment, the copy of the media file on the client acts as a temporary cache, in which all or part of the media file is deleted during or at the end of playback. A local cache or local buffer may be used to store data that is transmitted using extra bandwidth and that is decoded only when needed to improve the copy of the media file on the device. good.

【００１０】１つの実施形態では、メディアファイルはビデオを符号化し、各ビデオフレー
ム或いはフレームのシーケンスに対応した画像データがウェーブレット変換を用
いて生成されＳＰＩＨＴ圧縮を用いて変形され、その結果として得られるビット
ストリームは幾つかの離散的なビットストリームレイヤを含み、各ビットストリ
ームレイヤによって異なる空間解像度で画像データをディスプレイに表示するこ
とを可能にする。In one embodiment, a media file encodes a video, and image data corresponding to each video frame or sequence of frames is generated using a wavelet transform and deformed using SPIHT compression, resulting. The bitstream includes several discrete bitstream layers, each of which allows image data to be displayed on a display at a different spatial resolution.

【００１１】別の実施形態では、さらに機器には、（ｉ）前記メディアデータを解析するこ
と、（ii）前記メディアデータを新しいフォーマットに再圧縮すること、（iii
）前記メディアデータを、紙、フィルム、磁気テープ、ディスク、ＣＤ−ＲＯＭ
、或いは他のデジタル記憶媒体のような別の媒体に転送することを含む、プレイ
バック以外の目的のためにデータが提供される。In another embodiment, the apparatus further comprises: (i) analyzing the media data; (ii) recompressing the media data to a new format;
) The media data is stored on paper, film, magnetic tape, disk, CD-ROM
Alternatively, the data is provided for purposes other than playback, including transferring to another medium, such as another digital storage medium.

【００１２】典型的には、そのメディアファイルは、ビデオ、音響データ、或いは、映像及
び音響の複合データを符号化する。Typically, the media file encodes video, audio data, or composite video and audio data.

【００１３】ある側面から見れば、上記のように規定した方法のいずれかを用いてコピーさ
れるメディアファイルが備えられる。In one aspect, a media file is provided that is copied using any of the methods defined above.

【００１４】また別の面から見れば、クライアントがネットワークによってサーバからその
クライアントにコピーされたメディアファイルのコピーを改善することを、その
クライアントにおいてメディアファイルを再生するのに必要とはされない余分の
バンド幅を用いることで可能にしたコンピュータプログラムが備えられる。これ
に関連した面から見れば、そのようなコンピュータプログラムでプログラムされ
たクライアントが備えられる。[0014] In another aspect, a client may improve the copy of a media file copied from a server to the client by a network by providing an extra band that is not required to play the media file at the client. A computer program is provided that is enabled by using width. In a related aspect, there is provided a client programmed with such a computer program.

【００１５】さらに別の面からみれば、サーバがネットワークによってサーバからクライア
ントにコピーされたメディアファイルのコピーを改善することを、そのクライア
ントにおいてメディアファイルを再生するのに必要とはされない余分のバンド幅
を用いることで可能にしたコンピュータプログラムが備えられる。これに関連し
た面から見れば、そのようなコンピュータプログラムでプログラムされたサーバ
が備えられる。[0015] In yet another aspect, the server improves the copy of the media file copied from the server to the client over the network, with additional bandwidth not required to play the media file at the client. Is provided. In a related aspect, there is provided a server programmed with such a computer program.

【００１６】詳細な説明Ａ．かぎとなる概念ブロック単位の動き補償した符号化方式離散的コサイン変換、動き検出と補償を用いて、ソースからの空間的及び時間
的な冗長性を除去することによってビデオを圧縮する、ブロック単位のいくつか
の例の符号化方式がある。これらの中で、最もなじみのあるものがＭＰＥＧ（即
ち、ＭＰＥＧ−１或いはＭＰＥＧ−２）である。ＭＰＥＧは３種類の圧縮画像：
Ｉ−フレーム、Ｐ−フレーム、及びＢ−フレームを用いる。Ｉ−フレームは、離
散的コサイン変換（ＤＣＴ）から得られる空間周波数のセットとして表現される
、１６×１６画素の四角のブロック（マクロブロック）からできており、低い空
間周波数は一般に高い空間周波数よりもはるかに優れた精度で表現される。時間
的な冗長性は、Ｐ（予測）及びＢ（双方向性）フレームを用いて除去される。Ｐ
−フレームにおける特定のマクロブロックは、以前に符号化されたＩ−フレーム
（或いは、Ｐ−フレーム）を探索して考慮中のものに最も一致するマクロブロッ
クを探すことによりエンコーダで生成される。あるものが見出されたとき、その
２つの間のオフセットを表現するベクトルが計算される。この動きベクトルは、
予測ブロックと実際のブロックとの間の誤りとともに全て、受信器においてＰ−
フレームを再構成するために送られる必要のあるものである。３番目の種類のも
のは、Ｂ（双方向性）フレームと呼ばれる。Ｂ−フレームは過去と未来のＩ−、
及びＰ−フレームによって得られる動きベクトルに基づいており、これらのフレ
ームがスムージング効果を提供し、圧縮ファクタを増加させノイズを低減する。Detailed Description A. Key concepts Block-based motion-compensated coding schemes Discrete cosine transform, motion detection and compensation to compress video by removing spatial and temporal redundancy from the source. There are several example coding schemes. Of these, the most familiar is MPEG (i.e., MPEG-1 or MPEG-2). MPEG has three types of compressed images:
I-frame, P-frame, and B-frame are used. An I-frame is made up of square blocks (macroblocks) of 16x16 pixels, represented as a set of spatial frequencies obtained from a discrete cosine transform (DCT), with lower spatial frequencies generally being higher than higher spatial frequencies. Are also expressed with much better precision. Temporal redundancy is removed using P (prediction) and B (bidirectional) frames. P
-A particular macroblock in a frame is generated at the encoder by searching a previously encoded I-frame (or P-frame) for the macroblock that best matches the one under consideration. When one is found, a vector representing the offset between the two is calculated. This motion vector is
All with errors between the predicted block and the actual block, P-
It needs to be sent to reconstruct the frame. The third type is called a B (bidirectional) frame. B-frames are past and future I-,
And P-frames, which provide a smoothing effect, increase the compression factor and reduce noise.

【００１７】Ｐ−フレームは以前のＰ−フレームから計算されるので、誤りも累積する。そ
れで、定期的にＩ−フレームを挿入する必要がある。典型的なＭＰＥＧのフレー
ムシーケンスは、以下のように見えるかもしれない。Since P-frames are calculated from previous P-frames, errors also accumulate. Therefore, it is necessary to insert I-frames periodically. A typical MPEG frame sequence might look like this:

【００１８】 I B B P B B P B B I B B P B B P B B I.......... 本発明の観点からすれば、ＭＰＥＧは、ここで説明されるようなシステムの設
計におけるファクタである符号化されたメディアファイルの３つの特性を例証し
ている。IBBPBBPBBIBBPBBPBB I .......... From the point of view of the present invention, MPEG defines three characteristics of encoded media files that are factors in the design of a system as described herein. Is illustrated.

【００１９】第１の特性は、初期サンプル構造が符号化処理中は保存される、即ち、個々の
フレームの独自性が符号化されたビットストリームにおいて失われないことであ
る。これは、時間的な特性が圧縮領域においてフレームを追加或いは除去するこ
によって操作可能であることを意味する。第２に、時間的なウィンドウが定義さ
れ（画像のＭＰＥＧグループ或いはＧＯＰ）、その中で時間的な冗長性が活用さ
れて圧縮を行う。第３に、複雑なセットの依存性が符号化によって定義される。
即ち、ＭＰＥＧにおいては、復号化のために、Ｐ−フレームはＩ−フレームを必
要とし、Ｂ−フレームはＩとＰフレームを必要とする。The first property is that the initial sample structure is preserved during the encoding process, ie the identity of the individual frames is not lost in the encoded bit stream. This means that temporal properties can be manipulated by adding or removing frames in the compressed domain. Second, a temporal window is defined (MPEG group or GOP of images) in which the temporal redundancy is exploited to perform compression. Third, a complex set of dependencies is defined by the encoding.
That is, in MPEG, a P-frame requires an I-frame and a B-frame requires I and P frames for decoding.

【００２０】Ｈ．２６１とＨ．２６３とを含む動き検出と補償を利用する、ブロック単位の
符号化方式の別の例もある。H. 261 and H.E. There are other examples of block-based coding schemes that utilize motion detection and compensation, including H.263.

【００２１】ブロック単位のフレーム内だけの符号化方式動き補償のないブロック単位の符号化を用いる他の方式、顕著なものとしては
、ＪＰＥＧやＤＶ（ＳＭＰＴＥ３１４Ｍ−１９９９で定義された）などがある
。ＪＰＥＧとＤＶの両方において、基本的な方式は、ＤＣＴを用いてブロックの
画素を周波数成分に変換し、重み付けのファクタのセットによってその時の可変
長コードを乗算することによりその成分を量子化し、その結果、符号化されたビ
ットストリームを生成することである。しかしながら、ＤＶは、適用される圧縮
に先だって、圧縮過程を最適化するフィード−フォワードの概念を導入している
。これを行うために、ＤＣＴ変換された画像が調べられ、低度の、中度の、高度
の空間的な詳細の領域へと分類される。この情報を用いて、量子化ファクタの異
なったテーブルが領域に従って、どの周波数係数が表現されているのかとその忠
実度とを一致させる対象とともに、人間の視聴システムの周波数応答に対して選
択されて用いられる。Coding scheme only within a frame in block units Other schemes using block-based coding without motion compensation, notably JPEG and DV (defined by SMPTE 314M-1999), etc. . In both JPEG and DV, the basic scheme is to transform the pixels of the block into frequency components using DCT, quantize that component by multiplying the current variable length code by a set of weighting factors, The result is to generate an encoded bitstream. However, DV introduces the concept of feed-forward to optimize the compression process prior to the compression applied. To do this, the DCT transformed image is examined and classified into regions of low, medium and high spatial detail. Using this information, different tables of quantization factors are selected for the frequency response of the human audio-visual system, along with the target to match which frequency coefficient is represented and its fidelity according to the region. Used.

【００２２】ＤＶの２番目の特徴は、ブロック単位の適応型フィールド／フレーム処理の使
用である。これが意味することは、６４のエントリのあるＤＣＴブロックは非イ
ンタレースフレーム（８−８ＤＣＴ）における８×８領域のピクセル、或いは、
フレームの第１と第２のフィールドにおける２つの４×８領域（２−４−８ＤＣ
Ｔ）を表現できることである。この２つの間の選択は、動き検出によってなされ
る。前者の方式はフィールド間で発生する動きがほとんどないなら用いられ、後
者は動きが検出されたなら用いられる。ここでなされる選択はブロック単位を基
本としている。The second feature of DV is the use of adaptive block / field processing. This means that a DCT block with 64 entries is a pixel of an 8 × 8 area in a non-interlaced frame (8-8 DCT), or
Two 4 × 8 areas (2-4-8 DC) in the first and second fields of the frame
T) can be expressed. The choice between the two is made by motion detection. The former method is used if there is almost no motion occurring between fields, and the latter method is used if motion is detected. The selection made here is based on a block unit.

【００２３】ＭＰＥＧ符号化についても同様に、その符号化されたビットストリームの性質
に関して３つの観察がなされる。前に述べたように、符号化を通じて、サンプル
構造は保存され、次に、この場合にはフレームの２つのフィールドを表現する時
間的なウィンドウが定義される。３番目に、依存性のセットが定義される。例え
ば、２−４−８ＤＣＴブロックが生成されたときにはいつでも、依存性がフレー
ム内のフィールド間に存在する。Similarly, for MPEG coding, three observations are made regarding the nature of the coded bitstream. As mentioned earlier, through encoding, the sample structure is preserved, and then a temporal window is defined which in this case represents the two fields of the frame. Third, a set of dependencies is defined. For example, whenever a 2-4-8 DCT block is generated, dependencies exist between fields in the frame.

【００２４】サブバンド符号化方式ＤＣＴのような変換を用いたブロック単位の符号化方式へのより最近の代替案
は、サブバンド符号化である。これにより、（その小ブロックというよりはむし
ろ）完全な画像が処理されて、周波数／空間の限定されたバンドのセットになる
。これはしばしば、スケール（scale）として言及される。この例は、ウェーブ
レット変換である。Subband Coding Scheme A more recent alternative to block-based coding schemes using transforms such as DCT is subband coding. This processes the complete image (rather than its small blocks) into a set of frequency / space limited bands. This is often referred to as a scale. This example is a wavelet transform.

【００２５】ウェーブレット変換は画像解析と圧縮のツールとして、ただ相対的には最近に
なって成熟してきた。例えば、参考文献としては、パターン解析と機械知能にお
けるＩＥＥＥトランザクション（IEEE Transactions on Pattern Analysis and
Machine Intelligence）第１１巻第７号６７４〜６９２頁、１９８９年７月にあ
るモラット，ステファンＧ．（Mallat, Stephane G.）による“多解像度の信
号分解についての理論：ウェーブレット表現（A Theory for Multiresolution S
ignal Decomposition: The Wavelet Representation）”があり、そこでは高速
ウェーブレット変換（ＦＷＴ）が説明されている。ＦＷＴは２の階乗の階層構造
或いはサブバンドを生成する。そのサブバンドでは、各ステップにおいて、空間
サンプリング周波数−表現されるディテールの“精細度（fineness）”−がｘと
ｙにおいて、２倍ほど小さくなる。この手順によって、画像サンプルとほとんど
のエネルギーがサブバンド内の少数の高振幅の係数に圧縮され、残りの係数はだ
いたいはゼロか或いは小さい値になり、圧縮についてかなりの機会を提供してい
る。The wavelet transform has only recently matured as a tool for image analysis and compression. For example, references include IEEE Transactions on Pattern Analysis and IEEE Transactions on Machine Intelligence.
Machine Intelligence, Vol. 11, No. 7, pp. 674-692, Morat, Stephan G., July 1989. (Mallat, Stephane G.), "A Theory for Multiresolution S
ignal Decomposition: The Wavelet Representation), which describes the Fast Wavelet Transform (FWT). The FWT creates a factorial of 2 hierarchical structure or subband, in which each step is The spatial sampling frequency-the "fineness" of the represented detail-is reduced by a factor of two in x and y.This procedure reduces the image samples and most of the energy to a small number of high-amplitude coefficients in the subband. And the remaining coefficients are approximately zero or smaller, providing a significant opportunity for compression.

【００２６】各サブバンドは空間／周波数成分の特定の組み合わせによって画像を記述する
。このことは図２Ａに図示されている。ここで、ウェーブレットフィルタが画像
に２回適用される。最初の適用後、レベル０では４つのサブバンドは、各サブバ
ンドがオリジナルの４分の１の縮尺になるという結果になる。２回目の適用後に
は、４つの新しい、８分の１の縮尺のサブバンドが創生される。画像を十分に再
構成するため、手順は、逆ウェーブレットフィルタを用いて反対の順序で行われ
ることになる。その図はまた、サブバンドに名前をつけるために用いられる方式
を図示している。即ち、“Ｌ”と“Ｈ”はローパスとハイパスのウェーブレット
フィルタを示し、これらはｘとｙにおいて適用されて、サブバンドを生成する。
ｘとｙにＨフィルタを適用すると、“ＨＨ”サブバンドを与え、ｘに“Ｌ”を、
ｙに“Ｈ”を適用すると、“ＬＨ”を生み出す結果になるなどである。Each subband describes an image by a particular combination of spatial / frequency components. This is illustrated in FIG. 2A. Here, the wavelet filter is applied twice to the image. After the first application, at level 0, the four subbands result in each subband being one quarter of the original scale. After the second application, four new, eighth-scale sub-bands are created. To fully reconstruct the image, the procedure will be performed in the reverse order using an inverse wavelet filter. The figure also illustrates the scheme used to name the subbands. That is, "L" and "H" denote low-pass and high-pass wavelet filters, which are applied in x and y to generate subbands.
Applying an H filter to x and y gives the “HH” subband, “L” to x,
Applying “H” to y results in “LH”, and so on.

【００２７】サブバンドはオリジナル画像にいつも完全に再構成される必要はない。それら
は異なった方法で組み合わされて最終的な画像を生成し、個々の要求のあるセッ
トを満足できる。３つの例が図２に示されている。図２Ｂにおいて、レベル１の
ＬＬサブバンドはオリジナルの１６分の１の縮尺版として用いられる。図２Ｃに
おいて、レベル０の４つのサブバンドは逆に変換されて、オリジナルの４分の１
版を再構成する。３番目の例である図２Ｄにおいて、レベル１のＬＬとＬＨサブ
バンドが、レベル０のＬＨサブバンドとともに用いられてオリジナルの解像度の
画像を再構成するが、全ての縮尺において水平方向の特徴が強調される。The subbands do not always have to be completely reconstructed into the original image. They can be combined in different ways to produce the final image, satisfying individual demanded sets. Three examples are shown in FIG. In FIG. 2B, the level 1 LL subband is used as a 1/16 scaled version of the original. In FIG. 2C, the four sub-bands at level 0 are transformed back to one-quarter of the original.
Restructure the edition. In FIG. 2D, a third example, the LL and LH subbands at level 1 are used together with the LH subband at level 0 to reconstruct the original resolution image, but with horizontal features at all scales. Be emphasized.

【００２８】画像ツリー単位の圧縮ツリー単位のデータ構造が圧縮方式によって用いられ、画像における空間的な
冗長性を活用する。そのような方式の最近の発展はＳＰＩＨＴアルゴリズムであ
る。ビデオ技術の回路とシステムについてのＩＥＥＥトランザクション、成長す
るインタラクティブマルチメディアサービスにおける画像とビデオ処理の特別号
、１９９８年９月号（IEEE Trans. Circuits & Systems for Video Technology,
Special Issue on Image and Video Processing for Emerging Interactive Mu
ltimedia Services, Sept. 1998）に提出された、ベオン−ジョークム、ジキ
シアンキシオン、及びウィリアムパールマン（Beong-Jo Kim, Zixiang Xion
g and William Pearlman）による“階層ツリーにおける３Ｄセットの区分分けを
ともなう超低速の埋め込み型符号化（Very Low Bit-Rate Embedded Coding with
3D Set Partitioning in Hierarchical Trees）”を参照されたい。Image Tree-Based Compression A tree-based data structure is used by the compression scheme to take advantage of spatial redundancy in the image. A recent development of such a scheme is the SPIHT algorithm. IEEE Transactions on Video Technology Circuits and Systems, Special Issue on Image and Video Processing in Growing Interactive Multimedia Services, September 1998 (IEEE Trans. Circuits & Systems for Video Technology,
Special Issue on Image and Video Processing for Emerging Interactive Mu
ltimedia Services, Sept. 1998), Beong-Jo Kim, Zixiang Xion, and Zixiang Xion.
g and William Pearlman) “Very Low Bit-Rate Embedded Coding with Partitioning of 3D Sets in Hierarchical Trees”
3D Set Partitioning in Hierarchical Trees) ”.

【００２９】ＳＰＩＨＴアルゴリズムはウェーブレット変換と連携して用いる効果的な圧縮
ステップである。なぜなら、その変換によって備えられる入力画像の非相関を効
率的に利用して高レベルのデータ低減を得ることができるからである。本発明の
目的のために、ＳＰＩＨＴアルゴリズムの重要な特徴は、重要度（Significance
）によって変換された画像を解析する能力である。それはビットの重要度（sign
ificance）、即ち、サンプルにおける最上位ビットがセットされる位置に関して
、全てのサンプルを効率的に突き止め、部分的に処理する。これは、サンプルの
大きさに対応しているので、そのサンプルはそのエネルギー或いは再構成される
対象に対してなす寄与に関して効率的に処理される。重要なレイヤはビット位置
が定義される全てのサンプルに関して、ビット位置を選択しそのビット位置の値
（１或いは０）を出力することによって生成される。[0029] The SPIHT algorithm is an effective compression step used in conjunction with the wavelet transform. This is because a high level of data reduction can be obtained by efficiently utilizing the decorrelation of the input image provided by the conversion. For the purposes of the present invention, an important feature of the SPIHT algorithm is the Significance
) Is the ability to analyze the image converted. It is the bit importance (sign
efficiency, i.e., efficiently locate and partially process all samples for the position where the most significant bit in the sample is set. Since this corresponds to the size of the sample, the sample is efficiently processed in terms of its energy or contribution to the object to be reconstructed. The critical layer is created by selecting the bit position and outputting the value (1 or 0) of that bit position for every sample for which the bit position is defined.

【００３０】ＭＰＥＧ符号化についても同様に、関連する画像成分間の依存度のセットはウ
ェーブレット／ＳＰＩＨＴ圧縮のような方式によって定義される。この場合、重
要なレイヤ（significance layer）は、次に最も重要なレイヤもデコードされる
ときにだけ、デコードされる。同様に、サブバンド（例えば、図１におけるＬＨ
０）をデコードするために、親のサブバンド（図１におけるＬＨ１）もまたデコ
ードされねばならない。Similarly, for MPEG coding, a set of dependencies between related image components is defined by a scheme such as wavelet / SPIHT compression. In this case, the significant layer is decoded only when the next most important layer is also decoded. Similarly, subbands (eg, LH in FIG. 1)
In order to decode 0), the parent subband (LH1 in FIG. 1) must also be decoded.

【００３１】３Ｄ方式ウェーブレット及びＳＰＩＨＴ圧縮方式は、正に率直な方法で３番目の（時間
の）次元に拡張される。この場合、フレームシーケンスが獲得され、３次元ブロ
ックのデータとして処理される。ウェーブレットフィルタは垂直方向と水平方向
に沿うとの同じくらい何度も時間軸に沿って適用され、その結果、夫々がサブバ
ンドである空間的時間的な立方体のセットを有する変換されたブロックになる。
３Ｄ−ＳＰＩＨＴアルゴリズムは８つのツリー（２Ｄの場合には４つのツリーで
あるのに対し）によってこれらのサブバンドを表現し、よってビットストリーム
を生成する。3D Schemes Wavelet and SPIHT compression schemes are extended to the third (time) dimension in a very straightforward manner. In this case, a frame sequence is obtained and processed as three-dimensional block data. The wavelet filter is applied along the time axis as many times along the vertical and horizontal directions, resulting in transformed blocks having a set of spatial and temporal cubes, each of which is a subband .
The 3D-SPIHT algorithm represents these subbands by eight trees (as opposed to four trees in 2D), thus producing a bitstream.

【００３２】当業者によって認識されるように、他の圧縮方式もまた、本発明で用いること
ができる。As will be appreciated by those skilled in the art, other compression schemes can also be used with the present invention.

【００３３】Ｂ．概観画像の符号化本発明の方法は、１つの実施形態において、フレームシーケンスに関して結果
として得られるデータが符号化過程を用いて２つ以上のスライスに分割されるよ
うに画像データが符号化されることを要求している。異なるスライスは、異なる
解像度と品質において、画像シーケンスの異なる部分を符号化するであろう。特
定の表示解像度、品質、及びフレーム速度で、フレームシーケンスを再構成する
ために、完全なデータセットのサブセットをデコードすることが可能でなければ
ならない。B. Overview Image Encoding In one embodiment, the method of the present invention encodes image data such that the resulting data for a frame sequence is divided into two or more slices using an encoding process. Is required. Different slices will encode different parts of the image sequence at different resolutions and qualities. At a particular display resolution, quality, and frame rate, it must be possible to decode a subset of the complete data set in order to reconstruct the frame sequence.

【００３４】好適な例では、ウェーブレット変換とＳＰＩＨＴ圧縮とを用いて生成されたビ
ットストリームをスライスへの分割について、この明細書の後の部分で説明され
る。１つ以上のこれらデータのスライスをデコードすることにより、特定の表示
解像度、品質、及びフレーム速度で、フレーム或いはフレームシーケンスを再構
成することが可能である。一般に、結果として得られるフレーム或いはフレーム
シーケンスの品質は、デコードされるデータの全量に依存する。フレーム或いは
フレームシーケンスを再構成するのに用いられるスライスの選択は、所望の解像
度、品質、フレーム速度に基づいてなされる。また、より多くのスライスを追加
することにより、存在するフレーム或いはフレームシーケンスの品質、解像度、
及びフレーム速度を改善することが可能である。この一般的な過程は、“改善（
エンハンスメント）”として知られている。（後で与えられるウェーブレットを
基本とした例では、この過程に対して、“リファインメント”という語を用いる
。）典型的には、エンハンスメントは、新しいスライスを存在するスライスに追
加することにより、或いは、新しいレイヤからの第１のスライスを追加すること
により発生する。おそらく、しかしまれではあろうが、これには、全ての新しい
レイヤ（そのレイヤについての全スライス）を追加することが関与するかもしれ
ない。In a preferred example, the division of the bitstream generated using wavelet transform and SPIHT compression into slices is described later in this specification. By decoding one or more slices of this data, it is possible to reconstruct a frame or sequence of frames at a particular display resolution, quality, and frame rate. In general, the quality of the resulting frame or frame sequence depends on the total amount of data to be decoded. The selection of the slice used to reconstruct the frame or sequence of frames is based on the desired resolution, quality, and frame rate. Also, by adding more slices, the quality, resolution,
And it is possible to improve the frame rate. This general process is called “improvement (
(In the wavelet-based example given later, the term "refinement" is used for this process.) Typically, an enhancement involves the existence of a new slice. This can occur by adding to the current slice or by adding the first slice from a new layer, possibly, but rarely, by adding all new layers (all slices for that layer). ) May be involved.

【００３５】Ｃ．システムの説明本発明の方法を実行可能なシステムは、図１に示されているように、エンコー
ダと、ネットワークと、サーバと、Ｎ個のクライアントを有している。C. System Description A system capable of performing the method of the present invention has an encoder, a network, a server, and N clients, as shown in FIG.

【００３６】エンコーダは到来するメディアを圧縮し、それをレイヤ構造にし、そして、こ
れをビットストリームとしてデジタル通信ネットワークによって送信する前に、
次の節で説明するようにレイヤラベリング情報を追加してサーバに格納する。The encoder compresses the incoming media, layer it, and before transmitting it as a bit stream over a digital communication network,
Add layer labeling information and store it on the server as described in the next section.

【００３７】図１において、Ｎ個のクライアントがあり、夫々はメディアサーバとのセショ
ンに従事し、その間にメディアの内容と制御情報が送信される。各クライアント
に関し、サーバへの制御チャネルが維持され、これによってクライアントは送信
されるメディアをサーバに要求することができる。なされる要求のタイプは（こ
れに限定される訳ではないが）次のものを含む。・メディアファイルにおける特定点へのシーク。・特定の品質でメディアクリップを送信。・特別の情報を送信し指定された方法でクライアントにおけるメディアの品質を改善。In FIG. 1, there are N clients, each engaged in a session with a media server, during which media content and control information are transmitted. For each client, a control channel to the server is maintained so that the client can request the server for media to be transmitted. The types of requests made include (but are not limited to): • Seek to a specific point in a media file. • Send media clips with a specific quality. • Improve media quality at the client in a specified way by sending special information.

【００３８】これらの要求は高速前送りと巻戻しを用いたビデオプレビュー、１フレームス
テップでの前送りと後戻り、及びフレーム凍結のようなクライアントにおける様
々な有用なアプリケーションレベルでの機能をサポートする。例えば、メディア
閲覧のアプリケーションでは、低品質版のメディアがほとんど遅延なくクライア
ントに送信されて、そのユーザが、例えば、高速にファイルを走査して注目する
項目を突き止めるタスクを即座に開始することを可能にする。始めには、ユーザ
は、ある地点から別の地点に高速に移動することを要求するために、システムの
応答性ほどにはメディアの絶対的な品質には、さほど関心を払わないかもしれな
い。注目の項目を見つけると、そのアプリケーションは、そのメディアのこの特
定のセクションがよりよい品質でレンダリングされるべきことを判断し、適切な
要求をサーバに発行することができる。そのサーバは、要求された方法でそのメ
ディア情報を改善にするためにはどのエクストラレイヤが配信されるべきかを計
算し、そのレイヤだけを送信する。なお、そのメディアファイルのエクストラ処
理は、適切なセットのデータを突き止めて送信する以外の手順の間には要求され
ない。また、品質の改善が多くの方法で発生し得る。これには次のことを含む。・メディアの時間解像度を増すこと（即ち、ビデオに関して、もっと多くの個々のフレームが分析されるのを可能にする）。・ビデオの空間解像度を増すこと。・オーディオのサンプリング周波数を増すこと。・メディアの歪み度を減らすこと。These requirements support various useful application-level features at the client, such as video preview using fast forward and rewind, forward and backward in one frame steps, and frame freeze. For example, in a media browsing application, a low-quality version of the media is sent to the client with little delay, allowing the user to quickly start a task, for example, quickly scanning a file and locating items of interest. To Initially, the user may not be as concerned with the absolute quality of the media as much as the responsiveness of the system, as he demands to move quickly from one point to another. Upon finding the item of interest, the application can determine that this particular section of the media should be rendered with better quality and issue an appropriate request to the server. The server computes which extra layer should be delivered to improve the media information in the requested manner and sends only that layer. Note that extra processing of the media file is not required during procedures other than locating and transmitting an appropriate set of data. Also, quality improvements can occur in many ways. This includes: Increase the temporal resolution of the media (ie, for video, allow more individual frames to be analyzed). -Increasing the spatial resolution of the video. -Increasing the audio sampling frequency.・ To reduce distortion of media.

【００３９】この方式の動作の中心は、前の節で説明したものも含め任意の符号化フォーマ
ットにおけるメディアファイルを、上述した動作をサポートする特性をもったレ
イヤ化されたフォーマットに変換することである。これは、次の節で説明される
ようなレイヤラベリングの目的である。At the heart of the operation of this scheme is the conversion of media files in any encoding format, including those described in the previous section, into a layered format with properties that support the operations described above. is there. This is the purpose of layer labeling as described in the next section.

【００４０】Ｄ．レイヤラベリングの説明概略レイヤ化されたビットストリームは、メディアを符号化タイプによって決定さ
れる方法でパケット（ここでは、チャンク（chunk）或いはスライス（slice）と
呼ばれる）に分割し、ラベル（Label）をチャンクに付加して再構成されるファ
イルに対するその寄与をユニークに記述することによるメディアの符号化から構
築される。分配されたツールが構築されて構成され、分散環境における多くのレ
イヤ化されたビットストリームのフォーマットを理解できるために、静的（stat
ic）信号情報（ストリームの存在期間を通じて定数）が、全体的に利用可能なフ
ァイルへの参照として、或いは、ビットストリームに１回送信されたデータとし
て利用可能となる。そのストリーム内で変化しなければならない構成情報に関し
て、そのビットストリームで搬送される動的（dynamic）信号情報に関して備え
がなされる。D. Description of Layer Labeling Overview A layered bit stream divides media into packets (here called chunks or slices) in a manner determined by the coding type, and labels (Labels). It is built from media encoding by uniquely describing its contribution to the reconstructed file in addition to the chunk. Static (stat) tools are needed to build and configure distributed tools and to understand the format of many layered bitstreams in a distributed environment.
ic) Signal information (a constant over the life of the stream) is made available as a reference to a globally available file or as data transmitted once to a bitstream. Provision is made for the dynamic signaling information carried in the bitstream with respect to configuration information that must change within the stream.

【００４１】うまくいくためにラベリング方式は３つの事柄をうまくしなければならない。・効率的にかつ制御可能にメディアファイルの一部が選択されて特定の品質のサービスを獲得してメディア品質、ビット速度、或いは遅延要求を満足することを可能にすること。・特定の項目を指定された品質へと再構成するのに必要なメディアのパーツの位置突き止めと送信とを効率的にサポートすること。・基本的なメディア符号化をカプセル化して複雑な詳細を隠す一方、均一でメディア符号化に独立し、レイヤ化されたビューのファイルを提供すること。To work, a labeling scheme must do three things. -To enable a part of a media file to be efficiently and controllably selected to obtain a particular quality of service to meet media quality, bit rate or delay requirements. • Efficiently support the location and transmission of media parts needed to reconstruct a particular item to a specified quality. • Encapsulating basic media coding to hide complex details, while providing a uniform, media coding independent, layered view file.

【００４２】最初の要求は、ファイルのどの部分が送信のために選択されるべきかを指定す
るフィルタ（Filter）の概念を通じて扱われる。第２の要求は、ビットストリー
ム内でレイヤ化された構造を記述描写するためのバンド内（in-band）信号発信
（signalling）情報を必要とする。第３の要求は、特定の符号化の細かい点を均
一なレイヤ化されたビューへとマップする間接的な機構を通じて扱われる。この
間接的な機構はコードブック（Codebook）と名付けられる。コードブックは各新
しい符号化方式、或いは、新しいパラメータを伴う現存する方式の変形に対して
定義される。The initial request is handled through the concept of a filter that specifies which parts of the file should be selected for transmission. The second requirement requires in-band signaling information to describe the layered structure in the bitstream. The third requirement is addressed through an indirect mechanism that maps specific coding details into a uniform layered view. This indirect mechanism is named Codebook. A codebook is defined for each new coding scheme or variant of an existing scheme with new parameters.

【００４３】レイヤラベリングフォーマット（Layer Labelling Format）この節は、本発明で用いられるレイヤラベルのフォーマットを定義して、メデ
ィアファイルをレイヤ化されたストリームのフォーマットに変換する。ストリー
ムは連続したグループのチャンクとして格納され、送信される。各チャンクは、
そのチャンクの長さとタイプを定義するチャンクヘッダをもっている。あるチャ
ンクタイプはデータを搬送する（スライス（slice））と名付けられる）ために
予約されており、一方、他のものは信号発信のために用いられる。Layer Labeling Format This section defines the format of the layer labels used in the present invention and converts the media file into a layered stream format. Streams are stored and transmitted as contiguous groups of chunks. Each chunk is
It has a chunk header that defines the length and type of that chunk. Some chunk types are reserved for carrying data (named slices), while others are used for signaling.

【００４４】フォーマットとスライス依存性を符号化するような、これらストリームについ
ての特別な構成情報はデータストリームと関連づけられる。この情報の静的な部
分は外部に、或いは、チャンクの信号発信を用いてバンド内（in-band）に格納
される。動的な情報はつねにバンド内（in-band）に格納される。Special configuration information about these streams, such as encoding formats and slice dependencies, is associated with the data streams. The static portion of this information is stored externally or in-band using chunk signaling. Dynamic information is always stored in-band.

【００４５】チャンクヘッダ（Chunk Header）全てのチャンクはチャンクヘッダによって導かれ、そのヘッダのフォーマット
は図３に図示されている。そのタイプに係らず、全てのチャンクはこれらのヘッ
ダの１つによって導入される。その結果は構造が固定のデータストリームであり
、一方、将来の拡張のために以前として余裕を残している。Chunk Header All chunks are guided by a chunk header, the format of which is shown in FIG. All chunks, regardless of type, are introduced by one of these headers. The result is a data stream with a fixed structure, while leaving room for future expansion.

【００４６】タイプ（Type）２ビットのタイプフィールドは４つの独立したセットのチャンクタイプ間を区
別するために用いられる。（独立性の問題は重要である。なぜなら、異なるタイ
プに属する２つのラベル間の依存性を特定する方法はないからである。）次のタ
イプの値が現在の所、定義されている：データチャンク（data chunks）（スライス(slices)）信号発信チャンク（signalling chunks）。Type The 2-bit type field is used to distinguish between four independent sets of chunk types. (The issue of independence is important because there is no way to specify a dependency between two labels belonging to different types.) The following types of values are currently defined: Data Chunk (data chunks) (slices) Signaling chunks.

【００４７】長さ（Length）２２ビット長のフィールドによってヘッダを含むバイトにおけるチャンク長を
与える。それゆえに、ペイロード情報を含まないチャンク（即ち、ヘッダだけ）
の長さは４である。Length A 22-bit long field gives the chunk length in bytes including the header. Therefore, chunks without payload information (ie only headers)
Is 4 in length.

【００４８】ラベル（Label）８ビットのラベルフィールドは、チャンクに対するレイヤラベルである。その
ラベルの解釈はタイプフィールドに依存し（上述参照）、それで、ヘッダフォー
マットによって４セットの２５６個のラベルが可能になる。Label The label field of 8 bits is a layer label for the chunk. The interpretation of the label depends on the type field (see above), so the header format allows four sets of 256 labels.

【００４９】用語（Terminology）データチャンク（同等のものとしてスライスと名付けられる）は、マルチメデ
ィアデータのストリームをカプセルに包むために用いられる。そのスライスに付
けられるラベルは、スライス内容の性質とストリームにおける他のスライスとの
依存性の関係とを示すために用いられる。信号発信チャンクはそのストリーム内
で用いられてそのデータがどのように符号化され或いはフィルタされたのかにつ
いての詳細のようなデータについての特別な情報を通過させる。Terminology Data chunks (termed slices as equivalents) are used to encapsulate a stream of multimedia data. The label attached to the slice is used to indicate the nature of the slice contents and the dependency of the stream on other slices. The signaling chunk is used in the stream to pass special information about the data, such as details about how the data was encoded or filtered.

【００５０】スライス（Slice）スライスラベル割当がなされる方法は、カプセル化されるマルチメディアデー
タの正確な性質に依存する。全てのツール間で相互に同意されている基本的な方
式のセットがあるが、新しい割当方式が付加されて異なる状態に合わせるために
システムをうまくしつらえるようにできる。唯一の制限は以下の通りである。・データストリーム内のスライスは全て、同じ割当方式を用いてラベリングされねばならない。・スライスはこれが含むデータに関してユニークにラベリングされねばならない。・ラベル割当はスライスに依存した順番でなされねばならない。Slice The way slice labels are assigned depends on the exact nature of the multimedia data being encapsulated. Although there is a mutually agreed upon set of basic schemes among all tools, new allocation schemes can be added to tailor the system to accommodate different situations. The only restrictions are: • All slices in the data stream must be labeled using the same allocation scheme. • Slices must be uniquely labeled with respect to the data they contain. • Label assignments must be made in a slice-dependent order.

【００５１】最初の制限は自明なことであるが、他のメリットをさらに検討する。スライス
ラベルは、スライスのデータ内容を指示するものとしてとられ、システムによっ
て用いられてスライスを分離したり併合したりして異なる品質のデータストリー
ムを得る。ストリームのデコーダがステイトレスでもよいために、スライスラベ
ルは実際のデータ内容をユニークに識別しなければならない。その結果、単一の
チャンクとしてカプセル化するにには余りにも長いスライスは、デコーダがどれ
がベースデータであるのか、そして、どれが連続情報であるのかを決定すること
ができるような方法で、分割されねばならない。The first limitation is self-evident, but considers other advantages further. The slice label is taken as an indication of the data content of the slice and is used by the system to separate or merge the slices to obtain different quality data streams. The slice label must uniquely identify the actual data content so that the stream decoder may be stateless. As a result, slices that are too long to be encapsulated as a single chunk are represented in such a way that the decoder can determine which is the base data and which is the continuous information, Must be split.

【００５２】スライスラベリングはまた、依存性の階層構造を反映するものでなければなら
ず、全てのスライスはそれらに依存するものに比べて数値的により小さいラベル
値をもつ。このことは、ストリームのスライスが依存性の順番で符号化されると
き、そのラベル値は依存性のあるユニット（以下の“コンテクスト（context）
”を見よ）を通じた進展と一致して増大する。これは、以前に分割されたデータ
ストリームからのスライスを併合するジョブを簡単にするだけではなく、コンテ
クスト間の境界を暗示的にできる（もし、スライスラベルが前のものよりも数値
的に大きくないなら、それは新しいコンテクストに属さねばならない）。Slice labeling must also reflect the hierarchy of dependencies, with all slices having numerically smaller label values than those that depend on them. This means that when the slices of the stream are encoded in dependency order, the label value is dependent on the dependent unit ("context" below).
(See ""), which not only simplifies the job of merging slices from previously split data streams, but also allows for implicit boundaries between contexts (if If the slice label is not numerically larger than the previous one, it must belong to the new context).

【００５３】コンテクスト（Context）依存性（空間的或いは時間的のいずれか）によって関連づけられるスライスの
グループがコンテクストと呼ばれる。上述のように、前記コンテクスト内にある
スライスについてのラベル値は、それらが数値的に増えていく順番となるように
調整されねばならない。これによって、通常、コンテクスト間の境界が明らかな
信号発信チャンクを必要とせずに、暗示的に検出可能となる。上述のラベル割当
方式に基づいて、従って、スライスが常にそれに依存するものに先立つように、
コンテクスト内のスライスもまた依存性をもった順番に調整される。Context A group of slices related by dependencies (either spatially or temporally) is called a context. As described above, the label values for slices in the context must be adjusted so that they are in numerically increasing order. This usually allows for implicit detection, without the need for signaling chunks where the boundaries between contexts are clear. Based on the label allocation scheme described above, so that slices always precede those that depend on them,
The slices in the context are also adjusted in a dependent order.

【００５４】利用可能なスライスラベルの数（及びユニークなスライスラベリングの制限）
のために、コンテクストを有するスライスの最大数は２５６であることになる。
スライスラベルについての依存性の階層構造はコードブック（以下参照）で定義
され、データストリームの期間は固定される。なお、スライスラベルが連続的に
割当てられるのは必須ではないし、特定のスライスラベルが全てのコンテクスト
に存在するのも本質的なことではない。Number of available slice labels (and limitations on unique slice labeling)
Therefore, the maximum number of slices with context will be 256.
The hierarchical structure of the dependency on the slice label is defined in the codebook (see below), and the duration of the data stream is fixed. It is not essential that slice labels are continuously assigned, and it is not essential that a specific slice label be present in all contexts.

【００５５】依存性によって関連づけられるスライスとともにグルーピングするのと同様に
、コンテクストは編集が実行される境界を定義するのに非常に重要な役割を果た
す。妥当な編集地点を判断するために明示的な信号発信チャンクの必要はない。As well as grouping with slices that are related by dependencies, context plays a very important role in defining the boundaries where editing is performed. There is no need for explicit signaling chunks to determine a valid edit point.

【００５６】コンテクストは時間的空間ではその隣接するものから分離しており、それらに
対して独立であると仮定される。言いかえると、データストリームからコンテク
ストが抽出されるとき、夫々はスタンドアロンのエンティティとしてデコード可
能であるべきである。これに対する１つの例外は、隣接する画面群間の時間的な
依存性をもつＭＰＥＧのような符号化方式によって例示される。このことは、時
間的なユニット（以下のサンプルを参照）が、すぐ前のコンテクストに依存して
いるものとしてマークできる各コンテクストと“オーバラップ”を関連づけるこ
とによって扱われる。A context is assumed to be separate from its neighbors in temporal space and independent of them. In other words, when context is extracted from the data stream, each should be decodable as a standalone entity. One exception to this is illustrated by encoding schemes such as MPEG that have a temporal dependency between adjacent screen groups. This is handled by associating a temporal unit (see sample below) with "overlaps" with each context that can be marked as dependent on the previous context.

【００５７】メディアのヘッダ（以下を参照）に格納されるオーバラップの値は、これらコ
ンテクスト間の依存性がどのように扱われるのかを定義する。即ち、オーバラッ
プ値ｎは、コンテクストの最初のｎ個のサンプルが以前のコンテクストに依存し
ていることを示唆する。正常な状態とはコンテクストが時間的に自己充足的であ
ることであるので、そのディフォルトのオーバラップ値はゼロである。それがゼ
ロではない場合、スライス依存性の階層構造は、符号化方式のコンテクスト間の
特性を反映しなければならないし、これには、必要であれば付加される“ファン
トム（phantom）”依存性を伴っている。The value of the overlap stored in the media header (see below) defines how dependencies between these contexts are handled. That is, an overlap value n indicates that the first n samples of the context are dependent on the previous context. The normal overlap state is that the context is self-contained in time, so its default overlap value is zero. If it is non-zero, the slice dependency hierarchy must reflect the inter-context properties of the coding scheme, including additional "phantom" dependencies if necessary. Is accompanied.

【００５８】サンプル（Sample）マルチメディアデータストリームは通常、離散したユニットのシーケンスとし
て表現され、そのユニット各々はストリームにおいてきちんと定められた時間的
な位置がある。これはビデオデータの場合、特に真実であり、ビデオデータは通
常、分離したフレームのセットとしてモデル化される（たとえ、例えば、ＭＰＥ
Ｇなどを用いて、そのフレームが符号化過程において順番を入れ替えられたり、
グループ化されたりしたかもしれないとしても）。そのデータストリームは、符
号化方式の自然な時間的なユニットを保存し、サンプルと名付けられる各離散し
たユニットをもっている。コンテクストが単一のサンプルを含むか、或いは、サ
ンプルのグループを含むかは用いられる符号化方式に依存するが、特定の技法が
通常は、非常に厳密な繰返しパターンに続く。従って、デフォルトによって、各
コンテクストは固定数のサンプルを含み、システムヘッダ（以下を参照）で定義
されるコンテクスト間隔（コンテクスト繰返しカウント当たりのサンプル）をも
っている。Sample A multimedia data stream is usually represented as a sequence of discrete units, each of which has a well-defined temporal position in the stream. This is especially true for video data, which is typically modeled as a set of separate frames (eg, for example, MPE
Using G or the like, the frame is replaced in the encoding process,
Even if they may have been grouped). The data stream preserves the natural temporal units of the encoding scheme and has each discrete unit named a sample. Whether a context contains a single sample or a group of samples depends on the coding scheme used, but the particular technique usually follows a very strict repeating pattern. Thus, by default, each context contains a fixed number of samples and has a context interval (samples per context repetition count) defined in the system header (see below).

【００５９】コンテクストとサンプルとの間の区別は、（３−Ｄウェーブレット／ＳＰＩＨ
Ｔ方式のような）マルチメディアデータの時間的な依存性と（ビデオシーケンス
の４番目毎のフレームを再生するというような）時間的な短縮を実行する能力を
熟慮するときには重要である。コンテクストは時間的な符号化ユニット（ＭＰＥ
ＧにおけるＧＯＰや、ウェーブレット／ＳＰＩＨＴにおけるＧＯＦ）を作り上げ
る最少数のサンプルを含んでおり、これには正しく同じ方法で扱われる空間的時
間的な依存性が伴っている。コンテクスト内で、各スライスには時間順序の高度
な短縮を可能にする時間的優先度（以下を参照）が与えられているが、それ自身
ではどんな種類の時間的な依存性も示唆するものではない。The distinction between context and sample is (3-D wavelet / SPIH
It is important when considering the temporal dependence of multimedia data (such as the T scheme) and the ability to perform temporal reductions (such as playing every fourth frame of a video sequence). The context is a temporal coding unit (MPE
It contains the minimum number of samples that make up the GOP in G and the GOF in wavelet / SPIHT), with the spatio-temporal dependencies handled in exactly the same way. Within the context, each slice is given a temporal priority (see below) that allows for a high degree of temporal shortening, but by itself does not imply any kind of temporal dependency. Absent.

【００６０】システムヘッダ（System Header）システムヘッダは、システムレイヤだけで関係のあるストリームの属性を定義
するために用いられる。基本的に、これはサーバによって必要とされるパラメー
タを意味しており、これらに必要とされるサービスのレベルを提供している。例
えば、デフォルトコンテクストの間隔は、コンテクストとサンプルとの間のマッ
ピングを構成するために必要とされる。System Header (System Header) The system header is used to define attributes of a stream that are relevant only in the system layer. Basically, this means the parameters required by the server and provides the level of service required for them. For example, a default context interval is needed to configure the mapping between context and sample.

【００６１】メディアヘッダ（Media Header）メディアヘッダは、メディアレイヤだけで関係のあるストリームの属性を定義
するために用いられる。これは、データストリームを意味あるものにするために
ストリームデコーダによって必要とされるが、サーバによっては必要とはされな
いパラメータを意味している。このカテゴリに入る例は、ビデオ符号化の水平及
び垂直解像度、コンテクスト間の依存性のオーバラップ、符号化のために用いら
れたコードブックへの参照である。Media Header The media header is used to define the attributes of the stream that are relevant only in the media layer. This implies parameters that are needed by the stream decoder to make the data stream meaningful, but not required by the server. Examples that fall into this category are the horizontal and vertical resolutions of video coding, overlapping dependencies between contexts, and references to the codebook used for coding.

【００６２】情報をコードブックのそれと結合するというよりはむしろ、分離したメディア
ヘッダチャンクがあるという理由は、コードブックが包括的であるという性質を
もち、元々のメディアの生の性質に独立している傾向があるためである。メディ
アヘッダは具体的な記録についてのこれら詳細を満たしており、従って、システ
ムによって必要とされるコードブックの数を全体として大きく減少させている。[0062] Rather than combining information with that of the codebook, the reason that there is a separate media header chunk is that the codebook has the property of being comprehensive and independent of the raw nature of the original media. This is because there is a tendency. The media header satisfies these details for a particular recording, thus greatly reducing the number of codebooks required by the system as a whole.

【００６３】コードブック（Code Book）コードブックはコンテクスト内のスライスの依存性と品質関係とを定義するた
めに用いられ、データストリームに対するフィルタを構成するために必要である
。その依存性が空間的なものであるか或いは時間的なものであるのかに係らず、
個々のスライス間の関係は単純な階層構造として表現される。即ち、個々のスラ
イスタイプ（即ち、特定のラベルをもつスライス）は、それが依存している他の
スライスタイプの所定のセットをもっている。その階層構造の性質は明らかに用
いられているラベリング方式に依存しているが、コードブック技術によって、非
常におおきな柔軟性を備えることが可能になり、スライスラベリングの基本的な
規則が侵害されないように備えられる（上記を参照）。Code Book A code book is used to define slice dependencies and quality relationships within a context, and is needed to construct filters on the data stream. Regardless of whether the dependency is spatial or temporal,
The relationship between individual slices is represented as a simple hierarchical structure. That is, each slice type (ie, a slice with a particular label) has a predetermined set of other slice types on which it depends. Although the nature of the hierarchy clearly depends on the labeling scheme used, the codebook technique allows for great flexibility and does not violate the basic rules of slice labeling. (See above).

【００６４】コードブックの環境では、“依存性”には次の意味がある。あるスライスが別
のスライスに依存するために、それなしにスライスをデコードすることが不可能
でなければならない。（例えば、このことは、最初のＰフレームがデコードされ
る前にＩフレームが必要であるというＭＰＥＧビデオの場合に、真実である。）
なお、受容可能な品質を生み出すということになると、この依存性の関係はスラ
イスの好ましさについては何も述べていない。In a codebook environment, “dependency” has the following meaning. Because one slice depends on another slice, it must be impossible to decode the slice without it. (For example, this is true for MPEG video where an I-frame is required before the first P-frame is decoded.)
When it comes to producing acceptable quality, this dependency relationship says nothing about the liking of slices.

【００６５】依存性の階層構造を定義するのと同様に、個々のスライス値とスライスを含む
デコードされたデータストリームの“品質”との間のマッピングを格納するのも
コードブックの仕事である。必要があって、上述した厳密なスライス依存性と比
較して、ストリームの品質はあいまいであり、主観的な因子である。即ち、実際
おそらくは１つ以上の品質の次元がある。例えば、３−Ｄウェーブレット／ＳＰ
ＩＨＴ符号化ビデオは、４つの品質軸：縮尺（画像サイズ）、忠実度（画像品質
）、色、及び時間的なあいまいさをもっているものとして考えられる。As well as defining the hierarchy of dependencies, it is the job of the codebook to store the mapping between individual slice values and the “quality” of the decoded data stream containing the slices. As needed, compared to the strict slice dependence described above, the quality of the stream is ambiguous and a subjective factor. That is, in fact there is probably more than one quality dimension. For example, 3-D wavelet / SP
IHT encoded video is considered to have four quality axes: scale (image size), fidelity (image quality), color, and temporal ambiguity.

【００６６】コードブックの大部分は依存性データ、時間的な優先順位、品質パラメータの
セットを含んだ一連のテーブル（各スライスラベルに対して１つ）を、各品質軸
に関して１つ有している。定義される品質軸の数とタイプとはマルチメディアデ
ータと用いられる符号化方式の性質に依存している。問題を単純にするために、
３つの重要な制限が品質軸に適用される。Most codebooks have a series of tables (one for each slice label) containing a set of dependency data, temporal priority, and quality parameters, one for each quality axis. I have. The number and type of quality axes defined depends on the nature of the multimedia data and the coding scheme used. To simplify the problem,
Three important restrictions apply to the quality axis.

【００６７】品質軸は全体的な重要度をもつ名称（或いは品質タグ）を割当てられる。同じ
品質タグを採用するコードブックは、関連する軸についての品質パラメータとい
うことになると、同じ割当方式を用いなければならない。The quality axis is assigned a name (or quality tag) with overall importance. Codebooks employing the same quality tag must use the same allocation scheme when it comes to quality parameters for the relevant axis.

【００６８】品質パラメータは最高の品質をもった（フィルタされていない）データストリ
ームに関して常に指定される。例えば、ビデオの場合、これによって共通のコー
ドブックを画像サイズに係りなく用いることが可能になる。The quality parameters are always specified for the highest quality (unfiltered) data stream. For example, in the case of video, this allows a common codebook to be used regardless of image size.

【００６９】品質パラメータそれ自身は、実際的であれば、用いられる実際の符号化方式に
独立でなければならない。例えば、ビデオデータについての縮尺パラメータは、
アスペクト比が変わらないとすれば、オリジナルに比較して縮尺された画像の幅
或いは高さを表現するかもしれない。The quality parameters themselves, if practical, must be independent of the actual coding scheme used. For example, the scale parameter for video data is
If the aspect ratio does not change, it may represent the width or height of the scaled image compared to the original.

【００７０】コードブックのヘッダは品質タグのリストを含んでおり、そのリストに関して
、品質パラメータは次に続くテーブルにおいて見出されるであろう。上記略述し
た制限と組み合わせれば、これによって、フィルタの創生がマルチメディアデー
タのタイプと符号化方式に全く独立した方法でなされることが可能になる。存在
しないか、或いは、指定されていない品質軸を用いて創生されるフィルタを用い
ても常に、その軸に関してスライスがフィルタされないという結果となる。The codebook header contains a list of quality tags, for which quality parameters will be found in the following table. Combined with the limitations outlined above, this allows the creation of filters to be done in a manner that is completely independent of the type and encoding of the multimedia data. Using a filter created with a missing or unspecified quality axis always results in slices not being filtered with respect to that axis.

【００７１】時間的優先順位（Temporal Priority）データストリームを短縮し、バンド幅要求を小さくするか、或いは、通常の速
度より速い速度でデータストリームを再生することがしばしば必要である。異な
るデータ符号化方式が与えられると、“４つ毎に１サンプル”というようなナイ
ーブなサブサンプリング方式で動作することは必ずしも賢明なことではない。こ
れに対して２つの理由がある。Temporal Priority It is often necessary to shorten the data stream, reduce bandwidth requirements, or play the data stream at a faster than normal rate. Given a different data encoding scheme, it is not always wise to operate with a naive subsampling scheme such as "one sample every four". There are two reasons for this.

【００７２】関連のあるサンプルのスライス間依存性は単純な短縮を禁ずるかもしれない。
このことは、例えば、ＭＰＥＧビデオでは真実である。この場合、Ｉ、Ｐ、Ｂフ
レーム間の複雑な蜘蛛の巣のような時間的な依存性がある。The inter-slice dependence of relevant samples may prohibit simple shortening.
This is true, for example, for MPEG video. In this case, there is a temporal dependency such as a complicated spider web between the I, P, and B frames.

【００７３】スライスデータのキャッシュを維持し利用可能なバンド幅の効率的な利用を保
証しようと試みるとき、繰返される（或いは、洗練された）サブサンプリング要
求が注意深く統合調整されてデータのオーバラップを最大化することが不可欠で
ある。When attempting to maintain a cache of sliced data and ensure efficient use of available bandwidth, repeated (or sophisticated) sub-sampling requests can be carefully coordinated to reduce data overlap. Maximizing is essential.

【００７４】従って、コンテクストの各スライスは、短縮を制御するために用いられる時間
的な優先順位を割当てられる。例えば、ＭＰＥＧビデオでは、Ｉ、Ｐ、及びＢフ
レームに属するスライスには夫々、０、１、２の時間的な優先順位が割当てられ
る。そのコンテクストのレベル以下で時間的なフィルタリングを実行するときに
用いられるのは、これら時間的な優先順位である。Thus, each slice of the context is assigned a temporal priority that is used to control the shortening. For example, in MPEG video, temporal priorities of 0, 1, and 2 are assigned to slices belonging to I, P, and B frames, respectively. It is these temporal priorities that are used when performing temporal filtering below the level of the context.

【００７５】信号発信チャンク（Signalling Chunks）信号発信チャンクは、データストリームにおけるデータチャンクを識別して注
釈を付けるために用いられる。それらがサーバによって共に格納されるのか、或
いは、配信時刻に急いで生成されるのかは、この文書の範囲の外にある実施レベ
ルでの詳細である。信号発信チャンクは、３つの明白なカテゴリ：静的、動的、
及びパディングに分かれる。Signaling Chunks Signaling chunks are used to identify and annotate data chunks in a data stream. Whether they are stored together by the server or generated on the fly at delivery time is a detail at the implementation level that is outside the scope of this document. The signaling chunks have three distinct categories: static, dynamic,
And padding.

【００７６】静的チャンク（Static chunks）静的チャンクは、データストリームの存続期間には変化しないパラメータを定
義する。これには、データスライスを生成するために採用されたコードブックの
アイデンティティのような、ストリーム全体に適用する基本的なデフォルト値の
定義が含まれる。これらのチャンクは静的であるので、即ち、情報がバンドの外
に送信されるのが妥当であるように、或いは、参照によって、再利用可能なコー
ドブックテーブルの場合のように、それらは、データストリームの一部として含
まれる必要はない。しかしながら、もし、それらが送信されたり、或いは、バン
ド内に格納されたりするなら、それらはユニークで、第１のデータスライスに先
立ち、そして、Ｏｐコードが数値的に増えていく順番に格納されなければならな
い。Static chunks Static chunks define parameters that do not change over the life of the data stream. This includes defining basic default values that apply to the entire stream, such as the identity of the codebook employed to generate the data slice. Because these chunks are static, i.e., it is reasonable that the information is transmitted out of band, or by reference, as in the case of reusable codebook tables, they are: It need not be included as part of the data stream. However, if they are transmitted or stored in bands, they must be unique, prior to the first data slice, and stored in numerically increasing order of the Opcodes. Must.

【００７７】動的チャンク（Dynamic chunks）動的チャンクは、データストリーム内の位置に依存して変化するパラメータを
定義する。これは、ストリームを生成するために用いられるフィルタは、スライ
スは配信されるときに変化できるために、フィルタ化の情報を含んでいる。それ
はまた、“シーク”動作の指示や、一定ではないコンテクスト間隔をもったコン
テクストの処理のような全ての可変コンテクスト及びサンプル情報を含む。本来
的に、動的チャンクはデータストリーム内で位置情報を搬送するので、常に、バ
ンド内で送信されねばならない。それが存在する場合、動的チャンクはコンテク
スト境界でのみ正当なものである。Dynamic Chunks Dynamic chunks define parameters that change depending on their position in the data stream. This includes filtering information because the filters used to generate the stream can change when the slice is delivered. It also contains all variable context and sample information, such as indicating "seek" operations and processing contexts with non-constant context intervals. By nature, dynamic chunks carry location information in the data stream and must always be transmitted in bands. If it is present, dynamic chunks are only valid at context boundaries.

【００７８】パディングチャンク（Padding chunks）パディングチャンクはデータストリームでのセマティックな影響はないので、
どんなチャンク境界にも置くことができる。Padding chunks Since padding chunks have no semantic effect on the data stream,
It can be placed on any chunk boundary.

【００７９】静的信号発信チャンク（Static Signalling Chunks）システムヘッダタイプ：静的Ｏｐコード： 0x00 内容：システムレイヤだけ（メディアサーバ）に特定な情報、データストリームをフィルタしたりデコードするには必要ないパラメータはコンテクスト間隔（context spacing）フォーマット：名前／値のペアステータス：必須位置：バンドの外、或いは第１のデータスライス前のバンド内。Static Signaling Chunks System Header Type: Static Opcode: 0x00 Description: Information specific to the system layer only (media server), not required to filter or decode the data stream Is the context spacing Format: name / value pair Status: required Position: Outside the band, or inside the band before the first data slice.

【００８０】メディアヘッダタイプ：静的Ｏｐコード： 0x01 内容：メディアレイヤだけ（デコーダ）に特定な情報、メディアサーバの動作には必要。パラメータはオリジナルメディアパラメータ（original media parameters）コンテクスト依存性オーバラップ（context dependency overlap）コードブックレファレンス（code book reference）フォーマット：名前／値のペアステータス：必須位置：バンドの外、或いは第１のデータスライス前のバンド内。 Media header type: Static Op code: 0x01 Description: Information specific to the media layer only (decoder), necessary for operation of the media server. Parameters are original media parameters context dependency overlap code book reference Format: name / value pairs Status: required Location: out of band or first data slice In the previous band.

【００８１】コードブックタイプ：静的Ｏｐコード： 0x02 内容：スライスフィルタだけの生成に特定の情報、メディアサーバ或いはデコーダの動作には不要品質タグリスト（quality tag list）スライス依存性情報（slice dependency information）スライス品質情報（slice quality information）スライスの時間的優先度情報（slice temporal priority information）ステータス：必須位置：バンドの外、参照による、或いは第１のデータスライス前のバンド内。 Code book type: Static Op code: 0x02 Contents: Specific information for generating slice filter only, unnecessary for operation of media server or decoder Quality tag list (slice dependency information) Slice quality information Slice temporal priority information Status: Mandatory Location: Out of band, by reference, or in band before first data slice.

【００８２】メタデータタイプ：静的Ｏｐコード： 0x04 内容：データストリームに関連するがその格納、フィルタリング、或いは送信は必要とされない情報。例は、データ、時刻、及び場所データ履歴著作権の表示フォーマット：名前／値のペアステータス：オプション位置：バンドの外、或いは第１のデータスライス前のバンド内。 Metadata Type: Static Opcode: 0x04 Content: Information related to the data stream but whose storage, filtering, or transmission is not required. Examples are: Data, Time, and Location Data History Copyright Notice Format: Name / Value Pairs Status: Optional Location: Outside the band or in the band before the first data slice.

【００８３】ユーザデータ（1,2,3,4）タイプ：静的Ｏｐコード： 0x05, 0x06, 0x07, 0x08 内容：符号化の生成器によってデータストリームに付加されるプライベート情報であるが、システムではどのようにも解釈されないフォーマット：不透明なバイトアレイ位置：バンドの外、或いは第１のデータスライス前のバンド内。 User Data (1,2,3,4) Type: Static Op Code: 0x05, 0x06, 0x07, 0x08 Contents: Private information added to the data stream by the coding generator. Not interpreted in any way by the system Format: opaque byte array Location: Outside the band or in the band before the first data slice.

【００８４】動的な信号発信チャンク（Dynamic Signalling Chunks）ベースラインタイプ：動的Ｏｐコード： 0x80 内容：コンテクストとサンプルシーケンス番号とに関したストリームの現在の位置に関する情報で、通常、データストリーム内での“シーク”動作を示すために使用。２つのパラメータを持ち、いすれかはその値がストリーム位置から推定可能であれば、省略可コンテクストシーケンスサンプルシーケンスデータストリームの始まりで特に指定がなければ、そのコンテクストとサンプルシーケンス番号は両方ともゼロで開始するフォーマット：６４ビットのリトル−エンディアン（little-endian）バイナリステータス：オプショナル位置：コンテクスト境界ではバンド内。Dynamic Signaling Chunks Baseline Type: Dynamic Opcode: 0x80 Content: Information about the current position of the stream in relation to context and sample sequence number, usually within the data stream. Used to indicate “seek” operation. It has two parameters, any of which can be omitted if the value can be inferred from the stream position. Context sequence Sample sequence Unless otherwise specified at the beginning of the data stream, both the context and the sample sequence number are zero. Begins with Format: 64-bit little-endian Binary Status: Optional Location: In band at context boundary.

【００８５】コンテクストスタートタイプ：動的Ｏｐコード： 0x81 内容：すぐ次のコンテクストについてのコンテクスト間隔フォーマット：３２ビットのリトル−エンディアン（little-endian）バイナリステータス：オプショナル。コンテクスト間隔がデフォルトと異なるときのみに必要位置：コンテクスト境界ではバンド内。 Context Start Type: Dynamic Opcode: 0x81 Content: Context interval for the immediately following context Format: 32-bit little-endian binary Status: Optional. Required only when context interval is different from default Position: In band at context boundary.

【００８６】コンテクストエンドタイプ：動的Ｏｐコード： 0x82 内容：なしフォーマット：ｎ／ａステータス：オプショナル。コンテクスト境界がスライスタイプの値の単純比較により決定できないときにのみ必要。位置：コンテクスト境界ではバンド内。 Context End Type: Dynamic Op Code: 0x82 Contents: None Format: n / a Status: Optional. Only needed when context boundaries cannot be determined by a simple comparison of slice type values. Position: Within the band at the context boundary.

【００８７】フィルタ定義タイプ：動的Ｏｐコード： 0x83 内容：続くデータスライスについてのストリームフィルタ情報の符号化に使用。どのデータが後続のストリームからフィルタされるのかを示すビットシーケンス（おそらくは圧縮された）のペアの形式をとる。 Filter definition type: Dynamic Op code: 0x83 Description: Used to encode stream filter information for the following data slice. It takes the form of a (possibly compressed) pair of bit sequences that indicate which data is filtered from the subsequent stream.

【００８８】スライスマスク（slice mask）は、２５６の可能性のあるスライスタイプのどれがデータストリームからフィルタされたのかを示す。それは、残りのスライスタイプがそのストリームで見出されることを示唆しているのでもなければ、存在するスライスがコードブックで特定される依存性の基準に準拠していることを保証するものでもない。The slice mask indicates which of the 256 possible slice types has been filtered from the data stream. It does not imply that the remaining slice types are found in the stream, but does ensure that existing slices comply with the dependency criteria specified in the codebook not.

【００８９】コンテクストマスク（context mask）は、どのコンテクスト全体がデータストリームからフィルタされたのかを示すために用いられる。各コンテクストはそのシーケンスにおいてユニークな位置を有し、マスクはモジュロ２４０（数値的な特性が2, 3, 4, 5, 6, 8, 10, 12, 15, 16, 20, 24 及び30の公倍数であるために選択された数）の位置の値を参照する。スライスマスクとは異なり、コンテクストマスクはどのコンテクストが存在し、どのコンテクストがないのかを決定するために用いることができる。しかしながら、コンテクスト間の依存性の場合、関連のある情報が実際に存在する保証はない。ステータス：オプショナル位置：コンテクスト境界ではバンド内。[0089] A context mask is used to indicate which context has been filtered out of the data stream. Each context has a unique position in the sequence and the mask is a modulo 240 (numerical characteristic of 2, 3, 4, 5, 6, 8, 10, 12, 15, 16, 20, 24 and a common multiple of 30) Refers to the value at the position of the number selected to be). Unlike slice masks, context masks can be used to determine which context exists and which context does not. However, in the case of dependencies between contexts, there is no guarantee that relevant information actually exists. Status: Optional Location: In band at context boundary.

【００９０】パディング信号発信チャンクパディングタイプ：動的Ｏｐコード： 0xff 内容：不透明なバイトアレイ。パディングチャンクは、固定ビット率のストリームが符号化されたり、或いは、特定のバイト合わせが必要とされる場合には、必要であるかもしれない。また、データフィルタ化の実験のために決まった場所にスライスを再ラベル付けするために用いられる。フォーマット：不透明なバイトアレイステータス：オプショナル位置：バンド内。Padding Signaling Chunk Padding Type: Dynamic Op Code: 0xff Description: Opaque byte array. Padding chunks may be necessary if a fixed bit rate stream is encoded or if specific byte alignment is required. It is also used to relabel slices at fixed locations for data filtering experiments. Format: Opaque byte array Status: Optional Location: In band.

【００９１】レイヤラベリング、フィルタリング、マージング、及びコードブックの例図４は、レイヤ化され、フィルタされ、符号化されたようなメディアファイル
によって経験されるいくつかの典型的な変換の概要を与えている。特に、その図
は可能な２つのスタイルのフィルタリング、即ち、（依存性ユニット内で）スラ
イスレベルでのフィルタリングと、（全依存性ユニットを通過する）コンテクス
トレベルでのフィルタリングとを示している。スライスレベルでのフィルタリン
グは、メディアの色の深さ、空間解像度、忠実度、或いは、時間解像度を変更す
るのに用いられる。コンテキストレベルでのフィルタリングはより粗い“粒状性
（grain）”をもっており、主としてメディアの時間的な短縮をサポートするた
めに設計されている。Example of Layer Labeling, Filtering, Merging, and Codebook FIG. 4 gives an overview of some typical transformations experienced by media files such as layered, filtered, and encoded. I have. In particular, the figure shows two possible styles of filtering: filtering at the slice level (within the dependency units) and filtering at the context level (passing through all dependency units). Filtering at the slice level is used to change the media color depth, spatial resolution, fidelity, or temporal resolution. Context-level filtering has a coarser "grain" and is designed primarily to support media time savings.

【００９２】ウェーブレットの例図５は、ウェーブレット／ＳＰＩＨＴ符号化データにラベリングする、もっと
詳細な例を示している。４つの重要度レベルをもつ深さが“２”のウェーブレッ
ト分解により、２８個のスライスのコンテクストができる結果になる。この符号
化に関する１つの可能性のあるコードブックの一部が図６に示され、そのコード
ブックがどのように用いられてフィルタを生成するのかを示す例が図７に図示さ
れている。Wavelet Example FIG. 5 shows a more detailed example of labeling wavelet / SPIHT encoded data. A "2" depth wavelet decomposition with four importance levels results in a context of 28 slices. A portion of one possible codebook for this encoding is shown in FIG. 6, and an example showing how the codebook is used to generate the filter is shown in FIG.

【００９３】図６において、コードブックのヘッダは自己識別のためのコードブックＩＤ（
CodebookID）フィールド、この符号化に利用可能な品質軸を指定する品質タグ（
QualityTags）フィールド、ただのテキスト名を用いてこれらの軸がどのように
ラベルされるのかを指定する品質分割（QualityDivisions）フィールド、及びオ
リジナルのメディアの空間解像度を指定するオリジナルサイズ（OriginalSize）
フィールドを含み、スケール計算のためのベースラインを備える。In FIG. 6, the header of the codebook is a codebook ID (ID) for self-identification.
CodebookID) field, a quality tag (which specifies the quality axis available for this encoding)
QualityTags) field, a QualityDivisions field that specifies how these axes are labeled using just the text name, and an OriginalSize that specifies the spatial resolution of the original media.
Includes fields and provides a baseline for scale calculations.

【００９４】各符号化方式は定義されたそれ自身のコードブックをもっても良いが、共通の
品質タグ（QualityTags）のエントリを共有する如何なるコードブックもまさし
く同じ品質分割（QualityDivisions）方式を用いてその品質軸にラベルしなけれ
ばならない。テキストの名称は、類似はしても同一ではない異なるメディア符号
化の特性を単一の品質方式へとマップするために用いられる。これによって、シ
ステムの重要な能力が得られるという結果になる。即ち、アプリケーションは、
符号化のフォーマットに完全に独立した方法で、これらの名称を用いてメディア
を操作するために書かれる。Each coding scheme may have its own codebook defined, but any codebook that shares a common QualityTags entry will have its quality using the very same QualityDivisions scheme. You must label the axis. Text names are used to map similar but not identical properties of different media encodings into a single quality scheme. This results in a significant capability of the system. That is, the application
Written to manipulate media using these names in a manner that is completely independent of the encoding format.

【００９５】その図において、ファイル操作ツールが中程度の忠実度、モノクローム、ハー
フスケールの３５２×２８８のビデオ符号化を抽出しようとしていると仮定する
。これを成し遂げるために、コードブックが最大の数をもつラベルについて品質
パラメータ（QualityParams）の必要とされる値、即ち、“scale＝half、fideli
ty＝medium”でサーチされる（実際には、これは予め計算されたインデックスを
用いて最適化される）。このラベル（１３）を見出すと、依存性のグラフが完全
に分析されるまで、全ての従属するラベルが調べられる。これが進行するにつれ
て、スライスフィルタを表現するビットマスクが構築される。即ち、ビット位置
ｎにおける“ゼロ（０）”はラベル（ｎ）をもつスライスがフィルタされたスト
リームでは必要とされ、“１”はそのスライスがないことを意味する。このフィ
ルタを用いた結果が図７に図示されている。In that figure, assume that the file manipulation tool is trying to extract a medium fidelity, monochrome, half-scale 352 × 288 video encoding. To accomplish this, the required value of the quality parameter (QualityParams) for the label whose codebook has the largest number, ie, “scale = half, fideli
ty = medium ”(actually this is optimized using a pre-calculated index). When this label (13) is found, until the dependency graph is completely analyzed, All dependent labels are examined, and as this proceeds, a bit mask is constructed that represents the slice filter, ie, a "zero (0)" at bit position n filters the slice with label (n). Required by the stream, a "1" means that there is no slice, and the result of using this filter is shown in FIG.

【００９６】フィルタされたスライスに、フィルタ化を実行するのに用いられるスライスマ
スクを保持するフィルタ定義信号発信チャンクが先行する。この値は、ストリー
ム特性の知識を必要とするダウンストリームツールによって用いられて、例えば
、デコーダを初期化したり、或いは、バッファリングのためのメモリ領域をセッ
トアップする。スライスマスクはまた、図８に図示されているように、フィルタ
されたビットストリームが併合されて結果として得られるストリームのスライス
内容を決定するときに用いられる。ここで、スライス１６、１７、２０、２１、
２４、及び２５を含むリファインメントストリームが、図７のハーフスケールの
、低解像度のフィルタされたストリームと併合されて、フルスケールの低解像度
の符号化が得られる。その組み合わされたストリームについてのスライスマスク
は、２つの入力スライスマスクの単純なビット的ＯＲを用いて得られる。The filtered slice is preceded by a filter definition signaling chunk that holds the slice mask used to perform the filtering. This value is used by downstream tools that need knowledge of the stream characteristics to, for example, initialize a decoder or set up memory areas for buffering. The slice mask is also used in determining the slice content of the resulting stream, as shown in FIG. 8, where the filtered bitstreams are merged. Here, slices 16, 17, 20, 21,
The refinement stream, including 24, and 25, is merged with the half-scale, low-resolution filtered stream of FIG. 7 to obtain a full-scale, low-resolution encoding. The slice mask for the combined stream is obtained using a simple bitwise OR of the two input slice masks.

【００９７】ＭＰＥＧの例図９はＭＰＥＧ符号化データをラベリングする例を示し、図１０は対応するコ
ードブックの一部を示している。この例では、ファイル操作ツールは高速再生ス
トリームを生成するような方法でスライス上で動作するフィルタを生成すること
を欲していることが仮定されている。この符号化において、スライスはＭＰＥＧ
フレームを表すので、フィルタの効果はそのストリームを時間的に短縮すること
である。コードブックから、そのツールによって品質パラメータ（QualityParam
s）再生速度（playrate）＝高速（fast）における依存性の階層構造を解決し、
上述したようにスライスフィルタを構築する。そのフィルタを適用した結果は図
１１に示されている。Example of MPEG FIG. 9 shows an example of labeling MPEG encoded data, and FIG. 10 shows a part of a corresponding codebook. In this example, it is assumed that the file manipulation tool wants to generate a filter that operates on the slice in such a way as to generate a fast playback stream. In this encoding, the slice is MPEG
Since it represents a frame, the effect of the filter is to shorten the stream in time. From the codebook, the quality parameter (QualityParam
s) resolve the hierarchy of dependencies at playrate = fast,
Build a slice filter as described above. The result of applying the filter is shown in FIG.

【００９８】ＤＶの例図１２はＤＶ符号化データをラベリングする例を示し、図１３は対応するコー
ドブックの一部を示している。この例では、ＤＶ符号化データは次の方式に従っ
てラベルされる。即ち、フィールド１からの２×４×８と８×８ＤＣＴブロック
の両方を表現するデータはスライスレベル０を割当てられ、フィールド２だけか
らの８×８ＤＣＴブロックを表現するデータはスライスレベル１を割当てられる
。この結果２つのレイヤになり、それは、フィールド１から生じる垂直解像度が
半分のベースレイヤと、完全な解像度をもつ再構成画像を生み出すリファインメ
ントレイヤである。図１４はラベル０スライスだけを選択するフィルタを適用し
た結果を図示しており、垂直解像度が半分のビデオストリームを生成している。Example of DV FIG. 12 shows an example of labeling DV encoded data, and FIG. 13 shows a part of a corresponding codebook. In this example, the DV encoded data is labeled according to the following scheme. That is, data representing both 2 × 4 × 8 and 8 × 8 DCT blocks from field 1 is assigned slice level 0, and data representing 8 × 8 DCT blocks from field 2 alone is assigned slice level 1. . The result is two layers: a base layer with half the vertical resolution resulting from field 1 and a refinement layer that produces a reconstructed image with full resolution. FIG. 14 illustrates the result of applying a filter to select only the label 0 slice, which produces a video stream with half the vertical resolution.

【００９９】Ｅ．クライアント駆動型の画像配信この明細書で説明される方法によってクライアントは受信することを望む画像
についての情報を指定し、サーバに適切な時間にその画像データの適切なスライ
スを配信させることが可能になる。クライアントは（他のパラメータの間で）画
像サイズ、画像品質、フレーム速度、及びデータ速度を指定できる。サーバはこ
の情報を用いて画像データのどのセクションがクライアントにどの時間に送信さ
れるべきであるのかを判断する。E. Client-Driven Image Delivery The method described herein allows a client to specify information about an image it wants to receive and to have the server deliver the appropriate slice of that image data at the appropriate time. Become. The client can specify (among other parameters) the image size, image quality, frame rate, and data rate. The server uses this information to determine which section of the image data should be sent to the client and at what time.

【０１００】クライアントはまた、データは配信されるプレイバック速度を指定できる。各
フレームの表示時間が、選択された速度に時間的に適切な方法でデータを配信す
るために用いられる。The client can also specify the playback speed at which the data will be distributed. The display time of each frame is used to deliver the data in a timely appropriate manner for the selected speed.

【０１０１】Ｆ．プレイバックパラメータクライアントがデータ配信を要求するとき、典型的にはそのクライアントの必
要と状態に特定なプレイバックパラメータのセットを指定する。そのプレイバッ
クパラメータではメディアに対して指定されるどんな品質パラメータに対しても
許容値を指定しても良い。プレイバックパラメータはクライアントによって指定
されても良いし、サーバにすでに格納されていても良いし、或いは、クライアン
トとサーバのパラメータの組み合わせで構成されていても良い。F. Playback Parameters When a client requests data delivery, it typically specifies a set of playback parameters that are specific to the client's needs and status. The playback parameter may specify an acceptable value for any quality parameter specified for the media. The playback parameters may be specified by the client, may be already stored in the server, or may be configured by a combination of the parameters of the client and the server.

【０１０２】特に、プレイバックパラメータは画像や画像のシーケンスについて次のメディ
アパラメータの内、１つ以上を定義しても良い：（ｉ）画像或いは複数画像の空間解像度；（ii）画像或いは複数画像における歪みのレベル；（iii）表示可能な画像の数；（iv）１つ以上の画像内の色成分の選択；（ｖ）配信される利用可能なフレームのサブセット；及び（vi）１つ以上のフレーム内の注目領域。In particular, the playback parameters may define one or more of the following media parameters for an image or sequence of images: (i) the spatial resolution of the image or images; (ii) the image or images. (Iii) the number of images that can be displayed; (iv) the selection of color components in one or more images; (v) a subset of available frames to be delivered; and (vi) one or more. Attention area in the frame of.

【０１０３】プレイバックパラメータはまた、オーディオに関し次のメディアパラメータの
内、１つ以上を定義しても良い：（ｉ）オーディオの歪み；（ii）オーディオのダイナミックレンジ；（iii）送信されるオーディオチャネルの数；及び（iv）モノラル、ステレオ、或いは、４チャネルオーディオの選択。The playback parameters may also define one or more of the following media parameters for audio: (i) audio distortion; (ii) audio dynamic range; (iii) transmitted audio. Number of channels; and (iv) choice of mono, stereo or 4-channel audio.

【０１０４】コードブックの場合、これらパラメータのいくつかは、そのメディアに対する
クライアント生成の、或いはサーバ生成のフィルタにカプセル化されても良い。For codebooks, some of these parameters may be encapsulated in a client-generated or server-generated filter for the media.

【０１０５】Ｇ．サーバプレイバックエンハンスメントはクライアントによって指定されても良いし、サーバにすで
に格納されていても良いし、或いは、クライアントとサーバのパラメータの組み
合わせで構成されていても良い。サーバはプレイバックパラメータを用いてクラ
イアントに適切なデータを配信する。この過程には次のステップが関与する：１）サーバは送信されるスライスの全セットを決定する；２）サーバは適切な速度でスライスを配信する。G. The server playback enhancement may be specified by the client, may be already stored in the server, or may be configured by a combination of parameters of the client and the server. The server uses the playback parameters to deliver appropriate data to the client. This process involves the following steps: 1) The server determines the entire set of transmitted slices; 2) The server delivers slices at the appropriate rate.

【０１０６】Ｈ．プレイバック用のスライスセットの決定サーバはプレイバックパラメータのいくつかに基づいて送信される必要のある
スライスを選択する。それは、スライスの完全なセットで始まり、それから、次
のことに基づいてスライスを廃棄する。・必要とされない色成分を符号化した全てのスライスを破棄する。・指定されたフレームセットの外側にあるフレームに対してのみ符号化した全て
のスライスを破棄する。・選択されたフレーム速度に要求されないフレームに対してのみ符号化した全て
のスライスを破棄する。・少なくとも要求品質に対して符号化した親のレイヤをもち、要求品質よりも高
い品質に対して符号化した全てのスライスを破棄する。・少なくとも要求スケールに対して符号化した親のレイヤをもち、要求スケール
よりも高いスケールに対して符号化した全てのスライスを破棄する。・クライアントで既に利用可能なものとしてマークされている全てのスライスを
破棄する。・任意の品質パラメータが、そのパラメータについてプレイバックパラメータで
指定された許容範囲の外にある全てのスライスを破棄する。H. Determining a Slice Set for Playback The server selects slices that need to be sent based on some of the playback parameters. It starts with a complete set of slices, then discards the slices based on: Discard all slices that have encoded color components that are not needed. Discard all slices encoded only for frames outside the specified frame set. Discard all slices encoded only for frames not required for the selected frame rate. -Discard all slices that have a parent layer coded for at least the required quality and are coded for quality higher than the required quality. Discard all slices that have a parent layer encoded for at least the required scale and that have been encoded for scales higher than the required scale. Discard all slices already marked as available on the client. Discard all slices for which any quality parameter is outside the tolerance specified by the playback parameter for that parameter.

【０１０７】他の選択基準がこの点で適用されても良い。Other selection criteria may be applied at this point.

【０１０８】Ｉ．プレイバック用スライスの配信サーバは選択スライスを、指定されたプレイバック速度での表示に要求された
データ速度で配信する。そのスライスに関連したタイミング情報を用いて、時間
依存性のあるデータを流出させる標準の調整機構に従ってデータを配信する。I. Distribution of playback slice The server distributes the selected slice at the data rate required for display at the specified playback rate. Using the timing information associated with the slice, the data is delivered according to a standard coordination mechanism that drains the time-dependent data.

【０１０９】Ｊ．データセクションの破棄タイムリーな方法で結果としてえら得るフレームを表示するために、データが
要求速度で配信されねばならない。このことが不可能であるとき、サーバはスラ
イスのいくらかを破棄する。このことは、選択スライスのデータ速度がプレイバ
ックパラメータで指定された最大速度よりも速いために発生するか、或いは、伝
送チャネルが永久的にせよ一時的なものにせよ、選択データ速度に対して十分な
能力をもっていないために発生する。データは、廃棄パラメータとして知られる
プレイバックパラメータのサブセットに含まれるものを含めて、基準のリストに
従って、廃棄される。最も重要な基準はデータセクションとそのペアレントとの
間の依存性である。データセクションのペアレントはそのデータ自身より前には
決して廃棄されない。適用される残りの基準とその順序は廃棄パラメータでクラ
イアントによって指定される。次の基準が適用される：・与えられたフレーム速度以上に対して要求されるスライスは廃棄されても良い
。・与えられたスケール以上に対して符号化したスライスは廃棄可能。・与えられた品質以上に対して符号化したスライスは廃棄可能。・特定の領域に対して符号化したスライスは廃棄可能。・特定の色成分に対して符号化したスライスは廃棄可能。・与えられた限界値を超える品質パラメータをもつスライスは廃棄されても良い
。J. Discarding Data Sections In order to display the resulting frames in a timely manner, data must be delivered at the requested rate. When this is not possible, the server discards some of the slices. This may occur because the data rate of the selected slice is faster than the maximum rate specified in the playback parameters, or the transmission rate may be higher or lower, regardless of whether the transmission channel is permanent or temporary. It occurs because you do not have enough ability. Data is discarded according to a list of criteria, including those included in a subset of playback parameters known as discard parameters. The most important criterion is the dependency between the data section and its parent. Parents in a data section are never discarded before the data itself. The remaining criteria applied and their order are specified by the client in the discard parameters. The following criteria apply: Slices required for a given frame rate or higher may be discarded.・ Slices encoded for a given scale or higher can be discarded.・ Slices encoded for a given quality or higher can be discarded.・ Slices encoded for specific areas can be discarded.・ Slices encoded for specific color components can be discarded. • Slices with quality parameters that exceed a given limit may be discarded.

【０１１０】他の選択基準がこの点で適用されても良い。リストにある各規準はスライスに
適用するどんな品質パラメータを参照しても良いし、同じ品質パラメータが多数
の基準で現れても良い。例えば、廃棄パラメータはまず、ある品質以上の全ての
データを廃棄し、その後、代替フレームを廃棄し、それから２番目の、より低い
品質以上の全てのデータを廃棄する。その時点における利用可能な容量以下に要
求データ速度がなるまで、データは廃棄される。[0110] Other selection criteria may be applied in this regard. Each criterion in the list may refer to any quality parameter that applies to the slice, or the same quality parameter may appear in multiple criteria. For example, the discard parameter first discards all data of a certain quality or higher, then discards the replacement frame, and then discards the second, all data of a lower quality or higher. Data is discarded until the requested data rate falls below the currently available capacity.

【０１１１】Ｋ．プレイバックパラメータへの変更クライアントはプレイバックパラメータを任意の時間に変更して良い。既にク
ライアントに配信されたスライスを考慮して、サーバは要求スライスの新しいセ
ットを再計算する。スライスが現在送信中であれば、要求リストにスライスがこ
れ以上出現しないなら伝送は完了しても良く、その場合、伝送は終了する。もし
、その伝送が終了したなら、これに従って、データストリームにはマークが付け
られてクライアントには不完全なスライスを通知する。K. Changes to Playback Parameters The client may change the playback parameters at any time. Taking into account the slices already delivered to the client, the server recalculates a new set of requested slices. If the slice is currently being transmitted, the transmission may be completed if no more slices appear in the request list, in which case the transmission ends. If the transmission is completed, the data stream is marked accordingly and the client is notified of the incomplete slice.

【０１１２】Ｌ．クライアントによるデータ格納クライアントがうまくデータを受信するとき、それはデコードと再利用のため
に格納される。後になって、サーバはフレームやフレームのシーケンスに対する
別のスライスを送信することにより格納データを改善できる。クライアントはこ
のデータを現存する格納データに付加し、デコード可能な圧縮データを与えて、
より高い品質、解像度、或いはフレーム速度でのフレームやフレームシーケンス
を生成する。フレームやフレームシーケンスを表示する時間がくれば、クライア
ントは格納データと現在サーバから配信されているデータとを組み合わせて用い
、フレームとフレームシーケンスの圧縮形式を備える。このデータが、それから
デコードされて適切に表示される。L. Data storage by the client When the client successfully receives the data, it is stored for decoding and reuse. Later, the server can improve the stored data by sending another slice for the frame or sequence of frames. The client appends this data to the existing stored data, giving it compressed data that can be decoded,
Generate frames or frame sequences with higher quality, resolution, or frame rate. If it takes time to display a frame or a frame sequence, the client uses a combination of the stored data and the data currently distributed from the server to provide a compression format of the frame and the frame sequence. This data is then decoded and displayed appropriately.

【０１１３】Ｍ．制限されたチャネル容量の扱いデータチャネルは、ネットワークの全体容量、ネットワークの利用度、そのチ
ャネルに割当てられた容量などの因子に依存した制限のある容量をもっている。
プレイバックプロファイルによって要求された容量が利用可能な容量よりも大き
いなら、サーバは廃棄パラメータによって与えられる規準を用いて送信されるデ
ータ量を減らす。品質プロファイルによって要求された容量がそのチャネルで利
用可能な容量よりも小さいなら、或いは、プレイバックに対して利用可能なデー
タ速度に正確に一致することが不可能であれば、余剰容量（surplus capacity）
であると言われる。M. Handling of Limited Channel Capacity A data channel has a limited capacity that depends on factors such as the overall capacity of the network, the utilization of the network, and the capacity allocated to the channel.
If the capacity requested by the playback profile is greater than the available capacity, the server reduces the amount of data sent using the criteria given by the discard parameter. If the capacity required by the quality profile is less than the capacity available on that channel, or if it is not possible to exactly match the available data rate for playback, surplus capacity )
Is said to be.

【０１１４】Ｎ．余剰チャネル容量余剰容量は多くの方法で利用可能になる。例えば、クライアントはプレイバッ
クプロファイルにおいて最大品質設定を指定する一方、ユーザ動作に依存して要
求フレーム速度を変化させても良い。ビデオがスローモーションで再生されてい
るとき、ビデオデータに対する必要な容量が減らされて、余剰容量を生み出すよ
うにしても良い。極端な場合、ユーザはプレイバックを休止して静止画像を表示
しても良い。その静止画像が要求品質で一旦転送されると、これ以上のデータは
必要ではなく、全チャネル容量が余剰容量として利用可能である。同様に、クラ
イアントは以前に配信されたフレーム或いはフレームシーケンスを再生しても良
い。フレームに対して要求されるデータのいくつかは既にクライアントによって
格納されており、指定された品質でのデータのプレイバックに要求される容量は
それほど大きくはないだろう。プレイバックパラメータはまた、全ての利用可能
なチャネル容量を用いることなく配信可能な品質を指定するために選択されても
良い。残りの容量が余剰容量である。従って、チャネル容量のいくらか、或いは
、全てが余剰となるかもしれない。N. Excess Channel Capacity Excess capacity can be made available in many ways. For example, the client may specify the maximum quality setting in the playback profile while changing the requested frame rate depending on the user action. When the video is being played in slow motion, the required capacity for the video data may be reduced to create extra capacity. In extreme cases, the user may pause playback and display a still image. Once the still image is transferred with the required quality, no more data is needed and the entire channel capacity is available as surplus capacity. Similarly, the client may play a previously delivered frame or frame sequence. Some of the data required for the frame has already been stored by the client, and the capacity required to play back the data at the specified quality will not be so large. Playback parameters may also be selected to specify distributable quality without using all available channel capacity. The remaining capacity is the surplus capacity. Thus, some or all of the channel capacity may be surplus.

【０１１５】余剰容量の使用チャネルに余剰容量があるとき、それはビデオフレームに関し画像データのい
くつかあるいは全てを前もってローディングすることにより、ビデオフレームの
品質を改善するのに用いられる。これには、以前にはダウンロードされていなか
った新しいフレーム（即ち、以前ダウンロードされたフレームを“補う”）につ
いてのデータをローディングすることが関与するかもしれず、或いは、それらの
フレームについてより多くのスライスをダウンロードすることによって現存する
フレームの改善が関与するかもしれない。この過程は、改善（エンハンスメント
）として知られ、エンハンスメントパラメータのセットの制御下にある。それゆ
え、“エンハンスメント”は、以前にダウンロードされたフレームを補うことを
含む。エンハンスメントパラメータのセットはまた、エンハンスメントプロファ
イルとして知られ、それは前もってローディングされることになるフレームの順
番と各フレームに関しローディングされることになるデータ量とを選択するため
に用いられる。Use of Surplus Capacity When there is surplus capacity in a channel, it is used to improve the quality of a video frame by preloading some or all of the image data for the video frame. This may involve loading data for new frames that were not previously downloaded (ie, "complementing" previously downloaded frames), or for more slices for those frames May involve improving existing frames. This process is known as enhancement and is under the control of a set of enhancement parameters. Therefore, "enhancement" includes supplementing previously downloaded frames. The set of enhancement parameters is also known as an enhancement profile, which is used to select the order of frames to be pre-loaded and the amount of data to be loaded for each frame.

【０１１６】例えば、余剰容量は全シーケンスにわたる低解像度版をローディングしてその
データを通して後に続く高速再生を可能にするために用いられても良い。或いは
、その余剰容量はデータの細かな点を精細にするため、そのシーケンスのキーフ
レームをロードするために用いられても良い。或いは、現在のプレイバック位置
の周りのフレーム品質を改善するために用いられても良い。For example, the surplus capacity may be used to load a low resolution version over the entire sequence and allow subsequent high speed playback through the data. Alternatively, the surplus capacity may be used to load the keyframes of the sequence to refine the details of the data. Alternatively, it may be used to improve frame quality around the current playback position.

【０１１７】そのデータは後での再利用のためにクライアントで保持されても良いし、或い
は、一時的なものとして扱われ、これをこれ以上クライアントがすぐにしようす
ることがなければ廃棄されても良い。The data may be retained at the client for later reuse, or may be treated as temporary and discarded unless the client wants to do so immediately. Is also good.

【０１１８】ある場合には、サーバからクライアントにデータを配信するのに問題があるか
もしれない。例えば、配信が信頼性の低い転送プロトコルを用いたネットワーク
による場合がこれに当たるかもしれない。この場合、ネットワークでのデータ損
失のため、格納データにギャップがあるかもしれない。余剰容量はこれらのギャ
ップを補って埋め戻し完全なセットの圧縮データを提供するために用いられる。In some cases, there may be problems delivering data from the server to the client. For example, this may be the case when the distribution is over a network using an unreliable transfer protocol. In this case, there may be gaps in the stored data due to data loss in the network. The surplus capacity is used to fill these gaps and provide a complete set of compressed data.

【０１１９】Ｏ．サーバのエンハンスメントサーバはエンハンスメントパラメータを用いて適切なデータをクライアントに
配信する。この過程には次のステップが関与する：・サーバは送信されるスライスの全セットを決定する。・サーバはエンハンスメントパラメータに従ってスライスの順番付けをする。・サーバは全ての利用可能な余剰容量を用いてスライスを配信する。O. Server Enhancement The server uses the enhancement parameters to deliver the appropriate data to the client. This process involves the following steps: The server determines the entire set of transmitted slices. The server orders slices according to the enhancement parameters. The server uses all available surplus capacity to deliver slices.

【０１２０】エンハンスメント用のスライス選択エンハンスメントパラメータはプレイバックパラメータがプレイバックのため
にスライスを選択するために用いられるのと同じ方法でエンハンスメントのため
にスライスを選択するために用いられる。Slice Selection for Enhancement The enhancement parameters are used to select slices for enhancement in the same way that playback parameters are used to select slices for playback.

【０１２１】エンハンスメント用のスライスの順序づけエンハンスメントのスライスが一旦選択されたなら、スライスが再順序付けさ
れてエンハンスメントパラメータの要求に合致させる付加的なステップがある。
これによって、クライアントはスライスが配信される順序と、それゆえに、画像
の局部的なコピーを改善したり或いは補うために余剰バンド幅が用いられる方法
とを指定できる。Ordering Slices for Enhancement Once an enhancement slice has been selected, there is an additional step in which the slices are reordered to meet the enhancement parameter requirements.
This allows the client to specify the order in which the slices are delivered, and therefore, how the extra bandwidth is used to improve or supplement a local copy of the image.

【０１２２】スライスはスコアを各スライスに割当てることによって順序付けされる。この
スコアは各利用可能な品質パラメータについての値とエンハンスメントパラメー
タにおける品質パラメータに割当てられた重要度とに基づいて計算される。The slices are ordered by assigning a score to each slice. This score is calculated based on the value for each available quality parameter and the importance assigned to the quality parameter in the enhancement parameter.

【０１２３】次のパラメータがスライスのスコアをつけるのに用いられる：・与えられたスライス内で利用可能なスケール／解像度；・与えられたスライス内に存在する歪み；・与えられたスライスによって符号化されるフレーム；・与えられたスライスによって符号化されるフレーム内の注目領域；・与えられたスライスによって符号化される色成分；・そのスライスに対して指定された何らかの別の品質パラメータ；・この時点で適用されるかもしれない他の順序付け基準。The following parameters are used to score the slices: Scale / resolution available in a given slice; Distortion present in a given slice; Encoding by a given slice A region of interest in a frame encoded by a given slice; a color component encoded by a given slice; some other quality parameter specified for that slice; Other ordering criteria that may be applied at the time.

【０１２４】スライスは各スライスに対するスコアに従って順序付けがなされる。最後に、
スライスが順序付けされてスライスのペアレントが常にそのスライス自身の前に
現れることを保証し、各スライスはそれが到着するや否やデコードされることを
保証する。The slices are ordered according to the score for each slice. Finally,
The slices are ordered to ensure that the parent of the slice always appears before the slice itself, and that each slice is decoded as soon as it arrives.

【０１２５】エンハンスメント用のスライス配信エンハンスメント用のスライスは通信チャネルで利用可能な全ての余剰容量を
用いて配信される。プレイバックの場合とは異なり、メディアデータの調整や流
れ込みもない。従って、メディアのすぐさまのプレイバックに必要とはされない
ときでさえも、通信チャネルの十分な容量がいつでも用いられる。Slice Distribution for Enhancement Slices for enhancement are distributed using all the surplus capacity available on the communication channel. Unlike playback, there is no adjustment or inflow of media data. Thus, even when not needed for immediate playback of media, sufficient capacity of the communication channel is always used.

【０１２６】他のユーザへの拡張上述の説明において、結果として得られるビデオフレームはクライアントのコ
ンピュータで表示される。この技術はまた、他の利用のためにメディアデータを
提供するためにも用いられる。例えば、ショット変化の検出や対象の認識のよう
な特徴を備えるために、フレームが画像解析システムへの入力として用いられて
も良い。そのフレームはまた新しいフォーマットへと再圧縮され、紙、フィルム
、磁気テープ、ディスク、ＣＤ−ＲＯＭ、或いは、他のデジタル記憶媒体のよう
な媒体に格納されても良い。Extension to Other Users In the description above, the resulting video frames are displayed on the client computer. This technique is also used to provide media data for other uses. For example, frames may be used as input to an image analysis system to provide features such as shot change detection and object recognition. The frame may also be recompressed to the new format and stored on media such as paper, film, magnetic tape, disk, CD-ROM, or other digital storage media.

【０１２７】他のデータタイプへの拡張この技術は、オーディオ、グラフィックス、アニメーションのような他のデー
タタイプに適用することができる。この技術は上述のようなエンハンスメントを
可能とするフォーマットへと圧縮されるどんなデータでもサポートできる。Extension to Other Data Types This technique can be applied to other data types such as audio, graphics, animation. This technique can support any data that is compressed into a format that allows for enhancement as described above.

【０１２８】Ｐ．付録１図示された実施形態の環境で用いられた用語についてのいくらかの定義スケール（Scale）機能が解析されて異なる時間／空間／周波数の内容をもつ成分のセットになる
。これらの成分はスケールと呼ばれ、スケールへと変える解析の処理はマルチス
ケール分解と呼ばれる。この解析は限られた時間の波形とゼロ積分によって実行
され、ウェーブレットと呼ばれる。時間或いは空間で高度に局所化しているが、
うまく規定されていない周波数スペクトラムをもつ成分はスモールスケールであ
り、精密な詳細を捉えている。空間に拡散しているが精密に規定された周波数ス
ペクトラムをもつ成分はラージスケールであり、全体的な傾向を捉えている。P. No. Appendix 1 Some Definitions of Terms Used in the Environment of the Illustrated Embodiment Scale Function is analyzed into a set of components with different time / space / frequency content. These components are called scales, and the process of analysis that turns them into scales is called multiscale decomposition. This analysis is performed by a limited time waveform and zero integration and is called a wavelet. Highly localized in time or space,
Components with a poorly defined frequency spectrum are small scale, capturing fine details. The components that are diffused in space but have a precisely defined frequency spectrum are large scale and capture the overall trend.

【０１２９】レイヤ（Layer）レイヤはメディアファイルの単一均質な成分を初期化或いはリファインするメ
ディアファイルの概念的な部分であり、そのファイルの成分は、空間解像度、時
間解像度、サンプルの忠実度、色、或いは他の品質軸を表現するデータである。
それゆえに、それは、完全なファイル内の１つのレベルの品質に対するデータセ
ットである。Layer A layer is a conceptual part of a media file that initializes or refines a single homogeneous component of the media file, where the components of the file are spatial resolution, temporal resolution, sample fidelity, Data representing color or another quality axis.
Therefore, it is a data set for one level of quality in a complete file.

【０１３０】スライス（Slice）スライスは、単一コンテクスト内のレイヤに関するデータを符号化するメディ
アファイルの一部である。それゆえに、特定のメディアファイル内の各レイヤは
それ自身、各コンテクストについて１つの、一連のスライスに分割される。Slice A slice is a part of a media file that encodes data related to a layer in a single context. Therefore, each layer in a particular media file is itself divided into a series of slices, one for each context.

【０１３１】ベースレイヤ（Base Layer）クライアントに最初に送信されてそのクライアントでメディアオブジェクトの
表示を初期化するレイヤであり、これにリファインメントレイヤが付加される。Base Layer (Base Layer) This layer is transmitted first to a client and initializes the display of a media object in the client. A refinement layer is added to this layer.

【０１３２】リファインメントレイヤ（Refinement Layer）ベースレイヤに続いてクライアントに送信されるレイヤであり、これはそのク
ライアントでメディアオブジェクトの表示の品質を、あるメトリックに従って、
“改善”する。Refinement Layer A layer sent to the client following the base layer, which measures the quality of the display of media objects at the client according to some metric.
"Improve.

【０１３３】重要度レイヤ（Significance Layer）全てのリファインメント情報がリファインメントを経験する全ての係数に関し
て特定のビット位置を参照するようにしたレイヤ。Significance Layer A layer in which all refinement information refers to a particular bit position for all coefficients that experience refinement.

【０１３４】スケールレイヤ（Scale Layer）全てのリファインメント情報が特定のスケールに対するようにしたレイヤ。Scale Layer A layer in which all refinement information is for a specific scale.

【０１３５】リージョンレイヤ（Region Layer）全てのリファインメント情報が空間或いは時間変化する機能における特定の関
係領域にに対するようにしたレイヤ。Region Layer A layer in which all refinement information is directed to a specific related area in a space or time-varying function.

【０１３６】歪み（Distortion）画像の歪みは最大信号の雑音に対する比（ＰＳＮＲ）によって測定され、ＰＳ
ＮＲ＝１０ｌｏｇ（２５５²／ＭＳＥ）、ＭＳＥは画像の二乗平均誤差である。Distortion Image distortion is measured by the ratio of the maximum signal to noise (PSNR),
NR = 10 log (255 ² / MSE), where MSE is the root mean square error of the image.

【０１３７】品質タグ（Quality Tags）例えば、空間解像度、時間解像度、オリジナルに関して再構成された画素の忠
実度などの知覚される品質の異なった面を表現するのに利用可能な軸のセット。Quality Tags A set of axes that can be used to represent different aspects of perceived quality, for example, spatial resolution, temporal resolution, and fidelity of reconstructed pixels with respect to the original.

【０１３８】品質分割（Quality Divisions）品質タグ（Quality Tags）の軸についてマークをつける方式、例えば、空間
解像度の軸は、低、中、高とマークが付けられるかもしれない。Quality Divisions The method of marking the axis of the Quality Tags, eg, the axis of spatial resolution may be marked as low, medium, high.

【０１３９】品質パラメータ（QualityParams）符号化されたメディアの項目の、再構成の知覚品質に対する寄与を記述するの
に用いられる分類子のセット。品質パラメータ（QualityParams）分類子は (Qua
lityTags＝QualityDivisions) のペアで定義され、例えば、忠実度（fidelity）
＝６、或いは解像度（resolution）＝高（high）。Quality Parameters (QualityParams) A set of classifiers used to describe the contribution of encoded media items to the perceived quality of the reconstruction. The quality parameter (QualityParams) classifier is (Qua
lityTags = QualityDivisions), for example, fidelity
= 6, or resolution = high.

【０１４０】コードブック（Codebook）品質操作の動作が、オリジナルの圧縮フォーマットに係らず正当であるメディ
アファイル上で定義されるように、抽出機構を備えるテーブル。コードブック（
Codebook）は、品質パラメータ（QualityParams）分類システムの使用によって
フォーマットの独立性を成し遂げる。Codebook A table with an extraction mechanism so that the operations of quality operations are defined on valid media files regardless of the original compression format. Codebook (
Codebook achieves format independence through the use of a QualityParams classification system.

【０１４１】フィルタ（Filter）レイヤ化されたメディアファイルの２つの部分への区分を定義する情報の構造
である。即ち、１つの部分は品質の落ちたメディアファイルを表し、他の１つの
部分はオリジナルファイルを再構成するのに必要な改善情報を表す。最も単純な
実施形への組み込みはビットマスクであり、ｎ個のビット位置の“０”と“１”
とが、特定のラベル（ｎ）をもつデータ項目が低品質の出力ファイルで必要とさ
れるのか、或いは、必要とされないのかを指定する。Filter A structure of information that defines the division of a layered media file into two parts. That is, one part represents the media file with reduced quality, and the other part represents the improvement information required to reconstruct the original file. Incorporation into the simplest implementation is a bit mask, with "0" and "1" in n bit positions.
Specifies whether a data item with a particular label (n) is required or not required in a low quality output file.

【０１４２】フィルタマスク（Filter Mask）フィルタされたファイルに付加されて、そのファイルの情報内容（即ち、その
ファイルからフィルタされたもの）についてのダウンストリームのツールを通知
する１片の情報。もし、そのフィルタが単純なビットマスクとして実施形に組み
込まれたなら、フィルタマスクは単純にこのフィルタのコピーである。Filter Mask A piece of information that is added to a filtered file and informs a downstream tool about the information content of that file (ie, filtered from that file). If the filter was implemented in the implementation as a simple bit mask, the filter mask is simply a copy of this filter.

[Brief description of the drawings]

本発明は添付図面を参照して説明されるが、その図面は以下の通りである。 The present invention will be described with reference to the accompanying drawings, which show the following:

【図１】本発明に従う、メディアファイルの配信方法を実行するのに用いられるネット
ワークの図である。FIG. 1 is a diagram of a network used to execute a media file distribution method according to the present invention.

【図２】従来のマルチスケール分解における画像にウェーブレット変換を適用した結果
得られるサブバンドを、本発明に従う部分再構成の例とともに表す図である。FIG. 2 is a diagram illustrating subbands obtained as a result of applying a wavelet transform to an image in a conventional multi-scale decomposition, together with an example of partial reconstruction according to the present invention.

【図３】本発明に従う“チャンク”データ構造のフォーマットを表す図である。FIG. 3 illustrates a format of a “chunk” data structure according to the present invention.

【図４】本発明に従うレイヤリングシステムによる一単位のメディアの典型的な経路を
表した図である。FIG. 4 is a diagram illustrating a typical path of one unit of media by a layering system according to the present invention.

【図５】本発明に従うウェーブレット符号化に適用されるようなラベリング機構を表し
た図である。FIG. 5 illustrates a labeling mechanism as applied to wavelet coding according to the present invention.

【図６】本発明に従う、図５のラベリングの例についてのコードブックの１部の例であ
る。FIG. 6 is an example of a portion of a codebook for the example labeling of FIG. 5, in accordance with the present invention.

【図７】本発明に従う、図５のラベリングの例に適用されるようなスライスフィルタリ
ングを表した図である。FIG. 7 is a diagram illustrating slice filtering as applied to the example of labeling of FIG. 5, according to the present invention.

【図８】本発明に従う、図５のラベリングの例に適用されるようなスライスマージング
を表した図である。FIG. 8 is a diagram illustrating slice merging as applied to the example of labeling of FIG. 5 according to the present invention.

【図９】本発明に従う、ＭＰＥＧ符号化に適用されるようなラベリングを表した図であ
る。FIG. 9 illustrates labeling as applied to MPEG coding according to the present invention.

【図１０】本発明に従う、図９のラベリングの例についてのコードブックの１部の例であ
る。FIG. 10 is an example of a portion of a codebook for the example labeling of FIG. 9 in accordance with the present invention.

【図１１】本発明に従う、図９のラベリングの例に適用されるようなスライスフィルタリ
ングを表した図である。FIG. 11 is a diagram illustrating slice filtering as applied to the example of labeling of FIG. 9 according to the present invention.

【図１２】本発明に従う、ＤＶ符号化に適用されるようなラベリング機構を表した図であ
る。FIG. 12 illustrates a labeling mechanism as applied to DV coding according to the present invention.

【図１３】本発明に従う、図１２のラベリングの例についてのコードブックの１部の例で
ある。13 is an example of a portion of a codebook for the example labeling of FIG. 12, in accordance with the present invention.

【図１４】本発明に従う図１２のラベリングの例に適用されるようなスライスフィルタリ
ングを表した図である。FIG. 14 illustrates slice filtering as applied to the example of labeling of FIG. 12 according to the present invention.

───────────────────────────────────────────────────── フロントページの続き (71)出願人ＭｏｕｎｔＰｌｅａｓａｎｔＨｏｕｓｅ，２ＭｏｕｎｔＰｌｅａｓａｎｔ，ＨｕｎｔｉｎｇｄｏｎＲｏａｄ，ＣａｍｂｒｉｄｇｅＣＢ３０ＲＮ，ＵｎｉｔｅｄＫｉｎｇｄｏｍ (72)発明者キング，トニー，リチャードイギリス国ケンブリッジシービー３９ジェイダブリュー，ニュンハム，マルロウロード 28 (72)発明者グロエルト，ティモシー，ホルロイドイギリス国ケンブリッジシービー２５イーダブリュー，リトルシェルフォード，フィットレスフォードロード 44 Ｆターム(参考） 5C052 AB02 CC01 CC11 DD04 DD06 5C053 FA15 GA11 GB17 GB21 HA33 KA01 LA15 ──────────────────────────────────────────────────続き Continuation of the front page (71) Applicant Mount Pleasant House, 2 Mount Pleasant, Huntingdon Road, Cambridge CB30RN, United Kingdom (72) Inventors King, Tony, Richard 9 Cambridge, England UK Nunham, Marlow Road 28 (72) Inventor Grouert, Timothy, Holloyd Cambridge CB2 5E brew, Little Shelford, Fitlessford Road 44 F-term (reference) 5C052 AB02 CC01 CC11 DD04 DD06 5C053 FA15 GA11 GB17 GB21 HA33 KA01 LA15

Claims

[Claims]

1. A method of copying a media file to a device over a network and making the copy of the media file playable, wherein extra bandwidth is not required on the device to play the media file. The method of claim 1, wherein the method is used to improve copying of the media file.

2. The extra bandwidth includes: (i) playing the media file at a reduced speed or quality; (ii) playing a portion of the media file; or (iii) playing the media file. The method of claim 1, wherein the method is available as a result of pausing playback of a file.

3. The method according to claim 1, wherein the improvement is performed according to a profile stored in the server.

4. The method according to claim 1, wherein the improvement is performed according to a profile sent by the device to a server before or during the copying.

5. The improvement according to claim 1, wherein the improvement is performed according to a profile that is a combination of a value stored on the server and a value transmitted to the server by the device before or during copying. Alternatively, the method according to 2.

6. The method according to claim 1, wherein the data required for playback is selected according to a profile stored in a server.

7. The method according to claim 1, wherein the data required for playback is selected according to a profile transmitted to a server before or during copying. .

8. The data required for playback is selected according to a profile which is a combination of a value stored on the server and a value transmitted to the server by the device before or during copying. A method according to any of claims 1 to 7, characterized in that:

9. The copy of the media file on the client acts as a temporary cache, wherein all or part of the media file is deleted during or at the end of playback. 9. The method according to any one of 1 to 8.

10. The media file encodes video, and image data corresponding to each video frame or sequence of frames is generated using a wavelet transform and transformed using SPIHT compression, and the resulting bit stream is 10. A method according to any of the preceding claims, comprising several discrete bitstream layers, each bitstream layer allowing to display image data on a display at a different spatial resolution. .

11. The apparatus further comprising: (i) analyzing the media data; (ii) recompressing the media data into a new format; (iii)
) The media data is stored on paper, film, magnetic tape, disk, CD-ROM
A method according to any of the preceding claims, wherein the data is provided for purposes other than the playback, including transferring to another medium, such as a digital storage medium. .

12. The method according to claim 1, wherein the media file encodes audio data.

13. The method according to claim 1, wherein the media file encodes composite video and audio data.

14. A local cache for storing data that is transmitted using the extra bandwidth and that is decoded only when needed to improve the copy of the media file on the device. The method according to any one of claims 1 to 13, wherein the method is used for:

15. A local buffer for storing data transmitted using the extra bandwidth and decoded only when needed to improve the copy of the media file on the device. 2. A method according to claim 1, wherein
15. The method according to any one of claims 14 to 14.

16. The extra capacity comprises: (i) loading a low quality version of the data of the entire sequence to enable high speed playback with the data; and (ii) loading certain key frames for the sequence. Remove the degradation of the data (scrub
16. The method according to any of the preceding claims, wherein (iii) is used for one or more of the enhancements of enhancing the quality of the frame surrounding the current playback position. the method of.

17. The method according to claim 1, wherein the device is a client in a client-server network.

18. The method according to claim 1, wherein the device is a server or an edge server.

19. A media file copied using the method according to claim 1. Description:

20. Using a client to improve or supplement a copy of a media file copied from a server to the client by a network, using extra bandwidth not required to play the media file at the client. Computer programs made possible by

21. A client programmed with the computer program according to claim 20.

22. A server that improves the copying of media files copied from said server to a client over a network by using extra bandwidth not required to play said media files at said client. Computer program made possible.

23. A server programmed with the computer program according to claim 22.