JP2007324722A

JP2007324722A - Moving picture data distribution apparatus and moving picture data communication system

Info

Publication number: JP2007324722A
Application number: JP2006150151A
Authority: JP
Inventors: Toru Suneya; 亨強矢; Masahiko Takaku; 雅彦高久
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2006-05-30
Filing date: 2006-05-30
Publication date: 2007-12-13

Abstract

<P>PROBLEM TO BE SOLVED: To attain distribution, reception and reproduction of live video and audio, and reuse of received data as contents even when the specifications defined by the HTTP/1.1 are not mounted in a transmitter side, or in a receiver side, or neither of the transmitter and receiver sides. <P>SOLUTION: A moving picture distribution apparatus is disclosed which calculates a size (Content-length) of distribution data on the basis of request information from a client (S503), notifies a reproduction apparatus about the size to generate blocks each comprising a set of metadata and coded data required for reproduction processing (S505, S509), and sequentially transmits the blocks. The moving picture distribution apparatus transmits negligible data in place of the coded data so that the total transmission size is equal to the Content-length. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は動画像データ配信装置及び動画像データ通信システムに関し、例えばデータの送受信に用いて好適な技術に関する。 The present invention relates to a moving image data distribution apparatus and a moving image data communication system, for example, a technique suitable for use in data transmission / reception.

近年、犯罪の監視や遠隔地の確認などを目的とした、いわゆる監視カメラが普及しつつある。このような実時間符号化を行った動画映像や音声の配信（以下、ライブ映像配信とする）を実現するものとして、ＲＴＰ（A Transport Protocol for Real-Time Applications）と呼ばれる形式が一般的である。 In recent years, so-called surveillance cameras for the purpose of monitoring crimes and confirming remote locations are becoming popular. A format called RTP (A Transport Protocol for Real-Time Applications) is generally used to realize the distribution of moving image video and audio (hereinafter referred to as live video distribution) subjected to such real-time encoding. .

しかしながら、ＲＴＰを用いたライブ映像配信においては、ネットワークセキュリティ上の観点からファイヤーウォールが設定されていたり、ライブ映像を格納するデータパケットの欠落が発生しやすかったりといった課題もある。このため、より単純なＨＴＴＰ（Hyper Text Transfer Protocol）などのプロトコルによるファイル転送形式を用いたライブ映像配信が検討されてきた。 However, in live video distribution using RTP, there are problems that a firewall is set from the viewpoint of network security, and that data packets for storing live video are likely to be lost. For this reason, live video distribution using a file transfer format based on a simpler protocol such as HTTP (Hyper Text Transfer Protocol) has been studied.

従来、このような実時間処理を行う事に適したファイル記録形式としては、ＩＳＯＢａｓｅＭｅｄｉａＦｉｌｅフォーマットにおけるフラグメント映像形式（Fragmented Movie）がある。フラグメント映像形式は、動画データの全体に係わるメタデータをファイルの先頭に記述し、その後に、一定時間の実際の動画データの一部を記録した後、これに続く動画データのメタデータと対応する動画データの組み合わせを連続して記録するものである。すなわち、ある一定の決められた動画データとそのメタデータの組み合わせをひとつのデータのかたまりとして取り扱うことが可能となっている。このような形式で記録することにより、実時間符号化された動画データを連続して記録する事をより小さな負荷で実現することができる。 Conventionally, as a file recording format suitable for performing such real-time processing, there is a fragmented video format (Fragmented Movie) in the ISO Base Media File format. In the fragment video format, metadata related to the entire video data is described at the beginning of the file, and after that, a portion of the actual video data for a certain period of time is recorded, and then the metadata of the video data that follows is recorded. A combination of moving image data is continuously recorded. In other words, it is possible to handle a certain combination of moving image data and its metadata as one data block. By recording in such a format, real-time encoded moving image data can be continuously recorded with a smaller load.

また、ＨＴＴＰなどのファイル転送に適したプロトコルについては、その性質上、転送されるデータのサイズをあらかじめ受信側に通知する必要があるため、これに対応する仕組みが拡張されてきた。 In addition, a protocol suitable for file transfer such as HTTP needs to notify the receiving side in advance of the size of data to be transferred because of its nature, and the mechanism corresponding to this has been expanded.

ライブ映像配信においてはその終了を配信開始前に知ることができない。このため、例えば、送信されるデータを小さなブロックに分割し、このブロック毎に配信側から受信側に転送サイズを通知することで、配信開始前に全体のサイズを必要としない仕組みがＨＴＴＰにおいては定義されている。 In live video distribution, the end cannot be known before distribution starts. For this reason, for example, in HTTP, the transmission data is divided into small blocks, and the transfer size is notified from the distribution side to the reception side for each block, so that the entire size is not required before distribution starts. Is defined.

また、受信側から送信側に対して送信要求サイズを通知し、これによって同様の仕組みを実現することも提案されている（例えば、特許文献１参照）。これは、ＨＴＴＰに定義されるＲａｎｇｅ機能を用いて、通信接続を維持したまま、受信側からデータの一部を順次取得することにより、分割受信をするものである。 In addition, it has been proposed that a transmission request size is notified from the reception side to the transmission side, thereby realizing a similar mechanism (see, for example, Patent Document 1). In this method, division reception is performed by sequentially acquiring a part of data from the reception side while maintaining communication connection using the Range function defined in HTTP.

特開２００５−２７０１０号公報JP 2005-27010 A

しかしながら、例えば、前記ＣｈｕｎｋｅｄＴｒａｎｓｆｅｒＥｎｃｏｄｉｎｇなどの手法を用いて全体の配信ファイルサイズが確定しないライブ映像を配信しようとする場合、ＨＴＴＰ／１．１で定義される仕様が、送信側及び受信側の双方で実装されていなければならない。そして、どちらかがＨＴＴＰのひとつ前の仕様であるＨＴＴＰ／１．０を実装している場合には、これを利用できないという問題点があった。ＣＰＵリソースやメモリなどに制約のある機器の場合、ＨＴＴＰ／１．０相当の実装とせざるをえない場合も多く、このような場合には実現できない。 However, for example, when trying to distribute a live video in which the overall distribution file size is not fixed using a method such as the above-mentioned Chunked Transfer Encoding, the specifications defined in HTTP / 1.1 are both the transmission side and the reception side. Must be implemented. When either of them implements HTTP / 1.0, which is the previous specification of HTTP, there is a problem that this cannot be used. In the case of a device with restrictions on CPU resources, memory, etc., there are many cases where it is necessary to implement HTTP / 1.0, and in such a case, it cannot be realized.

一方で、配信ファイルサイズが確定できないため、ファイルサイズを通知せずに配信を行い、通信の終了を通信接続の切断により行うことも可能であるが、この場合には、このような通信の切断が例外的なものであるため好ましいものではない。そればかりか、転送が完了しない状態での切断のため、特にプロトコル的な観点からは、受信データが完全ではない状態での終了であることから、受信したファイルが不完全なものとなって、再利用が困難となってしまう問題も発生する。 On the other hand, since the delivery file size cannot be determined, delivery can be performed without notifying the file size, and communication can be terminated by disconnecting the communication connection. In this case, such communication disconnection is possible. Is not preferred because it is exceptional. In addition, because of the disconnection in the state where the transfer is not completed, especially from the protocol viewpoint, since the received data is not complete, the received file becomes incomplete, There is also a problem that reuse becomes difficult.

また、ＨＴＴＰに定義されるＲａｎｇｅ機能を用いる場合、本来この機能は、受信側の判断で都合のよいデータサイズ毎にファイルを取得する事を主眼としている。このため、前記通信接続の切断の問題が発生する可能性があるばかりか、送信側及び受信側の双方が、このような仕組みを相互に了解済みでなければ処理が困難となってしまうという問題点があった。 In addition, when the Range function defined in HTTP is used, this function is primarily intended to acquire a file for each convenient data size at the judgment of the receiving side. For this reason, there is a possibility that the problem of the disconnection of the communication connection may occur, and the processing becomes difficult unless both the transmission side and the reception side have mutually understood such a mechanism. There was a point.

本発明は前述の問題点に鑑み、ＨＴＴＰ／１．１で定義される仕様が少なくとも送信側、または受信側で実装されていなくても、ライブでの映像及び音声の配信と受信・再生でき、且つ受信したデータをコンテンツとして再利用できるようにすることを目的とする。 In view of the above-described problems, the present invention can perform live video and audio distribution and reception / reproduction even if the specifications defined in HTTP / 1.1 are not implemented at least on the transmission side or reception side. And it aims at making received data reusable as contents.

本発明は前記課題に鑑みてなされたものであり、映像または音声の少なくとも一方からなる動画像データの発生順序に従って、動画像の符号化データと前記符号化データの管理情報とからなる１つ以上のデータブロックから構成されるように生成する動画像データ生成手段と、前記動画像データのサイズをあらかじめ概算する動画像データサイズ概算手段と、前記動画像データサイズ概算手段により概算された動画像データのサイズ情報と第１のデータブロックとを通信手続きに適した形式に生成する第１の通信データ生成手段と、前記動画像データ生成手段により生成された動画像データを、順次通信手続きに適した形式に生成する第２の通信データ生成手段と、前記第１の通信データ生成手段及び第２の通信データ生成手段により順次生成された通信データのサイズを積算する通信データ積算手段と、前記動画像データサイズ概算手段により概算された動画像データのサイズ情報と前記通信データ積算手段により積算された通信データのサイズ情報とから終了を判断する終了判断手段と、前記終了判断手段により終了と判断された場合に、前記動画像データサイズ概算手段により概算された動画像データのサイズ情報と前記動画像データサイズ概算手段により積算された通信データのサイズ情報の差異に等しいサイズの無視されるデータを生成する第３の通信データ生成手段と、前記第３の通信データ生成手段により生成された通信データを順次送信する通信手段とを備えることを特徴とする動画像データ配信装置等、を提供する。 The present invention has been made in view of the above problems, and includes at least one of encoded data of moving images and management information of the encoded data according to the generation order of moving image data including at least one of video and audio. Moving image data generating means for generating the data block, moving image data size estimating means for estimating the size of the moving image data in advance, and moving image data estimated by the moving image data size estimating means First communication data generating means for generating the size information and the first data block in a format suitable for the communication procedure, and the moving image data generated by the moving image data generating means are sequentially suitable for the communication procedure. Sequentially generated by the second communication data generating means for generating the format, the first communication data generating means and the second communication data generating means. Ended from communication data integration means for integrating the size of the communication data received, the size information of the moving image data estimated by the moving image data size estimation means, and the size information of the communication data integrated by the communication data integration means When it is determined that the end is determined by the end determination unit, the size information of the moving image data estimated by the moving image data size estimation unit is integrated by the moving image data size estimation unit. Third communication data generating means for generating ignored data having a size equal to the difference in size information of communication data, and communication means for sequentially transmitting the communication data generated by the third communication data generating means. A moving image data distribution device and the like characterized by the above are provided.

本発明によれば、ライブ映像及び音声を符号化した動画像データの送受信を実現し、受信側ではライブでの画像及び音声の再生が可能であり、受信したデータは再利用が可能な状態で蓄積することができる。 According to the present invention, it is possible to transmit and receive moving image data in which live video and audio are encoded. On the receiving side, live image and audio can be reproduced, and the received data can be reused. Can be accumulated.

（第１の実施形態）
以下、本発明の通信機能を備えたネットワークカメラに適用した場合の第１の実施形態について図面を参照しながら説明する。
図１は、本実施形態の実施に好適なシステム構成例を示す概念図である。
図１に示すように、このシステムは、動画像を生成しネットワーク上に配信するネットワークカメラ１０１と、配信された動画像を受信して表示する再生装置と１０２から構成されている。 (First embodiment)
Hereinafter, a first embodiment when applied to a network camera having a communication function of the present invention will be described with reference to the drawings.
FIG. 1 is a conceptual diagram showing a system configuration example suitable for implementing this embodiment.
As shown in FIG. 1, this system includes a network camera 101 that generates a moving image and distributes it on a network, and a playback device 102 that receives and displays the distributed moving image.

ネットワークカメラ１０１は、通信機能を備えた撮像装置であり、近年、監視用途などで用いられるカメラとして普及している。一方、再生装置１０２は、ネットワークカメラ１０１に対応して、通信機能を備えた動画像の再生を行う装置である。再生装置１０２は、例えば、ＰＣ（パーソナルコンピュータ）上で動作するアプリケーション・プログラムとして実現されている場合もあれば、専用のディスプレイを備えた監視装置の一部として実現される場合もある。 The network camera 101 is an imaging device having a communication function, and has recently become widespread as a camera used for monitoring purposes. On the other hand, the playback device 102 is a device that plays back a moving image having a communication function corresponding to the network camera 101. For example, the playback device 102 may be realized as an application program that operates on a PC (personal computer), or may be realized as a part of a monitoring device including a dedicated display.

ネットワークカメラ１０１は、概念的な機能面からは、動画像をキャプチャし符号化データを生成する動画像生成モジュール１０３と、符号化された動画像を通信に適したデータに変換する通信データ生成モジュール１０４と、実際に要求に応じて動画像を送出する通信サーバー１０５とから構成されている。 In terms of conceptual functions, the network camera 101 includes a moving image generation module 103 that captures a moving image and generates encoded data, and a communication data generation module that converts the encoded moving image into data suitable for communication. 104 and a communication server 105 that actually transmits a moving image in response to a request.

再生装置１０２は、概念的な機能面では、ネットワークカメラ１０１に動画像の要求を行い、動画像を取得する通信クライアント１０６と、受信した動画像の符号化データを復号化して可視化（レンダリング等）する動画像表示モジュール１０６とから構成される。 In terms of conceptual functions, the playback device 102 requests a moving image from the network camera 101, obtains a moving image, and decodes the encoded data of the received moving image to visualize (rendering, etc.). And a moving image display module 106.

次に、一般的な処理の流れを説明する。
最初に、利用者により再生装置１０２が操作されて、ネットワークカメラ１０１からの画像を取得するよう指示を行う。このような操作は、例えばＧＵＩ（グラフィカル・ユーザ・インターフェース）を用いてネットワークカメラ１０１のＵＲＬ（Uniform Resource Locator）を入力する操作であってもよい。また、あらかじめ再生装置１０２に通信接続定義しておいた情報を用いて、スイッチを押下するような操作であってもよい。 Next, a general processing flow will be described.
First, the user operates the playback device 102 to instruct to acquire an image from the network camera 101. Such an operation may be an operation of inputting a URL (Uniform Resource Locator) of the network camera 101 using, for example, a GUI (Graphical User Interface). Further, the operation may be an operation of pressing a switch using information that is defined in advance for communication connection in the playback apparatus 102.

画像取得の指示は、通信クライアント１０６から通信サーバー１０５に対して、ＨＴＴＰ（Hyper Text Transfer Protocol）のような通信手続き（プロトコル）によりネットワークを介して通知される。 The image acquisition instruction is notified from the communication client 106 to the communication server 105 via a network by a communication procedure (protocol) such as HTTP (Hyper Text Transfer Protocol).

本実施形態では、具体的には、通信クライアント１０６と通信サーバー１０５との間で使用される通信手続きは、ＨＴＴＰであり、通信サーバー１０５は、ＨＴＴＰｄと呼ばれるＨＴＴＰサーバーそのものである。また、要求に利用されるメッセージは、たとえばＵＲＬを利用したＣＧＩ（Common Gateway Interface）の指定などにより行われる。 In the present embodiment, specifically, the communication procedure used between the communication client 106 and the communication server 105 is HTTP, and the communication server 105 is an HTTP server itself called HTTPd. The message used for the request is performed by designating a CGI (Common Gateway Interface) using a URL, for example.

この場合、ＣＧＩは、動画像を取得するためのプログラムである。ＵＲＬにより、動画像の取得が行われた場合、通信サーバー１０５は、該当のリソースを呼び出し送信処理を行おうとするが、その処理が通信データ生成モジュール１０４の処理である。前述のように、ＣＧＩプログラムであれば、通信データ生成モジュール１０４の処理そのものがＣＧＩによって構成される。 In this case, CGI is a program for acquiring a moving image. When the moving image is acquired by the URL, the communication server 105 calls the corresponding resource to perform the transmission process, and the process is the process of the communication data generation module 104. As described above, in the case of a CGI program, the processing itself of the communication data generation module 104 is configured by CGI.

ここで、通信データ生成モジュール１０４は、動画像生成モジュール１０３が生成した動画像を取得し、再生装置１０２に送出するにあたり最適となる形式に整形する。さらに、本実施形態であれば、ＨＴＴＰのヘッダを生成するといった処理を行って、送信用のデータを出力する。通信サーバー１０５は、この送信用のデータを通信クライアント１０６に送出する。 Here, the communication data generation module 104 acquires the moving image generated by the moving image generation module 103 and shapes it into a format that is optimal for transmission to the playback apparatus 102. Further, according to the present embodiment, processing such as generation of an HTTP header is performed, and data for transmission is output. The communication server 105 sends the data for transmission to the communication client 106.

通信クライアント１０６では、ＵＲＬ指定により要求した動画像が要求に対する結果として受信される。受信が成功すれば、この受信データが動画像表示モジュール１０７に受け渡され、ここで、再生表示が行われる。 The communication client 106 receives the moving image requested by the URL designation as a result of the request. If the reception is successful, the received data is transferred to the moving image display module 107, where reproduction display is performed.

ここで図１を用いて説明したシステム構成上での処理の概要は、所謂ＷＷＷ（World Wide Web）で一般的に利用されるＨＴＴＰ処理と同等であって、特に特殊な利用方法ではない。ただし、動画像生成モジュール１０３が記録済みの静的なリソースではなく動画像を生成しているという点は除く。 Here, the outline of the processing on the system configuration described with reference to FIG. 1 is equivalent to the HTTP processing generally used in so-called WWW (World Wide Web), and is not a special usage method. However, the point that the moving image generation module 103 generates a moving image instead of the recorded static resource is excluded.

本実施形態の特徴について詳細に述べる前に、ここで、ネットワークカメラ１０１と再生装置１０２とにより構成される本実施形態に好適なシステム構成の装置構成の詳細を図２を用いてさらに説明する。図２は、本実施形態に係るネットワークカメラ１０１の内部構成及び再生装置１０２の内部構成の一例を示すブロック図である。 Before describing the features of this embodiment in detail, the details of the system configuration of the system configuration suitable for this embodiment configured by the network camera 101 and the playback device 102 will be further described with reference to FIG. FIG. 2 is a block diagram illustrating an example of the internal configuration of the network camera 101 and the internal configuration of the playback apparatus 102 according to the present embodiment.

ネットワークカメラ１０１の内部構成は、次のようになっている。まず、送出される動画像の元となる被写体の画像は、光学系からなる撮影レンズユニット２０１を通って光学センサ２０５にて結像する。 The internal configuration of the network camera 101 is as follows. First, an image of a subject that is a source of a moving image to be sent is formed by an optical sensor 205 through a photographing lens unit 201 that is an optical system.

撮影レンズユニット２０１は、例えば、焦点を合わせるためにモーターなどによりレンズ群を移動する仕組みになっており、その直接の制御は、撮影レンズユニット駆動回路２０２によって行われる。この時、絞りを含む絞りユニット２０３が絞りユニット駆動回路２０４によって操作され、結像する光は適切に光量調整されるようになっている。 The taking lens unit 201 has a mechanism for moving a lens group by a motor or the like for focusing, for example, and the direct control is performed by the taking lens unit drive circuit 202. At this time, the aperture unit 203 including the aperture is operated by the aperture unit drive circuit 204 so that the amount of light for image formation is appropriately adjusted.

光学センサ２０５は、固体撮像素子（ＣＣＤやＣＭＯＳなど）によって構成され、入射した光を光学量に応じて電荷に変換、蓄積する性質をもっている。この電荷を読み出し、Ａ／Ｄコンバータ２０７でデジタル化することにより、圧縮されていないデジタル画像が生成される。 The optical sensor 205 is constituted by a solid-state imaging device (CCD, CMOS, etc.), and has a property of converting incident light into electric charge according to an optical quantity and storing it. This electric charge is read out and digitized by the A / D converter 207 to generate an uncompressed digital image.

光学センサ２０５は、光学センサ駆動回路２０６の出力されるパルス信号などによって適切に制御されており、指示されたタイミングで指示された時間の間に蓄積された電荷を読み出す一連の動作を連続することによって、連続したデジタル画像が得られる。このようにして取得された連続画像は、言うまでもなく動画像である。 The optical sensor 205 is appropriately controlled by a pulse signal or the like output from the optical sensor driving circuit 206, and continues a series of operations for reading out charges accumulated during a designated time at a designated timing. Thus, a continuous digital image can be obtained. Needless to say, the continuous images acquired in this way are moving images.

次に、連続して取得されるデジタル画像は、圧縮符号化を行うために、画像信号処理回路２０８に受け渡され、ホワイトバランス補正やガンマ補正といった画像の補正を行った上で、符号圧縮回路２０９に受け渡される。このような処理間のデジタル画像の受け渡しは、例えばＤＭＡ（ＤｉｒｅｃｔＭｅｍｏｒｙＡｃｃｅｓｓ）回路を利用した高速なメモリ２１０のアクセスにより行われる。このメモリ２１０の画像データを圧縮符号化処理するのが符号圧縮回路２０９である。 Next, continuously acquired digital images are transferred to the image signal processing circuit 208 for compression encoding, and after performing image correction such as white balance correction and gamma correction, the code compression circuit To 209. Transfer of digital images between such processes is performed by accessing the memory 210 at high speed using, for example, a DMA (Direct Memory Access) circuit. The code compression circuit 209 performs compression coding processing on the image data in the memory 210.

例えば、連続したＪＰＥＧ（ＩＳＯ／ＩＥＣ１０９１８）符号化によるいわゆるモーションＪＰＥＧ画像では、画像信号処理回路２０８の入力でデジタル画像がＲＧＢ信号である場合、輝度信号Ｙとクロマ信号ＣｂＣｒからなるＹＣ信号に信号化される。そして、これらを８×８画素のブロックに分割したのち、離散コサイン変換、量子化、ハフマン符号化といった処理が行われて最終的な圧縮画像が出力される。 For example, in a so-called motion JPEG image by continuous JPEG (ISO / IEC 10918) encoding, when a digital image is an RGB signal at the input of the image signal processing circuit 208, a signal is generated as a YC signal composed of a luminance signal Y and a chroma signal CbCr. It becomes. Then, after dividing these into blocks of 8 × 8 pixels, processes such as discrete cosine transform, quantization, and Huffman coding are performed, and a final compressed image is output.

あるいは、圧縮符号化方式がＭＰＥＧ−２（ＩＳＯ／ＩＥＣ１３８１８）やＭＰＥＧ−４（ＩＳＯ／ＩＥＣ１４４９６）などのフレーム間予測を行う形式である場合、圧縮しようとする特定の１枚の画像（フレーム）に対し、前後のフレームを参照する。そして前後のフレームを参照しながら、動き補償予測、マクロブロック処理などを行って、前後が相互に依存した圧縮画像（ビットストリーム）が出力される。 Alternatively, when the compression encoding method is a format that performs inter-frame prediction such as MPEG-2 (ISO / IEC 13818) or MPEG-4 (ISO / IEC 14496), a specific image (frame) to be compressed is used. ) Refer to the previous and next frames. Then, while referring to the previous and subsequent frames, motion compensation prediction, macroblock processing, and the like are performed, and compressed images (bitstreams) that are dependent on each other are output.

このようにして出力された動画の圧縮画像は、メモリ２１０に一時保管され、記録ファイル形式に整形されてネットワークコントローラ２１１を介し、通信回路２１２を経てネットワーク上に送出される。 The compressed moving image output in this manner is temporarily stored in the memory 210, shaped into a recording file format, and sent to the network via the network controller 211 and the communication circuit 212.

この一連の動作を図１で説明したような処理として制御し、通信に最適化する役割は、ＣＰＵ２１４によって行われる。すなわち、一般的な表現で言われる制御プログラム（ファームウェア）によるプログラム動作である。 The series of operations is controlled as the processing described with reference to FIG. That is, it is a program operation by a control program (firmware) that is generally called.

制御プログラム（ファームウェア）は、静的な情報として、通常ＲＯＭ２１３に格納されている。ネットワークカメラ１０１は、必要に応じて、このＲＯＭより制御プログラム（ファームウェア）を読み出す。そして、ＣＰＵ２１４で処理することにより、全体の動作が行われる。 The control program (firmware) is normally stored in the ROM 213 as static information. The network camera 101 reads a control program (firmware) from this ROM as necessary. Then, the entire operation is performed by processing by the CPU 214.

ＣＰＵ２１４で処理される制御プログラム（ファームウェア）は、これ以外にもいくつかの処理機能を担っており、例えば、先に説明した撮影レンズユニット２０１の動作の指示を行って、ズーム動作（焦点距離の移動）やオートフォーカスといった処理を行う。なお、ここでは、再生装置１０２側からの動画像要求といった処理については、説明を省略している。これは、装置構成の説明において、そのような処理の流れは複雑となるためであって、たとえばこのような動画像要求のメッセージなども通信回路２１２を介して通信されることは言うまでもない。 The control program (firmware) processed by the CPU 214 has several other processing functions. For example, the control program (firmware) of the photographing lens unit 201 described above is instructed to perform a zoom operation (of the focal length). Move) and auto focus. Here, description of processing such as a moving image request from the playback device 102 is omitted. This is because, in the description of the apparatus configuration, such a processing flow becomes complicated. Needless to say, for example, such a moving image request message is also communicated via the communication circuit 212.

次に、再生装置１０２について説明する。ここでも同様に、動画像を受信した場合を例に説明をおこなう。再生装置１０２の内部構成は、次のようになっている。
動画像データは、通信回路２２１からネットワークコントローラ２２２を介して再生装置に取り込まれる。このような処理は、たとえばネットワークカメラ１０１と同様に、ＲＯＭ２２８に格納されたプログラムなどによりＣＰＵ２２９上で制御プログラム（ファームウェア）が処理を行う。 Next, the playback device 102 will be described. Here, similarly, a case where a moving image is received will be described as an example. The internal configuration of the playback device 102 is as follows.
The moving image data is taken into the playback device from the communication circuit 221 via the network controller 222. Such processing is performed by a control program (firmware) on the CPU 229 by a program stored in the ROM 228 as in the network camera 101, for example.

受信データは、一次的にメモリ２３０に格納される。もし、必要であれば、Ｉ／Ｏコントローラ２２９により記録装置２２８に格納してもよい。格納された受信データは、符号伸長回路２２３が処理出来る形式に変換され、符号伸長回路２２３により復号化が行われた後、画像信号処理回路２２４にて表示に適切になるように画像処理が行われて表示装置２２６で表示される。 The received data is temporarily stored in the memory 230. If necessary, it may be stored in the recording device 228 by the I / O controller 229. The stored received data is converted into a format that can be processed by the code decompression circuit 223, decoded by the code decompression circuit 223, and then subjected to image processing by the image signal processing circuit 224 so as to be suitable for display. And displayed on the display device 226.

表示装置２２６は、表示装置駆動回路２２７にてリフレッシュのタイミングなどが調整され、あるいは、表示装置がアナログ駆動されるものであれば、Ａ／Ｄコンバータ２２５にてデジタルからアナログ変換される。 The display device 226 is converted from digital to analog by the A / D converter 225 if the refresh timing is adjusted by the display device driving circuit 227 or if the display device is analog-driven.

なお、図２を説明するにあたって、制御プログラム（ファームウェア）による制御を説明したが、勿論、高速化のためにファームウェア処理の一部、または、全部をハードウェアで処理することも可能である。例えば、プロトコル処理の一部は、一般的にハード化されやすい場合があるとされ、所謂オフローダーと呼ばれる装置構成としてハード化する例も多い。本実施形態では、最も一般的な構成として制御プログラムによる処理として説明しているに過ぎない。 In the description of FIG. 2, the control by the control program (firmware) has been described. Of course, part or all of the firmware processing can be processed by hardware for speeding up. For example, a part of protocol processing is generally easy to be hardwareized, and there are many examples where hardware is implemented as a so-called offloader device configuration. In the present embodiment, the most general configuration is merely described as processing by a control program.

次に、この後の説明を容易にするため、本実施形態に適用されるフラグメント映像データとしてのＭＰ４ファイルについて説明する。本実施形態においては、符号圧縮回路２０９および符号伸長回路２２３が処理する符号化形式は、たとえばＭＰＥＧ−４（ISO/IEC 14496）が利用されるものとする。 Next, in order to facilitate the subsequent description, an MP4 file as fragment video data applied to the present embodiment will be described. In the present embodiment, for example, MPEG-4 (ISO / IEC 14496) is used as the encoding format processed by the code compression circuit 209 and the code expansion circuit 223.

ＭＰ４ファイルは、ＭＰＥＧ−４の標準ファイル形式として国際標準化されたファイル形式であり、フラグメント映像データは、Fragmented Movieとして、このＭＰ４ファイル形式内で定義される形式のひとつである。なお、本実施形態では、ＭＰＥＧ−４を用いたＭＰ４ファイル形式を示すが、例えばＭｏｔｉｏｎ−ＪＰＥＧ２０００を用いたＩＳＯＢａｓｅＭｅｄｉａＦｉｌｅＦｏｒｍａｔのファイル形式であっても同様に動作する。あるいは、符号化形式としてＨ．２６４を用いてもよい。 The MP4 file is a file format internationally standardized as a standard file format of MPEG-4, and fragment video data is one of the formats defined in this MP4 file format as Fragmented Movie. In the present embodiment, the MP4 file format using MPEG-4 is shown, but the same operation is performed even in the ISO Base Media File Format file format using Motion-JPEG2000, for example. Alternatively, the encoding format is H.264. H.264 may be used.

これらは、符号化形式としては異なるものであるが、後に明らかになるように、符号化形式が異なるものであってもフラグメント映像データであれば本実施形態では適用可能である。 These are different encoding formats, but as will become apparent later, even if the encoding formats are different, they can be applied in the present embodiment as long as they are fragment video data.

ここで、ＭＰ４ファイル形式及びフラグメント映像データ（Fragmented Movie）の構造について説明する。図３は、Fragmented Movieを含むＭＰ４ファイル形式についてのデータ構造の一例を示す概念図である。
ＭＰ４ファイル形式では、コンテンツ全体のプレゼンテーションを「ムービー」、コンテンツを構成するメディアストリームのプレゼンテーションを「トラック」と呼んでいる。 Here, the structure of the MP4 file format and fragment video data (Fragmented Movie) will be described. FIG. 3 is a conceptual diagram showing an example of a data structure for an MP4 file format including a Fragmented Movie.
In the MP4 file format, the presentation of the entire content is called “movie”, and the presentation of the media stream constituting the content is called “track”.

最初のヘッダであるMovie_BOX （'moov'）３０２には、典型的には、動画像のデータ全体を論理的に取り扱うビデオトラック３０３と音声のデータ全体を論理的に取り扱うオーディオトラック３０７とが含まれている。ビデオトラック３０３とオーディオトラック３０７の基本的な構成内容は、ほとんど同等のものとなっている。すなわち、それぞれのトラックは、実際のメディアデータの様々な属性情報を記録しており、その内容がメディアデータの特性に応じて多少異なっているだけである。 The first header, Movie_BOX ('moov') 302, typically includes a video track 303 that logically handles the entire moving image data and an audio track 307 that logically handles the entire audio data. ing. The basic configuration contents of the video track 303 and the audio track 307 are almost the same. That is, each track records various attribute information of actual media data, and the contents thereof are only slightly different depending on the characteristics of the media data.

ビデオトラック３０３に含まれるデータは、例えば、符号化データを復号化するための所謂デコーダの構成情報や動画像の矩形サイズなどの情報が含まれる。さらに、メディアデータが実際に記録されているファイル上の位置を示すオフセット３０４や、メディアデータのそれぞれのフレームデータ（ピクチャと呼ばれることもある）のサイズを示すサンプルサイズ３０５、それぞれのフレームデータのデコード時間及びプレゼンテーション時間を示すタイムスタンプ３０６などが記録されている。 The data included in the video track 303 includes, for example, information such as so-called decoder configuration information for decoding encoded data and a rectangular size of a moving image. Further, an offset 304 indicating the position on the file where the media data is actually recorded, a sample size 305 indicating the size of each frame data (also called a picture) of the media data, and decoding of each frame data A time stamp 306 indicating the time and the presentation time is recorded.

ＭＰ４ファイル３０１の全体の構成としては、映像及び音声データの物理的位置、時間的位置や特性情報などを示すヘッダ情報（メタデータ）部分と、符号化された映像・音声データの実体であるメディアデータ部分から構成される。 The entire configuration of the MP4 file 301 includes a header information (metadata) portion indicating the physical position, temporal position, characteristic information, and the like of video and audio data, and a medium that is an entity of encoded video / audio data. It consists of a data part.

図３はFragmented Movie独自のBOXを含んだものになっている。ところが、Fragmented Movie構造を持たないシンプルな構造のＭＰ４ファイルでは、Fragmented Movieによる拡張部分の情報を示すMovie_Extends_BOX（'mvex'）３０８を除いたMovie_BOX （'moov'）３０２と、それと対を成すメディアデータ部であるMedia_Data_BOX（'mdat'）３１１のみによって構成される。 Figure 3 includes a Fragmented Movie original BOX. However, in an MP4 file with a simple structure that does not have a Fragmented Movie structure, Movie_BOX ('moov') 302 excluding Movie_Extends_BOX ('mvex') 308 that indicates information about the extended portion of the Fragmented Movie and media data that forms a pair with it. Only the Media_Data_BOX ('mdat') 311 that is a part.

一方、Fragmented Movie構造を持つＭＰ４ファイルでは、コンテンツのヘッダ情報およびメディアデータを任意の時間単位で分割することができ、分割された「フラグメント」はファイルの先頭から時系列順に記録される。 On the other hand, in an MP4 file having a Fragmented Movie structure, content header information and media data can be divided in arbitrary time units, and the divided “fragments” are recorded in chronological order from the beginning of the file.

この時、コンテンツ全体の属性情報を含む先頭のMovie_BOX （'moov'）３０２には、図３に示すように、フラグメント部分を含む全体の再生時間（duration）などの情報を格納するMovie_Extends_BOX（'mvex'）３０８が配置される。そして、後に続くMedia_Data_BOX（'mdat'）３１１に含まれるデータに関する情報を保持する。 At this time, in the first Movie_BOX ('moov') 302 including the attribute information of the entire content, as shown in FIG. 3, Movie_Extends_BOX ('mvex') storing information such as the entire playback time (duration) including the fragment portion ') 308 is arranged. Then, information regarding data included in Media_Data_BOX ('mdat') 311 that follows is held.

次に出現するMovie_Fragment_BOX（'moof'）３１２はフラグメント部分のヘッダ情報である。そして、Media_Data_BOX（'mdat'）３１３に含まれるデータに関する情報を保持する。以上のように以降同様にMovie_Fragment_BOX（'moof'）とMedia_Data_BOX（'mdat'）の組み合わせが追加されていく形で構成される。 Movie_Fragment_BOX ('moof') 312 that appears next is header information of the fragment portion. Information on data included in Media_Data_BOX ('mdat') 313 is held. As described above, a combination of Movie_Fragment_BOX ('moof') and Media_Data_BOX ('mdat') is added in the same manner.

Fragmented Movie構造を持つＭＰ４ファイルでは、Movie_BOX （'moov'）３０２の中にFragmented Movieによる拡張情報を格納しているMovie_Extends_BOX（'mvex'）３０８が存在する。そして、Movie_Extends_BOX（'mvex'）３０８に含まれるデータは、フラグメント部分を含めたムービー全体の再生時間（duration）３０９やフラグメント部分に含まれるメディアデータのサンプルサイズやサンプル毎のdurationなどのデフォルト値などの情報３１０を設定可能である。 In an MP4 file having a Fragmented Movie structure, Movie_Extends_BOX ('mvex') 308 storing extension information by Fragmented Movie exists in Movie_BOX ('moov') 302. The data included in Movie_Extends_BOX ('mvex') 308 includes the playback time (duration) 309 of the entire movie including the fragment portion, the default value such as the sample size of media data included in the fragment portion, and the duration for each sample. The information 310 can be set.

ここにデフォルト値を設定することによって、後に続くMovie_Fragment_BOX（'moof'）３１２内のサンプル情報ではデフォルト値を使用する場合、サンプル毎の値の設定を省略することができる。 By setting a default value here, when using the default value in the sample information in the Movie_Fragment_BOX ('moof') 312 that follows, the setting of the value for each sample can be omitted.

このように、ＭＰ４ファイル形式のファイルでは、メディアデータに関する各種属性情報をメタデータ領域としてメディアデータと分離して保持する。これにより、メディアデータが物理的にどのように格納されているかに関わらず、所望のサンプルデータに容易にアクセスすることが可能になっている。 As described above, in the MP4 file format file, various attribute information related to the media data is stored as a metadata area separately from the media data. This makes it possible to easily access the desired sample data regardless of how the media data is physically stored.

また、Fragmented Movie構造によれば、メタデータとメディアデータを1つのブロックとして、複数のブロックが時系列順に連結したファイル構造とすることができる。次に、図１及び図２を念頭に、本実施形態の特徴となる部分をより具体的かつ詳細に説明していく。 Further, according to the Fragmented Movie structure, a file structure in which a plurality of blocks are connected in time-series order with metadata and media data as one block can be achieved. Next, with reference to FIG. 1 and FIG. 2, the parts that characterize this embodiment will be described more specifically and in detail.

図４は、再生装置１０２からネットワークカメラ１０１に対して動画像の要求を行う際に利用するＵＲＬの一例を示す図である。
通信クライアント１０６および通信サーバー１０５は、ＨＴＴＰの処理に適合しており、図４のＵＲＬ例４０１で示したようなＵＲＬで、通信クライアント１０６から通信サーバー１０５に接続を行うものとする。 FIG. 4 is a diagram illustrating an example of a URL used when a request for a moving image is made from the playback apparatus 102 to the network camera 101.
The communication client 106 and the communication server 105 are adapted to HTTP processing, and are connected from the communication client 106 to the communication server 105 with a URL as shown in the URL example 401 of FIG.

ＵＲＬ例４０１の５つの例の前半部分はともに同じである。ＨＴＴＰにより（http:）、hostnameで与えられるアドレス（//hostname/）の相対パス（cgi-bin/getmp4.cgi）であるＣＧＩにアクセスを行うことを意味している。これにより、通信サーバー１０５に含まれるＨＴＴＰ処理サーバーがＣＧＩプログラムを起動する。後半部分のパラメータ類はＣＧＩプログラムに引き渡される値である。 The first half of the five examples of the URL example 401 are the same. This means that access is made to the CGI that is the relative path (cgi-bin / getmp4.cgi) of the address (// hostname /) given by the hostname by HTTP (http :). As a result, the HTTP processing server included in the communication server 105 starts the CGI program. The parameters in the latter half are values delivered to the CGI program.

図４には、５つの例のうち、最後のものを分解して記載されたブロック全体４０２を示している。プロトコル部分４０３、アドレス部分４０４、相対パス部分４０５が前記で説明した部分である。また、セパレータ部分４０６がそれ以降のパラメータ部分との区切りとなっている。パラメータ部分は、パラメータのセパレータ部分４０８により５つの部分に分けることができる。 FIG. 4 shows the entire block 402 described by disassembling the last of the five examples. The protocol part 403, the address part 404, and the relative path part 405 are the parts described above. Further, the separator portion 406 is a separator from the subsequent parameter portion. The parameter part can be divided into five parts by a parameter separator part 408.

第１のパラメータ４０７（duration=180）は、要求する動画像の時間が１８０である事を表現している。この単位は、続く第２のパラメータ４０９（scale=1）で１秒をひとつに分割した単位であることが指定されているため、１秒であり、１８０秒の動画像を指定することになる。 The first parameter 407 (duration = 180) expresses that the requested moving image time is 180. Since this unit is specified to be a unit obtained by dividing 1 second into two in the subsequent second parameter 409 (scale = 1), a moving image of 180 seconds is specified. .

第３のパラメータ４１０（bitrate=512000）は、５１２０００ｂｐｓのビットレートであることを指定する。そして、第４のパラメータ４１１（top=6）と第５のパラメータ４１２（frg=4）で、動画の送出時の最初のデータブロックが６秒であり、その後４秒毎のデータブロックで送出を期待することを指定している。すなわち、この例では、最初６秒のデータブロックでその後４秒のデータブロックとなるような３分間の動画像を５１２ｋｂｐｓで要求することを意味している。 The third parameter 410 (bitrate = 512000) specifies that the bit rate is 512000 bps. Then, with the fourth parameter 411 (top = 6) and the fifth parameter 412 (frg = 4), the first data block at the time of moving image transmission is 6 seconds, and thereafter, the data block is transmitted every 4 seconds. Specifies what to expect. That is, in this example, it means that a 3-minute moving image is requested at 512 kbps so that a data block of 6 seconds is first and then a data block of 4 seconds is obtained.

データブロックは、先に説明したフラグメント映像データとしてのＭＰ４ファイルにおける最初のヘッダ（'moov'）とデータ（'mdat'）の組み合わせ、及び、フラグメントのヘッダ（'moof'）とデータ（'mdat'）の組み合わせに相当する。ここで説明してきたＵＲＬの例は、実装上の例であって、パラメータの種類と数、順序、フォーマットなどは、異なる仕様であっても構わないことに注意しなければならない。これらは、純粋にアプリケーションに依存するものである。 The data block includes a combination of the first header ('moov') and data ('mdat') in the MP4 file as the fragment video data described above, and a fragment header ('moof') and data ('mdat'). ). It should be noted that the URL example described here is an implementation example, and the types, number, order, format, and the like of parameters may be different specifications. These are purely application dependent.

また、パラメータ類は、必ずＵＲＬにより受け渡されるかのように説明してきたが、これらのパラメータ類の一部あるいは全部が、あらかじめ、通信サーバー１０５上に定義、保管、記録されていてもよい。そのようにすることにより、もし、必要なパラメータ類が通信クライアント１０６より通知されない場合でもこれを利用することができる。 Further, although the parameters have been described as if they are always delivered by URL, some or all of these parameters may be defined, stored, and recorded on the communication server 105 in advance. By doing so, even if necessary parameters are not notified from the communication client 106, they can be used.

次に、前記例に従って通信サーバー１０５が起動するＣＧＩプログラムの動作について説明する。
ＣＧＩプログラムは、動画像生成モジュール１０３の出力する動画像の符号化データを送信のために取得する必要がある。 Next, the operation of the CGI program started by the communication server 105 according to the above example will be described.
The CGI program needs to acquire the encoded data of the moving image output from the moving image generation module 103 for transmission.

符号化された動画像のデータは、圧縮符号回路２０９を駆動して生成される事は先に説明したが、動作説明を簡略化するため、ＣＧＩプログラムの機能は、通信データ生成モジュール１０４に含まれるものとする。そして、動画像のデータの符号化までを動画像生成モジュール１０３が行うものとする。 As described above, the encoded moving image data is generated by driving the compression encoding circuit 209. However, in order to simplify the operation description, the function of the CGI program is included in the communication data generation module 104. Shall be. Then, it is assumed that the moving image generation module 103 performs up to encoding of moving image data.

また、動画像生成モジュール１０３が停止している場合には、通信データ生成モジュール１０４がこれを起動する必要がある。これは、制御プログラム（ファームウェア）が符号圧縮回路２０９などの駆動を開始することに対応するかも知れないし、或いは単純に、符号化データを取得するためにそのインターフェースの初期化を行うに過ぎないかもしれない。 When the moving image generation module 103 is stopped, the communication data generation module 104 needs to start it. This may correspond to the control program (firmware) starting to drive the code compression circuit 209 or the like, or simply to initialize the interface to obtain the encoded data. unknown.

図５は、ＣＧＩプログラムの動作の一例を示すフローチャートである。
まず、起動されたＣＧＩプログラムは、後段のステップで利用する変数などの初期化を行い（ステップＳ５０１）、受け渡されるパラメータ５３１を取得する（ステップＳ５０２）。 FIG. 5 is a flowchart showing an example of the operation of the CGI program.
First, the activated CGI program initializes variables and the like used in the subsequent steps (step S501), and obtains parameters 531 to be transferred (step S502).

ここに、パラメータ５３１は、図４を用いて説明したパラメータ部分４０２に相当する。このパラメータ５３１を用いて、次に、通信データ長の算出を行う（ステップＳ５０３）。通信データ長の算出するステップＳ５０３の具体的な算出方法は、後にさらに詳細に説明を加えるが、通信データ長は、具体的には、通信サーバー１０５から通信クライアント１０６に送出されるＭＰ４のファイル長である。 Here, the parameter 531 corresponds to the parameter portion 402 described with reference to FIG. Next, the communication data length is calculated using the parameter 531 (step S503). The specific calculation method in step S503 for calculating the communication data length will be described in more detail later. Specifically, the communication data length is the file length of the MP4 sent from the communication server 105 to the communication client 106. It is.

ここで注目すべきことは、もし、図４を用いて説明したパラメータが、この説明で用いた例と同じ場合、パラメータにはデータ長が含まれず、３分間の５１２ｋｂｐｓデータといったことのみがＣＧＩプログラムが了解するパラメータであるという点である。すなわち、通信データ長の算出するステップＳ５０３では、パラメータに応じて実際に送出されるＭＰ４ファイルのデータ長を算出することになる。 What should be noted here is that if the parameters described using FIG. 4 are the same as the example used in this description, the data does not include the data length, and only the CGI program of 512 kbps data for 3 minutes. Is a parameter that is understood. That is, in step S503 for calculating the communication data length, the data length of the MP4 file that is actually transmitted according to the parameter is calculated.

通信データ長が算出された後に、ＨＴＴＰヘッダが生成される（ステップＳ５０４）。
生成されるＨＴＴＰヘッダについての詳細も後に説明を加えるが、特に重要な点は、ＨＴＴＰヘッダ内に、送出しようとするＭＰ４ファイルのデータ長が記載されている点である。ＭＰ４ファイルのデータ長は、通信データ長の算出するステップＳ５０３ですでに計算済みであり、この値が用いられる。 After the communication data length is calculated, an HTTP header is generated (step S504).
Details of the generated HTTP header will also be described later, but a particularly important point is that the data length of the MP4 file to be transmitted is described in the HTTP header. The data length of the MP4 file has already been calculated in step S503 for calculating the communication data length, and this value is used.

次に、送出されるＭＰ４ファイルに格納される動画データとして、エンコードバッファ５３２よりエンコードデータを読み出し、第１データブロックを生成する（ステップＳ５０５）。第１データブロックは、先に説明したＭＰ４ファイルにおける最初のヘッダ（'moov'）とデータ（'mdat'）の組み合わせ部分である。このようにして生成されたＨＴＴＰヘッダと第１データブロックは、次のステップであるデータ出力１により通信データ５３４として出力が行われる（ステップＳ５０６）。 Next, the encoded data is read from the encode buffer 532 as moving image data stored in the transmitted MP4 file, and a first data block is generated (step S505). The first data block is a combination part of the first header ('moov') and data ('mdat') in the MP4 file described above. The HTTP header and the first data block generated in this way are output as communication data 534 by data output 1 which is the next step (step S506).

一般的なＨＴＴＰサーバーでは、ＣＧＩプログラムから出力される通信データは、オペレーティングシステムの標準入出力を介して、ＨＴＴＰサーバーが接続済みの通信クライアントに対して送信を行うようになっている。ここでは、そのようなＨＴＴＰサーバーの何らかの実装仕様に従って出力が行われることを仮定している。また、動画データは、一般的にそのデータ長が大きいために、一次ファイルなどを用いて出力を行ってもよい。 In a general HTTP server, communication data output from a CGI program is transmitted to a communication client to which the HTTP server is already connected via a standard input / output of the operating system. Here, it is assumed that output is performed according to some implementation specifications of such an HTTP server. In addition, since moving image data generally has a large data length, it may be output using a primary file or the like.

次に、終了条件の見積もりを行う（ステップＳ５０７）。そして、終了条件を満たしているか否かを判断し（ステップＳ５０８）、終了条件を満たしていれば、ｆｒｅｅデータブロックを生成するステップＳ５１１に進み、満たしていなければ後続データブロックを生成するステップＳ５０９に進む。 Next, the end condition is estimated (step S507). Then, it is determined whether or not the end condition is satisfied (step S508). If the end condition is satisfied, the process proceeds to step S511 for generating a free data block, and if not, the process proceeds to step S509 for generating a subsequent data block. move on.

ここで、終了条件の見積もりとは、先に通信データ長を算出するステップＳ５０３であらかじめ算出された通信データ長と、実際に出力された通信データ５３４のデータ長の積算値とから判断して、最終的に出力される通信データ５３４のデータ長の積算値が算出された通信データ長を越えないかどうか判断する処理である。もし、これが越えると判断されるならば、ＨＴＴＰプロトコルに従って通信クライアント１０６に通知されたデータ長と矛盾することになる。このような矛盾が発生しないように終了条件が見積もられる。 Here, the estimation of the end condition is determined from the communication data length calculated in advance in step S503 for calculating the communication data length and the integrated value of the data length of the communication data 534 actually output, This is a process for determining whether the integrated value of the data length of the communication data 534 to be finally output does not exceed the calculated communication data length. If it is determined that this is exceeded, the data length is inconsistent with the data length notified to the communication client 106 according to the HTTP protocol. Termination conditions are estimated so that such contradiction does not occur.

もし、終了条件を満たしていないのであれば、後続データブロックを生成するステップＳ５０９が実行される。後続データブロックは、先に説明したＭＰ４ファイルにおけるフラグメントのヘッダ（'moof'）とデータ（'mdat'）の組み合わせに相当する。 If the termination condition is not satisfied, step S509 for generating a subsequent data block is executed. The subsequent data block corresponds to a combination of the fragment header ('moof') and data ('mdat') in the MP4 file described above.

具体的には、第１データブロック生成処理（ステップＳ５０５）と同様に、エンコードバッファ５３２よりエンコードデータを読み込み、この後続データブロックが生成される。生成された後続データブロックは、第１データブロックの出力であるデータ出力１処理（ステップＳ５０６）と同様に、データ出力２処理（ステップＳ５１０）で通信データ５３４として出力される。このようにして、後続データブロックが出力されたのち、再び終了条件の見積もりステップＳ５０７が行われる。終了条件が満たされない限り、この一連の処理が継続することになる。 Specifically, as in the first data block generation process (step S505), the encoded data is read from the encode buffer 532, and this subsequent data block is generated. The generated subsequent data block is output as communication data 534 in the data output 2 process (step S510), similarly to the data output 1 process (step S506) that is the output of the first data block. After the subsequent data block is output in this way, the end condition estimation step S507 is performed again. As long as the termination condition is not satisfied, this series of processing continues.

なお、データ出力１処理（ステップＳ５０６）が行われた直後の終了条件見積もりステップＳ５０７とデータ出力２処理（ステップＳ５１０）が行われた後での終了条件見積もり処理（ステップＳ５０７）は、基本的には同じ処理である。ところが、実際に出力された通信データ５３４のデータ長の積算値は、明らかにデータ出力２処理（ステップＳ５１０）を通過する毎に増加している点で異なるものである。終了条件見積もりステップＳ５０７の詳細についても、後述する。 The end condition estimation step S507 immediately after the data output 1 process (step S506) is performed and the end condition estimation process (step S507) after the data output 2 process (step S510) are basically performed. Is the same process. However, the integrated value of the data length of the communication data 534 actually output is different in that it clearly increases every time the data output 2 process (step S510) is passed. Details of the end condition estimation step S507 will also be described later.

一方で、ステップＳ５０８の判断の結果、終了条件が満たされた場合、ｆｒｅｅデータブロック生成が行われる（ステップＳ５１１）。ｆｒｅｅデータブロック生成は、通信データとしては有効であるが、動画像データや動画像データそのものにかかわるデータを含まない特に意味のないデータ領域（'free'）を生成する処理である。このようなある種の無意味な領域は、実装上の都合や動画像の編集処理などで不要となるといった事により、アプリケーションの都合で利用することができるようＭＰ４ファイルなどの国際標準として定義されている。 On the other hand, if the end condition is satisfied as a result of the determination in step S508, free data block generation is performed (step S511). The free data block generation is a process for generating a data area ('free') which is effective as communication data but does not include moving image data or data related to the moving image data itself ('free'). This kind of meaningless area is defined as an international standard such as an MP4 file so that it can be used for the convenience of the application because it is not necessary for implementation or editing of moving images. ing.

ｆｒｅｅデータブロック生成処理（ステップＳ５１１）により生成されたデータは、第１データブロックの出力であるデータ出力１処理（ステップＳ５０６）やデータ出力２処理（ステップＳ５１０）と同様である。すなわち、データ出力３処理（ステップＳ５１２）において通信データ５３４として出力される。 The data generated by the free data block generation process (step S511) is the same as the data output 1 process (step S506) and the data output 2 process (step S510), which are outputs of the first data block. That is, it is output as communication data 534 in the data output 3 process (step S512).

このようにして出力された通信データ５３４は、それ自体で正しいＭＰ４ファイルとなっており、最後にＣＧＩプログラムの最終処理としてクローズ処理を行うことにより完結する（ステップＳ５１３）。 The communication data 534 output in this manner is a correct MP4 file by itself, and is finally completed by performing a close process as a final process of the CGI program (step S513).

ここで、通信データ長の算出処理（ステップＳ５０３）の具体的な算出方法の詳細について説明を加える。通信データ長は、先に説明したように、具体的には、通信サーバー１０５から通信クライアント１０６に送出されるＭＰ４のファイル長である。 Here, the details of a specific calculation method of the communication data length calculation process (step S503) will be described. As described above, the communication data length is specifically the MP4 file length transmitted from the communication server 105 to the communication client 106.

フラグメント映像データとしてのＭＰ４ファイルは、ＭＰ４ファイルにおける最初のヘッダ（'moov'）とデータ（'mdat'）の組み合わせ及び０個以上のフラグメントのヘッダ（'moof'）とデータ（'mdat'）の組み合わせから構成される。したがって、ここで算出される通信データ長、より具体的には、ＭＰ４のファイル長は、これらのデータ長を足し合わせたサイズとなる。 The MP4 file as fragment video data includes a combination of the first header ('moov') and data ('mdat') and zero or more fragment headers ('moof') and data ('mdat') in the MP4 file. Composed of a combination. Accordingly, the communication data length calculated here, more specifically, the file length of MP4 is a size obtained by adding these data lengths.

しかしながら、これらの個々のデータ長を正確に算出することは困難である。なぜならば、少なくとも圧縮符号化された動画データのデータ長は、符号化前の動画像の複雑さや圧縮符号化時のパラメータに依存するためである。多くの符号化器においては、目標ビットレートを設定し、これに近いビットレートにて圧縮符号化を行うよう設計されているが、厳密さを持ってあらかじめデータ長を決定することはかなり難しい。 However, it is difficult to accurately calculate these individual data lengths. This is because at least the data length of the compressed and encoded moving image data depends on the complexity of the moving image before encoding and the parameters at the time of compression encoding. Many encoders are designed so that a target bit rate is set and compression encoding is performed at a bit rate close to the target bit rate, but it is quite difficult to determine the data length in advance with accuracy.

また、最初のヘッダ（'moov'）などについても、動画データに比べればそのサイズをあらかじめ見積もることはより容易であると言えるが、それでもなお、そのデータ格納形式に連長圧縮などによる部分が存在するため、困難な場合がある。すなわち、通信データ長の算出処理（ステップＳ５０３）は、厳密であると言うより、想定されるより正確なデータ長を見積もることにほかならない。 In addition, it can be said that it is easier to estimate the size of the first header ('moov') etc. in advance compared to video data, but there is still a part due to continuous length compression etc. in the data storage format It may be difficult to do so. That is, the communication data length calculation process (step S503) is nothing but exact, and it is nothing more than estimating the expected more accurate data length.

最初のヘッダ（'moov'）のほとんどの部分については、ヘッダに含まれるほとんどの項目のデータサイズをあらかじめ見積もることが可能である。
これらのヘッダに含まれるすべての項目について、すべて、見積もりを行う方法を説明する事は冗長であるため、ここではその詳細な説明を省略する。例えば、ＭｏｖｉｅＨｅａｄｅｒＢｏｘ（'mvhd'）は、あらかじめそのサイズを決定できる項目の典型的な例のひとつである。 For most parts of the first header ('moov'), it is possible to estimate the data size of most items included in the header in advance.
Since it is redundant to explain how to estimate all the items included in these headers, detailed description thereof is omitted here. For example, Movie Header Box ('mvhd') is one of typical examples of items whose size can be determined in advance.

一方で、例えば、フレームレートが固定であって、タイムスタンプがタイムスケールとの関係において一定の値で増加するようになっている場合を考える。この場合には、Ｔｉｍｅ−ｔｏ−ｓａｍｐｌｅＢｏｘ（'stts'）は、これが連長圧縮による記録であるため記録サイズを一定とすることが可能である。 On the other hand, for example, consider a case where the frame rate is fixed and the time stamp increases at a constant value in relation to the time scale. In this case, since the time-to-sample box ('stts') is recorded by continuous length compression, the recording size can be made constant.

しかしながら、記録されるタイムスタンプの差異が一定では無い場合、記録サイズは一定とならない。このような場合には、最初のヘッダと対で記録される動画データのサンプル数（フレーム数）との関係において、その記録サイズを概算する必要がある。しかしながら、それでもなお、記録サイズの変動は最大で高々サンプル数（フレーム数）を上限とする変動であり、この部分のデータ長の見積もりは、概算値としては比較的容易に行うことが可能である。 However, if the recorded time stamp difference is not constant, the recording size is not constant. In such a case, it is necessary to estimate the recording size in relation to the number of moving image data samples (number of frames) recorded in pairs with the first header. However, the recording size still varies up to the maximum number of samples (the number of frames), and the data length of this portion can be estimated relatively easily as an approximate value. .

より具体的には、このＴｉｍｅ−ｔｏ−ｓａｍｐｌｅＢｏｘの場合、国際標準規格であるＩＳＯ／ＩＥＣ１４４９６−１２には、
aligned(8) class TimeToSampleBox
extends FullBox('stts', version = 0, 0) {
unsigned int(32) entry_count;
int i;
for (i=0; i < entry_count; i++) {
unsigned int(32) sample_count;
unsigned int(32) sample_delta;
}
}
と定義されており、最小では２４バイト、最大でも１６バイトと４バイトにサンプル数を乗じたサイズ程度に見積もることができる。 More specifically, in the case of this Time-to-sample Box, ISO / IEC 14496-12, which is an international standard,
aligned (8) class TimeToSampleBox
extends FullBox ('stts', version = 0, 0) {
unsigned int (32) entry_count;
int i;
for (i = 0; i <entry_count; i ++) {
unsigned int (32) sample_count;
unsigned int (32) sample_delta;
}
}
The minimum size is 24 bytes, and the maximum size is 16 bytes and 4 bytes multiplied by the number of samples.

また、例えば、ＳａｍｐｌｅＳｉｚｅＢｏｘ（'stsz'）は、サンプル数に比例してサイズが増加するよう定義されている。この場合、最初のヘッダ（'moov'）と対となるデータ（'mdat'）のサンプル数からこれを見積もることができる。このように、通信データ長の算出処理（ステップＳ５０３）における最初のヘッダ（'moov'）については、対応するデータ（'mdat'）のサンプル数から、ほぼ正確にそのサイズを見積もることができる。 Further, for example, Sample Size Box ('stsz') is defined so that the size increases in proportion to the number of samples. In this case, this can be estimated from the number of samples of data ('mdat') paired with the first header ('moov'). Thus, the size of the first header ('moov') in the communication data length calculation process (step S503) can be estimated almost accurately from the number of samples of the corresponding data ('mdat').

次に、最初のヘッダ（'moov'）に対応するデータ（'mdat'）部分のサイズ見積もりについて説明する。この部分のサイズ見積もりは、例えば、ビットレートとこのデータブロックの時間（Duration）から概算することができる。もし、ビットレートが８Ｍｂｐｓでありデータブロックの時間が５秒である場合、これらの値から明らかに、８Ｍｂｐｓ×５秒÷８ビットから５Ｍバイトであることが算出できる。もちろん、あらかじめサンプル数（フレーム数）とフレームレートが決められていて、この値からデータブロックの時間を算出するような方法であってもよい。 Next, the size estimation of the data ('mdat') portion corresponding to the first header ('moov') will be described. The size of this portion can be estimated from, for example, the bit rate and the time (Duration) of this data block. If the bit rate is 8 Mbps and the time of the data block is 5 seconds, it is apparent from these values that 8 Mbps × 5 seconds ÷ 8 bits to 5 Mbytes can be calculated. Of course, a method may be used in which the number of samples (the number of frames) and the frame rate are determined in advance, and the time of the data block is calculated from these values.

一般的に、このような値は、何らかの形で定義されており、それに従って符号化の制御を行うようにしている。ここでの問題は、符号化データのデータサイズを厳密に見積もることは、困難であるということである。前述したように、重要であることは、正確な見積もりではなく、想定されるより正確なデータ長を見積もることにある。 In general, such a value is defined in some form, and the encoding is controlled accordingly. The problem here is that it is difficult to accurately estimate the data size of the encoded data. As described above, what is important is not an accurate estimation but an estimation of a more accurate data length than expected.

ＭＰ４ファイルにおける最初のヘッダ（'moov'）とデータ（'mdat'）の組み合わせと同様にして、フラグメントのヘッダ（'moof'）とデータ（'mdat'）の組み合わせについても、そのデータ長を見積もることができる。 Similar to the combination of the first header ('moov') and data ('mdat') in the MP4 file, the data length of the combination of the fragment header ('moof') and data ('mdat') is estimated. be able to.

フラグメントのヘッダ（'moof'）も、その格納形式と格納データがヘッダ（'moov'）と異なるとはいえ、ほぼ同様な考え方で定義されるものである。異なる点は、フラグメントのヘッダ（'moof'）とデータ（'mdat'）の組み合わせが複数回発生することにある。例えば、送出されるＭＰ４ファイルの全体が１８０秒で、最初のヘッダ（'moov'）とデータ（'mdat'）の組み合わせが５秒で、その後のひとつのフラグメントのヘッダ（'moof'）とデータ（'mdat'）の組み合わせが４秒の場合を考える。この場合、１８０から最初の５秒を差し引いた１７５秒間にフラグメントが複数回発生する。よって、１７５秒を４秒で除算することにより、４４回のフラグメントが発生すると計算することができる。 The header ('moof') of a fragment is also defined based on almost the same concept, although its storage format and stored data are different from the header ('moov'). The difference is that the combination of the fragment header ('moof') and data ('mdat') occurs multiple times. For example, the entire MP4 file to be sent is 180 seconds, the combination of the first header ('moov') and data ('mdat') is 5 seconds, and the header ('moof') and data of one fragment after that Consider the case where the combination of ('mdat') is 4 seconds. In this case, fragments are generated a plurality of times in 175 seconds obtained by subtracting the first 5 seconds from 180. Therefore, it can be calculated that 44 fragments are generated by dividing 175 seconds by 4 seconds.

まず、１つのフラグメントについて説明する。フラグメントのヘッダであるMovie_Fragment_BOX（'moof'）３１２はMovie_Fragment_BOX（'moof'）３１２が１つにつき、１つのMovie_Fragment_Header_BOX（'mfhd'）と映像や音声のトラック毎にTrack_Fragment_BOX（'traf'）を持つ。更にTrack_Fragment_BOX（'traf'）はトラック毎に１つのTrack_Fragment_Header_BOX（'tfhd'）を持つ。これにより、１つ以上のTrack_Fragment_Run_BOX（'trun'）を持つ構造になっている。 First, one fragment will be described. Each movie_Fragment_BOX ('moof') 312 which is a fragment header has one Movie_Fragment_Header_BOX ('mfhd') and Track_Fragment_BOX ('traf') for each video or audio track. Furthermore, Track_Fragment_BOX ('traf') has one Track_Fragment_Header_BOX ('tfhd') for each track. As a result, the structure has one or more Track_Fragment_Run_BOX ('trun').

ここで、Track_Fragment_BOX（'traf'）は、Movie_Fragment_BOX（'moof'）３１２と同様に含まれているBOXのコンテナとしての役割を担っているだけであるので、そのサイズは一定（８バイト）である。またMovie_Fragment_Header_BOX（'mfhd'）は、Movie_Fragment_BOX（'moof'）３１２が出現する毎に加算されるシーケンスナンバーを保持しており、BOXのサイズは常に一定（１６バイト）である。 Here, since Track_Fragment_BOX ('traf') only serves as a container for the included BOX similarly to Movie_Fragment_BOX ('moof') 312, its size is constant (8 bytes). . Movie_Fragment_Header_BOX ('mfhd') holds a sequence number added every time Movie_Fragment_BOX ('moof') 312 appears, and the size of the BOX is always constant (16 bytes).

また、Track_Fragment_Header_BOX（'tfhd'）はトラックのIDを保持している。また、オプションとしてそのフラグメントの基準オフセット値やデータ（'mdat'）に格納されているサンプルのサイズやdurationなどのデフォルト値を設定することができる。このため、デフォルト値の設定を行なうか否かでBOXのサイズが異なる。 Track_Fragment_Header_BOX ('tfhd') holds a track ID. Optionally, default values such as the reference offset value of the fragment and the size and duration of the sample stored in the data ('mdat') can be set. For this reason, the BOX size differs depending on whether or not the default value is set.

しかし、一般には、符号化方法や符号化に関わるパラメータによってデフォルト値の設定を行うか否かはファイル生成時には決定されており、同じIDを持つトラックにおいてこれらの設定が動的に変更されることはない。このため、同一ファイル中では同じトラックIDを持つTrack_Fragment_Header_BOX（'tfhd'）のサイズは一定であると見なすことができる。 However, in general, whether or not to set default values depending on the encoding method and parameters related to encoding is determined at the time of file generation, and these settings are dynamically changed for tracks with the same ID. There is no. For this reason, the size of Track_Fragment_Header_BOX ('tfhd') having the same track ID in the same file can be regarded as constant.

例えば、オプションとして基準オフセット値（８バイト）を設定した場合、ＩＳＯ／ＩＥＣ１４４９６−１２によると、Track_Fragment_Header_BOX（'tfhd'）は下記のように定義されている。このため、このBOXのサイズは２４バイトとなる。また、下記tf_flagsの設定により、どのオプション・フィールドが有効なのか判断することができる。
aligned(8) class TrackFragmentHeaderBox
extends FullBox('tfhd', 0, tf_flags){
unsigned int(32) track_ID;
// all the following are optional fields
unsigned int(64) base_data_offset;
unsigned int(32) sample_description_index;
unsigned int(32) default_sample_duration;
unsigned int(32) default_sample_size;
unsigned int(32) default_sample_flags
} For example, when a reference offset value (8 bytes) is set as an option, Track_Fragment_Header_BOX ('tfhd') is defined as follows according to ISO / IEC 14496-12. For this reason, the size of this BOX is 24 bytes. In addition, it is possible to determine which option field is valid by setting the following tf_flags.
aligned (8) class TrackFragmentHeaderBox
extends FullBox ('tfhd', 0, tf_flags) {
unsigned int (32) track_ID;
// all the following are optional fields
unsigned int (64) base_data_offset;
unsigned int (32) sample_description_index;
unsigned int (32) default_sample_duration;
unsigned int (32) default_sample_size;
unsigned int (32) default_sample_flags
}

ところでトラックが複数存在する場合、データ（'mdat'）内のサンプルデータの格納方法は、ある一定時間分の連続した'サンプルの塊り'毎に異なるトラックのデータが交互に記録（インターリーブ）される方法が一般的である。この'サンプルの塊り'をMP4ファイルフォーマットでは'チャンク'と呼んでいる。 By the way, when there are multiple tracks, the method of storing the sample data in the data ('mdat') is to record (interleave) different track data alternately for each continuous 'cluster of samples' for a certain time. The method is general. This 'sample lump' is called 'chunk' in the MP4 file format.

ここで、Track_Fragment_BOX（'traf'）に１つ以上存在するTrack_Fragment_Run_BOX（'trun'）はこの'チャンク'毎のデータの情報を格納している。このため、例えば映像と音声のトラックが存在する場合、データ（'mdat'）内で映像と音声のチャンクがインターリーブされる形で格納され、そのチャンク毎にTrack_Fragment_Run_BOX（'trun'）が存在することになる。 Here, one or more Track_Fragment_Run_BOX ('trun') existing in Track_Fragment_BOX ('traf') stores data information for each 'chunk'. For this reason, for example, when video and audio tracks exist, the video and audio chunks are stored in an interleaved manner in the data ('mdat'), and there is a Track_Fragment_Run_BOX ('trun') for each chunk. become.

また、トラックが１つのみであっても、Track_Fragment_Run_BOX（'trun'）は複数存在してもよい。例えば映像データがＭＰＥＧ−４などのフレーム間予測を行うタイプの符号化形式であった場合、符号化データのまとまりであるGOV（Group of VOP , VOP : Video Object Planes）毎にチャンクを一般的に形成する。よって、Track_Fragment_Run_BOX（'trun'）が１つのMovie_Fragment_BOX（'moof'）３１２内にいくつ存在するかは、１データブロックの時間と１チャンクあたりの時間によって確定される。 Even if there is only one track, there may be a plurality of Track_Fragment_Run_BOX ('trun'). For example, when the video data is in the encoding format of the type that performs inter-frame prediction such as MPEG-4, chunks are generally set for each GOV (Group of VOP, VOP: Video Object Planes) that is a group of encoded data. Form. Therefore, how many Track_Fragment_Run_BOX ('trun') exist in one Movie_Fragment_BOX ('moof') 312 is determined by the time of one data block and the time per chunk.

また、Track_Fragment_Run_BOX（'trun'）は、そのチャンクに格納されているサンプル数を保持し、オプションとして、サンプル毎のdurationやサイズなどを設定可能である。このオプションの設定に関しては同一ファイル中では、該当するチャンクのサンプル数に依存してBOXのサイズは可変となるが、チャンク毎の時間は通常一定になっている。このため、映像データのフレームレートや音声データのサンプリングレートなどのデータレートが一定であれば、チャンク内のサンプル数も同一トラックID間では一定になる。そして、１データブロック中のサンプル数は１データブロックあたりの時間とメディアトラック毎のデータレートによって確定される。 Track_Fragment_Run_BOX ('trun') holds the number of samples stored in the chunk, and can optionally set the duration and size of each sample. Regarding the setting of this option, the BOX size is variable depending on the number of samples of the corresponding chunk in the same file, but the time for each chunk is usually constant. Therefore, if the data rate such as the frame rate of the video data and the sampling rate of the audio data is constant, the number of samples in the chunk is also constant between the same track IDs. The number of samples in one data block is determined by the time per data block and the data rate for each media track.

よって、Track_Fragment_Run_BOX（'trun'）のデータブロックあたりの個数は、トラック毎に"１フラグメントあたりの時間／１チャンクあたりの時間"で求めることができる。また、１チャンク当たりのサンプル数は"１データブロックの時間／メディアトラック毎のデータレート"で求まる。 Therefore, the number of Track_Fragment_Run_BOX ('trun') per data block can be obtained by “time per fragment / time per chunk” for each track. The number of samples per chunk can be obtained by “time of one data block / data rate per media track”.

ＩＳＯ／ＩＥＣ１４４９６−１２によると、Track_Fragment_Run_BOX（'trun'）は下記のように定義されている。このため、例えばこのBOXのオプションとして、data_offset、first_sample_flags、サンプル毎に設定されるパラメータのオプションは、sample_duration、sample_size、を設定したとする。すると、このＢＯＸのサイズは『ＢＯＸの数 × ２４バイト＋１チャンク当たりのサンプル数 × ８バイト』で求めることができる。 According to ISO / IEC 14496-12, Track_Fragment_Run_BOX ('trun') is defined as follows. For this reason, for example, it is assumed that data_offset, first_sample_flags, and sample_duration and sample_size are set as parameter options set for each sample as options of this BOX. Then, the size of the BOX can be obtained by “number of BOX × 24 bytes + number of samples per chunk × 8 bytes”.

また、Track_Fragment_Header_BOX（'tfhd'）と同様、下記tr_flagsの設定により、どのオプション・フィールドが有効なのかを判断することができる。
aligned(8) class TrackRunBox
extends FullBox('trun', 0, tr_flags) {
unsigned int(32) sample_count;
// the following are optional fields
signed int(32) data_offset;
unsigned int(32) first_sample_flags;
// all fields in the following array are optional
{
unsigned int(32) sample_duration;
unsigned int(32) sample_size;
unsigned int(32) sample_flags
unsigned int(32) sample_composition_time_offset;
}[ sample_count ]
} Similarly to Track_Fragment_Header_BOX ('tfhd'), it is possible to determine which option field is valid by setting the following tr_flags.
aligned (8) class TrackRunBox
extends FullBox ('trun', 0, tr_flags) {
unsigned int (32) sample_count;
// the following are optional fields
signed int (32) data_offset;
unsigned int (32) first_sample_flags;
// all fields in the following array are optional
{
unsigned int (32) sample_duration;
unsigned int (32) sample_size;
unsigned int (32) sample_flags
unsigned int (32) sample_composition_time_offset;
} [sample_count]
}

以上のことから、Movie_Fragment_BOX（'moof'）３１２のサイズの概算値は、上述したMovie_Fragment_BOX（'moof'）３１２に含まれる各BOXのサイズを総計することで求めることができる。また、フラグメントのデータ（'mdat'）部分の概算値は、最初のヘッダ（'moov'）に対応するデータ（'mdat'）と同様にビットレートと、このデータブロックの時間（Duration）から概算することができる。 From the above, the approximate value of the size of the Movie_Fragment_BOX ('moof') 312 can be obtained by summing the sizes of the BOXes included in the Movie_Fragment_BOX ('moof') 312 described above. In addition, the estimated value of the fragment data ('mdat') part is estimated from the bit rate and the time (Duration) of this data block in the same way as the data ('mdat') corresponding to the first header ('moov'). can do.

このように算出されたフラグメントのヘッダ（'moof'）とデータ（'mdat'）の概算値および先に説明したフラグメントの数から、すべてのフラグメント部分のデータ長の概算を行うことができる。もちろん、これまで説明してきたようなビットレートと時間からの算出ではなく、あらかじめ、それぞれのデータブロックの目標サイズが決められている。そして、この目標サイズからビットレートを決定し、符号化を行う符号圧縮回路２０９に対して符号化パラメータを設定するようにしてもよい。 From the calculated values of the fragment header ('moof') and data ('mdat') and the number of fragments described above, the data lengths of all fragment portions can be estimated. Of course, the target size of each data block is determined in advance, not the calculation from the bit rate and time as described above. Then, the bit rate may be determined from the target size, and the encoding parameter may be set for the code compression circuit 209 that performs encoding.

次に、終了条件見積もり処理（ステップＳ５０７）の詳細について説明する。これまで説明してきたように、通信データ長の算出処理（ステップＳ５０３）によりあらかじめ算出された通信データ長は、終了条件見積もり処理（ステップＳ５０７）において、実際に送出された通信データ５４３のデータ長と比較される。 Next, details of the end condition estimation process (step S507) will be described. As described above, the communication data length calculated in advance by the communication data length calculation process (step S503) is the same as the data length of the communication data 543 actually transmitted in the end condition estimation process (step S507). To be compared.

終了条件見積もり処理（ステップＳ５０７）では、データ出力１処理（ステップＳ５０６）やデータ出力２処理（ステップＳ５１０）において実際に出力された通信データ５４３のデータ長の積算値と、次に出力されるフラグメントのデータブロック長の予想サイズとの和が、あらかじめ算出された通信データ長より大きいかどうかの判断を行う。もし、大きいと判断されれば、終了条件を満たすことになり、小さいと判断されれば、終了条件を満たさないことになる。 In the end condition estimation process (step S507), the integrated value of the data length of the communication data 543 actually output in the data output 1 process (step S506) and the data output 2 process (step S510), and the next output fragment It is determined whether the sum of the data block length and the expected size is larger than the communication data length calculated in advance. If it is determined to be large, the end condition is satisfied, and if it is determined to be small, the end condition is not satisfied.

ここで、次に出力されるフラグメントのデータブロック長の予想サイズとは、通信データ長の算出処理（ステップＳ５０３）で算出されたひとつのフラグメントのデータブロック長にほかならない。 Here, the expected size of the data block length of the next fragment to be output is nothing but the data block length of one fragment calculated in the communication data length calculation process (step S503).

ここで、あらかじめ計算された予想サイズを実際に出力されるデータのサイズが上回る可能性があるため、あらかじめ計算された予想サイズを補正してもよい。すなわち、これまで実際に出力されたフラグメントのデータサイズと予想サイズの最大の差異を加えて比較してもよいし、あらかじめ設定済みの誤差を加えて比較してもよい。 Here, since the size of the data that is actually output may exceed the predicted size calculated in advance, the predicted size calculated in advance may be corrected. That is, comparison may be made by adding the maximum difference between the data size of the fragment actually output so far and the expected size, or by adding a preset error.

設定済みの誤差を加える場合には、データ出力１処理（ステップＳ５０６）やデータ出力２処理（ステップＳ５１０）において実際に出力された通信データ５４３のデータ長の積算値と次に出力されるフラグメントのデータブロック長の予想サイズの和に設定済みの誤差の値を加える。そして、これがあらかじめ算出された通信データ長より大きいかどうかの判断を行う。 When adding the set error, the integrated value of the data length of the communication data 543 actually output in the data output 1 process (step S506) and the data output 2 process (step S510) and the fragment to be output next. The set error value is added to the sum of the expected size of the data block length. And it is judged whether this is larger than the communication data length calculated beforehand.

次に、ｆｒｅｅデータブロック生成処理（ステップＳ５１１）について説明する。
ｆｒｅｅデータブロック生成処理（ステップＳ５１１）で生成されたデータブロックは、最後のフラグメント（条件によっては、データ出力１処理（ステップＳ５０６）の出力データ）の直後に出力される。 Next, the free data block generation process (step S511) will be described.
The data block generated by the free data block generation process (step S511) is output immediately after the last fragment (the output data of the data output 1 process (step S506) depending on conditions).

生成されるデータは、ＭＰ４においては、ｆｒｅｅＢｏｘまたはｓｋｉｐＢｏｘなどと呼ばれるそれ自体は特に意味を持たないデータである。この詳細については、ＭＰ４の規格に定義されているため詳しく述べないが、そのデータ長については説明を加えておく。 The generated data is data that is not particularly meaningful in itself, which is called a free box or a skip box in MP4. This detail is not described in detail because it is defined in the MP4 standard, but a description of the data length is added.

ｆｒｅｅデータブロック生成処理（ステップＳ５１１）で生成されたデータブロックのサイズは、通信データ長の算出処理（ステップＳ５０３）において算出されたデータ長から、これまでに実際に出力されたデータ長を差し引いたものである。もし、ｆｒｅｅＢｏｘを用いるのであれば、その定義に従い、ｆｒｅｅＢｏｘのデータ長と４バイトのｆｒｅｅＢｏｘを示す識別子を含めたサイズがこのデータブロックのサイズとなるようにする。このようにすることにより、通信データ長の算出処理（ステップＳ５０３）において算出されたデータ長と実際に出力されるデータ長を完全に一致させることができる。 The size of the data block generated in the free data block generation process (step S511) is obtained by subtracting the data length actually output so far from the data length calculated in the communication data length calculation process (step S503). Is. If a free box is used, according to the definition, the size including the data length of the free box and the identifier indicating the 4-byte free box is made the size of this data block. In this way, the data length calculated in the communication data length calculation process (step S503) can be completely matched with the actually output data length.

図５を用いたフローチャートに関する説明の最後に、生成されるＨＴＴＰヘッダについて説明を加える。図６は、先に説明した、通信データ５４３の全体像を示す模式図である。
図６において、通信データ５４３は、ＨＴＴＰヘッダ６０１、これに続くＭＰ４ファイルにおける最初のヘッダ（'moov'）とデータ（'mdat'）の組み合わせからなる第１データブロック、ＭＰ４ファイルにおけるフラグメントのヘッダ（'moof'）とデータ（'mdat'）の組み合わせからなる２つの後続データブロック、及びｆｒｅｅデータブロックから構成される。 At the end of the description of the flowchart using FIG. 5, the generated HTTP header will be described. FIG. 6 is a schematic diagram showing the overall image of the communication data 543 described above.
In FIG. 6, communication data 543 includes an HTTP header 601, a first data block composed of a combination of the first header ('moov') and data ('mdat') in the subsequent MP4 file, and a fragment header in the MP4 file ( It consists of two subsequent data blocks consisting of a combination of 'moof') and data ('mdat'), and a free data block.

図６に示す模式図においては、フラグメントのヘッダ（'moof'）とデータ（'mdat'）の組み合わせが２回発生しているが、説明してきたように、これは、０回以上の任意の回数発生することは言うまでもない。また、データ（'mdat'）が動画のデータを格納する部分であることをわかりやすく示すため、エンコードバッファ５３２からの矢印でこれを示している。 In the schematic diagram shown in FIG. 6, the combination of the fragment header ('moof') and data ('mdat') occurs twice, but as explained, this is an arbitrary number of zero or more times. Needless to say, it occurs several times. In addition, in order to clearly show that the data ('mdat') is a part for storing moving image data, this is indicated by an arrow from the encode buffer 532.

ＨＴＴＰヘッダ６０１は、ＨＴＴＰプロトコルにおけるステータスラインとレスポンスヘッダの一例であって、必ずしもすべての標準メッセージを記載したものではないが、本実施形態において重要な項目を最小限含むものである。最初の行は、ステータスラインであり、この例では、ＨＴＴＰのリクエストに対して、ＨＴＴＰ／１．０が成功したことを示している。これにより通信クライアント１０６は、正しく処理が行われていることを知ることができる。 The HTTP header 601 is an example of a status line and a response header in the HTTP protocol, and does not necessarily describe all standard messages, but includes at least important items in the present embodiment. The first line is a status line, which in this example indicates that HTTP / 1.0 was successful for an HTTP request. As a result, the communication client 106 can know that the process is being performed correctly.

次の行は、レスポンスヘッダの一部であり、ＨＴＴＰに従って送信されるデータのサイズを通知するものである。通信クライアント１０６は、この行により送信されてくる、すなわち受信されるデータのサイズを正確に知ることができる。ここで、送信されるデータのサイズとは、図５を参照しながら説明したフローチャートにおける通信データ長の算出処理（ステップＳ５０３）で算出されたＭＰ４のファイル長にほかならない。通信クライアント１０６は、この情報によって通信サーバー１０５のＣＧＩプログラムが見積もったファイルサイズを取得する。 The next line is a part of the response header and notifies the size of data transmitted according to HTTP. The communication client 106 can accurately know the size of data transmitted by this line, that is, received. Here, the size of the data to be transmitted is nothing but the MP4 file length calculated in the communication data length calculation process (step S503) in the flowchart described with reference to FIG. The communication client 106 acquires the file size estimated by the CGI program of the communication server 105 based on this information.

一方でＣＧＩプログラムはこのファイルサイズに従って適切にファイルサイズを調整することにより、あたかもあらかじめ正確にファイルとして記録蓄積された動画データを確実にダウンロード取得したかのように処理することが可能となる。 On the other hand, by appropriately adjusting the file size according to the file size, the CGI program can process the moving image data recorded and stored as a file accurately in advance as if it was downloaded and acquired.

３行目は、この動画データの形式を示すものである。通信サーバー１０５と通信クライアント１０６があらかじめ動画データの形式を了解しあっていない場合には、この行によって、通信クライアント１０６は、受信データの形式を知ることができる。 The third line shows the format of the moving image data. If the communication server 105 and the communication client 106 do not understand the format of the moving image data in advance, the communication client 106 can know the format of the received data by this line.

以上のように、この方法によればＨＴＴＰ／１．１で定義される仕様が送信側または受信側、またはその双方で実装されていなくても、ＨＴＴＰ／１．０がサポートされていればライブ映像を配信して受信・再生が可能となる。また受信したデータは、フラグメント映像形式のＭＰ４ファイルとして正しいデータとなるため、コンテンツとして再利用することなども可能となる。 As described above, according to this method, even if the specification defined in HTTP / 1.1 is not implemented on the transmission side, the reception side, or both, if the HTTP / 1.0 is supported, the live Video can be distributed and received / played. Further, since the received data becomes correct data as an MP4 file in the fragment video format, it can be reused as content.

（第２の実施形態）
第１の実施形態においては、通信プロトコルとしてＨＴＴＰを利用した実施形態を説明してきたが、本実施形態では、ＨＴＴＰ以外の通信形式にも適用可能である。例えば、第１の実施形態で示した通信サーバー１０５と通信クライアント１０６が、ＳＯＡＰ（Simple Object Access Protocol）のような端末間などで機能呼び出しを行う仕組みをもつ。そして、通信クライアント１０６から遠隔関数呼び出しによって動画データを取得する手法が技術的に確立している。 (Second Embodiment)
In the first embodiment, the embodiment using HTTP as the communication protocol has been described. However, the present embodiment is applicable to communication formats other than HTTP. For example, the communication server 105 and the communication client 106 shown in the first embodiment have a mechanism for calling a function between terminals such as SOAP (Simple Object Access Protocol). A technique for acquiring moving image data by remote function call from the communication client 106 has been technically established.

ＳＯＡＰ自体は、ネットワーク経由でオブジェクト間の通信を行う軽量のプロトコルであり、それを実現する手段としてＨＴＴＰなどを利用することができる。しかしながら、ＳＯＡＰそのものには、動画データをダウンロードするといった記述の標準がないため、実際には、これを利用するアプリケーションが記述方法を定義して利用することになる。 SOAP itself is a lightweight protocol for performing communication between objects via a network, and HTTP or the like can be used as means for realizing it. However, since SOAP itself does not have a description standard for downloading moving image data, an application that uses this actually defines and uses a description method.

次に、ＳＯＡＰを用いたメッセージの例を以下に示す。
<SOAP-ENV:Envelope>
<SOAP-ENV:Body>
<ns:getMP4>
<ns:duration>180</ns:duration>
</ns:getMP4>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope> Next, an example of a message using SOAP is shown below.
<SOAP-ENV: Envelope>
<SOAP-ENV: Body>
<ns: getMP4>
<ns: duration> 180 </ ns: duration>
</ ns: getMP4>
</ SOAP-ENV: Body>
</ SOAP-ENV: Envelope>

この例では、「getMP4」という関数名で１８０秒間のデータを要求している。このＳＯＡＰメッセージが通信クライアント１０６から通信サーバー１０５に送信されると、通信サーバー１０５ではＣＧＩと同様にしてこの関数の処理が行われるようになっている。こうした処理の詳細な方法は、実装に依存するものであり、ＣＧＩプログラムで一般的なように別プロセスで処理されてもよいし、メッセージを受信したプログラム内で処理されてもよい。 In this example, data for 180 seconds is requested with the function name “getMP4”. When this SOAP message is transmitted from the communication client 106 to the communication server 105, the communication server 105 performs processing of this function in the same manner as the CGI. The detailed method of such processing depends on the implementation, and may be processed in a separate process as is common in CGI programs, or may be processed in the program that received the message.

このようにして呼び出された関数は、第１の実施形態で図５を用いて示した通信データ長の算出処理（ステップＳ５０３）と同様にして通信データ長の算出を行う。ＳＯＡＰによる処理の結果の一例を以下に示す。
<SOAP-ENV:Envelope>
<SOAP-ENV:Body>
<ns:getMP4Response>
<ns:length>45000000</ns:length>
<ns:source>（データ）</ns:source>
</ns1:getMP4Response>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope> The function called in this way calculates the communication data length in the same manner as the communication data length calculation process (step S503) shown in FIG. 5 in the first embodiment. An example of the result of processing by SOAP is shown below.
<SOAP-ENV: Envelope>
<SOAP-ENV: Body>
<ns: getMP4Response>
<ns: length> 45000000 </ ns: length>
<ns: source> (data) </ ns: source>
</ ns1: getMP4Response>
</ SOAP-ENV: Body>
</ SOAP-ENV: Envelope>

通信クライアント１０６に返された結果は、この例では、「ns:length」で与えられたデータ長と「ns:source」で与えられた実際の動画データである。ここで注目すべき事は、データ長と実際の動画データが、第１の実施形態で図５を用いて示したフローチャートとまったく同様に扱われている点である。 In this example, the result returned to the communication client 106 is the data length given by “ns: length” and the actual moving picture data given by “ns: source”. What should be noted here is that the data length and the actual moving image data are handled in exactly the same manner as the flowchart shown in FIG. 5 in the first embodiment.

このように、本実施形態は、ＨＴＴＰ以外の通信形式にも適用可能となっている。また、この例では、動画データをＳＯＡＰメッセージの一部として処理しているが、送信データのエンコーディング上の制約などから、代わりに、動画データの取得のためのプロトコルとアクセス方法を返しても、本発明の技術的背景になんら差異がない。 Thus, this embodiment is applicable also to communication formats other than HTTP. In this example, the video data is processed as a part of the SOAP message. However, due to restrictions on encoding of transmission data, instead of returning the protocol and access method for acquiring video data, There is no difference in the technical background of the present invention.

（本発明に係る他の実施形態）
本発明は、ＩＳＯＢａｓｅＭｅｄｉａＦｉｌｅ形式または同様の動画などのメディアデータに加えてその管理情報を時間に沿ってインターリーブする形式であれば適用可能である。このようなファイル形式として、ＡＶＣファイル（MPEG-4 Advanced Video Codingファイル）やＭｏｔｉｏｎ−ＪＰＥＧ２０００ファイルなどにも適用できる。これらのファイル形式は、すべてＩＳＯＢａｓｅＭｅｄｉａＦｉｌｅ形式とその根本的な形式が同一であり、実質的に同等であると考えることができる。 (Other embodiments according to the present invention)
The present invention is applicable as long as the management information is interleaved along with time in addition to the ISO Base Media File format or similar media data such as moving images. As such a file format, it can be applied to an AVC file (MPEG-4 Advanced Video Coding file), a Motion-JPEG 2000 file, and the like. All of these file formats have the same basic format as the ISO Base Media File format and can be considered to be substantially equivalent.

前述した本発明の実施形態における動画像データ配信装置を構成する各手段、並びに動画像データ配信装置の制御方法の各工程は、コンピュータのＲＡＭやＲＯＭなどに記憶されたプログラムが動作することによって実現できる。このプログラム及び前記プログラムを記録したコンピュータ読み取り可能な記録媒体は本発明に含まれる。 Each means constituting the moving image data distribution apparatus and each process of the control method of the moving image data distribution apparatus in the embodiment of the present invention described above is realized by the operation of a program stored in a RAM or ROM of a computer. it can. This program and a computer-readable recording medium recording the program are included in the present invention.

また、本発明は、例えば、システム、装置、方法、プログラムもしくは記録媒体等としての実施形態も可能であり、具体的には、複数の機器から構成されるシステムに適用してもよいし、また、一つの機器からなる装置に適用してもよい。 Further, the present invention can be implemented as, for example, a system, apparatus, method, program, or recording medium. Specifically, the present invention may be applied to a system including a plurality of devices. The present invention may be applied to an apparatus composed of a single device.

なお、本発明は、前述した実施形態の機能を実現するソフトウェアのプログラム（実施形態では図５に示すフローチャートに対応したプログラム）を、システムまたは装置に直接、または遠隔から供給する。そして、そのシステムまたは装置のコンピュータが前記供給されたプログラムコードを読み出して実行することによっても達成される場合を含む。 In the present invention, a software program (in the embodiment, a program corresponding to the flowchart shown in FIG. 5) for realizing the functions of the above-described embodiments is directly or remotely supplied to the system or apparatus. This includes the case where the system or the computer of the apparatus is also achieved by reading and executing the supplied program code.

したがって、本発明の機能処理をコンピュータで実現するために、前記コンピュータにインストールされるプログラムコード自体も本発明を実現するものである。つまり、本発明は、本発明の機能処理を実現するためのコンピュータプログラム自体も含まれる。 Accordingly, since the functions of the present invention are implemented by computer, the program code installed in the computer also implements the present invention. In other words, the present invention includes a computer program itself for realizing the functional processing of the present invention.

その場合、プログラムの機能を有していれば、オブジェクトコード、インタプリタにより実行されるプログラム、ＯＳに供給するスクリプトデータ等の形態であってもよい。 In that case, as long as it has the function of a program, it may be in the form of object code, a program executed by an interpreter, script data supplied to the OS, and the like.

プログラムを供給するための記録媒体としては、例えば、フロッピー（登録商標）ディスク、ハードディスク、光ディスク、光磁気ディスクなどがある。さらに、ＭＯ、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＣＤ−ＲＷ、磁気テープ、不揮発性のメモリカード、ＲＯＭ、ＤＶＤ（ＤＶＤ−ＲＯＭ、ＤＶＤ−Ｒ）などもある。 Examples of the recording medium for supplying the program include a floppy (registered trademark) disk, a hard disk, an optical disk, and a magneto-optical disk. Further, there are MO, CD-ROM, CD-R, CD-RW, magnetic tape, nonvolatile memory card, ROM, DVD (DVD-ROM, DVD-R) and the like.

その他、プログラムの供給方法としては、クライアントコンピュータのブラウザを用いてインターネットのホームページに接続する方法がある。そして、前記ホームページから本発明のコンピュータプログラムそのもの、もしくは圧縮され自動インストール機能を含むファイルをハードディスク等の記録媒体にダウンロードすることによっても供給できる。 As another program supply method, there is a method of connecting to a homepage on the Internet using a browser of a client computer. The computer program itself of the present invention or a compressed file including an automatic installation function can be downloaded from the homepage by downloading it to a recording medium such as a hard disk.

また、本発明のプログラムを構成するプログラムコードを複数のファイルに分割し、それぞれのファイルを異なるホームページからダウンロードすることによっても実現可能である。つまり、本発明の機能処理をコンピュータで実現するためのプログラムファイルを複数のユーザに対してダウンロードさせるＷＷＷサーバも、本発明に含まれるものである。 It can also be realized by dividing the program code constituting the program of the present invention into a plurality of files and downloading each file from a different homepage. That is, a WWW server that allows a plurality of users to download a program file for realizing the functional processing of the present invention on a computer is also included in the present invention.

また、その他の方法として、本発明のプログラムを暗号化してＣＤ−ＲＯＭ等の記録媒体に格納してユーザに配布し、所定の条件をクリアしたユーザに対し、インターネットを介してホームページから暗号化を解く鍵情報をダウンロードさせる。そして、その鍵情報を使用することにより暗号化されたプログラムを実行してコンピュータにインストールさせて実現することも可能である。 As another method, the program of the present invention is encrypted, stored in a recording medium such as a CD-ROM, distributed to users, and encrypted from a homepage via the Internet to users who have cleared predetermined conditions. Download the key information to be solved. It is also possible to execute the encrypted program by using the key information and install the program on a computer.

また、コンピュータが、読み出したプログラムを実行することによって、前述した実施形態の機能が実現される。さらに、そのプログラムの指示に基づき、コンピュータ上で稼動しているＯＳなどが、実際の処理の一部または全部を行い、その処理によっても前述した実施形態の機能が実現され得る。 Further, the functions of the above-described embodiments are realized by the computer executing the read program. Furthermore, based on the instructions of the program, an OS or the like running on the computer performs part or all of the actual processing, and the functions of the above-described embodiments can be realized by the processing.

さらに、その他の方法として、まず記録媒体から読み出されたプログラムが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれる。そして、そのプログラムの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によっても前述した実施形態の機能が実現される。 As another method, the program read from the recording medium is first written in a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer. Then, based on the instructions of the program, the CPU or the like provided in the function expansion board or function expansion unit performs part or all of the actual processing, and the functions of the above-described embodiments are also realized by the processing.

本発明の第１の実施形態におけるシステム構成例を示す概念図である。It is a conceptual diagram which shows the system configuration example in the 1st Embodiment of this invention. 本発明の第１の実施形態に係るネットワークカメラ１０１の内部構成及び再生装置１０２の内部構成の一例を示すブロック図である。2 is a block diagram illustrating an example of an internal configuration of a network camera 101 and an internal configuration of a playback apparatus 102 according to the first embodiment of the present invention. FIG. 本発明の第１の実施形態において、フラグメント映像形式を含むＭＰ４ファイル形式についてのデータ構造の一例を示す概念図である。In the 1st Embodiment of this invention, it is a conceptual diagram which shows an example of the data structure about the MP4 file format containing a fragment image | video format. 本発明の第１の実施形態において、ＵＲＬの一例を示す図である。In the 1st Embodiment of this invention, it is a figure which shows an example of URL. 本発明の第１の実施形態において、ＣＧＩプログラムの動作の一例を示すフローチャートである。4 is a flowchart illustrating an example of an operation of a CGI program in the first embodiment of the present invention. 本発明の第１の実施形態において、通信データの全体像を示す図である。It is a figure which shows the whole image of communication data in the 1st Embodiment of this invention.

Explanation of symbols

１０１ネットワークカメラ
１０２再生装置
１０３動画像生成モジュール
１０４通信データ生成モジュール
１０５通信サーバー
１０６通信クライアント
１０７動画像表示モジュール
２０１撮影レンズユニット
２０２撮影レンズユニット駆動回路
２０３絞りユニット
２０４絞りユニット駆動回路
２０５光学センサ
２０６光学センサ駆動回路
２０７Ａ／Ｄコンバータ
２０８画像信号処理回路
２０９符号圧縮回路
２１０メモリ
２１１ネットワークコントローラ
２１２通信回路
２１３ＲＯＭ
２１４ＣＰＵ
２２１通信回路
２２２ネットワークコントローラ
２２３符号伸張回路
２２４画像信号処理回路
２２５Ａ／Ｄコンバータ
２２６表示装置
２２７表示装置駆動回路
２２８記録装置
２２９Ｉ／Ｏコントローラ DESCRIPTION OF SYMBOLS 101 Network camera 102 Playback apparatus 103 Moving image generation module 104 Communication data generation module 105 Communication server 106 Communication client 107 Moving image display module 201 Shooting lens unit 202 Shooting lens unit drive circuit 203 Aperture unit 204 Aperture unit drive circuit 205 Optical sensor 206 Optical Sensor drive circuit 207 A / D converter 208 Image signal processing circuit 209 Code compression circuit 210 Memory 211 Network controller 212 Communication circuit 213 ROM
214 CPU
221 Communication circuit 222 Network controller 223 Sign expansion circuit 224 Image signal processing circuit 225 A / D converter 226 Display device 227 Display device drive circuit 228 Recording device 229 I / O controller

Claims

Video data generated so as to be composed of one or more data blocks consisting of encoded data of the video and management information of the encoded data according to the generation order of the video data consisting of at least one of video and audio Generating means;
Moving image data size approximating means for estimating the size of the moving image data in advance;
First communication data generation means for generating the size information of the moving image data estimated by the moving image data size estimation means and the first data block in a format suitable for a communication procedure;
Second communication data generating means for sequentially generating moving image data generated by the moving image data generating means in a format suitable for a communication procedure;
Communication data integrating means for integrating the sizes of communication data sequentially generated by the first communication data generating means and the second communication data generating means;
End determination means for determining end from the size information of the moving image data estimated by the moving image data size estimation means and the size information of the communication data accumulated by the communication data accumulation means;
When it is determined that the end is determined by the end determination unit, the difference between the size information of the moving image data estimated by the moving image data size estimation unit and the size information of the communication data integrated by the moving image data size estimation unit Third communication data generating means for generating ignored data of equal size;
A moving image data distribution apparatus comprising: communication means for sequentially transmitting communication data generated by the third communication data generation means.

The moving image data size estimation means is a moving image that approximates the sum of the encoded data size calculated from the encoding rate of the moving image data and the distribution time of the moving image data and the management information of the encoded data. 2. The moving image data distribution apparatus according to claim 1, wherein the moving image data distribution device has a size of image data.

The termination determining means includes the size information of the communication data, the size information of the communication data accumulated by the moving image data size estimating means, the encoding rate of the moving image data, and the next of the moving image data. This is performed by determining whether or not it is smaller than the sum of the size of the encoded data calculated from the time of the data block, the management information of the encoded data, and an error value of 0 or more given in advance. The moving image data distribution apparatus according to claim 1, wherein:

The moving image data includes MPEG, JPEG, or encoded data comparable to the MPEG and JPEG, and the main body of communication data distributed by the communication means is in ISO Base Media File format or a derivative format thereof. The moving image data distribution apparatus according to any one of claims 1 to 3.

The moving image data distribution apparatus according to claim 1, wherein the communication unit uses HTTP or SOAP as a communication procedure.

The moving image data distribution device according to any one of claims 1 to 5,
A moving image data communication comprising: a client communication unit that communicates with the moving image data distribution device; and a reproduction device that includes a moving image reproduction unit that reproduces the moving image data received by the client communication unit. system.