JP2007159058A

JP2007159058A - Recording apparatus and recording method, and reproduction apparatus and reproduction method

Info

Publication number: JP2007159058A
Application number: JP2005355308A
Authority: JP
Inventors: Akio Fujii; 昭雄藤井
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2005-12-08
Filing date: 2005-12-08
Publication date: 2007-06-21
Anticipated expiration: 2025-12-08
Also published as: JP4630805B2

Abstract

<P>PROBLEM TO BE SOLVED: To realize techniques for increasing data access speed, while reducing the memory capacity necessary for reproducing or searching for image data or audio data recorded as a file. <P>SOLUTION: An apparatus recording image data and audio data as a file includes an image data encoding means, an image data code amount control means, an audio data encoding means, an audio data code amount control means, an image stream generating means, an audio stream generating means, a multiplexing means for multiplexing the generated image stream and audio stream, and a recording means for recording a stream multiplexed by the multiplexing means on a recording medium, wherein the image data code amount control means and the audio data code amount control means control the image data encoding means and the audio data encoding means so as to fix the lengths of the image stream and the audio stream for each unit to be multiplexed by the multiplexing means. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、フレーム間符号化を用いた動画像データとブロック符号化を用いたオーディオデータとを記録及び再生する技術に関するものである。 The present invention relates to a technique for recording and reproducing moving image data using interframe coding and audio data using block coding.

近年、デジタルビデオカメラやデジタルカメラ等、動画像データをＭＰＥＧ等のフレーム間予測符号化を用いて符号化して記録再生する装置が普及している。 In recent years, devices such as digital video cameras and digital cameras that record and reproduce moving image data by encoding using inter-frame predictive encoding such as MPEG have become widespread.

また、デジタルビデオカメラやデジタルカメラ等で撮影されたＭＰＥＧ（ＭＰＥＧ２／ＭＰＥＧ４）の動画像データやオーディオデータはＭＰ４という汎用ファイルフォーマットで記録されている。これらのデータをＭＰ４ファイルとして記録することで、他の機器で再生できる互換性が保証される。尚、ＭＰ４は、MPEG audio Layer-4の略称である。 In addition, MPEG (MPEG2 / MPEG4) moving image data and audio data shot by a digital video camera, a digital camera, or the like are recorded in a general-purpose file format called MP4. By recording these data as MP4 files, compatibility that can be reproduced by other devices is guaranteed. MP4 is an abbreviation for MPEG audio Layer-4.

ＭＰ４ファイルは、符号化された画像データ及びオーディオデータのストリームが入っているmdat boxと各ストリームに関する情報が入っているmoov boxとからなる。mdat boxは更にチャンク(chunk)で構成され、例えば、動画像のチャンクとオーディオのチャンクとから構成される。 The MP4 file includes an mdat box containing encoded image data and audio data streams and a moov box containing information about each stream. The mdat box is further composed of chunks, for example, a moving image chunk and an audio chunk.

moov boxには、ファイルの各先頭から各チャンクへのオフセットバイト数や、サイズ、サンプル数等が格納される。 The moov box stores the number of offset bytes from each head of the file to each chunk, the size, the number of samples, and the like.

ＭＰ４ファイルを再生する場合、記録媒体からＭＰ４ファイルのmoov boxを読み込み、そのmoov boxに基づいて動画像データのチャンク、オーディオデータのチャンクへのアクセスができるようになる。 When playing back an MP4 file, the moov box of the MP4 file is read from the recording medium, and the chunk of moving image data and the chunk of audio data can be accessed based on the moov box.

しかし、moov boxのデータはチャンク数に比例してそのサイズが増大する。つまり、記録時間が長くなればそれに比例してサイズが増大するものであり、ストリームへのアクセス情報であるmoov boxデータを全てデジタルビデオカメラやデジタルカメラ等の携帯型小型撮影機器に読み込むためには大きなメモリを必要とする。 However, the size of moov box data increases in proportion to the number of chunks. In other words, the longer the recording time, the larger the size, and in order to read all the moov box data, which is the access information to the stream, into a portable small photographic device such as a digital video camera or digital camera. Requires large memory.

このような問題に対処するために、moov boxのデータとは別にストリームにアクセスする情報を選別して別データとし、ＭＰ４ファイルにリンクする情報と共に別ファイルとして記録しておく技術が提案されている。この技術によれば、再生時にその別ファイルからＭＰ４のストリームデータへアクセスすることで、再生時にストリームへのアクセス情報を展開するメモリを削減している（例えば、特許文献１参照）。 In order to deal with such a problem, a technique has been proposed in which information for accessing a stream is selected separately from the data in the moov box and is recorded as separate data and recorded as a separate file together with the information linked to the MP4 file. . According to this technology, the MP4 stream data is accessed from the separate file at the time of reproduction, thereby reducing a memory for developing access information to the stream at the time of reproduction (for example, see Patent Document 1).

ところが、上記のようにmdat box内のストリームデータへのアクセス情報を選別して、アクセス情報の縮小を図っても、記録時間の増大に伴って大きくなることには変わりない。また、アクセスデータを別ファイルとして記録しておくので、ＭＰ４ファイルを別の記録媒体にコピーした場合には使用できない。
特開２００４−１２８９３８号公報 However, even if the access information to the stream data in the mdat box is selected as described above and the access information is reduced, it does not change as the recording time increases. Further, since the access data is recorded as a separate file, it cannot be used when the MP4 file is copied to another recording medium.
JP 2004-128938 A

上述したようにＭＰ４ファイルを再生する場合、データストリームへのアクセス情報を格納するためには大きなメモリ容量を必要とするという問題がある。 As described above, when an MP4 file is reproduced, there is a problem that a large memory capacity is required to store access information to the data stream.

また、記録媒体に記録されている１つのＭＰ４ファイルからmoov box内のデータストリームへのアクセス情報を一度に機器内に取り込めない場合には、以下のように対処しなければならない。即ち、ＭＰ４ファイルデータを再生していくに従い、読み込んだアクセス情報の切れ目が近づくと、最初に読み込んだアクセス情報を破棄し、切れ目に続くアクセス情報をファイルの先頭にシークしてmoov boxから読み込んで取得しなければならない。しかし、記録レートの大きいファイルや、記録時間の長いファイルの後ろのほうを再生している場合や、特殊再生（サーチ）している場合等でリアルタイムにシームレスに再生できなくなるという問題がある。 Further, when access information to a data stream in the moov box cannot be taken in from a single MP4 file recorded on the recording medium at a time, the following measures must be taken. In other words, as the MP4 file data is played back, when the break of the read access information approaches, the access information read first is discarded, and the access information following the break is sought at the head of the file and read from the moov box. Must get. However, there is a problem that seamless playback in real time becomes impossible when a file with a high recording rate or a file with a long recording time is played back, or when a special playback (search) is performed.

本発明は、上記課題に鑑みてなされ、その目的は、ファイルとして記録された画像データやオーディオデータの再生時やサーチ時に必要なメモリ容量を削減し、データアクセスを高速化できる技術を実現することである。 The present invention has been made in view of the above problems, and an object of the present invention is to realize a technique capable of reducing the memory capacity required for reproducing or searching image data or audio data recorded as a file and speeding up data access. It is.

上記課題を解決し、目的を達成するために、本発明は、画像データ及びオーディオデータをファイルとして記録する装置において、画像データを符号化する画像データ符号化手段と、前記画像データ符号化手段を制御する画像データ符号量制御手段と、オーディオデータを符号化するオーディオデータ符号化手段と、前記オーディオデータ符号化手段を制御するオーディオデータ符号量制御手段と、前記画像データ符号化手段によって符号化された画像データをストリーム化する画像ストリーム生成手段と、前記オーディオデータ符号化手段によって符号化されたオーディオデータをストリーム化するオーディオストリーム生成手段と、それぞれ予め決められた期間分の前記画像データ及びオーディオデータを単位として、前記生成された画像ストリーム及びオーディオストリームを多重する多重化手段と、前記多重化手段によって多重されたストリームを記録媒体に記録する記録手段とを有し、前記画像データ符号量制御手段及び前記オーディオデータ符号量制御手段は、前記多重化手段によって多重化する単位毎に前記画像ストリーム及びオーディオストリームを夫々固定長化するように前記画像データ符号化手段及びオーディオデータ符号化手段を制御する。 In order to solve the above-mentioned problems and achieve the object, the present invention provides an image data encoding means for encoding image data and an image data encoding means in an apparatus for recording image data and audio data as a file. Image data code amount control means for controlling, audio data encoding means for encoding audio data, audio data code amount control means for controlling the audio data encoding means, and image data encoding means An image stream generating means for streaming the image data, an audio stream generating means for streaming the audio data encoded by the audio data encoding means, and the image data and audio data for a predetermined period, respectively. The generated image in units of A multiplexing unit that multiplexes the stream and the audio stream; and a recording unit that records the stream multiplexed by the multiplexing unit on a recording medium. The image data code amount control unit and the audio data code amount control unit include: The image data encoding means and the audio data encoding means are controlled so that the image stream and the audio stream are each fixed-length for each unit multiplexed by the multiplexing means.

また、本発明は、記録媒体にファイルとして記録された画像データ及びオーディオデータを再生する再生手段と、前記再生手段を制御する再生制御手段と、前記記録媒体に記録されているヘッダの情報を解析するヘッダ情報解析手段とを有し、前記ヘッダ情報解析手段が前記記録媒体に記録されているヘッダ内に固有のユーザーデータを検出した場合、当該ヘッダ内に記録されているストリームへのアクセスデータの読み込みを禁止し、前記再生制御手段は、前記ヘッダ情報解析手段で解析されたファイルに記録されているストリームのパラメータ情報に基づいて、前記再生手段を制御して前記記録媒体にファイルとして記録された画像データ及びオーディオデータを再生する。 The present invention also provides a reproducing means for reproducing image data and audio data recorded as a file on a recording medium, a reproduction control means for controlling the reproducing means, and analyzing header information recorded on the recording medium. Header information analysis means, and when the header information analysis means detects unique user data in the header recorded on the recording medium, the access information of the stream recorded in the header Reading is prohibited, and the reproduction control unit controls the reproduction unit based on the parameter information of the stream recorded in the file analyzed by the header information analysis unit, and is recorded as a file on the recording medium. Play back image data and audio data.

本発明は、画像データ及びオーディオデータをファイルとして記録する方法であって、画像データの符号化を制御する画像データ符号化制御工程と、オーディオデータの符号化を制御するオーディオデータ符号化制御工程と、符号化された前記画像データをストリーム化する画像ストリーム生成工程と、符号化された前記オーディオデータをストリーム化するオーディオストリーム生成工程と、それぞれ予め決められた期間分の前記画像データ及びオーディオデータを単位として、前記生成された画像ストリーム及びオーディオストリームを多重する多重化工程と、多重された前記ストリームを記録媒体に記録する記録工程とを有し、前記画像データ符号化制御工程及び前記オーディオデータ符号化制御工程では、前記多重化工程にて多重化する単位毎に前記画像ストリーム及びオーディオストリームを夫々固定長化するように前記画像データ及び前記オーディオデータを符号化する。 The present invention relates to a method for recording image data and audio data as a file, an image data encoding control step for controlling encoding of image data, and an audio data encoding control step for controlling encoding of audio data. An image stream generation step for streaming the encoded image data, an audio stream generation step for streaming the encoded audio data, and the image data and audio data for a predetermined period, respectively. The unit includes a multiplexing step of multiplexing the generated image stream and audio stream, and a recording step of recording the multiplexed stream on a recording medium, and the image data encoding control step and the audio data code In the multiplexing control process, multiplexing is performed in the multiplexing process. The encoding said image data and the audio data to each fixed length image and audio streams for each that unit.

また、本発明は、記録媒体にファイルとして記録された画像データ及びオーディオデータを再生する再生方法であって、前記記録媒体に記録されているヘッダの情報を解析するヘッダ情報解析工程と、前記ヘッダ情報解析工程により前記記録媒体に記録されているヘッダ内に固有のユーザーデータを検出した場合、当該ヘッダ内に記録されているストリームへのアクセスデータの読み込みを禁止し、前記ヘッダ情報解析工程で解析されたファイルに記録されているストリームのパラメータ情報に基づいて、前記記録媒体に記録された画像データ及びオーディオデータを再生する再生工程とを有する。 The present invention also provides a playback method for playing back image data and audio data recorded as files on a recording medium, the header information analyzing step for analyzing header information recorded on the recording medium, and the header When unique user data is detected in the header recorded in the recording medium by the information analysis process, reading of access data to the stream recorded in the header is prohibited and analyzed in the header information analysis process. A reproduction step of reproducing the image data and the audio data recorded on the recording medium based on the parameter information of the stream recorded in the recorded file.

本発明によれば、ファイルとして記録された画像データやオーディオデータの再生時やサーチ時に必要なメモリ容量を削減し、データアクセスを高速化できる。 According to the present invention, it is possible to reduce the memory capacity required for reproduction or search of image data and audio data recorded as a file, and to speed up data access.

以下に、添付図面を参照して本発明を実施するための最良の形態について詳細に説明する。 The best mode for carrying out the present invention will be described below in detail with reference to the accompanying drawings.

尚、以下に説明する実施の形態は、本発明の実現手段としての一例であり、本発明が適用される装置の構成や各種条件によって適宜修正又は変更されるべきものであり、本発明は以下の実施の形態に限定されるものではない。 The embodiment described below is an example as means for realizing the present invention, and should be appropriately modified or changed according to the configuration and various conditions of the apparatus to which the present invention is applied. It is not limited to the embodiment.

［記録装置の説明］
図３は、本発明に係る実施形態の記録装置の構成を示すブロック図である。 [Description of recording device]
FIG. 3 is a block diagram showing the configuration of the recording apparatus according to the embodiment of the present invention.

図３において、３０１は画像データの入力端子である。３０２は画像符号化回路である。３０３は画像とデータとオーディオデータのストリームを多重化する多重化回路である。３０４は記録回路である。３０５はオーディオデータの入力端子である。３０６はオーディオの符号化回路である。３０７はファイルのヘッダの付加回路である。３０８は記録時の画質等のパラメータの入力端子である。３０９は符号量設定回路である。３１０はチャンク構成設定回路である。３１１は記録媒体である。 In FIG. 3, reference numeral 301 denotes an image data input terminal. Reference numeral 302 denotes an image encoding circuit. A multiplexing circuit 303 multiplexes streams of images, data, and audio data. Reference numeral 304 denotes a recording circuit. Reference numeral 305 denotes an audio data input terminal. Reference numeral 306 denotes an audio encoding circuit. Reference numeral 307 denotes a file header addition circuit. Reference numeral 308 denotes an input terminal for parameters such as image quality during recording. Reference numeral 309 denotes a code amount setting circuit. Reference numeral 310 denotes a chunk configuration setting circuit. Reference numeral 311 denotes a recording medium.

本実施形態では、動画像データ及び音声データをＭＰ４ファイル形式で記録再生する。 In this embodiment, moving image data and audio data are recorded and reproduced in the MP4 file format.

ここで、ＭＰ４ファイルの構成について説明する。 Here, the configuration of the MP4 file will be described.

ＭＰ４ファイルはQuicktime（登録商標）ファイルフォーマットに基づいている。Quicktime（登録商標）ファイルフォーマットは、図９に示すように、box（又はatom）と呼ばれる基本単位で構成される。 The MP4 file is based on the Quicktime (registered trademark) file format. As shown in FIG. 9, the Quicktime (registered trademark) file format is composed of basic units called boxes (or atoms).

boxは、図９に示すようにboxのサイズを示す４バイトのsizeフィールドと、boxのタイプを示す４バイトのtypeフィールドと、それに続くdataフィールドとからなる。 As shown in FIG. 9, the box includes a 4-byte size field indicating the size of the box, a 4-byte type field indicating the type of the box, and a subsequent data field.

ＭＰ４ファイルは基本的には図７（ａ）に示すように、符号化された画像データ及びオーディオデータのストリームが入っているmdat boxと、各ストリームに関する情報が入っているmoov boxとからなる。 As shown in FIG. 7A, the MP4 file basically includes an mdat box containing encoded image data and audio data streams, and a moov box containing information about each stream.

mdat boxの中は更に、図７（ｂ）に示すようにチャンク(chunk)で構成され、この例の場合には動画像のチャンクvideo chunk v(n)とオーディオのチャンクaudio chunk a(n)とから構成される。 The mdat box is further composed of chunks as shown in FIG. 7B. In this example, the video chunk v (n) and the audio chunk a (n) It consists of.

動画像及びオーディオの各チャンクは、更に図７（ｄ）、（ｆ）に示すようにサンプル（sample）から構成される。図７（ｄ）は動画像のchunk v(1)がsample sv(1)、sv(2)、・・・、sv(K)から構成されることを示している。また、図７（ｆ）はオーディオのchunk a(1)がsample sa(1)、sa(2)、・・・、sa(M)から構成されることを示している。 Each chunk of moving image and audio is further composed of samples as shown in FIGS. 7 (d) and 7 (f). FIG. 7D shows that chunk v (1) of the moving image is composed of samples sv (1), sv (2),..., Sv (K). FIG. 7F shows that audio chunk a (1) is composed of samples sa (1), sa (2),..., Sa (M).

上記各sampleは、例えば動画像の場合、図７（ｅ）に示すように、sample sv(1)、sv(2)、sv(3)、s(4)、・・・に対してＩ₀、Ｂ_-2、Ｂ_-1、Ｐ₃、・・・の符号化されたＭＰＥＧストリームが対応する。ここでＩ_nはイントラ符号化（フレーム内符号化）されたフレーム画像データである。Ｂ_nは双方向から参照して符号化（フレーム間符号化）されるフレーム画像データである。Ｐ_nは一方向（順方向）から参照して符号化（フレーム間符号化）されるフレーム画像データである。上記画像データはいずれも可変長データである。 For example, in the case of a moving image, each sample is I _{0 with} respect to samples sv (1), sv (2), sv (3), s (4),... As shown in FIG. , B ₋₂ , B ₋₁ , P ₃ ,... Encoded MPEG streams correspond. Where I _n is the frame image data that has been intra-coded (intraframe coding). B _n is frame image data which is encoded (inter-frame encoding) with reference from both directions. P _n is frame image data that is encoded (inter-frame encoded) with reference from one direction (forward direction). The image data is variable length data.

オーディオデータの各sampleは、図７（ｇ）に示すように、オーディオデータの符号化の単位であるフレームに対応付けられる。 Each sample of audio data is associated with a frame, which is a unit for encoding audio data, as shown in FIG.

図７（ａ）のmoov boxは、図７（ｃ）に示すように、作成日時等を入れるヘッダ情報からなるmvhd boxを有する。また、moov boxは、mdat boxに格納されるストリームデータの情報を入れる動画像及びオーディオの各trak box(video)とtrak box(audio)を有する。 As shown in FIG. 7C, the moov box in FIG. 7A has an mvhd box made up of header information into which the creation date and time is entered. Also, the moov box has a trak box (video) and a trak box (audio) for moving images and audio into which information of stream data stored in the mdat box is stored.

trak boxに格納されるboxは、図８（ａ）に示すようなboxであるが、ここでは本発明に関係するstsc、stsz、stcoの各boxについてのみ説明する。 The box stored in the trak box is a box as shown in FIG. 8A. Here, only the stsc, stsz, and stco boxes related to the present invention will be described.

stco boxは、図８（ｂ）に示すように、ストリームの各チャンクへのオフセットアドレス値を格納する。図８（ａ）、（ｂ）は動画像のstco boxのみを示しているが、stco boxは動画像、オーディオの各trackに１つずつ存在する。 As shown in FIG. 8B, the stco box stores an offset address value for each chunk of the stream. FIGS. 8A and 8B show only the stco box of the moving image, but one stco box exists for each track of the moving image and audio.

stsz boxは、図８（ｃ）に示すように、ストリームの各サンプルのサイズを示す情報を格納する。図８（ａ）、（ｃ）は動画像のstsz boxのみを示しているが、stsz boxは動画像とオーディオの各trackに１つずつ存在する。 As shown in FIG. 8C, the stsz box stores information indicating the size of each sample of the stream. FIGS. 8A and 8C show only the stsz box of the moving image, but one stsz box exists for each track of the moving image and the audio.

stsc boxは、図８（ｄ）に示すように、各チャンクのサンプル数を示す情報を格納する。図８（ａ）、（ｄ）)は動画像のstsc boxのみを示しているが、stsc boxは動画像とオーディオの各trackに１つずつ存在する。 As shown in FIG. 8D, the stsc box stores information indicating the number of samples of each chunk. FIGS. 8A and 8D show only the stsc box of the moving image, but there is one stsc box for each track of the moving image and the audio.

尚、上記stco、stsz、stscの各boxに格納されるデータは記録時間に伴って増大していく。 Note that the data stored in the boxes of stco, stsz, and stsc increase with the recording time.

例えば秒間30フレーム／秒の画像を１５フレーム毎に１チャンクに格納するようにしたとして、１時間記録したとすると、動画像の総チャンク数がおよそ、2×60×60=7200となる。 For example, assuming that an image of 30 frames / second per second is stored in one chunk every 15 frames, and recorded for one hour, the total number of chunks of moving images is approximately 2 × 60 × 60 = 7200.

各チャンクへのオフセットアドレス情報を格納するstco boxは、2×60×60×4=28800(byte)となり、およそ２８８００バイト（約２８kbyte）ほどのデータエリアを必要とする。 The stco box that stores offset address information for each chunk is 2 × 60 × 60 × 4 = 28800 (bytes), and requires a data area of about 28800 bytes (about 28 kbytes).

また、各サンプルのサイズを示す情報を格納するstsz boxは、30×60×60×4=432000(byte)となり、およそ４３２０００バイト（約４２２kbyte）ほどのデータエリアを必要とする。 The stsz box for storing information indicating the size of each sample is 30 × 60 × 60 × 4 = 432000 (bytes), and requires a data area of about 432000 bytes (about 422 kbytes).

また、各チャンクのサンプル数を示す情報を格納するstsc boxは、2×60×60×12=86400(byte)となり、およそ８６４００バイト（約８４kbyte）ほどのデータエリアを必要とする。 The stsc box that stores information indicating the number of samples of each chunk is 2 × 60 × 60 × 12 = 86400 (bytes), and requires a data area of approximately 86400 bytes (approximately 84 kbytes).

オーディオデータに関しては、例えばＡＡＣを用いてオーディオデータを圧縮するとする。サンプリング周波数を４８kHz、フレームサイズ（サンプルサイズ）を１０２４オーディオサンプルとする。そして、１チャンク当たり４６フレーム（サンプル）（約１秒）とすると、１時間記録した時のオーディオの総チャンク数はおよそ、48000×60×60/(1024×46)≒3668となる。 As for audio data, for example, audio data is compressed using AAC. The sampling frequency is 48 kHz, and the frame size (sample size) is 1024 audio samples. Assuming 46 frames (samples) per chunk (about 1 second), the total number of audio chunks when recording for 1 hour is approximately 48000 × 60 × 60 / (1024 × 46) ≈3668.

stco boxは、3668×4=14672(byte)
stsz boxは、46×3668×4=674912(byte)
stsc boxは、3668×12= 44016(byte)となり、それぞれ約１４kbyte、６５９kbyte、４２kbyteほどになる。 stco box is 3668 × 4 = 14672 (byte)
stsz box is 46 × 3668 × 4 = 674912 (byte)
The stsc box is 3668 × 12 = 44016 (byte), which is about 14 kbytes, 659 kbytes, and 42 kbytes, respectively.

上記からＭＰ４ファイルとして記録していく場合、１時間で１．２Ｍｂｙｔｅ以上のmoov boxの容量を必要とする。 When recording as an MP4 file from the above, the capacity of a moov box of 1.2 Mbytes or more is required in one hour.

［記録動作及び再生動作の説明］
先ず、具体的な記録動作について説明する。 [Description of recording and playback operations]
First, a specific recording operation will be described.

操作者の設定により、入力端子３０８から画質の記録パラメータ、若しくは記録媒体３１１への所望する記録時間のパラメータが符号量設定回路３０９に入力される。 Depending on the setting of the operator, a recording parameter for image quality or a parameter for a desired recording time to the recording medium 311 is input to the code amount setting circuit 309 from the input terminal 308.

更に、チャンク構成設定回路３１０から画像とオーディオデータのチャンク構成の情報が符号量設定回路３０９、画像符号化回路３０２、オーディオ符号化回路３０６、ヘッダ付加回路３０７に送られる。 Further, information on the chunk configuration of the image and audio data is sent from the chunk configuration setting circuit 310 to the code amount setting circuit 309, the image encoding circuit 302, the audio encoding circuit 306, and the header addition circuit 307.

ここでチャンク構成設定回路３１０は例えば図７（ｄ）、（ｆ）に示すように、動画像データをＭＰＥＧ２若しくはＭＰＥＧ４で圧縮し、Ｋフレームで動画像データの１チャンクとする。また、オーディオデータはＭＰＥＧ２‐ＡＡＣで圧縮し、Ｍフレームで１チャンクを構成するように符号量設定回路３０９、画像符号化回路３０２、オーディオ符号化回路３０６、ヘッダ付加回路３０７に設定情報を送る。本実施形態では、動画像データの１チャンクをＭＰＥＧ符号化におけるＧＯＰ(Group of Picture)の整数倍のフレーム数とする。 Here, for example, as shown in FIGS. 7D and 7F, the chunk configuration setting circuit 310 compresses the moving image data with MPEG2 or MPEG4, and uses the K frame as one chunk of the moving image data. The audio data is compressed by MPEG2-AAC, and setting information is sent to the code amount setting circuit 309, the image encoding circuit 302, the audio encoding circuit 306, and the header addition circuit 307 so as to form one chunk with M frames. In the present embodiment, one chunk of moving image data has a frame number that is an integral multiple of GOP (Group of Picture) in MPEG encoding.

符号量設定回路３０９は、入力端子３０８から入力されたパラメータとチャンク構成設定回路３１０から供給されるチャンクの構成情報とから画像及びオーディオデータの１チャンク当たりの符号量を決める。 The code amount setting circuit 309 determines the code amount per chunk of image and audio data from the parameters input from the input terminal 308 and the chunk configuration information supplied from the chunk configuration setting circuit 310.

例えば図５（ａ）に示すように、画像データ１チャンクの符号量をＶＬ、オーディオデータ１チャンクの符号量をＡＬとし、画像データ１チャンクとオーディオデータ１チャンクの合計の符号量ＶＬ＋ＡＬ＝Ｌfixと決める。図５（ａ）では、１ＧＯＰを１５フレームとし、１チャンクを２ＧＯＰ分、即ち、Ｋ＝３０フレームとしている。 For example, as shown in FIG. 5A, the code amount of one chunk of image data is VL, the code amount of one chunk of audio data is AL, and the total code amount VL + AL = Lfix of one chunk of image data and one chunk of audio data Decide. In FIG. 5A, 1 GOP is 15 frames, and 1 chunk is 2 GOPs, that is, K = 30 frames.

Ｌfixは少なくとも記録開始から記録停止まで、固定の値として設定される。 Lfix is set as a fixed value at least from recording start to recording stop.

符号量設定回路３０９は決定した１チャンク当たりの画像データ符号量ＶＬ、オーディオデータ符号量ＡＬを画像符号化回路３０２、オーディオ符号化回路３０６、ヘッダ付加回路３０７に設定する。 The code amount setting circuit 309 sets the determined image data code amount VL and audio data code amount AL per chunk in the image encoding circuit 302, the audio encoding circuit 306, and the header addition circuit 307.

次に記録が開始されると、画像符号化回路３０２、オーディオ符号化回路３０６は画像データ及びオーディオデータの符号化を開始する。 Next, when recording is started, the image encoding circuit 302 and the audio encoding circuit 306 start encoding image data and audio data.

ここで、図１を用いて画像符号化回路３０２の詳細について説明する。 Here, the details of the image encoding circuit 302 will be described with reference to FIG.

図１において、１０１は画像データの入力端子である。１０２はカメラ信号処理回路である。１０３は符号化のためのフレーム順の並べ替えを行う並べ替え回路である。１０４は加算回路である。１０５はスイッチ回路である。１０６はＤＣＴ回路である。１０７は量子化回路である。１０８は逆量子化回路である。１０９は逆ＤＣＴ回路である。１１０は加算回路である。１１１はメモリである。１１２は動き補償回路である。１１３はスイッチ回路である。１１４は可変長符号化回路である。１１５はストリーム生成回路である。１１６はバッファである。１１７は符号量制御回路である。１１８はストリームデータの出力端子である。１１９はストリームパラメータ付加回路である。１２０はヘッダ情報生成回路である。１２１はヘッダに付加する情報の出力端子である。１２２は符号量設定回路である。１２３は記録制御回路である。１２４は記録制御回路への制御データ入力端子である。 In FIG. 1, reference numeral 101 denotes an image data input terminal. Reference numeral 102 denotes a camera signal processing circuit. Reference numeral 103 denotes a rearrangement circuit that rearranges the frames in order for encoding. Reference numeral 104 denotes an adder circuit. Reference numeral 105 denotes a switch circuit. Reference numeral 106 denotes a DCT circuit. Reference numeral 107 denotes a quantization circuit. Reference numeral 108 denotes an inverse quantization circuit. Reference numeral 109 denotes an inverse DCT circuit. Reference numeral 110 denotes an adder circuit. Reference numeral 111 denotes a memory. Reference numeral 112 denotes a motion compensation circuit. Reference numeral 113 denotes a switch circuit. Reference numeral 114 denotes a variable length coding circuit. Reference numeral 115 denotes a stream generation circuit. Reference numeral 116 denotes a buffer. Reference numeral 117 denotes a code amount control circuit. Reference numeral 118 denotes an output terminal for stream data. Reference numeral 119 denotes a stream parameter addition circuit. Reference numeral 120 denotes a header information generation circuit. Reference numeral 121 denotes an output terminal for information to be added to the header. Reference numeral 122 denotes a code amount setting circuit. Reference numeral 123 denotes a recording control circuit. Reference numeral 124 denotes a control data input terminal to the recording control circuit.

入力端子１２４から符号量設定回路３０９と、チャンク構成設定回路３１０からの記録に関するパラメータであるチャンク当たりの符号量ＶＬ及びチャンク構成を示すパラメータＫが入力される。記録制御回路１２３は入力されたチャンク当たりの符号量ＶＬとチャンク構成を示すパラメータＫを符号量設定回路１２２に供給する。符号量設定回路１２２はチャンク当たりのＧＯＰ数ｇｎを算出し、ＧＯＰがｇｎ個で符号量ＶＬとなるように符号量制御回路１１７に設定すると共に、ストリームパラメータ付加回路１１９にも供給する。 The code amount setting circuit 309 and the code amount VL per chunk, which are parameters related to recording, and the parameter K indicating the chunk configuration are input from the input terminal 124. The recording control circuit 123 supplies the input code amount VL per chunk and the parameter K indicating the chunk configuration to the code amount setting circuit 122. The code amount setting circuit 122 calculates the number of GOPs gn per chunk, sets it to the code amount control circuit 117 so that the number of GOPs is gn and the code amount VL, and supplies it to the stream parameter addition circuit 119 as well.

記録が開始されると、不図示の撮像素子で撮像された画像データが入力端子１０１から入力され、カメラ信号処理回路１０２でカメラ信号処理され、輝度信号／色差信号として並べ替え回路１０３に出力される。 When recording is started, image data captured by an image sensor (not shown) is input from the input terminal 101, processed by the camera signal processing circuit 102, and output to the rearrangement circuit 103 as a luminance signal / color difference signal. The

並べ替え回路１０３では、図５（ｂ）に示す時間軸ｔに対してカメラ信号処理回路１０２から各フレーム画像Ｆ_-2、Ｆ_-1、Ｆ₀、Ｆ₁、Ｆ₂、Ｆ₃、Ｆ₄、Ｆ₅、Ｆ₆、・・・が出力される。そして、並べ替え回路１０３は、これらの画像フレームをＦ₀、Ｆ_-2、Ｆ_-1、Ｆ₃、Ｆ₁、Ｆ₂、Ｆ₆、Ｆ₄、Ｆ₅、・・・の順に並べ替える。また、並べ替え回路１０３は、フレーム内の画像データを符号化する単位の小ブロックに分割する処理を行い、加算回路１０４、スイッチ回路１０５に出力する。 The rearrangement circuit 103, and FIG. 5 (b) camera signals with respect to the time axis t as shown in the processing circuit 102 each frame image from _{_{_{F -2, F -1, F 0}}} , F 1, F 2, F 3, F 4 , F ₅ , F ₆ ,... Are output. Then, the rearrangement circuit 103 rearranges these image frames in the order of F ₀ , F ₋₂ , F ₋₁ , F ₃ , F ₁ , F ₂ , F ₆ , F ₄ , F ₅ ,. The rearrangement circuit 103 performs a process of dividing the image data in the frame into small blocks of units to be encoded, and outputs the result to the addition circuit 104 and the switch circuit 105.

記録制御回路１２３はスイッチ回路１０５の端子を記録開始時、まず端子ａを選択するよう指示し、スイッチ回路１０５は記録制御回路１２３により端子ａを選択し、フレーム画像Ｆ₀をＤＣＴ回路１０６に送出する。ＤＣＴ回路１０６はフレーム画像Ｆ₀に対して離散コサイン変換（Discrete Cosine Transform、以下、「ＤＣＴ」）を施し、量子化回路１０７に送出する。 When the recording control circuit 123 starts recording the terminal of the switch circuit 105, the recording control circuit 123 first instructs the terminal a to select the terminal a. The switching circuit 105 selects the terminal a by the recording control circuit 123 and sends the frame image F ₀ to the DCT circuit 106. To do. The DCT circuit 106 performs a discrete cosine transform (hereinafter referred to as “DCT”) on the frame image F ₀ and sends it to the quantization circuit 107.

量子化回路１０７では符号量制御回路１１７から設定された量子化処理で、ＤＣＴ回路１０６から供給されるＤＣＴ処理されたデータを量子化し、可変長符号化回路１１４と逆量子化回路１０８に送出する。 The quantization circuit 107 quantizes the DCT-processed data supplied from the DCT circuit 106 by the quantization process set by the code amount control circuit 117, and sends it to the variable-length encoding circuit 114 and the inverse quantization circuit 108. .

可変長符号化回路１１４は量子化回路１０７から供給される量子化された画像データをハフマン符号等を用いて可変長符号化する。そして可変長符号化回路１１４は、図５（ｂ）に示すように、フレーム画像Ｆ₀をフレーム内符号化データＩ₀としてストリーム生成回路１１５に送出する。 The variable length coding circuit 114 performs variable length coding on the quantized image data supplied from the quantization circuit 107 using a Huffman code or the like. Then, as shown in FIG. 5B, the variable length encoding circuit 114 sends the frame image F ₀ to the stream generation circuit 115 as intra-frame encoded data I ₀ .

逆量子化回路１０８は量子化回路１０７から供給された量子化された画像データを逆量子化し、逆ＤＣＴ回路１０９に送出する。逆ＤＣＴ回路１０９は逆量子化回路１０８で逆量子化された画像データを逆離散コサイン変換（逆ＤＣＴ）し、加算回路１１０に送出する。 The inverse quantization circuit 108 inversely quantizes the quantized image data supplied from the quantization circuit 107 and sends it to the inverse DCT circuit 109. The inverse DCT circuit 109 performs inverse discrete cosine transform (inverse DCT) on the image data inversely quantized by the inverse quantization circuit 108 and sends it to the adder circuit 110.

加算回路１１０は逆ＤＣＴ回路１０９から供給される逆ＤＣＴされた画像データにスイッチ回路１１３から供給されるデータを加算してメモリ１１１に供給する。ここで、スイッチ回路１１３は記録制御回路１２３の指示により、端子ａが選択され、’０’データを加算回路１１０に供給する。従って加算回路１１０は逆ＤＣＴ回路１０９から供給されるデータと同じ値をメモリ１１１に供給し、メモリ１１１は供給された画像データを記憶する。 The adder circuit 110 adds the data supplied from the switch circuit 113 to the image data subjected to the inverse DCT supplied from the inverse DCT circuit 109 and supplies it to the memory 111. Here, the switch circuit 113 selects the terminal a in accordance with an instruction from the recording control circuit 123, and supplies the “0” data to the adder circuit 110. Therefore, the adder circuit 110 supplies the same value as the data supplied from the inverse DCT circuit 109 to the memory 111, and the memory 111 stores the supplied image data.

フレーム画像Ｆ₀の処理が終わると、引き続きフレーム画像Ｆ_-2の画像データが並べ替え回路１０３から加算回路１０４、動き補償回路１１２に入力される。この時、スイッチ回路１０５は記録制御回路１２３の指示により、端子ｂが選択される。更に記録制御回路１２３は動き補償回路１１２を制御する。動き補償回路１１２は、記録開始時であるので図５に示すようにフレーム画像Ｆ₀のみを参照して予測誤差が最も小さくなるフレーム画像Ｆ₀のブロックをメモリ１１１から読み出し加算回路１０４に供給する。 When the processing of the frame image F ₀ is completed, the image data of the frame image F _-2 is continuously input from the rearrangement circuit 103 to the addition circuit 104 and the motion compensation circuit 112. At this time, the switch circuit 105 selects the terminal b in accordance with an instruction from the recording control circuit 123. Further, the recording control circuit 123 controls the motion compensation circuit 112. Since the motion compensation circuit 112 is at the start of recording, referring to only the frame image F ₀ as shown in FIG. 5, the block of the frame image F _{0 with} the smallest prediction error is read from the memory 111 and supplied to the adder circuit 104. .

加算回路１０４は並べ替え回路１０３から供給されるフレーム画像Ｆ_-2の画像データと動き補償回路１１２から供給されるデータを引き算し、予測誤差データとしてスイッチ回路１０５に供給する。 The adder circuit 104 subtracts the image data of the frame image F- ₂ supplied from the rearrangement circuit 103 and the data supplied from the motion compensation circuit 112, and supplies the result to the switch circuit 105 as prediction error data.

スイッチ回路１０５は加算回路１０４から供給された引き算された予測誤差データをＤＣＴ回路１０６に送出する。ＤＣＴ回路１０６は加算回路１０４から供給される予測誤差データをＤＣＴ処理して量子化回路１０７に供給する。 The switch circuit 105 sends the subtracted prediction error data supplied from the adder circuit 104 to the DCT circuit 106. The DCT circuit 106 performs DCT processing on the prediction error data supplied from the adder circuit 104 and supplies it to the quantization circuit 107.

量子化回路１０７では符号量制御回路１１７から設定された量子化処理で、ＤＣＴ回路１０６から供給されるＤＣＴ処理された予測誤差データを量子化し、可変長符号化回路１１４に送出する。 The quantization circuit 107 quantizes the DCT-processed prediction error data supplied from the DCT circuit 106 by the quantization process set by the code amount control circuit 117, and sends it to the variable-length coding circuit 114.

可変長符号化回路１１４は量子化回路１０７から供給される量子化された予測誤差データを可変長符号化する。そして、可変長符号化回路１１４は、図５に示すように、フレーム画像Ｆ_-2をフレーム間符号化データＢ_-2としてストリーム生成回路１１５に送出する。 The variable length coding circuit 114 performs variable length coding on the quantized prediction error data supplied from the quantization circuit 107. Then, as shown in FIG. 5, the variable length encoding circuit 114 sends the frame image F _-2 to the stream generation circuit 115 as interframe encoded data B _-2 .

次のフレーム画像Ｆ_-1の処理もフレーム画像Ｆ_-2と同様に処理される。処理されたフレーム画像は可変長符号化回路１１４からフレーム間符号化（双方向予測符号化）データＢ_-1としてストリーム生成回路１１５に送出される。 The next frame image F- ₁ is processed in the same manner as the frame image F- ₂ . The processed frame image is sent from the variable length coding circuit 114 to the stream generation circuit 115 as interframe coding (bidirectional predictive coding) data B- ₁ .

次にフレーム画像Ｆ₃が並べ替え回路１０３から加算回路１０４、動き補償回路１１２に入力される。この時、スイッチ回路１０５は記録制御回路１２３の指示により、端子ｂが選択される。そして動き補償回路１１２はフレーム画像F０を参照して予測誤差が最も小さくなるフレーム画像Ｆ₀のブロックをメモリ１１１から読み出し加算回路１０４に供給する。 Next, the frame image F ₃ is input from the rearrangement circuit 103 to the addition circuit 104 and the motion compensation circuit 112. At this time, the switch circuit 105 selects the terminal b in accordance with an instruction from the recording control circuit 123. The motion compensation circuit 112 reads the block of the frame image F ₀ having the smallest prediction error with reference to the frame image F ₀ from the memory 111 and supplies the read block to the addition circuit 104.

加算回路１０４は並べ替え回路１０３から供給されるフレーム画像Ｆ₃の画像データと動き補償回路１１２から供給される動き補償されたフレーム画像Ｆ₀のデータを引き算する。また、加算回路１０４は、予測誤差データとしてスイッチ回路１０５に供給する。スイッチ回路１０５は加算回路１０４から供給された引き算された予測誤差データをＤＣＴ回路１０６に送出する。ＤＣＴ回路１０６は加算回路１０４から供給される予測誤差データをＤＣＴ処理して量子化回路１０７に供給する。 The adder circuit 104 subtracts the image data of the frame image F ₃ supplied from the rearrangement circuit 103 and the data of the frame image F ₀ subjected to motion compensation supplied from the motion compensation circuit 112. Further, the adder circuit 104 supplies the switch circuit 105 as prediction error data. The switch circuit 105 sends the subtracted prediction error data supplied from the adder circuit 104 to the DCT circuit 106. The DCT circuit 106 performs DCT processing on the prediction error data supplied from the adder circuit 104 and supplies it to the quantization circuit 107.

量子化回路１０７では符号量制御回路１１７から設定された量子化処理で、ＤＣＴ回路１０６から供給されるＤＣＴ処理された予測誤差データを量子化し、可変長符号化回路１１４と逆量子化回路１０８に送出する。 The quantization circuit 107 quantizes the DCT-processed prediction error data supplied from the DCT circuit 106 by the quantization process set by the code amount control circuit 117, and supplies the quantized circuit to the variable-length encoding circuit 114 and the inverse quantization circuit 108. Send it out.

可変長符号化回路１１４は量子化回路１０７から供給される量子化された予測誤差データを可変長符号化する。そして、可変長符号化回路１１４は、図５（ｂ）に示すように、フレーム画像F３をフレーム間符号化（片方向予測符号化）データＰ₃としてストリーム生成回路１１５に送出する。 The variable length coding circuit 114 performs variable length coding on the quantized prediction error data supplied from the quantization circuit 107. The variable-length coding circuit 114, as shown in FIG. 5 (b), and sends the stream generating circuit 115 a frame image F3 as inter-frame coding (unidirectional predictive encoding) data P _3.

逆量子化回路１０８は量子化回路１０７から供給された量子化された予測誤差データを逆量子化し、逆ＤＣＴ回路１０９に送出する。逆ＤＣＴ回路１０９は逆量子化回路１０８で逆量子化された予測誤差データを逆ＤＣＴ処理し、加算回路１１０に送出する。 The inverse quantization circuit 108 inversely quantizes the quantized prediction error data supplied from the quantization circuit 107 and sends it to the inverse DCT circuit 109. The inverse DCT circuit 109 performs inverse DCT processing on the prediction error data inversely quantized by the inverse quantization circuit 108 and sends it to the adder circuit 110.

加算回路１１０は逆ＤＣＴ回路１０９から供給される逆ＤＣＴ処理された予測誤差データにスイッチ回路１１３から供給されるデータを加算してメモリ１１１に供給する。ここで、スイッチ回路１１３は記録制御回路１２３の指示により、端子ｂが選択され、動き補償回路１１２から供給される動き補償されたフレーム画像Ｆ₀のデータを加算回路１１０に供給する。従って加算回路１１０は逆ＤＣＴ回路１０９から供給される予測誤差データに動き補償回路１１２から供給される動き補償されたデータを加算してメモリ１１１に供給し、メモリ１１１は供給された画像データを記憶する。 The adder circuit 110 adds the data supplied from the switch circuit 113 to the prediction error data subjected to the inverse DCT process supplied from the inverse DCT circuit 109 and supplies the result to the memory 111. Here, the switch circuit 113 selects the terminal b according to the instruction of the recording control circuit 123 and supplies the motion compensated frame image F ₀ data supplied from the motion compensation circuit 112 to the adder circuit 110. Therefore, the adder circuit 110 adds the motion compensated data supplied from the motion compensation circuit 112 to the prediction error data supplied from the inverse DCT circuit 109 and supplies it to the memory 111. The memory 111 stores the supplied image data. To do.

次にフレーム画像Ｆ₁、Ｆ₂が処理される。即ち、動き補償回路１１２で、局所復号されたフレーム画像Ｆ₀、Ｆ₃の双方向から予測される以外は上記Ｆ_-2、Ｆ_-1の処理と同じであり、可変長符号化回路１１４で可変長符号化される。そして、図５（ｂ）に示すように、フレーム画像Ｆ₁、Ｆ₂はフレーム間符号化データＢ₁、Ｂ₂としてストリーム生成回路１１５に送出される。 Next, the frame images F ₁ and F ₂ are processed. In other words, the processing is the same as the processing of F _-2 and F _-1 except that the motion compensation circuit 112 predicts locally decoded frame images F ₀ and F ₃ from both directions. Variable length coding. Then, as shown in FIG. 5B, the frame images F ₁ and F ₂ are sent to the stream generation circuit 115 as interframe encoded data B ₁ and B ₂ .

以下、フレーム画像Ｆ₆、Ｆ₄、Ｆ₅、Ｆ₉、Ｆ₇、Ｆ₈、Ｆ₁₂、Ｆ₁₀、Ｆ₁₁と処理し、図５（ｂ）に示すようにＰ₆、Ｂ₄、Ｂ₅、Ｐ₉、Ｂ₇、Ｂ₈、Ｐ₁₂、Ｂ₁₀、Ｂ₁₁を得、ストリーム生成回路１１５に供給されていく。 Thereafter, the frame images F ₆ , F ₄ , F ₅ , F ₉ , F ₇ , F ₈ , F ₁₂ , F ₁₀ , F ₁₁ are processed, and P ₆ , B ₄ , B ₁₁ are processed as shown in FIG. ₅ , P ₉ , B ₇ , B ₈ , P ₁₂ , B ₁₀ , B ₁₁ are obtained and supplied to the stream generation circuit 115.

ストリーム生成回路１１５では、図５（ｃ）に示すように、シーケンスヘッダ、ＧＯＰヘッダ、ピクチャヘッダ等を付加してストリームを生成し、バッファ１１６に供給する。そして、ストリーム生成回路１１５は、図５（ｂ）に示すように、符号化データとして、Ｉ₀、Ｂ_-2、Ｂ_-1、Ｐ₃、Ｂ₁、・・・、Ｐ₁₂、Ｂ₁₀、Ｂ₁₁で１ＧＯＰ（ＧＯＰ₍₁₎)）を形成する。 As shown in FIG. 5C, the stream generation circuit 115 generates a stream by adding a sequence header, a GOP header, a picture header, etc., and supplies the stream to the buffer 116. Then, as shown in FIG. 5 (b), the stream generation circuit 115 generates I ₀ , B ₋₂ , B ₋₁ , P ₃ , B ₁ ,..., P ₁₂ , B ₁₀ ,. forming a 1GOP _{(GOP (1)))} at B _11.

ストリームパラメータ付加回路１１９は、図５（ｃ）に示すように、ストリーム生成回路１１５に、画像データの符号量ＶＬ、フレーム内符号化データＩ₀の符号化サイズＩ₀(size)をユーザーデータとして付加する。また、ストリームパラメータ付加回路１１９は、図５（ｃ）に示すように、ストリーム生成回路１１５に、次のＧＯＰのＩフレームまでのオフセット値Ｉ₀(offset)をユーザーデータとして付加する。 As shown in FIG. 5C, the stream parameter addition circuit 119 sends the stream data generation circuit 115 the image data code amount VL and the intra-frame encoded data I ₀ encoded size I ₀ (size) as user data. Append. Further, as shown in FIG. 5C, the stream parameter addition circuit 119 adds an offset value I ₀ (offset) up to the I frame of the next GOP as user data to the stream generation circuit 115.

バッファ１１６は記録制御回路１２３の制御を受け、出力端子１１８から図３の多重化回路３０３に画像のストリームデータを送出する。 Under the control of the recording control circuit 123, the buffer 116 sends image stream data from the output terminal 118 to the multiplexing circuit 303 of FIG.

符号量制御回路１１７は符号量設定回路１２２から設定されたチャンク当たりのＧＯＰ数ｇｎとチャンク当たりの符号量ＶＬに従い、ＧＯＰがｇｎ個で符号量がＶＬとなるようにバッファ１１６の符号量を監視し、量子化回路１０７での量子化処理を制御する。 The code amount control circuit 117 monitors the code amount of the buffer 116 according to the number of GOPs per chunk gn set by the code amount setting circuit 122 and the code amount VL per chunk so that the number of GOPs is gn and the code amount is VL. Then, the quantization processing in the quantization circuit 107 is controlled.

ヘッダ情報生成回路１２０は、ストリーム生成回路１１５からＭＰ４ファイルのヘッダに必要とされるチャンクへのオフセット値、チャンク当たりのサンプル数、サンプルのサイズ等の情報を生成する。そして、ヘッダ情報生成回路１２０は、生成された情報を出力端子１２１から出力し図３のヘッダ付加回路３０７に供給する。 The header information generation circuit 120 generates information such as the offset value to the chunk, the number of samples per chunk, and the sample size required for the header of the MP4 file from the stream generation circuit 115. Then, the header information generation circuit 120 outputs the generated information from the output terminal 121 and supplies it to the header addition circuit 307 in FIG.

次に、図２を参照してオーディオの符号化について説明する。 Next, audio encoding will be described with reference to FIG.

図２において、２０１はオーディオデータの入力端子である。２０２はＭＤＣＴ回路である。２０３はＴＮＳ回路である。２０４はインテンシティステレオ回路である。２０５はＭ／Ｓステレオ回路である。２０６はスケールファクタ計算回路である。２０７は量子化回路である。２０８は可変長符号化回路である。２０９はストリーム生成回路である。２１０は出力端子である。２１１はヘッダ情報生成回路である。２１２は出力端子である。２１３は聴覚心理モデル回路である。２１４はビット割り当て回路である。２１５は記録制御回路である。２１６は入力端子である。 In FIG. 2, reference numeral 201 denotes an audio data input terminal. Reference numeral 202 denotes an MDCT circuit. Reference numeral 203 denotes a TNS circuit. Reference numeral 204 denotes an intensity stereo circuit. Reference numeral 205 denotes an M / S stereo circuit. Reference numeral 206 denotes a scale factor calculation circuit. Reference numeral 207 denotes a quantization circuit. Reference numeral 208 denotes a variable length coding circuit. Reference numeral 209 denotes a stream generation circuit. 210 is an output terminal. Reference numeral 211 denotes a header information generation circuit. Reference numeral 212 denotes an output terminal. Reference numeral 213 denotes an auditory psychological model circuit. Reference numeral 214 denotes a bit allocation circuit. Reference numeral 215 denotes a recording control circuit. Reference numeral 216 denotes an input terminal.

入力端子２１６から符号量設定回路３０９、チャンク構成設定回路３１０からの記録に関するパラメータであるチャンク当たりの符号量ＡＬとチャンク構成を示すパラメータＭが記録制御回路２１５に入力される。 A code amount AL per chunk, which is a parameter relating to recording from the code amount setting circuit 309 and the chunk configuration setting circuit 310, and a parameter M indicating the chunk configuration are input to the recording control circuit 215 from the input terminal 216.

記録が開始されると、不図示のＡ／Ｄ変換回路等から入力端子２０１にオーディオデータが入力され、ＭＤＣＴ回路２０２、聴覚心理モデル回路２１３に供給される。 When recording is started, audio data is input to the input terminal 201 from an A / D conversion circuit (not shown) or the like, and is supplied to the MDCT circuit 202 and the psychoacoustic model circuit 213.

ＭＤＣＴ回路２０２では入力端子２０１から入力されるオーディオデータを１０２４サンプルで１フレームとして変形離散コサイン変換（以下、「ＭＤＣＴ」）を施し、ＴＮＳ回路２０３に出力する。尚、変形離散コサイン変換は、Modified Discrete Cosine Transformの略称である。ＴＮＳ回路２０３ではＴＮＳ（Temporal Noise Shaping）処理を施し、インテンシティステレオ回路２０４に出力する。ＴＮＳは量子化雑音を信号波形の振幅値に応じて整形することにより音質の向上を図る処理である。ここではＭＤＣＴ係数の一部を時系列信号とみなして線形予測する。これにより、量子化雑音は信号波形の振幅が大きいところに集中する。 In the MDCT circuit 202, audio data input from the input terminal 201 is subjected to modified discrete cosine transform (hereinafter “MDCT”) with 1024 samples as one frame, and is output to the TNS circuit 203. The modified discrete cosine transform is an abbreviation for Modified Discrete Cosine Transform. The TNS circuit 203 performs TNS (Temporal Noise Shaping) processing and outputs it to the intensity stereo circuit 204. TNS is a process for improving sound quality by shaping quantization noise according to the amplitude value of a signal waveform. Here, a part of MDCT coefficients is regarded as a time series signal and linear prediction is performed. As a result, the quantization noise is concentrated where the amplitude of the signal waveform is large.

インテンシティステレオ回路２０４では、高域では左右チャネルの信号パワー差により音源の位置を感じるという聴覚の特性を利用し、左右チャネルの和信号とパワー比を本来の２チャネルデータの代わりに用いる。インテンシティステレオ回路２０４から出力されたオーディオデータはＭ／Ｓステレオ回路２０５に送出される。 Intensity stereo circuit 204 uses the auditory property of feeling the position of the sound source due to the difference in signal power between the left and right channels at high frequencies, and uses the sum signal and power ratio of the left and right channels instead of the original two-channel data. The audio data output from the intensity stereo circuit 204 is sent to the M / S stereo circuit 205.

Ｍ／Ｓステレオ回路２０５では、左右のチャネルの和信号と差信号を本来の２チャネルデータの代わりに用いる。主に低域の信号処理に用いられる。Ｍ／Ｓステレオ回路２０５から出力されたオーディオデータはスケールファクタ計算回路２０６に供給される。 The M / S stereo circuit 205 uses the sum and difference signals of the left and right channels instead of the original two-channel data. Used mainly for low-frequency signal processing. The audio data output from the M / S stereo circuit 205 is supplied to the scale factor calculation circuit 206.

スケールファクタ計算回路２０６では聴覚特性に合わせて複数の帯域（スケールファクタバンド）に分割された各帯域ごとの代表値（スケールファクタ）を算出し、帯域毎にスケールファクタを用いて正規化処理を行う。そして、スケールファクタ計算回路２０６は正規化されたデータを量子化回路２０７に送出する。 The scale factor calculation circuit 206 calculates a representative value (scale factor) for each band divided into a plurality of bands (scale factor bands) in accordance with the auditory characteristics, and performs a normalization process using the scale factor for each band. . Then, the scale factor calculation circuit 206 sends the normalized data to the quantization circuit 207.

量子化回路２０７はスケールファクタ計算回路２０６から供給されたデータに対して、ビット割り当て回路２１４からの制御に従って量子化を行い、可変長符号化回路２０８に送出する。 The quantization circuit 207 quantizes the data supplied from the scale factor calculation circuit 206 according to the control from the bit allocation circuit 214 and sends the data to the variable length coding circuit 208.

ここで、入力端子２０１から入力されたオーディオデータは聴覚心理モデル回路２１３にも入力されている。聴覚心理モデル回路２１３では入力されたオーディオデータをＦＦＴ処理し、スケールファクタバンド毎に絶対可聴閾値とマスキング効果を加味して、補正された可聴閾値を算出し、ビット割り当て回路２１４に送出する。 Here, the audio data input from the input terminal 201 is also input to the psychoacoustic model circuit 213. The psychoacoustic model circuit 213 performs FFT processing on the input audio data, calculates a corrected audible threshold value by adding an absolute audible threshold value and a masking effect for each scale factor band, and sends the corrected audible threshold value to the bit allocation circuit 214.

ビット割り当て回路２１４は、記録制御回路２１５により、聴覚心理モデル回路２１３から供給されたスケールファクタバンド毎の補正された可聴閾値から閾値を越える成分についてＭフレーム（１チャンク）でＡＬビットとなるようにビットを割り当てていく。 The bit allocation circuit 214 sets the AL bit in M frames (one chunk) for components exceeding the threshold from the corrected audible threshold for each scale factor band supplied from the psychoacoustic model circuit 213 by the recording control circuit 215. Assign bits.

可変長符号化回路２０８は量子化回路２０７から量子化されたデータを供給され、ハフマン符号化を施し、符号化したデータをストリーム生成回路２０９、ビット割り当て回路２１４に送出する。 The variable length coding circuit 208 is supplied with the quantized data from the quantization circuit 207, performs Huffman coding, and sends the coded data to the stream generation circuit 209 and the bit allocation circuit 214.

ストリーム生成回路２０９は、可変長符号化回路２０８から供給される符号化データと、回路２０２〜２０７からの処理帯域の範囲やスケールファクタ等の符号化パラメータとを用いてオーディオのストリームを生成する。そして、ストリーム生成回路２０９は、生成されたオーディオストリームを出力端子２１０から出力し図３の多重化回路３０３に供給する。 The stream generation circuit 209 generates an audio stream using the encoded data supplied from the variable length encoding circuit 208 and the encoding parameters such as the processing band range and scale factor from the circuits 202 to 207. Then, the stream generation circuit 209 outputs the generated audio stream from the output terminal 210 and supplies it to the multiplexing circuit 303 in FIG.

また、ヘッダ情報生成回路２１１は、ストリーム生成回路２０９からＭＰ４ファイルのヘッダに必要とされるチャンクへのオフセット値、チャンク当たりのサンプル数、サンプルのサイズ等の情報を生成する。そして、ヘッダ情報生成回路２１１は、生成された情報を出力端子２１２から出力し図３のヘッダ付加回路３０７に供給する。 Also, the header information generation circuit 211 generates information such as an offset value to the chunk, the number of samples per chunk, and the size of the samples required from the stream generation circuit 209 for the header of the MP4 file. Then, the header information generation circuit 211 outputs the generated information from the output terminal 212 and supplies it to the header addition circuit 307 in FIG.

多重化回路３０３は画像符号化回路３０２、オーディオ符号化回路３０６から供給される画像とオーディオの符号化データを図５（ａ）に示すようにビデオ、オーディオ１チャンク毎にインターリーブして多重化する。そして、多重化回路３０３は、ビデオ１チャンクとオーディオ１チャンクで符号量ＶＬ＋ＡＬ＝Ｌfixとなるようにしてmdat boxのデータを生成し、記録媒体３１１に記録していく。 The multiplexing circuit 303 interleaves and multiplexes the video and audio encoded data supplied from the image encoding circuit 302 and the audio encoding circuit 306 for each video and audio chunk as shown in FIG. . Then, the multiplexing circuit 303 generates mdat box data so that the code amount is VL + AL = Lfix in one video chunk and one audio chunk, and records the data in the recording medium 311.

更にヘッダ付加回路３０７は画像符号化回路３０２及びオーディオ符号化回路３０６からチャンクへのオフセット値、チャンク当たりのサンプル数、サンプルのサイズ等の情報が供給される。そして、ヘッダ付加回路３０７は、これらの情報から、video track及びaudio trackのstco、stsc、stsz boxのデータを作成し、多重化回路３０３に供給する。 Further, the header addition circuit 307 is supplied with information such as the offset value to the chunk, the number of samples per chunk, and the sample size from the image encoding circuit 302 and the audio encoding circuit 306. Then, the header addition circuit 307 creates stco, stsc, and stsz box data of the video track and the audio track from these pieces of information, and supplies them to the multiplexing circuit 303.

また、ヘッダ付加回路３０７は、ビデオチャンク、オーディオチャンクの符号量ＶＬ，ＡＬ、ビデオチャンク、オーディオチャンクの合計符号量Ｌfixをユーザーデータとして記録するfree boxを生成し、多重化回路３０３に供給する。また、ヘッダ付加回路３０７は、ビデオチャンクの構成を示す１チャンク当たりのフレーム数Ｋをユーザーデータとして記録するfree boxを生成し、多重化回路３０３に供給する。更に、ヘッダ付加回路３０７は、ＧＯＰ数ｇｎ、オーディオチャンクの構成を示す１チャンク当たりのフレーム数Ｍ、記録される画像データのフレームレート等をユーザーデータとして記録するfree boxを生成し、多重化回路３０３に供給する。 The header addition circuit 307 generates a free box that records the total code amount Lfix of video chunks and audio chunks VL, AL, video chunks, and audio chunks as user data, and supplies it to the multiplexing circuit 303. The header addition circuit 307 generates a free box that records the number of frames K per chunk indicating the configuration of the video chunk as user data, and supplies the free box to the multiplexing circuit 303. Furthermore, the header addition circuit 307 generates a free box for recording the GOP number gn, the number M of frames per chunk indicating the configuration of the audio chunk, the frame rate of recorded image data, etc. as user data, and a multiplexing circuit 303.

多重化回路３０３は、図５（ｄ）に示すように、moov box内にfree box Fを生成し、moov boxを記録媒体３１１に記録する。またユーザーデータには、データが上記形式であることを示す識別情報を入れておくようにする。 The multiplexing circuit 303 generates a free box F in the moov box and records the moov box on the recording medium 311 as shown in FIG. In addition, identification information indicating that the data is in the above format is put in the user data.

尚、図５（ｄ）ではmoov box内にfree box Fを設けたが、free box F はmoov box 内ではなく、moov box、mdat boxと同等の位置に設けてもよい。 In FIG. 5D, the free box F is provided in the moov box. However, the free box F may be provided not in the moov box but in the same position as the moov box and mdat box.

［再生装置の説明］
次に、図４を参照して上記のように生成されたＭＰ４ファイルを再生する装置について説明する。 [Description of playback device]
Next, an apparatus for reproducing the MP4 file generated as described above will be described with reference to FIG.

図４において、４０１は記録媒体である。４０２は記録媒体４０１に記録されたデータを再生する再生回路である。４０３はバッファである。４０４は分離回路である。４０５は可変長復号化回路である。４０６は逆量子化回路である。４０７は逆ＤＣＴ回路である。４０８は加算回路である。４０９はメモリである。４１０は動き補償回路である。４１１はスイッチ回路である。４１２は並べ替え回路である。４１３は出力端子である。４１４は可変長復号化回路である。４１５は逆量子化回路である。４１６はスケールファクタ回路である。４１７はＭ／Ｓステレオ回路である。４１８はインテンシティステレオ回路である。４１９はＴＮＳ回路である。４２０は逆ＭＤＣＴ回路である。４２１は出力端子である。４２２はヘッダ情報解析回路である。４２３はストリームユーザーデータ解析回路である。４２４は再生制御回路である。４２５は入力端子である。 In FIG. 4, 401 is a recording medium. Reference numeral 402 denotes a reproduction circuit for reproducing data recorded on the recording medium 401. Reference numeral 403 denotes a buffer. Reference numeral 404 denotes a separation circuit. Reference numeral 405 denotes a variable length decoding circuit. Reference numeral 406 denotes an inverse quantization circuit. Reference numeral 407 denotes an inverse DCT circuit. Reference numeral 408 denotes an adder circuit. Reference numeral 409 denotes a memory. Reference numeral 410 denotes a motion compensation circuit. Reference numeral 411 denotes a switch circuit. Reference numeral 412 denotes a rearrangement circuit. Reference numeral 413 denotes an output terminal. Reference numeral 414 denotes a variable length decoding circuit. Reference numeral 415 denotes an inverse quantization circuit. Reference numeral 416 denotes a scale factor circuit. Reference numeral 417 denotes an M / S stereo circuit. Reference numeral 418 denotes an intensity stereo circuit. Reference numeral 419 denotes a TNS circuit. Reference numeral 420 denotes an inverse MDCT circuit. Reference numeral 421 denotes an output terminal. Reference numeral 422 denotes a header information analysis circuit. Reference numeral 423 denotes a stream user data analysis circuit. Reference numeral 424 denotes a reproduction control circuit. Reference numeral 425 denotes an input terminal.

図６は再生処理を示すフローチャートである。 FIG. 6 is a flowchart showing the reproduction process.

先ず、入力端子４２５から再生指示を受けると再生制御回路４２４は指定されたファイルを再生するために、指定されたファイルが上記形式で記録されたファイルであるかどうかを調べる。 First, when a reproduction instruction is received from the input terminal 425, the reproduction control circuit 424 checks whether or not the designated file is a file recorded in the above format in order to reproduce the designated file.

再生制御回路４２４は再生回路４０２を制御して、図６に示すように記録媒体４０１に記録されたファイルをオープンし（Ｓ６０２）、再生回路４０２を通してファイルデータを再生し、バッファ４０３に供給する。 The reproduction control circuit 424 controls the reproduction circuit 402 to open the file recorded on the recording medium 401 as shown in FIG. 6 (S602), reproduce the file data through the reproduction circuit 402, and supply it to the buffer 403.

ヘッダ情報解析回路４２２は、再生されてバッファ４０３に記憶される再生データを解析し、moov boxを検出する。moov boxが検出されると、記録されている画像の大きさ、オーディオの符号化方式等の符号化の基本パラメータを取得する（Ｓ６０４）。 The header information analysis circuit 422 analyzes the reproduction data reproduced and stored in the buffer 403, and detects the moov box. When the moov box is detected, basic encoding parameters such as the size of the recorded image and audio encoding method are acquired (S604).

次に、free boxから、上記形式で記録したユーザー定義データがあることを示す識別情報があるかどうかを検出する（Ｓ６０５）。 Next, it is detected from the free box whether there is identification information indicating that there is user-defined data recorded in the above format (S605).

free box Fが検出されると、ビデオチャンクの構成を示す１チャンク当たりのフレーム数Ｋ、ＧＯＰ数ｇｎ、オーディオチャンクの構成を示す１チャンク当たりのフレーム数Ｍを取得する（Ｓ６０６）。更にビデオチャンク、オーディオチャンクの符号量ＶＬ、ＡＬ、ビデオチャンク、オーディオチャンクの合計符号量Ｌfixを取得する（Ｓ６０７）。そしてstsc、stsz、stco boxは読み込まず、ストリームデータのあるmdat boxを検出する（Ｓ６０８）。 When the free box F is detected, the number K of frames per chunk indicating the configuration of the video chunk, the number of GOPs gn, and the number M of frames per chunk indicating the configuration of the audio chunk are acquired (S606). Further, the code amount VL, AL of the video chunk and the audio chunk, and the total code amount Lfix of the video chunk and the audio chunk are acquired (S607). Then, stsc, stsz, and stco box are not read, and an mdat box with stream data is detected (S608).

また、上記Ｓ６０５でfree box Fがない場合、再生制御回路４２４は、通常のファイルとして、stsc、stsz、stco boxを検出する（Ｓ６１０）。また、チャンクの構成パラメータのチャンクオフセット値、サンプル数、サンプルサイズを再生回路４０２を介してバッファ４０３に可能な限り読み出す（Ｓ６１１）。 If there is no free box F in S605, the playback control circuit 424 detects stsc, stsz, and stco box as normal files (S610). In addition, the chunk offset value, the number of samples, and the sample size of the chunk configuration parameters are read out as much as possible to the buffer 403 via the reproduction circuit 402 (S611).

本実施形態においては、ヘッダ情報解析回路４２２がチャンクの構成及びサイズを示すユーザデータを検出すると、再生制御回路４２４はstsc box、stsz box、stco boxのパラメータを使わずにmdat box内のストリームデータにアクセスする。 In the present embodiment, when the header information analysis circuit 422 detects user data indicating the configuration and size of the chunk, the playback control circuit 424 uses the stream data in the mdat box without using the parameters of the stsc box, stsz box, and stco box. To access.

即ち、ユーザデータが検出された場合、そのＭＰ４ファイルは、１チャンク毎に固定長化されて記録されているため、オフセット値を使用せず、所望のチャンクを検出することができる。 That is, when user data is detected, the MP4 file is recorded with a fixed length for each chunk, so that a desired chunk can be detected without using an offset value.

再生時において、各ビデオチャンクの先頭からのオフセット位置は（Ｌfix × ｎ）バイト（ｎ＝０，１，２、・・・）で求められる。また、オーディオチャンクの先頭からのオフセット位置は（Lfix × ｎ＋ＶＬ）バイト（ｎ＝０，１，２、・・・）で求められる。 At the time of reproduction, the offset position from the head of each video chunk is obtained by (Lfix × n) bytes (n = 0, 1, 2,...). Also, the offset position from the beginning of the audio chunk is obtained by (Lfix × n + VL) bytes (n = 0, 1, 2,...).

再生回路４０２は再生制御回路４２４の制御により、記録媒体４０１に記録されたファイルのmdat boxのストリームデータを再生し、バッファ４０３に供給する。バッファ４０３に記憶されたストリームデータはバッファの占有状態等をみて読み出しを始め、分離回路４０４に供給される。 The playback circuit 402 plays back the stream data of the mdat box of the file recorded on the recording medium 401 under the control of the playback control circuit 424, and supplies it to the buffer 403. The stream data stored in the buffer 403 starts to be read in view of the buffer occupation state and the like, and is supplied to the separation circuit 404.

分離回路４０４ではビデオのチャンクデータとオーディオのチャンクデータを分離し、ビデオのチャンクデータは可変長復号化回路４０５、ストリームユーザーデータ解析回路４２３に供給し、オーディオチャンクのデータは可変長復号化回路４１４に供給する。
可変長復号化回路４０５は分離回路４０４から供給される再生されたストリームデータを可変長復号し、逆量子化回路４０６に供給する。 The separation circuit 404 separates the video chunk data and the audio chunk data, supplies the video chunk data to the variable length decoding circuit 405 and the stream user data analysis circuit 423, and the audio chunk data to the variable length decoding circuit 414. To supply.
The variable length decoding circuit 405 performs variable length decoding on the reproduced stream data supplied from the separation circuit 404 and supplies the decoded stream data to the inverse quantization circuit 406.

逆量子化回路４０６は可変長復号化回路４０５から供給される可変長復号化されたデータを逆量子化し、逆ＤＣＴ回路４０７に送出する。逆ＤＣＴ回路４０７は逆量子化回路４０６から供給される逆量子化されたデータに逆ＤＣＴ処理を施し、加算回路４０８に送出する。加算回路４０８は逆ＤＣＴ回路４０７から供給される逆ＤＣＴ処理されたデータとスイッチ回路４１１から供給されるデータを加算する。 The inverse quantization circuit 406 performs inverse quantization on the variable length decoded data supplied from the variable length decoding circuit 405 and sends the data to the inverse DCT circuit 407. The inverse DCT circuit 407 performs inverse DCT processing on the inversely quantized data supplied from the inverse quantization circuit 406 and sends it to the addition circuit 408. The adder circuit 408 adds the inverse DCT processed data supplied from the inverse DCT circuit 407 and the data supplied from the switch circuit 411.

ここで、記録媒体４０１から再生されるストリームデータは、先ずＧＯＰ₍₁₎のフレーム内符号化されたＩ₀が再生される。再生制御回路４２４はスイッチ回路４１１の端子ａを選択するよう制御し、スイッチ回路４１１は加算回路４０８にデータ‘０’を供給する。加算回路４０８はスイッチ回路４１１から供給される‘０’データと逆ＤＣＴ回路４０７から供給される逆ＤＣＴ処理されたデータとを加算して、再生されたフレームＦ₀としてメモリ４０９と並べ替え回路４１２に供給する。メモリ４０９は加算回路４０８から供給される加算データを記憶する。 Here, in the stream data reproduced from the recording medium 401, first, the intra-frame encoded I _{0 of} GOP ₍₁₎ is reproduced. The reproduction control circuit 424 controls to select the terminal a of the switch circuit 411, and the switch circuit 411 supplies data “0” to the adder circuit 408. The adder circuit 408 adds the “0” data supplied from the switch circuit 411 and the data subjected to the inverse DCT process supplied from the inverse DCT circuit 407, and the memory 409 and the rearrangement circuit 412 as the reproduced frame F _0. To supply. The memory 409 stores the addition data supplied from the addition circuit 408.

ＧＯＰ₍₁₎のフレーム内符号化データＩ₀の次に再生されるのは双方向予測符号化されたピクチャデータＢ_-2、Ｂ_-1である。また、逆ＤＣＴ回路４０７までの再生手続きは上記フレーム内符号化データＩ₀で説明した再生処理と同様であるので省略する。 Next to the intraframe encoded data I ₀ of GOP ₍₁₎ , the picture data B ₋₂ and B ₋₁ that have been bidirectionally predicted encoded are reproduced. The reproduction procedure up to the inverse DCT circuit 407 is the same as the reproduction process described for the intra-frame encoded data I ₀ , and is therefore omitted.

逆ＤＣＴ回路４０７から逆ＤＣＴ処理された双方向予測符号化されたピクチャデータが加算回路４０８に供給される。再生制御回路４２４はこの時、スイッチ回路４１１の端子ｂを選択するようスイッチ回路４１１を制御し、スイッチ回路４１１は端子ｂを選択し、加算回路４０８に動き補償回路４１０からのデータを供給する。 The inverse DCT circuit 407 supplies the inverse DCT-processed picture data that has been subjected to bidirectional prediction encoding to the addition circuit 408. At this time, the reproduction control circuit 424 controls the switch circuit 411 to select the terminal b of the switch circuit 411, and the switch circuit 411 selects the terminal b and supplies data from the motion compensation circuit 410 to the adder circuit 408.

動き補償回路４１０は、再生されてくるストリームから、符号化時に生成されストリーム内に記録された動きベクトルを検出する。そして、動き補償回路４１０は、参照ブロックのデータ（この場合は記録開始時であるため、再生されたフレーム内符号化データＦ₀からのデータのみ）をメモリ４０９から読み出してスイッチ回路４１１の端子ｂに供給する。 The motion compensation circuit 410 detects a motion vector generated at the time of encoding and recorded in the stream from the reproduced stream. Then, the motion compensation circuit 410 reads the data of the reference block (in this case, since it is the recording start time, only the data from the reproduced intraframe encoded data F ₀ ) from the memory 409, and outputs the terminal b of the switch circuit 411. To supply.

加算回路４０８は逆ＤＣＴ回路４０７から供給される逆ＤＣＴ処理されたデータとスイッチ回路４１１から供給される動き補償されたデータを加算する。また、加算回路４０８は、再生されたフレームＦ_-2、Ｆ_-1として並べ替え回路４１２に供給する。 The adder circuit 408 adds the inverse DCT processed data supplied from the inverse DCT circuit 407 and the motion compensated data supplied from the switch circuit 411. Further, the adding circuit 408 supplies the regenerated frames F ₋₂ and F ₋₁ to the rearrangement circuit 412.

次に、片方向方向予測符号化されたピクチャデータＰ₃が再生され、逆ＤＣＴ回路４０７までの再生処理は上記フレーム内符号化データＩ₀で説明した再生処理と同様であるので省略する。 Next, the picture data P _{3 that} has been unidirectionally predictive encoded is reproduced, and the reproduction process up to the inverse DCT circuit 407 is the same as the reproduction process described for the intra-frame encoded data I ₀ , and is therefore omitted.

逆ＤＣＴ回路４０７から逆ＤＣＴ処理された片方向予測符号化されたピクチャデータが加算回路４０８に供給される。再生制御回路４２４は、スイッチ回路４１１の端子ｂを選択するようスイッチ回路４１１を制御し、スイッチ回路４１１は端子ｂを選択し、加算回路４０８に動き補償回路４１０からのデータを供給する。 The inverse DCT circuit 407 supplies the picture data subjected to the inverse DCT process and subjected to the one-way prediction encoding to the addition circuit 408. The reproduction control circuit 424 controls the switch circuit 411 so as to select the terminal b of the switch circuit 411, and the switch circuit 411 selects the terminal b and supplies the data from the motion compensation circuit 410 to the adder circuit 408.

動き補償回路４１０は再生されてくるストリームから、符号化時に生成され、ストリーム内に記録された動きベクトルを検出する。そして、動き補償回路４１０は、参照ブロックのデータ（再生されたフレーム内符号化データＦ₀からのデータ）をメモリ４０９から読み出してスイッチ回路４１１の端子ｂに供給する。 The motion compensation circuit 410 detects a motion vector that is generated during encoding from the reproduced stream and recorded in the stream. Then, the motion compensation circuit 410 reads the reference block data (reproduced data from the intra-frame encoded data F ₀ ) from the memory 409 and supplies it to the terminal b of the switch circuit 411.

加算回路４０８は逆ＤＣＴ回路４０７から供給される逆ＤＣＴ処理されたデータとスイッチ回路４１１から供給される動き補償されたデータを加算し、再生されたフレームＦ₃としてメモリ４０９と並べ替え回路４１２に供給する。メモリ４０９は加算回路４０８から供給される加算データを記憶する。 The adder circuit 408 adds the inverse DCT-processed data supplied from the inverse DCT circuit 407 and the motion compensated data supplied from the switch circuit 411, and outputs the resulting frame F ₃ to the memory 409 and the rearrangement circuit 412. Supply. The memory 409 stores the addition data supplied from the addition circuit 408.

次にピクチャＢ₁、Ｂ₂が再生される。Ｂ_-2、Ｂ_-1と違う点は、記録開始時のフレームではないため、双方向予測としてフレームＦ₀とＦ₃から再生されることである。それ以外は上記Ｂ_-2、Ｂ_-1と同様の処理によって再生される。 Next, pictures B ₁ and B ₂ are reproduced. The difference from B _-2 and B _-1 is that it is not a frame at the start of recording, and is reproduced from frames F ₀ and F ₃ as bidirectional prediction. Other than that, it is reproduced by the same processing as the above B- ₂ and B- ₁ .

以上説明したように、Ｐ₆、Ｂ₄、Ｂ₅、・・・が順次再生されていく。 As described above, P ₆ , B ₄ , B ₅ ,... Are sequentially reproduced.

並べ替え回路４１２は順次再生されてくるフレームＦ₀、Ｆ_-2、Ｆ_-1、Ｆ₃、Ｆ₁、Ｆ₂、Ｆ₆、Ｆ₄、Ｆ₅、・・・を並べ換える。即ち、並べ替え回路４１２は、再生されるフレームをＦ_-2、Ｆ_-1、Ｆ₀、Ｆ₁、Ｆ₂、Ｆ₃、Ｆ₄、Ｆ₅、Ｆ₆、・・・に並べ換えて出力端子４１３に出力する。 The rearrangement circuit 412 rearranges the frames F ₀ , F ₋₂ , F ₋₁ , F ₃ , F ₁ , F ₂ , F ₆ , F ₄ , F ₅ ,. That is, the rearrangement circuit 412 rearranges the reproduced frames into F _-2 , F _-1 , F ₀ , F ₁ , F ₂ , F ₃ , F ₄ , F ₅ , F ₆ ,. It outputs to 413.

可変長復号化回路４１４に供給されたオーディオチャンクの再生データは、可変長復号され、逆量子化回路４１５に供給される。逆量子化回路４１５では復号された再生データを逆量子化し、スケールファクタ回路４１６に送出する。 The audio chunk playback data supplied to the variable length decoding circuit 414 is variable length decoded and supplied to the inverse quantization circuit 415. The inverse quantization circuit 415 inversely quantizes the decoded reproduction data and sends it to the scale factor circuit 416.

スケールファクタ回路４１６では正規化された成分をスケールファクタ値から元に戻し、Ｍ／Ｓステレオ回路４１７に送出する。Ｍ／Ｓステレオ回路４１７では和信号、差信号から左右チャネル信号を再生し、インテンシティステレオ回路４１８に送出する。 The scale factor circuit 416 restores the normalized component from the scale factor value and sends it to the M / S stereo circuit 417. The M / S stereo circuit 417 reproduces the left and right channel signals from the sum signal and the difference signal and sends them to the intensity stereo circuit 418.

インテンシティステレオ回路４１８は右チャネルの和信号とパワー比から左右チャネル信号を再生し、ＴＮＳ回路４１９に送出する。 The intensity stereo circuit 418 reproduces the left and right channel signals from the right channel sum signal and the power ratio, and sends them to the TNS circuit 419.

ＴＮＳ回路４１９では符号化時に用いた予測フィルタと逆特性のフィルタを用いてデータを再生し、逆ＭＤＣＴ回路４２０に送出する。 The TNS circuit 419 reproduces data using a filter having a characteristic opposite to that of the prediction filter used at the time of encoding, and sends the data to the inverse MDCT circuit 420.

逆ＭＤＣＴ回路４２０では再生されたデ−タに対して逆ＭＤＣＴを施して再生オーディオデータを得、出力端子４２１から出力する。 The inverse MDCT circuit 420 performs inverse MDCT on the reproduced data to obtain reproduced audio data and outputs it from the output terminal 421.

＜特殊再生＞
以上、通常再生について説明したが、次に特殊再生について説明する。 <Special playback>
The normal reproduction has been described above. Next, the special reproduction will be described.

再生制御回路４２４はmdat boxのストリームのファイル先頭からのオフセット値Ｓｔ(offset)をファイルの再生開始時に取得し保持している。 The reproduction control circuit 424 acquires and holds an offset value St (offset) from the beginning of the file of the mdat box stream at the start of reproduction of the file.

ここで、ストリーム内の画像フレームのｎ番目を再生中に図４の入力端子４２５から順方向への特殊再生の指示信号が入力されるとする。例えばｓ秒先の時間を指定されたとする。 Here, it is assumed that a special reproduction instruction signal in the forward direction is input from the input terminal 425 in FIG. 4 during reproduction of the nth image frame in the stream. For example, assume that a time of s seconds is specified.

ｓ秒間のフレーム数はｓ × frame_rate＝ｕ・・・（１）
frame_rateは再生開始時、ＭＰ４ファイルのfree box Fのユーザーデータとして記録されているパラメータから取得する。 The number of frames for s seconds is s × frame_rate = u (1)
The frame_rate is acquired from the parameters recorded as user data of the free box F of the MP4 file at the start of playback.

再生されるフレームはｎ＋ｕ番目のフレームであるが、ＭＰＥＧで圧縮されたストリームデータを特殊再生する場合、フレーム内符号化されたデータＩから再生しなければならない。従ってｎ＋ｕ番目のフレームの直前のフレーム内符号化された画像データ（イントラフレーム）を再生する。 The frame to be reproduced is the (n + u) th frame. However, when the stream data compressed by MPEG is specially reproduced, it must be reproduced from the data I encoded in the frame. Therefore, the intra-frame encoded image data (intra frame) immediately before the n + u-th frame is reproduced.

まず、ｎ＋ｕ番目のフレームが含まれているチャンクまでのチャンク数ｒが計算される。
ｒ＝（ｎ＋ｕ）／Ｋ・・・（２）
但し、ｒの小数点以下は切り捨てて小さいほうの整数にまるめる。 First, the number r of chunks up to a chunk including the n + u-th frame is calculated.
r = (n + u) / K (2)
However, the decimal part of r is rounded down to the smaller integer.

次に再生されるチャンクまでのオフセット値Ｐｒが
Ｐｒ＝Ｓｔ(offset)＋Ｌfix × ｒ・・・（３）
で算出される。 The offset value Pr up to the next chunk to be reproduced is Pr = St (offset) + Lfix × r (3)
Is calculated by

次に再生されるイントラフレームのあるＧＯＰのチャンク内の番号ｑを算出する。
ｑ＝（ｎ＋ｕ−ｒ）／（Ｋ／ｇｎ）・・・（４）
但し、ｑの小数点以下は切り捨てて小さいほうの整数にまるめる。 Next, the number q in the GOP chunk having the intra frame to be reproduced is calculated.
q = (n + ur) / (K / gn) (4)
However, the decimal part of q is rounded down to the smaller integer.

再生制御回路４２４は上記式（１）〜（４）により各値を算出した後、バッファ４０３、可変長符号化回路４０５等の復号回路を初期化し、オーディオの出力端子４２１への出力をミュートする。 The reproduction control circuit 424 calculates each value by the above formulas (1) to (4), initializes the decoding circuit such as the buffer 403 and the variable length coding circuit 405, and mutes the output to the audio output terminal 421. .

再生回路４０２を制御して、ＰｒバイトにあるＧＯＰ(Pr)まで記録媒体４０１をシークし、ＧＯＰ(Pr)からデータを再生し、バッファ４０３に供給する。バッファ４０３に記憶された再生データは直ちに分離回路４０４を介してストリームユーザーデータ解析回路４２３に供給される。 The reproducing circuit 402 is controlled to seek the recording medium 401 up to GOP (Pr) in the Pr byte, reproduce data from GOP (Pr), and supply the data to the buffer 403. The reproduction data stored in the buffer 403 is immediately supplied to the stream user data analysis circuit 423 via the separation circuit 404.

ストリームユーザーデータ解析回路４２３ではＧＯＰ(Pr)のユーザーデータを解析し、次のイントラフレームまでのオフセット値Ｉ(offset)を取得し、再生制御回路４２４に供給する。 The stream user data analysis circuit 423 analyzes GOP (Pr) user data, acquires an offset value I (offset) up to the next intra frame, and supplies it to the reproduction control circuit 424.

再生制御回路４２４は次のイントラフレームまでのオフセット値Ｉ(offset)が取得されるとすぐに再生回路４０２を介して記録媒体４０１のファイルをオフセット値Ｉ(offset)分シークさせ、つぎのＧＯＰ(Pr+1)を再生させる。そしてＧＯＰ(Pr+1)のユーザーデータから次のイントラフレームへのオフセット値を取得する。 As soon as the offset value I (offset) up to the next intra frame is acquired, the playback control circuit 424 seeks the file on the recording medium 401 through the playback circuit 402 by the offset value I (offset), and the next GOP ( Play Pr + 1). Then, an offset value from the user data of GOP (Pr + 1) to the next intra frame is acquired.

上記処理を繰り返して特殊再生の目的であるＧＯＰ(Pr+q)に辿り着く。再生制御回路４２４が再生回路４０２を介してＧＯＰ(Pr+q)を再生し、バッファ４０３に記憶していく。分離回路４０４はバッファ４０３に記憶されたＧＯＰ(Pr+q)の再生データをストリームユーザーデータ解析回路４２３に供給する。ストリームユーザーデータ解析回路４２３はユーザーデータからイントラフレームのサイズＩ(size)を取得し、再生制御回路４２４にそのサイズ（Ｉ(size)）を伝える。再生制御回路４２４はサイズ情報Ｉ(size)をもとに、イントラフレームのみを可変長復号化回路４０５以下に送出し、特殊再生の再生画像を得ることができる。そして、次の特殊再生画像を式（１）〜（４）を用いて計算を始める。 The above process is repeated to arrive at GOP (Pr + q) which is the purpose of special reproduction. The reproduction control circuit 424 reproduces GOP (Pr + q) through the reproduction circuit 402 and stores it in the buffer 403. The separation circuit 404 supplies the reproduction data of GOP (Pr + q) stored in the buffer 403 to the stream user data analysis circuit 423. The stream user data analysis circuit 423 obtains the size I (size) of the intra frame from the user data, and informs the reproduction control circuit 424 of the size (I (size)). Based on the size information I (size), the playback control circuit 424 can send only an intra frame to the variable length decoding circuit 405 and below to obtain a playback image for special playback. Then, the calculation of the next special reproduction image is started using the equations (1) to (4).

以上説明したように、記録時に、画像データとオーディオデータのチャンク構成を決め、画像データのチャンクの符号量ＶＬ、オーディオデータのチャンクの符号量ＡＬとしたときに、両チャンクの符号量Ｌfix＝ＶＬ＋ＡＬが一定となるように符号化を行う。 As described above, at the time of recording, the chunk configuration of image data and audio data is determined, and when the code amount VL of the chunk of image data and the code amount AL of the chunk of audio data are used, the code amount Lfix of both chunks LVL = VL + AL Encoding is performed so that becomes constant.

更に画像データのストリームには、ＧＯＰ毎に画像データの符号量ＶＬとフレーム内符号化データ（イントラフレーム）の符号量Ｉ(size)、次のＧＯＰのイントラフレームまでのオフセット値Ｉ(offset)をユーザーデータとして記録する。 Further, for each GOP, the image data stream includes the code amount VL of the image data, the code amount I (size) of the intra-frame encoded data (intra frame), and the offset value I (offset) to the intra frame of the next GOP. Record as user data.

更に１チャンク当たりのフレーム数Ｋ，Ｍ、符号化単位数ｇｎ等の各チャンクの構成を示す情報等をユーザーデータとしてfree boxに格納して記録するようにする。また、符号化のフレームレート、各チャンクの符号量ＶＬ，ＡＬ、及び合計の符号量Ｌfix、上記構成であることを示す識別情報等をユーザーデータとしてfree boxに格納して記録するようにする。 Further, information indicating the configuration of each chunk such as the number of frames K and M per chunk and the number of encoding units gn is stored in the free box and recorded as user data. In addition, the encoding frame rate, the code amounts VL and AL of each chunk, the total code amount Lfix, identification information indicating the above configuration, and the like are stored and recorded in the free box as user data.

再生時にはfree boxから上記ストリームであることを認識すると、free boxとストリームに記録されているユーザーデータの各情報からビデオ及びオーディオの各チャンクのストリームデータにアクセスする。これにより、通常再生も特殊再生も行うようにし、stco、stsc、stsz box等のヘッダ情報を読み込まなくとも再生可能としたので、大きな容量を必要とするヘッダを記憶するメモリが不要になる。 When recognizing the stream from the free box at the time of reproduction, the stream data of each chunk of video and audio is accessed from the information of the user data recorded in the free box and the stream. Thus, normal reproduction and special reproduction are performed, and reproduction is possible without reading header information such as stco, stsc, stsz box, etc., so that a memory for storing a header that requires a large capacity becomes unnecessary.

また、ヘッダにある符号化に関する上記stco、stsc、stsz box等のパラメータを読み込んで解析する必要がないのでストリームデータへのアクセスが早く行えるようになり、再生の待ち時間を減少でき、シームレスの再生が可能になる。 In addition, since it is not necessary to read and analyze the above-mentioned parameters such as stco, stsc, stsz box, etc. related to encoding in the header, access to stream data can be performed quickly, playback waiting time can be reduced, and seamless playback is possible. Is possible.

従来では、上記stco、stsc、stsz box等のパラメータの情報量が多すぎてメモリに１回で読み込めない場合に、再生しながら記録媒体からパラメータを読み直してメモリのパラメータ情報を更新することが必要であった。しかし、その必要はなくなり、機器本体での処理負荷も低減できる。 Conventionally, when there is too much parameter information such as stco, stsc, stsz box and so on, it is necessary to update the parameter information of the memory by rereading the parameter from the recording medium while reproducing. Met. However, this is not necessary, and the processing load on the device body can be reduced.

また、ファイルは通常のＭＰ４ファイルであるので、他の機器での再生も可能である。 Also, since the file is a normal MP4 file, it can be played back on other devices.

更に、１つのファイルで実現できるので、記録媒体間でコピーされたデータを本装置で再生したとしても、上記再生処理と同様に再生できる。 Furthermore, since it can be realized by one file, even if data copied between recording media is reproduced by this apparatus, it can be reproduced in the same manner as the above reproduction processing.

また、上記例ではオーディオの符号化方式をＭＰＥＧ−ＡＡＣとして説明したが、ＭＰＥＧＩ，ＩＩ，ＩＩＩ等の他の符号化方式でも同様に記録再生でき、同様の効果が得られる。 In the above example, the audio encoding method has been described as MPEG-AAC. However, other encoding methods such as MPEGI, II, and III can be similarly recorded and reproduced, and the same effect can be obtained.

［他の実施形態］
以上、本発明に係る実施形態について具体例を用いて詳述したが、本発明は、例えば、システム、装置、方法、プログラム若しくは記憶媒体（記録媒体）等としての実施態様をとることが可能である。具体的には、複数の機器から構成されるシステムに適用しても良いし、また、一つの機器からなる装置に適用しても良い。 [Other Embodiments]
The embodiment according to the present invention has been described in detail using specific examples. However, the present invention can take an embodiment as a system, apparatus, method, program, storage medium (recording medium), or the like. is there. Specifically, the present invention may be applied to a system composed of a plurality of devices, or may be applied to an apparatus composed of a single device.

また、本発明の目的は、図示の機能ブロック及び動作において、いずれの部分をハードウェア回路により実現し、或いはコンピュータを用いたソフトウェア処理によって実現しても達成されることは言うまでもない。 It goes without saying that the object of the present invention can be achieved even if any part of the illustrated functional blocks and operations is realized by a hardware circuit or by software processing using a computer.

尚、本発明は、前述した実施形態の機能を実現するソフトウェアのプログラムを、システムあるいは装置に直接あるいは遠隔から供給することによって達成される場合も含む。その場合、システム等のコンピュータが該プログラムコードを読み出して実行することになる。 Note that the present invention includes a case where the present invention is achieved by supplying a software program for realizing the functions of the above-described embodiments directly or remotely to a system or apparatus. In that case, a computer such as a system reads and executes the program code.

従って、本発明の機能処理をコンピュータで実現するために、該コンピュータにインストールされるプログラムコード自体も本発明を実現するものである。つまり、本発明は、本発明の機能処理を実現するためのコンピュータプログラム自体も含まれる。 Accordingly, since the functions of the present invention are implemented by computer, the program code installed in the computer also implements the present invention. In other words, the present invention includes a computer program itself for realizing the functional processing of the present invention.

その場合、プログラムの機能を有していれば、オブジェクトコード、インタプリタにより実行されるプログラム、OSに供給するスクリプトデータ等の形態であっても良い。 In that case, as long as it has the function of a program, it may be in the form of object code, a program executed by an interpreter, script data supplied to the OS, or the like.

プログラムを供給するための記録媒体（記憶媒体）としては、例えば、フレキシブルディスク、ハードディスク、光ディスク、光磁気ディスク等がある。その他にも、MO、CD-ROM、CD-R、CD-RW、磁気テープ、不揮発性のメモリカード、ROM、DVD(DVD-ROM、DVD-R)等がある。 Examples of the recording medium (storage medium) for supplying the program include a flexible disk, a hard disk, an optical disk, and a magneto-optical disk. In addition, there are MO, CD-ROM, CD-R, CD-RW, magnetic tape, nonvolatile memory card, ROM, DVD (DVD-ROM, DVD-R) and the like.

その他、プログラムの供給方法としては、クライアントコンピュータのブラウザを用いてインターネットのホームページに接続し、該ホームページから本発明のコンピュータプログラムそのものをダウンロードすることもできる。また圧縮され自動インストール機能を含むファイルをハードディスク等の記録媒体にダウンロードすることによっても供給できる。また、本発明のプログラムを構成するプログラムコードを複数のファイルに分割し、それぞれのファイルを異なるホームページからダウンロードすることによっても実現可能である。つまり、本発明の機能処理をコンピュータで実現するためのプログラムファイルを複数のユーザに対してダウンロードさせるWWWサーバも、本発明に含まれるものである。 As another program supply method, a client computer browser can be used to connect to a homepage on the Internet, and the computer program itself of the present invention can be downloaded from the homepage. It can also be supplied by downloading a compressed file including an automatic installation function to a recording medium such as a hard disk. It can also be realized by dividing the program code constituting the program of the present invention into a plurality of files and downloading each file from a different homepage. That is, a WWW server that allows a plurality of users to download a program file for realizing the functional processing of the present invention on a computer is also included in the present invention.

また、本発明のプログラムを暗号化してCD-ROM等の記憶媒体に格納してユーザに配布し、所定の条件をクリアしたユーザが、インターネットを介してホームページから暗号化を解く鍵情報をダウンロードすることもできる。この場合、ダウンロードした鍵情報を使用することにより暗号化されたプログラムを実行してコンピュータにインストールさせて実現する。 In addition, the program of the present invention is encrypted, stored in a storage medium such as a CD-ROM, distributed to users, and a user who satisfies predetermined conditions downloads key information for decryption from a homepage via the Internet. You can also. In this case, the downloaded key information is used to execute the encrypted program and install it on the computer.

また、コンピュータが、読み出したプログラムを実行することによって、前述した実施形態の機能が実現される他、そのプログラムの指示に基づき、コンピュータ上で稼動しているOS等が、実際の処理の一部又は全部を行うことによっても実現され得る。 In addition to the functions of the above-described embodiments being realized by the computer executing the read program, the OS running on the computer based on the instructions of the program may be part of the actual processing. Alternatively, it can be realized by performing all of them.

更に、記録媒体から読み出されたプログラムが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットのメモリに書き込まれた後、該ボード等のCPU等が実際の処理の一部又は全部を行うことによっても実現される。 Further, after the program read from the recording medium is written in the memory of a function expansion board inserted into the computer or a function expansion unit connected to the computer, the CPU of the board or the like performs a part of the actual processing. Alternatively, it can be realized by performing all of them.

図３の画像符号化回路３０２の詳細構成を示すブロック図である。FIG. 4 is a block diagram illustrating a detailed configuration of an image encoding circuit 302 in FIG. 3. 図３のオーディオ符号化回路３０５の詳細構成を示すブロック図である。FIG. 4 is a block diagram showing a detailed configuration of an audio encoding circuit 305 in FIG. 3. 本発明に係る実施形態の記録装置を示すブロック図である。1 is a block diagram illustrating a recording apparatus according to an embodiment of the present invention. 本発明に係る実施形態の再生装置を示すブロック図である。It is a block diagram which shows the reproducing | regenerating apparatus of embodiment which concerns on this invention. （ａ）は画像ストリーム及びオーディオストリームとチャンクとの関係を説明する図、（ｂ）は符号化のフレーム順序を説明する図、（ｃ）は画像ストリームのデータ構成を説明する図、（ｄ）は本実施形態のfree boxの構成を説明する図である。(A) is a diagram for explaining the relationship between image streams and audio streams and chunks, (b) is a diagram for explaining the frame order of encoding, (c) is a diagram for explaining the data structure of an image stream, (d) These are the figures explaining the structure of the free box of this embodiment. 本実施形態の再生処理を示すフローチャートである。It is a flowchart which shows the reproduction | regeneration processing of this embodiment. ＭＰ４ファイルに記録される画像データ及びオーディオデータの構成を説明する図である。It is a figure explaining the structure of the image data recorded on an MP4 file, and audio data. ＭＰ４ファイルのmoov box内のstsc、stsz、stco boxを説明する図である。It is a figure explaining stsc, stsz, stco box in moov box of MP4 file. Quicktime（登録商標）フォーマットの構成を説明する図である。It is a figure explaining the structure of a Quicktime (trademark) format.

Explanation of symbols

１０１画像データ入力端子
１０２カメラ信号処理回路
１０３並べ替え回路
１０４加算回路
１０５スイッチ回路
１０６ＤＣＴ回路
１０７量子化回路
１０８逆量子化回路
１０９逆ＤＣＴ回路
１１０加算回路
１１１メモリ
１１２動き補償回路
１１３スイッチ回路
１１４可変長符号化回路
１１５ストリーム生成回路
１１６バッファ
１１７符号量制御回路
１１８ストリームデータ出力端子
１１９ストリームパラメータ付加回路
１２０ヘッダ情報生成回路
１２１ヘッダ情報出力端子
１２２符号量設定回路
１２３記録制御回路
１２４制御データ入力端子
２０１オーディオデータ入力端子
２０２ＭＤＣＴ回路
２０３ＴＮＳ回路
２０４インテンシティステレオ回路
２０５Ｍ／Ｓステレオ回路
２０６スケールファクタ計算回路
２０７量子化回路
２０８可変長符号化回路
２０９ストリーム生成回路
２１０出力端子
２１１ヘッダ情報生成回路
２１２出力端子
２１３聴覚心理モデル回路
２１４ビット割り当て回路
２１５記録制御回路
２１６入力端子
３０１画像データ入力端子
３０２画像符号化回路
３０３多重化回路
３０４記録回路
３０５オーディオデータ入力端子
３０６オーディオ符号化回路
３０７ヘッダ付加回路
３０８パラメータ入力端子
３０９符号量設定回路
３１０チャンク構成設定回路
３１１記録媒体
４０１記録媒体
４０２再生回路
４０３バッファ
４０４分離回路
４０５可変長復号化回路
４０６逆量子化回路
４０７逆ＤＣＴ回路
４０８加算回路
４０９メモリ
４１０動き補償回路
４１１スイッチ回路
４１２並べ替え回路
４１３出力端子
４１４可変長復号化回路
４１５逆量子化回路
４１６スケールファクタ回路
４１７Ｍ／Ｓステレオ回路
４１８インテンシティステレオ回路
４１９ＴＮＳ回路
４２０逆ＭＤＣＴ回路
４２１出力端子
４２２ヘッダ情報解析回路
４２３ストリームユーザーデータ解析回路
４２４再生制御回路
４２５入力端子 101 Image data input terminal 102 Camera signal processing circuit 103 Rearrangement circuit 104 Addition circuit 105 Switch circuit 106 DCT circuit 107 Quantization circuit 108 Inverse quantization circuit 109 Inverse DCT circuit 110 Addition circuit 111 Memory 112 Motion compensation circuit 113 Switch circuit 114 Variable Long encoding circuit 115 Stream generation circuit 116 Buffer 117 Code amount control circuit 118 Stream data output terminal 119 Stream parameter addition circuit 120 Header information generation circuit 121 Header information output terminal 122 Code amount setting circuit 123 Recording control circuit 124 Control data input terminal 201 Audio data input terminal 202 MDCT circuit 203 TNS circuit 204 Intensity stereo circuit 205 M / S stereo circuit 206 Scale factor calculation circuit 207 Quantization times Path 208 Variable length encoding circuit 209 Stream generation circuit 210 Output terminal 211 Header information generation circuit 212 Output terminal 213 Auditory psychology model circuit 214 Bit allocation circuit 215 Recording control circuit 216 Input terminal 301 Image data input terminal 302 Image encoding circuit 303 Multiplexing Conversion circuit 304 recording circuit 305 audio data input terminal 306 audio encoding circuit 307 header addition circuit 308 parameter input terminal 309 code amount setting circuit 310 chunk configuration setting circuit 311 recording medium 401 recording medium 402 reproduction circuit 403 buffer 404 separation circuit 405 variable length Decoding circuit 406 Inverse quantization circuit 407 Inverse DCT circuit 408 Adder circuit 409 Memory 410 Motion compensation circuit 411 Switch circuit 412 Rearrangement circuit 413 Output terminal 414 Variable length decoding circuit 415 Inverse quantization circuit 416 Scale factor circuit 417 M / S stereo circuit 418 Intensity stereo circuit 419 TNS circuit 420 Inverse MDCT circuit 421 Output terminal 422 Header information analysis circuit 423 Stream user data analysis circuit 424 Playback control circuit 425 Input terminal

Claims

In an apparatus for recording image data and audio data as files,
Image data encoding means for encoding image data;
Image data code amount control means for controlling the image data encoding means;
Audio data encoding means for encoding audio data;
Audio data code amount control means for controlling the audio data encoding means;
Image stream generation means for streaming the image data encoded by the image data encoding means;
Audio stream generating means for streaming the audio data encoded by the audio data encoding means;
Multiplexing means for multiplexing the generated image stream and audio stream in units of the image data and audio data for a predetermined period, respectively.
Recording means for recording the stream multiplexed by the multiplexing means on a recording medium,
The image data code amount control means and the audio data code amount control means have the image data encoding means and the audio data so that the image stream and the audio stream are fixed-length for each unit multiplexed by the multiplexing means. A recording apparatus for controlling data encoding means.

The recording means adds and records additional information including the size of a multiplexed unit of the encoded image stream and the number of frames of image data of the multiplexed unit to the image stream for recording. The recording apparatus according to 1.

The recording means records the stream multiplexed by the multiplexing means on the recording medium as a file, and also includes an offset value from the beginning of the file and each multiplexing for each multiplexing unit of the image stream and audio stream. 2. The recording apparatus according to claim 1, wherein information on the number of samples per unit is recorded as a header of the file.

The additional information includes information on the number of coding units included in one of the multiplexing units of the image stream and audio stream, and the total code amount of the multiplexing units corresponding to each other in the image data and audio data. The recording apparatus according to claim 2, further comprising:

2. The recording apparatus according to claim 1, wherein the encoding unit of the image data matches an image data group when the image data is MPEG-encoded.

2. The recording / reproducing apparatus according to claim 1, wherein the audio data encoding unit matches a frame which is an encoding unit of the audio encoding means.

2. The recording / reproducing apparatus according to claim 1, wherein the file recorded by the recording means is in an MPEG audio Layer-4 format.

Playback means for playing back image data and audio data recorded as files on a recording medium;
Reproduction control means for controlling the reproduction means;
Header information analysis means for analyzing header information recorded in the recording medium,
When the header information analysis means detects unique user data in the header recorded in the recording medium, it prohibits reading of access data to the stream recorded in the header,
The reproduction control means controls the reproduction means based on the parameter information of the stream recorded in the file analyzed by the header information analysis means, and the image data and audio data recorded as a file on the recording medium A playback device characterized by playing back video.

It has stream user data analysis means for detecting user data recorded in a stream of image data and analyzing encoding information of the stream,
The reproduction control unit controls the reproduction unit based on the stream parameter information detected by the header information analysis unit and the stream encoding information analyzed by the stream user data analysis unit to control the recording medium. 9. The reproducing apparatus according to claim 8, wherein the image data and audio data recorded as files are reproduced.

The stream user data analyzing means detects the encoding size of the minimum reproduction unit of the image data from the user data recorded in the image stream and the offset value to the encoding unit recorded next on the recording medium. The reproducing apparatus according to claim 9.

The header information analysis means detects the user data and corresponds to the number of encoding units of the multiplexed image data and audio data and the number of encoding units of the image data as parameter information of the stream to be analyzed. A code amount corresponding to the number of encoding units of the audio data, and a code amount obtained by summing up the code amounts corresponding to the number of encoding units of each of the image data and the audio data. Item 9. The playback device according to Item 8.

12. The reproducing apparatus according to claim 11, wherein the encoding unit of the image data matches an image data group when the image data is MPEG-encoded.

9. The reproducing apparatus according to claim 8, wherein the file recorded on the recording medium is recorded in the MPEG audio Layer-4 format.

A method of recording image data and audio data as files,
An image data encoding control step for controlling the encoding of the image data;
An audio data encoding control step for controlling encoding of audio data;
An image stream generating step for streaming the encoded image data;
An audio stream generating step for streaming the encoded audio data;
A multiplexing step of multiplexing the generated image stream and audio stream, with the image data and audio data for a predetermined period as a unit,
A recording step of recording the multiplexed stream on a recording medium,
In the image data encoding control step and the audio data encoding control step, the image data and the audio data are set to have a fixed length for each of the units to be multiplexed in the multiplexing step. A recording method characterized by encoding.

A reproduction method for reproducing image data and audio data recorded as files on a recording medium,
A header information analysis step of analyzing header information recorded in the recording medium;
When unique user data is detected in the header recorded on the recording medium by the header information analysis step, reading of access data to the stream recorded in the header is prohibited,
A reproduction step of reproducing image data and audio data recorded on the recording medium based on the parameter information of the stream recorded in the file analyzed in the header information analysis step .

The program for making a computer perform the method of Claim 14 or 15.

A computer-readable storage medium storing the program according to claim 16.