JP4379869B2

JP4379869B2 - Moving image processing apparatus, program, and information recording medium

Info

Publication number: JP4379869B2
Application number: JP2004057273A
Authority: JP
Inventors: 宏幸作山
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2004-03-02
Filing date: 2004-03-02
Publication date: 2009-12-09
Anticipated expiration: 2024-03-02
Also published as: JP2005252434A

Description

本発明は、符号化された動画像の処理に係り、特に、フレーム内符号化された動画像の符号量削減のためのフレーム間引きに関する。 The present invention relates to processing of encoded moving images, and more particularly, to frame thinning for reducing the amount of code of moving images encoded within a frame.

動画像の符号化方式は、フレーム内符号化とフレーム間符号化の２つに大別することができる。フレーム内符号化は、動画像の個々のフレームを独立に符号化し、各フレームの符号を連結して動画像の符号を生成する方式であり、その代表例はＤＶ，Motion-JPEG，
Motion-JPEG2000である。フレーム間符号化は、動画像の連続した複数のフレームを１つのグループとし、グループ毎に符号化を行い、グループ毎の符号を連結して動画像の符号を生成する方式であり、その代表例はMPEG1，MPEG2，MPEG4である。 Video coding methods can be broadly divided into two types: intra-frame coding and inter-frame coding. Intra-frame coding is a method in which individual frames of a moving image are independently coded, and the code of each frame is connected to generate a moving image code. Typical examples are DV, Motion-JPEG,
Motion-JPEG2000. Inter-frame coding is a method in which a plurality of continuous frames of a moving image are grouped into one group, coding is performed for each group, and a code for each group is generated to generate a moving image code. Are MPEG1, MPEG2, and MPEG4.

また、各フレームの符号化には、インターレース画像である奇数フィールドと偶数フィールドの画像をそれぞれ分けて符号化するフィールド符号化と、奇数フレームと偶数フレームを一括して符号化するフレームベース符号化とがある。 The encoding of each frame includes field encoding that encodes an odd-numbered field and an even-numbered field image that are interlaced images, and frame-based encoding that encodes odd-numbered frames and even-numbered frames at once. There is.

本発明の対象とする動画像は、フレーム毎に独立して、インターレース画像をフレームベース符号化してなる動画像である。 The moving image targeted by the present invention is a moving image obtained by frame-based encoding an interlaced image independently for each frame.

本発明に関連する公知文献としては例えば特許文献１と特許文献２がある。 Known documents related to the present invention include, for example, Patent Document 1 and Patent Document 2.

特許文献１には、符号化側において、隣接するフレーム画像間のブロックマッチングなどによりフレーム間の動きを検出し、動きが変化するフレームのみ符号化し（動きが一様なフレームは間引く）、復号側において、復号したフレーム画像を参照画像とした動き補償フレーム内挿処理により間引かれたフレーム画像を合成する技術が記載されている。 In Patent Document 1, on the encoding side, the motion between frames is detected by block matching between adjacent frame images, etc., and only the frame in which the motion changes is encoded (the frame with uniform motion is thinned), and the decoding side Describes a technique for synthesizing frame images thinned out by motion compensation frame interpolation processing using a decoded frame image as a reference image.

特許文献２には、インターレース画像のフレームベース符号化に関連し、フレームを構成するフィールド間で被写体が移動することによる「櫛形」について記載されている。 Patent Document 2 describes a “comb shape” related to frame-based encoding of an interlaced image, in which a subject moves between fields constituting a frame.

特許第２９１９２１１号公報Japanese Patent No. 2919211 特開２００２−６４８３０号公報JP 2002-64830 A

さて、近年、符号化された動画像を復号することなく、動画像の一部のフレームの符号を取り出すニーズが高まりつつある。例えば、ネットワーク上のサーバにある動画像から、動きの少ないフレームを間引き、動きの大きいフレームの符号のみからなる動画像を作成して送信するならば、ネットワークの負荷や転送時間を低減することができる。しかも、動きの大きなフレームの符号は保存されているため、復号した動画像の動きはそれほど不自然にならない。 Nowadays, there is an increasing need to extract codes of some frames of a moving image without decoding the encoded moving image. For example, if a frame with little motion is thinned out from a moving image on a server on the network and a moving image consisting only of a code with a large amount of motion is created and transmitted, the load on the network and the transfer time can be reduced. it can. In addition, since the code of a frame with a large motion is stored, the motion of the decoded moving image is not so unnatural.

フレーム内符号化された動画像は、各フレームの符号の集合として構成されているため、構造上はフレームごとの符号の間引きが可能である。問題は、動画像を復号することなく、動画像を構成するフレーム群から動きの少ないフレームをどのようにして選択するかである。 Since the intra-frame encoded moving image is configured as a set of codes for each frame, the code can be thinned out for each frame. The problem is how to select a frame with less motion from the frame group constituting the moving image without decoding the moving image.

よって、本発明の目的は、フレーム毎に独立して、インターレース画像をフレームベース符号化してなる動画像に関し、それを復号することなく、動きの少ないフレームを選択し、さらには、選択したフレームの符号を削除して符号量が削減された新たな動画像を生成するための新規な動画像処理装置及び方法を提供することにある。 Therefore, an object of the present invention relates to a moving image obtained by frame-based encoding of an interlaced image independently for each frame, and selects a frame with less motion without decoding it. It is an object of the present invention to provide a novel moving image processing apparatus and method for generating a new moving image with a reduced code amount by deleting the code.

請求項１の発明は、The invention of claim 1
フレーム毎に独立して、インターレース画像をフレームベース符号化してなる動画像を処理する動画像処理装置であって、  A moving image processing apparatus that processes a moving image obtained by frame-based encoding an interlaced image independently for each frame,
前記フレームベース符号化は、周波数変換として２次元ウェーブレット変換を用い、ウェーブレット係数をサブバンド毎に符号化するものであり、  The frame-based encoding uses a two-dimensional wavelet transform as a frequency transform, and encodes a wavelet coefficient for each subband.
各フレームの符号中のヘッダであって１ＬＨサブバンドの符号量に関する情報が記載されている部分の情報に基づいて１ＬＨサブバンドの符号量を算出する手段と、該手段により算出された符号量が大きいほど動き量が大きいと評価して、動き量の小さいフレームを選択する手段とを有することを特徴とする動画像処理装置である。  Means for calculating the code amount of the 1LH subband based on the information of the portion of the header of the code of each frame in which information relating to the code amount of the 1LH subband is described, and the code amount calculated by the means is The moving image processing apparatus includes means for evaluating a larger amount of motion as a larger value and selecting a frame with a smaller amount of motion.

請求項２の発明は、The invention of claim 2
フレーム毎に独立して、インターレース画像をフレームベース符号化してなる動画像を処理する動画像処理装置であって、  A moving image processing apparatus that processes a moving image obtained by frame-based encoding an interlaced image independently for each frame,
前記フレームベース符号化は、周波数変換として２次元ウェーブレット変換を用い、ウェーブレット係数をサブバンド毎に符号化するものであり、  The frame-based encoding uses a two-dimensional wavelet transform as a frequency transform, and encodes a wavelet coefficient for each subband.
各フレームの符号中のヘッダであって１ＬＨサブバンドの符号量及び１ＨＬサブバンドの符号量に関する情報が記載されている部分の情報に基づいて１ＬＨサブバンドの符号量及び１ＨＬサブバンドの符号量をそれぞれ算出する手段と、該手段により算出された符号量を用い、  The code amount of the 1LH subband and the code amount of the 1HL subband are determined based on the information of the header in the code of each frame and the information about the code amount of the 1LH subband and the code amount of the 1HL subband. Using the means for calculating each and the code amount calculated by the means,
符号量比＝（１LHサブバンドの符号量）／（１ＨＬサブバンドの符号量）  Code amount ratio = (Code amount of 1LH subband) / (Code amount of 1HL subband)
により計算される符号量比が大きいほど動き量が大きいと評価して、動き量の小さいフレームを選択する手段とを有することを特徴とする動画像処理装置である。And a means for selecting a frame with a small amount of motion by evaluating that the amount of motion is larger as the code amount ratio calculated by (1) is larger.

請求項３の発明は、The invention of claim 3
フレーム毎に独立して、インターレース画像をフレームベース符号化してなる動画像を処理する動画像処理装置であって、  A moving image processing apparatus that processes a moving image obtained by frame-based encoding an interlaced image independently for each frame,
前記フレームベース符号化は、周波数変換として２次元ウェーブレット変換を用い、ウェーブレット係数をサブバンド毎に符号化するものであり、  The frame-based encoding uses a two-dimensional wavelet transform as a frequency transform, and encodes a wavelet coefficient for each subband.
各フレームの符号中のヘッダであって１ＬＨサブバンドの符号量、１ＨＬサブバンドの符号量、２ＬＨサブバンドの符号量及び２ＨＬサブバンドの符号量に関する情報が記載されている部分の情報に基づいて１ＬＨサブバンドの符号量、１ＨＬサブバンドの符号量、２ＬＨサブバンドの符号量及び２ＨＬサブバンドの符号量をそれぞれ算出する手段と、該手段により算出された符号量を用い、  Based on the information in the header in the code of each frame, which describes the information about the code amount of 1LH subband, the code amount of 1HL subband, the code amount of 2LH subband, and the code amount of 2HL subband. A code amount for 1LH subband, a code amount for 1HL subband, a code amount for 2LH subband, and a code amount for 2HL subband, and a code amount calculated by the means,
符号量比＝［（１ＬＨサブバンド符号量／１ＨＬサブバンド符号量）／（２ＬＨサブバ  Code quantity ratio = [(1LH subband code quantity / 1HL subband code quantity) / (2LH subband code quantity
ンド符号量／２ＨＬサブバンド符号量）］                Code amount / 2HL subband code amount)]
により計算される符号量比が大きいほど動き量が大きいと評価して、動き量の小さなフレームを選択する手段とを有することを特徴とする動画像処理装置である。And a means for selecting a frame with a small amount of motion by evaluating that the amount of motion is larger as the code amount ratio calculated by (2) is larger.

請求項４の発明は、The invention of claim 4
フレーム毎に独立して、インターレース画像をフレームベース符号化してなる動画像を処理する動画像処理装置であって、  A moving image processing apparatus that processes a moving image obtained by frame-based encoding an interlaced image independently for each frame,
前記フレームベース符号化は、周波数変換として２次元ウェーブレット変換を用い、ウェーブレット係数をサブバンド毎に符号化するものであり、  The frame-based encoding uses a two-dimensional wavelet transform as a frequency transform, and encodes a wavelet coefficient for each subband.
各フレームの符号中のヘッダであって１ＬＨサブバンドの符号量及び１ＨＬサブバンドの符号量に関する情報が記載されている部分の情報に基づいて、サブバンドより小さいブロックの単位で１ＬＨサブバンドの符号量及び１ＨＬサブバンドの符号量を算出する手段と、該手段により算出された、位置関係の対応したブロック単位の符号量を用い、  1LH subband code in units of blocks smaller than the subband based on the header information in the code of each frame and the information of the portion describing the information about the code amount of the 1LH subband and the code amount of the 1HL subband. A unit for calculating the amount and the code amount of the 1HL subband, and the code amount of the block unit corresponding to the positional relationship calculated by the unit,
ブロック符号量比＝（１ＬＨサブバンドのブロック単位の符号量／１ＨＬサブバンドの  Block code amount ratio = (Code amount of 1LH subband block unit / 1HL subband)
ブロック単位の符号量）                      Code amount in blocks)
により算出されるブロック符号量比を全ブロックについて合計した符号量比を求め、この符号量比が大きいほど動き量が大きいと評価して、動き量の小さなフレームを選択する手段とを有することを特徴とする動画像処理装置である。A code amount ratio obtained by summing up the block code amount ratio calculated for all blocks, evaluating that the larger the code amount ratio is, the larger the amount of motion is, and selecting a frame with a small amount of motion. This is a featured moving image processing apparatus.

請求項５の発明は、The invention of claim 5
フレーム毎に独立して、インターレース画像をフレームベース符号化してなる動画像を処理する動画像処理装置であって、  A moving image processing apparatus that processes a moving image obtained by frame-based encoding an interlaced image independently for each frame,
前記フレームベース符号化は、周波数変換として２次元ウェーブレット変換を用い、ウェーブレット係数をサブバンド毎に符号化するものであり、  The frame-based encoding uses a two-dimensional wavelet transform as a frequency transform, and encodes a wavelet coefficient for each subband.
各フレームの符号中のヘッダであって１ＬＨサブバンドの符号量、１ＨＬサブバンドの符号量、２ＬＨサブバンドの符号量及び２ＨＬサブバンドの符号量に関する情報が記載されている部分の情報に基づいて、サブバンドより小さなブロックの単位で、１ＬＨサブバンドの符号量、１ＨＬサブバンドの符号量、２ＬＨサブバンドの符号量及び２ＨＬサブバンドの符号量をそれぞれ算出する手段と、該手段により算出された、位置関係の対応したブロック単位の符号量を用い、  Based on the information in the header in the code of each frame, which describes the information about the code amount of 1LH subband, the code amount of 1HL subband, the code amount of 2LH subband, and the code amount of 2HL subband. Means for calculating the code amount of 1LH subband, the code amount of 1HL subband, the code amount of 2LH subband, and the code amount of 2HL subband in units of blocks smaller than the subband, and the means , Using the code amount of the block unit corresponding to the positional relationship,
ブロック符号量比＝［（１ＬＨサブバンドのブロック単位の符号量／１ＨＬサブバンド  Block code amount ratio = [(1LH subband block code amount / 1HL subband
のブロック単位の符号量）／（２ＬＨサブバンドのブロック単位の符号量／２ＨＬ      Code amount in block units) / (code amount in block units of 2LH subbands / 2HL)
サブバンドのブロック単位の符号量）］      Subband block code amount)]
により算出されるブロック符号量比を全ブロックについて合計した符号量比を求め、この符号量比が大きいほど動き量が大きいと評価して、動き量の小さなフレームを選択する手段とを有することを特徴とする動画像処理装置である。A code amount ratio obtained by summing up the block code amount ratio calculated for all blocks, evaluating that the larger the code amount ratio is, the larger the amount of motion is, and selecting a frame with a small amount of motion. This is a featured moving image processing apparatus.

請求項６の発明は、請求項２又は３の発明の動画像処理装置において、符号量をサブバンド間のトランケート量の違いに応じて補正することを特徴とするを動画像処理装置である。 According to a sixth aspect of the present invention, there is provided the moving image processing apparatus according to the second or third aspect , wherein the code amount is corrected in accordance with a difference in truncation amount between subbands.

請求項７の発明は、請求項２又は３の発明の動画像処理装置において、符号量をサブバンド間のトランケート量の違い及びフレーム間のトランケート量の違いに応じて補正することを特徴とする動画像処理装置である。 The invention according to claim 7 is the moving image processing apparatus according to claim 2 or 3 , wherein the code amount is corrected in accordance with a difference in truncation amount between subbands and a difference in truncation amount between frames. A moving image processing apparatus.

請求項８の発明は、請求項４又は５の発明の動画像処理装置において、符号量をブロック間のトランケート量の違いに応じて補正することを特徴とする動画像処理装置である。 According to an eighth aspect of the present invention, there is provided the moving image processing apparatus according to the fourth or fifth aspect of the invention, wherein the code amount is corrected in accordance with a difference in truncation amount between blocks.

請求項９の発明は、請求項４又は５の発明の動画像処理装置において、符号量をブロック間のトランケート量の違い及びフレーム間のトランケート量の違いに応じて補正することを特徴とする動画像処理装置である。 According to a ninth aspect of the present invention, in the moving image processing apparatus according to the fourth or fifth aspect of the present invention, the code amount is corrected in accordance with a difference in a truncation amount between blocks and a difference in a truncation amount between frames. An image processing apparatus.

請求項１０の発明は、請求項２乃至９のいずれか１項の発明の動画像処理装置において、符号量をサブバンド間の線形量子化ステップ数の違いに応じて補正することを特徴とする動画像処理装置である。 According to a tenth aspect of the present invention, in the moving image processing apparatus according to any one of the second to ninth aspects, the code amount is corrected in accordance with a difference in the number of linear quantization steps between subbands. A moving image processing apparatus.

請求項１１の発明は、請求項２乃至９のいずれか１項の発明の動画像処理装置において、符号量をサブバンド間の線形量子化ステップ数の違い及びフレーム間の線形量子化ステップ数の違いに応じて補正することを特徴とする動画像処理装置である。 According to an eleventh aspect of the present invention, in the moving image processing apparatus according to any one of the second to ninth aspects of the present invention, the code amount is determined by the difference in the number of linear quantization steps between subbands and the number of linear quantization steps between frames. A moving image processing apparatus is characterized in that correction is performed according to a difference.

請求項１２の発明は、請求項１乃至１１のいずれか１項の発明の動画像処理装置において、算出される符号量は輝度成分の符号量であることを特徴とする動画像処理装置である。 The invention of claim 12 is the moving image processing apparatus according to any one of claims 1 to 11 , wherein the calculated code amount is a code amount of a luminance component. .

請求項１３の発明は、請求項１乃至１２のいずれか１項の発明の動画像処理装置において、前記フレームを選択する手段は、評価した動き量が所定値より小さいフレームを選択することを特徴とする動画像処理装置である。 According to a thirteenth aspect of the present invention, in the moving image processing apparatus according to any one of the first to twelfth aspects of the present invention, the means for selecting the frame selects a frame whose evaluated motion amount is smaller than a predetermined value. Is a moving image processing apparatus.

請求項１４の発明は、請求項１乃至１２のいずれか１項の発明の動画像処理装置において、前記フレームを選択する手段は、評価した動き量が小さい順に所定数のフレームを選択することを特徴とする動画像処理装置である。 According to a fourteenth aspect of the present invention, in the moving image processing apparatus according to any one of the first to twelfth aspects, the means for selecting the frame selects a predetermined number of frames in ascending order of the evaluated motion amount. This is a featured moving image processing apparatus.

請求項１５の発明は、請求項１３又は１４の発明の動画像処理装置において、前記フレームを選択する手段は、動画像の先頭フレームを選択対象から除外することを特徴とする動画像処理装置である。 A fifteenth aspect of the present invention is the moving image processing apparatus according to the thirteenth or fourteenth aspect of the present invention, wherein the means for selecting the frame excludes the first frame of the moving image from the selection target. is there.

請求項１６の発明は、前記フレームを選択する手段により選択されたフレームの符号を動画像より削除する手段を有することを特徴とする、請求項１３，１４又は１５の発明の動画像処理装置である。 A sixteenth aspect of the present invention is the moving image processing apparatus according to the thirteenth, fourteenth, or fifteenth aspects of the present invention, further comprising means for deleting the code of the frame selected by the means for selecting the frame from the moving image. is there.

請求項１７の発明は、請求項１乃至１６のいずれか１項の発明の動画像処理装置の各手段としてコンピューを機能させるプログラムである。 The invention according to claim 17 is a program that causes a computer to function as each means of the moving image processing apparatus according to any one of claims 1 to 16 .

請求項１８の発明は、請求項１７の発明のプログラムが記録された、コンピュータが読み取り可能な情報記録媒体である。 The invention of claim 18 is a computer-readable information recording medium on which the program of the invention of claim 17 is recorded.

上記各請求項に係る発明について以下に順に説明する。 The invention according to each of the above claims will be described in order below.

まず、インターレース画像のフレームベース符号化における「櫛形」について、図１により説明する。図１の（ａ）は第（ｎ）フィールドの画像、（ｂ）はその１／６０秒後の第（ｎ＋１）フィールドの画像、（ｃ）はこの２フィールドの画像（インターレース画像）を合成して得られるフレーム画像の例をそれぞれ示している。フレームベース符号化では、２フィールドを合成したフレーム画像をそのまま符号化する。 First, “comb” in frame-based encoding of interlaced images will be described with reference to FIG. (A) in FIG. 1 is an image of the (n) field, (b) is an image of the (n + 1) field after 1/60 second, and (c) is a composite of the images of the two fields (interlaced image). Examples of frame images obtained in this way are shown. In frame-based encoding, a frame image obtained by combining two fields is encoded as it is.

さて、２イールド間で被写体が図示のように右方向へ移動した場合、２フィールドを合成したフレーム画像上で１走査線毎に被写体の左右エッジ部分が複数画素分だけ櫛形にずれる。（ｄ）は櫛形のエッジ部分を拡大したもので、この横方向エッジの長さＬは２フィールド間（フレーム内）での被写体の動き量に相当する。 Now, when the subject moves to the right as shown in the drawing between two yields, the left and right edge portions of the subject are shifted in a comb shape by a plurality of pixels for each scanning line on the frame image composed of two fields. (D) is an enlargement of the comb-shaped edge portion, and the length L of the lateral edge corresponds to the amount of movement of the subject between two fields (within the frame).

すなわち、被写体の動きが大きいほど、フレーム画像の横方向のエッジ量が増加するわけである。ビデオカメラで撮影するような場合、被写体の移動としては、このような横方向の移動が圧倒的に多いため、櫛形のエッジ部に見られるような横方向エッジのエッジ量をフレームの動き量の指標として用いることは合理的である。 That is, the greater the movement of the subject, the greater the amount of lateral edge of the frame image. When shooting with a video camera, the movement of the subject is overwhelmingly large in the horizontal direction, so the edge amount of the horizontal edge as seen at the edge of the comb is the amount of movement of the frame. It is reasonable to use it as an indicator.

一方、被写体の横方向移動により生じる櫛形のエッジ部分の縦方向エッジの長さは、被写体の移動量に拘わらず略一定である。したがって、横方向エッジ量と縦方向エッジ量の比（＝横方向エッジ量／縦方向エッジ量）をフレームの動き量の指標として用いることも合理的である。 On the other hand, the length of the vertical edge of the comb-shaped edge portion generated by the lateral movement of the subject is substantially constant regardless of the amount of movement of the subject. Therefore, it is also reasonable to use the ratio of the horizontal edge amount and the vertical edge amount (= horizontal edge amount / vertical edge amount) as an index of the frame motion amount.

そして、フレームベース符号化に、周波数変換（ウェーブレット変換など）を用いる符号化方式が利用される場合、一般に、フレームの符号において横方向のエッジ量、縦方向のエッジ量がそれぞれ符号量に反映される。また、後述のように、ＪＰＥＧ２０００などでは、そのような符号量を、符号中のヘッダ情報から求めることができる。 When an encoding method using frequency transform (wavelet transform or the like) is used for frame-based encoding, generally, the amount of horizontal edge and the amount of vertical edge are reflected in the code amount in the frame code. The As will be described later, in JPEG 2000 or the like, such a code amount can be obtained from header information in the code.

かかる考察に基づき、請求項１の発明は、フレーム毎に、インターレース画像をフレームベース符号化した動画像において、各フレームの符号のヘッダ情報に基づき横方向エッジ量を反映する符号量を算出し、算出した符号量に基づきフレームの動き量を評価して動き量の小さいフレームを選択しようとするものである。また、請求項２の発明は、フレームベース符号化を用いて符号化された動画像において、各フレームの符号のヘッダ情報に基づいて横方向エッジ量を反映する符号量と縦方向を反映する符号量を算出し、それら符号量の比（＝横方向エッジ量を反映する符号量／縦方向エッジ量を反映する符号量）に基づきフレームの動き量を評価し、動き量の小さいフレームを選択しようとするものである。 Based on this consideration, the invention of claim 1 calculates, for each frame, a code amount that reflects a lateral edge amount based on header information of a code of each frame in a moving image obtained by frame-based encoding of an interlaced image. The frame motion amount is evaluated based on the calculated code amount, and a frame with a small motion amount is selected. According to a second aspect of the present invention, in a moving image encoded using frame-based encoding, a code amount reflecting a horizontal edge amount based on header information of a code of each frame and a code reflecting a vertical direction Calculate the amount, evaluate the motion amount of the frame based on the ratio of the code amount (= code amount reflecting the horizontal edge amount / code amount reflecting the vertical edge amount), and select a frame with a small motion amount It is what.

さて、周波数変換係数をサブバンド（周波数帯域）毎に符号化する符号化方式（例えばＪＰＥＧ２０００）では、横方向のエッジ量及び縦方向のエッジ量はそれぞれ特定のサブバンドの符号量に反映される。 Now, in an encoding method (for example, JPEG2000) that encodes a frequency conversion coefficient for each subband (frequency band), the edge amount in the horizontal direction and the edge amount in the vertical direction are each reflected in the code amount of a specific subband. .

これに鑑み、請求項１又は２の発明は、横方向エッジ量を反映する１ＬＨサブバンドの符号量、又は、横方向エッジ量を反映する１ＬＨサブバンドの符号量及び縦方向エッジ量を反映する１ＨＬサブバンドの符号量を算出しようとするものである。 In view of this, the invention of claim 1 or 2 reflects the code amount of the 1LH subband reflecting the amount of horizontal edge, or the code amount and vertical edge amount of 1LH subband reflecting the amount of horizontal edge. The code amount of 1HL subband is to be calculated.

多重解像度サブバンド分割を行う符号化方式（例えばＪＰＥＧ２０００）の場合、エッジ成分は最高解像度レベルのサブバンド（周波数変換が適用された回数の最も少ないサブバンド）に顕著に反映され、解像度レベルが下がるとエッジ成分が反映される度合いが減少し、ある解像度レベル以下ではエッジ成分は殆ど反映されなくなる。特に被写体移動による櫛形のエッジ部に表れる１ライン間隔の横方向エッジ成分は、最高解像度レベルの特定サブバンドに反映され、それより低い解像度レベルのサブバンドには実質的に反映されない。 In the case of an encoding method that performs multi-resolution subband division (for example, JPEG2000), the edge component is remarkably reflected in the subband of the highest resolution level (the subband with the least number of frequency conversions applied), and the resolution level is lowered. The degree to which the edge component is reflected decreases, and the edge component is hardly reflected below a certain resolution level. In particular, a horizontal edge component with one line interval appearing at the comb-shaped edge portion due to movement of the subject is reflected in a specific subband at the highest resolution level, and is not substantially reflected in a subband at a lower resolution level.

また、小さな被写体が移動するようなフレームでは、サブバンド単位でみると、被写体の移動による（櫛形による）横方向エッジ量を反映する符号量の変化がそれほど大きくならない場合がある。 In addition, in a frame in which a small subject moves, when viewed in subband units, the change in the code amount reflecting the lateral edge amount (by the comb shape) due to the movement of the subject may not be so large.

請求項４又は５の発明は、サブバンドより小さなブロックの単位で、横方向エッジ量又は縦横各方向のエッジ量を反映する符号量を算出することにより、小さな被写体の動き量をより的確に評価できるようにするものである。 The invention according to claim 4 or 5 more accurately evaluates the amount of motion of a small subject by calculating a code amount that reflects a horizontal edge amount or an edge amount in each vertical and horizontal direction in units of blocks smaller than subbands. It is something that can be done.

本発明は、フレームベース符号化に２次元ウェーブレット変換が用いられる場合を想定したものである。そのような符号化方式の代表例がＪＰＥＧ２０００である。Ｍｏｔｉｏｎ−ＪＰＥＧ２０００の動画像においては、各フレームはＪＰＥＧ２０００により符号化されている。ここで、ＪＰＥＧ２０００の概要を説明する。 The present invention assumes the case where two-dimensional wavelet transform is used for frame-based encoding. A representative example of such an encoding method is JPEG2000. In a Motion-JPEG2000 moving image, each frame is encoded by JPEG2000. Here, an outline of JPEG2000 will be described.

図２に、ＪＰＥＧ２０００の圧縮（符号化）・伸長（復号化）処理の基本的な流れを示す。 FIG. 2 shows a basic flow of JPEG2000 compression (encoding) / decompression (decoding) processing.

まず、圧縮（符号化）処理について説明する。例えば、ＲＧＢの３コンポ−ネントで構成されるカラー画像は、各コンポーネント毎に１以上の重複しないタイルに分割され、各コンポーネントの各タイル毎に処理が行われる。まず、各タイル毎に、ＤＣレベルシフトと、輝度・色差コンポ−ネントへのコンポ−ネント変換（色変換）がなされ、次に、各コンポーネントの各タイル毎に２次元のウェーブレット変換（離散ウェーブレット変換）がなされる。ウェーブレット係数は、サブバンド毎に、必要に応じて線形量子化された後、ビットプレーンを単位としたエントロピー符号化がなされる（正確には、１つのビットプレーンは３つのサブビットプレーン（符号化パス）に分割されて符号化される）。そして、不要な符号を破棄（トランケート）し、必要な符号をまとめてパケットを生成し、パケットを所定の順序に並べ、必要なタグ及びタグ情報を付加することにより、所定のフォーマットのコードストリームが形成される。 First, compression (encoding) processing will be described. For example, a color image composed of three RGB components is divided into one or more non-overlapping tiles for each component, and processing is performed for each tile of each component. First, for each tile, a DC level shift and a component conversion (color conversion) to luminance / color difference components are performed, and then a two-dimensional wavelet transform (discrete wavelet transform) is performed for each tile of each component. ) Is made. The wavelet coefficients are linearly quantized as necessary for each subband, and then entropy-coded in units of bitplanes (more precisely, one bitplane has three subbitplanes (encoding Pass) and encoded). Then, unnecessary codes are discarded (truncated), packets are generated by collecting necessary codes, packets are arranged in a predetermined order, and necessary tags and tag information are added. It is formed.

ＪＰＥＧ２０００においては、５×３変換と呼ばれる可逆ウェーブレット変換と９×７変換と呼ばれる非可逆ウェーブレット変換を利用できる。ウェーブレット係数の線形量子化は、９×７ウェーブレット変換が用いられる場合のみ適用できる。線形量子化が適用される場合、ウェーブレット係数の線形量子化後の係数で構成されるビットプレーンをエントロピー符号化する。線形量子化を行わない場合には、不要なビットプレーンの符号を破棄し、あるいは必要なビットプレーンまでを符号化する（以下これらをトランケートと呼ぶ）。５×３ウェーブレット変換を使用する場合は線形量子化は適用できず、トランケートによる符号破棄を行う仕様となっている。 In JPEG2000, a reversible wavelet transform called 5 × 3 transform and an irreversible wavelet transform called 9 × 7 transform can be used. Linear quantization of wavelet coefficients is applicable only when 9 × 7 wavelet transform is used. When linear quantization is applied, a bit plane composed of coefficients after linear quantization of wavelet coefficients is entropy encoded. When linear quantization is not performed, unnecessary bit-plane codes are discarded or even necessary bit-planes are encoded (hereinafter referred to as truncation). When 5 × 3 wavelet transform is used, linear quantization cannot be applied, and the specification is such that the code is discarded by truncation.

伸長（復号化）処理は圧縮処理と丁度逆の処理である。コードストリームは各コンポーネントの各タイルのコードストリームに分解され、ビットプレーン単位でエントロピー復号され、復号されたウェーブレット係数は逆量子化された後、２次元逆ウェーブレット変換が施され、その後、逆コンポーネント変換（逆色変換）及び逆ＤＣレベルシフトが施されることにより元のＲＧＢの画素値に戻される。 The decompression (decoding) process is just the reverse of the compression process. The codestream is decomposed into codestreams for each tile of each component, entropy-decoded in bit plane units, the decoded wavelet coefficients are dequantized, and then subjected to two-dimensional inverse wavelet transform, and then inverse component transform (Reverse color conversion) and reverse DC level shift are performed to restore the original RGB pixel values.

５×３変換と呼ばれるウェーブレット変換と、９×７変換と呼ばれるウェーブレット変換の順変換式と逆変換式を次に示す。 A forward transform formula and an inverse transform formula of a wavelet transform called a 5 × 3 transform and a wavelet transform called a 9 × 7 transform are shown below.

＜５×３変換＞
（順変換）
C(2i+1)=P(2i+1)−floor（(P(2i)+P(2i+2))/2） [step1]
C(2i)=P(2i)+|_(C(2i-1)+C(2i+1)+2)/4_| [step2]
（逆変換）
P(2i)=C(2i)−floor（(C(2i-1)+C(2i+1)+2)/4） [step1]
P(2i+1)=C(2i+1)＋floor（(P(2i)+P(2i+2))/2） [step2]
ただし、floor（ｘ）はｘのフロア関数（実数ｘを、ｘを越えず、かつ、ｘに最も近い整数に置換する関数）を示している。 <5x3 conversion>
(Forward conversion)
C (2i + 1) = P (2i + 1) −floor ((P (2i) + P (2i + 2)) / 2) [step1]
C (2i) = P (2i) + | _ (C (2i-1) + C (2i + 1) +2) / 4_ | [step2]
(Inverse transformation)
P (2i) = C (2i) −floor ((C (2i-1) + C (2i + 1) +2) / 4) [step1]
P (2i + 1) = C (2i + 1) + floor ((P (2i) + P (2i + 2)) / 2) [step2]
Here, floor (x) represents a floor function of x (a function that replaces the real number x with an integer that does not exceed x and is closest to x).

＜９×７変換＞
（順変換）
C(2n+1)=P(2n+1)+α*(P(2n)+P(2n+2)) [step1]
C(2n)=P(2n)+β*(C(2n-1)+C(2n+1)) [step2]
C(2n+1)=C(2n+1)+γ*(C(2n)+C(2n+2)) [step3]
C(2n)=C(2n)+δ*(C(2n-1)+C(2n+1)) [step4]
C(2n+1)=K*C(2n+1) [step5]
C(2n)=(1/K)*C(2n) [step6]
（逆変換）
P(2n)=K*C(2n) [step1]
P(2n+1)=(1/K)*C(2n+1) [step2]
P(2n)=X(2n)-δ*(P(2n-1)+P(2n+1)) [step3]
P(2n+1)=P(2n+1)-γ*(P(2n)+P(2n+2)) [step4]
P(2n)=P(2n)-β*(P(2n-1)+P(2n+2)) [step5]
P(2n)=P(2n+1)-α*(P(2n)+P(2n+2)) [step6]
ただし、α＝-1.586134342059924
β＝-0.052980118572961
γ＝0.882911075530934
δ＝0.443506852043971
Ｋ＝1.230174104914001 <9x7 conversion>
(Forward conversion)
C (2n + 1) = P (2n + 1) + α * (P (2n) + P (2n + 2)) [step1]
C (2n) = P (2n) + β * (C (2n-1) + C (2n + 1)) [step2]
C (2n + 1) = C (2n + 1) + γ * (C (2n) + C (2n + 2)) [step3]
C (2n) = C (2n) + δ * (C (2n-1) + C (2n + 1)) [step4]
C (2n + 1) = K * C (2n + 1) [step5]
C (2n) = (1 / K) * C (2n) [step6]
(Inverse transformation)
P (2n) = K * C (2n) [step1]
P (2n + 1) = (1 / K) * C (2n + 1) [step2]
P (2n) = X (2n) -δ * (P (2n-1) + P (2n + 1)) [step3]
P (2n + 1) = P (2n + 1) -γ * (P (2n) + P (2n + 2)) [step4]
P (2n) = P (2n) -β * (P (2n-1) + P (2n + 2)) [step5]
P (2n) = P (2n + 1) -α * (P (2n) + P (2n + 2)) [step6]
However, α = -1.586134342059924
β = -0.052980118572961
γ = 0.882911075530934
δ = 0.443506852043971
K = 1.230174104914001

図３乃至図７により、16×16画素のモノクロの画像に対して、５x３変換と呼ばれるウェーブレット変換を２次元(垂直方向及び水平方向)で施す過程について説明する。 A process of performing wavelet transform called 5 × 3 transform in two dimensions (vertical direction and horizontal direction) on a monochrome image of 16 × 16 pixels will be described with reference to FIGS.

図３のようにＸＹ座標をとり、あるＸ座標について、Ｙ座標がｙである画素の画素値をP（y）（0≦y≦15）と表す。ＪＰＥＧ２０００では、まず垂直方向（Ｙ座標方向）に、Ｙ座標が奇数（y=2i+1）の画素を中心にハイパスフィルタを施して係数C(2i+1)を得る。次に、Ｙ座標が偶数（y=2i）の画素を中心にローパスフィルタを施して係数C(2i)を得る(これを全てのＸ座標について行う)。前記５×３変換の順変換式中のｓｔｅｐ１の式がハイパスフィルタを表し、ｓｔｅｐ２の式がローパスフィルタを表す。 As shown in FIG. 3, the XY coordinates are taken, and the pixel value of a pixel whose Y coordinate is y is expressed as P (y) (0 ≦ y ≦ 15) for a certain X coordinate. In JPEG2000, a coefficient C (2i + 1) is obtained by first applying a high-pass filter in the vertical direction (Y-coordinate direction) around an odd-numbered pixel (y = 2i + 1). Next, a low-pass filter is applied to the pixels whose Y coordinates are even (y = 2i) to obtain a coefficient C (2i) (this is performed for all X coordinates). The step 1 expression in the 5 × 3 conversion forward conversion expression represents a high-pass filter, and the step 2 expression represents a low-pass filter.

なお、画像の端部においては、中心となる画素に対して隣接画素が存在しない場合があり、この場合にはミラリングによって画素値を補う。 Note that there may be no adjacent pixel at the edge of the image with respect to the central pixel. In this case, the pixel value is compensated by mirroring.

簡単のため、ハイパスフィルタで得られる係数をＨ、ローパスフィルタで得られる係数をＬ、と表記すれば、前記垂直方向の変換によって図３の画像は図４のようなＬ係数とＨ係数の配列へと変換される。 For the sake of simplicity, if the coefficient obtained by the high-pass filter is denoted by H and the coefficient obtained by the low-pass filter is denoted by L, the image in FIG. 3 is arranged as shown in FIG. Is converted to

続いて、図３の係数配列に対して、水平方向に、Ｘ座標が奇数（x=2i+1）の係数を中心にハイパスフィルタを施し，次にＸ座標が偶数（x=2i）の係数を中心にローパスフィルタを施す(これを全てのＹ座標について行う。この場合、変換式中のP(2i)等は係数値を表すものと読み替える。 Subsequently, a high-pass filter is applied to the coefficient array shown in FIG. 3 in the horizontal direction centering on an odd-numbered (x = 2i + 1) coefficient, and then an even-numbered (x = 2i) coefficient. (This is performed for all the Y coordinates. In this case, P (2i) and the like in the conversion formula are read as those representing coefficient values.)

簡単のため、前記Ｌ係数を中心にローパスフィルタを施して得られる係数をLL、前記Ｌ係数を中心にハイパスフィルタを施して得られる係数をHL、前記Ｈ係数を中心にローパスフィルタを施して得られる係数をLH、前記Ｈ係数を中心にハイパスフィルタを施して得られる係数をHH、と表記すれば、図４の係数配列は、図５の様な係数配列へと変換される。ここで同一の記号を付した係数群はサブバンドと呼ばれる。 For simplicity, LL is obtained by applying a low-pass filter centered on the L coefficient, HL is obtained by applying a high-pass filter centered on the L coefficient, and a low-pass filter is applied centering on the H coefficient. 4 is converted into a coefficient array as shown in FIG. 5, if the coefficient obtained is expressed as LH and the coefficient obtained by applying a high-pass filter around the H coefficient as HH. Here, the coefficient group to which the same symbol is attached is called a subband.

以上で、１回のウェーブレット変換（１回のデコンポジション（分解））が終了し、LL係数だけを集めると（図６の様にサブバンド毎に集め、LLサブバンドだけ取り出すと）、ちょうど原画像の１/２の解像度の“画像”が得られる（このように、係数をサブバンド毎に分類することをデインターリーブと呼び、図５のような状態に配置することをインターリーブという）。 With the above, one wavelet transform (one decomposition (decomposition)) is completed, and only the LL coefficients are collected (collected for each subband as shown in FIG. 6 and only the LL subband is taken out). An “image” having half the resolution of the image is obtained (in this way, classifying the coefficients for each subband is called deinterleaving, and arranging in the state shown in FIG. 5 is called interleaving).

２回目のウェーブレット変換は、LLサブバンドを原画像と見なして、上記と同様の変換を行えばよい。２回目のウェーブレット変換を行い、係数をデインターリーブすると、模式的な図７が得られる．
ここで、図６，図７において、係数の接頭の１や２は、何回のウェーブレット変換で該係数が得られたかを示しており、デコンポジションレベルと呼ばれる。ウェーブレット変換においては、このデコンポジションレベルが周波数帯域に相当する。また、デコンポジションレベルとほぼ逆の関係にある解像度レベルの定義を図８に示す。 In the second wavelet transform, the LL subband may be regarded as an original image and the same transformation as described above may be performed. When the second wavelet transform is performed and the coefficients are deinterleaved, a schematic diagram 7 is obtained.
Here, in FIGS. 6 and 7, the prefixes 1 and 2 of the coefficient indicate how many times the wavelet transform has been obtained, and are called the decomposition level. In the wavelet transform, this decomposition level corresponds to the frequency band. Further, FIG. 8 shows the definition of the resolution level that is almost opposite to the composition level.

このような５×３ウェーブレット変換の逆変換においては、図５の様なインターリーブされた係数の配列に対して、まず水平方向に、Ｘ座標が偶数（x=2i）の係数を中心に逆ローパスフィルタを施し、次に、Ｘ座標が奇数（x=2i+1）の係数を中心に逆ハイパスフィルタを施す（これを全てのｙについて行う）。前記５×３変換の逆変換式中のｓｔｅｐ１の式が逆ローパスフィルタを表し、ｓｔｅｐ２の式が逆ハイパスフィルタを表す。順変換の場合と同様、画像の端部においては中心となる係数に対して隣接係数が存在しないことがあり、この場合はミラリングによって適宜係数値を補う。 In such inverse transformation of 5 × 3 wavelet transform, an inverse low-pass operation is first performed on the interleaved coefficient array as shown in FIG. 5 centering on a coefficient whose X coordinate is an even number (x = 2i). Apply a filter, and then apply an inverse high-pass filter centered on a coefficient whose X coordinate is an odd number (x = 2i + 1) (this is performed for all y). The step 1 expression in the 5 × 3 inverse conversion expression represents an inverse low-pass filter, and the step 2 expression represents an inverse high-pass filter. As in the case of forward conversion, there may be no adjacent coefficient for the central coefficient at the edge of the image. In this case, the coefficient value is appropriately compensated by mirroring.

これにより、図５の係数配列は図４のような係数配列に変換（逆変換）される。続いて、垂直方向に、Ｙ座標が偶数（y=2i）の係数を中心に逆ローパスフィルタを施し、次にＹ座標が奇数（y=2i+1）の係数を中心に逆ハイパスフィルタを施せば（これを全てのＸ座標について行う）、１回のウェーブレット逆変換が終了し、図３の画像に戻る(再構成される)ことになる。なお、ウェーブレット変換が複数回施されている場合は、図３をLLサブバンドとみなし、HL等の他の係数を利用して同様の逆変換を繰り返せばよい。 As a result, the coefficient array in FIG. 5 is converted (inversely converted) into a coefficient array as shown in FIG. Next, in the vertical direction, apply an inverse low-pass filter centered on the coefficient whose Y coordinate is an even number (y = 2i), and then apply an inverse high-pass filter centered on the coefficient whose Y coordinate is an odd number (y = 2i + 1). If this is done (for all X coordinates), one wavelet inverse transformation is completed, and the image of FIG. 3 is returned (reconstructed). If wavelet transformation is performed a plurality of times, FIG. 3 is regarded as an LL subband, and similar inverse transformation may be repeated using other coefficients such as HL.

以上説明したように、５×３ウェーブレット変換は、５画素を用いて１つのローパスフィルタの出力（ローパス係数）が得られ、３画素を用いて１つのハイパスフィルタの出力（ハイパス係数）が得られる変換である。これに対し、９×７ウェーブレット変換は、９画素を用いて１つのローパスフィルタの出力（ローパス係数）が得られ、７画素を用いて１つのハイパスフィルタの出力（ハイパス係数）が得られる変換である。このように、５×３変換と９×７変換の主な違いは、フィルタの範囲の違いであり、偶数位置中心にローパスフィルタが、奇数位置中心にハイパスフィルタが施されるのは同様である。したがって、図４乃至図７は、９×７変換にも同様に当てはまる。 As described above, in the 5 × 3 wavelet transform, one low-pass filter output (low-pass coefficient) is obtained using five pixels, and one high-pass filter output (high-pass coefficient) is obtained using three pixels. It is a conversion. On the other hand, the 9 × 7 wavelet transform is a conversion in which 9 pixels are used to obtain one low-pass filter output (low-pass coefficient), and 7 pixels are used to obtain one high-pass filter output (high-pass coefficient). is there. As described above, the main difference between the 5 × 3 conversion and the 9 × 7 conversion is the difference in the filter range, and the low pass filter is applied to the center of the even position and the high pass filter is applied to the center of the odd position. . Therefore, FIGS. 4 to 7 apply to the 9 × 7 conversion as well.

図９はタイル、サブバンド、プリシンクト、コードブロックの関係を示している。ＪＰＥＧ２０００では、サブバンドはプリシンクトに分割される。プリシンクトとは、サブバンドを（ユーザが指定可能なサイズの）矩形に分割したもので、ＨＬ，ＬＨ，ＨＨの３つのサブバンドのプリシンクトは３つで１まとまりを成し（ただし、ＬＬサブバンドを分割したプリシンクトは１つで１まとまりである。)、大まかには画像中の場所（Position）を表すものである。プリシンクトをサブバンドと同じサイズにすることもできる。プリシンクトを（ユーザが指定可能なサイズの）矩形にさらに分割したものがコードブロックである。 FIG. 9 shows the relationship among tiles, subbands, precincts, and code blocks. In JPEG2000, a subband is divided into precincts. The precinct is a subband divided into rectangles (of a size that can be specified by the user), and the precincts of the three subbands HL, LH, and HH form a group of three (however, the LL subbands). The precinct is divided into a single unit.), And roughly represents a position in the image. The precinct can be the same size as the subband. A code block is obtained by further dividing the precinct into rectangles (of a size that can be specified by the user).

サブバンドの係数はビットプレーン符号化されることは前述したが、より具体的にはコードブロック単位でビットプレーン符号化される。 Although the sub-band coefficients are bit-plane encoded as described above, more specifically, the bit-plane encoding is performed in units of code blocks.

そして、プリシンクトに含まれる全てのコードブロックから、符号の一部を取り出して集めたもの（例えば、全てのコードブロックのMSBから３枚目までのビットプレーンの符号を集めたもの）がパケットである。パケットの中身が符号的には“空(から)”ということもある。 A packet is a collection of a part of the code extracted from all the code blocks included in the precinct (for example, a collection of codes from the MSB of all the code blocks to the third bit plane). . The contents of the packet may be “empty” in terms of sign.

全てのプリシンクト（＝全てコードブロック＝全てのサブバンド）のパケットを集めると、画像全域の符号の一部（例えば、画像全域のウェーブレット係数の、MSBから３枚目までのビットプレーンの符号）ができるが、これをレイヤーと呼ぶ。レイヤーは大まかには画像全体のビットプレーンの符号の一部であるから、復号されるレイヤー数が増えれば画質は上がる。レイヤーはいわば画質の単位である。すべてのレイヤーを集めると、画像全域の全てのビットプレーンの符号になる。 When packets of all precincts (= all code blocks = all subbands) are collected, a part of the code of the entire image (for example, the code of the wave plane coefficient of the entire image from the MSB to the third bit plane) You can, but this is called a layer. Since the layers are roughly part of the code of the bit plane of the entire image, the image quality increases as the number of layers to be decoded increases. A layer is a unit of image quality. If all the layers are collected, it becomes the sign of all the bit planes throughout the image.

図１０は、ウェーブレット変換の階層数(デコンポジションレベル数)＝２、プリシンクトサイズ＝サブバンドサイズ、としたときのレイヤー構成の例を示す。その一部のレイヤに含まれるパケットを図１１に例示する。図１１中の太線で囲んだものがパケットである。 FIG. 10 shows an example of the layer configuration when the number of wavelet transform layers (the number of decomposition levels) = 2 and the precinct size = subband size. Packets included in some of the layers are illustrated in FIG. A packet surrounded by a thick line in FIG. 11 is a packet.

この例では、プリシンクトサイズ＝サブバンドサイズであり、プリンシンクトの大きさと同じ大きさのコードブロックを採用しているため、デコンポジションレベル２のサブバンドは４つのコードブロックに、デコンポジションレベル１のサブバンドは９個のコードブロックに分割されている。パケットはプリシンクトを単位とするものであるから、プリシンクト＝サブバンドとした場合、パケットはＨＬ〜ＨＨサブバンドをまたいだものとなる。 In this example, the precinct size = subband size, and the code block having the same size as the size of the printinct is adopted. Therefore, the subband at the decomposition level 2 is divided into four code blocks. The level 1 subband is divided into 9 code blocks. Since the packet is a unit of precinct, when precinct = subband, the packet extends across the HL to HH subbands.

ここで、パケットは「コードブロックの符号の一部を取り出して集めたもの」であり、不要な符号はパケットとして生成する必要はない。例えば、図１０に示すレイヤ９に含まれるような下位ビットプレーンの符号は破棄（トランケート）されるのが通常である。 Here, the packet is “a collection of code blocks extracted and collected”, and unnecessary codes need not be generated as packets. For example, the code of the lower bit plane as included in the layer 9 shown in FIG. 10 is usually discarded (truncated).

コードストリーム上のパケットの並び順をプログレッション順序といい、ＪＰＥＧ２０００では、解像度（Ｒ）、プリシンクト（Ｐ）、コンポーネント（Ｃ）、レイヤ（Ｌ）の組み合わせにより、ＬＲＣＰ、ＲＬＣＰ、ＰＲＣＬ、ＰＣＲＬ、ＣＰＲＬの５種類のプログレッシブ順序が規定されている。 The order of the packets on the code stream is called the progression order. In JPEG2000, the LRCP, RLCP, PRCL, PCRL, and CPRL are combined according to the combination of resolution (R), precinct (P), component (C), and layer (L). Five types of progressive order are defined.

例えば、ＬＲＣＰの場合、
for(レイヤ)｛
for(解像度)｛
for(コンポ−ネント)｛
for(プリシンクト)｛
エンコード時：パケットを配置
デコード時：パケットを解釈
｝
｝
｝
｝
というネストされたｆｏｒループにより決まる順で、パケットの配置（エンコード時）および解釈（デコード時）がなされる。他のプログレッシブ順序においても、同様のネストされたｆｏｒループによりパケットのエンコード時の配置順及びデコード時の解釈順が決まる。 For example, for LRCP:
for (layer) {
for (resolution) {
for (component) {
for (Precinct) {
Encoding: Place packet
When decoding: Interpret packets}
}
}
}
Packets are arranged (when encoded) and interpreted (when decoded) in the order determined by the nested for loop. In other progressive orders as well, the same nested for loop determines the arrangement order during packet encoding and the interpretation order during decoding.

パケットは、その本体である符号とパケットヘッダとからなる。パケットヘッダには符号の長さなどは記述されているが、レイヤ番号や解像度レベル等はパケットヘッダには記述されていない。デコード時には、コードストリームのメインヘッダ中のＣＯＤマーカに指定されたプログレッシブ順序から上記のようなｆｏｒループを形成し、そのパケットがどのｆｏｒループ内でハンドリングされたかで、そのバケットがどのレイヤのどの解像度のものかを判別することになる。レイヤ数、解像度数、プリシンクト数はメインヘッダのＣＯＤマーカから、コンポ−ネント数はメインヘッダのＳＩＺマーカから読みとることができる(ＣＯＤマーカからプリシンクトサイズが分かるためプリシンクト数は計算できる)。あとは、パケットの切れ目さえ判別できればパケットの個数は数えられる。パケットヘッダには、パケットに含まれる符号の長さが書かれているため上記切れ目はカウントが可能となる。 A packet consists of a code that is the main body and a packet header. The packet header describes the code length and the like, but the layer number and resolution level are not described in the packet header. At the time of decoding, a for loop as described above is formed from the progressive order specified by the COD marker in the main header of the code stream, and the resolution of which layer of which bucket is determined by which for loop the packet is handled in. Will be determined. The number of layers, the number of resolutions, and the number of precincts can be read from the COD marker in the main header, and the number of components can be read from the SIZ marker in the main header (the number of precincts can be calculated because the precinct size is known from the COD marker). After that, the number of packets can be counted as long as the packet breaks can be identified. Since the length of the code included in the packet is written in the packet header, the break can be counted.

以上の様に、ＪＰＥＧ２０００の符号の場合、プログレッシブ順序を基に全てのパケットのパケットヘッダを読み出せば、パケットに含まれるコードブロックの符号長が得られる。そして、そのサブバンドに含まれる全てのパケットのコードブロックの符号長の和をサブバンド毎にとれば、サブバンド毎の符号量を、復号することなしに得ることができるのである。これが、請求項１，２に関連して述べた、復号することなく、ヘッダ情報から符号量を計算する具体例の一つである。 As described above, in the case of JPEG2000 codes, the code lengths of code blocks included in a packet can be obtained by reading the packet headers of all packets based on the progressive order. If the sum of the code lengths of the code blocks of all packets included in the subband is taken for each subband, the code amount for each subband can be obtained without decoding. This is one of the specific examples of calculating the code amount from the header information without decoding, as described in relation to claims 1 and 2.

さて、２次元ウェーブレット変換のサブバンドの中で、横方向エッジ成分すなわち垂直方向の高周波成分は１ＬＨサブバンド（デコンポジションレベル１，垂直方向高周波，水平方向低周波）の係数に最も大きく反映され、一方、縦方向エッジ成分すなわち横方向の高周波成分は１ＨＬサブバンド（デコンポジションレベル１，垂直方向低周波，水平方向高周波）の係数に最も大きく反映される。したがって、横方向のエッジ量は１ＬＨサブバンドのエントロピー符号の符号量に最も大きく反映され、縦方向のエッジ量は１ＨＬサブバンドのエントロピー符号の符号量に最も大きく反映される。 Now, among the subbands of the two-dimensional wavelet transform, the horizontal edge component, that is, the high frequency component in the vertical direction is most greatly reflected in the coefficients of the 1LH subband (decomposition level 1, vertical high frequency, horizontal low frequency), On the other hand, the vertical edge component, that is, the high frequency component in the horizontal direction is most greatly reflected in the coefficient of 1HL subband (decomposition level 1, vertical low frequency, horizontal high frequency). Accordingly, the amount of edge in the horizontal direction is most greatly reflected in the code amount of the entropy code of the 1LH subband, and the amount of edge in the vertical direction is most reflected in the code amount of the entropy code of 1HL subband.

以上に鑑み、請求項２の発明は、フレームベース符号化に２次元ウェーブレット変換を用いる例えばＪＰＥＧ２０００を利用する場合に、１ＬＨサブバンドの符号量と１ＨＬサブバンドの符号量の比、すなわち
符号量比＝１ＬＨサブバンド符号量／１ＨＬサブバンド符号量
によってフレームの動き量を評価しようとするものである。 In view of the above, the invention of claim 2 is the ratio of the code amount of the 1LH subband to the code amount of the 1HL subband, that is, the code amount ratio when using, for example, JPEG2000 using two-dimensional wavelet transform for frame-based encoding. = Ru der intended to assess the amount of movement of the frame by 1LH subband coding amount / 1HL subband coding amount.

なお、１ＬＨ，１ＨＬサブバンドの符号量比を用いる場合と同等の評価精度は期待し難いが、符号量比に代えて１ＬＨサブバンドの符号量のみを動き量の評価に用いることも可能である。このような動き量の評価を行う態様は、請求項１の発明に包含されるものである。 Although it is difficult to expect an evaluation accuracy equivalent to the case where the code amount ratio of the 1LH and 1HL subbands is used, it is also possible to use only the code amount of the 1LH subband for the motion amount evaluation instead of the code amount ratio. . Such an aspect of evaluating the amount of motion is included in the invention of claim 1 .

さて、被写体の動き量すなわち「櫛形」の横方向エッジ量は１ＬＨサブバンド符号量に反映されるが、櫛形以外の被写体の横方向エッジ量も１ＬＨサブバンド符号量に反映される。したがって、例えば、長い横方向エッジを持つ被写体が含まれるフレームでは、その被写体の移動速度が遅い場合であっても、符号量比（＝１ＬＨサブバンド符号量／１ＨＬサブバンド符号量）が大きな値をとることがある。一方、１ＬＨサブバンドより解像度が１レベル低い２ＬＨサブバンドの符号量は、１ライン置きの「櫛形」の横方向エッジの影響をほとんど受けず、櫛形以外の垂直方向の高周波成分の影響をもっぱら受けるため、被写体の移動によらず安定した値をとる。 The amount of movement of the subject, that is, the “comb-shaped” lateral edge amount is reflected in the 1LH subband code amount, but the lateral edge amount of the subject other than the comb shape is also reflected in the 1LH subband code amount. Therefore, for example, in a frame including a subject having a long horizontal edge, even if the subject moving speed is slow, the code amount ratio (= 1LH subband code amount / 1HL subband code amount) is a large value. May take. On the other hand, the code amount of the 2LH subband, which is one level lower in resolution than the 1LH subband, is almost unaffected by the horizontal edges of the “comb” every other line, and is exclusively influenced by the high-frequency components in the vertical direction other than the comb. Therefore, it takes a stable value regardless of the movement of the subject.

そこで、１ＬＨサブバンド符号量／１ＨＬサブバンド符号量の比をＡとし、
２ＬＨサブバンド符号量／２ＨＬサブバンド符号量の比をＢとして、その比Ａ／Ｂを考える。被写体が高速に移動する場合には、櫛形の横方向エッジ量が増加するため分子Ａの値は増加するが、分母Ｂの値は増加しないので、比Ａ／Ｂの値は大きくなる。同じ被写体が静止しているか移動速度が遅い場合、分子Ａの値は減少するが分母Ｂの値は変わらないため、比Ａ／Ｂの値は小さくなる。以上のことは被写体が横方向に長いエッジを持つ場合も同様である。よって、比Ａ／Ｂを用いると、すなわち、１ＬＨと１ＨＬの符号量比を、２ＬＨと１ＨＬの符号量比で正規化した値を用いると、例えば、横方向に長いエッジを持つ被写体が高速で移動する場合と低速で移動する場合とをより確実に識別可能である。 Therefore, the ratio of 1LH subband code amount / 1HL subband code amount is A,
Consider the ratio A / B, where B is the ratio of 2LH subband code amount / 2HL subband code amount. When the subject moves at high speed, the value of the numerator A increases because the amount of comb-shaped lateral edges increases, but the value of the denominator B does not increase, so the value of the ratio A / B increases. When the same subject is stationary or moving at a low speed, the value of numerator A decreases, but the value of denominator B does not change, so the value of ratio A / B decreases. The same applies to the case where the subject has a long edge in the horizontal direction. Therefore, when the ratio A / B is used, that is, a value obtained by normalizing the code amount ratio of 1LH and 1HL with the code amount ratio of 2LH and 1HL, for example, a subject having a long edge in the horizontal direction can be processed at high speed. The case of moving and the case of moving at a low speed can be more reliably distinguished.

かかる考察に鑑み、請求項３の発明は、上記比Ａ／Ｂに相当する符号量比、すなわち
符号量比＝（１ＬＨサブバンド符号量／１ＨＬサブバンド符号量）／（２ＬＨサブバン
ド符号量／２ＨＬサブバンド符号量）
に基づき、フレームの動き量をより確実に評価しようとするものである。 In view of such consideration, the invention of claim 3 is the code amount ratio corresponding to the ratio A / B, that is, the code amount ratio = (1LH subband code amount / 1HL subband code amount) / (2LH subband).
Code amount / 2HL subband code amount)
Based on, Ru der intended to assess the amount of movement of the frame more reliably.

移動する被写体が画像サイズに比べて小さい場合、その被写体が高速で移動してもサブバンド単位の符号量比の増加は小さい。１ＬＨサブバンドの符号量にはフレーム全体の横方向エッジ量が反映されるからである。サブバンドより小さなブロックを単位として同様の符号量比を計算するならば、被写体の位置に対応したブロックの符号量比に被写体の動き量が鋭敏に反映される。 If the moving subject is smaller than the image size, the increase in the code amount ratio in subband units is small even if the subject moves at high speed. This is because the amount of horizontal edge of the entire frame is reflected in the code amount of the 1LH subband. If a similar code amount ratio is calculated in units of blocks smaller than the subband, the amount of movement of the subject is reflected sharply in the code amount ratio of the block corresponding to the position of the subject.

かかる考察に基づき、請求項４及び５の発明は、小さな被写体の移動をより確実に反映する符号量比を計算し、それに基づいてフレームの動き量を評価しようとするものである。請求項４，５の発明における符号量算出単位であるブロックとして、ＪＰＥＧ２０００のプリシンクトや１つ又は複数のコードブロックを用いることができる。 Based on such considerations, the inventions of claims 4 and 5 try to calculate a code amount ratio that more reliably reflects the movement of a small subject, and to evaluate the motion amount of the frame based on the calculated code amount ratio. As a block which is a code amount calculation unit in the fourth and fifth aspects of the invention, a JPEG 2000 precinct or one or a plurality of code blocks can be used.

なお、請求項４の発明における符号量比は次のように表現できる。
符号量比＝
Σ（１LHサブバンドのブロックiの符号量／１HLサブバンドのブロックjの符号量）
ただし、ブロックiとブロックjの係数は略同じ画素位置から派生するものであり、総和は、少なくとも略同じ画素位置から派生する１HLサブバンドのブロックを有する全ての１HLサブバンドのブロックについてとる。 The code amount ratio in the invention of claim 4 can be expressed as follows.
Code amount ratio =
Σ (code amount of block i of 1 LH subband / code amount of block j of 1HL subband)
However, the coefficients of block i and block j are derived from substantially the same pixel position, and the sum is taken for all 1HL subband blocks having at least 1HL subband blocks derived from substantially the same pixel position.

請求項５の発明における符号量比は次のように表現できる。
符号量比＝
Σ（１LHサブバンドのブロックiの符号量／１HLサブバンドのブロックjの符号量）／（2LHサブバンドのブロックkの符号量／2HLサブバンドのブロックlの符号量）
ただし、ブロックiとブロックjの係数は略同じ画素位置から派生し、ブロックkとブロックlの係数も略同じ画素位置から派生する。また、ブロックiとブロックkの係数は略同じ画素位置から派生した係数を含み、ブロックjとブロックlの係数も略同じ画素位置から派生した係数を含む。総和は，少なくとも、略同じ画素位置から派生する１HLサブバンドのブロックを有する全ての１HLサブバンドのブロックについてとる。 The code amount ratio in the invention of claim 5 can be expressed as follows.
Code amount ratio =
Σ (code amount of block 1 of 1LH subband / code amount of block j of 1HL subband) / (code amount of block k of 2LH subband / code amount of block 1 of 2HL subband)
However, the coefficients of block i and block j are derived from substantially the same pixel position, and the coefficients of block k and block l are also derived from the substantially same pixel position. The coefficients of block i and block k include coefficients derived from substantially the same pixel position, and the coefficients of block j and block l also include coefficients derived from substantially the same pixel position. The sum is taken for at least all 1HL subband blocks having 1HL subband blocks derived from approximately the same pixel location.

なお、このような請求項４，５の発明を包括する上位概念の発明が請求項２，３の発明であることは明らかである。 It is obvious that the invention of the superordinate concept including the inventions of claims 4 and 5 is the invention of claims 2 and 3 .

さて、前述のように、ＪＰＥＧ２０００では必ずしも全てのビットプレーンが符号化されると限らない。図１及び図１０に関連して述べたように、符号化処理過程において、不要な下位ビットプレーンの符号又は係数のトランケートが行われることはごく普通のことである。 As described above, in JPEG 2000, not all bit planes are necessarily encoded. As described in connection with FIG. 1 and FIG. 10, it is common that truncation of unnecessary lower bitplane codes or coefficients is performed during the encoding process.

動画像の各フレームに関し、全てのビットプレーンが符号化されている場合は問題とはならないが、そうでない場合は、トランケートされたビットプレーン数（トランケート量）がサブバンドによって異なることがあることに留意しなければならない。例えば、１HLサブバンドと１LHサブバンドとでトランケート量は同一とは限らない。したがって、符号量比を計算するための符号量としては、サブバンドによるトランケート量の違いによる影響を補正したものを用いるのが望ましい。 For each frame of a moving image, there is no problem if all the bit planes are coded, but otherwise, the number of truncated bit planes (truncation amount) may vary depending on the subband. You have to be careful. For example, the truncation amount is not always the same between the 1HL subband and the 1LH subband. Therefore, it is desirable to use a code amount for calculating the code amount ratio, in which the influence of the difference in the truncation amount due to the subband is corrected.

例えば、１HLサブバンドがビットプレーン３枚分、１LHサブバンドがビットプレーン２枚分のトランケートを受けている場合、１LHサブバンドが有する符号のうちの最下位ビットプレーン（トランケートされなかったビットプレーンのうちの最下位のビットプレーン）分の符号を符号量計算から除外し、１ＨＬサブバンドと同じトランケート量での符号量とすべきである。あるいは、１HLサブバンドが有する最下位ビットプレーンの符号を、１HLサブバンドのトランケートされたビットプレーンのうちの最上位ビットプレーンの符号に等しいとみなし、符号量の計算の際に、１HLサブバンドが有する最下位ビットプレーンの符号を２回加えることも可能である。こうしたトランケート量の調整は、もちろんサブビットプレーン単位で行うことも可能である。 For example, if the 1HL subband is truncated for 3 bit planes and the 1LH subband is truncated for 2 bitplanes, the least significant bit plane of the code of the 1LH subband (the bit plane that was not truncated) The code corresponding to the least significant bit plane) should be excluded from the code amount calculation, and the code amount should be the same truncation amount as that of the 1HL subband. Alternatively, the code of the least significant bit plane of the 1HL subband is regarded as being equal to the code of the most significant bit plane of the truncated bit planes of the 1HL subband, and the 1HL subband is calculated when calculating the code amount. It is also possible to add the code of the least significant bit plane it has twice. Such truncation amount adjustment can of course be performed in units of sub-bit planes.

請求項６の発明は、このようなサブバンド間のトランケート量の違いに応じて１ＬＨサブバンドの符号量及び１ＨＬサブバンドの符号量を補正し、動き量をより的確に評価しようとするものである。なお、このような符号量に対する補正は、符号量の算出の際に行っても符号量比の算出の際に行ってもよい。 According to the sixth aspect of the invention, the code amount of the 1LH subband and the code amount of the 1HL subband are corrected according to the difference in the truncation amount between the subbands, and the motion amount is more accurately evaluated. is there. Such correction for the code amount may be performed when the code amount is calculated or when the code amount ratio is calculated.

また、同一サブバンド内においても、全てのブロックでトランケート量が均一とは限らない。ブロック単位の符号量比を動き量の評価に利用する場合、ブロック間のトランケート量の違いを考慮すべきである。例えば、第nブロックがビットプレーン３枚分、第n+1ブロックがビットプレーン２枚分のトランケートを受けている場合、第n+1ブロックが有する符号のうちの最下位ビットプレーン（トランケートされなかったビットプレーンのうちの最下位のビットプレーン）分の符号を符号量計算から除外し、第nブロックと同じトランケート量での符号量を求めるべきである。あるいは、第nブロックが有する最下位ビットプレーンの符号を、第nブロックが有しないトランケートされたビットプレーンのうちの最上位ビットプレーンの符号と略同一とみなし、符号量計算の際に、第nブロックが有する最下位ビットプレーンの符号を２回加えることも可能である。あるいは、第n+1ブロックが有する最下位ビットプレーンの符号を、第nブロックが有しないトランケートされたビットプレーンのうちの最上位ビットプレーンの符号と略同一とみなし、隣接する第n+1ブロックの符号量を代用したり、近傍のブロックの符号量を利用することも可能である。このようなトランケート量の調整はサブビットプレーン単位で行うことも可能であることはもちろんである。 Even within the same subband, the truncation amount is not always uniform in all blocks. When the code amount ratio of the block unit is used for the motion amount evaluation, the difference in the truncation amount between the blocks should be considered. For example, if the nth block has been truncated by 3 bit planes and the n + 1 block has been truncated by 2 bit planes, the least significant bit plane of the code of the n + 1 block (not truncated) The code for the lowest bit plane of the bit planes) should be excluded from the code amount calculation, and the code amount at the same truncation amount as that of the nth block should be obtained. Alternatively, the code of the least significant bit plane that the nth block has is regarded as substantially the same as the code of the most significant bit plane of the truncated bit planes that the nth block does not have, and the nth block It is also possible to add the code of the least significant bit plane of the block twice. Alternatively, the code of the least significant bit plane of the (n + 1) th block is regarded as substantially the same as the code of the most significant bit plane of the truncated bit planes not possessed by the (n) th block, and is adjacent to the (n + 1) th block. It is also possible to substitute the code amount of, or use the code amount of neighboring blocks. Of course, such a truncation amount can be adjusted in units of sub-bit planes.

請求項８の発明は、このようなブロック間のトランケート量の違いに応じてブロック単位の符号量を補正することにより、動き量をより的確に評価しようとするものである。なお、このような符号量に対する補正は、符号量の算出の際に行っても符号量比の算出の際に行ってもよい。 The invention of claim 8 intends to more accurately evaluate the motion amount by correcting the code amount in units of blocks in accordance with the difference in the truncation amount between the blocks. Such correction for the code amount may be performed when the code amount is calculated or when the code amount ratio is calculated.

さて、各サブバンドのトランケートされたビットプレーン数が均一であっても、各サブバンドが異なる線形量子化を受けている場合には、その違いも考慮する必要がある。線形量子化時の量子化ステップ数は１HLサブバンドと１LHサブバンドとで同一とは限らない。例えば、１HLサブバンドがステップ数８で、１LHサブバンドがステップ数４で線形量子化されている場合、１LHサブバンドのビットプレーン数は、もともと１HLサブバンドのビットプレーン数よりも１枚多いことになるため（８／４＝２の１乗、これはビットプレーン１枚に相当する）、１LHサブバンドが有する符号のうちの最下位ビットプレーン（トランケートされなかったビットプレーンのうちの最下位のビットプレーン）分の符号を符号量計算から除外すべきである。あるいは、１HLサブバンドが有する最下位ビットプレーンの符号を、１HLサブバンドが有しないトランケートされたビットプレーンのうちの最上位ビットプレーンの符号と略等しいとみなし、符号量計算時に１HLサブバンドが有する最下位ビットプレーンの符号を２回加えることも可能である。 Now, even if the number of truncated bit planes in each subband is uniform, if each subband is subjected to different linear quantization, the difference needs to be considered. The number of quantization steps at the time of linear quantization is not necessarily the same between the 1HL subband and the 1LH subband. For example, if 1HL subband is linearly quantized with 8 steps and 1LH subband with 4 steps, the number of bitplanes in 1LH subband should be one more than the number of bitplanes in 1HL subband. (8/4 = 2 to the first power, which corresponds to one bit plane), the least significant bit plane of the code of the 1LH subband (the least significant bit plane of the non-truncated bit plane) The code for bit plane) should be excluded from the code amount calculation. Alternatively, the code of the least significant bit plane included in the 1HL subband is regarded as substantially equal to the code of the most significant bit plane of the truncated bit planes not included in the 1HL subband, and the 1HL subband has at the time of code amount calculation. It is also possible to add the code of the least significant bit plane twice.

こうした線形量子化量の調整はサブビットプレーン単位で行うことも可能である。サブビットプレーンｎ枚は、線形量子化のステップ数の比が２のｎ/3乗であることに相当するからである（ビットプレーン１枚＝サブビットプレーン３枚＝２の１乗）。すなわち、サブバンド間の量子化ステップ数の比がＸである場合、それは３log（X）枚のサブビットプレーン数の違いに相当し（logの底は２）、このサブビットプレーン数の差の分だけ符号量を調整してやればよいことになる。 Such adjustment of the linear quantization amount can also be performed in units of sub-bit planes. This is because the number of sub-bit planes n corresponds to the ratio of the number of steps of linear quantization being 2 to the power of n / 3 (one bit plane = three sub-bit planes = 2 to the first power). That is, when the ratio of the number of quantization steps between subbands is X, it corresponds to the difference in the number of 3 log (X) subbit planes (the bottom of the log is 2). It is only necessary to adjust the code amount by the amount.

請求項１０の発明は、このようなサブバンド間の線形量子化ステップ数の違いに応じて符号量を補正することにより、動き量をより的確に評価しようとするものである。なお、このような符号量に対する補正は、符号量の算出の際に行っても符号量比の算出の際に行ってもよい。 The invention of claim 10 intends to more accurately evaluate the amount of motion by correcting the amount of code in accordance with the difference in the number of linear quantization steps between subbands. Such correction for the code amount may be performed when the code amount is calculated or when the code amount ratio is calculated.

ここまでは、トランケート量又は量子化ステップ数の違いに対する補正を個々のフレームの中でのみ検討した。しかし、動画像のフレーム間でトランケート量又は量子化ステップ数の違いがある場合には、その違いについても考慮するのが望ましい。特に、各フレームについて符号量比により評価した動き量を小さい順に序列付けし、動き量の小さい方からある枚数のフレームを選択する場合には、符号量比を求めるための符号量の計算の際に、フレーム間でのトランケート量又は量子化ステップ数の違いによる影響を排除するための補正を行うのが望ましい。そのためには、例えば、動画像の先頭フレームにおけるトランケート量又は量子化ステップ数を基準として、各フレームにおけるトランケート量又は量子化ステップ数に関連した符号量の補正を行えばよい。 So far, corrections for differences in the amount of truncation or the number of quantization steps have been considered only within individual frames. However, if there is a difference in the amount of truncation or the number of quantization steps between the frames of the moving image, it is desirable to consider the difference. In particular, when ordering the motion amount evaluated by the code amount ratio for each frame in ascending order and selecting a certain number of frames from the smaller motion amount, when calculating the code amount for obtaining the code amount ratio, In addition, it is desirable to perform correction to eliminate the influence of the truncation amount or the difference in the number of quantization steps between frames. For this purpose, for example, the amount of code related to the amount of truncation or the number of quantization steps in each frame may be corrected on the basis of the amount of truncation or the number of quantization steps in the first frame of the moving image.

請求項７，９，１１の発明は、そのようなフレーム間のトランケート量又は線形量子化ステップ数の違いによる影響をも排除するように符号量の補正を行うことにより、動き量をより的確に評価しようとするものである。 According to the seventh , ninth , and eleventh aspects of the present invention, by correcting the code amount so as to eliminate the influence of the truncation amount between the frames or the difference in the number of linear quantization steps, the motion amount is more accurately determined. It is something to be evaluated.

カラー動画像の場合、その符号は輝度成分の符号と色差成分の符号からなるが、色差成分の符号量は被写体の色の違いなどによる変動幅が大きいため、エッジ量を反映する符号量として輝度成分の符号量を用いるのが一般に妥当である。これを考慮したのが請求項１２の発明である。 In the case of a color moving image, the code consists of the code of the luminance component and the code of the color difference component, but since the code amount of the color difference component has a large fluctuation range due to the difference in the color of the subject, the luminance as the code amount reflecting the edge amount It is generally appropriate to use the code amount of the component. This is considered in the invention of claim 12 .

動画像のフレームを間引きする場合、動き量の大きいフレームを間引くと復号した際の動画像の動きの不自然さが際だってしまう。請求項１６の発明は、動画像から動き量の小さいフレームを間引くことにより、復号した際の動画像の動きをそれほど不自然にすることなく、動画像の符号量削減を図ろうとするものである。 When thinning out a frame of a moving image, if a frame with a large amount of motion is thinned out, the unnatural motion of the moving image at the time of decoding becomes prominent. The invention of claim 16 is intended to reduce the code amount of a moving image without thinning the motion of the moving image at the time of decoding by thinning out a frame with a small amount of motion from the moving image. .

この間引きフレームの選び方としては、動き量が基準より小さいフレームを枚数に関係なく選択する方法（１）と、動き量が小さいフレームから順に所定枚数のフレームを選択する方法（２）とが考えられる。前者の選択方法（１）は、基準より動き量が小さいフレームが多い動画像の符号量を大幅に削減できるため、符号量削減を優先したいような場合に効果的であるが、その反面、フレーム総数が元々少ない動画像では間引きフレーム数が過多になるような不都合も懸念される。後者の選択方法（２）は、間引きフレーム数が過多になるような不都合を容易に回避できる利点があるが、動き量の大きいフレームの多い動画像では動き量が比較的大きいフレームが間引かれるおそれがある。 As a method of selecting the thinned frame, there are a method (1) for selecting a frame whose motion amount is smaller than the reference regardless of the number of frames, and a method (2) for selecting a predetermined number of frames in order from the frame having the smallest motion amount. . The former selection method (1) can greatly reduce the code amount of a moving image having many frames with a smaller amount of motion than the reference, and is effective when priority is given to code amount reduction. There is also a concern that the number of thinned-out frames is excessive for moving images with a small total number. The latter selection method (2) has an advantage that it is possible to easily avoid the disadvantage that the number of thinned frames is excessive. However, in a moving image having a large amount of motion, a frame having a relatively large amount of motion is thinned out. There is a fear.

請求項１３の発明は、上記選択方法（１）による間引きフレームの選択を行うようなケースを想定したものである。請求項１４の発明は、上記選択方法（２）による間引きフレームの選択を行うようなケースを想定したものである。 The invention of claim 13 assumes a case where a thinned frame is selected by the selection method (1). The invention of claim 14 assumes a case in which a thinned frame is selected by the selection method (2).

なお、選択方法（１）と選択方法（２）を組み合わせた選択方法（３）も可能である。すなわち、動き量が基準以下のフレームの中で、動き量が小さい順に所定数のフレームを選択する方法である。請求項１〜１２の発明の画像処理装置において、かかる選択方法（３）によりフレームを選択するようにしてもよく、かかる構成の画像処理装置及び方法も本発明に包含される。 A selection method (3) that combines the selection method (1) and the selection method (2) is also possible. That is, this is a method in which a predetermined number of frames are selected in ascending order of motion amount, from among frames whose motion amount is below the reference. In the image processing apparatus according to any one of claims 1 to 12, a frame may be selected by the selection method (3), and the image processing apparatus and method having such a configuration are also included in the present invention.

動画像を静止画としてアイコン表示するような場合、その表示に先頭フレームが用いられることが多いため、先頭フレームは間引きの対象から除外するのが好ましい。請求項１５の発明によれば動画像の先頭フレームは選択対象から除外されるため、先頭フレームの間引きを回避することができる。 When a moving image is displayed as an icon as a still image, the first frame is often used for the display, and therefore the first frame is preferably excluded from the thinning target. According to the fifteenth aspect of the invention, since the first frame of the moving image is excluded from the selection target, it is possible to avoid thinning out the first frame.

ここで付言すれば、動画像の符号化装置において、各フレームに対しエントロピー符号化工程まで実行した段階で、請求項１〜１６の発明と同様にしてフレームの動き量を評価し、動き量が小さいと評価したフレームについては処理を打ち切ることにより、符号化処理過程で動き量の小さいフレームの間引きを行うことも可能である。 In other words, in the moving picture coding apparatus, when the frame is executed up to the entropy coding process for each frame, the amount of movement of the frame is evaluated in the same manner as in the invention of claims 1 to 16 , and the amount of movement is determined. It is also possible to thin out frames with a small amount of motion during the encoding process by aborting the process for frames evaluated as small.

ここで、ＪＰＥＧ２０００の説明を補足する。
ＪＰＥＧ２０００のＤＣレベルシフトの変換式は次の通りである。
順変換
I(x,y)←I(x,y)−2^Ssiz(i)
逆変換
I(x,y)←I(x,y)＋2^Ssiz(i)
ただし、Ｓsiz(i)は原画像の各コンポーネントｉ（ＲＧＢ画像ならｉ０，１，２）のビット深さである。このＤＣレベルシフトは、ＲＧＢ信号値のような正の数である場合に、順変換では各信号値から信号のダイナミックレンジの半分を減算するレベルシフトを、逆変換では各信号値に信号のダイナミックレンジの半分を加算するレベルシフトを行うものである。ただし、このレベルシフトはＹＣｂＣｒ信号のＣｂ，Ｃｒ信号のような符号付き整数には適用されない。 Here, the description of JPEG2000 will be supplemented.
The conversion formula for the DC level shift of JPEG2000 is as follows.
Forward conversion
I (x, y) ← I (x, y) −2 ^ Ssiz (i)
Reverse transformation
I (x, y) ← I (x, y) + 2 ^ Ssiz (i)
Here, Ssiz (i) is the bit depth of each component i of the original image (i0, 1, 2 for RGB images). When this DC level shift is a positive number such as an RGB signal value, the forward conversion subtracts half of the dynamic range of the signal from each signal value, and the inverse conversion uses the signal dynamics for each signal value. A level shift that adds half the range is performed. However, this level shift is not applied to signed integers such as Cb and Cr signals of YCbCr signals.

ＪＰＥＧ２０００では色変換（コンポ−ネント変換）として、可逆変換（ＲＣＴ）と非可逆変換（ＩＣＴ）が定義されている。 JPEG2000 defines reversible conversion (RCT) and irreversible conversion (ICT) as color conversion (component conversion).

ＲＣＴの順変換と逆変換は次式で表される。
順変換
Y0(x,y)=floor（(I0(x,y)+2*(I1(x,y)+I2(x,y))/4）
Y1(x,y)=I2(x,y)-I1(x,y)
Y2(x,y)=I0(x,y)-I1(x,y)
逆変換
I1(x,y)=Y0(x,y)-floor((Y2(x,y)+Y1(x,y))/4)
I0(x,y)=Y2(x,y)+I1(x,y)
I2(x,y)=Y1(x,y)+I1(x,y)
式中のＩは原信号、Ｙは変換後の信号を示す。ＲＧＢ信号ならば、Ｉ信号において０＝Ｒ，１＝Ｇ，２＝Ｂ、Ｙ信号において０＝Ｙ，１＝Ｃｂ，２＝Ｃｒと表される。 RCT forward and inverse transforms are expressed by the following equations.
Forward conversion
Y0 (x, y) = floor ((I0 (x, y) + 2 * (I1 (x, y) + I2 (x, y)) / 4)
Y1 (x, y) = I2 (x, y) -I1 (x, y)
Y2 (x, y) = I0 (x, y) -I1 (x, y)
Reverse transformation
I1 (x, y) = Y0 (x, y) -floor ((Y2 (x, y) + Y1 (x, y)) / 4)
I0 (x, y) = Y2 (x, y) + I1 (x, y)
I2 (x, y) = Y1 (x, y) + I1 (x, y)
In the equation, I represents an original signal, and Y represents a signal after conversion. In the case of the RGB signal, 0 = R, 1 = G, 2 = B in the I signal, and 0 = Y, 1 = Cb, 2 = Cr in the Y signal.

ＩＣＴの順変換と逆変換は次式で表される。
順変換
Y0(x,y)=0.299*I0(x,y)+0.587*I1(x,y)+0.144*I2(x,y)
Y1(x,y)=-0.16875*I0(x,y)-0.33126*I1(x,y)+0.5*I2(x,y)
Y2(x,y)=0.5*I0(x,y)-0.41869*I1(x,y)-0.08131*I2(x,y)
逆変換
I0(x,y)=Y0(x,y)+1.402*Y2(x,y)
I1(x,y)=Y0(x,y)-0.34413*Y1(x,y)-0.71414*Y2(x,y)
I2(x,y)=Y0(x,y)+1.772*Y1(x,y)
式中のＩは原信号、Ｙは変換後の信号を示す。ＲＧＢ信号ならば、Ｉ信号において０＝Ｒ，１＝Ｇ，２＝Ｂ、Ｙ信号において０＝Ｙ，１＝Ｃｂ，２＝Ｃｒと表される。 The forward conversion and the reverse conversion of ICT are expressed by the following equations.
Forward conversion
Y0 (x, y) = 0.299 * I0 (x, y) + 0.587 * I1 (x, y) + 0.144 * I2 (x, y)
Y1 (x, y) =-0.16875 * I0 (x, y) -0.33126 * I1 (x, y) + 0.5 * I2 (x, y)
Y2 (x, y) = 0.5 * I0 (x, y) -0.41869 * I1 (x, y) -0.08131 * I2 (x, y)
Reverse transformation
I0 (x, y) = Y0 (x, y) + 1.402 * Y2 (x, y)
I1 (x, y) = Y0 (x, y) -0.34413 * Y1 (x, y) -0.71414 * Y2 (x, y)
I2 (x, y) = Y0 (x, y) + 1.772 * Y1 (x, y)
In the equation, I represents an original signal, and Y represents a signal after conversion. In the case of the RGB signal, 0 = R, 1 = G, 2 = B in the I signal, and 0 = Y, 1 = Cb, 2 = Cr in the Y signal.

前述のように、ＪＰＥＧ２０００で９×７ウェーブレット変換を選択した場合には、各サブバンド毎にウェーブレット係数を線形（スカラー）量子化することができる。量子化式は次の通りである。
qb(u,v)=sign(ab(u,v))*floor(|ab(u,v)|/Δb)
ただし、ab(u,v)はサブバンドｂにおける係数
qb(u,v)はサブバンドｂにおける係数
Δbはサブバンドｂにおける量子化ステップ数
量子化ステップ数Δｂは次式で表される。
Δb=２^(Rb-εb*floor(1+μb/2^11))
ただし、Rbはサブバンドｂにおけるダイナミックレンジ
εbはサブバンドｂにおける量子化の指数
μbはサブバンドｂにおける量子化の仮数
指数εbと仮数μbは、コードストリーム中のメインヘッダ又はタイルパートヘッダのＱＣＤマーカ又はＱＣＣマーカで規定される。 As described above, when the 9 × 7 wavelet transform is selected in JPEG2000, the wavelet coefficients can be linearly (scalar) quantized for each subband. The quantization formula is as follows.
qb (u, v) = sign (ab (u, v)) * floor (| ab (u, v) | / Δb)
Where ab (u, v) is the coefficient in subband b
qb (u, v) is a coefficient in subband b Δb is the number of quantization steps in subband b The number of quantization steps Δb is expressed by the following equation.
Δb = 2 ^ (Rb-εb * floor (1 + μb / 2 ^ 11))
Where Rb is the dynamic range in subband b εb is the quantization exponent in subband b μb is the quantization mantissa in subband b Exponent εb and mantissa μb are the QCD markers in the main header or tile part header in the codestream Or it is defined by the QCC marker.

以上の説明から明らかなように、（１）請求項１〜１６の発明によれば、動画像の符号を復号することなく、フレームの動き量を評価して動き量の小さいフレームを選択することができ、さらに動き量の小さなフレームを削除し、復号した際の動画像の動きをそれほど不自然にすることなく動画像の符号量を削減することができる。（２）請求項４，５の発明によれば、移動する小さな被写体が含まれるフレームの動き量をより的確に評価することができる。（３）請求項３の発明によれば、長い横方向エッジを持つ被写体が含まれるフレームの動き量をより的確に評価することができる。（４）請求項６〜１１の発明によれば、フレームにおいてサブバンド間又はブロック間、さらにはフレーム間で量子化ステップ数及トランケート量に違いがあっても、フレームの動き量を的確に評価して動き量の小さなフレームを選択することができる。（５）請求項１３の発明によれば、動き量の小さなフレームが多い動画像の符号量を効果的に削減することができる。（６）請求項１４の発明によれば、動きの小さなフレームが多い動画像における過度なフレーム間引きを防止することができる。（７）先頭フレームが間引かれると、間引き後の動画像の先頭フレームのアイコン表示と間引き前の先頭フレームのアイコン表示とが相違し、別の動画像と誤認しやすくなるが、請求項１５の発明によれば、先頭フレームの動き量が小さい場合でもその間引きを防止することができるため、そのようなアイコン表示時の不都合を回避できる。（８）請求項１７，１８の発明によれば請求項１〜１６の発明をコンピュータを利用し容易に実施可能である、等々の効果を得られる。 As is clear from the above description, (1) according to the inventions of claims 1 to 16 , the frame motion amount is evaluated and a frame with a small motion amount is selected without decoding the moving image code. Furthermore, it is possible to delete a frame with a small amount of motion and reduce the amount of code of the moving image without making the motion of the moving image unnatural when decoded. (2) According to the inventions of claims 4 and 5 , it is possible to more accurately evaluate the motion amount of the frame including the small moving object. (3) According to the invention of claim 3, the amount of motion of a frame including a subject having a long lateral edge can be more accurately evaluated. (4) According to the inventions of claims 6 to 11 , even if there is a difference in the number of quantization steps and the truncation amount between subbands or blocks in the frame, and also between frames, the frame motion amount is accurately evaluated. Thus, a frame with a small amount of motion can be selected. (5) According to the invention of claim 13 , it is possible to effectively reduce the code amount of a moving image having many frames with a small amount of motion. (6) According to the invention of claim 14 , it is possible to prevent excessive frame thinning in a moving image having many frames with small motion. (7) When the top frame is subsampled moving the head frame of the icon display and decimation icon display before the first frame of the image are different after the thinning, but easily mistaken as different moving image, claim 15 According to this invention, even when the amount of movement of the first frame is small, it is possible to prevent the thinning-out, and thus it is possible to avoid such an inconvenience at the time of icon display. (8) According to the inventions of claims 17 and 18, the inventions of claims 1 to 16 can be easily implemented using a computer, and the like can be obtained.

図１２は、本発明の実施の形態を説明するためのブロック図である。図１２において、１００は処理の対象となる動画像ファイルであり、１０１は動画像の各フレームの符号から特定方向のエッジ量を反映する符号量を算出する符号量算出手段である。１０２はフレーム選択手段であり、符号量算出手段１０１により算出された符号量に基づき各フレームの動き量を評価し、動き量の小さいフレームを選択する。１０３はフレーム削除等処理手段であり、動画像ファイル１００からフレーム選択手段１０２で選択されたフレームの符号を削除する等の処理を行って新しい動画像ファイル１０４を生成する。 FIG. 12 is a block diagram for explaining an embodiment of the present invention. In FIG. 12, reference numeral 100 denotes a moving image file to be processed, and reference numeral 101 denotes code amount calculation means for calculating a code amount reflecting an edge amount in a specific direction from the code of each frame of the moving image. A frame selection unit 102 evaluates the motion amount of each frame based on the code amount calculated by the code amount calculation unit 101, and selects a frame having a small motion amount. Reference numeral 103 denotes a processing unit such as frame deletion, which generates a new moving image file 104 by performing processing such as deleting the code of the frame selected by the frame selecting unit 102 from the moving image file 100.

本発明の好ましい実施の形態によれば、動画像ファイル１００，１０４はＭｏｔｉｏｎ−ＪＰＥＧ２０００の動画像ファイルである。Ｍｏｔｉｏｎ−ＪＰＥＧ２０００の動画像は、図１３に模式的に示すように、独立して符号化されたフレーム（Ｉ）の系列からなる。フレームレートは３０フレーム／秒である。各フレームはＪＰＥＧ２０００により符号化される。 According to a preferred embodiment of the present invention, the moving image files 100 and 104 are Motion-JPEG2000 moving image files. Motion-JPEG2000 moving images are composed of a series of independently encoded frames (I) as schematically shown in FIG. The frame rate is 30 frames / second. Each frame is encoded by JPEG2000.

Ｍｏｔｉｏｎ−ＪＰＥＧ２０００では、図１４に模式的に示すようにフレームを奇数フィールドと偶数フィールドとに分離してウェーブレット変換以降の処理を実行するフィールドベース符号化と、図１５に模式的に示すようにフィールドを分離せずにウェーブレット変換以降の処理を実行するフレームベース符号化のいずれも可能であるが、本発明が対象とするものはフレームベース符号化による動画像である。 In Motion-JPEG2000, as schematically shown in FIG. 14, field-based coding for performing processing after wavelet transform by separating a frame into odd and even fields and a field as schematically shown in FIG. 15. Any of the frame-based encoding that executes the processing after the wavelet transform without separating the image is possible. However, what is targeted by the present invention is a moving image by frame-based encoding.

Ｍｏｔｉｏｎ−ＪＰＥＧ２０００の動画像ファイルは、図１６に模式的に示すように、フレームのコードストリームが格納される”ｍｏｖｉｅｄａｔａ”（略称ｍｄａｔ）と、ｍｄａｔの属性を格納した”ｍｏｖｉｅｒｅｓｏｕｒｃｅ”（略称ｍｏｏｖ）からなる基本構造である。Ｍｏｔｉｏｎ−ＪＰＥＧ２０００のファイルの構成要素はボックス（ｂｏｘ）と呼ばれ、数多くのボックスが階層構造を形成している。このような階層構造を図１７に示す。図１７の表の左側に位置するボックスほど上の階層であり、したがって、”ｍｏｏｖ”と”ｍｄａｔ”は最上位階層のボックスである。 As schematically shown in FIG. 16, the Motion-JPEG2000 moving image file includes a “movie data” (abbreviation mdat) in which a code stream of a frame is stored, and a “movie resource” (abbreviation movov) in which an attribute of mdat is stored. ). The components of the Motion-JPEG2000 file are called boxes, and many boxes form a hierarchical structure. Such a hierarchical structure is shown in FIG. The boxes located on the left side of the table of FIG. 17 are the upper layers, and therefore “moov” and “mdat” are the uppermost layers.

フレーム削除等処理手段１０３により処理された動画像ファイル１０４は”ｍｄａｔ”から一部のフレームのコードストリームが削除され、それに対応して”ｍｏｏｖ”の内容が更新されたものである。フレーム削除に伴い更新される”ｍｏｏｖ”の内容は、フレーム数とフレーム間隔に関連する”sample-table box（stbl）”の下位ボックスである、
”Time-to-Sample Box(stts)”と”Sample-to-Chunk Box(stsc)”の２つである。ただし、チャンク数が２以上の場合には、”ｓｔｂｌ”中の”Chunk-Offset Box(stco)”の更新も必要になる。チャンクとは、シーンのひとまとまりのことである。ここではチャンク数を１として説明を続ける。 In the moving image file 104 processed by the processing unit 103 such as frame deletion, the code stream of a part of the frame is deleted from “mdat”, and the content of “moov” is updated accordingly. The content of “moov” that is updated when a frame is deleted is a lower box of “sample-table box (stbl)” related to the number of frames and the frame interval.
“Time-to-Sample Box (stts)” and “Sample-to-Chunk Box (stsc)”. However, if the number of chunks is 2 or more, it is also necessary to update “Chunk-Offset Box (stco)” in “stbl”. A chunk is a group of scenes. Here, the description will be continued assuming that the number of chunks is 1.

Ｍｏｔｉｏｎ−ＪＰＥＧ２０００の標準書によれば、”stts”は
aligned(8) class TimeToSampleBox
extends FullBox('stts',version=0,0){
unsigned int(32) entry-count;
int i;
for(i=1;i≦entry-count;i++){
unsigned int(32) sample-count;
int(32) sample-delta;
}
}
と定義されている。 According to the Motion-JPEG2000 standard, "stts"
aligned (8) class TimeToSampleBox
extends FullBox ('stts', version = 0,0) {
unsigned int (32) entry-count;
int i;
for (i = 1; i ≦ entry-count; i ++) {
unsigned int (32) sample-count;
int (32) sample-delta;
}
}
It is defined as

すなわち、”stts”は図１９に示すような表である。entry-countは、表の行数である。sample-countとsample-daltaは表の構成要素であり、sample-countは各行のフレーム数(ただし同じ行に関するフレーム間隔は同一)、sample-deltaは各行の時間間隔(1/30秒を単位とする)である。 That is, “stts” is a table as shown in FIG. entry-count is the number of rows in the table. sample-count and sample-dalta are the components of the table, sample-count is the number of frames in each row (however, the frame interval for the same row is the same), sample-delta is the time interval of each row (in units of 1/30 seconds) Yes).

また、”stsc”は、
aligned(8) class SampleToChunkBox
extends FullBox('stsc',version=0,0){
unsigned int(32) entry-count;
for(i=1;i≦entry-count;i++){
unsigned int(32) first-chunk;
unsigned int(32) samples-per-chunk;
unsigned int(32) sample-description-index;
}
}
と定義されている。 Also, “stsc”
aligned (8) class SampleToChunkBox
extends FullBox ('stsc', version = 0,0) {
unsigned int (32) entry-count;
for (i = 1; i ≦ entry-count; i ++) {
unsigned int (32) first-chunk;
unsigned int (32) samples-per-chunk;
unsigned int (32) sample-description-index;
}
}
It is defined as

すなわち、”stsc”は図２０に示すような表である。entry-countは表の行数である。
first-chunk，samples-per-chunk，sample-description-indexは、この表の構成要素である。first-chunkはfirst-chunk表を用いるチャンク番号である(ここでは１チャンクとしているので１)。samples-per-chunkは表の各行に対応するチャンクに含まれるフレーム数、sample-description-indexは各行に対応するチャンクに含まれるフレームのID(ここでは１チャンクとしているので１)である。 That is, “stsc” is a table as shown in FIG. entry-count is the number of rows in the table.
First-chunk, samples-per-chunk, and sample-description-index are components of this table. The first-chunk is a chunk number that uses the first-chunk table (here, 1 is a chunk). samples-per-chunk is the number of frames included in the chunk corresponding to each row of the table, and sample-description-index is the ID of the frame included in the chunk corresponding to each row (here, 1 is 1).

本発明の典型的な実施形態によれば、図１２に示した本発明に係る動画像処理装置の各手段１０１，１０２，１０３の機能、換言すれば、本発明に係る動画像処理方法の処理手順における当該各手段に対応した工程は、コンピュータを利用して１以上のプログラムにより実現される。かかる実施形態の一例をについて、図１８を参照し簡単に説明する。図１８において、２００はＣＰＵ、２０１はＲＡＭ、２０２はハードディスク装置であり、これらはバス２０３により相互に接続されている。 According to the exemplary embodiment of the present invention, the functions of the respective units 101, 102, 103 of the moving image processing apparatus according to the present invention shown in FIG. 12, in other words, the processing of the moving image processing method according to the present invention. The process corresponding to each means in the procedure is realized by one or more programs using a computer. An example of such an embodiment will be briefly described with reference to FIG. In FIG. 18, reference numeral 200 denotes a CPU, 201 denotes a RAM, and 202 denotes a hard disk device, which are connected to each other via a bus 203.

処理フローの概略は次の通りである。まず、ＣＰＵ２００からの命令によって、ハードディスク装置２０２に記録されている動画像ファイルの内容がＲＡＭ２０１に読み込まれる（１）。ＣＰＵ２００は、ＲＡＭ２０１上の動画像ファイルの”ｍｄａｔ”内の個々のフレームに関し、動き量評価に用いる符号量を算出し、それに基づいてフレームの動き量を評価し、動き量の小さいフレームを選択する（２）。次に、ＣＰＵ２０１は、ＲＡＭ２０１上の動画像ファイルの”ｍｄａｔ”より、選択されたフレームの符号を削除し、それに応じて”ｍｏｏｖ”の内容を更新する処理を行う（３）。そして、ＣＰＵ２００の命令により、ＲＡＭ２０１上の動画像ファイルはハードディスク装置２０１へ転送され、元の動画像ファイルとは別の動画像ファイルとして保存される（４）。 The outline of the processing flow is as follows. First, the content of the moving image file recorded in the hard disk device 202 is read into the RAM 201 by an instruction from the CPU 200 (1). The CPU 200 calculates a code amount used for motion amount evaluation for each frame in “mdat” of the moving image file on the RAM 201, evaluates the motion amount of the frame based on the code amount, and selects a frame having a small motion amount. (2). Next, the CPU 201 deletes the code of the selected frame from “mdat” of the moving image file on the RAM 201 and updates the content of “moov” accordingly (3). Then, according to the instruction of the CPU 200, the moving image file on the RAM 201 is transferred to the hard disk device 201 and stored as a moving image file different from the original moving image file (4).

以下、本発明の実施例について説明する。 Examples of the present invention will be described below.

ここに述べる実施例において処理される動画像ファイルはＭｏｔｉｏｎ−ＪＰＥＧ２０００の動画像ファイルであり、各フレームの符号化において９×７ウェーブレット変換が用いられ、またRGBの画素値がYCbCrの輝度・色差値に変換されて処理されたものとする。また、動画像のチャンク数は１とする。また、１ＬＨ，１ＨＬサブバンドの符号量の比を、２ＬＨ，２ＨＬサブバンドの符号量の比で除した値、すなわち請求項３の発明における符号量比をフレームの動き量の評価に用いるものとする。また、この符号量比を計算する元となる１ＬＨ，１ＨＬ，２ＬＨ，２ＨＬ各サブバンドの符号量について、フレーム内のサブバンド間及び先頭フレームとの間での量子化ステップ数の違い及び量子化ステップ数の違いの影響を吸収するための補正を行う。また、先頭フレームは、削除フレームとしての選択の対象外とする。 The moving image file processed in the embodiment described here is a Motion-JPEG2000 moving image file, and 9 × 7 wavelet transform is used for encoding each frame, and the luminance / color difference value of RGB pixel value is YCbCr. And converted to. Also, the number of chunks of moving images is 1. Further, a value obtained by dividing the ratio of the code amount of the 1LH and 1HL subbands by the ratio of the code amount of the 2LH and 2HL subbands, that is, the code amount ratio in the invention of claim 3 is used for evaluating the motion amount of the frame. To do. In addition, with respect to the code amount of each 1LH, 1HL, 2LH, and 2HL subbands from which the code amount ratio is calculated, the difference in the number of quantization steps between the subbands in the frame and between the first frame and the quantization Correction is performed to absorb the influence of the difference in the number of steps. Also, the first frame is not selected as a deletion frame.

図２１は処理の全体的な流れを示すフローチャートである。このフローチャート及び図１２を参照し、処理全体について簡単に説明する。 FIG. 21 is a flowchart showing the overall flow of processing. The entire process will be briefly described with reference to this flowchart and FIG.

まず、ｓｔｅｐ１において、符号量算出手段１０１で各フレーム（本実施例では先頭フレームは除外）の輝度成分の１ＬＨ，１ＨＬ，２ＬＨ，２ＨＬ各サブバンドの符号量を算出し、この際に上記符号量の補正も行う。ｓｔｅｐ２において、求められた符号量を用いて、フレームの動き量を評価するための符号量比の計算をフレーム選択手段１０２で行う。ただし、後述のように、ｓｔｅｐ１とｓｔｅｐ２の処理はひとまとまりの処理として実行するのが効率的である。 First, in step 1, the code amount calculation means 101 calculates the code amounts of the 1LH, 1HL, 2LH, and 2HL subbands of the luminance component of each frame (excluding the first frame in this embodiment), and at this time, the code amount Correction is also performed. In step 2, the frame selection means 102 calculates the code amount ratio for evaluating the motion amount of the frame using the obtained code amount. However, as will be described later, it is efficient to execute the processing of step 1 and step 2 as a group of processing.

ｓｔｅｐ３において、フレーム選択手段１０２で、算出した符号量比に基づいて評価される動き量の小さなフレーム（削除フレーム）の選択を行う。この選択の方法として、請求項１３又は請求項１４の発明の方法を選択することができる。また、本実施例においては、請求項１５の発明に定義されているように、先頭フレームは削除フレームとしては選択されない。 In step 3, the frame selection unit 102 selects a frame with a small amount of motion (deletion frame) evaluated based on the calculated code amount ratio. As the selection method, the method of the invention of claim 13 or claim 14 can be selected. Further, in this embodiment, as defined in the invention of claim 15 , the first frame is not selected as the deletion frame.

ｓｔｅｐ４において、フレーム削除等処理手段１０３で、選択されたフレームの全成分のコードストリームを削除する処理を行う。ｓｔｅｐ５において、フレーム削除等処理手段１０３で、フレーム削除に伴う”ｍｏｏｖ”の必要な更新処理を行う。これで動きの量の小さいフレームを削除した新しい動画像ファイルが生成され、ｓｔｅｐ６において、その動画像ファイルを当該画像処理装置の内部又は外部の記憶装置に保存する処理をフレーム削除処理手段１０３で行い、一連の処理を完了する。 In step 4, the frame deletion processing unit 103 performs processing for deleting the code stream of all the components of the selected frame. In step 5, the frame deletion processing unit 103 performs a necessary update process of “moov” accompanying the frame deletion. As a result, a new moving image file in which a frame with a small amount of motion is deleted is generated, and in step 6, the processing for saving the moving image file in the internal or external storage device of the image processing apparatus is performed by the frame deletion processing means 103. , Complete a series of processing.

ｓｔｅｐ１及びｓｔｅｐ２をひとまとまりの処理として実行する場合の処理フローを図２２に示す。図中のｓｔｅｐ１８がｓｔｅｐ２に対応する処理ステップである。
FIG. 22 shows a processing flow in the case where step 1 and step 2 are executed as a group of processes. Step 18 in the figure is a processing step corresponding to step 2.

まず、ｓｔｅｐ１１で、”ｍｏｏｖ”から”ｍｄａｄ”のアドレスとフレーム数が読み出される。ここでは、１チャンク、１０フレームとして説明する。 First, in step 11, the address and the number of frames from “moov” to “mdad” are read. Here, description will be made assuming that one chunk and 10 frames.

ｓｔｅｐ１２で、フレーム数に等しい数のエントリーを持つテーブルが作成される。このテーブルは、図２３に示すように、各フレーム対応にフレーム番号、符号量比、及び、削除／保存フラグを記憶するためのテーブルである。 At step 12, a table having the number of entries equal to the number of frames is created. As shown in FIG. 23, this table is a table for storing a frame number, a code amount ratio, and a deletion / save flag for each frame.

ｓｔｅｐ１３において、”ｍｄａｔ”内の先頭フレームのコードストリームのヘッダ情報を参照し、輝度成分の１ＬＨ，１ＨＬ，２ＬＨ，２ＨＬサブバンドの量子化ステップ数及びトランケート量を取得して保持する。本実施例では、符号量の補正をビットプレー単位で行うため、トランケート量としてビットプレーン数を用いる。ただし、前述のようにサブビットプレーン単位での符号量補正を行うことも可能であり、その場合にはトランケートされたサブビットプレーン数をトランケート量として取得することになる。 In step 13, the header information of the code stream of the first frame in “mdat” is referred to, and the quantization step number and truncation amount of the luminance component 1LH, 1HL, 2LH, and 2HL subbands are acquired and held. In this embodiment, since the code amount is corrected in units of bit play, the number of bit planes is used as the truncation amount. However, it is also possible to perform code amount correction in units of sub-bit planes as described above, and in this case, the number of truncated sub-bit planes is acquired as the truncation amount.

ｓｔｅｐ１４において、次フレームのコードストリームを参照し、そのメインヘッダから１ＬＨ，１ＨＬ，２ＬＨ，２ＨＬサブバンドの量子化ステップ数を取得して保持し、次のｓｔｅｐ１５において、当該フレームのパケットヘッダに基づき各サブバンドの符号量とトランケート量を求めて保持する。これら各サブバンドの符号量に対し、ｓｔｅｐ１６で先頭フレームとの間の量子化ステップ数の違い及びトランケート量との違いを補正するためのフレーム間換算処理を行い、次いでｓｔｅｐ１７において当該フレームのサブバンド間の量子化ステップ数の違い及びトランケート量の違いを補正するためのフレーム内換算処理を行う。これら換算処理については後述する。 In step 14, the code stream of the next frame is referred to, and the number of quantization steps of the 1LH, 1HL, 2LH, and 2HL subbands is acquired and held from the main header, and in the next step 15, each step is performed based on the packet header of the frame. The sub-band code amount and truncation amount are obtained and held. Inter-frame conversion processing for correcting the difference in the number of quantization steps from the first frame and the difference from the truncation amount is performed on the code amount of each subband in step 16, and then in step 17, the subband of the frame is processed. Intra-frame conversion processing is performed to correct the difference in the number of quantization steps and the difference in the amount of truncation. These conversion processes will be described later.

そして、ｓｔｅｐ１８で、換算処理後の符号量を用いて符号量比（１ＬＨ／１ＨＬ／２ＬＨ／２ＨＬと略記する）を計算し、その結果をテーブル（図２３）の対応エントリーに書き込む。 In step 18, the code amount ratio (abbreviated as 1LH / 1HL / 2LH / 2HL) is calculated using the code amount after the conversion process, and the result is written in the corresponding entry of the table (FIG. 23).

以上のｓｔｅｐ１４〜１８の処理を各フレームについて最後のフレームまで繰り返し実行する。 The above steps 14 to 18 are repeatedly executed for each frame up to the last frame.

次に、符号量比に基づくフレーム選択処理（図２１のｓｔｅｐ３）について説明する。本実施例では、符号量比が所定の閾値より小さいフレーム（先頭フレームを除く）を選択する方法（１）と、符号量比の小さい順に所定数のフレームを選択する方法（２）を選ぶことができる。 Next, the frame selection process based on the code amount ratio (step 3 in FIG. 21) will be described. In this embodiment, a method (1) for selecting frames (excluding the first frame) whose code amount ratio is smaller than a predetermined threshold and a method (2) for selecting a predetermined number of frames in ascending order of the code amount ratio are selected. Can do.

まず、前者の選択方法（１）について、図２４のフローチャートを参照し説明する。ｓｔｅｐ３１で、テーブル（図２３）の先頭エントリーに保存フラグを書き込む。ｓｔｅｐ３２でテーブルの次エントリーから符号量比を読み出し、ｓｔｅｐ３３で符号量比と所定の閾値ＴＨとの比較判定を行う。符号量が閾値ＴＨより小さいときには当該エントリーに削除フラグを書き込む（つまり、当該エントリーに対応するフレームは削除フレームとして選択された）。符号量比が閾値以上ならば当該エントリーに保存フラグを書き込む。同様の処理をテーブルの最終エントリーまで繰り返す。 First, the former selection method (1) will be described with reference to the flowchart of FIG. In step 31, a save flag is written in the first entry of the table (FIG. 23). In step 32, the code amount ratio is read from the next entry in the table, and in step 33, the code amount ratio is compared with a predetermined threshold value TH. When the code amount is smaller than the threshold value TH, a deletion flag is written in the entry (that is, the frame corresponding to the entry is selected as the deletion frame). If the code amount ratio is equal to or greater than the threshold value, a save flag is written in the entry. The same process is repeated until the last entry in the table.

次に、後者の選択方法（２）について、図２５のフローチャートを参照し説明する。ｓｔｅｐ４１で、テーブル（図２３）の先頭エントリーに保存フラグを書き込む。ｓｔｅｐ４２で、テーブルの第２エントリーから最後のエントリーまでの符号量比を読み込み、値の小さい順にソートする。すなわち、テーブルのエントリーを符号量比の小さい順に序列化する。そして、ｓｔｅｐ４３で、この序列の先頭のエントリーから所定数のエントリーを選び、その各エントリーに削除フラグを書き込み、残りの各エントリーには保存フラグを書き込む。 Next, the latter selection method (2) will be described with reference to the flowchart of FIG. In step 41, a save flag is written in the first entry of the table (FIG. 23). In step 42, the code amount ratio from the second entry to the last entry of the table is read and sorted in ascending order of value. That is, the table entries are ordered in ascending order of the code amount ratio. Then, in step 43, a predetermined number of entries are selected from the top entries in this order, a deletion flag is written in each entry, and a save flag is written in each remaining entry.

図２１のｓｔｅｐ４においては、テーブル（図２３）の各エントリーに記録されたフラグを参照し、削除フラグが記録されているエントリーに対応したフレームのコードストりーを削除することになる。 In step 4 of FIG. 21, the code stream of the frame corresponding to the entry in which the deletion flag is recorded is deleted with reference to the flag recorded in each entry of the table (FIG. 23).

図２６は、”ｍｏｏｖ”の更新処理（図２１のｓｔｅｐ５）のフローチャートである。ｓｔｅｐ５１で”time-to-sample box(stts)”を更新し、ｓｔｅｐ５２で”sample-to-chunk box(stsc)”を更新する。本実施例ではチャンク数を１としているので以上で更新は終わりであるが、チャンク数が複数の場合には"chunk-offset box(stco)”の更新処理（ｓｔｅｐ５３）も必要となる。 FIG. 26 is a flowchart of the “moov” update process (step 5 in FIG. 21). In step 51, “time-to-sample box (stts)” is updated, and in step 52, “sample-to-chunk box (stsc)” is updated. In this embodiment, since the number of chunks is 1, the update is completed as described above. However, when there are a plurality of chunks, an update process (step 53) of “chunk-offset box (stco)” is also required.

削除フレームの選択方法として図２４を参照して説明した方法が用いられ、符号量比の判定のための閾値ＴＨを１とした場合の処理結果例を図２７乃至図２９に示す。図２７はテーブル（図２３）の処理後の内容を表している。図２８は”ｓｔｔｓ”の更新前と更新後を、また図２９は”ｓｔｓｃ”の更新前と更新後を表している。フレーム番号２〜４，６が削除されるため、均一値１であったsample-delta（フレーム間隔）が３種類の値をとり、entry-countも３になる。これに伴いsample
countも更新される。４フレームが削除されるため、samples-per-chunkが１０−４＝６に減少する。 The method described with reference to FIG. 24 is used as a method for selecting a deleted frame, and FIGS. 27 to 29 show examples of processing results when the threshold value TH for determining the code amount ratio is 1. FIG. 27 shows the contents of the table (FIG. 23) after processing. FIG. 28 shows before and after updating “stts”, and FIG. 29 shows before and after updating “stsc”. Since frame numbers 2 to 4 and 6 are deleted, sample-delta (frame interval), which is uniform value 1, takes three kinds of values, and entry-count is also 3. Along with this, sample
count is also updated. Since 4 frames are deleted, samples-per-chunk is reduced to 10-4 = 6.

削除フレームの選択方法として図２５を参照して説明した方法が用いられ、削除フレーム数を５とした場合の処理結果例を図３０乃至図３２に示す。図３０はテーブル（図２３）の処理後の内容を表している。図３１は”ｓｔｔｓ”の更新前と更新後を、また図３２は”ｓｔｓｃ”の更新前と更新後を表している。５フレームが削除されるため、samples-per-chunkが１０−５＝５に減少する。 The method described with reference to FIG. 25 is used as a method for selecting a deletion frame, and FIGS. 30 to 32 show examples of processing results when the number of deletion frames is five. FIG. 30 shows the contents of the table (FIG. 23) after processing. FIG. 31 shows before and after updating “stts”, and FIG. 32 shows before and after updating “stsc”. Since 5 frames are deleted, samples-per-chunk is reduced to 10-5 = 5.

次に、符号量のフレーム間換算処理（図２２のｓｔｅｐ１６）について説明する。この換算処理は量子化ステップ数に関する換算処理とトランケート量に関する換算処理とからなる。 Next, the code amount inter-frame conversion process (step 16 in FIG. 22) will be described. This conversion process includes a conversion process related to the number of quantization steps and a conversion process related to the amount of truncation.

まず、量子化ステップ数に関するフレーム間換算処理について説明する。図３３は、１ＬＨ，１ＨＬサブバンドの符号量についての換算処理のフローチャートである。２ＬＨ，２ＨＬサブバンドについても同様の換算処理が行われる。 First, the inter-frame conversion process regarding the number of quantization steps will be described. FIG. 33 is a flowchart of the conversion process for the code amounts of the 1LH and 1HL subbands. Similar conversion processing is performed for the 2LH and 2HL subbands.

図３３を参照すると、ｓｔｅｐ１０１で、保持されている先頭フレームの１ＨＬ，１ＬＨサブバンドの量子化ステップ数のうちの大きい方の量子化ステップ数を選び、それをＸとして設定する。このＸを基準として符号量を補正する。すなわち、ｓｔｅｐ１０２において、保持されている処理対象フレームの１ＨＬサブバンドの量子化ステップ数をＢとして設定する。ｓｔｅｐ１０３でＸ，Ｂの比較判定を行い、Ｘ−Ｂ＞０ならば、最初のフレームの方が多く量子化されているということであるので、ｓｔｅｐ１０４で、処理対象フレームの１ＨＬサブバンドの符号量に対し、下位からｌｏｇ（Ｘ−Ｂ）枚（ただし四捨五入により整数する）のビットプレーン分を差し引く補正を行う。Ｘ−Ｂ＜０の場合は処理対象フレームの方が多く量子化されているということであるので、ｓｔｅｐ１０５で、処理対象フレームの１ＨＬサブバンドの符号量に対し、最下位ビットプレーンの符号量を（ｌｏｇ（Ｂ−Ｘ）＋１）倍する補正を行う。なお、上記ｌｏｇの底は２である（以下同様）。Ｘ−Ｂ＝０のときは符号量の補正は行わない。補正後の符号量が改めて符号量として保持される。次に、ｓｔｅｐ１０６において、保持されている処理対象フレームの１ＬＨサブバンドの量子化ステップ数をＢとして設定する。ｓｔｅｐ１０７でＸ，Ｂの比較判定を行い、Ｘ−Ｂ＞０ならば、最初のフレームの方が多く量子化されているということであるので、ｓｔｅｐ１０８で、処理対象フレームの１ＬＨサブバンドの符号量に対し、下位からｌｏｇ（Ｘ−Ｂ）枚（ただし四捨五入により整数する）のビットプレーン分を差し引く補正を行う。Ｘ−Ｂ＜０の場合は処理対象フレームの方が多く量子化されているということであるので、ｓｔｅｐ１０９で、処理対象フレームの１ＬＨサブバンドの符号量に対し、最下位ビットプレーンの符号量を（ｌｏｇ（Ｂ−Ｘ）＋１）倍する補正を行う。Ｘ−Ｂ＝０のときは符号量の補正は行わない。補正後の符号量が改めて符号量として保持される。 Referring to FIG. 33, in step 101, the larger quantization step number of the held first frame 1HL and 1LH subbands is selected and set as X. The code amount is corrected based on this X. That is, in step 102, the number of quantization steps of the 1HL subband of the processing target frame held is set as B. In step 103, X and B are compared and determined. If X−B> 0, the first frame is more quantized. Therefore, in step 104, the code amount of the 1HL subband of the processing target frame On the other hand, correction is performed by subtracting log (X-B) bit planes from the lower order (however, integers are rounded off). If X−B <0, it means that the processing target frame is more quantized. Therefore, in step 105, the code amount of the least significant bit plane is set with respect to the code amount of the 1HL subband of the processing target frame. Correction to (log (B−X) +1) times is performed. The bottom of the log is 2 (the same applies hereinafter). When X−B = 0, the code amount is not corrected. The corrected code amount is again held as the code amount. Next, in step 106, the number of quantization steps of the 1LH subband of the processing target frame held is set as B. In step 107, X and B are compared and determined. If X−B> 0, the first frame is more quantized, and in step 108, the code amount of the 1LH subband of the processing target frame is determined. On the other hand, correction is performed by subtracting log (X-B) bit planes from the lower order (however, integers are rounded off). If X−B <0, it means that the processing target frame is more quantized. Therefore, in step 109, the code amount of the least significant bit plane is set with respect to the code amount of the 1LH subband of the processing target frame. Correction to (log (B−X) +1) times is performed. When X−B = 0, the code amount is not corrected. The corrected code amount is again held as the code amount.

次に、トランケート量の違いを補正するためのフレーム間換算処理について説明する。図３４は、１ＬＨ，１ＨＬサブバンドについての、この換算処理のフローチャートである。２ＬＨ，２ＨＬサブバンドについても同様の換算処理が行われる。 Next, an inter-frame conversion process for correcting the difference in truncation amount will be described. FIG. 34 is a flowchart of this conversion processing for the 1LH and 1HL subbands. Similar conversion processing is performed for the 2LH and 2HL subbands.

図３４を参照すると、ｓｔｅｐ１１１で、保持されている先頭フレームの１ＬＨ，１ＨＬサブバンドのトランケート量（ビットプレーン数）のうちの大きい方のトランケート量を選び、それをＸとして設定する。このＸを基準として符号量の補正を行う。すなわち、ｓｔｅｐ１１２において、保持されている処理対象フレームの１ＨＬサブバンドのトランケート量をＢとして設定する。ｓｔｅｐ１１３でＸ，Ｂの比較判定を行い、Ｘ−Ｂ＞０ならば、最初のフレームの方が多くトランケートされているということであるので、ｓｔｅｐ１１４で、処理対象フレームの１ＨＬサブバンドの符号量に対し、下位から（Ｘ−Ｂ）枚のビットプレーン分を差し引く補正を行う。Ｘ−Ｂ＜０の場合は処理対象フレームの方が多くトランケートされているということであるので、ｓｔｅｐ１１５で、処理対象フレームの１ＨＬサブバンドの符号量に対し、最下位ビットプレーンの符号量を（Ｂ−Ｘ＋１）倍する補正を行う。Ｘ−Ｂ＝０のときは符号量の補正を行わない。補正後の符号量が改めて符号量として保持される。次にｓｔｅｐ１１６で、保持されている処理対象フレームの１ＬＨサブバンドのトランケート量をＢとして設定する。ｓｔｅｐ１１７でＸ，Ｂの比較判定を行い、Ｘ−Ｂ＞０ならば、最初のフレームの方が多くトランケートされているということであるので、ｓｔｅｐ１１８で、処理対象フレームの１ＬＨサブバンドの符号量に対し、下位から（Ｘ−Ｂ）枚のビットプレーン分を差し引く補正を行う。Ｘ−Ｂ＜０の場合は処理対象フレームの方が多くトランケートされているということであるので、ｓｔｅｐ１１５で、処理対象フレームの１ＬＨサブバンドの符号量に対し、最下位ビットプレーンの符号量を（Ｂ−Ｘ＋１）倍する補正を行う。Ｘ−Ｂ＝０のときは符号量の補正を行わない。補正後の符号量が改めて符号量として保持される。 Referring to FIG. 34, in step 111, the larger truncation amount of the 1LH and 1HL subband truncation amounts (number of bit planes) of the held first frame is selected and set as X. The code amount is corrected based on this X. That is, in step 112, the truncation amount of the 1HL subband of the processing target frame held is set as B. In step 113, X and B are compared and determined. If X−B> 0, the first frame is more truncated, and in step 114, the code amount of the 1HL subband of the processing target frame is set. On the other hand, correction is performed by subtracting (X-B) bit planes from the lower order. If X−B <0, the processing target frame is more truncated, and in step 115, the code amount of the least significant bit plane is set to (1) the code amount of the 1HL subband of the processing target frame ( B-X + 1) correction is performed. When X−B = 0, the code amount is not corrected. The corrected code amount is again held as the code amount. Next, in step 116, the truncation amount of the 1LH subband of the processing target frame held is set as B. In step 117, X and B are compared and determined. If X−B> 0, the first frame is more truncated, and in step 118, the code amount of the 1LH subband of the processing target frame is set. On the other hand, correction is performed by subtracting (X-B) bit planes from the lower order. If X−B <0, the processing target frame is more truncated, and in step 115, the code amount of the least significant bit plane is set to (1LH subband code amount of the processing target frame) ( B-X + 1) correction is performed. When X−B = 0, the code amount is not corrected. The corrected code amount is again held as the code amount.

次に、符号量のフレーム内換算処理（図２２のｓｔｅｐ１７）について説明する。この換算処理は量子化ステップ数に関する換算処理とトランケート量に関する換算処理とからなる。 Next, the code amount intra-frame conversion process (step 17 in FIG. 22) will be described. This conversion process includes a conversion process related to the number of quantization steps and a conversion process related to the amount of truncation.

まず、量子化ステップ数に関するフレーム内換算処理について説明する。図３５は、１ＬＨ，１ＨＬサブバンドの符号量についての換算処理のフローチャートである。２ＬＨ，２ＨＬサブバンドについても同様の換算処理が行われる。 First, the intra-frame conversion process regarding the number of quantization steps will be described. FIG. 35 is a flowchart of the conversion process for the code amounts of the 1LH and 1HL subbands. Similar conversion processing is performed for the 2LH and 2HL subbands.

図３５を参照すると、ｓｔｅｐ１２１で、保持されている処理対象フレームの１ＨＬサブバンドの量子化ステップ数をＡに、１ＬＨサブバンドの量子化ステップ数をＢに設定する。ｓｔｅｐ１２２でＡ，Ｂの比較判定を行い、Ａ−Ｂ＞０ならば、１ＨＬの方が多く量子化されているということであるので、ｓｔｅｐ１２３で、１ＬＨサブバンドの符号量に対し、下位からｌｏｇ（Ａ−Ｂ）枚（ただし四捨五入により整数する）のビットプレーン分を差し引く補正を行う。Ａ−Ｂ＜０の場合は１ＬＨの方が多く量子化されているということであるので、ｓｔｅｐ１２４で、１ＨＬサブバンドの符号量に対し、下位からｌｏｇ（Ｂ−Ａ）枚（ただし四捨五入により整数する）のビットプレーン分を差し引く補正を行う。Ａ−Ｂ＝０のときは符号量の補正は行わない。補正後の符号量が改めて符号量として保持される。 Referring to FIG. 35, in step 121, the number of quantization steps of the 1HL subband of the held processing target frame is set to A, and the number of quantization steps of the 1LH subband is set to B. In step 122, A and B are compared and determined. If AB> 0, 1HL is more quantized, and therefore, in step 123, the code amount from the lower order is set for the code amount of 1LH subband. Correction is performed by subtracting (A-B) (but rounded off) bit planes. If A-B <0, 1LH is more quantized, so in step 124, log (BA) sheets from the lower order for the code amount of 1HL subband (however, rounded off to an integer) )) Is subtracted from the bit plane. When A−B = 0, the code amount is not corrected. The corrected code amount is again held as the code amount.

上に述べた処理では量子化ステップ数の大きい方のサブバンドのビットプレーン数に合わせる符号量補正方法であったが、量子化ステップ数の小さい方のサブバンドのビットプレーン数に合わせる符号量補正方法とすることも可能であり、その方法の場合の処理フローを図３６に示す。 In the processing described above, the code amount correction method is adapted to the number of bit planes of the subband having the larger quantization step number. However, the code amount correction is adapted to the number of bit planes of the subband having the smaller quantization step number. It is also possible to use a method, and a processing flow in the case of the method is shown in FIG.

図３６を参照すると、ｓｔｅｐ１３１で、保持されている処理対象フレームの１ＨＬサブバンドの量子化ステップ数をＡ、１ＬＨサブバンドの量子化ステップ数をＢに設定する。ｓｔｅｐ１３２でＡ，Ｂの比較判定を行い、Ａ−Ｂ＞０ならば、１ＨＬの方が多く量子化されているということであるので、ｓｔｅｐ１３３で、１ＨＬサブバンドの符号量に対し、その最下位ビットプレーンの符号量を（ｌｏｇ（Ａ−Ｂ）＋１）倍する補正を行う。Ａ−Ｂ＜０の場合は１ＬＨの方が多く量子化されているということであるので、ｓｔｅｐ１３４で、１ＬＨサブバンドの符号量に対し、その最下位ビットプレーンの符号量を
（ｌｏｇ（Ｂ−Ａ）＋１）倍する補正を行う。Ａ−Ｂ＝０のときは符号量の補正は行わない。２ＨＬ，２ＬＨサブバンドの符号量に対する補正も同様である。 Referring to FIG. 36, in step 131, the quantization step number of the 1HL subband of the held processing target frame is set to A, and the quantization step number of the 1LH subband is set to B. In step 132, A and B are compared and determined, and if A−B> 0, 1HL is more quantized. Therefore, in step 133, the least significant code is assigned to the code amount of 1HL subband. Correction for multiplying the bit plane code amount by (log (A−B) +1) is performed. If A-B <0, 1LH is more quantized, so in step 134, the code amount of the least significant bit plane is set to (log (B− A) +1) multiplication is performed. When A−B = 0, the code amount is not corrected. The same applies to the correction for the code amounts of the 2HL and 2LH subbands.

次に、トランケート量に関するフレーム内換算処理について説明する。図３７は、１ＬＨ，１ＨＬサブバンドの符号量についての換算処理のフローチャートである。２ＬＨ，２ＨＬサブバンドについても同様の換算処理が行われる。 Next, the intra-frame conversion process regarding the truncation amount will be described. FIG. 37 is a flowchart of the conversion process for the code amounts of the 1LH and 1HL subbands. Similar conversion processing is performed for the 2LH and 2HL subbands.

図３７を参照すると、ｓｔｅｐ１４１で、保持されている処理対象フレームの１ＨＬサブバンドのトランケート量をＡ、１ＬＨサブバンドのトランケート量をＢに設定する。ｓｔｅｐ１４２でＡ，Ｂの比較判定を行い、Ａ−Ｂ＞０ならば、１ＨＬの方が多くトランケートされているということであるので、ｓｔｅｐ１４３で、１ＬＨサブバンドの符号量に対し、下位から（Ａ−Ｂ）枚のビットプレーン分を差し引く補正を行う。Ａ−Ｂ＜０の場合は１ＬＨの方が多く量子化されているということであるので、ｓｔｅｐ１４４で、１ＨＬサブバンドの符号量に対し、下位から（Ｂ−Ａ）枚のビットプレーン分を差し引く補正を行う。Ａ−Ｂ＝０のときは符号量の補正は行わない。補正後の符号量が改めて符号量として保持される。 Referring to FIG. 37, in step 141, the truncation amount of the 1HL subband of the held processing target frame is set to A, and the truncation amount of the 1LH subband is set to B. In step 142, A and B are compared and determined. If AB> 0, 1HL is more truncated, and in step 143, the code amount of 1LH subband is compared to the lower order (A -B) Correction for subtracting bit planes. In the case of A−B <0, 1LH is more quantized. Therefore, in step 144, (B−A) bit planes from the lower order are subtracted from the code amount of 1HL subband. Make corrections. When A−B = 0, the code amount is not corrected. The corrected code amount is again held as the code amount.

上に述べた処理ではトランケート量の大きい方のサブバンドのビットプレーン数に合わせる符号量補正方法であったが、トランケート量の小さい方のサブバンドのビットプレーン数に合わせる符号量補正方法とすることも可能であり、その方法の場合の処理フローを図３８に示す。 In the above-described processing, the code amount correction method is adapted to the number of bit planes of the subband having the larger truncation amount. However, the code amount correction method is adapted to the number of bit planes of the subband having the smaller truncation amount. FIG. 38 shows a processing flow in the case of this method.

図３８を参照すると、ｓｔｅｐ１５１で、保持されている処理対象フレームの１ＨＬサブバンドのトランケート量をＡ、１ＬＨサブバンドのトランケート量をＢに設定する。ｓｔｅｐ１５２でＡ，Ｂの比較判定を行い、Ａ−Ｂ＞０ならば、１ＨＬの方が多くトランケートされているということであるので、ｓｔｅｐ１５３で、１ＨＬサブバンドの符号量に対し、最下位ビットプレーンの符号量を（Ａ−Ｂ＋１）倍する補正を行う。Ａ−Ｂ＜０の場合は１ＬＨサブバンドの方が多く量子化されているということであるので、ｓｔｅｐ１５４で、１ＬＨサブバンドの符号量に対し、最下位ビットプレーンの符号量を（Ｂ−Ａ＋１）倍する補正を行う。Ａ−Ｂ＝０のときは符号量の補正は行わない。２ＨＬ，２ＬＨサブバンドの符号量についても同様の補正が行われる。 Referring to FIG. 38, in step 151, the truncation amount of the 1HL subband of the held processing target frame is set to A, and the truncation amount of the 1LH subband is set to B. In step 152, A and B are compared and determined. If A−B> 0, 1HL is more truncated, and in step 153, the least significant bit plane for the code amount of 1HL subband is obtained. Is corrected by multiplying the code amount of (A−B + 1). If A−B <0, it means that the 1LH subband is more quantized. Therefore, in step 154, the code amount of the least significant bit plane is set to (B−A + 1) with respect to the code amount of 1LH subband. ) Perform double correction. When A−B = 0, the code amount is not corrected. Similar correction is performed for the code amounts of the 2HL and 2LH subbands.

なお、符号量比を厳密に計算する必要がない場合には、フレーム内換算処理とフレーム間換算処理の一方又は両方を省き得る。フレームの符号量比の序列に従って削除フレームを一定枚数選択する場合には、フレーム間の符号量比の相対的な大小関係が重要であるためフレーム間換算処理を行うのが望ましい。 When it is not necessary to calculate the code amount ratio strictly, one or both of the intra-frame conversion process and the inter-frame conversion process can be omitted. When selecting a certain number of frames to be deleted in accordance with the order of the code amount ratio of frames, it is desirable to perform inter-frame conversion processing because the relative magnitude relationship of the code amount ratio between frames is important.

以上に説明した実施例においては、サブバンド全体の符号量の比を用いてフレームの動き量を評価したが、サブバンドより小さなブロック単位の符号量の比を総和した符号量比を用いてフレームの動き量を評価する態様について説明を補充する。 In the embodiment described above, the amount of motion of the frame is evaluated using the ratio of the code amount of the entire subband. However, the frame is calculated using the code amount ratio obtained by summing the ratio of the code amount of the block unit smaller than the subband. A description will be supplemented regarding the mode of evaluating the amount of movement of the.

ここでは、Ｍｏｔｉｏｎ−ＪＰＥＧ２０００の動画像を対象とし、プリシンクトをブロックとして用いる場合を想定する。デコンポジションレベル１，２のＨＬ，ＬＨサブバンドに関して、図３９のようにプリシンクトに番号を付した場合、図４０に示すような位置関係の対応するプリシンクトの符号量を用いたプリシンクト単位（ブロック単位）の符号量比０〜８を計算する。 Here, it is assumed that Motion-JPEG2000 moving images are used and precincts are used as blocks. When the precincts are numbered as shown in FIG. 39 for the HL and LH subbands at decomposition levels 1 and 2, a precinct unit (block unit) using the precinct code amount corresponding to the positional relationship as shown in FIG. ) Is calculated.

符号量比０は、１ＬＨ，１ＨＬサブバンドのプリシンクト０の符号量の比を、２ＬＨ，２ＨＬサブバンドのプリシンクト０の符号量の比で除して求められる。符号量比１は、１ＬＨ，１ＨＬサブバンドのプリシンクト１の符号量の比を、２ＬＨ，２ＨＬサブバンドのプリシンクト０の符号量の比で除して求められる。他のサブバンド単位の符号量比も同様にして求められる。 The code amount ratio 0 is obtained by dividing the code amount ratio of the precinct 0 of the 1LH and 1HL subbands by the ratio of the code amount of the precinct 0 of the 2LH and 2HL subbands. The code amount ratio 1 is obtained by dividing the code amount ratio of the precinct 1 of the 1LH and 1HL subbands by the ratio of the code amount of the precinct 0 of the 2LH and 2HL subbands. The code ratio of other subband units can be obtained in the same manner.

このようにして求めたプリシンクト単位の符号量比の総和をとることにより、ブロック（ここではプリシンクト）毎の被写体の動きをより的確に反映した符号量比を求めることができる。この符号量比を用いた削除フレームの選択などの処理は前記実施例と同様でよい。また、符号量に対するフレーム間換算処理及びフレーム内換算処理も、プリシンクトを単位として同様に行えばよい。なお、コードブロックをブロックの単位として同様の符号量比を求めることもでき、より小さな被写体の動きを符号量比に反映させることができる。 By calculating the sum of the code amount ratios in units of precincts obtained in this way, it is possible to obtain a code amount ratio that more accurately reflects the movement of the subject for each block (precinct in this case). Processing such as selection of a deletion frame using the code amount ratio may be the same as in the above-described embodiment. Also, the interframe conversion process and the intraframe conversion process for the code amount may be performed in the same manner in units of precincts. Note that the same code amount ratio can be obtained using the code block as a unit of the block, and a smaller subject motion can be reflected in the code amount ratio.

以上、Ｍｏｔｉｏｎ−ＪＰＥＧ２０００の動画像を例して本発明の実施例を説明したが、ウェーブレット変換その他の周波数変換を用いる符号化方式により符号化された動画像に対しても本発明を適用し得ることは明らかである。 The embodiment of the present invention has been described above by taking the Motion-JPEG 2000 moving image as an example. However, the present invention can also be applied to a moving image encoded by an encoding method using wavelet transform or other frequency conversion. It is clear.

被写体の動きによる「櫛形」の説明のための図である。It is a figure for explanation of "comb shape" by a subject's movement. ＪＰＥＧ２０００を説明するためのブロック図である。It is a block diagram for demonstrating JPEG2000. 原画像の例と座標系を示す図である。It is a figure which shows the example and coordinate system of an original image. 垂直方向へのフィルタリング後の係数の配列を示す図である。It is a figure which shows the arrangement | sequence of the coefficient after filtering to a perpendicular direction. 水平方向へのフィルタリング後の係数の配列を示す図である。It is a figure which shows the arrangement | sequence of the coefficient after filtering to a horizontal direction. デインターリーブした係数の配列を示す図である。It is a figure which shows the arrangement | sequence of the coefficient which carried out the deinterleaving. ２回の変換後のデインターリーブした係数の配列を示す図である。It is a figure which shows the arrangement | sequence of the coefficient which carried out the deinterleaving after two conversions. ３回の変換後のサブバンドと解像度レベルを示す図である。It is a figure which shows the subband and resolution level after 3 times of conversion. 画像、タイル、サブバンド、プリシンクト、コードブロックの関係を示す図である。It is a figure which shows the relationship between an image, a tile, a subband, a precinct, and a code block. レイヤ構成の例を示す図である。It is a figure which shows the example of a layer structure. パレットの例を示す図である。It is a figure which shows the example of a palette. 本発明に係る動画像処理装置を説明するためのブロック図である。It is a block diagram for demonstrating the moving image processing apparatus which concerns on this invention. Ｍｏｔｉｏｎ−ＪＰＥＧ２０００の動画像のピクチャ構造を示す図である。It is a figure which shows the picture structure of the moving image of Motion-JPEG2000. フィールドベース符号化の説明図である。It is explanatory drawing of field-based encoding. フレームベース符号化の説明図である。It is explanatory drawing of frame-based encoding. Ｍｏｔｉｏｎ−ＪＰＥＧ２０００の動画像ファイルの基本構造を示す図である。It is a figure which shows the basic structure of the moving image file of Motion-JPEG2000. Ｍｏｔｉｏｎ−ＪＰＥＧ２０００の動画像ファイルの階層構造を示す図である。It is a figure which shows the hierarchical structure of the moving image file of Motion-JPEG2000. 本発明をプログラムにより実施する形態を説明するためのブロック図である。It is a block diagram for demonstrating the form which implements this invention by a program. Time-to-Sample Boxの説明図である。It is explanatory drawing of Time-to-Sample Box. Sample-to-Chunk Boxの説明図である。It is explanatory drawing of Sample-to-Chunk Box. 一実施例における処理全体の流れを示すフローチャートである。It is a flowchart which shows the flow of the whole process in one Example. 符号量の算出から符号量比の算出までの処理を示すフローチャートである。It is a flowchart which shows the process from calculation of code amount to calculation of code amount ratio. 処理のために作成されるテーブルの説明図である。It is explanatory drawing of the table produced for a process. 符号量比に基づくフレーム選択処理の一例を示すフローチャートである。It is a flowchart which shows an example of the frame selection process based on code amount ratio. 符号量比に基づくフレーム選択処理の別の例を示すフローチャートである。It is a flowchart which shows another example of the frame selection process based on code amount ratio. フレーム削除に伴うｍｏｏｖの更新処理のフローチャートである。It is a flowchart of the update process of moov accompanying frame deletion. 処理後のテーブルの内容を示す図である。It is a figure which shows the content of the table after a process. Time-to-Sample Boxの更新前と更新後の内容を示す図である。It is a figure which shows the content before and after the update of Time-to-Sample Box. Sample-to-Chunk Boxの更新前と更新後の内容を示す図である。It is a figure which shows the content before and after the update of Sample-to-Chunk Box. 処理後のテーブルの内容を示す図である。It is a figure which shows the content of the table after a process. Time-to-Sample Boxの更新前と更新後の内容を示す図である。It is a figure which shows the content before and after the update of Time-to-Sample Box. Sample-to-Chunk Boxの更新前と更新後の内容を示す図である。It is a figure which shows the content before and after the update of Sample-to-Chunk Box. 量子化ステップ数に関するフレーム間換算処理のフローチャートである。It is a flowchart of the conversion process between frames regarding the number of quantization steps. トランケート量に関するフレーム間換算処理のフローチャートである。It is a flowchart of the conversion process between frames regarding the amount of truncation. 量子化ステップ数に関するフレーム内換算処理のフローチャートである。It is a flowchart of the intra-frame conversion process regarding the number of quantization steps. 量子化ステップ数に関するフレーム内換算処理の別の例を示すフローチャートである。It is a flowchart which shows another example of the intra-frame conversion process regarding the number of quantization steps. トランケート量に関するフレーム内換算処理のフローチャートである。It is a flowchart of the intra-frame conversion process regarding the truncation amount. トランケート量に関するフレーム内換算処理の別の例を示すフローチャートである。It is a flowchart which shows another example of the intra-frame conversion process regarding a truncation amount. プリシンクトとその番号を示す図である。It is a figure which shows a precinct and its number. ブロック単位の符号量の計算に用いられるプリシンクトを示す図である。It is a figure which shows the precinct used for calculation of the code amount of a block unit.

Explanation of symbols

１００処理前の動画像ファイル
１０１符号量算出手段
１０２フレーム選択手段
１０３フレーム削除等処理手段
１０４処理後の動画像ファイル DESCRIPTION OF SYMBOLS 100 Moving image file before processing 101 Code amount calculation means 102 Frame selection means 103 Processing means such as frame deletion 104 Moving image file after processing

Claims

A moving image processing apparatus that processes a moving image obtained by frame-based encoding an interlaced image independently for each frame,
The frame-based encoding uses a two-dimensional wavelet transform as a frequency transform, and encodes a wavelet coefficient for each subband.
A means for calculating the code amount of the 1LH subband based on information of a portion of the header of the code of each frame in which information relating to the code amount of the 1LH subband is described, and the code amount calculated by the means And a means for selecting a frame with a small amount of motion by evaluating that the amount of motion is larger as the size of the moving image processing device is larger.

A moving image processing apparatus that processes a moving image obtained by frame-based encoding an interlaced image independently for each frame,
  The frame-based encoding uses a two-dimensional wavelet transform as a frequency transform, and encodes a wavelet coefficient for each subband.
  Based on the information in the header of the code of each frame and the information about the code amount of the 1LH subband and the code amount of the 1HL subband, the code amount of the 1LH subband and the code amount of the 1HL subband Respectively, and a code amount calculated by the means,
  Code amount ratio = (Code amount of 1LH subband) / (Code amount of 1HL subband)
And a means for selecting a frame with a small amount of motion by evaluating that the amount of motion is larger as the code amount ratio calculated by (2) is larger.

A moving image processing apparatus that processes a moving image obtained by frame-based encoding an interlaced image independently for each frame,
  The frame-based encoding uses a two-dimensional wavelet transform as a frequency transform, and encodes a wavelet coefficient for each subband.
  Based on the information in the header in the code of each frame, which describes the information about the code amount of 1LH subband, the code amount of 1HL subband, the code amount of 2LH subband, and the code amount of 2HL subband. 1LH subband code amount, 1HL subband code amount, 2LH subband code amount and 2HL subband code amount respectively, and using the code amount calculated by the means,
  Code quantity ratio = [(1LH subband code quantity / 1HL subband code quantity) / (2LH subband code quantity
                Code amount / 2HL subband code amount)]
And a means for selecting a frame with a small amount of motion by evaluating that the amount of motion is larger as the code amount ratio calculated by (2) is larger.

A moving image processing apparatus that processes a moving image obtained by frame-based encoding an interlaced image independently for each frame,
  The frame-based encoding uses a two-dimensional wavelet transform as a frequency transform, and encodes a wavelet coefficient for each subband.
  1LH subband code in units of blocks smaller than the subband based on the header information in the code of each frame and the information of the portion describing the information about the code amount of the 1LH subband and the code amount of the 1HL subband. A unit for calculating the amount and the code amount of the 1HL subband, and the code amount of the block unit corresponding to the positional relationship calculated by the unit,
  Block code amount ratio = (Code amount of 1LH subband block unit / 1HL subband)
                      Code amount in blocks)
A code amount ratio obtained by summing up the block code amount ratio calculated for all blocks, evaluating that the larger the code amount ratio is, the larger the amount of motion is, and selecting a frame with a small amount of motion. A moving image processing apparatus.

A moving image processing apparatus that processes a moving image obtained by frame-based encoding an interlaced image independently for each frame,
  The frame-based encoding uses a two-dimensional wavelet transform as a frequency transform, and encodes a wavelet coefficient for each subband.
  Based on the information in the header in the code of each frame, which describes the information about the code amount of 1LH subband, the code amount of 1HL subband, the code amount of 2LH subband, and the code amount of 2HL subband. Means for calculating the code amount of 1LH subband, the code amount of 1HL subband, the code amount of 2LH subband, and the code amount of 2HL subband in units of blocks smaller than the subband, and the means , Using the code amount of the block unit corresponding to the positional relationship,
  Block code amount ratio = [(1LH subband block code amount / 1HL subband
      Code amount in block units) / (code amount in block units of 2LH subbands / 2HL)
      Subband block code amount)]
A code amount ratio obtained by summing up the block code amount ratio calculated for all blocks, evaluating that the larger the code amount ratio is, the larger the amount of motion is, and selecting a frame with a small amount of motion. A moving image processing apparatus.

The moving image processing apparatus according to claim 2 or 3 ,
A moving image processing apparatus characterized in that a code amount is corrected in accordance with a difference in truncation amount between subbands.

The moving image processing apparatus according to claim 2 or 3 ,
A moving image processing apparatus for correcting a code amount according to a difference in truncation amount between subbands and a difference in truncation amount between frames.

The moving image processing apparatus according to claim 4 or 5 ,
A moving image processing apparatus for correcting a code amount according to a difference in a truncation amount between blocks.

The moving image processing apparatus according to claim 4 or 5 ,
A moving image processing apparatus that corrects a code amount in accordance with a truncation amount difference between blocks and a truncation amount difference between frames.

The moving image processing apparatus according to any one of claims 2 to 9 ,
A moving image processing apparatus that corrects a code amount according to a difference in the number of linear quantization steps between subbands.

The moving image processing apparatus according to any one of claims 2 to 9 ,
A moving picture processing apparatus that corrects a code amount according to a difference in the number of linear quantization steps between subbands and a difference in the number of linear quantization steps between frames.

The moving image processing apparatus according to any one of claims 1 to 11 ,
A moving image processing apparatus, wherein the calculated code amount is a code amount of a luminance component.

The moving image processing apparatus according to any one of claims 1 to 12 ,
The moving image processing apparatus according to claim 1, wherein the means for selecting the frame selects a frame having an estimated motion amount smaller than a predetermined value.

The moving image processing apparatus according to any one of claims 1 to 12 ,
The moving picture processing apparatus according to claim 1, wherein the means for selecting the frames selects a predetermined number of frames in ascending order of the evaluated motion amount.

The moving image processing apparatus according to claim 13 or 14 ,
The moving image processing apparatus characterized in that the means for selecting the frame excludes the first frame of the moving image from the selection target.

16. The moving image processing apparatus according to claim 13, 14 or 15 , further comprising means for deleting the code of the frame selected by the means for selecting the frame from the moving image.

A program that causes a computer to function as each unit of the moving image processing apparatus according to any one of claims 1 to 16.

A computer-readable information recording medium on which the program according to claim 17 is recorded.