JP2009094645A

JP2009094645A - Moving image encoding apparatus and method for controlling the same

Info

Publication number: JP2009094645A
Application number: JP2007261244A
Authority: JP
Inventors: Katsumi Otsuka; 克己大塚
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2007-10-04
Filing date: 2007-10-04
Publication date: 2009-04-30

Abstract

<P>PROBLEM TO BE SOLVED: To control a target encoding amount according to the degree of deterioration in picture quality caused by block distortion and visual statistic information on a picture. <P>SOLUTION: An encoding unit 105 performs encoding in units of blocks comprising a plurality of pixels to generate encoded data. An encoding amount detection unit 107 detects the amount of encoded data of the picture generated by the encoding unit 105. An encoding distortion detection unit 104 calculates, as a picture distortion amount, the amount of distortion in a block boundary position between a picture obtained by decoding the encoded data and the picture before the encoding. A statistic information calculation unit 101 calculates statistic information on properties affecting the distortion of the block boundary position when a picture of interest is encoded from the picture of interest. Then a first picture target encoding amount calculation unit 103 generates encoding parameters of a picture following the picture of interest on the basis of a sequence target encoding amount, the amount of the encoded data, the statistic information, and the picture distortion amount, and sets them to the encoding unit 105. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、可変ビットレートでリアルタイム符号化する動画像符号化技術に関するものである。 The present invention relates to a moving image coding technique for performing real-time coding at a variable bit rate.

近年のデジタル信号処理技術の飛躍的な進歩により、従来ならば困難であった動画像の蓄積メディアへの記録や伝送路を介した動画像の伝送が行われている。この場合に、動画像を構成する各々のピクチャは圧縮符号化処理が施され、そのデータ量が大幅に削減される。この圧縮符号化処理として代表的な手法の一つが、例えばＭＰＥＧ（Moving Picture Experts Group）方式である。 Due to dramatic progress in digital signal processing technology in recent years, recording of moving images to a storage medium and transmission of moving images via a transmission path, which have been difficult in the past, are performed. In this case, each picture constituting the moving image is subjected to compression encoding processing, and the data amount is greatly reduced. One typical technique for this compression encoding processing is, for example, the MPEG (Moving Picture Experts Group) system.

ＭＰＥＧ方式に準拠して一連のピクチャを一定のビットレートという条件下で圧縮符号化する場合に、複数ピクチャからなるシーン、ピクチャの空間周波数特性、ピクチャ間の相関、及び量子化スケール値に応じて符号量が大きく異なる。この様な符号化特性をもつ装置を実現する上で符号化歪みを最小限にするための重要な技術が符号量制御である。 When a series of pictures are compression-encoded under the condition of a constant bit rate according to the MPEG system, depending on the scene composed of multiple pictures, the spatial frequency characteristics of pictures, the correlation between pictures, and the quantization scale value The code amount is greatly different. An important technique for minimizing coding distortion in realizing an apparatus having such coding characteristics is code amount control.

符号量制御を実現するためのアルゴリズムは、固定ビットレート符号化方式(以後ＣＢＲ方式)、及び、可変ビットレート符号化方式(ＶＢＲ方式)の２つに大別出来る。一般にＶＢＲ方式では符号化難易度に応じて符号を適応的に割り当てるため、ＣＢＲ方式に比べて、復号ピクチャの画質が良い事が知られている。符号の適応的な割り当て方は、例えば符号化難易度が高いシーンには高いビットレートを割り当て、符号化難易度が低いシーンには低いビットレートを割り当てる事により実現される。 Algorithms for realizing the code amount control can be broadly classified into two types: a fixed bit rate encoding method (hereinafter referred to as CBR method) and a variable bit rate encoding method (VBR method). In general, it is known that in the VBR system, codes are adaptively assigned according to the encoding difficulty level, so that the picture quality of decoded pictures is better than that in the CBR system. An adaptive code allocation method is realized, for example, by assigning a high bit rate to a scene with a high degree of encoding difficulty and assigning a low bit rate to a scene with a low degree of encoding difficulty.

ＣＢＲ方式としては、ＭＰＥＧ−２符号化方式の標準化の過程で提案されたＴＭ５（Test Model 5(Test Model Editing Commitee: "Test Model 5", ISO/IEC JTC/SC29/WG11/N0400(Apr.1993))）や特許文献１などの方式が知られている。 As the CBR system, TM5 (Test Model Editing Commitee: “Test Model 5”, ISO / IEC JTC / SC29 / WG11 / N0400 (Apr. 1993) proposed in the process of standardization of the MPEG-2 encoding system. ))) And Patent Document 1 are known.

リアルタイムで（すなわち１パスで）ＶＢＲ方式を実現する技術として、特許文献２、３が知られている。更には、シーンに応じて適応的に符号量を割り当てる技術として、撮像制御情報を用いて実現を試みている特許文献４が知られている。次に、それぞれの従来技術について説明する。 Patent Documents 2 and 3 are known as techniques for realizing the VBR method in real time (that is, in one pass). Further, as a technique for adaptively allocating a code amount according to a scene, Patent Document 4 that is attempted to be implemented using imaging control information is known. Next, each prior art will be described.

特許文献２、３では図２に示す通りに、複数のピクチャからなるピクチャ群及び符号化対象であるピクチャに対して、符号化難易度算出部(２０１及び２０２)と称する符号化難易度を検出する手段を用いる。これにより、フィード・フォワード型のVBR方式を実現している。この方法によれば、複数ピクチャからなるピクチャ群をピクチャ群分割部２００で分割し、シーケンス全体に対する、ピクチャ群の符号化難易度を符号化難易度情報算出部２０１において算出している。この算出した符号化難易度に応じて、該ピクチャ群の目標符号量を可変に割り当てる事で復号ピクチャの画質のばらつきが抑えている。 In Patent Documents 2 and 3, as shown in FIG. 2, a coding difficulty level called a coding difficulty level calculation unit (201 and 202) is detected for a group of pictures and a picture to be coded. Use the means to do. This realizes a feed-forward VBR system. According to this method, a picture group consisting of a plurality of pictures is divided by the picture group dividing unit 200, and the encoding difficulty level of the picture group for the entire sequence is calculated by the encoding difficulty level information calculating unit 201. Variations in the picture quality of the decoded picture are suppressed by variably assigning the target code amount of the picture group according to the calculated encoding difficulty level.

特許文献４では、図３に示す通りに、撮像機器における符号化部３０２に対して撮像制御情報を用いる事によりピクチャの目標符号量を可変に割り当てている。この方法によれば特に、マイクロコンピュータ３０４が、撮像制御情報算出部３０１からのフォーカス情報及びズーム位置から得られる合焦条件を確認する。そして、マイクロコンピュータ３０４は、現在撮影中のピクチャが、ワイド端でかつ画像焦点が合いやすい場合にはピクチャ符号量を多く割り当てる事でＶＢＲ方式を実現している。この方式によれば、撮像制御情報のみを用いてピクチャ目標符号量を制御出来るので、従来のＶＢＲ方式よりも簡易に実現できるとしている。
特許第３１１２０３５号公報特許第３２６５８１８号公報特許第３３９９４７２号公報特開２００３−１８５２１号公報 In Patent Document 4, as shown in FIG. 3, the target code amount of a picture is variably assigned by using imaging control information to the encoding unit 302 in the imaging apparatus. In particular, according to this method, the microcomputer 304 confirms the focusing condition obtained from the focus information and the zoom position from the imaging control information calculation unit 301. The microcomputer 304 realizes the VBR method by assigning a large amount of picture code when the picture currently being photographed is at the wide end and the image is easily focused. According to this method, since the picture target code amount can be controlled using only the imaging control information, it can be realized more easily than the conventional VBR method.
Japanese Patent No. 3112035 Japanese Patent No. 3265818 Japanese Patent No. 3399472 JP 2003-18521 A

しかしながら、前記特許文献２乃至４においては、それぞれ以下の問題を有している。 However, Patent Documents 2 to 4 each have the following problems.

先ず、特許文献２によれば、符号化難易度情報算出部２０１及び２０２には、符号化部２０５と同様な符号化手段が必要となり、処理負荷が非常に重い。 First, according to Patent Document 2, the encoding difficulty level information calculation units 201 and 202 require encoding means similar to the encoding unit 205, and the processing load is very heavy.

また、特許文献３には、更に符号化難易度として空間アクティビティを用いる事が開示されているが、空間アクティビティでは符号化部２０５における符号化難易度を予測するには不十分である。更には、符号化難易度に応じてのみピクチャ目標符号量を制御しているので、ピクチャの視覚的な情報は一切考慮しておらず、シーンに応じて適応的な符号量の割り当てを行っているとは言い難い。 Further, Patent Document 3 discloses that a spatial activity is further used as an encoding difficulty level, but the spatial activity is insufficient to predict the encoding difficulty level in the encoding unit 205. Furthermore, since the picture target code amount is controlled only according to the encoding difficulty level, no visual information of the picture is considered, and adaptive code amount allocation is performed according to the scene. It ’s hard to say.

また、特許文献４であるが、これによると、ズーム情報からシーンに応じて適応的は符号量の割り当てを行っている。しかし、この提案においてもピクチャの視覚的な情報は一切考慮しておらず、ワイド端でかつ画像焦点が合いやすい場合にのみ符号量を増加させているのみである。更には、符号化難易度が一切考慮されおらず、撮影開始時に与えられる目標ビットレートから定まるシーケンス目標符号量内で符号化する事が困難でもある。 Further, as disclosed in Patent Document 4, according to this, the code amount is adaptively allocated according to the scene from the zoom information. However, this proposal does not consider any visual information of the picture, and only increases the amount of code only when the image is easily focused at the wide end. Furthermore, the degree of difficulty in encoding is not considered at all, and it is difficult to perform encoding within a sequence target code amount determined from a target bit rate given at the start of imaging.

本発明は、上記問題に鑑みなされたものである。すなわち、本発明は、符号化難易度から得られる画質の劣化具合及びピクチャの視覚的な統計情報から得られるシーンの画質重要度を考慮する。そして、本発明は、シーンに対する目標符号量を制御する事で、与えられた目標ビットレートの条件下において良好な画質の符号化動画像データを得る技術を提供するものである。 The present invention has been made in view of the above problems. That is, the present invention considers the degree of image quality deterioration obtained from the degree of difficulty in encoding and the importance of the image quality of the scene obtained from visual statistical information of the picture. The present invention provides a technique for obtaining encoded moving image data with good image quality under a given target bit rate condition by controlling a target code amount for a scene.

かかる課題を解決するため、例えば本発明の動画像符号化装置は以下の構成を備える。すなわち、
連続して入力されるピクチャを、目標ビットレートから定まるシーケンス目標符号量内で符号化する動画像符号化装置であって、
時間軸に並んだピクチャで構成される動画像を、予め設定された複数個のピクチャで構成されるシーンに分割する分割手段と、
与えられた量子化スケールを決定する符号化パラメータに従って、入力したピクチャを、複数画素で構成されるブロック単位に符号化し、符号化データを生成する符号化手段と、
前記符号化手段で生成されたピクチャの符号化データ量を検出する符号量検出手段と、
着目ピクチャより得られた符号化データを復号する復号手段と、
前記復号手段により復号して得られたピクチャと、符号化前のピクチャとの間の、前記ブロックの境界位置における歪み量をピクチャ歪み量として算出する歪み量算出手段と、
着目ピクチャから、当該ピクチャの符号化処理する場合の、前記ブロック境界位置の歪みに影響を与える属性の統計情報を算出する統計情報算出手段と、
前記シーケンス目標符号量、前記符号量検出手段で検出された符号化データ量、前記統計情報算出手段で算出された統計情報、及び、前記歪み量算出手段で算出されたピクチャ歪み量に基づき、着目ピクチャに後続するピクチャの符号化パラメータを生成し、前記符号化手段に設定する設定手段とを備える。 In order to solve this problem, for example, a moving image encoding apparatus of the present invention has the following configuration. That is,
A moving image encoding apparatus that encodes continuously input pictures within a sequence target code amount determined from a target bit rate,
A dividing unit that divides a moving image composed of pictures arranged in a time axis into scenes composed of a plurality of preset pictures;
Encoding means for encoding an input picture in block units composed of a plurality of pixels and generating encoded data in accordance with an encoding parameter for determining a given quantization scale;
Code amount detection means for detecting the amount of encoded data of the picture generated by the encoding means;
Decoding means for decoding the encoded data obtained from the picture of interest;
A distortion amount calculating means for calculating a distortion amount at a boundary position of the block between a picture obtained by decoding by the decoding means and a picture before encoding as a picture distortion amount;
Statistical information calculating means for calculating statistical information of attributes that affect the distortion of the block boundary position when encoding the picture from the picture of interest;
Based on the sequence target code amount, the encoded data amount detected by the code amount detection unit, the statistical information calculated by the statistical information calculation unit, and the picture distortion amount calculated by the distortion amount calculation unit Setting means for generating coding parameters for a picture following the picture and setting the coding parameters in the coding means.

本発明によれば、ブロック歪みによる画質の劣化の度合、及びピクチャの視覚的な統計情報に従い、シーンに対する目標符号量を制御する事で、与えられた目標ビットレートの条件下において良好な画質の符号化動画像データを得る事が可能となる。 According to the present invention, by controlling the target code amount for a scene according to the degree of deterioration in image quality due to block distortion and visual statistical information of a picture, a good image quality can be obtained under a given target bit rate condition. Encoded moving image data can be obtained.

以下、添付図面に従って本発明に係る実施形態を詳細に説明する。 Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

［第１の実施形態］
図１は、実施形態における時間軸に並んだピクチャで構成される動画像を符号化する動画像符号化装置のブロック構成図である。実施形態では、符号化方式としてＭＰＥＧ−４を例にして説明する。 [First Embodiment]
FIG. 1 is a block configuration diagram of a moving image encoding apparatus that encodes a moving image composed of pictures arranged on a time axis in the embodiment. In the embodiment, MPEG-4 will be described as an example of an encoding method.

符号化部１０５は、符号化パラメータとして与えられるピクチャ目標符号量Ｒｐ以下になる様に入力ピクチャをＭＰＥＧ−４符号化（ＤＣＴ変換、量子化、エントロピー符号化）する。つまり、符号化部１０５は、与えられたパラメータに従い符号化ストリームを生成し、出力する。一般に、生成される符号量の制御は、量子化処理における量子化ステップ値に依存するので、ピクチャ目標符号量Ｒｐは量子化ステップ（もしくは量子化スケール）を決定するためのパラメータと言うこともできる。また、局所復号化部１０６は、該符号化ストリームを入力としてＭＰＥＧ−４復号化を行い、局所復号ピクチャを出力する。詳細は後述する説明から明らかになるが、ＤＣＴ変換は複数画素（８×８画素）で構成されるブロック単位に行なう。このため、この局所復号化部１０６は、８×８画素のブロック内の境界の画素を復号する（境界より１画素分内側の６×６画素の復号処理は行なわない）。 The encoding unit 105 performs MPEG-4 encoding (DCT conversion, quantization, entropy encoding) on the input picture so that it is less than or equal to the picture target code amount Rp given as an encoding parameter. That is, the encoding unit 105 generates and outputs an encoded stream according to the given parameters. In general, since the control of the generated code amount depends on the quantization step value in the quantization process, the picture target code amount Rp can also be said to be a parameter for determining the quantization step (or quantization scale). . Also, the local decoding unit 106 performs MPEG-4 decoding using the encoded stream as an input, and outputs a locally decoded picture. Although details will become clear from the description to be described later, the DCT conversion is performed in units of blocks each composed of a plurality of pixels (8 × 8 pixels). For this reason, the local decoding unit 106 decodes the pixels at the boundary in the 8 × 8 pixel block (the 6 × 6 pixel decoding process one pixel inside the boundary is not performed).

符号量検出部１０７は、符号化部１０５で生成された１ピクチャ分の符号化データ量を検出し、検出した結果を後述の第１のピクチャ目標符号量算出部１０３に出力する。 The code amount detection unit 107 detects the encoded data amount for one picture generated by the encoding unit 105 and outputs the detected result to a first picture target code amount calculation unit 103 described later.

符号化歪み検出部は、ＭＰＥＧ−４符号化方式における符号化歪みとして代表的なブロック歪みを検出する。ブロック歪みの程度を表すスカラー値であるブロック歪み量をＢｐは、符号化部１０５に供給される符号化前のピクチャ、及び、局所復号化部１０６から出力される局所復号ピクチャを用いて次の通りに算出する。 The coding distortion detection unit detects typical block distortion as coding distortion in the MPEG-4 coding method. The block distortion amount Bp, which is a scalar value indicating the degree of block distortion, is calculated using the pre-encoding picture supplied to the encoding unit 105 and the local decoded picture output from the local decoding unit 106 as follows. Calculate as follows.

符号化部１０５の、入力ピクチャの水平方向の画素数をx_siz、垂直方向の画素数をy_sizとする。図４で示す通りに、水平方向の座標をＪ、垂直方向の座標をIとした際、符号化前の座標（Ｉ，Ｊ）の画素値をＣＩＮ（Ｉ，Ｊ）とする。同様に、局所復号化部１０６より得られた復号画像中の座標（Ｉ，Ｊ）の画素値をＣＯＵＴ（Ｉ，Ｊ）とする。符号化部１０５は、基本的に８×８画素のブロック単位に符号化することになるので、ブロック歪みは、８×８画素のブロックの境界位置に発生する。従って、画像全体に対するブロック歪み量（ピクチャ歪み量）Ｂｐは次に示すアルゴリズムによって求めることができる。
for (I＝0；I < y_size -1; I++){
for (J= 0; J < x_size -1; J++){
if (J % 8 == 7){
EDGEin = ABS (CIN(J,I) - CIN(J,I+1));
EDGEout =ABS (COUT(J,I) - COUT(J,I+1));
MSEblk ++ = POWER(EDGEin - EDGEout));}
else{
if( I % 8 == 7){
EDGEin = ABS(CIN(J,I) - CIN(J+1,I));
EDGEout =ABS(COUT(J,I) - COUT(J+1,I));
MSEblk++ = POWER(EDGEin - EDGEout));}
} }
Bp= MSEblk/MSEall; …（１）
上記において、MSEallはCIN(J,I)とCOUT(J,I)とのピクチャ全体における差分二乗和である。また、「X % Y」は、整数Ｘを整数Ｙで除算した際の余りを返す関数である。また、ブロック歪み量Ｂｐは、ブロックの境界の画素値のみを参照して算出するので、ブロック境界よりも内側の６×６画素は参照しない。先に説明したように、局所復号化部１０６が復号するのが、ブロック内の境界の画素値とするのは、この理由による。 Assume that the number of pixels in the horizontal direction of the input picture of the encoding unit 105 is x_siz, and the number of pixels in the vertical direction is y_siz. As shown in FIG. 4, when the horizontal coordinate is J and the vertical coordinate is I, the pixel value of the coordinate (I, J) before encoding is CIN (I, J). Similarly, the pixel value of the coordinates (I, J) in the decoded image obtained from the local decoding unit 106 is defined as COUT (I, J). Since the encoding unit 105 basically performs encoding in block units of 8 × 8 pixels, block distortion occurs at the boundary position of the block of 8 × 8 pixels. Therefore, the block distortion amount (picture distortion amount) Bp for the entire image can be obtained by the following algorithm.
for (I = 0; I <y_size -1; I ++) {
for (J = 0; J <x_size -1; J ++) {
if (J% 8 == 7) {
EDGEin = ABS (CIN (J, I)-CIN (J, I + 1));
EDGEout = ABS (COUT (J, I)-COUT (J, I + 1));
MSEblk ++ = POWER (EDGEin-EDGEout));}
else {
if (I% 8 == 7) {
EDGEin = ABS (CIN (J, I)-CIN (J + 1, I));
EDGEout = ABS (COUT (J, I)-COUT (J + 1, I));
MSEblk ++ = POWER (EDGEin-EDGEout));}
}}
Bp = MSEblk / MSEall; (1)
In the above, MSEall is the sum of squares of differences in the entire picture between CIN (J, I) and COUT (J, I). “X% Y” is a function that returns the remainder when the integer X is divided by the integer Y. Further, since the block distortion amount Bp is calculated by referring only to the pixel value at the block boundary, 6 × 6 pixels inside the block boundary are not referred to. As described above, it is for this reason that the local decoding unit 106 decodes the pixel value at the boundary in the block.

ここで上記アルゴリズムについて簡単に説明する。先に説明したように、ブロック境界位置は、画像の水平、垂直とも８の整数倍の座標位置である。画像の左上隅の座標は一般に原点（０，０）と表現するから、隣接する２つのブロックの境界に位置する画素の座標位置は、座標を８で除算した際に、余りが７となる座標と、その座標＋１となる。上記のアルゴリズによると、オリジナル（符号化前）の画像の２つのブロック境界に位置する２つの画素の差と復号後のブロック境界に位置する２つの画素の差の差分が、隣接する２つのブロック歪みを表わす指標値と言える。隣接するブロックは水平方向、垂直方向の２種類が存在するので、それぞれにおいて歪み値を累積することで、画像全体に対するブロック歪み量Ｂｐが算出できることになる。 Here, the algorithm will be briefly described. As described above, the block boundary position is a coordinate position that is an integral multiple of 8 in both the horizontal and vertical directions of the image. Since the coordinates of the upper left corner of the image are generally expressed as the origin (0, 0), the coordinate position of the pixel located at the boundary between two adjacent blocks is a coordinate whose remainder is 7 when the coordinates are divided by 8 And its coordinate is +1. According to the above algorithm, the difference between the two pixels located at the two block boundaries of the original (before encoding) image and the difference between the two pixels located at the block boundary after decoding is determined as two adjacent blocks. It can be said that it is an index value representing distortion. Since there are two types of adjacent blocks in the horizontal direction and the vertical direction, the block distortion amount Bp for the entire image can be calculated by accumulating the distortion values in each.

従って、ブロック歪み量Ｂｐが大きければ、復号した画像がオリジナルの画像に対して画質劣化が激しく符号化歪みが大きいといえることは明らかである。本実施形態においては、MPEG-4符号化方式を対象としたので、上記アルゴリズムでは８の整数倍の座標でブロック歪みを求めたが、ブロックのサイズが８×８以外の場合には、それに応じて求めればよい。 Therefore, it is clear that if the block distortion amount Bp is large, it can be said that the decoded image has a significant image quality deterioration compared to the original image and the coding distortion is large. In this embodiment, since the MPEG-4 encoding method is targeted, the above algorithm calculates block distortion at coordinates that are an integral multiple of 8. However, if the block size is other than 8 × 8, the block distortion is determined accordingly. Find it.

上記のようにして符号化歪み検出部１０４は、着目ピクチャのブロック歪み量Ｂｐを算出する。そして、符号化歪み検出部１０４は、算出したブロック歪み長Ｂｐを符号化パラメータ算出部１０３に出力する。 As described above, the coding distortion detection unit 104 calculates the block distortion amount Bp of the picture of interest. Then, the coding distortion detection unit 104 outputs the calculated block distortion length Bp to the coding parameter calculation unit 103.

統計情報算出部１０１は、ブロック境界位置の歪みに影響を与える属性の統計情報を算出する。本実施形態では、ピクチャ内の画素を８×８画素で構成されるブロック毎に、以下に説明する４つの統計情報Ｐ_h、Ｐ_s、Ｐ_y，Ｐ_aを算出する。前提として、符号化対象の画像データの各画素は、輝度（Ｙ）、クロマ（Ｃｂ，Ｃｒ）の３つの成分で構成されるものとする。なお、サブサンプルが４−２−０であるＭＰＥＧ−４プロファイルの場合には、クロマのブロックとしては４×４画素ブロックである。 The statistical information calculation unit 101 calculates statistical information of attributes that affect the distortion of the block boundary position. In the present embodiment, a pixel in a picture for each block consisting of 8 × 8 pixels, four statistics P _h as described below, P _s, P _y, and calculates the P _a. As a premise, it is assumed that each pixel of image data to be encoded is composed of three components of luminance (Y) and chroma (Cb, Cr). In the case of the MPEG-4 profile in which the subsample is 4-2-0, the chroma block is a 4 × 4 pixel block.

以下、本実施形態の統計情報算出部１０１が算出する統計情報を構成する情報Ｐ_h、Ｐ_s、Ｐ_y，Ｐ_aについて説明する。 Hereinafter, information P _h constituting statistics statistics calculator 101 of the present embodiment is calculated, P _s, P _y, for P _a will be described.

［統計情報Ｐ_h］
情報Ｐ_hは、該ブロックが肌色であるか否かを示す情報である。 [Statistical information P _h ]
Information P _h is the block is information indicating whether or not the skin color.

人間の視覚特性が、肌色の色相に対しては非常に敏感である事が知られている。よって、ピクチャ画質重要度算出部１０２におけるピクチャ画質重要度を算出する情報の一つとして、入力ピクチャ内に肌色の色相に相当するブロック数がどの程度存在するかを算出する。 It is known that human visual characteristics are very sensitive to the hue of skin color. Therefore, as one piece of information for calculating the picture quality importance in the picture quality importance calculation unit 102, the number of blocks corresponding to the flesh color hue in the input picture is calculated.

肌色であるか否かの判定は、図５に示すＣｂ−Ｃｒの２次元座標を用いる事により実現される。入力ピクチャのブロック内の、Ｃｂの平均値をＣｂ’、Ｃｒの平均値をＣｒ’として、図５上の座標をＰｂｒ（Ｃｂ’，Ｃｒ’）とすれば、当該ブロックの色相Ｈθは以下の式で得られる。
Ｈθ ＝ｔａｎ（Ｃｂ／Ｃｒ）^-1 …（２） The determination of whether or not the skin color is used is realized by using the two-dimensional coordinates of Cb-Cr shown in FIG. If the average value of Cb in the block of the input picture is Cb ′, the average value of Cr is Cr ′, and the coordinates on FIG. 5 are Pbr (Cb ′, Cr ′), the hue Hθ of the block is It is obtained by the formula.
Hθ = tan (Cb / Cr) ⁻¹ (2)

肌色の色相は、Ｃｂ−Ｃｒ空間では、１２３度近辺である事が知られているが、本実施形態では肌色の色相として１００乃至１５０度の区間（角度範囲）を予め定義しておき、色相Ｈθが該区間内の角度であるか否かを判定する。 The skin color hue is known to be around 123 degrees in the Cb-Cr space, but in this embodiment, a section (angle range) of 100 to 150 degrees is defined in advance as the hue of the skin color. It is determined whether Hθ is an angle within the section.

この判定結果をＰ_ｈとし、ブロックが肌色であればＰ_h ＝１、そうでなければＰ_h＝０とする。 The determination results and P _h, blocks P _h = 1 if the skin color, and P _h = 0 otherwise.

［統計情報Ｐ_s］
情報Ｐ_sは、該ブロックの彩度情報である。彩度情報Ｐ_sも、図５に示すＣｂ−Ｃｒの２次元座標を用いる事により算出できる。人間の視覚特性は、彩度が比較的低い領域（無彩色に近い領域）におけるブロック歪み対して敏感である。それ故、ピクチャ画質重要度算出部１０２におけるピクチャ画質重要度を算出する情報の一つとして、入力ピクチャ内の各ブロックの彩度情報Ｐ_sを算出する。 [Statistical information P _s ]
Information P _s is saturation information of the block. The saturation information P _s can also be calculated by using the Cb—Cr two-dimensional coordinates shown in FIG. Human visual characteristics are sensitive to block distortions in regions with relatively low saturation (regions close to achromatic colors). Therefore, the saturation information P _s of each block in the input picture is calculated as one piece of information for calculating the picture quality importance in the picture quality importance calculation unit 102.

彩度情報Ｐ_sは、座標Pbrの原点からの距離を算出すればよい。
Ｐs＝√（Ｃｂ’²＋Ｃｒ’²） …（３）
［統計情報Ｐ_y］
情報Ｐ_yは、ブロック内の輝度情報（Ｙ）の平均値である。人間の視覚特性が、輝度Ｙが比較的高い領域におけるブロック歪みに対して敏感である。それ故、ピクチャ画質重要度算出部１０３におけるピクチャ画質重要度を算出する情報の一つとして、入力ピクチャ内の、ブロックの輝度Ｙの平均値Ｐ_yを算出する。１つのブロックは８×８画素であるので、各画素の輝度をＹ_i（ｉ＝0,1,2,…,63）とするなら、輝度平均値は次式で求めることができる。
Ｐ_y＝｛ΣＹ_i｝／６４ …（４） For the saturation information P _s , the distance from the origin of the coordinate Pbr may be calculated.
Ps = √ (Cb ′ ² + Cr ′ ² ) (3)
[Statistical information P _y ]
Information _Py is an average value of the luminance information (Y) in the block. Human visual characteristics are sensitive to block distortion in regions where luminance Y is relatively high. Therefore, the average value P _y of the luminance Y of the block in the input picture is calculated as one piece of information for calculating the picture quality importance in the picture quality importance calculation unit 103. Since one block is 8 × 8 pixels, if the luminance of each pixel is Y _i (i = 0, 1, 2,..., 63), the average luminance value can be obtained by the following equation.
P _y = {ΣY _i } / 64 (4)

［統計情報Ｐ_a］
情報Ｐ_aは、ブロック中の各画素の輝度情報Ｙの値から求まる分散値情報である。 Statistics P _a]
Information P _a is the variance value information obtained from the values of the luminance information Y of each pixel in the block.

人間の視覚特性は、空間周波数が比較的低い領域におけるブロック歪みに対して敏感である。それ故、ピクチャ画質重要度算出部１０２におけるピクチャ画質重要度を算出する情報の一つとして、入力ピクチャ内の、各ブロックにおける輝度Yの分散値を算出する。ブロックの各画素の輝度の平均値をＹ’として、ブロック内の各画素の輝度Ｙの値をＹi（ｉ＝0,1,2,…,63）とすれば、次式で得られる。
Ｐ_a＝Σ（Ｙ_i−Ｙ’）² …（５）
なお、厳密には、分散は、上記式（５）を標本数（実施形態では８×８＝６４）で除算するものであるが、分散の指標値が判ればよいので、除算することは行なっていない。 Human visual characteristics are sensitive to block distortion in regions where the spatial frequency is relatively low. Therefore, the variance value of the luminance Y in each block in the input picture is calculated as one piece of information for calculating the picture image quality importance in the picture image quality importance calculating unit 102. If the average luminance value of each pixel in the block is Y ′ and the luminance Y value of each pixel in the block is Yi (i = 0, 1, 2,..., 63), the following equation is obtained.
P _a = Σ (Y _i −Y ′) ² (5)
Strictly speaking, the variance is obtained by dividing the above equation (5) by the number of samples (8 × 8 = 64 in the embodiment). However, since it is sufficient to know the index value of the variance, the division is performed. Not.

以上、実施形態における統計情報算出部１０１で生成する４つの統計情報を説明した。 Heretofore, the four pieces of statistical information generated by the statistical information calculation unit 101 in the embodiment have been described.

例えば、図４の入力ピクチャの水平方向の画素数x_sizeが“６４０”、垂直方向の画素数y_sizが“４８０”とすれば、この画像中には４８００個（＝（６４０／８）×（４８０／８））のブロックが存在することになる。図６は、或るピクチャのブロック分割例を示している。先頭のブロックの番号を０とし、その先頭ブロックをブロック＃０と表わすと、ブロック＃０乃至ブロック＃４７９９について、統計情報算出部１０１が上記４つの統計情報を算出することになる。各ブロックの統計情報は、配列Ｐ_h[N],Ｐ_s[N],Ｐ_y[N],Ｐ_a[N]（Ｎ＝0,1,2,…,4799）と表わせる。 For example, if the number of pixels x_size in the horizontal direction of the input picture in FIG. 4 is “640” and the number of pixels y_siz in the vertical direction is “480”, 4800 (= (640/8) × (480) in this image. / 8)) block exists. FIG. 6 shows an example of block division of a certain picture. If the number of the first block is 0 and the first block is represented as block # 0, the statistical information calculation unit 101 calculates the above four statistical information for block # 0 to block # 4799. The statistical information of each block can be expressed as an array P _h [N], P _s [N], P _y [N], P _a [N] (N = 0, 1, 2,..., 4799).

次に、ピクチャ画質重要度算出部１０２について説明する。ピクチャ画質重要度算出部１０２は、統計情報算出部１０１から入力される統計情報Ｐ_h[N],Ｐ_s[N],Ｐ_y[N],Ｐ_a[N]、及び、図１６に示す予め定めた視覚感度テーブルを用いて、入力ピクチャのピクチャ画質重要度Ｐｉを算出する。視覚感度テーブルは、重み付け係数Ｃｗ[k]及び正規化係数Ｃｄ[k](k=0〜3)を、４つの統計情報それぞれについて定義している。つまり、ｋ＝０の重み付け係数Ｃｗ[０]及び正規化係数Ｃｄ[０]は、統計情報Ｐ_h[N]に対するものである。ｋ＝１の重み付け係数Ｃｗ[１]及び正規化係数Ｃｄ[１]は、統計情報Ｐ_s[N]に対するものである。ｋ＝２の重み付け係数Ｃｗ[２]及び正規化係数Ｃｄ[２]は、統計情報Ｐ_y[N]に対するものである。そして、ｋ＝３は、重み付け係数Ｃｗ[３]及び正規化係数Ｃｄ[３]は、統計情報Ｐ_a[N]に対するものである。 Next, the picture image quality importance calculation unit 102 will be described. The picture image quality importance calculation unit 102 includes statistical information P _h [N], P _s [N], P _y [N], P _a [N] input from the statistical information calculation unit 101, and FIG. The picture quality importance Pi of the input picture is calculated using a predetermined visual sensitivity table. The visual sensitivity table defines a weighting coefficient Cw [k] and a normalization coefficient Cd [k] (k = 0 to 3) for each of four pieces of statistical information. That is, the k = 0 weighting coefficient Cw [0] and the normalization coefficient Cd [0] are for the statistical information P _h [N]. The weighting coefficient Cw [1] and the normalization coefficient Cd [1] with k = 1 are for the statistical information P _s [N]. The weighting coefficient Cw [2] and the normalization coefficient Cd [2] for k = 2 are for the statistical information P _y [N]. K = 3 is the weighting coefficient Cw [3] and the normalization coefficient Cd [3] is for the statistical information P _a [N].

ピクチャ画質重要度算出部１０２は、ピクチャ内の各々ブロックの、統計情報Ｐh[N]を除く、３つの統計情報と正規化係数Ｃｄ[ｋ]を乗算し、“１”以下の値にクリップする。統計情報Ｐ_h［N］は、既に０、１の２値の値に正規化済みである点に注意されたい。例えば、彩度情報Ｐ_s [N]については、以下の処理により正規化された彩度情報Ｐs’[N]を得る。
for(N = 0; N < 48000; N++) {
Ps’[N] = Ps[N] × Cd[1];
if(Ps’[N] > 1){
Ps’[N] = 1;
}
} …（６） The picture image quality importance calculation unit 102 multiplies three pieces of statistical information excluding the statistical information Ph [N] and the normalization coefficient Cd [k] for each block in the picture, and clips it to a value of “1” or less. . It should be noted that the statistical information P _h [N] has already been normalized to binary values of 0 and 1. For example, with respect to the saturation information P _s [N], the saturation information Ps ′ [N] normalized by the following processing is obtained.
for (N = 0; N <48000; N ++) {
Ps' [N] = Ps [N] × Cd [1];
if (Ps'[N]> 1) {
Ps' [N] = 1;
}
} (6)

同様に、Ｐ_y [N]及びＰ_a [N]についても処理を行い、正規化された統計情報であるＰ_y'[N]及びＰ_a'[N]を算出する。 Similarly, P _y [N] and P _a [N] are also processed, and normalized statistical information P _y '[N] and P _a ' [N] are calculated.

次にピクチャ画質重要度算出部１０２は、Ｐ_h[N]及び式（６）で求めた正規化された統計情報Ｐs'[N]、ＰY'[N]及びＰa'[N]に基づき、以下の処理を行なうことで、ピクチャ画質重要度Ｐｉを求める。なお、ピクチャ画質重要度Ｐｉを求める際、Ｐｉを“０”に初期化する。
for(N = 0; N < 48000; N++) {
Pi ＝ Pi + Ph[N]×Cw[0] + Ps’[N]×Cw[1] + PY’[N]×Cw[2] + Pa’[N]×Cw[3];
｝ …（７） Next, the picture image quality importance calculation unit 102 based on P _h [N] and the normalized statistical information Ps ′ [N], PY ′ [N], and Pa ′ [N] obtained by Expression (6), The picture quality importance Pi is obtained by performing the following processing. When obtaining the picture image quality importance Pi, Pi is initialized to “0”.
for (N = 0; N <48000; N ++) {
Pi = Pi + Ph [N] x Cw [0] + Ps' [N] x Cw [1] + PY '[N] x Cw [2] + Pa' [N] x Cw [3];
} (7)

つまり、ピクチャ画質重要度Ｐｉは、着目ピクチャの各ブロック毎の４つの正規化された統計情報それぞれの、重み付け係数を適用した合算値と言うことができる。 That is, the picture image quality importance level Pi can be said to be a sum value obtained by applying a weighting coefficient to each of the four normalized statistical information for each block of the target picture.

次に第１のピクチャ目標量算出部１０３について説明する。ここで、本実施形態の動画像符号化装置に与えられる目標ビットレートを「Ｔｓ」であるものとする。第１のピクチャ目標量算出部１０３は、この目標ビットレートＴｓ、符号化歪み算出部１０４から入力されるブロック歪み量Ｂｐ、及び、ピクチャ画質重要度算出部から入力されるピクチャ画質重要度Ｐｉから、次のピクチャ（後続するピクチャ）に対する符号化部１０５のピクチャ目標符号量Ｒｐ（符号化パラメータ）を算出する。ピクチャ目標符号量Ｒｐの算出方法について、図７を用いて説明する。 Next, the first picture target amount calculation unit 103 will be described. Here, it is assumed that the target bit rate given to the moving picture encoding apparatus of the present embodiment is “Ts”. The first picture target amount calculation unit 103 uses the target bit rate Ts, the block distortion amount Bp input from the coding distortion calculation unit 104, and the picture quality importance Pi input from the picture quality importance calculation unit. Then, the picture target code amount Rp (encoding parameter) of the encoding unit 105 for the next picture (subsequent picture) is calculated. A method for calculating the picture target code amount Rp will be described with reference to FIG.

本実施形態の動画像符号化装置において、予め設定された複数個のピクチャから構成されるシーンの目標符号量Ｒｓを次式により求める。なお、実施形態での１シーンは１５個のピクチャ（Ｎｓ＝１５）であるものとする。ここでシーンの先頭のピクチャはＩピクチャとする。入力ピクチャのフレームレートを３０ｆｐｓとすれば、シーン目標符号量Ｒｓ（１５フレームの目標符号量）を、次式（８）に従って算出する。
Ｒｓ＝Ｔｓ×１／２ …（８） In the moving picture encoding apparatus of the present embodiment, a target code amount Rs of a scene composed of a plurality of preset pictures is obtained by the following equation. Note that one scene in the embodiment is assumed to be 15 pictures (Ns = 15). Here, the first picture in the scene is an I picture. If the frame rate of the input picture is 30 fps, the scene target code amount Rs (target code amount of 15 frames) is calculated according to the following equation (8).
Rs = Ts × 1/2 (8)

更に、本動画像符号化装置に入力される着目ピクチャが、着目シーン内の先頭ピクチャ以外である場合（２番目以降のピクチャの場合）、例えば、図７中のピクチャＰ５であるとすると、ピクチャＰ５のピクチャ目標符号量の初期値Ｒｐ’を次式（９）から求める。
Ｒｐ’＝（Ｒｓ−Ｒｆ）／Ｎｒ …（９）
ここで、Ｎｒは、シーケンス内のピクチャＰ５を含む残りピクチャ数（未符号化のピクチャ数）であり、Ｒｆはシーン内におけるピクチャＰ１からピクチャＰ４までの総発生符号量である。 Furthermore, if the target picture input to the moving picture encoding apparatus is other than the first picture in the target scene (in the case of the second and subsequent pictures), for example, if it is picture P5 in FIG. The initial value Rp ′ of the picture target code amount of P5 is obtained from the following equation (9).
Rp ′ = (Rs−Rf) / Nr (9)
Here, Nr is the number of remaining pictures including the picture P5 in the sequence (number of uncoded pictures), and Rf is the total generated code amount from the picture P1 to the picture P4 in the scene.

ここで、シーンの初期のピクチャ目標符号量Ｒｐ’を、ブロック歪み量Ｂｐ及びピクチャ画質重要度Ｐｉに応じて、図８に示す処理フローにより増減させることで、最終的なピクチャ目標符号量Ｒｐを算出する。 Here, the final picture target code amount Rp ′ is increased or decreased by the processing flow shown in FIG. 8 according to the block distortion amount Bp and the picture quality importance Pi according to the block distortion amount Bp. calculate.

図８の処理において、Ｂｐ’及びＰｉ’は、直前のシーンを構成する全てのピクチャのブロック歪み量Ｂｐ及びピクチャ画質重要度Ｐｉの平均値である。図８における、ステップＳ８００において、ブロック歪み量Ｂｐと予め定めた閾値Ｂminとを比較する。このステップＳ８００における比較の結果、ブロック歪み量Ｂｐが閾値Ｂminより小さい場合には、次にステップＳ８０１において初期値Ｒｐ’を減少させるか否かを判定する。閾値Bminには、再構成ピクチャにおいて視覚特性の観点から認識困難なブロック歪み量の限界値を予め定義する。ステップＳ８０１において、ピクチャＰ５のピクチャ画質重要度Ｐｉと直前のシーンのＰｉ平均値であるピクチャ重要度Ｐｉ’とを比較し、ピクチャ画質重要度Ｐｉの方が小さい場合には、初期値Ｒｐ’をステップＳ８０７の式に従い減少させる。ステップＳ８０７における式中のβは、予め定めた0.0以上の小数点を含む定数である。 In the process of FIG. 8, Bp ′ and Pi ′ are the average values of the block distortion amounts Bp and the picture image quality importance levels Pi of all the pictures constituting the immediately preceding scene. In step S800 in FIG. 8, the block distortion amount Bp is compared with a predetermined threshold value Bmin. If the block distortion amount Bp is smaller than the threshold value Bmin as a result of the comparison in step S800, it is next determined in step S801 whether or not the initial value Rp ′ is to be decreased. The threshold value Bmin defines in advance a limit value of the block distortion amount that is difficult to recognize from the viewpoint of visual characteristics in the reconstructed picture. In step S801, the picture quality importance Pi of the picture P5 is compared with the picture importance Pi ′ which is the Pi average value of the immediately preceding scene. If the picture quality importance Pi is smaller, the initial value Rp ′ is set. Decrease in accordance with the formula in step S807. Β in the expression in step S807 is a constant including a predetermined decimal point of 0.0 or more.

一方、ステップＳ８００における比較の結果、ブロック歪み量Ｂｐが閾値Ｂmin以上の場合には、ステップＳ８０２及びＳ８０３において初期値Ｒｐ’を増加させるか否かの判定を行う。ステップＳ８０２において、ピクチャＰ５のピクチャ画質重要度Ｐｉが直前のシーンのＰｉの平均値であるピクチャ重要度Ｐｉ’より大きく、かつブロック歪み量Ｂｐが直前のシーンのＢｐ平均値であるブロック歪み量Ｂｐ’より大きく、更にステップＳ８０４において“Ｒｅ＞０”である場合に、初期値Ｒｐ’をステップＳ８０５の式に従い増加させる。ここで“Ｒｅ”は、直前のシーンまで符号化した結果得られた、目標ビットレートＴｓに対する余剰符号量に相当するＲｒから、現在のシーン内で初期値Ｒｐ’を増加させた符号量を減算した符号量である。Ｒｒ及びＲｅはシーンの符号化が完了する毎に次式を演算する事により得られる。
Ｒｒ＝Ｒｒ＋Ｒｓ−Ｒｓ’
Ｒｅ＝Ｒｒ …（１０）
ただし、Ｒｓ’はシーン内において発生したシーン発生符号量である。式（１０）で求めたＲｅをシーンの先頭ピクチャに用いる初期値とし、更にステップＳ８０５において随時更新する。 On the other hand, if the block distortion amount Bp is greater than or equal to the threshold value Bmin as a result of the comparison in step S800, it is determined in steps S802 and S803 whether or not the initial value Rp ′ is to be increased. In step S802, the picture quality importance Pi of the picture P5 is larger than the picture importance Pi ′ that is the average value of Pi of the immediately preceding scene, and the block distortion amount Bp is the block distortion amount Bp that is the Bp average value of the immediately preceding scene. If “Re> 0” in step S804, the initial value Rp ′ is increased according to the equation in step S805. Here, “Re” subtracts the code amount obtained by increasing the initial value Rp ′ in the current scene from Rr corresponding to the surplus code amount for the target bit rate Ts obtained as a result of encoding up to the immediately preceding scene. Code amount. Rr and Re are obtained by calculating the following equation each time the encoding of the scene is completed.
Rr = Rr + Rs−Rs ′
Re = Rr (10)
Here, Rs ′ is a scene generation code amount generated in the scene. Re obtained by Expression (10) is set as an initial value used for the first picture of the scene, and is updated as needed in step S805.

次に図９を用いて、本実施形態における動画像の符号化処理を説明する。図９には、動画像の先頭から連続する３つのシーンに対する本実施形態によるピクチャ目標符号量Ｒｐの推移を示している。 Next, a moving image encoding process according to this embodiment will be described with reference to FIG. FIG. 9 shows the transition of the picture target code amount Rp according to the present embodiment for three consecutive scenes from the beginning of the moving image.

シーン０においては、シーケンスの先頭シーンであるので、直前のシーンのＰｉ平均値であるピクチャ重要度Ｐｉ’及び直前のシーンのＢｐ平均値であるブロック歪み量Ｂｐ’の算出が出来ない。よって、図８中のステップＳ８０６の処理のみが実施されピクチャ目標符号量の初期値Ｒｐ’がそのままピクチャ目標符号量Ｒｐとして符号化部１０５に与えられる。 Since scene 0 is the first scene in the sequence, it is not possible to calculate the picture importance Pi ′, which is the Pi average value of the immediately preceding scene, and the block distortion amount Bp ′, which is the Bp average value of the immediately preceding scene. Therefore, only the process of step S806 in FIG. 8 is performed, and the initial value Rp ′ of the picture target code amount is directly supplied to the encoding unit 105 as the picture target code amount Rp.

次にシーン１においては、ステップＳ８０６に加えて、ピクチャ目標符号量の初期値Ｒｐ’に対して符号量を減少させるステップＳ８０７の処理が行われる。一方、符号量を増加させるステップＳ８０５の処理はシーン１では行われない。これは、図９においてシーン０の符号化結果においてシーン目標符号量Ｒｓがシーン発生符号量Ｒｓ’と等しくＲｒが０であるからである。勿論、シーン０において、ステップＳ８０７の処理において積極的に符号量の減少を行わなくとも、式（１０）によりＲｒ＞０と判定された場合には、ステップＳ８０５の処理をシーン１において実行する事が可能である。図９中のシーン１において、ピクチャＰ２、Ｐ３及ぶＰ４においてステップＳ８０７の処理が行われている事がわかる。 Next, in scene 1, in addition to step S806, the process of step S807 for reducing the code amount with respect to the initial value Rp ′ of the picture target code amount is performed. On the other hand, the process of step S805 for increasing the code amount is not performed in the scene 1. This is because the scene target code amount Rs is equal to the scene generation code amount Rs ′ and Rr is 0 in the encoding result of the scene 0 in FIG. Of course, in the scene 0, even if the code amount is not actively reduced in the process of step S807, the process of step S805 is executed in the scene 1 when it is determined that Rr> 0 according to the equation (10). Is possible. In the scene 1 in FIG. 9, it can be seen that the process of step S807 is performed in the pictures P2, P3 and P4.

次にシーン２においては、ステップＳ８０６、ステップＳ８０７に加えてステップＳ８０５の処理が行われている。これはシーン２の符号化に先立ち演算する式（１０）の演算結果により、Ｒｒ＞０が得られたからである。これは、シーン１におけるピクチャＰ２、Ｐ３及びＰ４に対するステップＳ８０７の処理により発生した余剰符号量に相当する。シーン２のピクチャＰ０、Ｐ１及びＰ２、更にはピクチャＰ７、Ｐ８、及びＰ９に対してステップＳ８０５の処理により、ピクチャ目標符号量の初期値Ｒｐ’に対して符号量が増加させる事によりピクチャ目標符号量Ｒｐが算出されている事がわかる。 Next, in scene 2, in addition to steps S806 and S807, the process of step S805 is performed. This is because Rr> 0 is obtained from the calculation result of the equation (10) calculated prior to the encoding of the scene 2. This corresponds to the surplus code amount generated by the process of step S807 for the pictures P2, P3, and P4 in the scene 1. For the pictures P0, P1, and P2 of the scene 2, and further for the pictures P7, P8, and P9, the processing of step S805 causes the code amount to increase with respect to the initial value Rp ′ of the picture target code amount. It can be seen that the amount Rp is calculated.

なお、本実施形態においては、変数Ｐｉ’及びＢｐ’を求めるに際して、直前のシーンのピクチャ画質重要度Ｐｉ及びブロック歪み量Ｂｐを用いたが、直前の複数ピクチャのピクチャ画質重要度Ｐｉ及びブロック歪み量Ｂｐを用いても良い。この場合には、例えば入力ピクチャがＰ５の場合には、現在のシーンのピクチャＩ及びピクチャＰ０乃至Ｐ４及び直前のシーンの、ピクチャＰ５乃至Ｐ１３のピクチャ画質重要度Ｐｉ及びブロック歪み量Ｂｐを用いる事となる。 In the present embodiment, when obtaining the variables Pi ′ and Bp ′, the picture quality importance Pi and the block distortion amount Bp of the immediately preceding scene are used, but the picture quality importance Pi and block distortion of the immediately preceding multiple pictures are used. The amount Bp may be used. In this case, for example, when the input picture is P5, the picture quality importance Pi and the block distortion amount Bp of the pictures P5 to P13 of the current scene are used. It becomes.

［第２の実施形態］
第２の実施形態を説明する。図１０は本第２の実施形態における動画像符号化装置のブロック構成図である。第１の実施形態と同様の構成については、同じ参照符号を付した。従って、本第２の実施形態における符号化部１０５は、第１の実施形態と同様、ＭＰＥＧ−４に準拠した符号化処理を行なうものとなる。 [Second Embodiment]
A second embodiment will be described. FIG. 10 is a block diagram of the moving picture coding apparatus according to the second embodiment. The same reference numerals are assigned to the same configurations as those in the first embodiment. Accordingly, the encoding unit 105 according to the second embodiment performs an encoding process based on MPEG-4 as in the first embodiment.

図１０は、図１の構成に、シーン画質重要度算出部１０００と、シーン分割手段１００１、シーン目標符号量算出部１００２、及び符号化パラメータ算出部１００３を追加したものと言える。なお、図１０において、上記新たに追加された４つの処理部以外は、第１の実施形態と同様の処理を行うものであり、ここでの詳述は省略する。 10 can be said to be obtained by adding a scene image quality importance calculation unit 1000, a scene dividing unit 1001, a scene target code amount calculation unit 1002, and an encoding parameter calculation unit 1003 to the configuration of FIG. In FIG. 10, except for the four newly added processing units, the same processing as in the first embodiment is performed, and detailed description thereof is omitted here.

第２の実施形態と第１の実施形態で実現される処理の比較を図１７のテーブルに示す。本第２の実施形態において、符号量を制御する対象はシーン目標符号量Ｒｓであり、ピクチャ目標符号量Ｒｉを制御する第１の実施形態とはこの点で異なる。 A comparison of the processing realized in the second embodiment and the first embodiment is shown in the table of FIG. In the second embodiment, the code amount control target is the scene target code amount Rs, which is different from the first embodiment in which the picture target code amount Ri is controlled.

シーン画質重要度算出部１０００の処理を、図１１及び図１２を用いて説明する。シーン画質重要度算出部１０００ではピクチャ画質重要度Ｐｉを用いてシーン画質重要度を算出する。 Processing of the scene image quality importance calculation unit 1000 will be described with reference to FIGS. 11 and 12. The scene image quality importance calculation unit 1000 calculates the scene image quality importance using the picture image quality importance Pi.

まず、シーン画質重要度算出部１０００は、先に示した式（７）から得られる、小数点を含むスカラー値であるピクチャ画質重要度Ｐｉを、値の大きさに応じて複数のクラス値のいずれか分類する。分類するクラスの数は実施形態に応じて最適な数を選択する事が可能であるが、本実施形態におけるクラスの数は“５”であるものとして説明する。 First, the scene image quality importance calculation unit 1000 obtains the picture image quality importance Pi, which is a scalar value including a decimal point, obtained from the equation (7), from any of a plurality of class values according to the magnitude of the value. Classify. Although it is possible to select an optimum number of classes to be classified according to the embodiment, the description will be made assuming that the number of classes in this embodiment is “5”.

図１１（ａ）、（ｂ）には、横軸をピクチャ画質重要度算出部１０２から入力されるピクチャ画質重要度Ｐｉを、縦軸を分割後のピクチャ画質重要度クラス番号Ｃｐを示したものである。５つのクラスに分割するために、予め４つの閾値Ｔ１乃至Ｔ４を定義する。ここで、必ずピクチャ画質重要度Ｐｉを等間隔に分割する必要はない。図１１（ａ）は、本実施形態の動画像符号化装置の符号化モードとして画質安定重視型を定義した場合の符号化モードにおける閾値Ｔ１乃至Ｔ４の設定例を示している。同図（ｂ）は、画質メリハリ重視型を定義した場合における、符号化モードの閾値Ｔ１乃至Ｔ４の設定例である。 In FIGS. 11A and 11B, the horizontal axis represents the picture quality importance Pi input from the picture quality importance calculation unit 102, and the vertical axis represents the divided picture quality importance class number Cp. It is. In order to divide into five classes, four threshold values T1 to T4 are defined in advance. Here, it is not always necessary to divide the picture quality importance Pi into equal intervals. FIG. 11A shows a setting example of threshold values T1 to T4 in the encoding mode when the image quality stability importance type is defined as the encoding mode of the moving image encoding apparatus of the present embodiment. FIG. 6B shows an example of setting the encoding mode thresholds T1 to T4 when the image quality sharpening type is defined.

更に、シーン画質重要度算出部１０００は、分割したピクチャ画質重要度クラス番号Ｃｐを用いてシーン画質重要度Ｓｉを算出する。シーン画質重要度Ｓｉは、過去に符号化した複数のピクチャ及び入力ピクチャのピクチャ画質重要度クラス番号Ｃｐを重み付け加算後に平均する事で求める。参照する過去のピクチャ数は、実施形態に応じて選択する事が可能であり、本実施形態では、説明を簡単にするためにピクチャ数を“５”とする。 Further, the scene image quality importance calculation unit 1000 calculates the scene image quality importance Si using the divided picture image quality importance class number Cp. The scene image quality importance Si is obtained by averaging the picture image quality importance class numbers Cp of a plurality of pictures encoded in the past and the input picture after weighted addition. The number of past pictures to be referred to can be selected according to the embodiment. In this embodiment, the number of pictures is “5” for the sake of simplicity.

図１２に、シーン画質重要度Ｃｉ及びピクチャ画質重要度クラス番号Ｃｐの推移を示す。ここで、ピクチャ画質重要度クラスＣｐの過去の５ピクチャを格納した配列をArrayCp[N](N=0〜4)とすれば、シーン画質重要度Ｓｉは次式で求める事が可能である。ただし、Ｎ＝０は直前のピクチャの、Ｎ＝４は５ピクチャ前のピクチャ画質重要度クラスCpを格納するとする。 FIG. 12 shows changes in scene image quality importance Ci and picture image quality importance class number Cp. Here, if the array storing the past five pictures of the picture quality importance class Cp is ArrayCp [N] (N = 0 to 4), the scene quality importance Si can be obtained by the following equation. However, N = 0 stores the picture quality importance class Cp of the previous picture, and N = 4 stores 5 pictures before.

Ｓi= (C_wp0×Ci + C_wp1×ArrayPi[0] + C_wp2×ArrayPi[1]+ C_wp3×ArrayPi[2] + C_wp4×ArrayPi[3] + C_wp5×ArrayPi[4] )/(C_wp0+ C_wp1+ C_wp2+ C_wp3+ C_wp4 + C_wp5 ) …（１１）
ただし、C_wp0〜C_wp5は予め定めた重み付け係数であり、１以上の整数である。本実施形態においては、C_wp0〜C_wp4はすべて値“１”とする。式（１１）の処理をピクチャ毎にピクチャ画質重要度算出部１０２からピクチャ画質重要度Ｐｉが入力される毎に行う。すなわち、式（１１）で求めるシーン画質重要度Ｓｉは直前に符号化した複数ピクチャのピクチャ画質重要度Ｐｉの傾向を重み付け加算により求めている事になる。なお、図１２において、ピクチャ番号１〜５までは、シーン画質重要度が“１”である。これは、それ以前の５つ全てのピクチャが存在しないためであり、５つ全てのピクチャが出揃うまで、式（１１）では、ピクチャ番号０のピクチャ画質重要度を用いるようにしたためである。 Si = (C _wp0 × Ci + C _wp1 × ArrayPi [0] + C _wp2 × ArrayPi [1] + C _wp3 × ArrayPi [2] + C _wp4 × ArrayPi [3] + C _wp5 × ArrayPi [4]) / ( C _wp0 + C _wp1 + C _wp2 + C _wp3 + C _wp4 + C _wp5 )… (11)
However, C _{wp0 to} C _wp5 are predetermined weighting coefficients and are integers of 1 or more. In this embodiment, C _{wp0 to} C _wp4 are all set to the value “1”. The processing of Expression (11) is performed for each picture every time the picture quality importance Pi is input from the picture quality importance calculator 102. That is, the scene image importance Si obtained by the equation (11) is obtained by weighted addition of the tendency of the picture image importance Pi of a plurality of pictures encoded immediately before. In FIG. 12, the scene image quality importance is “1” for picture numbers 1 to 5. This is because all five previous pictures do not exist and the picture quality importance of picture number 0 is used in equation (11) until all five pictures are available.

シーン分割部１００１は、シーン画質重要度算出部１０００からシーンを適応的に構成する。本第２の実施形態においては、シーン分割部１００１によって分割されたシーンを対象として符号量制御を行う。シーンは複数ピクチャから構成されるが、予めシーンの最大ピクチャ数を定義する。シーンを構成するピクチャ数Ｎｓを、予め設定されたＮｍａｘ以下になるようにする（１シーンに含まれるピクチャの数は、この上限数以下にする）。本第２の実施形態においては説明を簡単にするためにＮｍａｘ＝１５とする。シーン分割部１００１において、式（１１）によって得られたシーン画質重要度Ｓｉが同じである連続するピクチャ群を、１つのシーンを構成するピクチャとする。ここでシーン画質重要度算出部１０００から入力されるシーン画質重要度Ｓｉは小数点を含むスカラー値を、小数点第１位を四捨五入する事で整数としたものをシーン画質重要度クラスＣｓとする。このシーン画質重要度クラスＣｓが同じで連続するピクチャをシーンとして構成する。シーン画質重要度クラスＣｓ及びシーンの構成を示した図を図１３に示す。 The scene division unit 1001 adaptively configures a scene from the scene image quality importance calculation unit 1000. In the second embodiment, code amount control is performed for the scene divided by the scene dividing unit 1001. Although a scene is composed of a plurality of pictures, the maximum number of pictures in the scene is defined in advance. The number Ns of pictures constituting the scene is set to be equal to or less than a preset Nmax (the number of pictures included in one scene is set to be equal to or less than the upper limit number). In the second embodiment, Nmax = 15 is set to simplify the description. In the scene dividing unit 1001, consecutive picture groups having the same scene image quality importance Si obtained by Expression (11) are set as pictures constituting one scene. Here, the scene image quality importance Si input from the scene image quality importance calculator 1000 is a scene image quality importance class Cs obtained by rounding the first decimal place to a scalar value including a decimal point. Consecutive pictures having the same scene image quality importance class Cs are configured as a scene. FIG. 13 shows a scene image importance class Cs and a scene configuration.

次にシーン目標符号量算出部１００２について説明する。シーン目標符号量算出部１００２はシーン画質重要度算出部１０００から入力されるシーン画質重要度Ｓｉ、シーン分割部１００１から入力されるシーンの切り替わりを示す情報、及び、符号化歪み算出部１０４から入力されるブロック歪み量Ｂｐからシーン目標符号量Ｒｓを算出する。シーン目標符号量Ｒｓは、シーンの先頭ピクチャを符号化するに先立ち算出されるのみである。シーン目標符号量Ｒｓを算出する際には、次に示す式（１２）で示される目標ビットレートＴｓ及びフレームレートＦｒ及びＮｍａｘから得られるＣＢＲ符号量Ｒｃｂｒを基準として増減させる。
Rcbr = Ts × Nmax × 1／Fr …（１２） Next, the scene target code amount calculation unit 1002 will be described. The scene target code amount calculation unit 1002 receives scene quality importance Si input from the scene image quality importance calculation unit 1000, information indicating scene switching input from the scene division unit 1001, and input from the coding distortion calculation unit 104. The scene target code amount Rs is calculated from the block distortion amount Bp. The scene target code amount Rs is only calculated prior to encoding the first picture of the scene. When calculating the scene target code amount Rs, the CBR code amount Rcbr obtained from the target bit rate Ts and the frame rates Fr and Nmax shown by the following equation (12) is increased or decreased.
Rcbr = Ts × Nmax × 1 / Fr (12)

シーン目標符号量をＣＢＲ符号量Ｒｃｂｒから減少させる状況は、ブロック歪み量Ｂｐが小さくかつ、シーン画質重要度Ｓｉが小さいシーンの場合である。ただし、シーン目標符号量Ｒｓはシーンを符号化するに先立ち算出する必要があるために、これから符号化するシーンのブロック歪み量Ｂｐ及びシーン画質重要度Ｓｉの予測を行う必要がある。そこで、本第２の実施形態においては、次に示す方法を用いて、この予測を実現する。 The situation in which the scene target code amount is decreased from the CBR code amount Rcbr is a scene in which the block distortion amount Bp is small and the scene image quality importance Si is small. However, since the scene target code amount Rs needs to be calculated before the scene is encoded, it is necessary to predict the block distortion amount Bp and the scene image quality importance Si of the scene to be encoded. Therefore, in the second embodiment, this prediction is realized using the following method.

ブロック歪み量Ｂｐの予測においては、直前に符号化した複数ピクチャのブロック歪み量Ｂｐから重み加算平均（加重平均）を算出する事により、シーンのブロック歪み量Ｂｐを予測する。予測するブロック歪み量Ｂｓは、過去５ピクチャのブロック歪み量Ｂｐを格納した配列をArrayBp[N](N=0〜4)とすれば、次式（１３）で求めることができる。
Bs = (C_wb0×ArrayBp[0] +C_wb1×ArrayBp[1]+ C_wp2×ArrayBp[2] + C_wb3×ArrayBp[3] + C_wp4×ArrayBp[4] )/( C_wb0+ C_wb1+ C_wb2+ C_wb3+ C_wb4) …（１３）
ただし、C_wb0〜C_wb4は予め定めた重み付け係数でありC_wb0=４、及びC_wb1〜C_wb4=１とする。 In the prediction of the block distortion amount Bp, the block distortion amount Bp of the scene is predicted by calculating the weighted average (weighted average) from the block distortion amounts Bp of a plurality of pictures encoded immediately before. The block distortion amount Bs to be predicted can be obtained by the following equation (13) if the array storing the block distortion amounts Bp of the past five pictures is ArrayBp [N] (N = 0 to 4).
Bs = (C _wb0 × ArrayBp [0] + C _wb1 × ArrayBp [1] + C _wp2 × ArrayBp [2] + C _wb3 × ArrayBp [3] + C _wp4 × ArrayBp [4]) / (C _wb0 + C _wb1 + C _wb2 + C _wb3 + C _wb4 )… (13)
However, C _{wb0 to} C _wb4 are predetermined weighting coefficients, and C _wb0 = 4 and C _{wb1 to} C _wb4 = 1.

次にシーン画質重要度Ｓｉの予測は、シーン分割部１００１のシーン分割方法に準ずれば良い。すなわち、シーン画質重要度Ｓｉを四捨五入したシーン画質重要度クラスＣｓが同じピクチャ同士をシーンとして分割するので、シーンのシーン画質重要度Ｓｉは対応するシーン画質重要度クラスＣｓとすれば良い。 Next, the prediction of the scene image quality importance Si may be performed in accordance with the scene dividing method of the scene dividing unit 1001. That is, since the scene image importance class Cs obtained by rounding off the scene image importance Si is divided into scenes having the same scene image importance class Cs, the scene image importance Si of the scene may be set to the corresponding scene image importance class Cs.

次にシーン目標符号量Ｒｓの算出方法について図１４を用いて説明する。 Next, a method for calculating the scene target code amount Rs will be described with reference to FIG.

まず、ステップＳ１４００において、予測したブロック歪み量Ｂｓが予め定めた定数Ｂminより小さいか否かを判定する。更に、ステップＳ１４０１において、予測したシーン画質重要度クラスＣｓが、予め定めた定数ＣＳminより小さいか否かを判定する。２つの判定結果が真(Ｙｅｓ)である場合には、ＣＢＲ符号量Ｒcbrから予測したシーン画質重要度クラスＣｓ及びブロック歪み量Ｂｓに応じて減少させた符号量をシーン目標符号量Ｒｓとする。 First, in step S1400, it is determined whether the predicted block distortion amount Bs is smaller than a predetermined constant Bmin. In step S1401, it is determined whether the predicted scene image quality importance class Cs is smaller than a predetermined constant CSmin. When the two determination results are true (Yes), the code amount reduced according to the scene image quality importance class Cs and the block distortion amount Bs predicted from the CBR code amount Rcbr is set as the scene target code amount Rs.

一方、ステップＳ１４０２において予測したブロック歪み量Ｂｓが予め定めた定数Ｂminより大きい場合には、ステップＳ１４０３において、直前シーンを符号化した結果得られた目標ビットレートＴｓに対する余剰符号量に相当するＲrが０より大きい場合には、ＣＢＲ符号量Ｒcbrに対して符号量を増加させてシーン目標符号量Ｒｒとする。 On the other hand, if the block distortion amount Bs predicted in step S1402 is larger than a predetermined constant Bmin, Rr corresponding to the surplus code amount for the target bit rate Ts obtained as a result of encoding the immediately preceding scene in step S1403. When it is larger than 0, the code amount is increased with respect to the CBR code amount Rcbr to obtain the scene target code amount Rr.

なお、ステップＳ１４０６中のγ及びステップＳ１４０４中のθは予め定めた定数である。 Note that γ in step S1406 and θ in step S1404 are predetermined constants.

最後に第２のピクチャ目標符号量算出部段の処理について説明する。シーン目標符号量Ｒsを入力として、符号化部１０５に対してピクチャ目標符号量Ｒpを出力する。第２の実施形態においては、ピクチャ目標符号量Ｒｐを算出する処理は従来技術を用いる事で実現する事が可能であり、例えば前記ＴＭ５アルゴリズムを用いる事によって次の様に実現する。 Finally, the processing of the second picture target code amount calculation unit will be described. With the scene target code amount Rs as an input, the picture target code amount Rp is output to the encoding unit 105. In the second embodiment, the process of calculating the picture target code amount Rp can be realized by using a conventional technique, and for example, realized by using the TM5 algorithm as follows.

I、P及びBピクチャ毎に、符号化部１０５における符号化結果からピクチャの複雑度Ｘ_i、Ｘ_p及びＸ_b（それぞれＩ，Ｐ，Ｂピクチャに対応する）を次式で求める。
Ｘ_i=ＲＡ_i×Ｑ_i
Ｘ_p=ＲＡ_p×Ｑ_p
Ｘ_b=ＲＡ_b×Ｑ_b …（１４）
ただし、ＲＡ_i,ＲＡ_p,ＲＡ_bはそれぞれＩ，Ｐ及びＢピクチャを符号化した結果得られる符号量を示し、Ｑ_i,Ｑ_p及びＱ_bは、それぞれＩ，Ｐ及びＢピクチャ内のすべてのマクロブロックに対するＱスケールの平均値である。式（１４）から、次式（１５）を用いて、Ｉ，Ｐ及びＢピクチャそれぞれについてピクチャ目標符号量Ｔ_i、Ｔ_p及びＴ_bを求めて、符号化部5で符号化するピクチャタイプに応じて、ピクチャ目標符号量Ｔ_i，Ｔ_p，Ｔ_bから選択し、符号化部に出力する。 For each of the I, P, and B pictures, the picture complexity X _i , X _p, and X _b (corresponding to the I, P, and B pictures, respectively) are obtained from the encoding result in the encoding unit 105 using the following equations.
X _i = RA _i × Q _i
X _p = RA _p × Q _p
X _b = RA _b × Q _b (14)
However, RA _i , RA _p , and RA _b indicate code amounts obtained as a result of encoding the I, P, and B pictures, respectively, and Q _i , Q _p, and Q _b are all in the I, P, and B pictures, respectively. The average value of the Q scale for the macroblocks. From the equation (14), using the following equation (15), the picture target code amounts T _i , T _p and T _b are obtained for each of the I, P and B pictures, and the picture type to be encoded by the encoding unit 5 is obtained. Accordingly, the picture target code amounts T _i , T _p and T _{b are} selected and output to the encoding unit.

ただし、Ｋp＝１．０、Ｋb＝１．４である。

However, Kp = 1.0 and Kb = 1.4.

［第３の実施形態］
図１５は、第３の実施形態における動画像符号化装置のブロック構成図である。図１５の動画像符号化装置は、第２の実施形態の動画像符号化装置に対して、新たに撮像制御情報算出部１５００及び動き情報算出部１５０１が追加されたものでもある。ここで、先ず、撮像制御情報算出部１５００が算出する撮像制御情報（ＡＥ情報Ｐ_ae、ＡＦ情報Ｐ_af）について説明する。 [Third Embodiment]
FIG. 15 is a block diagram of a moving picture encoding apparatus according to the third embodiment. The moving picture coding apparatus in FIG. 15 is a moving picture coding apparatus according to the second embodiment in which an imaging control information calculation unit 1500 and a motion information calculation unit 1501 are newly added. Here, first, imaging control information (AE information _Pae , AF information _Paf ) calculated by the imaging control information calculation unit 1500 will be described.

・撮影制御情報：ＡＥ情報Ｐ_ae
これは、図示しない被写体から撮像し、本第３の実施形態の動画像符号化装置に対して入力ピクチャを与える、撮像手段の露出及びシャッタースピードを調整するための情報である。撮像制御情報算出部１５００は、入力ピクチャの輝度（Ｙ）を用いて算出する事によってＡＥ情報Ｐ_aeを得る。ここで、ＡＥ情報Ｐ_aeは、撮像制御情報算出部１５００において、入力ピクチャが露出オーバー或いは露出アンダーである状況下であると判断した場合にはＰ_ae＞０であり、露出オーバー及びアンダーの度合いが数値として表される。Ｐ_aeの値が大きければ露出オーバー及びアンダーが激しい状況である事を示す。それ以外はＰ_ae＝０を、ピクチャ画質重要度算出部１０２に出力する。 _-Shooting control information: AE information _Pae
This is information for adjusting the exposure and shutter speed of the image pickup means that picks up an image from a subject (not shown) and gives an input picture to the moving picture coding apparatus of the third embodiment. The imaging control information calculation unit 1500 obtains AE information P _ae by calculating using the luminance (Y) of the input picture. Here, the AE information P _ae is P _ae > 0 when the imaging control information calculation unit 1500 determines that the input picture is underexposed or underexposed, and the degree of overexposure and underexposure. Is expressed as a number. It indicates that if the value of P _ae is greater exposure over and under is a violent situation. Otherwise, P _ae = 0 is output to the picture quality importance calculator 102.

・撮影制御情報：ＡＦ情報Ｐ_af
図示しない撮像手段のレンズ位置を制御する事によって焦点距離を調整するための情報である。撮像制御情報算出部１５００においては、入力ピクチャの輝度（Ｙ）を用いて算出する事によってＡＦ情報Ｐ_afを得る。撮像制御情報算出部１５００において、入力ピクチャが焦点距離を調整中のピクチャであると判断した場合には、Ｐ_af＝１であり、それ以外はＰ_af＝０をピクチャ画質重要度算出部１０２に出力する。 Shooting control information: AF information P _af
This is information for adjusting the focal length by controlling the lens position of an imaging means (not shown). The imaging control information calculation unit 1500 obtains AF information P _af by calculating using the luminance (Y) of the input picture. If the imaging control information calculation unit 1500 determines that the input picture is a picture whose focal length is being adjusted, P _af = 1, otherwise P _af = 0 is sent to the picture image quality importance calculation unit 102. Output.

次にピクチャ重要度算出部１０２の処理について説明する。第１の実施形態で示した式（６）及び式（７）の右辺に、前記ＡＥ情報Ｐ_ae及びＡＦ情報Ｐ_afを追加する事で、第１の実施形態で示したピクチャ目標符号量Ｒｐ及び第２の実施形態で示したシーケンス目標符号量Ｒｓを同様に算出することが出来る。第１の実施形態で示した視覚感度テーブルは、重み付け係数Ｃｗ[k]及び正規化係数Ｃｄ[k](k=0〜3)であったが、ｋの取り得る範囲を０乃至５として、重み付け係数Ｃｗ[4]及び正規化係数Ｃｄ[4]をそれぞれ、AE情報Ｐ_aeに対応する値を定義し、更に重み付け係数Ｃｗ[5]及び正規化係数Ｃｄ[5]をそれぞれ、ＡＦ情報Ｐ_afに対応する値を定義すれば良い。 Next, the processing of the picture importance calculation unit 102 will be described. The right side of equation (6) and (7) shown in the first embodiment, the AE information P _ae and By adding AF information P _af, picture target code amount Rp shown in the first embodiment The sequence target code amount Rs shown in the second embodiment can be calculated in the same manner. The visual sensitivity table shown in the first embodiment is the weighting coefficient Cw [k] and the normalization coefficient Cd [k] (k = 0 to 3). The weighting coefficient Cw [4] and the normalization coefficient Cd [4] are respectively defined as values corresponding to the AE information _Pae , and the weighting coefficient Cw [5] and the normalization coefficient Cd [5] are respectively defined as the AF information P Define a value corresponding to _af .

更に、撮像モードに応じて重み付け係数Ｃｗ[k](k=0〜5)を予め定めた複数の組み合わせから適用的に選択、変更する事も可能である。重み付け係数Ｃｗ[k](k=0〜5)の設定例を図１８に示す。第２の実施形態で説明した図１１で示される画質安定重視型及び画質メリハリ重視型に応じたピクチャ画質重要度クラスＣｐの算出方法と同時に組み合わせる事も可能である。 Furthermore, the weighting coefficient Cw [k] (k = 0 to 5) can be selected and changed from a plurality of predetermined combinations according to the imaging mode. An example of setting the weighting coefficient Cw [k] (k = 0 to 5) is shown in FIG. It is also possible to combine the picture quality importance class Cp calculation method corresponding to the image quality stability importance type and the image quality sharpness importance type shown in FIG. 11 described in the second embodiment.

以上説明したように本実施形態によれば、ブロック歪みによる画質の劣化の度合、及びピクチャの視覚的な統計情報から得られるシーンの画質重要度に従い、シーンに対する目標符号量を制御する。この結果、与えられた目標ビットレートの条件下において良好な画質の符号化動画像データを得る事が可能となる。 As described above, according to the present embodiment, the target code amount for a scene is controlled according to the degree of deterioration of image quality due to block distortion and the importance of the image quality of the scene obtained from the visual statistical information of the picture. As a result, it is possible to obtain encoded moving image data with good image quality under the condition of a given target bit rate.

また、更には、シーンの画質重要度に、撮像手段の撮像制御情報を考慮し撮影状況に応じたシーンに対する目標符号量の制御を行う事で、与えられた目標ビットレートの条件下において良好な画質の符号化動画像データを得る事が可能となる。 In addition, by controlling the target code amount for the scene according to the shooting situation in consideration of the imaging control information of the imaging means for the importance of the image quality of the scene, it is favorable under the condition of the given target bit rate. It is possible to obtain encoded moving image data with image quality.

第１の実施形態における動画像符号化装置のブロック構成図である。It is a block block diagram of the moving image encoder in 1st Embodiment. 従来の動画像符号化装置のブロック構成図である。It is a block block diagram of the conventional moving image encoder. 従来の動画像符号化装置のブロック構成図である。It is a block block diagram of the conventional moving image encoder. 符号化歪み検出部１０４の動作を説明するための図である。6 is a diagram for explaining the operation of a coding distortion detection unit 104. FIG. Ｃｂ−Ｃｒ座標を用いた肌色検出の原理を説明するための図である。It is a figure for demonstrating the principle of the skin color detection using a Cb-Cr coordinate. 画像をブロック分割した例を示す図である。It is a figure which shows the example which divided the image into blocks. シーンの構成を説明した図Diagram explaining the structure of the scene 第１のピクチャ目標符号量算出部の処理内容を示すフローチャートである。It is a flowchart which shows the processing content of the 1st picture target code amount calculation part. ピクチャ目標符号量の推移を示す図である。It is a figure which shows transition of the picture target code amount. 第２の実施形態における動画像符号化装置のブロック構成図である。It is a block block diagram of the moving image encoder in 2nd Embodiment. 第２の実施形態におけるピクチャ画質重要度のクラス分類の例を示す図である。It is a figure which shows the example of the class classification | category of the picture quality importance in 2nd Embodiment. ピクチャ画質重要度クラスの推移を示す図である。It is a figure which shows transition of a picture image quality importance class. シーン画質重要度クラスの推移を示す図である。It is a figure which shows transition of a scene image quality importance class. シーン目標符号量算出部の処理内容を示すフローチャートである。It is a flowchart which shows the processing content of a scene target code amount calculation part. 第３の実施形態における動画像符号化装置のブロック構成図である。It is a block block diagram of the moving image encoder in 3rd Embodiment. 第１の実施形態における視覚感度テーブルの例を示す図である。It is a figure which shows the example of the visual sensitivity table in 1st Embodiment. 第１、第２の実施形態の符号化処理に用いる情報の対応関係を示すテーブルを示す図である。It is a figure which shows the table which shows the correspondence of the information used for the encoding process of 1st, 2nd embodiment. 第３の実施形態における重み付け係数Ｃｗの設定のためのテーブルの例を示す図である。It is a figure which shows the example of the table for the setting of the weighting coefficient Cw in 3rd Embodiment.

Claims

A moving image encoding apparatus that encodes continuously input pictures within a sequence target code amount determined from a target bit rate,
A dividing unit that divides a moving image composed of pictures arranged in a time axis into scenes composed of a plurality of preset pictures;
Encoding means for encoding an input picture in block units composed of a plurality of pixels and generating encoded data in accordance with an encoding parameter for determining a given quantization scale;
Code amount detection means for detecting the amount of encoded data of the picture generated by the encoding means;
Decoding means for decoding the encoded data obtained from the picture of interest;
A distortion amount calculating means for calculating a distortion amount at a boundary position of the block between a picture obtained by decoding by the decoding means and a picture before encoding as a picture distortion amount;
Statistical information calculating means for calculating statistical information of attributes that affect the distortion of the block boundary position when encoding the picture from the picture of interest;
Based on the sequence target code amount, the encoded data amount detected by the code amount detection unit, the statistical information calculated by the statistical information calculation unit, and the picture distortion amount calculated by the distortion amount calculation unit A moving picture coding apparatus comprising: setting means for generating coding parameters for a picture following the picture and setting the coding parameters in the coding means.

The statistical information calculation means calculates information indicating whether the color is a skin color, saturation information, luminance information, and information indicating luminance dispersion for each block, and further calculates a picture for the entire picture from each calculated information. The moving image encoding apparatus according to claim 1, wherein information indicating importance is calculated.

The setting means includes
For the first picture of the scene of interest, the encoding parameter is determined using the average value of the block distortion amount and the picture importance of each picture constituting the scene before the scene of interest,
For the second and subsequent pictures of the target scene, based on the preset scene target code amount, the encoded data amount of the already encoded picture in the target scene, and the number of unencoded pictures of the target scene, The moving picture coding apparatus according to claim 2, wherein the coding parameter is determined.

A method for controlling a moving image encoding apparatus that encodes continuously input pictures within a sequence target code amount determined from a target bit rate,
A dividing step of dividing a moving image composed of pictures arranged in a time axis into a scene composed of a plurality of preset pictures;
An encoding step of encoding the input picture in units of blocks composed of a plurality of pixels and generating encoded data according to an encoding parameter for determining a given quantization scale;
A code amount detection step of detecting the amount of encoded data of the picture generated in the encoding step;
A decoding step of decoding the encoded data obtained from the picture of interest;
A distortion amount calculating step of calculating, as a picture distortion amount, a distortion amount at a boundary position of the block between a picture obtained by decoding in the decoding step and a picture before encoding;
A statistical information calculating step of calculating statistical information of an attribute that affects the distortion of the block boundary position when encoding the picture from the target picture;
Based on the sequence target code amount, the encoded data amount detected in the code amount detection step, the statistical information calculated in the statistical information calculation step, and the picture distortion amount calculated in the distortion amount calculation step And a setting step of generating an encoding parameter of a picture following the picture and setting the encoding parameter in the encoding step.

A moving image encoding apparatus that encodes continuously input pictures within a sequence target code amount determined from a target bit rate,
Encoding means for encoding an input picture in block units composed of a plurality of pixels and generating encoded data in accordance with an encoding parameter for determining a given quantization scale;
Code amount detection means for detecting the code amount of the encoded data generated by the encoding means;
Decoding means for decoding the encoded data generated by the encoding means;
A distortion amount calculating means for calculating a distortion amount at a boundary position of the block between a picture obtained by decoding by the decoding means and a picture before encoding as a picture distortion amount;
Statistical information calculating means for calculating statistical information of attributes that affect the distortion of the block boundary position when encoding the picture from the picture of interest;
Based on the statistical information calculated by the statistical information calculation means, the importance level of the picture of interest is calculated, and the calculated importance level is compared with a plurality of preset threshold values to classify it into one of a plurality of class values. A picture quality importance calculating means for calculating a weighted average of class values of each of a plurality of previously set pictures of the current picture and the current picture as a picture quality importance;
Scene dividing means for dividing one or more pictures whose picture quality importance calculated by the picture quality importance calculating means is equal to or less than a preset upper limit and continuously the same as a scene;
Scene image quality importance calculating means for calculating the scene image quality importance of the scene from the picture image quality importance of each picture calculated by the picture image quality importance calculating means;
According to the sequence target code amount, the amount of encoded data detected by the code amount detection means, the picture distortion amount calculated by the distortion amount calculation means, and the scene image quality importance calculated by the scene image importance calculation means A scene target code amount calculating means for calculating a scene target code amount;
A moving image comprising: a setting unit configured to generate an encoding parameter of a picture subsequent to the picture of interest based on the scene target code amount calculated by the scene target code amount calculating unit, and to set the encoding parameter in the encoding unit Image encoding device.

The statistical information calculation means calculates information indicating whether the color is a skin color, saturation information, luminance information, and information indicating luminance dispersion for each block, and further calculates a picture for the entire picture from each calculated information. 6. The moving picture coding apparatus according to claim 5, wherein information indicating importance is calculated.

Furthermore,
Imaging means;
Imaging control information calculating means for calculating imaging control information imaged by the imaging means,
6. The moving picture encoding apparatus according to claim 5, wherein the picture image quality importance calculating unit calculates the picture image quality importance according to the image capture control information calculated by the image capture control information calculating unit.

The picture quality importance calculating means calculates the picture quality importance by appropriately changing the imaging control information, the statistical information, and the weighting of the distortion amount according to the imaging mode of the imaging means. 8. The moving picture encoding apparatus according to claim 7, wherein

A method for controlling a moving picture coding apparatus for coding pictures that are continuously input within a sequence target code amount determined from a target bit rate,
An encoding step of encoding the input picture in units of blocks composed of a plurality of pixels and generating encoded data according to an encoding parameter for determining a given quantization scale;
A code amount detection step of detecting a code amount of the encoded data generated in the encoding step;
A decoding step of decoding the encoded data generated in the encoding step;
A distortion amount calculating step of calculating, as a picture distortion amount, a distortion amount at a boundary position of the block between a picture obtained by decoding in the decoding step and a picture before encoding;
A statistical information calculating step of calculating statistical information of an attribute that affects the distortion of the block boundary position when encoding the picture from the target picture;
Based on the statistical information calculated in the statistical information calculation step, the importance of the picture of interest is calculated, and the calculated importance is compared with a plurality of preset threshold values, thereby classifying it into one of a plurality of class values. A picture quality importance calculating step for calculating a weighted average of class values of each of a plurality of preset pictures before the current picture and the current picture as a picture quality importance;
A scene division step of dividing one or more pictures that have the same or lower number of picture quality importance calculated in the picture quality importance calculation step into a scene that is equal to or less than a preset upper limit number;
A scene image quality importance calculating step of calculating a scene image quality importance of the scene from the picture image quality importance of each picture calculated in the picture image quality importance calculating step;
According to the sequence target code amount, the amount of encoded data detected in the code amount detection step, the picture distortion amount calculated in the distortion amount calculation step, and the scene image quality importance calculated in the scene image importance calculation step A scene target code amount calculating step for calculating a scene target code amount;
And a setting step of generating a coding parameter of a picture following the picture of interest based on the scene target code amount calculated in the scene target code amount calculation step and setting the encoding parameter in the encoding step. A method for controlling an image coding apparatus.