JP2006311078A

JP2006311078A - High efficiency coding recorder

Info

Publication number: JP2006311078A
Application number: JP2005129723A
Authority: JP
Inventors: Motoharu Ueda; 基晴上田
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2005-04-27
Filing date: 2005-04-27
Publication date: 2006-11-09

Abstract

<P>PROBLEM TO BE SOLVED: To provide a high efficiency coding recorder for carrying out optimum coding processing with a small code amount by controlling coding for each scene. <P>SOLUTION: A database management circuit 108 acquires a correction value of a parameter for controlling the coding processing from a scene information database 104 on the basis of information associated with a genre of a video image being an object of the coding processing obtained by a genre/keyword searching circuit 103, scene characteristic information representing the characteristic of each scene separated by a scene separation circuit 107, on the basis of image characteristic information obtained by an image characteristic calculation circuit 105 from digital image data of the video image, and coding result information associated with a code amount obtained by a code amount control circuit 219 through the application of the coding processing to the digital image data, and a coded parameter correction circuit 109 corrects the parameter in the high efficiency coding recorder 10 on the basis of the acquired correction value. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、デジタル画像データを効率良く記録することが可能な高能率符号化記録装置に関する。 The present invention relates to a high-efficiency encoding / recording apparatus capable of efficiently recording digital image data.

近年、デジタル化された映像の画像データに対して高能率な符号化を施して情報を圧縮することにより、長時間のコンテンツを記録媒体に記録したり、衛星回線、地上回線、および電話回線などの伝送路で送受信したりするサービスが多く利用されている。これらのサービスにおいては、動画像および音声の高能率符号化方式として国際規格であるＭＰＥＧ２、ＭＰＥＧ４−ＡＳＰ、ＭＰＥＧ４−ＡＶＣなどが用いられている。これらの規格では、隣接画素（空間方向）の相関、および、隣接フレーム間もしくは隣接フィールド間（時間方向）の相関を利用して情報量を圧縮する画像符号化方式が用いられている。 In recent years, high-efficiency encoding has been applied to digitized video image data to compress information, thereby recording long-term content on recording media, satellite lines, terrestrial lines, telephone lines, etc. There are many services that send and receive data over these transmission paths. In these services, MPEG2, MPEG4-ASP, MPEG4-AVC and the like, which are international standards, are used as high-efficiency encoding methods for moving images and audio. In these standards, an image coding method is used in which the amount of information is compressed using the correlation between adjacent pixels (space direction) and the correlation between adjacent frames or between adjacent fields (time direction).

例えば、ＭＰＥＧ２規格における画像符号化記録装置の一例として、以下のようなアルゴリズムで符号化処理が行われる装置がある。すなわち、時間的に連続する１２の符号化対象画像フレームのうち１フレームの静止画が基準フレームとして捉えられ、空間方向の相関のみを用いて符号化される。この基準フレームの符号化データは、このフレームの符号化データのみで復元できる。 For example, as an example of an image encoding / recording apparatus in the MPEG2 standard, there is an apparatus that performs encoding processing using the following algorithm. That is, one still image of 12 encoding target image frames that are temporally continuous is regarded as a reference frame, and is encoded using only the correlation in the spatial direction. The encoded data of this reference frame can be restored only with the encoded data of this frame.

基準フレーム以外の１１フレームでは、まず参照画像フレームとの間での被写体の動きから検出された動きベクトルを用いて画像が予測されることにより予測フレームが作成され、この予測フレームとの差分が求められる。この差分が空間方向の相関および時間方向の相関を用いて符号化されるため、基準フレームに比べてより高い符号化効率で符号化することができる。この予測フレームを用いて符号化されたデータは、参照フレームデータ、動きベクトルデータ、および予測フレームとの差分の符号化データより復元される。 In 11 frames other than the reference frame, first, a predicted frame is created by predicting an image using a motion vector detected from the motion of the subject with the reference image frame, and a difference from the predicted frame is obtained. It is done. Since this difference is encoded using the correlation in the spatial direction and the correlation in the time direction, it can be encoded with higher encoding efficiency than the reference frame. Data encoded using the prediction frame is restored from the reference frame data, the motion vector data, and the encoded data of the difference from the prediction frame.

ＭＰＥＧ２規格による画像符号化について図９を用いて具体的に説明する。図９は連続する基準フレームおよび予測フレームが平面状に並べられた状態を示している。Ｉで示されるＩピクチャ（Intra−coded picture：Ｉフレーム）は入力された符号化対象画像フレームであり、符号化処理において定期的に用いられ、復号処理で基準フレームとして用いられる。また、Ｐ１〜Ｐ３で示されるＰピクチャ（Predictive−coded picture：Ｐフレーム）は時間的に前（過去）の基準フレーム（Ｉピクチャ）のみを参照フレームとして作成される予測フレームであり、Ｂ１〜Ｂ８で示されるＢピクチャ（Bi−directionally predictive coded picture：Ｂフレーム）は時間的に前後（過去と未来）の２つの参照フレームから作成される予測フレームである。Ｐピクチャは、それ自身が予測フレームであるとともに、続いて作成されるＢピクチャやＰピクチャの参照フレームにもなる。 The image encoding according to the MPEG2 standard will be specifically described with reference to FIG. FIG. 9 shows a state in which consecutive reference frames and prediction frames are arranged in a plane. An I picture (Intra-coded picture: I frame) indicated by I is an input encoding target image frame, which is periodically used in the encoding process and used as a reference frame in the decoding process. Further, P pictures (Predictive-coded pictures: P frames) indicated by P1 to P3 are prediction frames created using only a reference frame (I picture) that is temporally previous (past) as a reference frame, and B1 to B8. A B picture (Bi-directionally predictive coded picture: B frame) indicated by is a prediction frame created from two reference frames before and after (past and future) in terms of time. The P picture itself is a prediction frame, and also serves as a reference frame for a B picture and a P picture that are created subsequently.

図９の矢印は、予測方向を示す。例えば、Ｐ１ピクチャは時間的に前のＩピクチャから予測され、Ｂ１ピクチャおよびＢ２ピクチャは時間的に前のＩピクチャとＰ１ピクチャから予測され、Ｂ３ピクチャおよびＢ４ピクチャはＰ１ピクチャとＰ２ピクチャから予測される。 The arrows in FIG. 9 indicate the prediction direction. For example, P1 picture is predicted from temporally previous I picture, B1 picture and B2 picture are predicted from temporally previous I picture and P1 picture, and B3 picture and B4 picture are predicted from P1 picture and P2 picture. The

Ｉピクチャの画像データは、輝度信号が水平１６画素×垂直１６画素であるマクロブロックと呼ばれる処理単位に分割される。分割され得られたマクロブロックデータは、さらに８×８画素単位の２次元ブロックに分割され、この２次元ブロックごとに直交変換の一種であるＤＣＴ（Discrete Cosine Transform：離散コサイン変換）変換処理および量子化処理が行われる。 The image data of an I picture is divided into processing units called macroblocks whose luminance signal is 16 horizontal pixels × 16 vertical pixels. The obtained macroblock data is further divided into two-dimensional blocks of 8 × 8 pixel units, and each two-dimensional block is subjected to DCT (Discrete Cosine Transform) conversion processing, which is a kind of orthogonal transformation, and quantum processing. Processing is performed.

ＤＣＴ変換で得られたデータはこの２次元ブロックデータの周波数成分に準じた値を示し、一般的な画像では成分が低周波域に集中する。この低周波成分は、高周波成分よりも視覚的に情報劣化が目立つ性質がある。よって量子化される際は、低周波成分域が細かく、高周波成分域が粗く処理され、その係数成分と成分がない係数０との連続する長さが可変長符号化されることにより、情報量が圧縮される。 Data obtained by DCT conversion shows a value according to the frequency component of the two-dimensional block data, and in a general image, the component is concentrated in a low frequency region. This low frequency component has a characteristic that information deterioration is more visually noticeable than a high frequency component. Therefore, when quantization is performed, the low frequency component area is processed finely, the high frequency component area is processed coarsely, and the continuous length of the coefficient component and the coefficient 0 having no component is variable-length encoded, whereby the amount of information Is compressed.

Ｐピクチャを用いて符号化対象フレームを圧縮する処理について図９を参照して説明する。図９のＰ１ピクチャ、Ｐ２ピクチャ、およびＰ３ピクチャに時間的に対応する符号化対象画像フレームも水平１６画素×垂直１６画素のマクロブロック単位に分割され、このマクロブロック毎に参照フレームであるＩピクチャまたはＰピクチャとの間の動きベクトルが検出される。動きベクトルは、一般的にブロックマッチング法により求められる。このブロックマッチング法では、動きベクトル検出対象である符号化対象画像フレームのマクロブロックデータの各画素と、このマクロブロックデータと近似の参照フレームのマクロブロックデータの各画素との差分絶対値和（もしくは差分二乗和）が求められ、その値が最小となるときの動きベクトルの値が検出された動きベクトルとして出力される。 The process of compressing the encoding target frame using the P picture will be described with reference to FIG. The encoding target image frame temporally corresponding to the P1 picture, P2 picture, and P3 picture in FIG. 9 is also divided into macroblock units of horizontal 16 pixels × vertical 16 pixels, and an I picture that is a reference frame for each macroblock Alternatively, a motion vector between the P picture is detected. The motion vector is generally obtained by a block matching method. In this block matching method, the sum of absolute differences between each pixel of the macroblock data of the encoding target image frame that is the motion vector detection target and each pixel of the macroblock data of the macroblock data and the approximate reference frame (or The sum of squared differences) is obtained, and the value of the motion vector when the value is the minimum is output as the detected motion vector.

参照フレームの画像を、このマクロブロック毎に検出された動きベクトル分ずらして作成された画像をＰピクチャとする。Ｐピクチャの画像信号は、Ｉピクチャと同様に輝度信号で水平１６画素×垂直１６画素のマクロブロック単位に分割される。そして、得られたＰピクチャのマクロブロックデータの各画素と符号化対象画像フレームのマクロブロックデータの各画素との差分ブロックデータが検出され、この差分ブロックデータが符号化される。正確な動きベクトルが検出された場合には、差分ブロックデータの情報量は元のマクロブロックデータの持っている情報量よりも大幅に少なくなる。そのため、Ｐピクチャを用いて符号化されたデータは、Ｉピクチャが符号化されたデータよりも粗い量子化処理が可能になる。実際には、差分ブロックデータを符号化するか、非差分ブロックデータ（符号化対象フレームのIntraブロックデータ）を符号化するかが予測モード判定で選択され、選択されたブロックデータに対してＩピクチャと同様のＤＣＴ変換処理および量子化処理が施され、圧縮が行われる。 An image created by shifting the image of the reference frame by the motion vector detected for each macroblock is defined as a P picture. An image signal of a P picture is divided into macroblock units of horizontal 16 pixels × vertical 16 pixels by a luminance signal as in the case of an I picture. Then, difference block data between each pixel of the obtained macroblock data of the P picture and each pixel of the macroblock data of the encoding target image frame is detected, and this difference block data is encoded. When an accurate motion vector is detected, the information amount of the difference block data is significantly smaller than the information amount of the original macroblock data. Therefore, the data encoded using the P picture can be subjected to coarser quantization processing than the data encoded using the I picture. Actually, whether to encode differential block data or non-differential block data (intra block data of the encoding target frame) is selected by the prediction mode determination, and an I picture is selected for the selected block data. The same DCT conversion process and quantization process are performed, and compression is performed.

Ｂピクチャを用いて符号化対象フレームを圧縮する処理について説明する。図９のＢ１ピクチャ、Ｂ２ピクチャ、・・・Ｂ８ピクチャに時間的に対応する符号化対象画像フレームもＰピクチャを用いる場合と同様の処理が行われるが、参照フレームであるＩピクチャおよびＰピクチャが時間的に前後に存在するため、それぞれの参照フレームと符号化対象フレームとの間で動きベクトルが検出される。この際、マクロブロック毎に選択される予測モードにより動きベクトルの検出が行われる。この予測モードには、時間的に前の基準フレームからブロックデータが予測される（Forward予測）モード、時間的に後の基準フレームからブロックデータが予測される（Backward予測）モード、これら２つの予測ブロックデータの画素毎の平均値からブロックデータが予測される（Average予測）モードの３種類が存在する。これら３種類のモードによりそれぞれ得られるＢピクチャのマクロブロックデータと符号化対象画像フレームのマクロブロックデータとの差分ブロックデータ、および、符号化対象フレームのIntraブロックデータの４種類のブロックデータからいずれかのデータが判定により選択され、選択されたブロックデータにＩピクチャおよびＰピクチャと同様のＤＣＴ変換処理および量子化処理が施され、圧縮が行われる。 Processing for compressing the encoding target frame using the B picture will be described. The encoding target image frame temporally corresponding to the B1 picture, B2 picture,..., B8 picture in FIG. 9 is processed in the same way as when the P picture is used. Since they exist before and after in time, a motion vector is detected between each reference frame and the encoding target frame. At this time, the motion vector is detected by the prediction mode selected for each macroblock. This prediction mode includes a mode in which block data is predicted from a temporally previous reference frame (Forward prediction), and a block data is predicted from a temporally subsequent reference frame (Backward prediction). There are three types of modes in which block data is predicted from the average value of each block data pixel (Average prediction). Any one of four types of block data, that is, differential block data between the macro block data of the B picture and the macro block data of the encoding target image frame, and the intra block data of the encoding target frame, which are obtained by each of these three types of modes. The selected block data is subjected to DCT conversion processing and quantization processing similar to those for the I picture and P picture, and compression is performed on the selected block data.

Ｂピクチャは時間的に前後の基準フレームから予測が可能であるため、Ｐピクチャよりもさらに予測効率が向上する。よって、一般的にＰピクチャよりもさらに粗く量子化される。 Since a B picture can be predicted from temporally preceding and following reference frames, the prediction efficiency is further improved than that of a P picture. Therefore, the quantization is generally coarser than that of the P picture.

このＢピクチャを用いる符号化は時間的に後の基準フレームからの予測処理も行われるため、参照フレームを用いた符号化がＢピクチャを用いる符号化に先行して行われる。そのため、入力された画像信号は図１０に示すように、Ｂピクチャを用いて符号化される符号化対象画像フレームは、その参照フレームであるＩピクチャまたはＰピクチャの後に並べ替えが行われ、符号化される。復号される際は、図１１に示すように、図１０の逆の並べ替えを行って出力することにより、入力された画像信号の順に復号された画像が再生される。 Since the encoding using the B picture is also performed with a prediction process from a later reference frame, the encoding using the reference frame is performed prior to the encoding using the B picture. Therefore, as shown in FIG. 10, the input image signal is encoded using the B picture, and the encoding target image frame is rearranged after the I picture or P picture that is the reference frame. It becomes. At the time of decoding, as shown in FIG. 11, by performing the reverse rearrangement of FIG. 10 and outputting, the decoded images are reproduced in the order of the input image signals.

次に、ＭＰＥＧ２画像符号化を実現する従来の画像符号化記録装置の構成について説明する。図１２は従来の画像符号化記録装置２０を示すブロック図である。従来の画像符号化記録装置２０は、対象画像入力端子２０１と、入力画像メモリ２０２と、２次元ブロックデータ変換回路２０３と、減算器２０４と、直交変換回路２０５と、量子化回路２０６と、符号化回路２０７と、符号化テーブル２０８と、マルチプレクサ２０９と、画像ビットストリームバッファ２１０と、逆量子化回路２１２と、逆直交変換回路２１３と、加算器２１４と、デブロック回路２１５と、参照画像メモリ２１６と、動きベクトル検出回路２１７と、動き補償予測回路２１８と、符号量制御回路２１９とを有する。 Next, the configuration of a conventional image encoding / recording apparatus that realizes MPEG2 image encoding will be described. FIG. 12 is a block diagram showing a conventional image encoding / recording apparatus 20. The conventional image encoding / recording apparatus 20 includes a target image input terminal 201, an input image memory 202, a two-dimensional block data conversion circuit 203, a subtractor 204, an orthogonal conversion circuit 205, a quantization circuit 206, a code, Circuit 207, encoding table 208, multiplexer 209, image bitstream buffer 210, inverse quantization circuit 212, inverse orthogonal transform circuit 213, adder 214, deblocking circuit 215, and reference image memory 216, a motion vector detection circuit 217, a motion compensation prediction circuit 218, and a code amount control circuit 219.

対象画像入力端子２０１は、符号化対象となるデジタル画像データを入力する。 The target image input terminal 201 inputs digital image data to be encoded.

入力画像メモリ２０２は、対象画像入力端子２０１で入力されたデジタル画像データを記憶して遅延させ、符号化される順番にフレームを並べ替えて２次元ブロックデータ変換回路２０３に送信する。 The input image memory 202 stores and delays the digital image data input at the target image input terminal 201, rearranges the frames in the encoding order, and transmits the frames to the two-dimensional block data conversion circuit 203.

２次元ブロックデータ変換回路２０３は、受信したデジタル画像データのフレームをマクロブロックデータに分割する。 The two-dimensional block data conversion circuit 203 divides the received digital image data frame into macroblock data.

減算器２０４は、符号化対象のフレームがＩピクチャの場合はそのまま直交変換回路２０５に送信し、Ｉピクチャ以外の場合は後述する予測ブロックデータと符号化対象フレームのマクロブロックデータとの差分を直交変換回路２０５に送信する。 If the encoding target frame is an I picture, the subtracter 204 transmits the difference between a prediction block data (to be described later) and the macro block data of the encoding target frame to be orthogonal if it is not an I picture. The data is transmitted to the conversion circuit 205.

直交変換回路２０５は、受信したマクロブロックデータをＤＣＴ変換し、このＤＣＴ変換により得られたＤＣＴ係数を量子化回路２０６に送信する。 The orthogonal transform circuit 205 performs DCT transform on the received macroblock data, and transmits the DCT coefficient obtained by this DCT transform to the quantization circuit 206.

量子化回路２０６は、受信したＤＣＴ係数を量子化マトリクスにより算出される値で除算して量子化処理を行う。 The quantization circuit 206 performs a quantization process by dividing the received DCT coefficient by a value calculated by a quantization matrix.

符号化回路２０７は、符号化テーブル２０８を参照することにより得られる符号化レートで、量子化されたＤＣＴ係数を可変長または固定長で符号化し、マルチプレクサ２０９に送信する。 The encoding circuit 207 encodes the quantized DCT coefficient with a variable length or a fixed length at an encoding rate obtained by referring to the encoding table 208, and transmits the encoded DCT coefficient to the multiplexer 209.

符号化テーブル２０８は、ＤＣＴ係数に対応する符号化レートを記憶している。 The encoding table 208 stores an encoding rate corresponding to the DCT coefficient.

マルチプレクサ２０９は、符号化回路２０７から受信した符号化データと２次元ブロックデータ変換回路２０３から受信したフレーム内でのマクロブロックデータの位置等を示す付加情報とを多重化して出力画像ビットストリームとし、画像ビットストリームバッファ２１０および符号量制御回路２１９に送信する。 The multiplexer 209 multiplexes the encoded data received from the encoding circuit 207 and the additional information indicating the position of the macroblock data in the frame received from the two-dimensional block data conversion circuit 203 into an output image bit stream, The data is transmitted to the image bit stream buffer 210 and the code amount control circuit 219.

画像ビットストリームバッファ２１０は、マルチプレクサ２０９から受信された出力画像ビットストリームを格納し、必要に応じて記録媒体もしくは伝送路２１１に送信する。 The image bit stream buffer 210 stores the output image bit stream received from the multiplexer 209 and transmits it to the recording medium or the transmission path 211 as necessary.

逆量子化回路２１２は、量子化回路２０６から受信した量子化されたＤＣＴ係数を逆量子化し、得られたＤＣＴ係数を逆直交変換回路２１３に送信する。 The inverse quantization circuit 212 inversely quantizes the quantized DCT coefficient received from the quantization circuit 206 and transmits the obtained DCT coefficient to the inverse orthogonal transform circuit 213.

逆直交変換回路２１３は、受信したＤＣＴ係数を逆ＤＣＴ変換処理し、得られたマクロブロックデータを加算器２１４に送信する。 The inverse orthogonal transform circuit 213 performs inverse DCT transform processing on the received DCT coefficient, and transmits the obtained macroblock data to the adder 214.

加算器２１４は、逆直交変換回路２１３から受信したマクロブロックデータに後述する動き補償予測回路２１８から得られる予測ブロックデータを加算し、デブロック回路２１５に送信する。 The adder 214 adds prediction block data obtained from a motion compensation prediction circuit 218 described later to the macroblock data received from the inverse orthogonal transform circuit 213, and transmits the result to the deblocking circuit 215.

デブロック回路２１５は、予測ブロックデータが加算されたマクロブロックデータを受信して復号し、得られた参照画像データを参照画像メモリ２１６に送信する。 The deblocking circuit 215 receives and decodes the macroblock data to which the prediction block data is added, and transmits the obtained reference image data to the reference image memory 216.

参照画像メモリ２１６は、受信した参照画像データを記憶し、ＰピクチャまたはＢピクチャの参照フレームとして動きベクトル検出回路２１７および動き補償予測回路２１８に送信する。 The reference image memory 216 stores the received reference image data and transmits it to the motion vector detection circuit 217 and the motion compensation prediction circuit 218 as a reference frame of a P picture or a B picture.

動きベクトル検出回路２１７は、２次元ブロックデータ変換回路２０３から受信した符号化対象画像のマクロブロックデータと参照画像メモリ２１６から受信した参照画像のマクロブロックデータとの間の動きベクトルを検出する。 The motion vector detection circuit 217 detects a motion vector between the macroblock data of the encoding target image received from the two-dimensional block data conversion circuit 203 and the macroblock data of the reference image received from the reference image memory 216.

動き補償予測回路２１８は、参照画像メモリ２１６から受信した参照フレームのマクロブロックデータを、動きベクトル検出回路２１７で求められた動きベクトル分ずらして予測ブロックデータを作成し、減算器２０４および加算器２１４に送信する。 The motion compensation prediction circuit 218 creates prediction block data by shifting the macroblock data of the reference frame received from the reference image memory 216 by the motion vector obtained by the motion vector detection circuit 217, and creates a subtractor 204 and an adder 214. Send to.

符号量制御回路２１９は、マルチプレクサ２０９から送信された出力画像ビットストリームの符号量と予め設定された目標とする符号量とを比較し、目標符号量に近づけるために量子化する細かさ（量子化スケール）を算出し、算出された量子化スケールで量子化が行われるように量子化回路２０６を制御する。 The code amount control circuit 219 compares the code amount of the output image bit stream transmitted from the multiplexer 209 with a target code amount set in advance, and performs fineness (quantization) so as to approach the target code amount. Scale) is calculated, and the quantization circuit 206 is controlled so that the quantization is performed with the calculated quantization scale.

次に、上記の従来の画像符号化記録装置２０の動作を説明する。 Next, the operation of the conventional image encoding / recording apparatus 20 will be described.

まず、符号化対象となる映像のデジタル画像データが、対象画像入力端子２０１から入力されて入力画像メモリ２０２に送信される。入力画像メモリ２０２では受信したデジタル画像データが記憶されて遅延され、図１０の符号化シンタックスに従って符号化される順番にフレームが並べ替えられて２次元ブロックデータ変換回路２０３に送信される。２次元ブロックデータ変換回路２０３においては、受信したデジタル画像データがマクロブロックデータに分割される。 First, digital image data of a video to be encoded is input from the target image input terminal 201 and transmitted to the input image memory 202. In the input image memory 202, the received digital image data is stored and delayed, and the frames are rearranged in the order of encoding according to the encoding syntax of FIG. 10 and transmitted to the two-dimensional block data conversion circuit 203. In the two-dimensional block data conversion circuit 203, the received digital image data is divided into macroblock data.

次に、入力画像メモリ２０２から入力されたデジタル画像データがＩピクチャである場合の符号化処理について説明する。まず、マクロブロックデータに分割されたＩピクチャの画像データは減算器２０４を介して直交変換回路２０５に送信される。そして直交変換回路２０５でさらに水平８画素×垂直８画素単位に分割されてＤＣＴ変換処理が行われ、ＤＣＴ係数が出力される。出力されたＤＣＴ係数は輝度信号が水平１６画素×垂直１６画素となるマクロブロック単位にまとめられ、量子化回路２０６に送られる。量子化回路２０６においては、ＤＣＴ係数が周波数成分毎に異なった値を持つ量子化マトリクスにより算出される値で除算されることにより、量子化処理が行われる。量子化処理が行われたＤＣＴ係数は、符号化回路２０７において符号化テーブル２０８のＤＣＴ係数に対応したアドレスが参照されることにより可変長または固定長の符号化が行われ、得られた符号化データがマルチプレクサ２０９に送信される。 Next, an encoding process when digital image data input from the input image memory 202 is an I picture will be described. First, the I-picture image data divided into macroblock data is transmitted to the orthogonal transform circuit 205 via the subtractor 204. Then, the orthogonal transform circuit 205 further divides the pixel into horizontal 8 pixels × vertical 8 pixels and performs DCT transform processing to output DCT coefficients. The output DCT coefficients are grouped into macroblock units each having a luminance signal of 16 horizontal pixels × 16 vertical pixels, and sent to the quantization circuit 206. In the quantization circuit 206, the DCT coefficient is divided by a value calculated by a quantization matrix having a different value for each frequency component, thereby performing a quantization process. The quantized DCT coefficient is subjected to variable length or fixed length coding by referring to an address corresponding to the DCT coefficient of the coding table 208 in the coding circuit 207, and the obtained coding Data is transmitted to the multiplexer 209.

マルチプレクサ２０９では、符号化回路２０７から受信された符号化データと２次元ブロックデータ変換回路２０３から受信されたフレーム内での該当するマクロブロックデータの位置等を示す付加情報とが多重化され、画像ビットストリームバッファ２１０に格納される。この多重化されたデータは、出力画像ビットストリームとして記録媒体もしくは伝送路２１１に出力される。 The multiplexer 209 multiplexes the encoded data received from the encoding circuit 207 and the additional information indicating the position of the corresponding macroblock data in the frame received from the two-dimensional block data conversion circuit 203, Stored in the bitstream buffer 210. The multiplexed data is output to a recording medium or transmission path 211 as an output image bit stream.

一方、量子化回路２０６において量子化されたＤＣＴ係数は、逆量子化回路２１２および逆直交変換回路２１３において逆量子化および逆ＤＣＴ変換処理が行われ、量子化されたＤＣＴ係数が復号されマクロブロックごとのデータが得られる。この得られたマクロブロックごとデータは加算器２１４を介してデブロック回路２１５に送信され、デブロック回路２１５でデブロックされて復号された参照画像データが得られる。復号された参照画像データは参照画像メモリ２１６に供給されて格納される。参照画像メモリ２１６に格納された画像データは、予測フレームであるＰピクチャやＢピクチャを用いて符号化処理する時に参照フレームとして使用される。 On the other hand, the DCT coefficient quantized in the quantization circuit 206 is subjected to inverse quantization and inverse DCT transform processing in the inverse quantization circuit 212 and the inverse orthogonal transform circuit 213, and the quantized DCT coefficient is decoded to generate a macroblock. Each data is obtained. The obtained data for each macroblock is transmitted to the deblocking circuit 215 via the adder 214, and the reference image data deblocked and decoded by the deblocking circuit 215 is obtained. The decoded reference image data is supplied to the reference image memory 216 and stored therein. The image data stored in the reference image memory 216 is used as a reference frame when encoding is performed using a P picture or B picture that is a predicted frame.

次に、入力画像メモリ２０２から出力されたデジタル画像データが、予測フレームであるＰピクチャまたはＢピクチャを用いて符号化処理される場合について説明する。まず、２次元ブロックデータ変換回路２０３で分割された符号化対象となる画像フレームのマクロブロックデータと参照画像メモリ２１６に格納されている参照画像のマクロブロックデータとの間の動きベクトルが、動きベクトル検出回路２１７で求められる。動きベクトル検出回路２１７で求められた動きベクトルのデータは動き補償予測回路２１８に送信される。動き補償予測回路２１８では、参照画像メモリ２１６から取得した参照フレームのマクロブロックデータが、動きベクトル検出回路２１７で求められた動きベクトル分ずらされることにより予測ブロックデータが作成される。さらに動き補償予測回路２１８では、複数の予測モードの中から最適な予測モードで作成された予測ブロックデータが選択される。そして、減算器２０４で符号化対象となる画像のマクロブロックデータと動き補償予測回路２１８で選択された予測ブロックデータとの差分データが算出され、直交変換回路２０５に送信される。この差分データはＩピクチャと同様にＤＣＴ変換処理および量子化処理が行われ、動きベクトルデータおよび予測ブロックデータとともに出力画像ビットストリームとして画像ビットストリームバッファ２１０から記録媒体もしくは伝送路２１１に出力される。 Next, a case where the digital image data output from the input image memory 202 is encoded using a P picture or B picture that is a prediction frame will be described. First, the motion vector between the macroblock data of the image frame to be encoded divided by the two-dimensional block data conversion circuit 203 and the macroblock data of the reference image stored in the reference image memory 216 is a motion vector. It is obtained by the detection circuit 217. The motion vector data obtained by the motion vector detection circuit 217 is transmitted to the motion compensation prediction circuit 218. The motion compensated prediction circuit 218 creates predicted block data by shifting the macroblock data of the reference frame acquired from the reference image memory 216 by the motion vector obtained by the motion vector detection circuit 217. Further, the motion compensation prediction circuit 218 selects prediction block data created in the optimum prediction mode from among a plurality of prediction modes. Then, difference data between the macroblock data of the image to be encoded by the subtractor 204 and the prediction block data selected by the motion compensated prediction circuit 218 is calculated and transmitted to the orthogonal transform circuit 205. The difference data is subjected to DCT transform processing and quantization processing in the same manner as the I picture, and is output from the image bit stream buffer 210 to the recording medium or transmission path 211 as an output image bit stream together with motion vector data and prediction block data.

符号量の制御に関しては、符号量制御回路２１９においてマルチプレクサ２０９から出力された画像ビットストリームの符号量と目標とする符号量とが比較され、目標とする符号量に近づけるための量子化スケール（量子化の細かさ）の算出が行われる。そして、この算出された量子化スケールで量子化が行われるように量子化回路２０６が制御される。 With regard to the control of the code amount, the code amount control circuit 219 compares the code amount of the image bitstream output from the multiplexer 209 with the target code amount, and a quantization scale (quantum for approaching the target code amount). Calculation) is calculated. Then, the quantization circuit 206 is controlled so that the quantization is performed with the calculated quantization scale.

この装置では前述した３種類のピクチャタイプを用いた情報量の異なる符号化処理が行われるため、それぞれのピクチャタイプの目標符号量はピクチャタイプの性質と出現頻度により算出される。 In this apparatus, the encoding processing with different information amounts using the three types of picture types described above is performed, so that the target code amount of each picture type is calculated based on the nature of the picture type and the appearance frequency.

一般的に各画像フレームの目標符号量は、一定時間における目標符号量に対し各ピクチャタイプを用いた符号化画像が持つ情報量から算出し、割り当てていく。具体的には、以前に各ピクチャタイプを用いて符号化したときに要した符号量をBits、各ピクチャタイプの量子化スケールの平均値をAvgQとした場合、各ピクチャタイプの持つ符号化の複雑度（以下、「符号化難易度」と称する）の近似値Cは以下の式（１）で算出される。

In general, the target code amount of each image frame is calculated and assigned from the information amount of an encoded image using each picture type with respect to the target code amount in a fixed time. Specifically, if the coding amount required when coding using each picture type before is Bits and the average value of the quantization scale of each picture type is AvgQ, the coding complexity of each picture type is The approximate value C of the degree (hereinafter referred to as “encoding difficulty”) is calculated by the following equation (1).

この値Cは、複雑な場面や動きの大きな場面程、値が大きい。ここで、Aはピクチャタイプの重要度や符号化時の劣化レベルを想定して各ピクチャタイプに対して設定される重み付けである。一般的にはこの重み付けは、A(I)＞A(P)＞A（B）となる。 This value C is larger for complex scenes and scenes with large movements. Here, A is a weight set for each picture type assuming the importance of the picture type and the degradation level at the time of encoding. Generally, this weighting is A (I)> A (P)> A (B).

一定時間内に含まれるF個のフレームに与えられる目標符号量TotalBitsに対して、各ピクチャタイプが用いられるFnum個のフレームに与えられる目標符号量Budgetは、下記式（２）〜（４）で算出される。

For the target code amount TotalBits given to F frames included in a certain time, the target code amount Budget given to Fnum frames for which each picture type is used is expressed by the following equations (2) to (4). Calculated.

符号量制御回路２１９では、仮想的に復号装置がシュミレートされたＶＢＶ（Video Buffer Verifier）バッファと呼ばれるストリームバッファに対して復号バッファにオーバーフローまたはアンダーフローが起きないように、上記のように設定される目標符号量Budgetに対する制限が行われる。 The code amount control circuit 219 is set as described above so that an overflow or underflow does not occur in the decoding buffer with respect to a stream buffer called a VBV (Video Buffer Verifier) buffer virtually simulated by the decoding device. The target code amount Budget is limited.

また、量子化スケールと出力符号量とは一般的にほぼ反比例の関係にある。これを利用して、フレームタイプ毎にピクチャ内の各マクロブロックデータに対する量子化スケール値が目標符号量Budgetから算出され、量子化処理が行われる。そして、ブロック毎に目標符号量に近づくように量子化スケールが変動されることによって、目標符号量内に出力画像ビットストリームが抑えられる。 The quantization scale and the output code amount are generally in an inversely proportional relationship. Using this, a quantization scale value for each macroblock data in the picture is calculated from the target code amount Budget for each frame type, and a quantization process is performed. Then, by changing the quantization scale so as to approach the target code amount for each block, the output image bit stream is suppressed within the target code amount.

上記のように、画像信号の空間方向相関や時間方向相関を利用して情報量を圧縮する方式を用いた場合、符号化難易度が高い場面で高い符号化効率を得ることができない。そのため、一定の目標符号量内に情報量を収めるには粗い量子化スケールにて量子化処理を行う必要があり、画像信号が復元されたときに画像の劣化が大きくなる。 As described above, when a method of compressing the information amount using the spatial direction correlation or the time direction correlation of the image signal is used, high encoding efficiency cannot be obtained in a scene where the encoding difficulty is high. For this reason, in order to keep the information amount within a certain target code amount, it is necessary to perform a quantization process on a coarse quantization scale, and image deterioration is increased when the image signal is restored.

このような画像の劣化を抑えるため、ＭＰＥＧ２規格においては、符号化された情報量が既知である画像を記録媒体に格納するときに、記録媒体の最大転送レートの符号化レート以内で上記符号化難易度に応じて符号化レートを変動させる、可変転送レート（ＶＢＲ）符号化が可能になっている。 In order to suppress such deterioration of the image, in the MPEG2 standard, when an image with a known encoded information amount is stored in the recording medium, the above encoding is performed within the encoding rate of the maximum transfer rate of the recording medium. Variable transfer rate (VBR) encoding, in which the encoding rate is changed according to the difficulty level, is possible.

また、符号化された情報量が既知ではなく、将来の場面（シーン）が予測できないリアルタイムの入力信号を記録媒体に格納するときのＶＢＲ符号化処理の制御方法に関しては、例えば特許文献１に挙げられている方法がある。この方法では、設定された平均符号化レートに対し、復元された画像信号の劣化が目立たないレベルの最低量子化スケールが設定され、符号化情報の出力が平均符号化レートに満たない部分は以降の符号化処理に割り当てられる。この処理により、符号化難易度の高いシーンに対する耐性が高められ、画像の劣化が抑えられる。 For example, Patent Document 1 discloses a control method of VBR encoding processing when a real-time input signal in which the amount of encoded information is not known and a future scene (scene) cannot be predicted is stored in a recording medium. There is a method that has been. In this method, a minimum quantization scale is set to a level at which deterioration of the restored image signal is not conspicuous with respect to the set average encoding rate, and portions where the output of the encoding information does not satisfy the average encoding rate are Assigned to the encoding process. By this processing, resistance to a scene with a high degree of difficulty in encoding is increased, and image degradation is suppressed.

また特許文献２には、テレビ番組を記録媒体に録画する場合に番組別に符号化レートを変動させる方法が記載されている。この方法は、番組内容に依存して大まかな符号化難易度の傾向があることを利用するものである。処理が行われる際は、電子番組ガイド（以下「ＥＰＧ」と称する）から番組ジャンル情報が取得され、この情報を基準に好ましい符号化レートがテーブルから参照され、その符号化レートで符号化処理が行われる。 Patent Document 2 describes a method of changing the coding rate for each program when a television program is recorded on a recording medium. This method makes use of the fact that there is a general tendency of encoding difficulty depending on the contents of the program. When processing is performed, program genre information is acquired from an electronic program guide (hereinafter referred to as “EPG”), a preferred encoding rate is referred to from the table based on this information, and encoding processing is performed at the encoding rate. Done.

これらの方法により、設定された符号化レートに対して無駄な符号化情報を省いたり、予め認識できるコンテンツの情報を使用したりすることが可能になり、大局的な制御を行うことができる。
特許３３０７３６７号特開２００２−４４６０４号公報 By these methods, it is possible to omit useless encoding information for a set encoding rate, or to use content information that can be recognized in advance, and to perform global control.
Patent 3307367 JP 2002-44604 A

しかし、上記の特許文献１または特許文献２に記載の方法では、瞬間的に符号化難易度が高くなるシーンに対する符号化処理については改善されていなかった。そのため、シーンに応じて時間方向相関を高める方法や、画像の特性に応じた情報の削減に関する方法は考慮されていないという欠点があった。 However, in the method described in Patent Document 1 or Patent Document 2 described above, the encoding process for a scene whose encoding difficulty is instantaneously high has not been improved. For this reason, there has been a drawback that a method for increasing the correlation in the time direction according to the scene and a method for reducing information according to the characteristics of the image are not considered.

具体的には、特許文献１に示されるような高能率符号化記録装置は将来入力されるデータを事前に知ることができない構成となっているため、瞬間的に符号化難易度の高いシーンが入力された場合に圧縮率を高めて画質の劣化を低減させることが困難であるという問題があった。 Specifically, since a high-efficiency encoding / recording apparatus as shown in Patent Document 1 has a configuration in which data to be input in the future cannot be known in advance, a scene with a high degree of difficulty in encoding instantaneously exists. There is a problem that it is difficult to increase the compression rate and reduce the deterioration of image quality when input.

また、特許文献２に示されるような高能率符号化記録装置は、番組のジャンル情報に依存して番組ごとに１つの設定を行うため、１つのジャンルの番組を１つの記録媒体全体に記録する際には有効ではないという問題があった。さらに、ジャンルによって区分けされた番組の中にもシーン毎に特徴が存在するため、１つの番組中が同一の制御では充分な効果が発揮されず、例えば瞬間的に符号化難易度が高いシーンが入力されたときにシーンに合わせた有効な制御ができないという問題もあった。これら以外にも、ＥＰＧ情報が存在しない入力ソースはジャンル情報が取得できないため、有効な処理ができないという問題もあった。 Moreover, since a high-efficiency encoding / recording apparatus as shown in Patent Document 2 performs one setting for each program depending on the genre information of the program, the program of one genre is recorded on one entire recording medium. There was a problem that it was not effective. Furthermore, since there is a feature for each scene in programs classified by genre, a single program does not have a sufficient effect with the same control. For example, there is a scene with a high degree of difficulty in encoding instantaneously. There was also a problem that effective control according to the scene could not be performed when it was input. In addition to these, there is another problem that genre information cannot be acquired from an input source that does not have EPG information, so that effective processing cannot be performed.

本発明は上記事情に鑑みてなされたものであり、シーンごとに符号化の制御を行うことにより、少ない符号量で最適な符号化処理を行う高能率符号化記録装置を提供することを目的とする。 The present invention has been made in view of the above circumstances, and an object thereof is to provide a high-efficiency encoding and recording apparatus that performs optimal encoding processing with a small amount of code by controlling encoding for each scene. To do.

上記目的を達成するための請求項１に記載の高能率符号化記録装置は、入力された映像のデジタル画像データに対して符号化制御パラメータにより制御された符号化処理を施すことにより、デジタル画像データを圧縮して記録媒体に格納するものであり、デジタル画像データに付加された電子番組ガイド情報を基に、またはデジタル画像データから算出されるジャンル予測情報を基に、符号化処理の対象となる映像のジャンルに関する情報を取得するジャンル情報取得手段と、デジタル画像データから、映像情報の空間的な相関に関する情報と時間的な相関に関する情報と輝度レベルに関する情報と色差レベルに関する情報とのうち少なくとも１つを画像特性情報として算出する画像特性算出手段と、算出された画像特性情報を基に、映像のシーンが変化するシーン変化点に関するシーン区切り情報をデジタル画像データから検出するシーン変化点検出手段と、検出されたシーン区切り情報によりデジタル画像データをシーン別に区切り、区切られたシーンごとの画像特性情報に基づくシーン特性情報を算出するシーン分別手段と、映像のデジタル画像データのうち既に符号化されたデジタル画像データの符号量に関する符号化結果情報を取得する符号化結果情報取得手段と、前記ジャンルに関する情報とシーン特性情報とから算出されたシーン識別信号と符号化結果情報とを基に符号化制御パラメータの補正値を算出し、算出された補正値によって符号化制御パラメータを補正する符号化制御パラメータ補正手段と、区切られたシーンごとに設定された符号化制御パラメータに従って、デジタル画像データの区切られたシーンごとに符号化処理を行う符号化手段とを備えることを特徴とする。 In order to achieve the above object, a high-efficiency encoding / recording apparatus according to claim 1, wherein a digital image is obtained by performing an encoding process controlled by an encoding control parameter on digital image data of an input video. Data is compressed and stored in a recording medium. Based on electronic program guide information added to digital image data or based on genre prediction information calculated from digital image data, Genre information acquisition means for acquiring information relating to the genre of the video, at least of information relating to spatial correlation of video information, information relating to temporal correlation, information relating to luminance level, and information relating to color difference level from digital image data Image characteristic calculating means for calculating one as image characteristic information, and based on the calculated image characteristic information, Scene change point detection means for detecting scene break information regarding scene change points where the scene changes from digital image data, and digital image data is divided into scenes by the detected scene break information, and image characteristic information for each divided scene Scene classification means for calculating scene characteristic information based on the above, encoding result information acquisition means for acquiring encoding result information relating to the code amount of digital image data already encoded among digital image data of video, and the genre An encoding control parameter that calculates a correction value of the encoding control parameter based on the scene identification signal calculated from the information and the scene characteristic information and the encoding result information, and corrects the encoding control parameter by the calculated correction value According to the correction means and the encoding control parameters set for each divided scene Te, characterized in that it comprises an encoding unit for performing encoding for each delimited scene of the digital image data.

また、請求項２は請求項１に記載の高能率符号化記録装置であり、画像特性算出手段は、画像特性情報として、デジタル画像データのフレームごとに算出される平均輝度レベル値と、デジタル画像データのフレームごとに算出される平均色差レベル値と、デジタル画像データのフレームごとに算出されるフレーム内における隣接画素間の差分絶対値の総和値と、デジタル画像データの連続する２フレーム間において算出される同一位置に属する画素間の差分絶対値の総和値とのうち少なくとも１つの値を算出することを特徴とする。 According to a second aspect of the present invention, there is provided the high-efficiency encoding / recording apparatus according to the first aspect, wherein the image characteristic calculation means includes, as image characteristic information, an average luminance level value calculated for each frame of the digital image data, and a digital image The average color difference level value calculated for each frame of data, the sum of absolute difference values between adjacent pixels in the frame calculated for each frame of digital image data, and calculated between two consecutive frames of digital image data And calculating at least one value of the sum of absolute differences between pixels belonging to the same position.

また、請求項３は請求項１または２に記載の高能率符号化記録装置であり、シーン変化点検出処理手段は、映像の瞬間的なシーンの変化、または連続的な期間を有して行われるシーンの変化を検出することを特徴とする。 A third aspect of the present invention is the high-efficiency encoding / recording apparatus according to the first or second aspect, wherein the scene change point detection processing means performs an instantaneous scene change of a video or a continuous period. It is characterized by detecting a change in the scene to be displayed.

また、請求項４は請求項１〜３の何れか１項に記載の高能率符号化記録装置であり、シーン分別手段は、シーン特性情報として、デジタル画像データのフレームごとに算出される平均輝度レベル値と、デジタル画像データのフレームごとに算出される平均色差レベル値と、デジタル画像データのフレームごとに算出されるフレーム内隣接画素間の差分絶対値の総和値と、デジタル画像データの連続する２フレーム間において算出される同一位置に属する画素間の差分絶対値の総和値とのうち少なくとも１つの値を算出し、符号化結果情報取得手段は、符号化結果情報として、動き補償予測が行われたフレームごとに算出される動きベクトル距離の総和値と、デジタル画像データのフレームごとに算出される符号化されたデジタル画像データの情報量とのうち少なくとも１つの値を算出することを特徴とする。 A fourth aspect of the present invention is the high-efficiency encoding / recording apparatus according to any one of the first to third aspects, wherein the scene classification means calculates the average luminance calculated for each frame of the digital image data as the scene characteristic information. The level value, the average color difference level value calculated for each frame of the digital image data, the sum of absolute differences between adjacent pixels calculated for each frame of the digital image data, and the continuous digital image data At least one value of the sum of absolute differences between pixels belonging to the same position calculated between two frames is calculated, and the encoding result information acquisition unit performs motion compensation prediction as the encoding result information. The sum of motion vector distances calculated for each frame and the encoded digital image data calculated for each frame of digital image data. And calculating at least one value of the distribution amount.

また、請求項５は請求項１〜４の何れか１項に記載の高能率符号化記録装置であり、符号化制御パラメータ補正手段は、符号化処理を制御するパラメータである動きベクトルの検出範囲を示す値と、動きベクトルを検出する際に参照フレームとなるフレームを挿入する間隔を指定する値と、目標とする符号量をフレームごとに制御する値と、瞬間的な符号化レートの最大値を制御する値と、輝度信号を量子化するための量子化マトリクス値と、
色差信号を量子化するための量子化マトリクス値とのうち少なくとも１つの値の補正値を出力することを特徴とする。 A fifth aspect of the present invention is the high-efficiency encoding / recording apparatus according to any one of the first to fourth aspects, wherein the encoding control parameter correction means detects a motion vector that is a parameter for controlling the encoding process. , A value for specifying an interval for inserting a frame to be a reference frame when detecting a motion vector, a value for controlling a target code amount for each frame, and a maximum instantaneous encoding rate A value for controlling, a quantization matrix value for quantizing the luminance signal,
A correction value of at least one of the quantization matrix values for quantizing the color difference signal is output.

また、請求項６は請求項１〜５の何れか１項に記載の高能率符号化記録装置であり、ジャンル情報取得手段において取得されるジャンルに関する情報は、デジタル画像データに付加された電子番組ガイド情報から取得した情報と、ジャンルを特定するために予め設定されたキーワード情報で電子番組ガイド情報から取得したテキストデータを検索処理することにより取得した情報とを含むことを特徴とする。 Further, claim 6 is the high-efficiency encoding and recording apparatus according to any one of claims 1 to 5, wherein the genre information acquired by the genre information acquisition means is an electronic program added to digital image data. It includes information acquired from guide information and information acquired by searching text data acquired from electronic program guide information with keyword information set in advance to specify a genre.

また、請求項７は請求項１〜５の何れか１項に記載の高能率符号化記録装置であり、シーン識別信号の出現頻度を累積加算して記録する頻度記録手段を有し、ジャンル情報取得手段は、電子番組ガイド情報から映像のジャンルに関する情報を取得できなかったときは、シーン分別手段から取得したシーン特性情報を基に全てのジャンルに対するシーン識別信号をそれぞれ算出し、この算出されたシーン識別信号の中で累積加算された出現頻度が最も高いシーン識別信号を選択してジャンル予測情報を算出することを特徴とする。 A seventh aspect of the present invention is the high-efficiency encoding / recording apparatus according to any one of the first to fifth aspects, comprising frequency recording means for accumulating and recording the appearance frequencies of scene identification signals, and genre information The acquisition means calculates the scene identification signals for all genres based on the scene characteristic information acquired from the scene classification means when the information about the video genre cannot be acquired from the electronic program guide information. Genre prediction information is calculated by selecting a scene identification signal having the highest appearance frequency cumulatively added from the scene identification signals.

本発明の高能率符号化記録装置によれば、映像のデジタル画像データを記録するときに、瞬間的に入力される符号化難易度の高いシーンや特徴のあるシーンに対して適切な符号化制御を行うことができる。これにより、限られた容量の記録媒体に、符号化品質を保ち効率良くデジタル画像データを記録することができる。 According to the high-efficiency encoding / recording apparatus of the present invention, when recording digital image data of video, appropriate encoding control is performed for scenes with high encoding difficulty and features that are input instantaneously. It can be performed. Thereby, digital image data can be efficiently recorded on a recording medium having a limited capacity while maintaining the encoding quality.

〈高能率符号化記録装置１０の構成〉
本発明の第１実施形態における高能率符号化記録装置１０の構成について図１を参照して説明する。本実施形態における高能率符号化記録装置１０は、ＥＰＧ情報入力端子１０１と、番組情報取得回路１０２と、ジャンル／キーワード検索回路１０３と、シーン情報データベース（記録媒体）１０４と、画像特性算出回路１０５と、シーン検出回路１０６と、シーン分別回路１０７と、データベース管理回路１０８と、符号化制御パラメータ補正回路１０９と、符号化シンタックス制御回路１１０と、ジャンル予測回路１１１と、対象画像入力端子２０１と、入力画像メモリ２０２と、２次元ブロックデータ変換回路２０３と、減算器２０４と、直交変換回路２０５と、量子化回路２０６と、符号化回路２０７と、符号化テーブル２０８と、マルチプレクサ２０９と、画像ビットストリームバッファ２１０と、記録媒体もしくは伝送路２１１と、逆量子化回路２１２と、逆直交変換回路２１３と、加算器２１４と、デブロック回路２１５と、参照画像メモリ２１６と、動きベクトル検出回路２１７と、動き補償予測回路２１８と、符号量制御回路２１９とを有する。このうち、対象画像入力端子２０１以降に記載の構成要件は図１２に示す従来の画像符号化記録装置と同様であるため、説明を省略する。 <Configuration of high-efficiency encoding / recording apparatus 10>
The configuration of the high-efficiency encoding / recording apparatus 10 according to the first embodiment of the present invention will be described with reference to FIG. A high-efficiency encoding / recording apparatus 10 according to the present embodiment includes an EPG information input terminal 101, a program information acquisition circuit 102, a genre / keyword search circuit 103, a scene information database (recording medium) 104, and an image characteristic calculation circuit 105. A scene detection circuit 106, a scene classification circuit 107, a database management circuit 108, an encoding control parameter correction circuit 109, an encoding syntax control circuit 110, a genre prediction circuit 111, and a target image input terminal 201. , Input image memory 202, two-dimensional block data conversion circuit 203, subtractor 204, orthogonal transform circuit 205, quantization circuit 206, encoding circuit 207, encoding table 208, multiplexer 209, image Bit stream buffer 210 and recording medium or transmission path 211 , Inverse quantization circuit 212, inverse orthogonal transform circuit 213, adder 214, deblock circuit 215, reference image memory 216, motion vector detection circuit 217, motion compensation prediction circuit 218, and code amount control circuit 219. Among these, the constituent elements described after the target image input terminal 201 are the same as those of the conventional image encoding and recording apparatus shown in FIG.

ＥＰＧ情報入力端子１０１は、ＥＰＧ情報を入力して番組情報取得回路１０２に送信する。ＥＰＧ情報は、地上アナログ放送の特定チャンネル・特定時刻にＴＶ映像信号のブランキング区間に同期データとともに送信されてくる情報や、地上・ＢＳデジタル放送でパケット化された符号化データであるＴＳ（トランスポートストリーム）と呼ばれるデータの中に特定の識別情報とともに周期的に送られてくるＳＩ（番組配列情報）から取得可能である。 The EPG information input terminal 101 inputs EPG information and transmits it to the program information acquisition circuit 102. The EPG information is information transmitted together with the synchronization data in the blanking section of the TV video signal at a specific channel / specific time of terrestrial analog broadcasting, or TS (Transformer) which is packetized data in terrestrial / BS digital broadcasting. It can be acquired from SI (program sequence information) periodically sent with specific identification information in data called “port stream”.

番組情報取得回路１０２は、受信したＥＰＧ情報から、処理を行っている番組のジャンル情報および番組内容を示すテキストデータを取得し、ジャンル／キーワード検索回路１０３に送信する。 The program information acquisition circuit 102 acquires the genre information of the program being processed and the text data indicating the program content from the received EPG information, and transmits it to the genre / keyword search circuit 103.

ジャンル／キーワード検索回路１０３は、ジャンル情報および番組内容を示すテキストデータを受信するとともに後述するシーン情報データベース１０４に格納されているキーワード情報を読み込み、番組内容を示すテキストデータの中にキーワード情報の中のキーワードがあるかどうか検索する。検索の結果抽出されたキーワード情報のジャンル情報ＩＤおよび番組情報取得回路１０２から受信したジャンル情報は、シーン分別回路１０７に送信される。ここで、ＥＰＧ情報が取得できないなどの理由でジャンル情報およびキーワード情報の抽出によるデータがともに取得できなかった場合には、情報取得不可を示す情報をシーン分別回路１０７およびジャンル予測回路１１１に送信する。 The genre / keyword search circuit 103 receives genre information and text data indicating program contents, reads keyword information stored in a scene information database 104 described later, and includes the keyword information in the text data indicating program contents. Search for any keywords. The genre information ID of the keyword information extracted as a result of the search and the genre information received from the program information acquisition circuit 102 are transmitted to the scene classification circuit 107. Here, if both the genre information and the keyword information extraction data cannot be acquired because the EPG information cannot be acquired, information indicating that the information cannot be acquired is transmitted to the scene classification circuit 107 and the genre prediction circuit 111. .

シーン情報データベース１０４は、メインジャンル情報と、このメインジャンル情報を細分化するための番組詳細を示すキーワードとしてのサブジャンル情報と、このサブジャンル情報に対応するジャンル情報ＩＤとで構成されたキーワード情報を格納している。またこのジャンル情報ＩＤ毎に、符号化制御を行うために予め設定された符号化制御パラメータを設定するためのデータであるシーン識別信号を格納している。またこのジャンルＩＤ毎に、出現頻度を累積加算したデータを格納している。 The scene information database 104 includes keyword information composed of main genre information, sub-genre information as keywords indicating program details for subdividing the main genre information, and genre information ID corresponding to the sub-genre information. Is stored. In addition, for each genre information ID, a scene identification signal that is data for setting an encoding control parameter set in advance to perform encoding control is stored. For each genre ID, data obtained by accumulating the appearance frequency is stored.

画像特性算出回路１０５は、入力画像メモリ２０２から符号化対象であるデジタル画像データを取得し、画像情報の空間的または時間的な相関や輝度および色差レベルに関する画像特性情報をフレーム毎に算出し、シーン検出回路１０６に送信する。 The image characteristic calculation circuit 105 acquires digital image data to be encoded from the input image memory 202, calculates image characteristic information related to spatial or temporal correlation of image information, luminance, and color difference level for each frame, This is transmitted to the scene detection circuit 106.

本実施形態においてはこの画像特性情報として、デジタル画像データのフレームごとに算出される平均輝度レベル値と、デジタル画像データのフレームごとに算出される平均色差レベル値と、デジタル画像データのフレームごとに算出されるフレーム内における隣接画素間の差分絶対値の総和値と、デジタル画像データの連続する２フレーム間において算出される同一位置に属する画素間の差分絶対値の総和値とを算出する。 In this embodiment, as this image characteristic information, the average luminance level value calculated for each frame of digital image data, the average color difference level value calculated for each frame of digital image data, and the frame of digital image data The sum of absolute differences between adjacent pixels in the calculated frame and the sum of absolute differences between pixels belonging to the same position calculated between two consecutive frames of digital image data are calculated.

シーン検出回路１０６は、受信した画像特性情報を用いてシーンの変化点を検出する処理を行う。行った結果、シーン変化点が検出されたフレームであるかどうかを示すシーン区切り情報を作成し、シーン分別回路１０７に送信する。また、このとき画像特性情報もシーン分別回路１０７に送信する。 The scene detection circuit 106 performs processing for detecting a scene change point using the received image characteristic information. As a result, scene delimiter information indicating whether or not the scene change point is detected is generated and transmitted to the scene classification circuit 107. At this time, the image characteristic information is also transmitted to the scene classification circuit 107.

シーン分別回路１０７は、シーン検出回路１０６から受信したシーン区切り情報によりシーンの区切りを認識する。また、シーン検出回路１０６から受信した画像情報特性の１シーン区間の平均値（以下、「シーン特性情報」と称する）を算出し、ジャンル／キーワード検索回路１０３から受信したジャンル情報ＩＤと合わせてデータベース管理回路１０８へ送信する。 The scene classification circuit 107 recognizes a scene break based on the scene break information received from the scene detection circuit 106. Further, an average value of one scene section of the image information characteristics received from the scene detection circuit 106 (hereinafter referred to as “scene characteristic information”) is calculated, and the database is combined with the genre information ID received from the genre / keyword search circuit 103. Transmit to the management circuit 108.

データベース管理回路１０８は、シーン特性情報と符号量制御回路２１９から取得した符号化難易度および動きベクトル距離に関する符号化結果情報とからシーン識別信号を作成する。そして、このシーン識別信号およびシーン分別回路１０７から受信したジャンル情報ＩＤを基にシーン情報データベース１０４をアクセスして符号化制御パラメータ補正データを取得し、符号化制御パラメータ補正回路１０９に送信する。 The database management circuit 108 creates a scene identification signal from the scene characteristic information and the encoding result information regarding the encoding difficulty level and the motion vector distance acquired from the code amount control circuit 219. Then, based on the scene identification signal and the genre information ID received from the scene classification circuit 107, the scene information database 104 is accessed to obtain the encoding control parameter correction data and transmitted to the encoding control parameter correction circuit 109.

符号化制御パラメータ補正回路１０９は、現在制御している符号化パラメータと受信した符号化制御パラメータ補正データを比較し、異なると判断した場合には必要な処理モジュールに対して補正を行うように符号化シンタックス制御回路１１０に符号化制御パラメータ補正データを送信する。 The encoding control parameter correction circuit 109 compares the encoding parameter currently being controlled with the received encoding control parameter correction data, and if it is determined that they are different, the encoding control parameter correction circuit 109 performs encoding so as to correct the necessary processing module. The encoding control parameter correction data is transmitted to the encoding syntax control circuit 110.

符号化シンタックス制御回路１１０は、受信した符号化制御パラメータ補正データに応じて、制御する処理モジュールに符号化制御パラメータ補正データを送信する。 The encoding syntax control circuit 110 transmits the encoding control parameter correction data to the processing module to be controlled according to the received encoding control parameter correction data.

ジャンル予測回路１１１は、シーン検出回路１０６からシーン区切り情報を取得し、ジャンル情報未取得フラグが存在する場合は予測ジャンル情報を作成してシーン分別回路１０７に送信する。 The genre prediction circuit 111 acquires scene delimiter information from the scene detection circuit 106, and creates a predicted genre information and transmits it to the scene classification circuit 107 when a genre information non-acquisition flag exists.

〈高能率符号化記録装置１０の動作〉
本発明の第１実施形態における高能率符号化記録装置１０の動作について説明する。 <Operation of High Efficiency Encoding / Recording Apparatus 10>
The operation of the high-efficiency encoding / recording apparatus 10 according to the first embodiment of the present invention will be described.

本実施形態における高能率符号化記録装置１０の動作のうち、符号化対象となる映像のデジタル画像データの流れについては従来の画像符号化記録装置２０の場合と同様であるため説明を省略する。 Among the operations of the high-efficiency encoding / recording apparatus 10 in the present embodiment, the flow of digital image data of a video to be encoded is the same as that of the conventional image encoding / recording apparatus 20, and thus the description thereof is omitted.

まず、符号化対象となるデジタル画像データの入力とは別に、ＥＰＧ情報がＥＰＧ情報入力端子１０１から入力される。取得されるＥＰＧ情報の例として、デジタル放送で伝送されるＥＰＧ情報の大まかな内容を図２に示す。 First, EPG information is input from the EPG information input terminal 101 separately from input of digital image data to be encoded. As an example of the acquired EPG information, FIG. 2 shows a rough content of EPG information transmitted by digital broadcasting.

次に、図２に示すＥＰＧ情報から番組情報取得回路１０２でジャンルを特定するための情報として符号化処理中の番組の「番組名」、「番組記述」、「ジャンル」、「番組詳細情報」が取得され、ジャンル／キーワード検索回路１０３に送信される。 Next, “program name”, “program description”, “genre”, and “program detailed information” of the program being encoded as information for specifying the genre by the program information acquisition circuit 102 from the EPG information shown in FIG. Is acquired and transmitted to the genre / keyword search circuit 103.

ジャンル／キーワード検索回路１０３では、番組情報取得回路１０２から受信された情報のうち「ジャンル」からジャンル情報が作成される。また、ジャンル／キーワード検索回路１０３ではシーン情報データベース１０４からキーワード情報が取得され、番組情報取得回路１０２から受信された「番組名」「番組記述」「番組詳細情報」の情報内容にキーワード情報の中のキーワードが含まれているかどうか検索される。 Genre / keyword search circuit 103 creates genre information from “genre” among the information received from program information acquisition circuit 102. Further, the genre / keyword search circuit 103 acquires keyword information from the scene information database 104 and includes the keyword information in the information contents of “program name”, “program description”, and “program detailed information” received from the program information acquisition circuit 102. It is searched whether the keyword of is included.

シーン情報データベース１０４に格納されているキーワード情報のデータ構成例を図３に示す。このキーワード情報は、メインジャンル情報およびこのメインジャンル情報を細分化するための番組詳細を示すキーワードとしてのサブジャンル情報と、このサブジャンル情報に対応するジャンル情報ＩＤとで構成されている。 A data configuration example of the keyword information stored in the scene information database 104 is shown in FIG. This keyword information is composed of main genre information, sub-genre information as keywords indicating program details for subdividing the main genre information, and genre information ID corresponding to the sub-genre information.

ジャンル／キーワード検索回路１０３で作成されたジャンル情報と、検索された結果抽出されたキーワード情報のジャンル情報ＩＤとは、シーン分別回路１０７に送信される。このとき、ＥＰＧ情報が取得できないなどの理由により、ジャンル／キーワード検索回路１０３でジャンル情報とキーワード情報の抽出によるジャンル情報ＩＤがともに取得不可能な場合は、ジャンル／キーワード検索回路１０３からシーン分別回路１０７にジャンル情報取得不可を示す情報が送信される。 The genre information created by the genre / keyword search circuit 103 and the genre information ID of the keyword information extracted as a result of the search are transmitted to the scene classification circuit 107. At this time, if the genre / keyword search circuit 103 cannot acquire both the genre information and the genre information ID by extracting the keyword information because the EPG information cannot be acquired, the genre / keyword search circuit 103 sends the scene classification circuit. Information indicating that genre information cannot be acquired is transmitted to 107.

一方、入力画像メモリに入力された符号化対象となるデジタル画像データが画像特性算出回路１０５に送信され、画像情報の空間的または時間的な相関に関する情報や輝度および色差レベルに関する情報である画像特性情報が算出される。 On the other hand, digital image data to be encoded input to the input image memory is transmitted to the image characteristic calculation circuit 105, and image characteristics that are information relating to spatial or temporal correlation of image information and information relating to luminance and color difference levels. Information is calculated.

画像特性算出回路１０５における画像特性情報の算出について説明する。算出される画像特性情報とは、具体的には、フレーム平均輝度情報LDC、フレーム平均色差情報CBDCおよびCRDC、フレーム内隣接画素間の差分絶対値のフレーム総和値FAct、連続するフレーム間における画面内同一位置に属する画素間差分絶対値の総和値FDfiff、である。 Calculation of image characteristic information in the image characteristic calculation circuit 105 will be described. Specifically, the calculated image characteristic information includes frame average luminance information LDC, frame average color difference information CBDC and CRDC, frame total value FAct of absolute difference between adjacent pixels in the frame, and in-screen between consecutive frames. This is the sum FDfiff of the absolute differences between pixels belonging to the same position.

フレーム平均輝度情報LDC（以下、「フレーム輝度DC」と称する）は、輝度信号のレベルをluma( )とすると、下記式（５）で算出される。

Frame average luminance information LDC (hereinafter referred to as “frame luminance DC”) is calculated by the following equation (5), where the luminance signal level is luma ().

また、フレーム平均色差情報CBDCおよびCRDCは、色差信号のレベルをcb( )およびcr( )とすると、下記式（６）および（７）で算出される。

The frame average color difference information CBDC and CRDC are calculated by the following formulas (6) and (7), where the level of the color difference signal is cb () and cr ().

また、フレーム内隣接画素間の差分絶対値（以下、「フレームアクティビティ」と称する）のフレーム総和値FActは、まずフレームアクティビティとしてＤＣＴ処理を行う８×８画素単位で図４に示すような矢印の画素間の面内相関Activityが算出される。そして、この面内相関Activityのフレーム総和が算出されることにより、フレームアクティビティのフレーム総和値が求められる。 Further, the frame sum value FAct of the difference value between adjacent pixels in the frame (hereinafter referred to as “frame activity”) is an arrow as shown in FIG. 4 in units of 8 × 8 pixels for performing DCT processing as frame activity. An in-plane correlation activity between pixels is calculated. Then, by calculating the frame sum of the in-plane correlation Activity, the frame sum value of the frame activity is obtained.

この面内相関Activityは下記式（８）で算出される。

This in-plane correlation activity is calculated by the following equation (8).

そして、この面内相関Activityのフレーム総和値としてのフレームアクティビティFActは、下記式（９）で算出される。

Then, the frame activity FAct as the frame total value of the in-plane correlation activity is calculated by the following equation (9).

また、連続するフレーム間における画面内同一位置に属する画素間差分絶対値の総和値FDfiff（以下、「フレームディファレンス」と称する）は、符号化対象となる入力画像フレームの１フレーム前の輝度成分をprev_luma( )、１フレーム前の色差成分をprev_cb( )およびprev_cr( )とすると、下記式（１０）で算出される。

Also, the sum FDfiff (hereinafter referred to as “frame difference”) of the absolute differences between pixels belonging to the same position in the screen between consecutive frames is the luminance component one frame before the input image frame to be encoded. Is prev_luma (), and chrominance components one frame before are prev_cb () and prev_cr (), the following equation (10) is calculated.

画像特性算出回路１０５で上記のように算出された画像特性情報は、シーン検出回路１０６に送信される。シーン検出回路１０６では、受信したこれらの画像特性情報を用いてシーンが切り替わるシーン変化点を検出する処理が行われる。このシーン変化点として、瞬間的に画面が切り替わるシーンチェンジの検出と、連続的な期間を有して画面がオーバーラップして変化する（画面が次第に明るくなるフェードインおよび画面が次第に消えていくフェードアウトを含む）状態の検出とが行われる。 The image characteristic information calculated as described above by the image characteristic calculation circuit 105 is transmitted to the scene detection circuit 106. The scene detection circuit 106 performs processing for detecting a scene change point at which a scene switches using the received image characteristic information. This scene change point is the detection of a scene change where the screen changes instantaneously, and the screen changes with a continuous period of time (the fade-in where the screen gradually becomes brighter and the fade-out where the screen gradually disappears) State).

これらの検出処理を図５を参照して説明する。この処理では、入力された画像特性情報の中の、フレームアクティビティのフレーム総和値FAct、および、フレーム輝度DC成分を用いて処理が行われる。 These detection processes will be described with reference to FIG. In this processing, processing is performed using the frame total value FAct of the frame activity and the frame luminance DC component in the input image characteristic information.

格納されている最新のフレームアクティビティのフレーム総和値をFAct(N)、１フレーム前のフレームアクティビティのフレーム総和値をFAct(N-1)、Nフレーム前のフレームアクティビティのフレーム総和値をFAct(0)とする。同様にフレーム輝度DC成分も、最新のフレーム輝度DC成分をLDC(N)、１フレーム前のフレーム輝度DC成分をLDC(N-1)、Nフレーム前のフレーム輝度DC成分をLDC(0)とする。また、I、J、K、およびLは変数である。 The frame total value of the latest frame activity stored is FAct (N), the frame total value of the frame activity one frame before is FAct (N-1), and the frame total value of the frame activity N frames before is FAct (0 ). Similarly, for the frame luminance DC component, the latest frame luminance DC component is LDC (N), the frame luminance DC component one frame before is LDC (N-1), and the frame luminance DC component N frames before is LDC (0). To do. I, J, K, and L are variables.

まず、シーンチェンジを検出するために、Nフレーム間のフレームアクティビティのフレーム総和値およびフレーム輝度DC成分のフレーム間差分絶対値が算出される。このとき、フレームI=0〜N-2に対するフレームアクティビティおよびフレーム輝度DC成分のフレーム間差分絶対値は以前のフレームでの算出結果から残っているため、まずフレームI=0〜N-2であるかどうかが判定される（Ｓ１）。フレームI=0〜N-2であれば（Ｓ１の「Yes」）、下記式（１１）および式（１２）の処理が行われる（Ｓ２）。

First, in order to detect a scene change, a frame sum value of frame activity between N frames and an inter-frame difference absolute value of a frame luminance DC component are calculated. At this time, since the frame activity absolute value of the frame activity and the frame luminance DC component for the frame I = 0 to N-2 remains from the calculation result in the previous frame, the frame I = 0 to N-2 first. Is determined (S1). If the frame is I = 0 to N−2 (“Yes” in S1), the following expressions (11) and (12) are performed (S2).

ステップＳ１において、フレームI=N-1になると（Ｓ１の「No」）フレームI=N-1とフレームI=Nとの間のフレームアクティビティの差分が下記式（１３）で算出され、同様にフレーム輝度DC成分のフレーム間差分が下記式（１４）で算出される（Ｓ３）。

In step S1, when frame I = N−1 (“No” in S1), the difference in frame activity between frame I = N−1 and frame I = N is calculated by the following equation (13), and similarly The interframe difference of the frame luminance DC component is calculated by the following formula (14) (S3).

次に、フレームJ(1≦J≦N-2)がシーンの切り替わりポイントであるかの判断を行うために、I=J-K(0≦K≦J)からI=N-2までのフレームアクティビティの総和値と、フレームJのフレームアクティビティ値が下記式（１５）で比較される（Ｓ４）。比較された結果、式（１５）が満たされる場合（Ｓ４の「Yes」）は、シーン変換点が検出されたと判断され、フレームJがシーン変化点ポイントとして出力される（Ｓ５）。 Next, in order to determine whether frame J (1 ≦ J ≦ N-2) is a scene switching point, the frame activity from I = JK (0 ≦ K ≦ J) to I = N-2 The total value and the frame activity value of frame J are compared by the following equation (15) (S4). As a result of the comparison, if Expression (15) is satisfied (“Yes” in S4), it is determined that a scene conversion point has been detected, and frame J is output as a scene change point (S5).

式（１５）が満たされなかった場合（Ｓ４の「No」）は、I=J-K(0≦K≦J)からI=N-2までのフレーム輝度DC成分の差分値の総和値と、フレームJのフレーム輝度DC成分の差分値が下記式（１６）で比較される（Ｓ６）。比較された結果、式（１６）が満たされる場合（Ｓ６の「Yes」）はシーン変換点が検出されたと判断され、フレームJがシーン変化点ポイントとして出力される（Ｓ６）。

When Expression (15) is not satisfied (“No” in S4), the sum of the difference values of the frame luminance DC components from I = JK (0 ≦ K ≦ J) to I = N−2 and the frame The difference value of the J frame luminance DC component is compared by the following equation (16) (S6). As a result of the comparison, if Expression (16) is satisfied (“Yes” in S6), it is determined that a scene conversion point has been detected, and frame J is output as a scene change point (S6).

ここで、通常閾値は０．５より十分大きく１に近い値となる。式（１６）の条件も満たさない場合（Ｓ６の「No」）は、シーン変化点は未検出である旨の情報が出力される（Ｓ７）。 Here, the normal threshold is a value sufficiently larger than 0.5 and close to 1. When the condition of Expression (16) is not satisfied (“No” in S6), information indicating that the scene change point has not been detected is output (S7).

次に、オーバーラップして変化するオーバーラップシーンを検出するため、Nフレーム間のフレームアクティビティの変化が測定され、一様に値が増加または減少しているかどうかが検索される。具体的な処理は以下のとおりである。 Next, in order to detect overlapping and changing overlap scenes, the change in frame activity between N frames is measured and a search is made as to whether the values are uniformly increasing or decreasing. The specific processing is as follows.

シーン変化点の検出が終わると、フレームJ＝0〜N-1（Ｓ８）のL=J+1〜N-1（Ｓ９）においてオーバーラップシーンの検出が行われる。まず、FAct(J)とFAct(N)とが比較され（Ｓ１０）、下記式（１７）が満たされるかどうかが判定される。

When the detection of the scene change point is completed, the overlap scene is detected at L = J + 1 to N-1 (S9) of the frame J = 0 to N-1 (S8). First, FAct (J) and FAct (N) are compared (S10), and it is determined whether or not the following equation (17) is satisfied.

判定の結果、式（１７）が満たされる場合は（Ｓ１０の「Yes」）、さらに下記式（１８）が満たされるかどうかが判定される（Ｓ１１）。

As a result of the determination, if the expression (17) is satisfied (“Yes” in S10), it is further determined whether the following expression (18) is satisfied (S11).

判定がN-1まで繰り返された結果（Ｓ１２、Ｓ１３）、式（１８）が満たされる場合（Ｓ１２の「Yes」）は、フェードインのオーバーラップ状態である可能性があると判断され、仮フェードイン状態であると判定される（Ｓ１４）。 If the determination is repeated until N-1 (S12, S13) and equation (18) is satisfied (“Yes” in S12), it is determined that there is a possibility of an overlapped fade-in state. It is determined that it is in a fade-in state (S14).

また、式（１７）が満たされなかった場合（Ｓ１０の「No」）は、下記式（１９）が満たされるかどうかが判定される（Ｓ１５）。

If the expression (17) is not satisfied (“No” in S10), it is determined whether or not the following expression (19) is satisfied (S15).

判定がN-1まで繰り返された結果（Ｓ１６、Ｓ１７）、式（１９）が満たされる場合（Ｓ１６の「Yes」）は、フェードアウトのオーバーラップ状態である可能性があると判断され、仮フェードアウト状態であると判定される（Ｓ１８）。 If the determination is repeated up to N-1 (S16, S17) and equation (19) is satisfied (“Yes” in S16), it is determined that there is a possibility of an overlapped fadeout, and a temporary fadeout is performed. The state is determined (S18).

上記の式（１８）が満たされなかった場合（Ｓ１１の「No」）、または式（１９）が満たされなかった場合（Ｓ１５「No」）は、フレームJ=1〜M（M＜N-1）までステップＳ９からステップＳ１８の処理が繰り返される（Ｓ１９、Ｓ２０の「No」）。フレームJ=N-1まで繰り返された結果、仮フェードイン状態または仮フェードアウト状態であると判定されなかった場合（Ｓ２０の「Yes」）は、オーバーラップシーンは検出されなかったと判断される（Ｓ２１）。 When the above equation (18) is not satisfied (“No” in S11) or when equation (19) is not satisfied (S15 “No”), frames J = 1 to M (M <N− The processing from step S9 to step S18 is repeated until 1) (“No” in S19 and S20). As a result of the repetition up to frame J = N−1, when it is not determined that the temporary fade-in state or the temporary fade-out state is present (“Yes” in S20), it is determined that no overlap scene has been detected (S21). ).

ステップＳ１４で仮フェードイン状態であると判定された場合、またはステップＳ１８で借りフェードアウト状態であると判定された場合、同区間の輝度DC成分の推移が測定される。輝度DC成分に関しては、以下の処理により誤差γを考慮して判定が行われる。ここで誤差γとしては、N-Jフレーム間の輝度DC成分の差分値LDC(N)-LDC(J)の１０分の１程度が望ましい。 If it is determined in step S14 that it is in the temporary fade-in state, or if it is determined in step S18 that it is in the borrowed fade-out state, the transition of the luminance DC component in the same section is measured. The luminance DC component is determined in consideration of the error γ by the following process. Here, the error γ is preferably about 1/10 of the difference value LDC (N) −LDC (J) of the luminance DC component between the N−J frames.

処理は、L=J+1〜N-1まで繰り返される（Ｓ２２）。まず、LDC(J)とLDC(N)とが比較され（Ｓ２３）、下記式（２０）が満たされるかどうかが判定される。

The process is repeated from L = J + 1 to N−1 (S22). First, LDC (J) and LDC (N) are compared (S23), and it is determined whether or not the following equation (20) is satisfied.

判定の結果、式（２０）が満たされる場合は（Ｓ２３の「Yes」）、さらに下記式（２１）が満たされるかどうかが判定される（Ｓ２４）。

As a result of the determination, if the expression (20) is satisfied (“Yes” in S23), it is further determined whether or not the following expression (21) is satisfied (S24).

判定がN-1まで繰り返された結果（Ｓ２５、Ｓ２６）、式（２１）が満たされ（Ｓ２５の「Yes」）、且つステップＳ１４で仮フェードイン状態であると判定されている場合はフェードインのオーバーラップ状態であると判断され、フェードインが検出されたと判定される（Ｓ２７）。また、式（２１）が満たされ（Ｓ２５の「Yes」）、且つステップＳ１８で仮フェードアウト状態であると判定されている場合はフェードアウトのオーバーラップ状態であると判断され、フェードアウトが検出されたと判定される（Ｓ２８）。 If the determination is repeated until N-1 (S25, S26), equation (21) is satisfied (“Yes” in S25), and it is determined in step S14 that the temporary fade-in state is present, fade-in Is determined to be in the overlap state, and it is determined that fade-in has been detected (S27). If it is determined that the formula (21) is satisfied (“Yes” in S25) and the temporary fade-out state is determined in step S18, it is determined that the fade-out overlap state is detected, and it is determined that the fade-out is detected. (S28).

また、式（２０）が満たされなかった場合（Ｓ２３の「No」）は、下記式（２２）が満たされるかどうかが判定される（Ｓ２９）

Further, when the formula (20) is not satisfied (“No” in S23), it is determined whether the following formula (22) is satisfied (S29).

この判定処理は、フレームL=J+1〜N-1の間繰り返される（Ｓ３０、Ｓ３１）。判定がN-1まで繰り返された結果、式（２２）が満たされ（Ｓ３０の「Yes」）、且つステップＳ１４で仮フェードイン状態であると判定されている場合はフェードインのオーバーラップ状態であると判断され、フェードインが検出されたと判定される（Ｓ２７）。また、式（２２）が満たされ（Ｓ３０の「Yes」）、且つステップＳ１８で仮フェードアウト状態であると判定されている場合はフェードアウトのオーバーラップ状態であると判断され、フェードアウトが検出されたと判定される（Ｓ２８）。 This determination process is repeated between frames L = J + 1 to N−1 (S30, S31). As a result of the determination being repeated up to N−1, equation (22) is satisfied (“Yes” in S30), and if it is determined in step S14 that the temporary fade-in state is present, the fade-in overlap state is established. It is determined that a fade-in has been detected (S27). If it is determined that the formula (22) is satisfied (“Yes” in S30) and the temporary fade-out state is determined in step S18, it is determined that the fade-out overlap state is detected, and it is determined that the fade-out is detected. (S28).

上記の式（２１）が満たされなかった場合（Ｓ２４の「No」）、または式（２２）が満たされなかった場合（Ｓ２９「No」）は、オーバーラップシーンは検出されなかったと判断される（Ｓ２１）。 When the above equation (21) is not satisfied (“No” in S24) or when equation (22) is not satisfied (S29 “No”), it is determined that no overlap scene has been detected. (S21).

シーン検出回路１０６では、上記の処理結果から、フレームJがシーン変換点が検出されたフレームであるかどうかを示すフラグ、オーバーラップシーンが検出されたフレームであるかどうかを示すフラグ、および、オーバーラップシーンが検出されたフレームにおいてアクティビティが上昇方向（フェードイン傾向）であるかまたは下降方向（フェードアウト傾向）であるかを示すフラグを含むシーン区切り情報が作成され、シーン分別回路１０７に送信される。また、同様にシーン検出回路１０６からシーン分別回路１０７に、画像特性情報も送信される。 In the scene detection circuit 106, from the above processing result, a flag indicating whether or not the frame J is a frame in which a scene conversion point is detected, a flag indicating whether or not an overlap scene is detected, Scene delimiter information including a flag indicating whether the activity is in the upward direction (fade-in tendency) or the downward direction (fade-out tendency) in the frame in which the lap scene is detected is generated and transmitted to the scene classification circuit 107. . Similarly, image characteristic information is also transmitted from the scene detection circuit 106 to the scene classification circuit 107.

シーン分別回路１０７では、シーン検出回路１０６から受信したシーン区切り情報によってシーンの区切りが認識される。このとき、オーバーラップシーンが検出されたフレームであることを示すフラグが含まれているときは、フラグが消えたタイミングがシーンの区切りであり、その後は新しいシーンが開始されると認識される。 The scene separation circuit 107 recognizes a scene break based on the scene break information received from the scene detection circuit 106. At this time, when the flag indicating that the overlap scene is detected is included, it is recognized that the timing at which the flag disappears is a scene break, and thereafter a new scene is started.

またシーン分別回路１０７では、シーン検出回路１０６からフレーム毎の画像特性情報が受信され、ジャンル／キーワード検索回路１０３からジャンル情報ＩＤが受信される。この画像特性情報から、１つのシーンが続いている区間の平均値（以下、「シーン特性情報」と称する）が算出され、ジャンル／キーワード検索回路１０３から受信されたジャンル情報ＩＤと合わせられ、データベース管理回路１０８へ送信される。 The scene classification circuit 107 receives image characteristic information for each frame from the scene detection circuit 106 and receives a genre information ID from the genre / keyword search circuit 103. From this image characteristic information, an average value (hereinafter referred to as “scene characteristic information”) of a section in which one scene continues is calculated and combined with the genre information ID received from the genre / keyword search circuit 103, It is transmitted to the management circuit 108.

このシーン特性情報は、シーン開始後のフレーム数をＰとすると、下記式（２３）〜（２７）で算出される。

This scene characteristic information is calculated by the following equations (23) to (27), where P is the number of frames after the start of the scene.

上記により算出されたシーン特性情報は、シーンが続いていると判断されている間は新しく入力されるフレームの画像特性情報によって補正されていき、シーンの区切りでリセットされる。 The scene characteristic information calculated as described above is corrected by the image characteristic information of a newly input frame while it is determined that the scene continues, and is reset at a scene break.

一方、符号量制御回路２１９では、出力画像ビットストリームの符号量Bitusedと量子化スケールの平均値AvgQとからフレーム毎に算出される過去Mフレームの符号化難易度の平均値Complexが下記式（２８）で算出される。

On the other hand, in the code amount control circuit 219, the average value Complex of the past M frames calculated for each frame from the code amount Bitused of the output image bitstream and the average value AvgQ of the quantization scale is expressed by the following equation (28). ).

また、同じく符号量制御回路２１９では、予測フレームに対する動きベクトル距離のフレーム総和SumMVの過去Mフレームの平均値AvgMVが下記式（２９）で算出される。

Similarly, in the code amount control circuit 219, an average value AvgMV of past M frames of the frame sum SumMV of motion vector distances with respect to the prediction frame is calculated by the following equation (29).

これら符号量制御回路２１９において、式（２８）または（２９）で算出されたComplexおよびAvgMVは、符号化結果情報としてデータベース管理回路１０８に送信される。 In these code amount control circuits 219, Complex and AvgMV calculated by Expression (28) or (29) are transmitted to the database management circuit 108 as encoded result information.

データベース管理回路１０８では、シーン分別回路１０７から受信したシーン特性情報と符号化制御回路２１９から受信した符号化結果情報とを基に、符号化制御を行うための符号化制御パラメータ補正データがシーン情報データベース１０４から取得される。 In the database management circuit 108, the encoding control parameter correction data for performing the encoding control based on the scene characteristic information received from the scene classification circuit 107 and the encoding result information received from the encoding control circuit 219 is the scene information. Obtained from the database 104.

データベース管理回路１０８において、シーン情報データベース１０４からパラメータ補正値が取得されるときの動作について、図６を参照して説明する。図６はシーン情報データベース１０４からパラメータ補正値が取得されるときの動作を示すアルゴリズムのフローチャートであり、Rは変数である。 The operation when the parameter correction value is acquired from the scene information database 104 in the database management circuit 108 will be described with reference to FIG. FIG. 6 is a flowchart of an algorithm showing an operation when a parameter correction value is acquired from the scene information database 104, and R is a variable.

まず、データベース管理回路１０８から、ジャンル情報ＩＤを基にシーン特性情報および符号化結果情報の区分けを行うための閾値が、シーン情報データベース１０４から読み出される。この閾値は、N種類に区分けを行う場合、(N-1)種類がジャンル情報ＩＤ毎にシーン情報データベース１０４に格納されている。またこの閾値は、シーン特性情報および符号化結果情報のAvgFAct、AvgLDC、AvgCBDC、AvgCRDC、AvgFDiff、Complex、AvgMVの７種類に対して作成されており、各々の情報が該当する閾値と比較され区分けが行われる。本実施形態においては、AvgCBDC、AvgCRDC、AvgMVに関しては２種類、AvgFAct、AvgLDC、AvgFDiff、Complexに関しては４種類に区分けが行われる。 First, the threshold value for dividing the scene characteristic information and the encoding result information based on the genre information ID is read from the scene information database 104 from the database management circuit 108. When this threshold is classified into N types, (N-1) types are stored in the scene information database 104 for each genre information ID. This threshold is created for seven types of scene characteristic information and encoding result information AvgFAct, AvgLDC, AvgCBDC, AvgCRDC, AvgFDiff, Complex, and AvgMV, and each information is compared with the corresponding threshold and classified. Done. In this embodiment, AvgCBDC, AvgCRDC, and AvgMV are divided into two types, and AvgFAct, AvgLDC, AvgFDiff, and Complex are divided into four types.

これらの値うち、最初にAvgFActに関する区分けが行われる。その動作は、まずデータベース管理回路１０８からシーン情報データベース１０４にアクセスされ、ジャンル情報ＩＤを基にAvgFActに関する３種類の閾値ε(R)(R=0〜2)が読み込まれる（Ｓ４１）。そして、R=0におけるAvgFActとε(R)の比較が行われる（Ｓ４２、S４３）。その結果、AvgFAct＜ε(R)の場合には（Ｓ４３の「Yes」）、「R」が出力される（Ｓ４４）。この処理がR=2となるまで繰り返し処理が行われ（Ｓ４５、Ｓ４６）、最終的にAvgFAct=ε(2)の場合には「3」が出力される（Ｓ４７）。 Of these values, AvgFAct is first classified. In the operation, first, the scene information database 104 is accessed from the database management circuit 108, and three kinds of thresholds ε (R) (R = 0 to 2) relating to AvgFAct are read based on the genre information ID (S41). Then, AvgFAct and ε (R) at R = 0 are compared (S42, S43). As a result, when AvgFAct <ε (R) (“Yes” in S43), “R” is output (S44). This process is repeated until R = 2 (S45, S46). Finally, when AvgFAct = ε (2), “3” is output (S47).

このAvgFActに関する区分け処理と同様に、AvgLDC、AvgCBDC、AvgCRDC、AvgFDiff、Complex、AvgMVに関しても区分け処理が行われる（Ｓ４８〜Ｓ５３）。その結果、出力された値が束ねられ、計１１ビットの信号（以下、「シーン識別信号」と称する）が作成される（Ｓ５４）。 Similar to this AvgFAct segmentation process, segmentation processes are also performed for AvgLDC, AvgCBDC, AvgCRDC, AvgFDiff, Complex, and AvgMV (S48 to S53). As a result, the output values are bundled, and a total 11-bit signal (hereinafter referred to as “scene identification signal”) is created (S54).

このシーン識別信号とジャンル情報ＩＤとによって、符号化処理を制御するパラメータとしてシーン情報データベース１０４に格納されているテーブルのデータ（以下、「符号化制御パラメータ補正データ」と称する）がデータベース管理回路１０８で取得される（Ｓ５５）。この符号化制御パラメータ補正データの構成例を図７に示す。データベース管理回路１０８からシーン情報データベース１０４のこのテーブルがアクセスされることにより、動きベクトルの検出範囲を示すMVMax、参照フレームを挿入する間隔を指定する値を示すSyntaxM、目標符号長のフレームタイプ別の重み付け乗数を示すA(T)、ＶＢＲ符号化時の最大割当レートを示す値であるMaxRate、ＶＢＶバッファをどの程度充足度に向かって制御するかを示すパラメータであるTargetVBV、輝度信号用量子化マトリクス値であるQmatL、色差信号用量子化マトリクス値であるQmatCの符号化パラメータが取得される。 Based on the scene identification signal and the genre information ID, data in a table (hereinafter referred to as “encoding control parameter correction data”) stored in the scene information database 104 as a parameter for controlling the encoding process is referred to as the database management circuit 108. (S55). A configuration example of the encoding control parameter correction data is shown in FIG. By accessing this table of the scene information database 104 from the database management circuit 108, MVMax indicating the detection range of the motion vector, Syntax M indicating the value for specifying the reference frame insertion interval, and the target code length for each frame type A (T) indicating a weighting multiplier, MaxRate which is a value indicating the maximum allocation rate at the time of VBR encoding, TargetVBV which is a parameter indicating how much the VBV buffer is controlled toward the degree of fullness, and a luminance signal quantization matrix The encoding parameters of QmatL, which is the value, and QmatC, which is the quantization matrix value for the color difference signal, are acquired.

上記の符号化パラメータの取得処理において、ジャンル情報として「スポーツ／サッカー」が選択されている場合について説明する。 A case where “sports / soccer” is selected as genre information in the encoding parameter acquisition process will be described.

ジャンル情報として「スポーツ／サッカー」が選択されている場合には、検出されたシーン毎にシーン特性情報のAvgLDC、AvgCBDC、AvgCRDCに特徴付けられた芝生の認識が行われ、AvgFActの大小によって画面のズーム度合いが測定される。芝生が映されていると認識されていない状態でAvgFActが大きい場合には、観客席が映されていると認識される。 When “Sports / Soccer” is selected as the genre information, the lawn characterized by the AvgLDC, AvgCBDC, and AvgCRDC scene characteristics information is recognized for each detected scene. The degree of zoom is measured. If the AvgFAct is large when the lawn is not recognized, it is recognized that the spectator seat is shown.

芝生が映されていると認識されている場合には、番組での注目点は試合の選手の動きである。このとき遠景で映されている場合には、動きベクトルの検出範囲MVMaxを水平方向に大きく取るように設定されることにより画面上で小さい選手の移動が正確に捉えられる。また、近景で映されている場合には、瞬間的な早い動きに対応するように動きベクトルの検出範囲MVMaxは水平・垂直に同じように与えられ、MaxRate値が大きく、TargetVBV値が高く設定され、さらに参照フレームを挿入する間隔SyntaxMが短く設定されることにより予測効率が向上される。 When it is recognized that the lawn is reflected, the attention point on the program is the movement of the player in the game. At this time, when the image is displayed in a distant view, the movement of the small player can be accurately captured on the screen by setting the motion vector detection range MVMax to be large in the horizontal direction. Also, in the case of a close-up view, the motion vector detection range MVMax is given in the same way horizontally and vertically to correspond to instantaneous fast movement, the MaxRate value is large, and the TargetVBV value is set high. In addition, the prediction efficiency is improved by setting the interval SyntaxM for inserting the reference frame to be shorter.

一方、観客席が映されていると認識された場合は、動きベクトルの検出範囲MVMaxは小さめに設定されるとともに、MaxRateが小さく設定され瞬間的に大きな符号量が与えられないようにされる。さらに、Ｉピクチャの割当が増やされ、高解像度の観客席において動きのスムーズさよりもブロックノイズ等の符号化ノイズが出現しにくいように制御される。 On the other hand, when it is recognized that the audience seat is shown, the motion vector detection range MVMax is set to be small, and MaxRate is set to be small so that a large code amount is not given instantaneously. Furthermore, the allocation of I pictures is increased, and control is performed so that encoding noise such as block noise is less likely to appear than smoothness of movement in a high-resolution auditorium.

シーン特性情報において、上記と同じように芝生が映されていると認識された場合でも、例えばキーワード情報として「音楽／ライブ」が選択されておりAvgFActが高い場合には、上記の観客席の場合と同じ制御が行われる。このとき、観客席の重要度は低いため、高域の輝度信号用量子化マトリクス値QmatL、および色差信号用量子化マトリクス値QmatCで制御され、粗い量子化処理が許可される。 Even if it is recognized that the lawn is reflected in the scene characteristic information as described above, for example, when “music / live” is selected as the keyword information and AvgFAct is high, The same control is performed. At this time, since the degree of importance of the audience seats is low, control is performed with the high-frequency luminance signal quantization matrix value QmatL and the color difference signal quantization matrix value QmatC, and coarse quantization processing is permitted.

このような符号化制御パラメータ補正データが、シーン識別信号の特徴ある推移から予測できる多種のシーンに対して設定されシーン情報データベースに格納されている。 Such encoding control parameter correction data is set for various scenes that can be predicted from characteristic transitions of the scene identification signal and stored in the scene information database.

また、シーンが変化することによりシーン変化点が検出されると（Ｓ５６）、シーン変化前の最後のジャンル情報ＩＤとシーン識別信号で管理されているデータの出現頻度回数Timesを１増加するためのデータがシーン情報データベース１０４に送信され、記録される（Ｓ５７、Ｓ５８）。 When a scene change point is detected due to a scene change (S56), the appearance frequency count Times of the data managed by the last genre information ID and scene identification signal before the scene change is increased by one. Data is transmitted to the scene information database 104 and recorded (S57, S58).

次に、符号化制御パラメータ補正データは、シーン区切り情報とともにデータベース管理回路１０８から符号化制御パラメータ補正回路１０９に送信される。符号化制御パラメータ補正回路１０９では、現在制御されている符号化パラメータと受信した符号化制御パラメータ補正データとが比較される。比較された結果、異なると判断された場合には必要な処理モジュールに対する制御信号が作成され、符号化シンタックス制御回路１１０に送信される。具体的には、シーンが変化した際には符号化パラメータの変化による制御信号が符号化シンタックス制御回路１１０に送信されるが、シーンの変化がなく符号化パラメータが大きく変動しない場合には制御信号は送信されない。本実施形態においては、符号化パラメータのうち１つしか変動していない場合は制御信号は送信されない。 Next, the encoding control parameter correction data is transmitted from the database management circuit 108 to the encoding control parameter correction circuit 109 together with the scene break information. The encoding control parameter correction circuit 109 compares the currently controlled encoding parameter with the received encoding control parameter correction data. As a result of the comparison, if it is determined that they are different, a control signal for a necessary processing module is generated and transmitted to the encoding syntax control circuit 110. Specifically, when a scene changes, a control signal due to a change in the encoding parameter is transmitted to the encoding syntax control circuit 110. However, if there is no change in the scene and the encoding parameter does not fluctuate greatly, the control is performed. No signal is transmitted. In the present embodiment, the control signal is not transmitted when only one of the encoding parameters has changed.

符号化シンタックス制御回路１１０において、受信された制御信号が対応する処理モジュールに送信される。具体的には、MVMaxに関しては動きベクトル検出回路２１７に、SyntaxMは符号化回路２０７、動きベクトル検出回路２１７、動き補償予測回路２１８、および符号量制御回路２１９に、A(T)、MaxRate、およびTargetVBVに関しては符号量制御回路２１９に、QmatL、QmatCに関しては量子化回路２０６、符号化回路２０７、逆量子化回路２１２に送信され、制御が行われる。 In the encoding syntax control circuit 110, the received control signal is transmitted to the corresponding processing module. Specifically, MVMax is sent to the motion vector detection circuit 217, SyntaxM is sent to the encoding circuit 207, the motion vector detection circuit 217, the motion compensation prediction circuit 218, and the code amount control circuit 219, and A (T), MaxRate, and TargetVBV is transmitted to the code amount control circuit 219, and QmatL and QmatC are transmitted to the quantization circuit 206, the encoding circuit 207, and the inverse quantization circuit 212 for control.

このように本実施形態によれば、符号化制御パラメータ補正データがシーン識別信号の特徴ある推移から予測できる多種のシーンに対して設定されているためシーンが切り替わったときに好適な符号化制御に切り替えることが可能になり、従来は困難であったシーンごとに適応したダイナミックな制御が実現可能である。 As described above, according to the present embodiment, since the encoding control parameter correction data is set for various scenes that can be predicted from the characteristic transition of the scene identification signal, the encoding control suitable for the scene switching is performed. It is possible to switch, and it is possible to realize dynamic control adapted to each scene, which has been difficult in the past.

以上は番組情報取得回路１０２においてＥＰＧ情報が取得できた場合についての高能率符号化記録装置１０の動作について説明したが、次にＥＰＧ情報が取得できなかった場合について説明する。 The operation of the high-efficiency encoding / recording apparatus 10 when EPG information can be acquired by the program information acquisition circuit 102 has been described above. Next, the case where EPG information cannot be acquired will be described.

符号化対象となるデジタル画像データがＴＶ番組ではなくチューナー以外から入力された場合などは、ＥＰＧ情報から作成されるジャンル情報およびキーワード情報の取得が不可能である。このような場合には、ジャンル情報の予測処理が行われる。 When digital image data to be encoded is input from a TV other than a TV program, it is impossible to obtain genre information and keyword information created from EPG information. In such a case, a genre information prediction process is performed.

ジャンル情報の予測処理について図８を参照して説明する。図８は、ジャンル情報の予測処理の動作のアルゴリズムを示すフローチャートであり、Hは変数である。 The genre information prediction process will be described with reference to FIG. FIG. 8 is a flowchart showing an algorithm of the operation for predicting genre information, where H is a variable.

まず、ジャンル／キーワード検索回路１０３では、番組情報取得回路１０２からＥＰＧ情報に含まれる情報が取得されなかったときはジャンル予測回路１１１に対して情報が取得できなかったことを知らせるジャンル情報未取得フラグが送信される。 First, in the genre / keyword search circuit 103, when the information included in the EPG information is not acquired from the program information acquisition circuit 102, the genre information non-acquisition flag that informs the genre prediction circuit 111 that the information cannot be acquired. Is sent.

ジャンル予測回路１１１では、シーン検出回路１０６にシーン区切り情報が存在する場合はシーン検出回路１０６からシーン区切り情報が受信される（Ｓ６１）。このとき、ジャンル予測回路１１１にジャンル情報未取得フラグが存在している場合は（Ｓ６２の「Yes」）、ジャンル予測回路１１１からデータベース管理回路１０８に対してジャンル取得要求が送信される（Ｓ６３）。 The genre prediction circuit 111 receives the scene break information from the scene detection circuit 106 when the scene break information exists in the scene detection circuit 106 (S61). At this time, if the genre information non-acquisition flag exists in the genre prediction circuit 111 (“Yes” in S62), a genre acquisition request is transmitted from the genre prediction circuit 111 to the database management circuit 108 (S63). .

データベース管理回路１０８では、シーン分別回路１０７から取得したシーン特性情報により全てのジャンルにおけるシーン識別信号が生成される。そして、生成されたそれぞれのシーン識別信号に属する各ジャンル情報ＩＤの出現頻度が取得される（Ｓ６４、Ｓ６５、Ｓ６６）。次に取得されたジャンル情報ＩＤの出現頻度が比較され（Ｓ６７）、最も多く検出されたジャンル情報ＩＤとその出現頻度がジャンル予測回路１１１に送信される（Ｓ６８、Ｓ６５の「Yes」）。 In the database management circuit 108, scene identification signals for all genres are generated based on the scene characteristic information acquired from the scene classification circuit 107. Then, the appearance frequency of each genre information ID belonging to each generated scene identification signal is acquired (S64, S65, S66). Next, the appearance frequencies of the acquired genre information IDs are compared (S67), and the most frequently detected genre information IDs and their appearance frequencies are transmitted to the genre prediction circuit 111 (“Yes” in S68 and S65).

ジャンル予測回路１１１では、受信した出現頻度がΛ以上であった場合（Ｓ６９）にそのジャンル情報ＩＤが有効であると判断され、このジャンル情報ＩＤが予測ジャンル情報としてシーン分別回路１０７に送信される（Ｓ７０）。 The genre prediction circuit 111 determines that the genre information ID is valid when the received appearance frequency is greater than or equal to Λ (S69), and transmits this genre information ID to the scene classification circuit 107 as predicted genre information. (S70).

シーン分別回路１０７において受信された予測ジャンル情報はデータベース管理回路１０８に送信され、この予測ジャンル情報によって符号化制御パラメータが補正される。 The predicted genre information received by the scene classification circuit 107 is transmitted to the database management circuit 108, and the encoding control parameter is corrected by the predicted genre information.

このように本実施形態によれば、ジャンル情報ＩＤの出現頻度が比較されることによって、ＥＰＧ情報の取得が不可能であっても、該当するシーン識別信号に属するジャンルの出現確率からジャンルの予測を行うことが可能であるとともに、ユーザのジャンル嗜好が選択判断に加えられ、有効な予測制御が可能になる。 As described above, according to the present embodiment, by comparing the appearance frequencies of the genre information IDs, even if the EPG information cannot be obtained, the genre prediction is performed from the appearance probabilities of the genres belonging to the corresponding scene identification signal. In addition, the user's genre preference is added to the selection determination, and effective predictive control becomes possible.

本実施形態においては、シーン変化点の検出処理をフレームアクティビティおよびフレーム輝度DC成分のみで行っているが、フレーム色差DC成分を使用しても処理を行うことができる。 In the present embodiment, scene change point detection processing is performed using only frame activity and frame luminance DC components, but processing can also be performed using frame color difference DC components.

また、本実施形態においては、シーン毎の画像特性情報の平均値をシーン特性情報として算出したが、この算出方法には限定されず、シーン内の画像特性情報のヒストグラムを取ってその代表値をシーン特性情報として算出してもよい。 In this embodiment, the average value of the image characteristic information for each scene is calculated as the scene characteristic information. However, the present invention is not limited to this calculation method, and a representative value is obtained by taking a histogram of the image characteristic information in the scene. It may be calculated as scene characteristic information.

また、本実施形態においては、７種類の符号化パラメータを使用したが、大きな特徴のあるシーンでは少ない符号化パラメータでも識別できるため、符号化時に考慮したいシーンに特化した形で識別するための符号化パラメータを管理することも可能である。その場合には、必要な符号化制御が実現される状態で、データベースに蓄積されるデータ量を削減することができる。またそれに伴い、算出する画像特性情報およびシーン特性情報の種類も削減することができる。 In this embodiment, seven types of encoding parameters are used. However, since a scene with a large feature can be identified with a small number of encoding parameters, it is possible to identify in a form specialized for the scene to be considered at the time of encoding. It is also possible to manage the encoding parameters. In that case, the amount of data stored in the database can be reduced in a state where necessary encoding control is realized. Accordingly, the types of image characteristic information and scene characteristic information to be calculated can be reduced.

また、本実施形態においては、高能率符号化記録装置として回路構成のブロック図を用いて説明したが、これらの回路は同じ処理アルゴリズムを用いてコンピュータ等のソフトウェア上で処理される場合にも同様の効果が得られる。 In this embodiment, the block diagram of the circuit configuration has been described as the high-efficiency encoding / recording apparatus. However, these circuits are the same when they are processed on software such as a computer using the same processing algorithm. The effect is obtained.

また、本実施形態においてはＭＰＥＧ２規格の符号化装置について説明したが、同様に画像信号の隣接画素間（空間方向）の相関および、隣接フレーム間もしくは隣接フィールド間（時間方向）の相関を利用して情報量を圧縮するＭＰＥＧ４ＡＳＰや、ＭＰＥＧ４ＡＶＣを用いた符号化記録装置においても適用可能であり、同様の効果が得られる。ＭＰＥＧ４ＡＶＣの場合には、量子化の細かさを輝度信号と色差信号とで異なる設定で符号化することができるため、符号化パラメータとして輝度信号と色差信号の量子化の比率を制御する値を用意することにより効果的な制御をすることが可能になる。

In this embodiment, the MPEG2 standard encoding apparatus has been described. Similarly, the correlation between adjacent pixels (spatial direction) of an image signal and the correlation between adjacent frames or adjacent fields (time direction) are used. The present invention can also be applied to an encoding / recording apparatus using MPEG4 ASP or MPEG4 AVC that compresses the amount of information, and similar effects can be obtained. In the case of MPEG4 AVC, since the fineness of quantization can be encoded with different settings for the luminance signal and the color difference signal, a value for controlling the quantization ratio of the luminance signal and the color difference signal is set as an encoding parameter. By preparing, it becomes possible to perform effective control.

本発明の第１実施形態における高能率符号化記録装置を示すブロック図である。1 is a block diagram showing a high-efficiency encoding / recording apparatus according to a first embodiment of the present invention. 本発明の第１実施形態における高能率符号化記録装置で利用するＥＰＧ情報として伝送される番組情報の表示例を示す説明図である。It is explanatory drawing which shows the example of a display of the program information transmitted as EPG information utilized with the highly efficient encoding recording device in 1st Embodiment of this invention. 本発明の第１実施形態における高能率符号化記録装置で利用するキーワード情報のデータ構成例を示す説明図である。It is explanatory drawing which shows the data structural example of the keyword information utilized with the highly efficient encoding recording device in 1st Embodiment of this invention. 本発明の第１実施形態における高能率符号化記録装置で算出されるアクティビティの領域と対象となる画素の関係を示す説明図である。It is explanatory drawing which shows the relationship of the area | region of the activity calculated with the highly efficient encoding recording device in 1st Embodiment of this invention, and object pixel. 本発明の第１実施形態における高能率符号化記録装置においてシーン変化点が検出される動作のアルゴリズムを示すフローチャートである。It is a flowchart which shows the algorithm of the operation | movement in which the scene change point is detected in the high efficiency encoding recording device in 1st Embodiment of this invention. 本発明の第１実施形態における高能率符号化記録装置において符号化制御パラメータ補正データが取得される動作のアルゴリズムを示すフローチャートである。It is a flowchart which shows the algorithm of the operation | movement by which encoding control parameter correction data is acquired in the high efficiency encoding recording device in 1st Embodiment of this invention. 本発明の第１実施形態における高能率符号化記録装置のシーン情報データベースに格納されている符号化制御パラメータ補正データのテーブル内容を示す説明図である。It is explanatory drawing which shows the table content of the encoding control parameter correction data stored in the scene information database of the high efficiency encoding recording device in 1st Embodiment of this invention. 本発明の第１実施形態における高能率符号化記録装置において予測ジャンル情報が取得される動作のアルゴリズムを示すフローチャートである。It is a flowchart which shows the algorithm of the operation | movement by which prediction genre information is acquired in the high efficiency encoding recording device in 1st Embodiment of this invention. 従来のＭＰＥＧ２画像符号化で用いられる符号化体系を示す模式図である。It is a schematic diagram which shows the encoding system used by the conventional MPEG2 image encoding. 従来のＭＰＥＧ２画像符号化で用いられる符号化時の符号化順への並べ替えのタイミングを示す模式図である。It is a schematic diagram which shows the timing of the rearrangement to the encoding order at the time of the encoding used by the conventional MPEG2 image encoding. 従来のＭＰＥＧ２画像符号化で用いられる復号時のストリーム到達順から復号画像出力順への並べ替えのタイミングを示す模式図である。It is a schematic diagram which shows the timing of rearrangement from the stream arrival order at the time of decoding used in the conventional MPEG2 image encoding to the decoded image output order. 従来のＭＰＥＧ２画像符号化記録装置を示すブロック図である。It is a block diagram which shows the conventional MPEG2 image coding recording apparatus.

Explanation of symbols

１０…高能率符号化記録装置
２０…画像符号化記録装置
１０１…ＥＰＧ情報入力端子
１０２…番組情報取得回路
１０３…ジャンル／キーワード検索回路
１０４…シーン情報データベース
１０５…画像特性算出回路
１０６…シーン検出回路
１０７…シーン分別回路
１０８…データベース管理回路
１０９…符号化制御パラメータ補正回路
１１０…符号化シンタックス制御回路
１１１…ジャンル予測回路
２０１…対象画像入力端子
２０２…入力画像メモリ
２０３…２次元ブロックデータ変換回路
２０４…減算器
２０５…直交変換回路
２０６…量子化回路
２０７…符号化回路
２０８…符号化テーブル
２０９…マルチプレクサ
２１０…画像ビットストリームバッファ
２１１…記録媒体もしくは伝送路
２１２…逆量子化回路
２１３…逆直交変換回路
２１４…加算器
２１５…デブロック回路
２１６…参照画像メモリ
２１７…動きベクトル検出回路
２１８…補償予測回路
２１９…符号量制御回路

DESCRIPTION OF SYMBOLS 10 ... High-efficiency encoding / recording apparatus 20 ... Image encoding / recording apparatus 101 ... EPG information input terminal 102 ... Program information acquisition circuit 103 ... Genre / keyword search circuit 104 ... Scene information database 105 ... Image characteristic calculation circuit 106 ... Scene detection circuit DESCRIPTION OF SYMBOLS 107 ... Scene classification circuit 108 ... Database management circuit 109 ... Coding control parameter correction circuit 110 ... Coding syntax control circuit 111 ... Genre prediction circuit 201 ... Target image input terminal 202 ... Input image memory 203 ... Two-dimensional block data conversion circuit 204 ... Subtractor 205 ... Orthogonal transformation circuit 206 ... Quantization circuit 207 ... Encoding circuit 208 ... Encoding table 209 ... Multiplexer 210 ... Image bit stream buffer 211 ... Recording medium or transmission path 212 ... Inverse quantization circuit 213 ... Inverse Interconversion circuit 214 ... Adder 215 ... Deblocking circuit 216 ... Reference image memory 217 ... Motion vector detection circuit 218 ... Compensation prediction circuit 219 ... Code amount control circuit

Claims

In a high-efficiency encoding / recording apparatus for compressing the digital image data and storing it in a recording medium by performing an encoding process controlled by an encoding control parameter on the input digital image data of the video,
Genre information for acquiring information related to the genre of the video to be encoded based on electronic program guide information added to the digital image data or based on genre prediction information calculated from the digital image data Acquisition means;
Image characteristics for calculating at least one of information relating to spatial correlation of video information, information relating to temporal correlation, information relating to luminance level, and information relating to color difference level from the digital image data as image characteristic information A calculation means;
Based on the calculated image characteristic information, scene change point detection means for detecting scene break information about the scene change point at which the video scene changes from the digital image data;
Scene classification means for dividing the digital image data into scenes according to the detected scene separation information and calculating scene characteristic information based on the image characteristic information for each divided scene;
Encoding result information acquisition means for acquiring encoding result information relating to a code amount of digital image data already encoded among the digital image data of the video;
A correction value of the encoding control parameter is calculated based on the scene identification signal calculated from the information related to the genre and the scene characteristic information and the encoding result information, and the encoding control is performed based on the calculated correction value. Encoding control parameter correction means for correcting parameters for each scene;
Encoding means for encoding the digital image data for each of the divided scenes according to the encoding control parameter corrected for each of the divided scenes;
A high-efficiency encoding / recording apparatus comprising:

The image characteristic calculation means, as the image characteristic information,
An average luminance level value calculated for each frame of the digital image data;
An average color difference level value calculated for each frame of the digital image data;
A sum of absolute values of differences between adjacent pixels in a frame calculated for each frame of the digital image data;
A sum of absolute differences between pixels belonging to the same position calculated between two consecutive frames of the digital image data;
The high-efficiency encoding / recording apparatus according to claim 1, wherein at least one value is calculated.

3. The high scene according to claim 1, wherein the scene change point detection processing unit detects an instantaneous scene change of the video or a scene change performed in a continuous period. Efficiency coding recording device.

The scene classification means, as the scene characteristic information,
An average luminance level value calculated for each frame of the digital image data;
An average color difference level value calculated for each frame of the digital image data;
A sum of absolute values of differences between adjacent pixels in the frame calculated for each frame of the digital image data;
A sum of absolute differences between pixels belonging to the same position calculated between two consecutive frames of the digital image data;
Calculate at least one of the values
The encoding result information acquisition means, as the encoding result information,
A sum of motion vector distances calculated for each frame for which motion compensation prediction has been performed;
Information amount of encoded digital image data calculated for each frame of the digital image data;
The high-efficiency encoding / recording apparatus according to claim 1, wherein at least one value is calculated.

The encoding control parameter correction means is a parameter for controlling the encoding process.
A value indicating a motion vector detection range;
A value specifying an interval for inserting a frame to be a reference frame when detecting the motion vector;
A value for controlling a target code amount for each frame;
A value that controls the maximum instantaneous encoding rate;
A quantization matrix value for quantizing the luminance signal;
A quantization matrix value for quantizing the color difference signal;
5. The high-efficiency encoded recording apparatus according to claim 1, wherein a correction value of at least one value is output.

Information on the genre acquired by the program information acquisition means is
Information obtained from electronic program guide information added to the digital image data;
Information obtained by searching text data obtained from the electronic program guide information with keyword information set in advance to specify a genre;
The high-efficiency encoding / recording apparatus according to claim 1, comprising:

A frequency recording means for accumulating and recording the appearance frequency of the scene identification signal;
The genre information acquisition unit calculates scene identification signals for all genres based on the scene characteristic information acquired from the scene classification unit when information about the genre of the video cannot be acquired from the electronic program guide information. 6. The genre prediction information is calculated by selecting a scene identification signal with the highest cumulative appearance frequency among the calculated scene identification signals. The high-efficiency encoding / recording apparatus according to the item.