JP2005295064A

JP2005295064A - Compression encoding method

Info

Publication number: JP2005295064A
Application number: JP2004105235A
Authority: JP
Inventors: Toru Toma; 當麻　　徹
Original assignee: Pioneer Electronic Corp
Current assignee: Pioneer Corp
Priority date: 2004-03-31
Filing date: 2004-03-31
Publication date: 2005-10-20

Abstract

<P>PROBLEM TO BE SOLVED: To match second and subsequent code generation amount to a target code amount with high precision in multipath compression encoding processing of two path or more. <P>SOLUTION: The compression encoding method comprises a step (a) for generating an encoded stream by carrying out orthogonal conversion, quantization and entropy encoding of a motion picture signal, a step (b) for setting a target code amount for each unit section based on the results of encoding from step (a), and a step (c) for determining a quantization width corresponding to the target code amount with reference to a code generation amount prediction graph for the quantization width and executing processing of step (a). <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、動画像信号を圧縮符号化する高能率符号化技術に関する。 The present invention relates to a high-efficiency encoding technique for compressing and encoding a moving image signal.

近年、次世代の高能率動画像符号化規格として、ＩＴＵ−Ｔ（International Telecommunication Union-Telecommunication Sector）のＶＣＥＧ（Video Coding Experts Group）とＩＳＯ／ＩＥＣ（International Organization for Standardization/International Electrotechnical Commission）のＭＰＥＧ（Moving Picture Experts Group）という２つの標準化団体が共同で策定した「Ｈ．２６４」（ISO/IEC14496-10）が注目されている。 In recent years, as a next-generation high-efficiency video coding standard, ITU-T (International Telecommunication Union-Telecommunication Sector) VCEG (Video Coding Experts Group) and ISO / IEC (International Organization for Standardization / International Electrotechnical Commission) MPEG ( “H.264” (ISO / IEC14496-10), which was jointly developed by two standardization organizations called “Moving Picture Experts Group”, is drawing attention.

また、動画像信号の符号化技術として、符号化ストリームのビットレートを可変に制御する可変ビットレート符号化と、そのビットレートを一定に制御する固定ビットレート符号化とが知られている。２パス方式の可変ビットレート符号化法は、１回目（１パス目）の符号化結果に基づいて単位区間毎の目標符号量を決定し、２回目（２パス目）の圧縮符号化時に、前記目標符号量に応じて量子化幅を変えることで単位区間毎の発生符号量が目標符号量に近づくように制御する方法である。２パス方式の可変ビットレート符号化技術は、たとえば、特許文献１（特開平１１−１３６６７３号公報）に開示されている。
特開１１−１３６６７３号公報 Also, as a moving picture signal encoding technique, variable bit rate encoding for variably controlling the bit rate of an encoded stream and fixed bit rate encoding for controlling the bit rate constant are known. The variable bit rate encoding method of the 2-pass method determines a target code amount for each unit section based on the first-time (first-pass) encoding result, and at the second (second-pass) compression encoding, In this method, the generated code amount for each unit section is controlled to approach the target code amount by changing the quantization width according to the target code amount. A two-pass variable bit rate encoding technique is disclosed in, for example, Japanese Patent Application Laid-Open No. 11-136673.
JP 11-136673 A

しかしながら、従来の２パス方式の圧縮符号化法では、２回目の符号化処理による発生符号量と、１回目の符号化処理で決定した目標符号量との差が広がりやすいという問題がある。特に、上記Ｈ．２６４による２パス方式の圧縮符号化方法は、従来のＭＰＥＧ方式による圧縮符号化法と比べると演算量が多いため、目標符号量に対する量子化幅の予測が難しく、前述の問題が起きやすい。 However, the conventional two-pass compression encoding method has a problem that the difference between the generated code amount by the second encoding process and the target code amount determined by the first encoding process tends to increase. In particular, H. The two-pass compression encoding method based on H.264 requires a larger amount of computation than the conventional compression encoding method based on the MPEG method, so that it is difficult to predict the quantization width with respect to the target code amount, and the above-described problems are likely to occur.

以上の状況などに鑑みて本発明の主目的は、２パス以上のマルチパス方式の圧縮符号化処理において、２回目以後の発生符号量を目標符号量へ精度良く合わせることを可能にする圧縮符号化方法を提供することである。 In view of the above situation and the like, the main object of the present invention is to provide a compression code that makes it possible to accurately match the generated code amount for the second and subsequent times to the target code amount in a multi-pass compression encoding process of two or more passes. Is to provide a method.

上記目的を達成すべく、請求項１記載の発明は、２パス以上のマルチパス方式の圧縮符号化方法であって、（ａ）動画像信号を直交変換し量子化しエントロピー符号化して符号化ストリームを生成するステップと、（ｂ）前記ステップ（ａ）で得られた符号化結果に基づいて単位区間毎に目標符号量を設定するステップと、（ｃ）量子化幅に対する発生符号量を予測する予測グラフを参照して前記目標符号量に応じた量子化幅を決定し前記ステップ（ａ）の処理を実行するステップと、を備えることを特徴としている。 In order to achieve the above object, the invention according to claim 1 is a compression encoding method of a multi-pass scheme of two or more passes, wherein (a) a moving image signal is orthogonally transformed and quantized and entropy-coded to obtain an encoded stream. (B) setting a target code amount for each unit section based on the encoding result obtained in the step (a), and (c) predicting the generated code amount for the quantization width And a step of determining a quantization width corresponding to the target code amount with reference to a prediction graph and executing the processing of step (a).

以下、図面を参照しつつ本発明に係る実施例について説明する。 Embodiments according to the present invention will be described below with reference to the drawings.

図１は、Ｈ．２６４に準拠する符号化器１の構成を概略的に示すブロック図である。符号化器１は、減算器１０，変換部１１，量子化部１２，エントロピー符号化部１３，逆量子化部１４，逆変換部１５，加算器１６，ループフィルタ１７，フレームメモリ１８，フレーム内予測部１９，動き補償部２０，動き予測部２１，スイッチ部２２および符号化制御部２３を備えている。マルチパス方式の圧縮符号化法を説明する前に、まず、符号化器１の動作を概説する。 FIG. 1 is a block diagram schematically showing a configuration of an encoder 1 conforming to H.264. The encoder 1 includes a subtracter 10, a conversion unit 11, a quantization unit 12, an entropy encoding unit 13, an inverse quantization unit 14, an inverse conversion unit 15, an adder 16, a loop filter 17, a frame memory 18, and an intra-frame. A prediction unit 19, a motion compensation unit 20, a motion prediction unit 21, a switch unit 22, and an encoding control unit 23 are provided. Before describing the multipass compression encoding method, the operation of the encoder 1 will be outlined first.

動画像信号は、時間的に連続する複数枚のフレーム映像からなり、符号化器１に入力する。減算器１０は、入力フレーム映像とスイッチ部２２から伝達するフレーム映像との差分を変換部１１に出力する。スイッチ部２２は、符号化制御部２３の制御を受けて、Ｉピクチャ（Intra-coded picture）についてフレーム内予測符号化を行うときは、フレーム内予測部１９と減算器１０とを接続し、Ｂピクチャ（Bidirectionally predictive-coded picture）とＰピクチャ（Predictive-coded picture）についてフレーム間予測符号化を行うときは、動き補償部２０と減算器１０とを接続する。 The moving image signal is composed of a plurality of temporal frame images and is input to the encoder 1. The subtracter 10 outputs the difference between the input frame video and the frame video transmitted from the switch unit 22 to the conversion unit 11. The switch unit 22 connects the intra-frame prediction unit 19 and the subtractor 10 when performing intra-frame prediction encoding for an I picture (Intra-coded picture) under the control of the encoding control unit 23. When inter-frame predictive coding is performed on a picture (Bidirectionally predictive-coded picture) and a P picture (Predictive-coded picture), the motion compensation unit 20 and the subtractor 10 are connected.

変換部１１は、入力信号を４×４画素単位で整数変換し、その変換係数を量子化部１２に出力する。量子化部１２は、符号化制御部２３から供給される量子化幅（量子化ステップ幅）Ｑで、入力する変換係数を量子化し、その結果得られる量子化係数をエントロピー符号化部１３と逆量子化部１４とに出力する。ここで、量子化幅Ｑが大きいほどに発生符号化量は小さくなり、量子化幅Ｑが小さいほどに発生符号化量は大きくなる。動き予測部２１は、符号化器１に入力する２枚〜５枚のフレーム映像を参照して動きベクトルを生成しこれをエントロピー符号化部１３に供給しており、エントロピー符号化部１３は、前記動きベクトルに基づいて前記量子化係数に可変長符号化を施すことで符号化ビットストリームを生成する。 The conversion unit 11 performs integer conversion on the input signal in units of 4 × 4 pixels, and outputs the conversion coefficient to the quantization unit 12. The quantization unit 12 quantizes the input transform coefficient with the quantization width (quantization step width) Q supplied from the encoding control unit 23, and the resulting quantization coefficient is inverted from that of the entropy encoding unit 13. It outputs to the quantization part 14. Here, the larger the quantization width Q, the smaller the generated coding amount, and the smaller the quantization width Q, the larger the generated coding amount. The motion prediction unit 21 generates a motion vector with reference to 2 to 5 frame images input to the encoder 1 and supplies the motion vector to the entropy encoding unit 13. The entropy encoding unit 13 A coded bit stream is generated by performing variable length coding on the quantized coefficient based on the motion vector.

逆量子化部１４は、量子化部１２から入力した量子化係数を逆量子化し、逆変換部１５は、逆量子化部１４から入力した変換係数を逆変換することでフレーム映像に復号化する。このフレーム映像は、スイッチ部２２から伝達するフレーム映像と加算され、ループフィルタ１７でブロック・ノイズを低減された後にフレームメモリ１８に蓄積される。フレームメモリ１８は、フレーム内予測部１９と動き補償部２０とに遅延したフレーム映像を供給する。 The inverse quantization unit 14 inversely quantizes the quantization coefficient input from the quantization unit 12, and the inverse transform unit 15 decodes the transform coefficient input from the inverse quantization unit 14 into a frame image. . This frame image is added to the frame image transmitted from the switch unit 22 and is stored in the frame memory 18 after block noise is reduced by the loop filter 17. The frame memory 18 supplies the delayed frame image to the intra-frame prediction unit 19 and the motion compensation unit 20.

次に、上記符号化器１を用いたマルチパス方式の符号化方法について説明する。図２に示すように、第１回目（１パス目）の符号化処理では、最適符号化予測部３０が符号化器１に対して量子化幅Ｑなどを含む制御信号を供給する。符号化器１は、最適符号化予測部３０の制御を受けて動画像信号を圧縮符号化することで符号化ストリームを生成する。このとき、最適符号化予測部３０は、初期情報に基づいて、フレーム映像毎あるいはＧＯＰ（Group of picture）毎に量子化幅Ｑを指定する。 Next, a multipass encoding method using the encoder 1 will be described. As shown in FIG. 2, in the first encoding process (first pass), the optimal encoding prediction unit 30 supplies a control signal including a quantization width Q and the like to the encoder 1. The encoder 1 generates an encoded stream by compressing and encoding a moving image signal under the control of the optimal encoding prediction unit 30. At this time, the optimal encoding prediction unit 30 designates the quantization width Q for each frame video or each GOP (Group of picture) based on the initial information.

最適符号化予測部３０は、第１回目の符号化結果から特徴量を取得する。特徴量としては、たとえば、各フレーム映像毎のフレーム内符号化またはフレーム間符号化の種別、各フレーム映像毎の発生符号量，符号化前の元の動画像信号とこれを符号化した後に復号化した動画像信号との誤差，フレーム内符号化に使用された画素ブロックの個数などが挙げられる。 The optimum encoding prediction unit 30 acquires a feature amount from the first encoding result. As the feature amount, for example, the type of intra-frame encoding or inter-frame encoding for each frame image, the generated code amount for each frame image, the original moving image signal before encoding, and decoding after encoding this The error with the converted video signal, the number of pixel blocks used for intra-frame coding, and the like can be mentioned.

次に、第２回目（２パス目）の符号化処理では、図３に示すように、特徴量解析部３１が、第１回目で取得された特徴量を解析して入力動画像信号をシーン単位で分割する。ここで、シーンとは、フレーム映像間の相関が著しく小さくなるシーンチェンジとシーンチェンジとの間のフレーム映像群を意味する。また、特徴量解析部３１は、映像ショット毎の発生符号量に基づいて後述の重み付け係数Ｗを設定する。特徴量解析部３１は、重み付け係数Ｗや映像ショット情報などを事前符号量予測部３２に供給する。 Next, in the second encoding process (second pass), as shown in FIG. 3, the feature amount analysis unit 31 analyzes the feature amount acquired in the first time and converts the input moving image signal into a scene. Divide by unit. Here, the scene means a group of frame images between scene changes where the correlation between the frame images becomes extremely small. In addition, the feature amount analysis unit 31 sets a weighting coefficient W described later based on the generated code amount for each video shot. The feature amount analysis unit 31 supplies the weighting coefficient W, video shot information, and the like to the prior code amount prediction unit 32.

事前符号量予測部３２は、前回の符号化結果に基づいて、符号化ストリームの受信側のバッファメモリの占有量（データ蓄積量）を予測する。図４は、特徴量解析部３１で特徴量を解析する前の受信側のバッファメモリの占有量を概略的に例示する図である。図５は、事前符号量予測部３２が予測する占有量を概略的に例示する図である。図４の場合では、占有量がバッファメモリの上限値（バッファ上限）を超えるため、バッファメモリが破綻してしまう。このような事態を避けるべく、図５に示すようにシーン毎に占有量を予測しておく。このとき、事前符号量予測部３２は、図６に示す予測グラフを用いて各シーン毎の予測符号量を算出する。この予測グラフの横軸は、量子化幅ｘを、縦軸は、量子化幅ｘに関する発生符号量f(ｘ)をそれぞれ示している。予測符号量Ｐ（ｎ＋１）は、前回の発生符号量Ｐ（ｎ）に、前記重み付け係数Ｗと(f(Q)-f(Q+dQ))/f(Q)とを乗算して得られる。すなわち、Ｐ（ｎ＋１）＝Ｐ（ｎ）×Ｗ×(f(Q)-f(Q+dQ))/f(Q)、である。事前符号量予測部３２による予測結果は、最適符号化予測部３０に供給される。なお、予測グラフは、符号化処理に対する発生符号量を統計処理することで図５に示すような量子化幅と発生符号量との間の関係を得ることができる。 The prior code amount prediction unit 32 predicts the occupation amount (data accumulation amount) of the buffer memory on the reception side of the encoded stream based on the previous encoding result. FIG. 4 is a diagram schematically illustrating an occupation amount of the buffer memory on the reception side before the feature amount analysis unit 31 analyzes the feature amount. FIG. 5 is a diagram schematically illustrating the occupation amount predicted by the prior code amount prediction unit 32. In the case of FIG. 4, since the occupation amount exceeds the upper limit value (buffer upper limit) of the buffer memory, the buffer memory fails. In order to avoid such a situation, the occupation amount is predicted for each scene as shown in FIG. At this time, the prior code amount prediction unit 32 calculates the prediction code amount for each scene using the prediction graph shown in FIG. The horizontal axis of this prediction graph represents the quantization width x, and the vertical axis represents the generated code amount f (x) related to the quantization width x. The predicted code amount P (n + 1) is obtained by multiplying the previous generated code amount P (n) by the weighting coefficient W and (f (Q) −f (Q + dQ)) / f (Q). . That is, P (n + 1) = P (n) × W × (f (Q) −f (Q + dQ)) / f (Q). The prediction result by the prior code amount prediction unit 32 is supplied to the optimal encoding prediction unit 30. Note that the prediction graph can obtain the relationship between the quantization width and the generated code amount as shown in FIG. 5 by statistically processing the generated code amount for the encoding process.

最適符号化予測部３０は、事前符号量予測部３２による予測結果に基づいて、図６に示す予測グラフを用いて、各シーン毎の目標符号量に合わせて量子化幅Ｑを決定する。決定された量子化幅Ｑは符号化器１に供給される。また、最適符号化予測部３０は、第２回目の符号化結果から特徴量を取得する。この特徴量は、第３回目の符号化処理に使用され得る。 The optimal encoding prediction unit 30 determines the quantization width Q according to the target code amount for each scene, using the prediction graph shown in FIG. 6 based on the prediction result by the prior code amount prediction unit 32. The determined quantization width Q is supplied to the encoder 1. Moreover, the optimal encoding prediction part 30 acquires a feature-value from the 2nd encoding result. This feature amount can be used for the third encoding process.

上記の通り、本実施例のマルチパス方式の圧縮符号化によれば、予測グラフ（図６）に基づいて目標符号量が算出されるため、２回目以後の発生符号量を目標符号量へ精度良く合わせることが可能となる。また、上記の圧縮符号化処理を３パス以上繰り返すことで一定のビットレートへ早く収束させることが可能であり、固定ビットレート符号化を効率良く実行することができる。 As described above, according to the multi-pass compression encoding of the present embodiment, the target code amount is calculated based on the prediction graph (FIG. 6), so the generated code amount after the second time is accurately converted to the target code amount. It becomes possible to match well. Further, it is possible to quickly converge to a constant bit rate by repeating the above-described compression encoding process for three or more passes, and it is possible to efficiently execute the fixed bit rate encoding.

Ｈ．２６４に準拠する符号化器の構成を概略的に示すブロック図である。H. 2 is a block diagram schematically showing a configuration of an encoder compliant with H.264. １回目（１パス目）の符号化処理を説明するための図である。It is a figure for demonstrating the encoding process of the 1st time (1st pass). ２回目（２パス目）以後の符号化処理を説明するための図である。It is a figure for demonstrating the encoding process after the 2nd time (2nd pass). 受信側のバッファメモリの占有量を概略的に例示する図である。It is a figure which illustrates roughly the occupation amount of the buffer memory on the receiving side. 事前符号量予測部で予測される占有量を概略的に例示する図である。It is a figure which illustrates roughly the occupation amount estimated by the prior code amount prediction part. 量子化幅に関する発生符号量を予測する予測グラフを概略的に示す図である。It is a figure which shows roughly the prediction graph which estimates the generated code amount regarding a quantization width | variety.

Explanation of symbols

１符号化器
１０減算器
１１変換部
１２量子化部
１３エントロピー符号化部
１４逆量子化部
１５逆変換部
１６加算器
１７ループフィルタ
１８フレームメモリ
１９フレーム内予測部
２０動き補償部
２１動き予測部
２２スイッチ部
２３符号化制御部 DESCRIPTION OF SYMBOLS 1 Encoder 10 Subtractor 11 Conversion part 12 Quantization part 13 Entropy encoding part 14 Inverse quantization part 15 Inverse conversion part 16 Adder 17 Loop filter 18 Frame memory 19 Intra-frame prediction part 20 Motion compensation part 21 Motion prediction part 21 22 switch unit 23 encoding control unit

Claims

A multi-pass compression encoding method of two or more passes,
(A) orthogonally transforming and quantizing the moving image signal and entropy encoding to generate an encoded stream;
(B) setting a target code amount for each unit interval based on the encoding result obtained in step (a);
(C) referring to a prediction graph for predicting a generated code amount with respect to a quantization width, determining a quantization width according to the target code amount, and executing the process of step (a);
A compression encoding method comprising:

A compression encoding method according to claim 1, wherein
Obtaining a feature amount based on the encoding result obtained in step (a);
Determining the unit section based on the feature amount;
A compression encoding method, further comprising:

A compression encoding method according to claim 2, wherein
Based on the encoding result obtained in the step (a), further comprising the step of predicting the occupation amount of the buffer memory on the receiving side that receives the encoded stream;
The step (c) includes a step of determining the quantization width according to the occupation amount.