WO2005081541A1

WO2005081541A1 - Image information encoding device and image information encoding method

Info

Publication number: WO2005081541A1
Application number: PCT/JP2005/001560
Authority: WO
Inventors: Toshiharu Tsuchiya; Kazushi Sato; Toru Wada; Yoichi Yagasaki; Makoto Yamada
Original assignee: Sony Corporation
Priority date: 2004-02-25
Filing date: 2005-01-27
Publication date: 2005-09-01
Also published as: US20070286281A1; KR20060127155A; CN1910933A; EP1746842A1; JP2005244503A; JP3879741B2

Abstract

There is provided an image information encoding device for outputting image compression information based on the image encoding method such as the MPEG4/AVC. When performing a mode judgment to decide whether a predetermined block is in the skip mode or a spatial direct mode, the motion vector information and the like should be calculated for all the predetermined adjacent blocks. However, when parallel processing is performed in order to increase the entire processing speed, the motion vector information and the like may not be obtained for all the predetermined adjacent blocks. In this case, in order to perform mode judgment without waiting for the calculation such as motion vector information on the adjacent blocks, mode judgment is performed by using in the pseudo-way the motion vector information and the like on other blocks located in the vicinity instead of the adjacent blocks.

Description

Specification

TECHNICAL FIELD The present invention relates to an image information encoding apparatus and an image information encoding method.

The present invention relates to image information (bit stream) compressed by orthogonal transform such as discrete cosine transform or force Lunen-Lébe transform and motion compensation, such as MPEG (Moving Picture Experts Group), H.26x, etc. ) When receiving over a network such as satellite broadcasting, cable television, the Internet, or a mobile phone, or when processing on a storage medium such as optical, magnetic disk, or flash memory The present invention relates to an information encoding device. Background art

In recent years, image information has been treated as digital, and image information encoding and decoding devices compliant with methods such as MPEG that compress by orthogonal transform such as discrete cosine transform and motion compensation using the redundancy inherent in image information. However, it is becoming popular both in information distribution at broadcasting stations and in information reception in ordinary households.

In particular, MP EG 2 (I S O (International Organization for Standardization) Z I E C (International Electrotechnical

Coandaition) 1 3 8 1 8 -2) is defined as a general-purpose image coding method. MPEG 2 is a standard that covers both interlaced and progressive scan images, as well as standard-resolution and high-definition images, and is currently widely used in a wide range of professional and consumer applications. ing. By using this MPEG2 compression method, for example, a standard resolution interlaced image having 720 × 480 pixels can be obtained. 4 to 8 Mb ps (Bit per Second) for images, and 18 to 22 Mb ps code rate (bit rate) for high-resolution interlaced images with 1920 x 1088 pixels As a result, a high compression rate and good image quality can be realized.

MP EG2 mainly targets high-quality coding suitable for broadcasting, but does not support a coding amount (low bit rate) smaller than MP EG 1, that is, a coding method with a higher compression rate. Was. With the spread of mobile terminals, the need for such an encoding system is expected to increase in the future, and in response, MPEG4 encoding system was standardized. Regarding the image coding method, the standard ISO OZ IEC 1449 6-2 was approved as an international standard in February 1998.

Furthermore, in recent years, the standardization of H.26L (ITU (International Telecommunication Union)-TQ6 / 16 VCEG), which was originally formulated for the purpose of video coding for videoconferencing, is progressing. H.26L requires more computation for encoding and decoding than conventional encoding methods such as MPEG2 and MPEG4, but achieves higher encoding efficiency It is known. Currently, as part of the activities of MPEG4, the Joint Model, which is based on H.26L and incorporates functions that are not supported by H.26L, to achieve higher coding efficiency, has been developed. Enhanced—Composure Video Video Coding. In March 2003, the H.264ZAVC (Advanced Video Coding) standard was recognized as an international standard. This standard is also referred to as MPEG-4Part10. In this specification, this standard is hereinafter referred to as AVC (AVC standard) as appropriate. Reference 1 below describes the contents of processing based on this standard.

`` Draft Errata List with Revision-Marked Correct ions for H.264 / AVCJ, JVT-1050, Thomas Wiegand et al., Joint Video Team (JVT) of ISO / IEC MPEG & ITU-T VCEG, 2003

Here, a conventional image information encoding device based on the AVC standard will be described with reference to the block diagram of FIG. The image information encoding device 100 in FIG. 1 includes an AZD conversion unit 101, a screen rearrangement buffer 102, an adder 103, an orthogonal transformation unit 104, a quantization unit 105, and lossless encoding. Unit 106, accumulation buffer 107, inverse quantization unit 108, inverse orthogonal transform unit 109, deblock fill evening 110, frame memory 111, intra prediction unit 112, motion Includes prediction / compensation unit 113 and rate control unit 114.

The input signal (image signal) is first provided to the A / D converter 101, where it is converted to a digital signal. Next, the frames are rearranged in the screen rearrangement buffer 102 according to the GOP (Group of Pictures) structure of the image compression information to be output.

For an image to be encoded using intra coding, that is, encoding using a single frame, the input image and difference information of pixel values generated by the intra prediction unit 112 are input to the orthogonal transformation unit 104, Here, orthogonal transform such as discrete cosine transform and force Lunen * Loeve transform is performed. The transform coefficient output from the orthogonal transform unit 104 is provided to a quantizing unit 105, where a quantization process is performed. The quantized transform coefficient output from the quantization unit 105 is sent to the lossless encoding unit 106, where lossless encoding such as variable-length encoding and arithmetic encoding is performed. Thereafter, the image data is accumulated in the accumulation buffer 107 and output from the image information encoding device 100 as image compression information.

The behavior of the quantization unit 105 is controlled by the rate control unit 114. The quantized transform coefficient output from the quantization unit '105 is input to the inverse quantization unit 108, and further subjected to inverse orthogonal transform processing in the inverse orthogonal transform unit 109 to obtain a decoded image. Information, and is blocked by the deblocking filter 110. After the lock distortion is removed, it is stored in the frame memory 111. In the intra prediction unit 112, information on the intra prediction mode applied to the block Z macroblock is transmitted to the lossless encoding unit 106 and encoded as part of header information in the image compression information. Is done. On the other hand, for an image to be coded using inter coding, that is, coding using image information of a plurality of frames, first, information of the image to be coded is input to the motion prediction / compensation unit 113. At the same time, the image information of another frame to be referred to is input to the motion prediction / compensation unit 113 from the frame memory 111, where the motion prediction / compensation processing is performed to generate reference image information. The adder 103 adds the reference image information with its phase inverted to the image information to obtain a difference signal. The motion prediction / compensation unit 113 simultaneously outputs the motion vector information to the lossless encoding unit 106, and the information is also subjected to lossless encoding such as variable-length encoding and arithmetic encoding. It is inserted into the header of the compression information. Other processes are the same as those relating to the intra coding.

Next, with reference to the block diagram of FIG. 2, an image information decoding apparatus 120 that realizes image compression by orthogonal transform such as discrete cosine transform or force Lunen-Loeve transform and motion compensation will be described. The image information decoding device 120 includes an accumulation buffer 121, a lossless decoding unit 122, an inverse quantization unit 123, an inverse orthogonal transform unit 124, an adder 125, and a screen rearrangement. It is composed of a buffer 126, a D / A conversion unit 127, a frame memory 128, a motion prediction / compensation unit 129, an intra prediction unit 130, and a deblock filter 1331.

The input information (image compression information) is first stored in the storage buffer 122, and then transferred to the lossless decoding unit 122. Here, processing such as variable-length decoding and arithmetic decoding is performed based on the determined image compression information format. Is At the same time, if the frame is intra-coded, the lossless decoding unit 122 also decodes the intra prediction mode information stored in the header of the image compression information, and converts the information. It is transmitted to the intra prediction unit 130. If the frame is inter-coded, the lossless decoding unit 122 also decodes the motion vector information stored in the header of the image compression information, and predicts the information using motion prediction. · Transfer to the compensation section 1 2 9

The quantized transform coefficient output from the lossless decoding unit 122 is input to the inverse quantization unit 123, where it is output as a transform coefficient. The transform coefficients are subjected to a fourth-order inverse orthogonal transform in the inverse orthogonal transform unit 124 ′ based on a predetermined method. If the frame has been subjected to the intra coding, the image information on which the inverse orthogonal transform processing has been performed and the predicted image generated in the intra prediction unit 130 are added by the adder 125. After being decomposed and deblocked by the deblocking filter 131, it is stored in the screen rearrangement buffer 126 and output after DZA conversion processing by the DZA conversion unit 127. .

If the frame is inter-coded, the motion prediction / compensation unit 129 sends the motion vector information subjected to the lossless decoding process by the lossless decoding unit 122, and the frame memory. A reference image is generated based on the image information stored in 128, and the reference image and the output of the inverse orthogonal transform unit 124 are combined in an adder 125. Other processing is the same as that of the intra-coded frame.

By the way, in the image information coding apparatus 100 shown in FIG. 1, the motion prediction / compensation unit 113 plays an important role to realize high compression efficiency. By introducing the following three methods for AVC encoding, it can be compared with conventional image encoding methods such as MPEG2 and MPEG4. To achieve high compression efficiency.

In other words, the first method refers to multiple frames (Multiple Reference Frame), the second method uses motion estimation and compensation using a variable block size, and the third method uses 1/4 This is motion compensation with pixel accuracy.

The first method refers to multiple frames. In the AVC coding method, one or more previous frames can be referred to for motion prediction 'compensation. In MPEG2 and MPEG4, only the previous frame was referenced during motion prediction and compensation. By referring to the immediately preceding frame, the frame to be encoded can be reproduced using only the difference data between the object image and the motion vector representing the movement of the moved object, and the compression ratio of the encoded data can be reduced. Can be increased. However, if a plurality of frames are referred to as in the case of the AVC encoding method, the difference data can be expected to be further reduced, and the compression ratio is further improved. As shown in Fig. 3, it is possible to refer to multiple frames for the processing of a macroblock belonging to one (current) frame. Such processing is performed by the motion prediction / compensation unit 113 of the image information encoding device 100 by storing the previous frame in the frame memory 111 so that the motion of the image information decoding device 120 The prediction / compensation unit 129 can be realized by storing the previous frame in the frame memory 128.

The second method is motion prediction and compensation using a variable block size. In the AVC coding system, as shown in Fig. 4, it is possible to divide one macro block into a minimum of 8 (pixels) x 8 (pixels) motion compensation blocks. Furthermore, an 8 × 8 motion compensation block can be divided into at least 4 × 4 sub-macroblocks (partitions). In each macroblock, Locks can have separate motion vector information.

Here, the hierarchy of the video sequence generated by the AVC coding system is expressed in the order of frame (picture)> slice> macroblock-> sub macroblock-> pixel . 4 X4 sub-macroblocks are sometimes simply referred to as blocks. Here, macro blocks and sub-macro blocks are referred to as “blocks” as appropriate.

The third method is a motion compensation process with quarter-pixel accuracy. This processing will be described with reference to FIG. First, pixel values with 1Z2 pixel accuracy are generated, and then pixel values with 1/4 pixel accuracy are calculated. For generating pixel values with half-pixel accuracy, the following 6 tap FIR (Finite I immediate response) filter is defined.

{1,-5, 2 0, 20,-5, 1} ... (Equation 1) In FIG. 5, the upper-case alphabetic portion represents an integer pixel (Integer Sample). The lowercase letters represent fractional pixels (Fractional Sample, eg 1/2 pixel or 1Z4 pixel). The pixel values b and h of 1 Z2 pixel precision are obtained as follows using the pixel values of neighboring integer pixel precision and the above filter.

bl = (E-5F + 20G + 20H-5I + J) hl (Equation 2) hl = (A-5C + 20G + 20M-5R + T) ) In addition, clip processing is performed as follows to obtain b and h.

b = C 1 ip 1 ((b 1 + 16) >> 5) (Equation 4) h = C 1 ip 1 ((h 1 + 16) >> 5) (Equation 5) Where C 1 ip 1 (x) = C 1 ip 3 (0, 2 5 5, x) and

CI ip 3 is defined as follows. Other cases

· ', (Equation 6) Also, “x >> y” indicates that y is shifted to the right by y bits with respect to X, which is a binary number in two's complement notation.

For a pixel value j with half-pixel accuracy, aa, bb, cc, dd, ee, ff, gg, and hh are generated in the same way as b and h described above, and Alternatively, it can be obtained by Equation 9 based on j 1 obtained by either of Equations 8.

jl = cc-5 dd + 20 h + 20 m-5 ee + ff-(eq. 7) jl = a a-5 bb + 20 b + 20 s-5 gg + hh-(eq. 8) j = C 1 ip 1 ((j 1 + 5 1 2) >> 10) (Equation 9)

For the pixel values of 1/4 pixel precision, a, c, d, n, f, i, k, and q, the pixel value of integer pixel precision is expressed by

Z It is obtained by the linear inner value of the pixel value with two pixel accuracy.

a = (G + b + 1) >> 1 (Equation 10)

c = (H + b + 1) >> 1 (Equation 1 1)

d = (G + h + 1) >> 1 (Equation 1 2)

n = (M + h + 1) >> 1 (Equation 13)

f = (b + j + 1) >> 1 (Equation 14)

i = (h + j + 1) >> 1 (Equation 15)

k = (j + m + 1) >> 1 (Equation 16)

Q = (j + s + 1) >> 1 (Equation 1 7)

Also, the pixel values e, g, p, and r of the 14-pixel accuracy are obtained by linear interpolation of the pixel values of the 1 / 2-pixel accuracy, as shown in Equations 18 to 21 below. (b + h + 1) »1 (Equation 18)

(b + m + 1) »1 (Equation 1 9)

(h + s + 1) »1 (Equation 20)

(m + s + 1) »1 (Equation 2 1)

Next, the motion vector coding method defined in the AVC coding method will be described with reference to FIG. Figure 6 shows block E and its surrounding blocks A, B, C, and D. Here, blocks A through E may be macroblocks or sub-macroblocks. In order to generate a predicted value of the motion vector for the block E which is the current block (ie, the target of the motion compensation processing), in principle, motion vector information for the adjacent blocks A, B, and C is used. Can be This process is called Median Prediction.

When the block C does not exist in the picture (frame) or the slice, or when the motion vector information and the reference frame of the block C cannot be used due to the processing order, the motion compensation processing is performed. Uses the motion vector information and reference frame of block D instead of the motion vector information and reference frame of block C. Furthermore, when all of the motion compensation blocks B, C, and D do not exist in the picture or the slice, the motion vector information and the reference frame for the block A are used.

In addition to the above, if it is intra-coded, or if it is impossible to perform coding using information on motion compensation because it does not exist in the picture or slice, its motion vector value is 0, Also, the value of the reference index (refldx) is 11.

Next, the skip mode in P-picture (frame) is explained. I will tell. In AVC, a special coding method called “skip mode” is defined for P pictures. This is a mode in which the motion vector information and coefficient information are not embedded in the bit stream, and when decoding, the motion vector information is restored under a certain rule. The number can be saved, and higher coding efficiency can be realized.

This skip mode is a special mode only for blocks with a block size of 16x16. For skip mode motion vector information, etc., the value of the reference index (re il dxLO) is 0, and if at least one of the following three conditions is satisfied, both components of the motion vector value (x , Y) are both 0, and in other cases, the result of the median prediction described above is used as the value of the motion vector. Here, it is assumed that the current block is block E.

Condition 1: When block A or block B cannot be used.

Condition 2: When the value of the reference index (refl dxLOA) of block A is 0 and the motion vector value is 0.

Condition 3: When the value of the reference index (refl dxLOB) of block B is 0 and the motion vector value is 0.

FIG. 7A shows an example in which the block sizes of blocks A to E described in FIG. 6 are all 16 × 16.

Fig. 7B shows a case where the current block E has a block size of 16 x 16 and block A is 8 x 4, block B is 4 x 8, and block C is 16 x 8 . In this case as well, the skip mode is determined as described above. Here, if the block adjacent to block E has a small size, a plurality of blocks will be in contact with block E, but the blocks in contact with the upper left corner of block E will be referred to as blocks A, D, and B, respectively. The block where the upper right corner of block E touches block C.

Next, the direct mode of the B picture will be described. The direct mode is a special mode for block size 16x16 or 8x8 and has no application to P-pictures. Similar to the skip mode described above, since motion vector information is not transmitted, at the time of decoding, these motion vector information are generated from information of adjacent blocks, but coefficient information of motion compensation processing in encoding is transmitted. Is done. In the direct mode, if the coefficient information becomes 0 as a result of the quantization processing for a 16 X 16 block, it can be treated as a skip mode having no coefficient information. .

The direct mode has a spatial direct mode and a temporal direct mode, as described later. It is possible to specify which one is used in the slice.

First, the spatial direct mode will be described. Before performing the direct direct mode prediction, the value of a predetermined flag (for example, “colZeroFlag”) is set as follows.

That is, when all of the following are “true”, the value of the flag “colZeroFlag” is set to 1 in units of 4 × 4 blocks or in units of 8 × 8 blocks, and is set to 0 otherwise.

(a) The reference frame (picture 1) referenced by RefPictListl [0] is marked as a short-term reference picture

(b) The value of the reference index for the collocated macroblock is 0 (c) Motion vector information of the locator block Both mv Col [0] and mv Col [1] are values between -1 and 1 with 1 Z4 pixel precision (the locator macro block is If it is a macroblock, the vertical direction is 1/4 pixel precision in field units.)

If the value of the flag “colZeroFlag” is 1, or if it is not possible to generate a motion vector (pmv) for the block due to circumstances such as all neighboring blocks being intra, mv ( (Motion vector) 0 is applied to the block. At other times, the motion vector value generated by median prediction is applied to the block.

The reference index uses the minimum value of the proximity blocks A, B, C (or D) shown in Fig. 7 for both List 0 and List 1.

Next, the temporal direct mode will be described. The forward motion vector MV0 and the backward motion vector MV1 are obtained from the motion vector MVC used in the collocation block of the subsequent frame (picture one) RL1. In FIG. 8, the forward motion vector information for the previous frame RL0 for the predetermined block 15 1 of frame B is MV0, the motion vector information for the subsequent frame RL1 is MVI, and the frame RL1 Let the motion vector information of the collocated block 150 be MVC. In the temporal direct mode, MV0 and MV1 are generated from the MVC and the distances TDD, TDD on the time axis of the frame B and the reference frames RL0, RL1 by the following Expressions 22 and 23.

MV 0 = (TDB / TDD) MV C (Equation 22) MV 1 = ((TDD -TD B) / TDD) MVC (Equation 23) By the way, in AVC, as described above, Many motion compensation modes In the conventional image information encoding apparatus 100 as shown in FIG. 1, selection of the optimum mode for each macroblock is performed based on the image compression information of the high compression ratio. This is an important technology to generate.

The following reference 2 discloses a motion vector search method for AVC standardization related to this technology. '

"Rate_Distortion Optimization for Video Compressionj, G.

Sullivan and T. Wiegand, IEEE Signal Processing Magazine, Nov. 1998

In this method (also called RD (Rate-Distortion) optimization), a motion vector that minimizes the following values is output as a search result in a motion search with all accuracy.

J (m, λ MOT I ON) = S A (T) D (s, c (m)) + λ MO T I ON · R (m-p) · · (Equation 24)

Here, m = (mx, my) T is a motion vector, ρ = (ρ χ, ρ y) Τ is a predicted motion vector, and λΜ〇Τ ION is a Lagrange multiplier for the motion vector. R (m- ρ) is the amount of generated motion vector difference information obtained by table lookup. In the AVC coding method, entropy coding uses UVLC

(Universal Variable Length Code) and CABAC

Two methods based on (Context-based Adaptive Binary Arithmetic Coding) are specified, but the amount of generated information is based on UVLC even when CABAC is used. The distortion is obtained by the following equation 25. Β, Β

V,

SAD {s, c (m)) | six, y]-cix in _x , y ™ in _y ] j ... (Equation 25) In Equation 25 above, s is the image signal of the current frame, and c is the reference frame. Represents an image signal. When correcting motion vectors with less than 1-pixel accuracy, SADD (Sum of Absolute Transform Difference), which is obtained using Hadamard transform instead of discrete cosine transform, is used. The Lagrangian multiplier λΜΟΤION is given as follows. That is, for I and P frames, Equation 26 gives, and for B frames, Equation 27 gives.

λ MOD E, P = (0.85 * 2 QP / 3) 1/2 (Equation 26) λ MOD E, B = (4 * 0.85 * 2 QP / 3) 1/2

• · (Equation 2 7)

Here, QP means quantization parameter overnight.

As the reference frame, a frame that minimizes the value of Equation 28 below is selected.

J (REFI λ MOT I ON) = S ATD (s, c (REF, m (REF))) + λ MOT I ON · (R (m (REF) — p (REF)) + R (REF ))

Here, R (REF) is the amount of information generated in the reference frame obtained by UVLC.

The prediction direction of the NXM block in the B frame is selected so as to minimize the value of Expression 29 below.

J (PD IR l AMOT I ON) = SATD (s, c (PDIR, m (PDIR))) + λ MOT I ON · (R (m (PD IR) — p (PDIR)) + R (REF ( (PDIR))) The macroblock mode is selected so as to minimize the value of Equation 30 below.

J (s, c, MODE I QP, λ MOD E) = S S D (s, c, MODEL QP) + λ MOD E-R (s, c, MODE I Q

P)

Here, QP indicates the quantization parameter of the macroblock, and AMODE indicates the Lagrange multiplier for mode selection.

MODEs that are candidates for selection are grouped as shown in the following equation 3 1 or 3 3 for each frame type.

MODE e (INTRA 4x4, INTRA 16x16}

(Equation 31)

P frame MODES

(Equation 32) J-Ream MODES

… (Equation 33) where SKIP represents the 16 x 16 mode, in which the motion vector residuals and coefficient residuals are not sent, and SSD is defined as in Equation 34 below S represents the image signal of the current frame, and c represents the image signal of the reference frame. SSD {s _r c _f MODE jQP) = ^ (s _y [X, y]-c _y [x, y, MODE | QP) ²

X = l, y; l

8,8

+ ^ _t £ s _v (x, y] -c _v [x, F, MODE | QP]) ²

… (Equation 34)

R (s, c, MODE IQP) represents the amount of information generated by the macroblock when MODE and QP are selected. The amount of generated information includes those corresponding to all information such as headers, motion vectors, and orthogonal transform coefficients. c Y [x, y, MODE I QP] and s Y [, y] represent the luminance components of the reconstructed image and the original image, and c U and c V, and s U and s V are the color difference components. Represents

The Lagrange multiplier λ MOT ION is given by the following equations 35 and 36 for the I frame, the P frame, and the P frame, respectively.

I, P frame: AMODE, P = 0.85 * 2 QP / 3

• · (Equation 3 5)

B frame: AM〇DE, B = 4 * 0.85 * 2 QP / 3

• · (Equation 36)

Here, QP indicates the quantization parameter.

The same selection processing as in the case of determining the mode of a macro block is performed when dividing an 8 × 8 block. A division mode that minimizes the value of Equation 37 below is selected.

16 J (s, c, MODE I QP, λ MOD E) = SSD (s, c, MODE IQP) + λ MOD E-R (s, c, MODE I QP)

• · (Equation 3 7)

Here, QP represents the quantization parameter of the macroblock, and λ MODE represents the Lagrange multiplier used for mode selection.

The candidates for the selection mode represented by MOD E are determined for the P frame and the B frame as shown in the following equations 38 and 39, respectively.

INTRA 4x4,

P frame MODE— _8χ8ι 8x4, 4x8, 4x4

(Equation 38>

INTRA 4x4, DIRECT,

8x8, 8x4, 4x8 _t 4x4

(Equation 39) By the way, when realizing the conventional image information encoding device 100 as shown in FIG. 1 as hardware that operates in real time, as a high-speed technology, such as a pipeline process, Parallel processing is essential. Also, depending on the motion search method for speeding up, the motion vector of the skip mode or the spatial direct mode calculated according to the rules defined by the standard is the motion vector search range. May not be included.

In such a case, in the skip mode or the spatial direct mode, it is necessary to perform a separate motion search process on those motion vectors in addition to the normal motion search process.

These mode decisions require motion vector information of adjacent macroblocks. However, each macro block is processed by pipeline processing. If the processing for the mouthpiece does not end in a predetermined order, the motion vector information of these adjacent macroblocks cannot be obtained, and the skip mode and the spatial direct mode are determined. It hinders.

Therefore, an object of the present invention is to provide an image information encoding device that outputs image compression information based on an image encoding method such as AVC, and obtains necessary vector information of adjacent blocks for parallel processing of a pipeline or the like. Even if this is not possible, it is to realize high-speed encoding processing by generating pseudo information.

A further object of the present invention is to provide an image information encoding apparatus that outputs image compression information based on an image encoding method such as AVC, wherein the motion information used to determine the skip mode or the spatial direct mode is determined. An object of the present invention is to provide a means for performing an effective mode setting while realizing high-speed processing by parallel processing by calculating vector information and reference index information in a pseudo manner. Disclosure of the invention

According to the first aspect of the present invention, at least one of the motion vector information and the coefficient information is omitted to perform coding on a block, and the decoding side can restore the omitted information according to a predetermined rule. A determination unit that determines whether or not a block can be coded in the coding mode using candidate information including motion information of a predetermined adjacent block adjacent to the block; and A pseudo-calculation unit that generates pseudo motion information instead of unusable motion information when motion information of one adjacent block is not available, and provides the motion information as candidate information. This is an image information encoding device that performs an encoding process.

According to a second aspect of the present invention, at least one of the motion vector information and the coefficient information is One of them is omitted and the block is coded, and the decoding side has a coding mode that can restore the omitted information according to a predetermined rule, and the block is coded in the coding mode. A judgment step of judging whether or not it is possible using candidate information consisting of motion information of a predetermined adjacent block adjacent to the block; cannot be used if motion information of at least one adjacent block is not available A pseudo-calculation step of generating pseudo motion information instead of motion information and providing the motion information as candidate information, which is a video information encoding method for encoding video information using motion prediction.

A third aspect of the present invention is a program for causing a computer to execute an image information encoding method for encoding image information using motion prediction. A block is coded by omitting at least one of the vector information and the coefficient information, and the decoding side has a coding mode capable of restoring the omitted information according to a predetermined rule. A determination step of determining whether a block can be coded in a coding mode using candidate information including motion information of a predetermined neighboring block adjacent to the block; and a step in which motion information of at least one neighboring block cannot be used. A pseudo calculation step of generating pseudo motion information in place of the unusable motion information and providing the generated motion information as candidate information.

According to the present invention, in an image information encoding device that outputs image compression information based on an image encoding method such as an AVC, when necessary vector information of an adjacent block cannot be obtained due to parallel processing of a pipeline or the like. However, high-speed encoding can be realized by generating pseudo information.

Further, according to the present invention, in an image information encoding device that outputs image compression information based on an image encoding method such as AVC, motion vector information used for determination of a skip mode or a spatial direct mode is included. By calculating the reference index information in a pseudo manner, a means for setting the mode effectively is provided while realizing high-speed processing by parallel processing.

FIG. 1 is a block diagram showing a configuration of a conventional image information encoding device. FIG. 2 is a block diagram showing a configuration of a conventional image information decoding device. FIG. 3 is a schematic diagram showing reference to a plurality of frames in the motion prediction / compensation processing.

FIG. 4 is a schematic diagram showing a macro block and a sub macro block. FIG. 5 is a schematic diagram for explaining a motion compensation process with quarter-pixel accuracy.

FIG. 6 is a schematic diagram for explaining median prediction in the motion vector coding method.

FIG. 7A and FIG. 7B are schematic diagrams used to explain the skip mode and the spatial direct mode.

FIG. 8 is a schematic diagram for explaining the temporal direct mode.

FIGS. 9A and 9B are schematic diagrams used to explain the procedure of the motion compensation processing of a macroblock.

FIG. 10 is a block diagram showing a configuration of an image information encoding device according to the first embodiment of the present invention.

FIG. 11 is a schematic diagram used to explain pseudo calculation of candidate motion vector information according to the present invention.

FIG. 12 is a schematic diagram used to explain pseudo calculation of candidate motion vector information according to the present invention. FIG. 13 is a flowchart showing a processing request of the image information encoding device according to the first embodiment of the present invention. BEST MODE FOR CARRYING OUT THE INVENTION

Before describing the image information encoding apparatus of the present invention, see FIG. 9 for a specific example in the case where necessary vector information or the like of an adjacent block cannot be obtained due to high-speed processing such as pipeline processing. Will be explained. Now, in Fig. 9A, if X is the macroblock currently being processed and A is the adjacent macroblock, the motion vector information for A is not necessarily determined at the time when the motion search process is performed for X. Not necessarily. This is because, as described above, each processing phase for each macro block is executed in parallel by parallel processing. Also, in FIG. 9B, if X is a macroblock currently being processed and B, C, and D are adjacent macroblocks, B, C, and The motion vector information for D is not always determined.

According to the present invention, pseudo motion vector information can be generated by generating pseudo motion vector information even when vector information necessary for adjacent blocks cannot be obtained due to high-speed processing of a pipeline or the like. The subsequent processing is smoothly performed, and as a result, a high-speed encoding processing is realized.

In order to solve the above-described problems, an image information encoding device according to the present invention includes an A / D converter, a screen rearrangement buffer, an adder, an orthogonal transformer, a quantizer, a lossless encoder, a storage buffer, and an inverse buffer. Equipped with quantization device, inverse orthogonal transform device, deblock filter, frame memory, intra prediction device, motion prediction / compensation device, candidate motion vector information calculation device, rate control device, skip mode and spatial direct mode Candidate motion vector By introducing a method of pseudo-calculating the motion vector information used as the vector information, a means for performing high-speed processing by a pipeline or the like is provided.

Furthermore, if the motion vector information and the reference index (reference frame) information thus simulated do not match the motion vector information and the reference index information calculated according to the rules of the AVC standard, respectively, By determining such information as a mode other than the skip mode or the spatial direct mode, further improvement in compression efficiency can be expected. Such motion vector information is for a 16x16 block for skip mode, while it is for a 16x16 or 8x8 block for special direct mode. It is. Here, the motion vector information and the reference index information are collectively referred to as “motion information” as appropriate.

Here, an image information encoding device according to the first embodiment of the present invention will be described with reference to FIG. FIG. 10 is a block diagram illustrating a configuration of an image information encoding device according to the first embodiment. The image information encoding device 10 includes an A / D converter 11, a screen rearrangement buffer 12, an adder 13, an orthogonal transformer 14, a quantization unit 15, a lossless encoder 16, and a storage buffer. 17, inverse quantization section 18, inverse orthogonal transformation section 19, deblocking filter 20, frame memory 21, intra prediction section 22, motion prediction / compensation section 23, pseudo calculation section 24, mode This is a device including a determination unit 25 and a rate control unit 26.

The A / D converter 11 converts the input analog image signal into a digital image signal, and sends the digital image signal to the screen rearrangement buffer 12. Upon receiving the digital image signal, the screen rearrangement buffer 12 rearranges each frame composed of the digital image signal according to the GOP structure of the image compression information to be output. The adder 13 outputs the input frame When encoding is performed, the difference between the input frame and the reference frame is generated.

The orthogonal transform unit 14 performs orthogonal transform such as discrete cosine transform and Carinen-Loeve transform on the input frame or the difference value between the input frame and the reference frame, and the quantizing unit 15 performs orthogonal transform. The quantization processing of the transformed coefficients subjected to is performed. The lossless encoding unit 16 receives the quantized transform coefficient from the quantization unit 15, performs lossless encoding processing such as variable-length encoding and arithmetic encoding on this, and sends the result to the accumulation buffer 17. Send out. The storage buffer 17 receives the reversibly transformed image compression information from the reversible encoding unit 16 and stores them.

The inverse quantization unit 18 receives the quantized transform coefficients from the quantization unit 15 and performs inverse quantization on them. The inverse orthogonal transform unit 19 performs inverse orthogonal transform of the orthogonally transformed orthogonally-transformed coefficients, and the deblocking filter 20 removes block distortion included in the decoded image. The decoded image is stored in the frame memory 21. The reason why these decoded images are stored in the frame memory 21 is for motion prediction and compensation processing.

The motion prediction / compensation unit 23 receives the decoded image stored in the frame memory 21 and searches for motion vector information and performs motion compensation processing. The pseudo-calculation unit 2 pseudo-calculates the motion vector information used for the determination of the skip mode or the spatial direct mode for the purpose of increasing the speed by the parallel processing. The intra prediction unit 22 receives the decoded image stored in the frame memory 21 and performs an intra prediction process. The mode determination unit 25 receives outputs from the motion prediction / compensation unit 23 and the intra prediction unit 22 and determines the mode (skip mode, spatial mode).

Further, the rate control unit 26 controls the operation of the quantization unit 15 by the feed knock control based on the information from the accumulation buffer 17. The difference from the conventional image information encoding apparatus 100 shown in FIG. 1 is the processing contents in the motion prediction / compensation unit 23, the pseudo calculation unit 24, and the mode determination unit 25. Hereinafter, the processing of the image information encoding device 10 will be described focusing on the processing of these components.

Here, the processing contents of the pseudo calculation unit 24 will be described with reference to FIG. As described with reference to FIG. 7, in FIG. 11, when the motion prediction / compensation processing is performed on the macroblock X, the determination of the skip mode or the spatial direct mode of the macroblock X is performed. To do this, the motion vectors and reference index (re fl dx) information of the macroblocks A, B, C, (if there is no C because X is at the frame boundary, etc., but instead of C) Must be confirmed.

However, when performing image coding processing by parallel processing, for example, since each processing phase for each macroblock is executed in parallel, at the time when motion prediction / compensation processing is performed for a certain macroblock, However, other macroblock information necessary for this processing is not necessarily obtained.

If there is no motion vector information and reference index information for macroblocks A, B, C, and D, instead of these, macroblocks A ', B', C ', D', and The motion vector information and reference index information for A ', B,', C, ', D,', · · · are simulated and used for mode judgment. That is, the motion vector information is used as a candidate motion vector.

For example, if the motion vector information and reference index information for macroblocks B and C are determined, but the motion vector information and reference index information for macroblock A are not determined, then As shown in the figure, the motion vector information and reference index information of block A ' Is used to determine the mode of macroblock X. In the spatial direct mode, reference index information on block A 'is used.

Next, the process of the mode determination unit 25 will be described. As described above, the motion vector information (and reference index information) calculated by the pseudo-calculation unit 24 is completely different from the content of the motion vector information related to a predetermined macroblock, calculated according to the rules of the AVC standard. Does not always match. Similarly, the contents of the reference index information do not always match.

Thus, the mode determination unit 25 compares the macroblock motion vector information calculated according to the rules of the standard with the motion vector information pseudo-calculated by the pseudo-calculation unit 24. Further, in the case of the spatial direct mode, it is checked whether or not the reference index information matches for each of the reference frames List0 and List1.

When the contents of the motion vector information and the reference index information match, the candidate motion vector calculated by the pseudo-calculation unit 24 is used as the candidate motion vector information of the skip mode or the spatial direct mode. Perform mode determination processing. At this time, the mode determination based on the RD optimization described above may be performed.

If the motion vector information does not match, the candidate motion vector information calculated by the pseudo calculation unit 24 is discarded, or a 16 × 16 block candidate motion vector or 8 × 8 Let it be a candidate motion vector. Then, an arbitrary mode determination is performed. As described above, in the skip mode, it is used as the motion vector information of a 16 × 16 block, and in the skip-to-square direct mode, it is 16 × 16 or 8 ×. It is used as motion vector information for block 8.

Next, the procedure of the mode determination process described above will be described with reference to the flowchart of FIG. This will be described with reference to a chart. FIG. 13 shows three dotted blocks A or C. The processing in the dotted block A is performed by the motion prediction-compensation unit 23, and the processing in the dotted block B is performed. This indicates that the process is performed by the intra prediction unit 22 and the process in the dotted block C is performed by the mode determination unit 25.

First, in step S1, the motion vector information (and reference index information) force S and the pseudo-calculation section 24 are calculated for use in the determination of the skip mode or the single direct mode. Here, this information is referred to as X. As shown in FIG. 11, when the processing of the motion vector information of the macroblock A is not completed, the pseudo-calculation unit 24 calculates the motion vector information of the macroblock A ′ as shown in FIG. If the processing of the motion vector information of macro block A 'has not been completed, the motion vector information of macro block A' '' is obtained.If the motion vector information cannot be obtained, The control is performed so as to obtain the motion vector information of a macroblock that is far from the macroblock X, that is, has a larger spatial distance than A-X.

In the example shown in Fig. 11, A, A ', A "-... are selected regularly. That is, A' is in contact with the side of A opposite to the side of A in which X is in contact. A ′ ′ is a professional and athlete in contact with the side of A ′ which is opposite to the side of A ′ in which A is in contact.

The operation of the pseudo calculation unit 24 is the same for the macroblocks B, C, and D. In this example, when the processing of the motion vector information of the macroblock A has not been completed, the motion vector information of the macroblock A 'is obtained instead, but the motion vector information is obtained. As long as the motion vector information of which macro block or relative position stakeholder is acquired is determined as appropriate, Can be Also, the motion vector information of a plurality of other macroblocks may be used instead of the motion vector information of the macroblock A.

When step S1 ends, in step S4, an evaluation index used for mode determination is calculated for information X. These indices are needed to actually quantize several macroblocks and estimate the required code amount. Here, for example, processing such as Hadamard transformation is performed.

In addition, the motion prediction / compensation unit 23 searches for optimal motion vector information for each block size such as 16 × 16 or 16 × 8 (step S 2). An evaluation index used for mode determination is calculated for the vector information (step S3). Here, the motion vector search does not use the motion vector information of the peripheral blocks. Therefore, even if all the vector information of the peripheral block has not been calculated, it can be calculated independently without waiting for the calculation result.

In the intra prediction section 22, an evaluation index used for mode determination is calculated from information obtained from the one frame (step S5). The processing of step S3 and step S5 does not need to be performed simultaneously with step S4, and it is sufficient that the processing is completed by the processing of step S10 described later.

Next, in step S6, the candidate motion vector information (and reference index information) in the skip mode or the spatial direct mode is calculated using the method of the above-described standard. This information is hereinafter referred to as information Y. If such information has already been calculated in step S3, it may be configured to use the result. In step S7, information X and information Y are compared. Info X When the information Y is equal to the information Y, the process proceeds to step S9, and the information X is used as candidate motion vector information used for determining the skip mode and the spatial direct mode.

On the other hand, if the information X is not equal to the information Υ, the process proceeds to step S8, where the information X is discarded, or the candidate motion vector information of the 16 × 16 block or the 8 × 8 block is used. Is done. In this case, the compression efficiency may be improved by using information X as a candidate motion vector.

When the candidate motion vector information is determined by the above procedure, an arbitrary mode determination is performed in step S11 based on the evaluation index of each candidate calculated in each process.

Next, an image information encoding device according to a second embodiment of the present invention will be described. The components of the image information coding apparatus according to this embodiment are the same as those of the image information coding apparatus according to the first embodiment shown in FIG. 10, and therefore, the block diagram is omitted. The difference lies in the processing content of the pseudo calculation unit. Therefore, the description here will focus on the processing of the pseudo-calculation unit (hereinafter, the reference numeral 24 'is attached).

The pseudo calculation unit 24 'does not use the information of the determined peripheral block, but sets all information to a specific value, for example, 0. That is, in the skip mode, the value of the motion vector is set to 0 for each component, and in the single-shot direct mode, the value of the reference index for List 0 and List 1 is set to 0, and The value of the motion vector for L ist O and L istl is also set to 0. Other processes are the same as those of the first embodiment.

In the second embodiment, the pseudo-calculation unit 24 ′ uses the motion vector information for the determination of the skip mode or the spatial direct mode. Calculation can be omitted.

As described above, the image information encoding device is configured so as not to hinder the speed-up by the parallel processing.

It can also be done in software using a computer like (personal 'computer) (software encoding). For example, consider an implementation using a PC composed of a CPU (Central Processing Unit), a memory, a hard disk, a recording medium drive, a network interface, and a bus connecting these to each other.

Here, the CPU may include a coprocessor such as a DSP (Digital Signal Processor). The CPU executes the function of each unit such as the AZD conversion unit 11 described above based on the instructions of the program loaded into the memory. If necessary, use high-speed accessible memory for temporary storage of data. Memory is used for buffers such as the screen rearrangement buffer 12 and the accumulation buffer 17 and the frame memory 21.

A program that implements such a function is usually stored in an external storage device such as a hard disk, and is loaded into the memory when an encoding process is instructed by a user or the like. The program can be stored in a ROM (Read Only Memory) or DVD (Digital Versatile Disk) ROM, and can be read into a hard disk or the like via a recording medium drive. In another embodiment, when the personal computer is connected to a network such as the Internet via a network interface, the program is transferred from another computer or site to a hard disk or the like via the network. Can be recorded. In the above, the features of the present invention have been described by taking an image information encoding device that outputs AVC image compression information as an example, but the scope of application of the present invention is limited to this. is not. Outputs image compression information based on any image coding method that uses motion prediction, such as MPEG-1 / 24 and H.263, and uses DPCM for motion vector coding. It is applicable to an image information encoding device.

Claims

The scope of the claims

1. An image information encoding method for encoding image information using motion prediction.

In the encoding process, the block is encoded by omitting at least one of the motion vector information and the coefficient information, and the decoding side can restore the information which has been omitted by a predetermined rule. A determination step having a certain coding mode, and determining whether or not coding is possible in the coding mode using candidate information including motion information of a predetermined adjacent block adjacent to the block; ,

A pseudo calculation step of generating pseudo motion information in place of the unusable motion information when the motion information of at least one adjacent block is not available, and providing the motion information as the candidate information. Image information encoding method.

2. In the image information encoding method according to claim 1,

The image f blue report encoding method, wherein the pseudo motion information is available motion information of a neighboring block near an adjacent block having the unavailable motion information.

3. In the image information encoding method according to claim 1,

An image information encoding method, wherein the pseudo motion information is a specific value.

4. The image information encoding method according to claim 1,

The encoding mode includes a first mode in which the motion vector information and the coefficient information are omitted and encoded.

The image information encoding method, wherein the determination step and the pseudo-calculation step treat the motion vector information as the motion information with respect to the determination and the generation of the first mode.

5. The image information encoding method according to claim 1, wherein the encoding mode includes a second mode in which the motion vector information is encoded by omitting the information.

The determination step and the pseudo calculation step are characterized in that, in the determination and the generation in the second mode, the motion vector information and the reference index information are treated as the motion information. Method.

6. In the image information encoding method according to claim 2,

The encoding is performed according to the MPEG4 / AVC standard, and the determining step includes: when the pseudo motion information does not match the motion information calculated according to the MPEG4 / AVC standard, An image information encoding method, wherein pseudo motion information is not used as the candidate information.

7. The image information encoding method according to claim 2,

The encoding is performed in accordance with the MPEG4ZAVC standard.If the pseudo motion information does not match the motion information calculated according to the MPEG4ZA VC standard, the pseudo motion information is

When the encoding mode is the first mode in which the motion vector information and the coefficient information are omitted and the encoding is performed, a candidate motion vector information of 16 × 16 blocks is used,

In the case where the encoding mode is the second mode in which the motion vector information is omitted for encoding, the motion vector information may be 16 × 16 or 8 × 8 block candidate motion vector information. An image information encoding method characterized by the following.

8. In the image information encoding method according to claim 2,

The neighbor block is the motion that is not available for the block. An image information encoding method, wherein a block having a larger spatial distance than an adjacent block having information is selected.

9. In an image information encoding device that performs image information encoding using motion prediction,

In the encoding process, the block is encoded by omitting at least one of the motion vector information and the coefficient information, and the decoding side is capable of restoring the omitted information by a predetermined rule. A decision unit for determining whether or not encoding is possible in the encoding mode using candidate information including motion information of a predetermined adjacent block adjacent to the block;

When the motion information of at least one of the adjacent blocks is not available, a pseudo calculation unit that generates pseudo motion information in place of the unusable motion information and provides the pseudo motion information as the candidate information is provided. Image information encoding device.

10. A program for causing a computer to execute an image information encoding method of performing image information encoding processing using motion prediction,

In the image information encoding method, at least one of the motion vector information and the coefficient information is omitted, and the block is encoded.The decoding side can restore the omitted information according to a predetermined rule. And the encoding mode is

A determining step of determining whether or not encoding is possible in the encoding mode using the candidate information including motion information of a predetermined adjacent block adjacent to the block;

A pseudo calculation step of generating pseudo motion information in place of the unusable motion information when the motion information of at least one adjacent block is not available, and providing the motion information as the candidate information. n

° Ma ^ mouth. Party §4

09ST00 / S00Zdf / X3d OAV