WO2013035452A1

WO2013035452A1 - Image encoding method, image decoding method, and apparatuses and programs thereof

Info

Publication number: WO2013035452A1
Application number: PCT/JP2012/068776
Authority: WO
Inventors: 貴也山本; 内海　端; 大津　誠
Original assignee: シャープ株式会社
Priority date: 2011-09-05
Filing date: 2012-07-25
Publication date: 2013-03-14

Abstract

The objective of the invention is to obtain an excellent encoding efficiency in a skipped macroblock mode. For each macroblock that is a partial area constituting a texture image (Tf) comprising the brightness value of each pixel, a skipped macroblock mode is used to perform the encoding of the texture image. At this moment, a single macroblock (Tb) is divided into a plurality of partitions (Tp1, Tp2) in accordance with the partitions (Db1, Db2) of a depth macroblock (Db), which is an area corresponding to the single macroblock (Tb), in a depth map comprising the depth value indicating the distance from the view point of each pixel. Further, a predicted vector is generated for each of the partitions (Tp1, Tp2) and encoded by use of the skipped macroblock mode.

Description

Image encoding method, image decoding method, and apparatus and program thereof

The present invention relates to an image encoding method, an image decoding method, and an apparatus and a program thereof.

In video coding, various methods such as intra-frame prediction and motion-compensated inter-frame prediction have been proposed, and MPEG (Moving Picture Experts Group) -2, MPEG-4, MPEG-4 AVC (Advanced Video Coding) / H. H.264.

As described in Non-Patent Document 1, MPEG-4 AVC / H. One of the methods adopted in H.264 is a skipped macroblock mode. This method does not encode the information of the macroblock to be encoded at all, and when decoding, generates a prediction vector based on the motion vector of the macroblock adjacent to the encoded macroblock, and uses that prediction vector. An encoded macroblock is decoded by performing motion compensation.

In Patent Literature 1, when multi-view video is encoded, a global difference indicating a global difference between an encoding target image and an image of a viewpoint different from the encoding target image is used instead of such a prediction vector. A technique for predicting a motion vector of a current block based on a vector and using it as a prediction vector is disclosed. Patent Document 1 also discloses that encoding and decoding are performed in a skipped macroblock mode using this prediction vector.

Special table 2010-516158

However, since the skipped macroblock mode described in Non-Patent Document 1 and Patent Document 1 does not encode information indicating the shape of the encoding target block, the encoding target block has one shape, for example, 16 × 16, You can only choose. Therefore, when there are a plurality of regions having different motions in the encoding target block, there is a problem that the motion compensation cannot be appropriately performed in the skipped macroblock mode and the encoding efficiency is lowered.

The present invention has been made in view of the above-described actual situation, and an object of the present invention is to provide an image encoding method, an image decoding method, and the like capable of obtaining excellent encoding efficiency in the skipped macroblock mode. It is providing the apparatus and program of this.

In order to solve the above-mentioned problem, the first technical means of the present invention is that the image encoding device performs a skipped macroblock mode for each macroblock which is a partial area constituting an image composed of luminance values for each pixel. The image encoding method performs encoding of the image according to claim 1, wherein the image encoding device converts one macroblock into one macroblock among depth maps each including a depth value indicating a distance from a viewpoint for each pixel. In accordance with a partition of a depth macroblock which is an area corresponding to a block, the image coding step includes dividing into a plurality of partitions, generating a prediction vector for each partition, and performing coding in a skipped macroblock mode. It is a thing.

According to a second technical means, in the first technical means, the image encoding step converts the one macroblock into a plurality of partitions based on a partition size used when the depth macroblock is encoded. The prediction vector of the partition based on the motion vector of the adjacent macroblock adjacent to the one macroblock and the prediction vector already calculated for the adjacent partition adjacent to the partition for each partition. And a calculating step for calculating.

According to a third technical means, in the second technical means, for each partition, the calculation step is performed on a prediction vector of an adjacent partition or a motion vector of an adjacent macroblock adjacent to the left, top, and right of the partition. The median is calculated as a prediction vector of the partition.

According to a fourth technical means, in the first technical means, the image encoding step includes a step of detecting an edge component from the block of the depth map, and a plurality of the one macroblock based on the detected edge component. Partition prediction based on the partitioning step, and for each partition, a motion vector of an adjacent macroblock adjacent to the one macroblock and a prediction vector already calculated for the adjacent partition adjacent to the partition. And a calculation step for calculating a vector.

According to a fifth technical means, in the fourth technical means, the calculating step includes, for each partition, among the adjacent partitions and adjacent macroblocks that are adjacent to the left, top, and top right of the partition. A prediction vector of an adjacent partition or a motion vector of an adjacent macroblock having a wide area that touches is used as the prediction vector of the partition.

A sixth technical means is an image encoding device that encodes the image in a skipped macroblock mode for each macroblock that is a partial region that configures an image composed of luminance values for each pixel, One macroblock is divided into a plurality of partitions according to a partition of a depth macroblock which is an area corresponding to the one macroblock in a depth map including a depth value indicating a distance from a viewpoint for each pixel, and the partition It is characterized by comprising image encoding means for generating a prediction vector for each time and encoding in a skipped macroblock mode.

According to a seventh technical means, an image encoding process for encoding the image in a skipped macroblock mode is performed on a computer for each macroblock which is a partial area constituting an image composed of luminance values for each pixel. An image encoding program for execution, wherein the image encoding process corresponds to one macroblock in a depth map including a depth value indicating a distance from a viewpoint for each pixel. In accordance with a partition of a depth macroblock that is an area, the image coding step includes dividing the image into a plurality of partitions, generating a prediction vector for each partition, and encoding in a skipped macroblock mode. .

According to an eighth technical means, in the image decoding method, from the luminance value for each pixel encoded in units of partitions in the skipped macroblock mode according to the image encoding method in any one of the first to fifth technical means. The image decoding apparatus includes an image decoding step for decoding a macroblock which is a partial area constituting the image.

According to a ninth technical means, in the image decoding apparatus, from the luminance value for each pixel encoded in units of partitions in the skipped macroblock mode according to the image encoding method in any one of the first to fifth technical means. The image decoding means which decodes the macroblock which is a one part area | region which comprises the image which becomes is characterized by the above-mentioned.

The tenth technical means constitutes an image composed of luminance values for each pixel, encoded in units of partitions by the skipped macroblock mode according to the image encoding method in any one of the first to fifth technical means. An image decoding program for causing a computer to execute an image decoding step for decoding a macroblock which is a partial area.

According to the present invention, an image can be encoded or decoded with excellent coding efficiency in the skipped macroblock mode.

1 is a schematic diagram illustrating a configuration example of a three-dimensional image capturing system including an image encoding device according to a first embodiment of the present invention. It is the schematic which shows the structural example of the image coding apparatus which concerns on the 1st Embodiment of this invention. It is the schematic which shows the structural example of the skipped macroblock part which concerns on the 1st Embodiment of this invention. It is a conceptual diagram which shows an example of the encoding by the skipped macroblock mode which concerns on the 1st Embodiment of this invention. It is a flowchart for demonstrating an example of the image coding process which the image coding apparatus which concerns on the 1st Embodiment of this invention performs. It is a flowchart following FIG. It is a flowchart for demonstrating an example of the skipped macroblock prediction image macroblock production | generation process in the skipped macroblock part which concerns on the 1st Embodiment of this invention. It is the schematic which shows the structural example of the image decoding apparatus which concerns on the 1st Embodiment of this invention. It is a flowchart for demonstrating an example of the image decoding process which the image decoding apparatus concerning the 1st Embodiment of this invention performs. It is a flowchart following FIG. It is the schematic which shows the structural example of the image coding apparatus which concerns on the 2nd Embodiment of this invention. It is the schematic which shows the structural example of the skipped macroblock part which concerns on the 2nd Embodiment of this invention. It is a flowchart for demonstrating an example of the image coding process which the image coding apparatus which concerns on the 2nd Embodiment of this invention performs. It is a flowchart for demonstrating an example of the skipped macroblock prediction image macroblock production | generation process in the skipped macroblock part which concerns on the 2nd Embodiment of this invention. It is the schematic which shows the structural example of the image decoding apparatus which concerns on the 2nd Embodiment of this invention. It is a flowchart for demonstrating an example of the image decoding process which the image decoding apparatus which concerns on the 2nd Embodiment of this invention performs.

In the present invention, an image is encoded or decoded in a skipped macroblock mode for each macroblock which is a partial region constituting an image composed of luminance values for each pixel. Of course, as will be apparent from the following description, encoding / decoding is not limited to skipped macroblock mode for all macroblocks. And in this invention, it has the main characteristics in the production | generation of a prediction vector. More specifically, in the skipped macroblock mode according to the present invention, when a macroblock is predictively encoded, a prediction vector is obtained for each partition related to the depth value depth map.
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

(First embodiment)
FIG. 1 is a schematic diagram illustrating a configuration example of a three-dimensional image capturing system including an image encoding device according to the first embodiment of the present invention.
The image photographing system illustrated in FIG. 1 includes a photographing device 3a, a photographing device 3b, an image preprocessing unit 2, and an image encoding device 1.

The imaging device 3a and the imaging device 3b are installed at different positions (viewpoints), and take images of subjects included in the same field of view at predetermined time intervals (for example, once every 1/30 seconds). The imaging device 3a and the imaging device 3b each output an image signal indicating a captured image to the image preprocessing unit 2.

This image signal is an image signal that is a signal value (luminance value) representing the color and shading of the subject and background included in the subject space, and is composed of signal values for each pixel arranged on a two-dimensional plane. This image signal is an image signal having a signal value representing a color space for each pixel, for example, an RGB signal. The RGB signal includes an R signal that represents the luminance value of the red component, a G signal that represents the luminance value of the green component, and a B signal that represents the luminance value of the blue component.

The image preprocessing unit 2 may refer to an image input from one of the imaging device 3a and the imaging device 3b, for example, the imaging device 3a, as a texture image (“text map”; “reference image”, “two-dimensional image”). .), And will be described separately from an image input from another imaging device (in this example, the imaging device 3b). Therefore, the texture image is also a signal value (luminance value) representing the color and shading of the subject and background included in the subject space, and is a general two-dimensional signal composed of signal values for each pixel arranged on the two-dimensional plane. It is an image signal. The image preprocessing unit 2 can also be simply called a “preprocessing unit”.

The image preprocessing unit 2 calculates a parallax between the texture image input from the image capturing device 3a and the image input from the other image capturing device 3b for each pixel, and generates a depth map. A depth map (sometimes referred to as depth map, “depth image”, or “distance image”) is a signal value corresponding to a distance from a subject included in the subject space or a background viewpoint (such as a photographing device). It may be referred to as “depth value”, “depth value”, or “depth”), and is an image signal composed of signal values for each pixel arranged on a two-dimensional plane. The pixels constituting the depth map correspond to the pixels constituting the texture image. Therefore, the depth map is a clue for representing a three-dimensional object space using a texture image that is a reference image signal obtained by projecting the object space in two dimensions. The image preprocessing unit 2 outputs the texture image and the generated depth map to the image encoding device 1.

The image encoding device 1 includes the following image encoding means. This image encoding means encodes one macroblock for each pixel in a skipped macroblock mode for each macroblock which is a partial area constituting an image composed of luminance values for each pixel. Divide into multiple partitions according to the partition of the depth macroblock (depth map macroblock) that is the area corresponding to the one macroblock in the depth map consisting of the depth value indicating the distance from, and predict for each partition A vector is generated and encoded in skipped macroblock mode.

In the present embodiment, the number of image capturing apparatuses included in the image capturing system is described as two. However, the number is not limited to this, and may be three or more. Further, the texture image and the depth map input to the image encoding device 1 do not have to be based on images captured by the imaging device 3a and the imaging device 3b, and may be images synthesized in advance, for example.

Next, a configuration example and functions of the image encoding device 1 according to the present embodiment will be described with reference to FIG. FIG. 2 is a schematic diagram illustrating a configuration example of the image encoding device 1 according to the present embodiment.

An image encoding apparatus 1 illustrated in FIG. 2 includes a texture image input unit 101, a depth map encoding unit 102, a partition size memory 103, a reference image memory 104, an intra prediction unit 105, an inter prediction unit 106, and a skipped macroblock unit. 107, encoding mode determination unit 108, motion vector memory 109, prediction vector generation unit 110, subtraction unit 111, subtraction unit 112, DCT transform / quantization unit 113, inverse quantization / inverse DCT transform unit 114, addition unit 115, And an information source encoding unit 116.

Texture image input unit 101 receives a texture image from the image preprocessing unit 2 for each frame, and extracts a macro block (hereinafter referred to as a texture image macro block) from the input texture image. The texture image input unit 101 outputs the extracted texture image macroblock to the encoding mode determination unit 108, the inter prediction unit 106, and the subtraction unit 111. The texture image macroblock is composed of a predetermined number of pixels, for example, 16 pixels in the horizontal direction × 16 pixels in the vertical direction.

The texture image input unit 101 shifts the position of the macro block from which the texture image macro block is extracted in the raster scan order so that the blocks do not overlap. That is, the texture image input unit 101 sequentially moves the block from which the texture image macroblock is extracted from the upper left end of the frame to the right by the number of pixels in the horizontal direction of the block. After the right end of the macroblock from which the texture image macroblock is extracted reaches the right end of the frame, the texture image input unit 101 moves the macroblock down by the number of pixels in the vertical direction of the macroblock and to the left end of the frame. . The texture image input unit 101 moves the macroblock from which the texture image macroblock is extracted in this way until it reaches the lower right of the frame. Hereinafter, the extracted texture image macroblock may be referred to as an encoding target macroblock or a target macroblock.

The depth map encoding unit 102 inputs the depth map from the image preprocessing unit 2 for each frame, and a known image encoding method for each macroblock constituting each frame, for example, ITU-T H.264. The depth map is encoded using an encoding method corresponding to the decoding method described in the H.264 standard. The depth map encoding unit 102 outputs the depth map code generated by encoding to the outside of the image encoding device 1.

The depth map encoding unit 102 stores, in the partition size memory 103, the partition size indicating the shape obtained by dividing the macro block when each macro block of the depth map is encoded. In this embodiment, the size of the partition is 16 pixels horizontally × 16 pixels vertically, 16 pixels × 8 pixels, 8 pixels × 16 pixels, 8 pixels × 8 pixels, 8 pixels × 4 pixels, 4 pixels × 8. Although it is a pixel or 4 pixels × 4 pixels, the present invention is not limited to a specific partition size.

The partition size memory 103 stores the partition sizes input from the depth map encoding unit 102 in the order of the encoded macro blocks.

The reference image memory 104 stores a reference image macroblock (hereinafter referred to as a reference image macroblock) input from the adder 115 at the position of the encoding target macroblock in the corresponding frame as a reference image. This reference image macroblock is an image macroblock obtained by encoding and decoding a texture image macroblock input in the past. An image signal of a frame configured by arranging reference image macroblocks in this way is called a reference image. The reference image memory 104 sequentially stores a reference number of frames set in advance from the present, for example, 15 frames in the past, and when the number of stored frames exceeds the preset number of frames, for example, When the 16th frame reference image is input, all the previous frame reference images are deleted.

The intra prediction unit 105 reads a reference image block that has been subjected to processing related to encoding adjacent to the top and left of the target macroblock in the same frame as the target macroblock from the reference image memory 104. The size of the reference image block is, for example, 16 pixels in the horizontal direction × 16 pixels in the vertical direction, 8 pixels × 8 pixels, 4 pixels × 4 pixels, but the present invention is not limited to the specific reference macroblock (the macroblock of the reference image). ) Is not limited by the size. The intra prediction unit 105 performs intra-frame prediction based on the read reference macroblock, and generates an intra predicted image macroblock. As a specific intra-frame prediction method, for example, ITU-T H.264. There is an intra-frame prediction method described in the H.264 standard. The intra prediction unit 105 outputs the generated intra predicted image macroblock to the encoding mode determination unit 108.

The inter prediction unit 106 divides the target macroblock input from the texture image input unit 101 into one or a plurality of partitions, and sequentially detects a motion vector for each partition. The division into one partition means that the division is not substantially performed.

Specifically, first, the inter prediction unit 106 reads a reference image block constituting a reference image from the reference image memory 104. The size (number of pixels) of the reference image block read by the inter prediction unit 106 is the same as the size of the partition to be processed (hereinafter referred to as the target partition) among the plurality of partitions. Here, the inter prediction unit 106 performs block matching between the target partition in the texture image macroblock input from the texture image input unit 101 and the reference image block. That is, the position at which the reference image block is read from the frame of the past reference image stored in the reference image memory 104 is moved pixel by pixel in the horizontal direction or the vertical direction within a preset range from the position of the target partition. The inter prediction unit 106 detects a difference between the coordinates of the target partition and the coordinates of the reference image block most similar to the target partition as a motion vector.

Here, the inter prediction unit 106 uses an index value indicating the similarity between the signal value for each pixel included in the target partition and the signal value for each pixel included in the reference image block, for example, SAD (Sum of Absolute Difference, difference (Absolute value sum) is calculated. The inter prediction unit 106 determines the reference image block that minimizes the index value as the reference image block that is most similar to the target partition. The inter prediction unit 106 calculates a motion vector based on the coordinates of the target partition and the coordinates of the reference image block determined.

Also, the inter prediction unit 106 selects a partition size that minimizes the coding cost from among the sizes of the partitions constituting the target macroblock. For example, the inter prediction unit 106 performs the following processing on the sizes of all partitions. First, the inter prediction unit 106 receives a prediction vector corresponding to the target partition from the prediction vector generation unit 110. The inter prediction unit 106 calculates a difference vector between the calculated motion vector and the input prediction vector, and calculates the absolute value of the horizontal component, the absolute value of the vertical component of the calculated difference vector, and the target partition and the target partition most similar to each other. The sum of the index values with the reference block is calculated as a cost value. The above processing is performed on all partitions constituting the target macroblock to calculate the cost value, and the sum of the cost values is obtained for each partition size. Then, the partition size with the smallest sum of cost values is selected from among the partition sizes.

As another partition size selection method, for example, an RD cost is used. With the RD cost, the encoding cost J is calculated from the following equation.
J = D + λ × R

Here, D is a difference between the target macroblock and an image obtained by encoding and decoding the target macroblock, and R is an expected generated code amount. The coefficient λ may be a constant or a function of a quantization parameter that controls the roughness of quantization. The present invention is not limited to a specific encoding cost calculation method.

Next, the inter prediction unit 106 performs motion compensation based on the size of the partition having the smallest cost value and the motion vector calculated for each partition of the size, and generates a predicted image macroblock. Specifically, first, the inter prediction unit 106 reads a reference image block at coordinates indicated by motion vectors for each partition from the reference image memory 104. The inter prediction unit 106 generates a predicted image macroblock by performing processing for matching the coordinates of the read reference image block with the coordinates of the partition for each partition.

The inter prediction unit 106 stores the calculated motion vector in the motion vector memory 109 and outputs it to the subtraction unit 112, and outputs the generated predicted image macroblock to the encoding mode determination unit 108.

The skipped macroblock unit 107 determines the partition size of the reference depth macroblock having the same coordinates as that of the target macroblock from the partition size memory 103, and the motion of the partition adjacent to the left, upper, and upper right of the target macroblock from the motion vector memory 109. The vector is read from each of the reference image blocks from the reference image memory 104. Here, an area corresponding to the reference macroblock in the depth map, that is, a macroblock having the same coordinates as the reference macroblock is referred to as a reference depth macroblock. Similarly, a region corresponding to the target macroblock in the depth map, that is, a macroblock having the same coordinates as the target macroblock is referred to as a target depth macroblock.

The skipped macroblock unit 107 calculates a prediction vector based on the read motion vector. The skipped macroblock unit 107 performs motion compensation based on the calculated prediction vector, thereby generating a predicted image macroblock (hereinafter referred to as a skipped macroblock predicted image macroblock) using the skipped macroblock. The skipped macroblock unit 107 outputs the skipped macroblock predicted image macroblock to the coding mode determination unit 108. Details of the processing will be described later.

The encoding mode determination unit 108 receives the target macroblock from the texture image input unit 101, the intra prediction image macroblock from the intra prediction unit 105, the prediction image macroblock from the inter prediction unit 106, and the skipped macroblock unit 107. Macroblock predicted image macroblocks are received as inputs.

Then, the coding mode determination unit 108 determines a coding mode with the highest coding efficiency based on the three input predicted image macroblocks. Specifically, for example, the encoding mode determination unit 108 calculates the square error of the signal value for each pixel with the target macroblock for each prediction image macroblock, and sums the square errors in the prediction image macroblock. Is the encoding cost. The coding mode determination unit 108 calculates coding costs for all coding modes, and sets the coding mode corresponding to the predicted image macroblock with the lowest coding cost as the coding mode with the highest coding efficiency. To do. As another determination method, for example, there is a method using RD cost. The present invention is not limited to a specific encoding cost calculation method.

The encoding mode determination unit 108 outputs an encoding mode signal indicating the encoding mode with the highest encoding efficiency to the information source encoding unit 116. When the coding mode with the highest coding efficiency is the skipped macroblock mode, the coding mode determination unit 108 outputs the skipped macroblock predicted image macroblock to the reference image memory 104. When the coding mode with the highest coding efficiency is intra prediction or inter prediction, the coding mode determination unit 108 selects a code with high coding efficiency from the intra prediction image macroblock or the inter prediction image macroblock. The data corresponding to the conversion mode is output to the subtracting unit 111 and the adding unit 115.

The motion vector memory 109 stores the motion vector input from the inter prediction unit 106 for each frame, each block, and each partition. The motion vector memory 109 outputs the stored motion vector to the prediction vector generation unit 110 and the skipped macroblock unit 107.

The prediction vector generation unit 110 reads the motion vector of the partition adjacent to the target partition and subjected to the processing related to encoding, for example, the partition adjacent to the left, top, and upper right of the target partition. Based on the read motion vector, the predicted vector generation unit 110 calculates a predicted vector by taking, for example, the median value of the horizontal and vertical components of the three read motion vectors. In addition to the intermediate value, an arbitrary standard such as an average value, a maximum value, or a minimum value may be used. The prediction vector generation unit 110 outputs the calculated prediction vector to the inter prediction unit 106 and the subtraction unit 112.

The subtraction unit 111 converts the signal value of each pixel included in the texture image macroblock input from the texture image input unit 101 into an intra prediction image macroblock or inter prediction image macroblock input from the coding mode determination unit 108. The residual signal block is calculated by subtracting the signal value of each pixel included. The subtraction unit 111 outputs the calculated residual signal macroblock to the DCT transform / quantization unit 113.

The subtraction unit 112 subtracts the prediction vector input from the prediction vector generation unit 110 from the motion vector input from the inter prediction unit 106 to calculate a difference vector. The subtraction unit 112 outputs the calculated difference vector to the information source encoding unit 116.

The DCT transform / quantization unit 113 performs a two-dimensional DCT (Discrete Cosine Transform) on the residual signal macroblock input from the subtraction unit 111 to calculate a DCT coefficient as a frequency domain signal, and then performs DCT. Coefficients are quantized to generate quantized DCT coefficients. Here, the quantization is a process of converting an input DCT coefficient into one of discrete representative values. The DCT transform / quantization unit 113 outputs the generated quantized DCT coefficient to the inverse quantization / inverse DCT transform unit 114 and the information source coding unit 116.

The inverse quantization / inverse DCT transformation unit 114 performs inverse quantization (de-quantization) on the quantized DCT coefficient input from the DCT transformation / quantization unit 113 to generate an inverse quantization DCT coefficient, and then performs inverse quantization. Two-dimensional inverse DCT (inverse Discrete Cosine Transform) is performed on the generalized DCT coefficients to generate a residual signal macroblock decoded as a spatial domain signal. Hereinafter, this residual signal macroblock is referred to as a decoded residual signal macroblock. The inverse quantization / inverse DCT transform unit 114 outputs the generated decoded residual signal macroblock to the adder 115.

The adder 115 receives the signal value of each pixel included in the intra-predicted image macroblock or inter-predicted image macroblock input from the coding mode determination unit 108 and the decoding input from the inverse quantization / inverse DCT transform unit 114. A reference image macroblock is generated by adding the signal value of each pixel included in the residual signal macroblock. The adder 115 stores the generated reference image macroblock in the reference image memory 104.

The information source encoding unit 116 performs encoding on the quantized DCT coefficient input from the DCT transform / quantization unit 113 and the difference vector input from the subtraction unit 112 to generate an encoded stream. The information source encoding unit 116 performs, for example, variable-length encoding as encoding for generating an encoded stream having a smaller amount of information than input information. The variable-length coding is a coding method that compresses the amount of information by converting shorter codes for information with higher appearance frequency into longer codes for information with lower appearance frequency in input information.

However, the information source encoding unit 116 does not use the quantized DCT coefficient and the difference vector when the encoding mode signal indicating the skipped macroblock mode is input from the encoding mode determination unit 108. The input encoding mode signal is encoded to generate an encoded stream. The information source encoding unit 116 outputs the generated encoded stream to the outside of the image encoding device 1.

Next, a configuration example and functions of the skipped macroblock unit 107 according to the present embodiment will be described with reference to FIG. FIG. 3 is a schematic diagram illustrating a configuration example of the skipped macroblock unit 107 according to the present embodiment.

The skipped macroblock unit 107 illustrated in FIG. 3 includes a skipped macroblock prediction vector generation unit 1071 and a prediction macroblock generation unit 1072.

Prior to the description of the

units

1071 and 1072, an example of encoding in the skipped macroblock mode according to the present embodiment will be described with reference to FIG. FIG. 4 is a conceptual diagram showing an example of encoding in the skipped macroblock mode according to the present embodiment.

4A shows an example of a texture image Tf of a frame to be encoded, and FIG. 4B shows an example of a depth map Df corresponding to the texture image Tf. A square frame at the center of the texture image Tf indicates the target macroblock Tb. A square frame in the center of the depth map Df indicates a region corresponding to the target macroblock in the depth map Df, that is, the target depth macroblock Db that is a macroblock having the same coordinates as the target macroblock Tb.

The three dotted rectangular frames around the target macroblock Tb are already encoded macroblocks TbA, TbB, and TbC adjacent to the target macroblock Tb. The arrows coming out of the macro blocks TbB and TbC indicate the motion vectors MvB and MvC used when the macro blocks TbB and TbC are encoded.

It is assumed that the target depth macroblock Db has already been encoded and is divided into two partitions Db1 and Db2 each having a size of 8 × 16. When the target macroblock Tb is encoded in the skipped macroblock mode, the target macroblock Tb is divided into partitions Tp1 and Tp2 having the same size as the target depth macroblock Db. Arrows extending from the partitions Tp1 and Tp2 of the target macroblock indicate the prediction vectors Pv1 and Pv2 of the partitions.

As an example, a method for generating the prediction vector Pv2 will be described. The prediction vector Pv2 has a median value for each of, for example, the horizontal component and the vertical component of the motion vectors MvB and MvC of the macroblocks TbB and TbC adjacent to the upper and upper right of the partition Tp2 and the prediction vector Pv1 of the partition Tp1 adjacent to the left. It is generated by taking. Further, the prediction vector Pv1 is generated by taking median values for, for example, the horizontal component and the vertical component of the motion vectors MvB and MvC and the motion vector of the macroblock TbA adjacent to the left, for example. Of course, the median need not be used for calculation of the prediction vector, and the calculation source is not limited to this.

Also, when the motion vector of the adjacent macroblock used for calculation of the prediction vector exists in units of partitions, that is, when the motion vector of the calculation source is encoded by the skipped macroblock mode The motion vector of the adjacent partition may be used for calculation.

Thus, in the present embodiment, as a main feature of the present invention, one macroblock Tb is divided into a plurality of partitions (Tp1, Tp2 in this example) according to the partitions of the depth macroblock Db (Db1, Db2 in this example). A prediction vector is generated for each partition in the one macroblock Tb, and is encoded by the skipped macroblock mode.

Referring back to FIG. 3, such processing will be described. The skipped macroblock prediction vector generation unit 1071 calculates the partition size of the reference depth macroblock having the same coordinates as the target macroblock from the partition size memory 103, and the movement of the partition adjacent to the left, top, and top right of the target macroblock. Read a vector. The skipped macroblock prediction vector generation unit 1071 divides the target macroblock into one or a plurality of partitions based on the input partition size, and calculates a prediction vector for each partition.

Similar to the prediction vector generation unit 110, the skipped macroblock prediction vector generation unit 1071 takes the median of the horizontal and vertical components of the motion vectors of the partitions adjacent to the left, top, and upper right of the target partition. A prediction vector is calculated. At this time, as shown in the example using the prediction vector Pv1 when calculating the prediction vector Pv2, if the adjacent partition is included in the same macroblock as the target partition, the adjacent partition is not a motion vector. The prediction vector of the target partition is calculated using the prediction vector. Then, the skipped macroblock prediction vector generation unit 1071 outputs the calculated prediction vector for each partition to the prediction macroblock generation unit 1072.

The prediction macroblock generation unit 1072 reads the partition size of the reference depth macroblock having the same coordinates as the target macroblock from the partition size memory 103, and assigns each partition in the target macroblock from the skipped macroblock prediction vector generation unit 1071. Enter the corresponding prediction vector. The prediction macroblock generation unit 1072 performs motion compensation based on the input partition size and the prediction vector of each partition, and generates a skipped macroblock prediction image macroblock.

Specifically, first, the prediction macroblock generation unit 1072 reads the reference image block at the coordinates indicated by the prediction vector for each partition from the reference image memory 104. At this time, the size of the reference image block is the same as the size of the corresponding partition. The predicted macroblock generation unit 1072 generates a skipped macroblock predicted image macroblock by performing processing for matching the coordinates of the read reference image block with the coordinates of the partition. The prediction macroblock generation unit 1072 outputs the generated skipped macroblock prediction image macroblock to the encoding mode determination unit 108.

Next, image encoding processing performed by the image encoding device 1 according to the present embodiment will be described with reference to FIG. FIG. 5 is a flowchart for explaining an example of the image encoding process performed by the image encoding device 1 according to the present embodiment.

First, in step S101, the texture image input unit 101 inputs a reference image (texture image) from the image preprocessing unit 2 for each frame. Then, it progresses to step S102. In step S <b> 102, the depth map encoding unit 102 inputs the depth map from the image preprocessing unit 2 for each frame, and encodes it for each macroblock constituting each frame. The depth map encoding unit 102 outputs the depth map code generated by encoding to the outside of the image encoding device 1. Thereafter, the process proceeds to step S103.

In step S103, the depth map encoding unit 102 stores the size of the partition selected at the time of encoding for each macroblock in the partition size memory 103. Thereafter, the process proceeds to step S104. In step S104, the image encoding device 1 performs the processing in steps S105 to S122 for each macroblock included in the input frame.

In step S105, the intra prediction unit 105 reads from the reference image memory 104 a reference image block that has been subjected to processing related to encoding adjacent to the top and left of the target macroblock in the same frame as the target macroblock. The intra prediction unit 105 performs intra-frame prediction based on the read reference macroblock, and generates an intra predicted image macroblock. The intra prediction unit 105 outputs the generated intra predicted image macroblock to the encoding mode determination unit 108. Thereafter, the process proceeds to step S106.

In step S106, the prediction vector generation unit 110 reads the motion vector of the reference image block adjacent to the target partition from the motion vector memory 109. The prediction vector generation unit 110 extracts the median value of the horizontal component and the median value of the vertical component from the read vector of the adjacent reference image block, and calculates the prediction vector from the extraction result. The prediction vector generation unit 110 calculates the prediction vectors of all partitions in the target macroblock using the above calculation method. The prediction vector generation unit 110 outputs the calculated prediction vector to the inter prediction unit 106 and the subtraction unit 112. Thereafter, the process proceeds to step S107.

In step S107, the inter prediction unit 106 inputs the target macroblock from the texture image input unit 101 and the prediction vector from the prediction vector generation unit 110, and reads the reference image block constituting the reference image from the reference image memory 104. The inter prediction unit 106 divides the target macroblock into one or a plurality of partitions, and sequentially detects a motion vector for each partition. The inter prediction unit 106 selects the size of the partition that has the minimum coding cost among the sizes of the partitions constituting the target macroblock. The inter prediction unit 106 performs motion compensation based on the size of the partition with the minimum encoding cost and the motion vector of each partition, and generates an inter predicted image macroblock. The inter prediction unit 106 outputs the calculated motion vector to the motion vector memory 109 and the subtraction unit. The inter prediction unit 106 outputs the generated inter prediction image macroblock to the encoding mode determination unit 108. Thereafter, the process proceeds to step S108.

In step S108, the skipped macroblock unit 107 obtains the motion vector from the motion vector memory 109, and the partition size memory 103, the same display time as the target macroblock, and the size of the partition of the depth macroblock having the same coordinate as the reference image memory. Each of the reference image blocks is read from 104. The skipped macroblock unit 107 calculates a prediction vector of each partition based on the input partition size and motion vector. The skipped macroblock unit 107 performs motion compensation based on the calculated prediction vector, and generates a skipped macroblock prediction image macroblock. The skipped macroblock unit 107 outputs the skipped macroblock predicted image macroblock to the coding mode determination unit 108. Thereafter, the process proceeds to step S109.

In step S <b> 109, the encoding mode determination unit 108 selects the target macroblock from the texture image input unit 101, the intra prediction image macroblock from the intra prediction unit 105, and the inter prediction image macroblock from the inter prediction unit 106. The skipped macroblock predicted image macroblock is received as an input from the block unit 107. The encoding mode determination unit 108 determines an encoding mode with the highest encoding efficiency based on the three input predicted image macroblocks. Then, it progresses to step S110.

In step S110, the encoding mode determination unit 108 generates an encoding mode signal corresponding to the selected encoding mode, and outputs it to the information source encoding unit 116. After that, in step S111, the encoding mode determination unit 108 determines the selected encoding mode. If the mode is the skipped macroblock mode (YES in step S111), the process proceeds to step S112, where the intra prediction mode or the inter prediction mode is selected. In the prediction mode (NO in step S111), the process proceeds to step S114.

In step S112, the encoding mode determination unit 108 stores the skipped macroblock predicted image macroblock in the reference image memory 104 as a reference image block. Thereafter, the process proceeds to step S113.

In step 113, the information source encoding unit 116 receives an encoding mode signal indicating that the encoding mode is a skipped macroblock mode as an input, and encodes it using variable length encoding. Thereafter, the process proceeds to step S124.

In step 114, the subtraction unit 111 uses the inter-predicted image macroblock or intra-predicted image macro input from the coding mode determination unit 108 based on the signal value of each pixel included in the target macroblock input from the texture image input unit 101. A signal value of each pixel included in the block is subtracted to calculate a residual signal macroblock. The subtraction unit 111 outputs the calculated residual signal macroblock to the DCT transform / quantization unit 113. Thereafter, the process proceeds to step S115.

In step S115, the DCT transform / quantization unit 113 performs two-dimensional DCT on the residual signal macroblock input from the subtraction unit 111, calculates a DCT coefficient, quantizes the DCT coefficient, Generalized DCT coefficients are generated. The DCT transform / quantization unit 113 outputs the generated quantized DCT coefficient to the inverse quantization / inverse DCT transform unit 114 and the information source coding unit 116. Thereafter, the process proceeds to step S116.

In step S116, the inverse quantization / inverse DCT transformation unit 114 performs inverse quantization on the quantized DCT coefficient input from the DCT transformation / quantization unit 113 to generate an inverse quantization DCT coefficient, and then performs inverse quantization. Two-dimensional inverse DCT is performed on the DCT coefficients to generate a decoded residual signal macroblock. The inverse quantization / inverse DCT transform unit 114 outputs the generated decoded residual signal macroblock to the adder 115. Thereafter, the process proceeds to step S117.

In step S117, the addition unit 115 receives the signal value of each pixel included in the inter prediction image macroblock or the intra prediction image macroblock input from the inter prediction unit 106, and the inverse quantization / inverse DCT conversion unit 114. The signal value of each pixel included in the decoded residual signal macroblock is added to generate a reference image macroblock. Thereafter, the process proceeds to step S118.

In step S118, the adding unit 115 stores the generated reference image macroblock in the reference image memory 104. Thereafter, in step S119, according to the determination result of the encoding mode by the encoding mode determination unit 108, when the encoding mode is inter prediction (YES in step S119), the process proceeds to step S120, and the encoding mode is intra prediction. In that case (NO in step S119), the process proceeds to step S123.

In step S120, the motion vector memory 109 stores the motion vector of each partition of the target macroblock input from the inter prediction unit 106 for each frame, each block, and each partition. Thereafter, the process proceeds to step S121.

In step S121, the subtraction unit 112 calculates a difference vector by subtracting the prediction vector input from the prediction vector generation unit 110 from the motion vector input from the inter prediction unit 106. The subtraction unit 112 outputs the calculated difference vector to the information source encoding unit 116. Thereafter, the process proceeds to step S122.

In step S122, the information source encoding unit 116 receives an encoding mode signal indicating the inter prediction encoding mode input from the encoding mode determination unit 108, and a quantization input from the DCT transform / quantization unit 113. The DCT coefficient and the difference vector input from the subtracting unit 112 are encoded to generate an encoded stream. Thereafter, the process proceeds to step S124.

In step S123, the information source encoding unit 116 receives the encoding mode signal indicating the intra prediction encoding mode input from the encoding mode determination unit 108 and the quantization input from the DCT transform / quantization unit 113. The DCT coefficients are encoded and an encoded stream is generated. Thereafter, the process proceeds to step S124.

In step S124, the information source encoding unit 116 outputs the generated encoded stream to the outside of the image encoding device 1. Thereafter, the process proceeds to step S125.
In step S124, the image encoding device 1 sequentially changes the macroblocks to be encoded in the raster scan order, and returns to step S105. However, if the image encoding device 1 determines that there is no block that has not yet been subjected to encoding processing in the frame of the input texture image, it ends the processing for that frame.

Next, the skipped macroblock prediction image macroblock generation processing in the skipped macroblock unit 107 according to the present embodiment will be described with reference to FIG. FIG. 7 is a flowchart for explaining an example of a skipped macroblock predicted image macroblock generation process in the skipped macroblock unit 107 according to the present embodiment. This process is a process corresponding to step S108 described above.

First, in step S <b> 201, the skipped macroblock prediction vector generation unit 1071 and the prediction macroblock generation unit 1072 read the partition size of the reference depth macroblock having the same coordinates as the target macroblock from the partition size memory 103. Thereafter, the process proceeds to step S202.

In step S202, the skipped macroblock unit 107 performs the processing in steps S203 to S205 for each partition in the target macroblock (that is, as many as the number of partitions).

In step S203, the skipped macroblock prediction vector generation unit 1071 reads the motion vectors of the partitions adjacent to the left, top, and top right of the target macroblock. The skipped macroblock prediction vector generation unit 1071 calculates a prediction vector by taking the median of the horizontal and vertical components of the motion vector or prediction vector of the partition adjacent to the left, top, and top right of the target partition. . Thereafter, the process proceeds to step S204.

In step S204, the prediction macroblock generation unit 1072 inputs the prediction vector of the target partition from the skipped macroblock prediction vector generation unit 1071. The prediction macroblock generation unit 1072 performs motion compensation based on the prediction vector of the target partition, and generates a skipped macroblock prediction image macroblock.

In step S205, the skipped macroblock unit 107 sequentially changes the encoding target partition and returns to step S203. However, if the skipped macroblock unit 107 determines that there is no partition in the target macroblock that has not yet been processed for generation of the skipped macroblock predicted image macroblock, the process for the target macroblock is terminated. To do.

As described above, in the image encoding device 1 according to the present embodiment, the one macro block is divided into a plurality of partitions based on the size of the partition used when the depth macro block is encoded. Then, in this image encoding device 1, for each partition, based on the motion vector of the adjacent macroblock adjacent to the one macroblock and the prediction vector already calculated for the adjacent partition adjacent to the partition, A prediction vector is calculated. Here, as described above, for each partition, the median value of the motion vector of the adjacent partition prediction vector or the adjacent macroblock adjacent to the left, upper, and upper right of the partition can be calculated as the partition prediction vector. preferable.

Next, the image decoding apparatus according to this embodiment will be described. The image decoding apparatus according to the present embodiment constitutes an image composed of luminance values for each pixel, which is encoded in units of partitions in the skipped macroblock mode according to the image encoding method in the image encoding apparatus 1 described above. Image decoding means for decoding a macroblock which is a region of a part. Hereinafter, a configuration example of such an image decoding device will be described with reference to FIG. FIG. 8 is a schematic diagram illustrating a configuration example of the image decoding apparatus according to the present embodiment.

The image decoding device 4 illustrated in FIG. 8 includes an information source decoding unit 401, a depth map decoding unit 402, a partition size memory 403, an addition unit 404, a motion vector memory 405, a prediction vector generation unit 406, an inverse quantization / inverse DCT transform. Unit 407, addition unit 408, reference image memory 409, intra prediction image generation unit 410, inter prediction image generation unit 411, skipped macroblock unit 412, coding mode determination unit 413, and image output unit 414. The

The information source decoding unit 401 decodes an encoded stream for each block input from the outside of the image decoding device 4 to generate a quantized DCT coefficient and a difference vector, or whether the encoding mode is intra prediction or inter prediction. An encoding mode signal indicating any one of the macro block modes is generated. The decoding method performed by the information source decoding unit 401 is a process opposite to the encoding method performed by the information source encoding unit 116. That is, when the information source encoding unit 116 performs variable length encoding, the information source decoding unit 401 performs variable length decoding. The information source decoding unit 401 outputs the generated difference vector to the addition unit 404. When the information source decoding unit 401 generates the quantized DCT coefficient, the information source decoding unit 401 outputs the generated quantized DCT coefficient to the inverse quantization / inverse DCT transform unit 407. The information source decoding unit 401 outputs the generated encoding mode signal to the encoding mode determination unit 413.

When the information source decoding unit 401 generates a coding mode signal indicating the skipped macroblock mode, the information source decoding unit 401 outputs a quantized DCT coefficient indicating that the value is zero to the inverse quantization / inverse DCT transform unit 407. In addition, a difference vector indicating that the value is zero is output to the adding unit 404. In this case, the residual signal described later is also zero, and the image decoding device 4 generates a decoded image without using the residual signal. Therefore, in this case, the inverse quantization / inverse DCT conversion unit 407 may generate a residual signal in which the signal value of each pixel becomes zero, and output the generated residual signal to the addition unit 408.

The depth map decoding unit 402 inputs a depth map code from the outside of the image decoding apparatus 4 for each macroblock, and performs a known image decoding method such as ITU-T H.264. Decoding is performed using a decoding method described in the H.264 standard. The depth map decoding unit 402 outputs the depth macroblock generated by decoding to the outside of the image decoding device 4.

The depth map decoding unit 402 stores, in the partition size memory 403, the partition size indicating the shape obtained by dividing the macro block when each macro block of the depth map is encoded.

The partition size memory 403 stores the partition sizes input from the depth map decoding unit 402 in the order of decoded macroblocks.

The addition unit 404 adds the difference vector input from the information source decoding unit 401 and the prediction vector input from the prediction vector generation unit 406 to generate a motion vector. The adder 404 stores the generated motion vector in the motion vector memory 405.

The motion vector memory 405 stores the motion vector input from the adding unit 404 for each frame and each block. The motion vector memory 405 outputs the stored motion vector to the inter predicted image generation unit 411 and the skipped macroblock unit 412.

The prediction vector generation unit 406 reads the motion vector of the block adjacent to the decoding target block and subjected to the processing related to decoding from the motion vector memory 405. The prediction vector generation unit 406 calculates a prediction vector based on the read motion vector. The process for the prediction vector generation unit 406 to calculate the prediction vector is the same as that of the prediction vector generation unit 110. The prediction vector generation unit 406 outputs the calculated prediction vector to the addition unit 404.

The inverse quantization / inverse DCT transform unit 407 performs inverse quantization on the quantized DCT coefficient input from the information source decoding unit 401 to generate an inverse quantized DCT coefficient, and then performs 2 for the inverse quantized DCT coefficient. Dimensional inverse DCT is performed to generate a decoded residual signal block as a spatial domain signal. The inverse quantization / inverse DCT transform unit 407 outputs the decoded residual signal block to the adder 408.

The adder 408 reverses the signal value of each pixel included in any of the intra-predicted image macroblock, the inter-predicted image macroblock, and the skipped macroblock predicted image macroblock input from the coding mode determining unit 413. A reference image macroblock is generated by adding the signal value of each pixel included in the decoded residual signal block input from the quantization / inverse DCT transform unit 407. The adder 408 stores the generated reference image macroblock in the reference image memory 409 and outputs it to the image output unit 414.

The reference image memory 409 stores the reference image macroblock input from the adder 408 at the position of the decoding target macroblock in the corresponding frame and stores it as a reference image. The reference image memory 409 stores a reference image of a frame that is a preset number of frames (for example, 15 frames) from the present, and when the number of stored frames exceeds the preset number of frames, for example, the 16th frame When the reference image is input, all the reference images in the past frames are deleted.

The intra-predicted image generation unit 410 reads the reference image macroblock that has been subjected to the processing related to the encoding adjacent to the left and above the target macroblock in the same frame as the target macroblock from the reference image memory 409, and has read it. Intraframe prediction is performed based on the reference image macroblock to generate an intra predicted image macroblock. The intra predicted image generation unit 410 outputs the generated intra predicted image macroblock to the encoding mode determination unit 413.

The inter prediction image generation unit 411 reads a motion vector from the motion vector memory 405. The inter prediction image generation unit 411 reads a reference image block constituting a reference image from the reference image memory 409 for each partition constituting the target macroblock. The coordinates at which the inter prediction image generation unit 411 reads out the reference image block are coordinates compensated by adding the motion vector input from the motion vector memory 405 to the coordinates of the decoding target partition, and the size of the reference image block. Is the same size as the corresponding partition. The inter prediction image generation unit 411 generates an inter prediction image macroblock by performing processing for matching the coordinates of the read reference image block with the coordinates of the partition. The inter prediction image generation unit 411 outputs the generated inter prediction image macroblock to the encoding mode determination unit 413.

The skipped macroblock unit 412 determines the partition size of the depth macroblock having the same coordinates as that of the target macroblock from the partition size memory 403, and the motion vectors of the partitions adjacent to the left, upper, and upper right of the target macroblock from the motion vector memory 405. Are read out from the reference image memory 409, respectively. The skipped macroblock unit 412 divides the target macroblock into a plurality of partitions based on the size of the read partition, calculates a prediction vector for each partition based on the read motion vector, and calculates the calculated prediction vector. Based on the motion compensation, a skipped macroblock prediction image macroblock is generated. The processing for the skipped macroblock unit 412 to generate the skipped macroblock predicted image macroblock is the same as that of the skipped macroblock unit 107. The skipped macroblock unit 412 outputs the skipped macroblock predicted image macroblock to the encoding mode determination unit 413.

The encoding mode determination unit 413 includes an encoding mode signal from the information source decoding unit 401, an intra prediction image macroblock from the intra prediction image generation unit 410, and an inter prediction image macroblock from the inter prediction image generation unit 411. Then, a skipped macroblock prediction image macroblock from the skipped macroblock unit 412 is input. The encoding mode determination unit 413 adds one of the intra prediction image macroblock, the inter prediction image macroblock, and the skipped macroblock prediction image macroblock based on the encoding mode indicated by the encoding mode signal. Output to.

The image output unit 414 generates the reference image for each frame by arranging the reference image macroblock input from the adding unit 408 at the position of the decoding target macroblock in the corresponding frame. The image output unit 414 outputs the generated reference image as a decoded texture image for each frame to the outside of the image decoding device 4.

Next, image decoding processing performed by the image decoding device 4 according to the present embodiment will be described with reference to FIGS. 9 and 10. FIG. 9 is a flowchart for explaining an example of the image decoding process performed by the image decoding device 4 according to the present embodiment, and FIG. 10 is a flowchart following FIG.

In step S301, the image decoding apparatus 4 performs the processing in steps S302 to S319 for each macroblock.

First, in step S302, the depth map decoding unit 402 inputs a depth map code from the outside of the image decoding device 4. The depth map decoding unit 402 decodes the input depth map code to generate a depth macroblock. The depth map decoding unit 402 outputs the generated depth macroblock to the outside of the image decoding device 4. Thereafter, the process proceeds to step S303.

In step S303, the depth map decoding unit 402 stores the partition size of the decoded depth macroblock in the partition size memory 403. Thereafter, the process proceeds to step S304.

In step S304, the information source decoding unit 401 inputs an encoded stream for each macroblock from the outside of the image decoding device 4. Thereafter, the process proceeds to step S305.
In step S305, the information source decoding unit 401 decodes the input encoded stream, and generates an encoding mode signal, a quantized DCT coefficient, and a difference vector. The information source decoding unit 401 outputs the generated encoding mode signal to the encoding mode determination unit 413. Thereafter, the process proceeds to step S306.

In step S306, the information source decoding unit 401 determines whether or not the generated encoding mode signal indicates a skipped macroblock mode. If the information source decoding unit 401 determines that the generated encoding mode signal indicates the skipped macroblock mode (YES in step S306), the information source decoding unit 401 proceeds to step S307. If the information source decoding unit 401 determines that the generated encoding mode signal does not indicate the skipped macroblock mode (NO in step S306), the information source decoding unit 401 proceeds to step S308.

In step S307, the information source decoding unit 401 outputs a quantized DCT coefficient indicating that the value is zero to the inverse quantization / inverse DCT transform unit 407. The information source decoding unit 401 outputs a difference vector indicating that the value is zero to the adding unit 404. Thereafter, the process proceeds to step S308.

In step S308, the information source decoding unit 401 determines whether or not the generated encoding mode signal indicates the inter prediction mode. If the information source decoding unit 401 determines that the generated encoding mode signal indicates the inter prediction mode (YES in step S308), the information source decoding unit 401 proceeds to step S310. If the information source decoding unit 401 determines that the generated encoding mode signal does not indicate the inter prediction mode (NO in step S308), the information source decoding unit 401 proceeds to step S309.

In step S309, the information source decoding unit 401 outputs a difference vector indicating that the value is zero to the adding unit 404. Thereafter, the process proceeds to step S310.
In step S310, the inverse quantization / inverse DCT transform unit 407 performs inverse quantization on the quantized DCT coefficient input from the information source decoding unit 401, generates an inverse quantized DCT coefficient, and then performs inverse quantization. A two-dimensional inverse DCT is performed on the DCT coefficients to generate a decoded residual signal block. The inverse quantization / inverse DCT transform unit 407 outputs the decoded residual signal block to the adder 408. Thereafter, the process proceeds to step S311.

In step S311, the prediction vector generation unit 406 reads from the motion vector memory 405 the motion vector of the reference block adjacent to the decoding target macroblock and subjected to the processing related to decoding. The prediction vector generation unit 406 calculates a prediction vector based on the read motion vector. The prediction vector generation unit 406 outputs the calculated prediction vector to the addition unit 404. Thereafter, the process proceeds to step S312.

In step S312, the addition unit 404 adds the difference vector input from the information source decoding unit 401 and the prediction vector input from the prediction vector generation unit 406 to generate a motion vector. The adder 404 stores the generated motion vector in the motion vector memory 405. Thereafter, the process proceeds to step S313.

In step S313, the inter prediction image generating unit 411 reads out motion vectors corresponding to all partitions in the target macroblock from the motion vector memory 405. The inter prediction image generation unit 411 generates an inter prediction image macroblock by performing motion compensation based on the motion vector read from the motion vector memory 405. The inter prediction image generation unit 411 outputs the generated inter prediction image macroblock to the encoding mode determination unit 413. Thereafter, the process proceeds to step S314.

In step S314, the intra predicted image generation unit 410 reads, from the reference image memory 409, a reference image block that has been subjected to processing related to decoding on the left and right of the target macroblock in the same frame as the target macroblock. The intra predicted image generation unit 410 performs intra-frame prediction based on the read reference image macroblock, and generates an intra predicted image macroblock. The intra predicted image generation unit 410 outputs the generated intra predicted image macroblock to the encoding mode determination unit 413. Thereafter, the process proceeds to step S315.

In step S315, the skipped macroblock unit 412 obtains the motion vector from the motion vector memory 405, the partition size memory 403 from the reference image memory 409, the partition size of the depth macroblock having the same display time and the same coordinate as the target block. Each reference image block is read out. The skipped macroblock unit 412 performs the same process as the skipped macroblock unit 107 to generate a skipped macroblock predicted image macroblock. The skipped macroblock unit 412 outputs the skipped macroblock predicted image macroblock to the encoding mode determination unit 413. Thereafter, the process proceeds to step S316.

In step S316, the encoding mode determination unit 413, the encoding mode signal from the information source decoding unit 401, the intra prediction image macroblock from the intra prediction image generation unit 410, and the inter prediction from the inter prediction image generation unit 411. An image macroblock and a skipped macroblock predicted image macroblock from the skipped macroblock unit 412 are input. The encoding mode determination unit 413 adds one of the intra prediction image macroblock, the inter prediction image macroblock, and the skipped macroblock prediction image macroblock based on the encoding mode indicated by the encoding mode signal. Output to. Thereafter, the process proceeds to step S317.

In step S317, the addition unit 408 includes the signal value of each pixel included in the intra prediction image macroblock, the inter prediction image macroblock, or the skipped macroblock prediction image macroblock input from the encoding mode determination unit 413, The reference image macroblock is generated by adding the signal value of each pixel included in the decoded residual signal macroblock input from the inverse quantization / inverse DCT transform unit 407. Thereafter, the process proceeds to step S318.

In step S318, the addition unit 408 stores the generated reference image macroblock in the reference image memory 409 and outputs the reference image macroblock to the image output unit 414. Thereafter, the process proceeds to step S319.

In step S319, the image decoding device 4 changes the decoding target macroblock in the order of raster scan, and returns to step S302. However, if the image decoding device 4 determines that there is no macroblock that has not been decoded in the frame of the input encoded stream, the image decoding device 4 ends the processing for that frame.

As described above, in the present embodiment, the motion vector indicating the amount of movement from the past texture image is predicted for each macroblock that is a partial region constituting the texture image including the luminance value for each pixel, and the above When encoding a texture image in the skipped macroblock mode, the one macroblock of the depth map indicating the motion vector of the partition adjacent to the one macroblock and the depth value indicating the distance from the viewpoint for each pixel. A prediction vector is generated based on the depth macroblock that is an area corresponding to the above and the size of the partition set when the depth macroblock is encoded.

Thus, in the present embodiment, when the target macroblock is encoded in the skipped macroblock mode, the target macroblock is divided based on the size of the partition of the depth map of the corresponding frame, and a prediction vector is generated for each partition. Therefore, since a prediction vector is set for each region in the macroblock, it is possible to improve coding efficiency.

(Second Embodiment)
FIG. 11 is a schematic diagram illustrating a configuration example of an image encoding device according to the second embodiment of the present invention.

The image encoding device 5 illustrated in FIG. 11 includes a texture image input unit 101, a depth map encoding unit 502, a depth map memory 503, a reference image memory 104, an intra prediction unit 105, an inter prediction unit 106, and a skipped macroblock unit. 507, encoding mode determination unit 108, motion vector memory 109, prediction vector generation unit 110, subtraction unit 111, subtraction unit 112, DCT transformation / quantization unit 113, inverse quantization / inverse DCT transformation unit 114, addition unit 115, And an information source encoding unit 116. The image encoding device 5 replaces the depth map encoding unit 102, the partition size memory 103, and the skipped macroblock unit 107 of the image encoding device 1 shown in FIG. 2 with a depth map encoding unit 502 and a depth map memory, respectively. 503 and the skipped macroblock unit 507 are different, and the other configurations are the same.

The depth map encoding unit 502 receives the depth map from the image preprocessing unit 2 for each frame, and a known image encoding method for each macroblock constituting each frame, for example, ITU-T H.264. The depth map is encoded using an encoding method corresponding to the decoding method described in the H.264 standard. The depth map encoding unit 502 outputs the depth map code generated by encoding to the outside of the image encoding device 1.

The depth map encoding unit 502 decodes the generated depth map code to generate an encoded depth map. The depth map encoding unit 502 stores the encoded depth map in the depth map memory 503.
The depth map memory 503 stores the encoded depth map input from the depth map encoding unit 502.

The skipped macroblock unit 507 reads a motion vector from the motion vector memory 109, reads a depth macroblock at the same coordinates as the target macroblock in the encoded depth map from the depth map memory 503, and reads from the reference image memory 104. Read the reference image block. Also, the skipped macroblock unit 507 generates a skipped macroblock prediction image macroblock based on the read motion vector, the encoded depth map, and the reference image block. The skipped macroblock unit 507 outputs the generated skipped macroblock predicted image macroblock to the coding mode determination unit 108.

Next, a configuration example and functions of the skipped macroblock unit 507 according to the present embodiment will be described with reference to FIG. FIG. 12 is a schematic diagram illustrating a configuration example of the skipped macroblock unit 507 according to the present embodiment.

The skipped macroblock unit 507 includes an edge image generation unit 5071, a skipped macroblock prediction vector generation unit 5072, and a prediction macroblock generation unit 5073.

The edge image generation unit 5071 performs edge detection processing for detecting an edge component on the target depth macroblock read from the depth map memory 503, and generates an edge image block indicating a region (pixel) in which the edge component is detected. To do. The edge image generation unit 5071 outputs the generated edge image block to the skipped macroblock prediction vector generation unit 5072.

An edge is a region where the spatial change of the signal value is remarkable. The edge image block includes, for each pixel, a signal value indicating whether or not an edge is detected, for example, a signal value indicating whether or not the spatial change of the depth value is larger than a preset threshold value.

The edge image generation unit 5071 uses, for example, a 3-by-3 Sobel filter shown in Expression (1) in order to generate an edge image block. That is, the Sobel filter is a filter that gives a gradient of signal values between pixels adjacent in the horizontal direction and a gradient of signal values between pixels adjacent in the vertical direction. Of course, the filter is not limited to 3 rows and 3 columns.

Here, for each pixel included in the target depth macroblock, the edge image generation unit 5071 multiplies a sub-block of 3 pixels in the horizontal direction × 3 pixels in the vertical direction including the pixel (center pixel) as a center by a Sobel filter, The gradient of the signal value for each pixel is calculated. However, if the center pixel is in the top row, bottom row, leftmost column or rightmost column of the target depth macroblock, the signal value of the top row, bottom row, leftmost column or rightmost column of each sub-block is set. The value is equal to the signal value of the center value.

The edge image generation unit 5071 determines a signal value (for example, 1) indicating that it is an edge for a pixel in which the absolute value of the calculated gradient for each pixel is larger than a preset threshold value. The edge image generation unit 5071 determines a signal value (for example, 0) indicating that the calculated absolute value of the gradient for each pixel is not an edge for a pixel that is equal to or smaller than a preset threshold value.

The edge image generation unit 5071 outputs a signal value indicating whether or not each pixel is an edge to the skipped macroblock prediction vector generation unit 5072 as an edge image block.

The skipped macroblock prediction vector generation unit 5072 receives the edge image block from the edge image generation unit 5071 and the motion vector of the partition adjacent to the left, top, and top right of the target macroblock from the motion vector memory 109 or the same adjacent macroblock. Are respectively received as inputs. The skipped macroblock prediction vector generation unit 5072 divides the target macroblock into a plurality of partitions along the edge of the edge image block.

The predicted vector generation unit for skipped macroblock 5072 has a motion vector of a partition adjacent to the target macroblock that has the largest area in contact with the target partition for each partition (processing for each partition is performed). Therefore, a prediction vector of a partition in a target macroblock for which a prediction vector has already been set or a motion vector of an adjacent macroblock is set as a prediction vector. The skipped macroblock prediction vector generation unit 5072 outputs the generated prediction vector of each partition to the prediction macroblock generation unit 5073.

The prediction macroblock generation unit 5073 reads the edge image block from the edge image generation unit 5071, and inputs the prediction vector corresponding to each partition in the target macroblock from the skipped macroblock prediction vector generation unit 5072. The prediction macroblock generation unit 5073 performs motion compensation based on the size of each partition based on the input edge image block and the prediction vector of each partition, and generates a skipped macroblock prediction image macroblock. The prediction macroblock generation unit 5073 outputs the generated skipped macroblock prediction image macroblock to the encoding mode determination unit 108.

Next, an image encoding process performed by the image encoding device 5 according to the present embodiment will be described.
FIG. 13 is a flowchart for explaining an example of an image encoding process performed by the image encoding device 5 according to the present embodiment. The flowchart in FIG. 13 differs from the flowchart in FIG. 5 in that step S103 is replaced with step S403 described later and step S108 is replaced with step S408 described later. Are the same. Also, I and II in the flowchart of FIG. 13 follow the I and II of the flowchart in FIG. 6 as in FIG. Since the processes other than step S403 and step S408 are the same as those in the flowchart of FIG.

In step S403, the depth map encoding unit 502 decodes the generated depth map code to generate an encoded depth map. The depth map encoding unit 502 stores the encoded depth map in the depth map memory 503. Thereafter, the process proceeds to step S104.

In step S408, the skipped macroblock unit 507 receives the motion vector from the motion vector memory 109, the depth macroblock having the same display time and the same coordinate as the target macroblock from the depth map memory 503, and the reference image block from the reference image memory 104. Is read out. The skipped macroblock unit 507 calculates a prediction vector for each partition based on the input depth macroblock and the motion vector. The skipped macroblock unit 507 performs motion compensation based on the calculated prediction vector, and generates a skipped macroblock prediction image macroblock. The skipped macroblock unit 507 outputs the skipped macroblock predicted image macroblock to the coding mode determination unit 108. Thereafter, the process proceeds to step S109.

Next, the skipped macroblock prediction image macroblock generation processing in the skipped macroblock unit 507 according to the present embodiment will be described with reference to FIG. FIG. 14 is a flowchart for explaining an example of a skipped macroblock predicted image macroblock generation process in the skipped macroblock unit 507 according to the present embodiment. This process is a process corresponding to step S408 described above.

First, in step S501, the edge image generation unit 5071 reads out the target depth macroblock having the same coordinates as the target macroblock from the depth map memory 503. Thereafter, the process proceeds to step S502.

In step S502, the edge image generation unit 5071 performs edge detection processing on the target depth macroblock read from the depth map memory 503, and generates an edge image block indicating a region (pixel) where the edge is detected. Thereafter, the process proceeds to step S503.

In step S503, the skipped macroblock unit 507 divides the target macroblock into a plurality of partitions based on the edge image block, and performs the processing in steps S504 to S506 for each partition.

In step S504, the skipped macroblock prediction vector generation unit 5072 reads the motion vector of the partition adjacent to the left, top, and top right of the target macroblock or the motion vector of the adjacent macroblock from the motion vector memory 109. The skipped macroblock prediction vector generation unit 5072 can be said to be a motion vector of a partition adjacent to the target macroblock having the largest area in contact with the target partition (because a process for each partition is performed, it can be said to be a prediction vector). ), Or a prediction vector of a partition in a target macroblock for which a prediction vector has already been set, or a motion vector of an adjacent macroblock is set as a prediction vector. Thereafter, the process proceeds to step S505.

In step S505, the prediction macroblock generation unit 5073 reads the edge image block from the edge image generation unit 5071, and inputs the prediction vector of the target partition from the skipped macroblock prediction vector generation unit 5072. The prediction macroblock generation unit 5073 performs motion compensation based on the size of the target partition calculated based on the input edge image block and the prediction vector of the target partition, and generates a skipped macroblock prediction image macroblock. Thereafter, the process proceeds to step S506.

In step S506, the skipped macroblock unit 507 sequentially changes the partition to be encoded, and returns to step S504. However, if the skipped macroblock unit 507 determines that there is no partition in the target macroblock that has not yet been processed for generation of the skipped macroblock prediction image macroblock, the process for the target macroblock is terminated. To do.

As described above, the image encoding device 5 according to the present embodiment detects an edge component from a depth map block, and divides the one macroblock into a plurality of partitions based on the detected edge component. In this image encoding device 5, for each partition, based on the motion vector of the adjacent macroblock adjacent to the one macroblock and the prediction vector already calculated for the adjacent partition adjacent to the partition, A prediction vector is calculated. Here, as described above, for each partition, among adjacent partitions and adjacent macroblocks adjacent to the left, top, and upper right of the partition, the prediction vector or adjacent macro of the adjacent partition having the largest area in contact with the partition. It is preferable that the motion vector of the block is a prediction vector of the partition.

Next, the image decoding apparatus according to this embodiment will be described. The image decoding apparatus according to the present embodiment constitutes an image composed of luminance values for each pixel, which is encoded in units of partitions in the skipped macroblock mode according to the image encoding method in the image encoding apparatus 5 described above. Image decoding means for decoding a macroblock which is a region of a part. Hereinafter, a configuration example of such an image decoding device will be described with reference to FIG. FIG. 15 is a schematic diagram illustrating a configuration example of an image decoding device according to the present embodiment.

The image decoding apparatus 6 illustrated in FIG. 15 includes an information source decoding unit 401, a depth map decoding unit 602, a depth map memory 603, an addition unit 404, a motion vector memory 405, a prediction vector generation unit 406, an inverse quantization / inverse DCT transform. A unit 407, an addition unit 408, a reference image memory 409, an intra prediction image generation unit 410, an inter prediction image generation unit 411, a skipped macroblock unit 612, and an image output unit 414.

The image decoding device 6 replaces the depth map decoding unit 402, the partition size memory 403, and the skipped macroblock unit 412 of the image decoding device 4 shown in FIG. 8 with a depth map decoding unit 602, a depth map memory 603, a skipped The difference is that the macroblock unit 612 is provided, and the other configurations are the same. Further, the depth map memory 603 and the skipped macroblock unit 612 of the image decoding device 6 are the same as the depth map memory 503 and the skipped macroblock unit 507 of the image encoding device 5 shown in FIG. 11, respectively.

The depth map decoding unit 602 inputs a depth map code for each block from the outside of the image decoding device 6, and a known image decoding method such as ITU-T H.264. Decoding is performed using a decoding method described in the H.264 standard. The depth map decoding unit 602 stores the depth map generated by decoding in the depth map memory 603 and outputs the depth map to the outside of the image decoding device 4.

Also, the skipped macroblock unit 612 has the same configuration as the skipped macroblock unit 507 shown in FIG. 12, and performs the same processing as the skipped macroblock predicted image macroblock generation processing shown in FIG.

Next, image decoding processing performed by the image decoding device 6 according to the present embodiment will be described with reference to FIGS. FIG. 16 is a flowchart for explaining an example of image decoding processing performed by the image decoding device 6 according to the present embodiment. The flowchart in FIG. 16 differs from the flowchart in FIG. 9 in that step S303 is replaced with step S603, and the other processes are the same. Also, the flowchart III in FIG. 16 is followed by the flowchart III in the flowchart of FIG. 10 as in FIG. 9, but only the processing contents of step S315 are different, and the other processes are the same. Hereinafter, step S315 after the processing in FIG. 16 will be described as “step S615” in order to distinguish it from the processing following FIG.

Note that step S603 in FIG. 16 is the same as step S403 in the flowchart in FIG. 13, and among the processing contents in FIG. 10 following FIG. 16, the processing contents in step S615 are also the same as step S408 in the flowchart in FIG. Therefore, the description thereof is omitted. Furthermore, since the processes other than step S603 and step S615 are the same as the flowcharts of FIGS. 9 and 10, the description thereof is omitted.

Also, when the encoding target block or decoding target block is the rightmost block of the frame, there is no block on the upper right. Therefore, the motion vector prediction unit according to the present embodiment may read the motion vector of the upper left adjacent block from the motion vector memory.

As described above, in the present embodiment, an edge component is detected from a block of a depth map, and one macro block (target texture image macro block) is divided into a plurality of partitions based on the detected edge component. A prediction vector is generated.

That is, in this embodiment, compared with the first embodiment, when encoding in the skipped macroblock mode for each macroblock which is a partial area constituting an image composed of luminance values for each pixel, The macroblock is divided into a plurality of partitions according to the partition of the depth macroblock which is an area corresponding to the one macroblock in the depth map including the depth value indicating the distance from the viewpoint for each pixel. This is common in that a prediction vector is generated. In the first embodiment, the one macroblock is divided into a plurality of partitions based on the size of the partition used when the depth macroblock is encoded. In the present embodiment, The difference is that an edge component is detected from a block of the depth map, and the one macroblock is divided into a plurality of partitions based on the detected edge component.

As a result, in this embodiment, as in the first embodiment, when encoding in the skipped macroblock mode, a prediction vector can be set for each region in the target texture image macroblock. It is possible to improve the generation accuracy of predicted macroblocks by motion compensation, and hence the encoding efficiency.

(Other)
In all the embodiments, encoding and decoding of a single-viewpoint image signal photographed by one camera has been described. However, the present invention may be applied to a multi-viewpoint image signal photographed by a plurality of cameras. Good. For example, a 3-viewpoint image signal is input, and an image signal of one viewpoint is encoded in a skipped macroblock mode using a depth map obtained from a parallax with an image signal of another viewpoint. By switching, the image signals of the three viewpoints can be encoded.

In addition, a part of the image encoding device 1, the image decoding device 4, the image encoding device 5, and the image decoding device 6 in the above-described embodiment, for example, the intra prediction unit 105, the inter prediction unit 106, and the skipped macroblock unit. 107, encoding mode determination unit 108, prediction vector generation unit 110, subtraction unit 111, subtraction unit 112, DCT transformation / quantization unit 113, inverse quantization / inverse DCT transformation unit 114, addition unit 115, information source coding unit 116, information source decoding unit 401, addition unit 404, prediction vector generation unit 406, inverse quantization / inverse DCT conversion unit 407, addition unit 408, intra prediction image generation unit 410, inter prediction image generation unit 411, skipped macroblock Unit 412, coding mode determination unit 413, skipped macroblock unit 507, and skipped macroblock 612 may be realized by a computer.

In that case, a program (an image encoding program and / or an image decoding program) for realizing this control function is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system. , May be realized by executing. Here, the “computer system” is a computer system built in the image encoding device 1, the image decoding device 4, the image encoding device 5, or the image decoding device 6, and includes an OS (Operating System) and peripherals. It shall include hardware such as equipment. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM or a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” is a medium that dynamically holds a program for a short time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line, In this case, a volatile memory inside a computer system that serves as a server or a client may be included that holds a program for a certain period of time. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in the computer system. Further, this program is not limited to being distributed via a portable recording medium or a network, but can also be distributed via a broadcast wave.

The above-described image encoding program performs an image encoding process for encoding the image in a skipped macroblock mode for each macroblock that is a partial area constituting an image composed of luminance values for each pixel. This is an image encoding program to be executed. In this image encoding process, one macroblock is divided into a plurality of depth macroblocks according to a partition of a depth macroblock that is an area corresponding to the one macroblock in a depth map including a depth value indicating a distance from a viewpoint for each pixel. There is an image encoding step of dividing into partitions, generating a prediction vector for each partition, and encoding in a skipped macroblock mode. Other application examples are as described for the image encoding device.

In addition, the above-described image decoding program is encoded for each pixel according to the image encoding method described later (or by the above-described image encoding device or the image encoding program) in units of partitions in the skipped macroblock mode. This is a program for causing a computer to execute an image decoding step for decoding a macroblock which is a partial area constituting an image composed of luminance values. Other application examples are as described for the image decoding apparatus.

In addition, a part or all of the image encoding device 1, the image decoding device 4, the image encoding device 5, and the image decoding device 6 in the above-described embodiment is integrated with an integrated circuit such as LSI (Large Scale Integration) or IC (Integrated). Circuit) may be realized as a chip set. Each functional block of the image encoding device 1, the image decoding device 4, the image encoding device 5, and the image decoding device 6 may be individually made into a processor, or a part or all of them may be integrated into a processor. Further, the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. In addition, when an integrated circuit technology that replaces LSI appears due to the advancement of semiconductor technology, an integrated circuit based on the technology may be used.

In addition, as exemplified in the flow of processing in the image encoding device and the image decoding device, the present invention further describes the image encoding method and the image decoding method as described in the processing of the image encoding program and the image decoding program. The form as can also be taken.

In this image encoding method, image encoding is performed in which the image encoding apparatus encodes the image in a skipped macroblock mode for each macroblock, which is a partial area constituting an image composed of luminance values for each pixel. In this method, the image coding apparatus partitions one macroblock into a depth macroblock that is an area corresponding to the one macroblock in a depth map including a depth value indicating a distance from the viewpoint for each pixel. Thus, it is assumed that there is an image encoding step of dividing into a plurality of partitions, generating a prediction vector for each partition, and encoding in a skipped macroblock mode. Other application examples are as described for the image encoding device.

In addition, the above-described image decoding method uses a macroblock that is a partial area constituting an image composed of luminance values for each pixel, encoded in units of partitions in the skipped macroblock mode according to the above-described image encoding method. The image decoding apparatus includes an image decoding step for decoding. Other application examples are as described for the image decoding apparatus.

DESCRIPTION OF SYMBOLS 1,5 ... Image coding apparatus, 2 ... Image pre-processing part, 3a, 3b ... Imaging | photography apparatus, 4,6 ... Image decoding apparatus, 101 ... Texture image input part, 102 ... Depth map coding part, 103 ... Partition Size memory 104 ... Reference image memory 105 ... Intra prediction unit 106 ... Inter prediction unit 107 107 Skipped macroblock unit 108 ... Coding mode determination unit 109 109 Motion vector memory 110 110 Prediction

vector generation unit

111, 112 ... subtracting unit, 113 ... DCT transform / quantization unit, 114 ... inverse quantization / inverse DCT transform unit, 115 ... addition unit, 116 ... information source coding unit, 401 ... information source decoding unit, 402 ... depth Map decoding unit, 403 ... partition size memory, 404 ... addition unit, 405 ... motion vector memory, 406 ... prediction vector generation unit, 4 DESCRIPTION OF SYMBOLS 7 ... Dequantization / inverse DCT conversion part, 408 ... Addition part, 409 ... Reference image memory, 410 ... Intra prediction image generation part, 411 ... Inter prediction image generation part, 412 ... Skipped macroblock part, 413 ... Encoding Mode decision unit, 414 ... Image output unit, 502 ... Depth map encoding unit, 503 ... Depth map memory, 507 ... Skipped macroblock unit, 602 ... Depth map decoding unit, 603 ... Depth map memory, 612 ... Skipped macro Block unit, 1071... Skipped macroblock prediction vector generation unit, 1072... Prediction macroblock generation unit, 5071. Edge image generation unit, 5072... Skipped macroblock prediction vector generation unit, 5073.

Claims

An image encoding method in which an image encoding device encodes the image in a skipped macroblock mode for each macroblock that is a partial area constituting an image composed of luminance values for each pixel,
The image encoding device includes a plurality of macroblocks according to a partition of a depth macroblock that is an area corresponding to the one macroblock in a depth map including a depth value indicating a distance from a viewpoint for each pixel. An image encoding method comprising: an image encoding step of dividing into partitions, generating a prediction vector for each partition, and encoding in a skipped macroblock mode.
The image encoding step includes:
Dividing the one macroblock into a plurality of partitions based on the size of the partition used when the depth macroblock was encoded;
For each partition, a calculation step of calculating a prediction vector of the partition based on a motion vector of an adjacent macroblock adjacent to the one macroblock and a prediction vector already calculated for an adjacent partition adjacent to the partition; The image encoding method according to claim 1, further comprising:
The calculating step includes:
The median value of a prediction vector of an adjacent partition or a motion vector of an adjacent macroblock adjacent to the left, top, and right of the partition for each partition is calculated as the prediction vector of the partition. 3. The image encoding method according to 2.
The image encoding step includes:
Detecting an edge component from the block of the depth map;
Dividing the one macroblock into a plurality of partitions based on the detected edge components;
For each partition, a calculation step of calculating a prediction vector of the partition based on a motion vector of an adjacent macroblock adjacent to the one macroblock and a prediction vector already calculated for an adjacent partition adjacent to the partition; The image encoding method according to claim 1, further comprising:
The calculating step includes:
For each partition, a prediction vector of a neighboring partition or a motion vector of a neighboring macroblock having the largest area in contact with the partition among neighboring partitions and neighboring macroblocks adjacent to the left, top, and right of the partition, The image encoding method according to claim 4, wherein the prediction vector of the partition is used.
An image encoding device that encodes the image in a skipped macroblock mode for each macroblock that is a partial area constituting an image composed of luminance values for each pixel,
One macroblock is divided into a plurality of partitions according to a partition of a depth macroblock which is an area corresponding to the one macroblock in a depth map including a depth value indicating a distance from a viewpoint for each pixel, and the partition An image coding apparatus comprising image coding means for generating a prediction vector for each time and coding in a skipped macroblock mode.
Image coding for causing a computer to execute image coding processing for coding the image in a skipped macroblock mode for each macroblock which is a partial area constituting an image composed of luminance values for each pixel A program,
In the image encoding process, one macroblock is divided into a plurality of depth macroblocks according to partitions of a depth macroblock that is an area corresponding to the one macroblock in a depth map including a depth value indicating a distance from a viewpoint for each pixel. An image encoding program comprising an image encoding step of dividing into partitions, generating a prediction vector for each partition, and encoding in a skipped macroblock mode.
A macro that is a partial area constituting an image composed of luminance values for each pixel, encoded in units of partitions in the skipped macroblock mode according to the image encoding method according to any one of claims 1 to 5. An image decoding method comprising: an image decoding step in which an image decoding apparatus decodes a block.
A macro that is a partial area constituting an image composed of luminance values for each pixel, encoded in units of partitions in the skipped macroblock mode according to the image encoding method according to any one of claims 1 to 5. An image decoding apparatus comprising image decoding means for decoding a block.
A macro that is a partial area constituting an image composed of luminance values for each pixel, encoded in units of partitions in the skipped macroblock mode according to the image encoding method according to any one of claims 1 to 5. An image decoding program for causing a computer to execute an image decoding step for decoding a block.