WO2012161318A1

WO2012161318A1 - Image encoding device, image decoding device, image encoding method, image decoding method and program

Info

Publication number: WO2012161318A1
Application number: PCT/JP2012/063503
Authority: WO
Inventors: 大津　誠
Original assignee: シャープ株式会社
Priority date: 2011-05-25
Filing date: 2012-05-25
Publication date: 2012-11-29
Also published as: JP2014150297A

Abstract

The present invention is an image encoding device which, when encoding an input image, uses pixel values of surrounding pixels in a periphery of a pixel to be processed in order to perform intra-screen prediction for predicting the pixel value of the pixel to be processed, and is characterized in being provided with an intra-screen prediction unit which determines the predictive value of each pixel on the basis of information indicating the distance to a subject for the pixel to be processed and information indicating the distance to the subject for the surrounding pixels.

Description

Image encoding device, image decoding device, image encoding method, image decoding method, and program

The present invention relates to an image encoding device, an image decoding device, an image encoding method, an image decoding method, and a program.
This application claims priority on May 25, 2011 based on Japanese Patent Application No. 2011-117425 filed in Japan, the contents of which are incorporated herein by reference.

In recent years, communication infrastructure such as broadband has become widespread, video acquisition devices that can easily shoot video with mobile phones, video cameras, etc., or television broadcasting via HDD (Hard Disk Drive) or BD (Blu-ray Disc) With the widespread use of hard disk BD recorders that can record and store on a Blu-ray Disc), the opportunity for general consumers to handle large-capacity video is increasing. In addition, with the increase in resolution of the video display system, the volume of video to be handled increases more and more, and a high-performance video compression technique is required.

In such circumstances, the international standard video compression standard H.264 H.264 (Non-patent Reference 1) was standardized in 2003 for the purpose of improving image quality and encoding efficiency. H. H.264 increases the coding efficiency by dividing an image into a plurality of blocks (hereinafter referred to as macroblocks), making full use of a plurality of prediction methods, and sequentially selecting a prediction method having a high coding efficiency for each macroblock. ing.

H. As for the encoding method performed in H.264, an intra prediction encoding method (intra prediction encoding method) that predicts and encodes an encoding target block using pixel information of a block that has already been encoded in the screen. ) And an inter-picture predictive coding method (inter-predictive coding method) for predicting and coding an encoding target block with reference to an image different from an image to be processed.

The intra prediction encoding method is a macroblock (16 pixels × 16 pixels) unit, or 4 pixels × 4 pixels and 8 pixels × 8 pixels (8 pixels × 8 pixels are further divided by H.264 FRExt). Information for identifying the prediction mode and the code amount when the difference image (residual component) between the prediction image generated according to the prescribed prediction mode and the original image to be encoded is encoded in units) An optimal prediction method is selected based on the amount of code necessary for encoding (Non-patent Document 1).

Four types of prediction modes (FIGS. 6 and 6 will be described later) can be applied to a block of 16 pixels × 16 pixels, and prediction using one type of DC component (average value prediction) and three types of prediction are possible. There are predictions using angles (vertical prediction, horizontal prediction, planar prediction). Nine kinds of prediction modes (FIGS. 4 and 4 will be described later) can be applied to a block of 4 pixels × 4 pixels or 8 pixels × 8 pixels, and prediction (average) by one type of DC component is applicable. Value prediction) and prediction using eight types of prediction angles (prediction of non-uniform angles of 45 ° to 206.57 °).

In addition, regarding encoding of information for specifying a prediction mode (for example, an index value indicating a mode), prediction is performed using the prediction modes of the upper part and the left part of the processing target block, and when the prediction matches with the prediction mode A 1-bit flag is prepared, and a determination is made by setting a flag that matches. If the prediction does not match, the flag is not set, and encoding is performed by adding information for 3 bits for determining the remaining eight types of prediction modes excluding the prediction mode that does not match. If the prediction is correct, only one bit of information is required to encode the prediction mode, but if the prediction is not correct, information of 4 bits is required.

For example, the invention described in Japanese Patent Application Laid-Open No. H10-228707 is H.264 so that prediction can be performed at an arbitrary prediction angle for the purpose of improving the encoding efficiency of the intra prediction encoding scheme. The number of prediction modes is increased compared to the H.264 method. A technique is disclosed in which a theoretical reference pixel position is obtained from a predicted angle and the position of a pixel to be processed, and a pixel value corresponding to the pixel position is generated by interpolating surrounding reference pixels.
The invention described in Patent Document 2 discloses a method for accurately predicting a prediction mode based on the angle of surrounding prediction modes.

JP 2009-284275 A JP 2010-056701 A

However, in the intra prediction method described in Patent Document 1, it is necessary to make the angle of prediction fine in order to improve the prediction accuracy. If the angle of prediction is made fine, the number of prediction modes corresponding to the angle one-to-one increases. As a result, since the number of bits representing the prediction mode increases, there is a problem that the amount of codes at the time of encoding increases.
Moreover, even if the prediction method in a screen described in patent document 2 can predict a prediction mode more accurately compared with the past, it cannot improve the precision of a prediction image.

The present invention has been made in view of such circumstances, and an object of the present invention is to provide an encoding device, a decoding device, an encoding method, a decoding method, and a program that improve the accuracy of a predicted image while suppressing an increase in the amount of code. Is to provide.

(1) The present invention has been made to solve the above-described problems, and according to one aspect of the present invention, when an input image is encoded, pixel values of peripheral pixels around the processing target pixel are used. An image encoding apparatus that performs intra prediction to predict a pixel value of the processing target pixel, and includes information indicating a distance to the subject in the processing target pixel and information indicating a distance to the subject in the peripheral pixels. An image coding apparatus comprising an intra-screen prediction unit that determines a predicted value for each pixel based on the image.

(2) Moreover, the other aspect of this invention is the above-mentioned image coding apparatus, Comprising: The said prediction part in a screen is for every pixel based on the distance between the pixels of the said process target pixel and the said surrounding pixel. A predicted value is determined.

(3) According to another aspect of the present invention, there is provided the above-described image encoding device, wherein the intra-screen prediction unit uses information indicating a boundary of the subject and a distance to the subject of the input image. A subject boundary detection unit for detecting, and when the boundary is not detected between the processing target pixel and a pixel adjacent in a predetermined direction, the processing target is detected using the pixel adjacent in the predetermined direction; A prediction value of a pixel is predicted.

(4) According to another aspect of the present invention, when a coded image is decoded, a pixel value of a processing target pixel is predicted using pixel values of peripheral pixels around the processing target pixel. An image decoding device that performs prediction, and when performing the intra prediction, information for each pixel based on information indicating a distance to the subject in the processing target pixel and information indicating a distance to the subject in the peripheral pixels. An image decoding apparatus comprising an in-screen prediction unit that determines a predicted value.

(5) Moreover, the other aspect of this invention is the above-mentioned image decoding apparatus, Comprising: The said prediction part in a screen is the prediction for every pixel based on the distance between the pixels of the said process target pixel and the said surrounding pixel. It is characterized by determining a value.

(6) According to another aspect of the present invention, there is provided the above-described image decoding device, wherein the in-screen prediction unit detects a boundary of the subject using information indicating a distance to the subject of the input image. A subject boundary detection unit that detects the boundary between the processing target pixel and a pixel adjacent in a predetermined direction, and uses the pixel adjacent in the predetermined direction to detect the processing target pixel. The prediction value is predicted.

(7) According to another aspect of the present invention, when an input image is encoded, a pixel value of the processing target pixel is predicted using pixel values of peripheral pixels around the processing target pixel. An image encoding method for performing prediction, wherein when performing intra prediction, information indicating a distance to the subject in the processing target pixel and information indicating a distance to the subject in the peripheral pixels This is an image coding method characterized by having a process of determining a predicted value for each pixel based on the above.

(8) Further, according to another aspect of the present invention, when decoding an input image, intra-frame prediction that predicts a pixel value of the processing target pixel using pixel values of peripheral pixels around the processing target pixel. When the intra prediction is performed, the intra prediction unit is based on information indicating the distance to the subject in the processing target pixel and information indicating the distance to the subject in the peripheral pixels. And a process for determining a predicted value for each pixel.

(9) According to another aspect of the present invention, when an input image is encoded, a pixel value of the processing target pixel is predicted using pixel values of peripheral pixels around the processing target pixel. When the image coding apparatus that performs prediction performs the intra prediction, prediction for each pixel is performed based on information indicating the distance to the subject in the processing target pixel and information indicating the distance to the subject in the peripheral pixels. It is a program for functioning as an in-screen prediction unit for determining a value.

(10) According to another aspect of the present invention, when decoding an input image, intra-frame prediction that predicts the pixel value of the processing target pixel using pixel values of peripheral pixels around the processing target pixel. When the image decoding apparatus performing the intra prediction, the prediction value for each pixel is calculated based on the information indicating the distance to the subject in the processing target pixel and the information indicating the distance to the subject in the peripheral pixels. It is a program for functioning as an in-screen prediction unit to be determined.

According to the present invention, it is possible to improve the accuracy of a predicted image in intra prediction encoding while suppressing an increase in code amount.

It is a block diagram which shows the structure of the image transmission system by one Embodiment of this invention. It is a schematic block diagram which shows the structure of the image coding apparatus in the embodiment. It is a schematic block diagram which shows the structure of the depth information utilization intra estimation part in the embodiment. It is a figure which shows the prediction mode of the prediction in a screen of the subblock unit of 4x4 pixel in the embodiment. It is a figure which shows the encoding order of the prediction in a screen of the subblock unit of 4x4 pixel in the embodiment. It is a figure which shows the prediction mode of the prediction in a screen of a 16x16 pixel unit in the embodiment. It is a figure (vertical direction) explaining the processing concept of the depth utilization prediction mode implementation part in the embodiment. It is a figure (horizontal direction) explaining the processing concept of the depth utilization prediction mode implementation part in the embodiment. It is a schematic block diagram which shows the structure of the depth utilization prediction mode implementation part in the embodiment. It is a flowchart which shows the image coding process which the image coding apparatus in the embodiment performs. It is a flowchart which shows the inter prediction process which the image coding apparatus in the embodiment performs. It is a flowchart which shows the intra prediction process which the image coding apparatus in the embodiment performs. It is a schematic block diagram which shows the structure of the image decoding apparatus in the embodiment. It is a schematic block diagram which shows the structure of the depth information utilization intra estimation part in the embodiment. It is a flowchart which shows the image decoding process which the image decoding apparatus in the embodiment performs. It is a flowchart which shows the inter process which the image decoding apparatus in the embodiment performs. It is a flowchart which shows the intra process which the image decoding apparatus in the embodiment performs.

Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a schematic block diagram showing a configuration of a moving image transmission system according to an embodiment of the present invention. As shown in FIG. 1, the moving image transmission system 10 in this embodiment includes an image encoding device 100, a communication network 500, an image decoding device 800, and a display device 600. The image encoding device 100 encodes the image and the depth map from the image signal R of the image to be encoded and the depth map signal D of the depth map corresponding to the image, and uses the encoded data. A certain encoded data E is generated and output. The communication network 500 transmits the encoded data E output from the image encoding device 100 to the image decoding device 800. The image decoding apparatus 800 decodes the transmitted encoded data E, and generates an image signal R ′ of the decoded image. The display device 600 includes an image display device such as a liquid crystal display or a plasma display, and displays an image indicated by the image signal R ′ generated by the image decoding device 800.

The image encoding device 100 is provided in a television broadcasting station, for example, and encodes a broadcast program. In this case, the communication network 500 is a communication network that transmits using broadcast waves, and the image decoding device 800 and the display device 600 are provided in a television receiver. Alternatively, the Internet or a mobile phone network may be used as the communication network 500. The image encoding apparatus 100 is provided in a content holder that edits contents stored and sold on a DVD (Digital Versatile Disc) or a BD (Blu-ray Disc), and encodes these contents.
In this case, the encoded image E is stored in a DVD, a BD, or the like, and is delivered by a delivery network instead of the communication network 500. The image decoding device 800 is provided in a DVD player, a BD player, or the like.

FIG. 2 is a schematic block diagram illustrating a configuration of the image encoding device 100 according to the present embodiment.
The image coding apparatus 100 includes an image input unit 101, a subtraction unit 102, an orthogonal transformation unit 103, a quantization unit 104, an entropy coding unit 105, an inverse quantization unit 106, an inverse orthogonal transformation unit 107, an addition unit 108, and a prediction method. Control unit 109, selection unit 110, deblocking filter unit 111, frame memory unit 112, motion compensation unit 113, motion vector detection unit 114, depth information use intra prediction unit 115, depth map encoding unit 116, depth map decoding unit 117 and a depth input unit 118. The de-blocking filter unit 111, the frame memory 112, the motion compensation unit 113, and the motion vector detection 114 constitute an inter prediction unit 120. In addition, the depth information use intra prediction unit 115 and the depth map decoding unit 117 constitute an intra prediction unit 121.

The image input unit 101, for example, outputs an image signal R (input image signal) indicating an image to be encoded (input image) as an example every 5 frames (the types of 5 frames will be described later). 100 from outside. The image input unit 101 divides the input image frame represented by the acquired input image signal into blocks having a predetermined size (for example, 16 pixels in the vertical direction × 16 pixels in the horizontal direction). The image input unit 101 outputs an image block signal B representing each of the divided blocks to the subtraction unit 102, the motion vector detection unit 114, and the depth information use intra prediction unit 115. The image input unit 101 repeats this process for each image frame until the output for all the blocks in the image frame is completed and the acquired image is completed while sequentially changing the block position.

In this embodiment, the input image to the image encoding device 100 includes at least a reference image (base view). The reference image is an image of one predetermined viewpoint included in a multi-view (multi-view) moving image for stereoscopic display, and is an image serving as a basis for calculating a depth map. The depth map is distance information representing the depth or distance from the photographing device of the subject represented in the reference image, and includes a quantized value given for each pixel of the reference image. Each of the quantized values is called a depth value, for example, a value quantized with 8 bits.

The image signal R for every five frames input to the image input unit 101 is, for example, an image signal of an I picture (I0), a B picture (B3), a B picture (B2), a B picture (B4), and a P picture (P1). including. The image signal R input to the image encoding device 100 is input in this order (hereinafter referred to as input order), for example. Here, in the code (I0, etc.), the leading I, etc., indicates the type of image, and 0, etc., indicates the order of encoding (hereinafter, encoding order) (therefore, the input order and the encoding order are different). . An I picture is an intra-frame picture (Intra Frame Picture), which can be decoded using only a code obtained by encoding the picture. The P picture is an inter-frame forward prediction image (Predictive Picture), which is an image that can be decoded using a code obtained by encoding the image and a code obtained by encoding an image signal of a past frame. A B picture is a bi-predictive coded picture (Bi-directional Predictive Picture), which is obtained by decoding the input picture and using a code obtained by coding a plurality of past or future frame image signals. It is an image that can be decoded.

The subtraction unit 102 subtracts the prediction image block signal output from the selection unit 110 from the image block signal output from the image input unit 101 to generate a difference image block signal. The subtraction unit 102 outputs the generated difference image block signal to the orthogonal transformation unit 103.
The orthogonal transform unit 103 performs orthogonal transform on the difference image block signal output from the subtraction unit 102 to generate signals indicating the strengths of various frequency characteristics.

When the orthogonal transform unit 103 orthogonally transforms the difference image block signal, the difference image block signal is subjected to, for example, DCT transform (discrete cosine transform) and a frequency domain signal (for example, DCT transform). , DCT coefficients). As long as the orthogonal transform unit 103 can generate a frequency domain signal based on the difference image block signal, other methods (for example, FFT (Fast Fourier Transform)) may be used instead of the DCT transform. . The orthogonal transform unit 103 outputs the coefficient value included in the generated frequency domain signal to the quantization unit 104.
The quantization unit 104 quantizes the coefficient value indicating each frequency characteristic intensity output from the orthogonal transform unit 103, and generates the generated quantized signal ED (difference image block code) with the entropy encoding unit 105 and the inverse quantization unit 106. Output to.

The inverse quantization unit 106 performs inverse quantization on the quantized signal ED output from the quantization unit 104 to generate a decoded frequency domain signal, and outputs the decoded frequency domain signal to the inverse orthogonal transform unit 107.
The inverse orthogonal transform unit 107 performs, for example, inverse DCT transform on the input decoded frequency domain signal to generate a decoded differential image block signal that is a spatial domain signal. As long as the inverse orthogonal transform unit 107 can generate a spatial domain signal based on the decoded frequency domain signal, the inverse orthogonal transform unit 107 is not limited to the inverse DCT transform, and other methods (eg, IFFT (Inverse Fast Fourier Transform)) are used. It may be used.
The inverse orthogonal transform unit 107 outputs the generated decoded difference image block signal to the addition unit 108.

The addition unit 108 acquires the predicted image block signal from the selection unit 110 and acquires the decoded difference image block signal from the inverse orthogonal transform unit 107. The adder 108 adds the decoded differential image block signal to the predicted image block signal, and generates a reference image block signal RB obtained by encoding / decoding the input image (internal decoding). The reference image block signal RB is output to the inter prediction unit 120 and the intra prediction unit 121.

The inter prediction unit 120 acquires the reference image block signal RB from the addition unit 108 and acquires the image block signal from the image input unit 101. The inter prediction unit 120 performs inter prediction using these signals, and generates an inter prediction image block signal. The inter prediction unit 120 outputs the generated inter prediction image block signal to the prediction method control unit 109 and the selection unit 110. At the same time, the inter prediction unit 120 outputs the generated inter prediction coding information IPE to the prediction scheme control unit 109. The inter prediction unit 120 will be described later.

The intra prediction unit 121 acquires the reference image block signal RB from the addition unit 108, acquires the image block signal from the image input unit 101, and acquires depth map encoded data from the depth map encoding unit 116. The intra prediction unit 121 performs intra prediction using these signals and data, and generates an intra predicted image block signal. The intra prediction unit 121 outputs the generated intra prediction image block signal to the prediction scheme control unit 109 and the selection unit 110. At the same time, the intra prediction unit 121 outputs the generated intra prediction encoding information TPE to the prediction scheme control unit 109. The intra prediction unit 121 will be described later.

The depth input unit 118 acquires the depth map signal D of the depth map corresponding to the input image input to the image input unit 101 from the outside of the image encoding device 100. The depth input unit 118 divides (depth block signal) the acquired depth map so that the input image block divided by the image input unit 101 has the same position and the same block size, and the depth map encoding unit. To 116. The depth map encoding unit 116 encodes the depth block signal output from the depth input unit 118 using, for example, variable length encoding (entropy encoding), and converts the depth map encoded data E <b> 2 whose data amount is further compressed. Generate. The depth map encoding unit 116 outputs the generated depth map encoded data E2 to the intra prediction unit 121 and the outside of the image encoding device 100 (for example, the image decoding device 800 via the communication network 500).

Next, the inter prediction unit 120 will be described. The inter prediction unit 120 includes a deblocking filter unit 111, a frame memory 112, a motion compensation unit 113, and a motion vector detection unit 114.
The deblocking filter unit 111 acquires the reference image block signal RB from the adder unit 108, and reduces the block distortion generated when the image is encoded, for example, a known encoding method (for example, H.264 Reference Software). FIR (Finite Impulse Response) filter processing used in JM ver. 13.2 Encoder, http://iphome.hhi.de/suehring/tml/, 2008). The deblocking filter unit 111 outputs the processing result (correction block signal) to the frame memory 112.

The frame memory 112 holds the correction block signal output from the deblocking filter unit 111 as a part of the image of the frame number together with information for identifying the frame number.
The motion vector detection unit 114 searches for a block similar to the image block signal input from the image input unit 101 from the image stored in the frame memory 112 (block matching), and vector information (motion vector) representing the searched block. Is generated. When performing block matching, the motion vector detection unit 114 calculates an index value between the divided blocks for each area, and searches for an area where the calculated index value is minimum. When the input image signal is a B picture, the motion vector detection unit 114 has two areas: a block in the reference image area having the smallest index value, a block in the reference image area having the next smallest index value, and the two areas. Find out.

The index value only needs to indicate the correlation or similarity between image signals. The motion vector detection unit 114 uses, for example, an absolute value sum (SAD: Sum of Absolute Difference) of a difference between a luminance value of a pixel included in a divided block and a luminance value in a certain region of the reference image. SAD between a block (for example, a size of N × N pixels) divided from the input image signal and the block of the reference image signal is expressed by the following equation (1).

In Expression (1), Iin (i0 + i, j0 + j) represents the luminance value at the coordinates (i0 + i, j0 + j) of the input image, and (i0, j0) represents the pixel coordinates of the upper left corner of the divided block. Iref (i0 + i + p, j0 + j + q) is a luminance value at the coordinates (i0 + i + p, j0 + j + q) of the reference image, and (p, q) is a shift amount (motion vector) based on the coordinates of the upper left corner of the divided block.

That is, the motion vector detection unit 114 calculates SAD (p, q) for each (p, q) in block matching, and finds (p, q) that minimizes SAD (p, q). (P, q) represents a vector (motion vector) from the divided block in the input image to the position of the reference region in the reference image.
The motion compensation unit 113 acquires a motion vector from the motion vector detection unit 114, and outputs the corresponding reference block to the prediction scheme control unit 109 and the selection unit 110 as an inter prediction image block signal. The motion compensator 113 outputs the corresponding image block when the motion vector output from the motion vector detector 114 is one, and applies when the motion vector output from the motion vector detector 114 is two. Two image blocks are averaged and output. The motion compensation unit 113 outputs information necessary for prediction (hereinafter, inter prediction coding information IPE), for example, a motion vector, to the prediction scheme control unit 109.

Next, the intra prediction unit 121 will be described. The intra prediction unit 121 includes a depth map decoding unit 117 and a depth information use intra prediction unit 115.
The depth map decoding unit 117 decodes the depth block signal having a larger amount of information by using, for example, variable length decoding, the depth map encoded data output from the depth map encoding unit 116. The depth map decoding unit 117 outputs the decoded depth map D ′ (depth block decoded signal) to the depth information use intra prediction unit 115.

FIG. 3 is a schematic block diagram illustrating a configuration of the depth information use intra prediction unit 115 according to the present embodiment. The processing of the depth information use intra prediction unit 115 will be described with reference to FIG. Specifically, the depth information use intra prediction unit 115 includes a first prediction mode execution unit 200-1 to an nth prediction mode execution unit 200-n (n is a natural number of 1 or more, for example, 6), and a depth use prediction mode execution. Unit 201 and a prediction mode selection unit 202.

The first prediction mode execution unit 200-1 to the n-th prediction mode execution unit 200-n perform the first to nth prediction mode execution units 200-n according to the processing in each prediction mode (prediction image block generation method) from the reference image block signal RB output from the addition unit 108. An n-th predicted image block signal is generated. The first prediction mode execution unit 200-1 to the n-th prediction mode execution unit 200-n output the generated first to n-th prediction image block signals to the prediction mode selection unit 202.

Each of the first prediction mode execution unit 200-1 to the n-th prediction mode execution unit 200-n is, for example, a conventional intra-screen prediction mode (for example, H.264 Reference Software JM ver. 13.2 Encoder, http: // iphome. hhi.de/suehring/tml/, 2008) is used to perform in-screen prediction (intra prediction). H. In H.264, there are nine types of intra prediction applied to a 4 × 4 pixel sub-block obtained by further dividing a macroblock, and four types of intra prediction applied to a macroblock unit (note that 8 × 8 pixels). (In-screen prediction using sub-blocks is formulated in H.264 FRExt, and the same intra-screen prediction method as 4 × 4 pixels is applied.)

Specifically, the first prediction mode execution unit 200-1 performs intra prediction (intra-screen prediction) using, for example, 4 × 4 sub-blocks. The second prediction mode execution unit 200-2 performs intra prediction using, for example, 8 × 8 sub-blocks. The third prediction mode execution unit 200-3 to the sixth prediction mode execution unit 200-6 perform four types of prediction methods, for example, in units of 16 × 16 macroblocks.

The first prediction mode execution generation unit 200-1 further divides the reference image block signal output from the addition unit 108 into 4 × 4 sub-block sizes, and performs prediction in units of 4 × 4 pixels in the order shown in FIG. Execute the method. That is, a 16 × 16 pixel block is divided into four 8 × 8 pixel blocks, which are processed in the order of upper left, upper right, lower left, and lower right. For each of these 8 × 8 pixel blocks, each is divided into four 4 × 4 pixel sub-blocks, and within each of these 8 × 8 pixel blocks, upper left, upper right, lower left, right Intra prediction is performed in the following order.

As shown in FIG. 4, there are nine types of prediction methods from prediction mode 0 to prediction mode 8 as the prediction method for a 4 × 4 pixel sub-block. The first prediction mode execution unit 200-1 includes a 4 × 4 pixel predicted image block generated by each of nine types of prediction methods and a corresponding sub-block of the image block signal B output from the image input unit 101. An index indicating the degree of correlation is calculated, and a prediction method is selected for each sub-block based on the index. The first prediction mode execution unit 200-1 calculates, for example, the absolute value sum (SAD) of the luminance value differences as the index, and sets the prediction method with the smallest SAD value to the corresponding 4 × 4 pixel sub A block prediction method is selected, and a first predicted image block signal at a corresponding position is generated. Also, the prediction method is retained.
The first prediction mode execution unit 200-1 repeats the above process until a prediction method for 16 × 16 pixels and a first predicted image block signal are generated.

The second prediction mode execution unit 200-2 further divides the reference image block signal RB output from the addition unit 108 into four 8 × 8 pixel sub-blocks, and uses the prediction used in the first prediction mode execution unit 200-1. Nine kinds of prediction methods similar to those in the mode 0 to the prediction mode 8 are applied to each of the 8 × 8 pixel sub-blocks to generate a prediction image. At the same time, the prediction method is retained.
The second prediction mode execution unit 200-2 repeats the above processing, sequentially determines the prediction method in units of 8 × 8 pixel sub-blocks, and predicts all 16 × 16 pixel block prediction methods and predicted images based on the prediction method. Generate block signals.

The third prediction mode execution unit 200-3 to the sixth prediction mode execution unit 200-6 are 16 × 16 pixel intra predictions (intra-screen predictions), and use the reference image block signal output from the addition unit 108. Prediction image block signals corresponding to prediction modes 0 to 3 in FIG. 6 are generated.

The depth use prediction mode execution unit 201 obtains a reference image block signal from the addition unit 108 and a depth block decoded signal from the depth map decoding unit 117, and uses the depth map to suppress prediction across a subject boundary. I do. Details of the depth use prediction mode execution unit 201 will be described later. The depth use prediction mode execution unit 201 outputs the prediction image block signal and the prediction method to the prediction mode selection unit 202.

The prediction mode selection unit 202 acquires the prediction image block signal generated by the first prediction mode execution unit 200-1 to the nth prediction mode execution unit 200-n and the depth use prediction mode execution unit 201 and information necessary for prediction. To do. The information necessary for prediction is applied to each sub-block of the first prediction mode execution unit 200-1 and the second prediction mode execution unit 200-2 that perform processing by further dividing 16 × 16 pixels into sub-blocks, for example. Information indicating the predicted mode, and information indicating the prediction mode indicating the direction of prediction of the depth use prediction mode execution unit 201.

The prediction mode selection unit 202 selects one prediction image block signal having the smallest index value from the obtained prediction image block signals (including the prediction image block signal output by the depth use prediction mode execution unit 201). . As the index value, the prediction mode selection unit 202, for example, as shown by the following equation, the luminance value Iin (i0 + i, j0 + j) of the corresponding image block included in the input image input from the image input unit 101 and the candidate prediction image SAD with the luminance value Ip, m (i0 + i, j0 + j) of the block is used.

In Expression (2), m is an index indicating which prediction mode of which prediction mode execution unit is. Accordingly, Ip, m (x, y) is a luminance value at the coordinates x, y of the predicted image in the prediction mode m. Further, i0 and j0 are the coordinates of the upper left vertex of the block, and N is the size of the block (the number of pixels on one side). In the present embodiment, in addition to SAD, the effectiveness of processing for each prediction mode, such as the correlation between the image block included in the input image as an index value and the candidate predicted image block, the similarity, or the amount of information after encoding. Any variable that represents can be used.

The prediction mode selection unit 202 generates prediction mode information including an index representing the prediction mode. Alternatively, the prediction mode selection unit 202 selects a prediction mode in which information necessary for prediction exists (specifically, the first prediction mode execution unit 200-1, the second prediction mode execution unit 200-2, and the depth). In the prediction mode of the use prediction mode execution unit 201, the index and information necessary for this prediction are collected to generate prediction mode information.
The prediction mode selection unit 202 sends the selected prediction image block signal (hereinafter referred to as an intra prediction image block signal) to the selection unit 110 and the prediction scheme control unit 109, and the prediction mode information (hereinafter referred to as intra prediction coding information TPE). Output to the prediction method control unit 109.

Returning to FIG. 2, the prediction scheme control unit 109 receives the picture type of the input image, the inter prediction image block signal input from the inter prediction unit 120, its inter prediction encoding information IPE, and the intra prediction unit 121. A prediction scheme is determined based on the intra-predicted image block signal and the intra-encoded information, and information on the prediction scheme is output to the selection unit 110 and the entropy encoding unit 105. The prediction method control unit 109 monitors the picture type of the input image, and selects an intra prediction method when the input image is an I picture. In the case of a P picture or a B picture, the prediction scheme control unit 109 uses, for example, a conventional technique (from the residual between the number of bits generated by the encoding performed by the entropy encoding unit 105 and the original image of the subtraction unit 102 ( For example, the Lagrange cost is calculated using H.264 Reference Software JM ver. 13.2 Encoder, http://iphome.hhi.de/suehring/tml/, ラ 2008), and either inter prediction method or intra prediction method is selected. select.

The prediction method control unit 109 adds information that can specify the prediction method to the coding information corresponding to the selected prediction method, among the inter prediction coding information IPE or the intra prediction coding information TPE, as prediction coding information. The data is output to the entropy encoding unit 105.
The selection unit 110 selects the inter prediction image block signal input from the inter prediction unit 120 or the intra prediction image block signal input from the intra prediction unit 121 according to the prediction method information input from the prediction method control unit 109. Then, the predicted image block signal is output to the subtracting unit 102 and the adding unit 108. When the prediction method input from the prediction method control unit 109 is inter prediction, the selection unit 110 selects and outputs the inter prediction image block signal input from the inter prediction unit 120, and the prediction method control unit 109. When the input prediction method is intra prediction, the intra prediction image block signal input from the intra prediction unit 121 is selected and output.

The entropy encoding unit 105 packs the differential image code input from the quantization unit 104 and the prediction encoding information input from the prediction scheme control unit 109, for example, variable length encoding (entropy). The encoded data E1 is generated by using the encoding) to compress the amount of information. The entropy encoding 105 outputs the generated encoded data E1 to the outside of the image encoding device 100 (for example, the image decoding device 800 via the communication network 500).

<Depth usage prediction mode implementation process overview>
Next, a method for generating a predicted image block by the depth use prediction mode execution unit 201 will be described.
In-screen prediction is performed by predicting pixels of a processing target block using surrounding pixels as described above. Specifically, in the intra prediction, a predicted image block signal is created by sequentially copying neighboring pixels that have been processed in the prediction direction.

Therefore, in a flat region where the texture characteristics do not change significantly, the pixel of the processing target block can be accurately predicted by this intra prediction, and the difference (residual) between the pixel of the processing target block and the pixel of the prediction block As a result, the code amount can be reduced (or the error during decoding can be reduced).

On the other hand, different subjects generally have significantly different texture characteristics. However, in-screen prediction is performed uniformly even in regions that cross between different subjects, so that there is a problem that prediction accuracy is reduced and code amount is increased.
Here, different subjects often have different depth values unless they are in contact with each other at the same distance. That is, by using the difference in depth value (for example, detecting an edge of the depth value), it is possible to separate different subjects (detect boundaries between different subjects).
By using this information, the prediction accuracy can be improved by controlling the prediction of pixels in the intra prediction. Specifically, it is as follows.

The prediction directions of the intra prediction using the depth performed by the depth use prediction mode execution unit 201 in the present embodiment are the prediction in the vertical direction (prediction mode 0) and the prediction in the horizontal direction (prediction mode 1) shown in FIG. is there. However, when the processing described below can be applied (however, except for the prediction mode 2 in FIG. 6), other prediction directions can be used. Also, the processing described below can be applied to the prediction method in units of sub-blocks in FIG. 4 (except for prediction mode 2). That is, as in the present embodiment, a new prediction mode may be added while leaving the conventional prediction mode, or the prediction method performed by the depth use prediction mode execution unit 201 may be replaced with the conventional method. Thus, the number of modes may not be increased. The following describes an example in which a depth usage prediction mode is newly added.

7 and 8 are diagrams for explaining the processing concept of the depth use prediction mode execution unit 201. FIG. 7 and 8, the graphic indicated by a circle indicates a pixel for which processing has been completed, and can be referred to when a predicted pixel block is generated. Further, a graphic indicated by a square indicates a pixel to be processed, and is a target that is predicted using pixels that can be referred to in the vicinity. Furthermore, the arrow indicates the direction of prediction, and pixels that can be referred to are sequentially predicted (specifically, simply copied) in the direction of the arrow. That is, in the prediction mode of FIG. 7, the pixel value is copied in the vertical direction, and in the prediction mode of FIG. 8, the pixel value is copied in the horizontal direction. 7 and 8, a thick broken line indicates the boundary of the subject.

FIG. 9 is a schematic block diagram illustrating a configuration of the depth use prediction mode execution unit 201 according to the present embodiment. As illustrated in FIG. 9, the depth use prediction mode execution unit 201 includes a boundary control prediction image generation unit 300, a boundary prediction control unit 301, and a subject boundary detection unit 302.

The subject boundary detection unit 302 acquires a depth block signal representing a depth value of a pixel corresponding to the image block signal B to be processed from the depth map decoding unit 117, and detects a depth edge. Depth edge detection is performed by thresholding the difference between adjacent pixels in the depth map. Whether or not the depth edge exists in the horizontal direction is determined by whether or not the absolute value of the difference between pixels adjacent in the vertical direction is larger than the threshold value TV, as shown in Expression (3).
Similarly, whether or not the depth edge exists in the vertical direction is determined by whether or not the absolute value of the difference between pixels adjacent in the horizontal direction is larger than the threshold value TH.

Here, D (i, j) represents a depth map value at the pixel position (i, j). TV and TH are threshold values used when determining whether or not edges exist in the horizontal direction and the vertical direction, respectively. Specifically, the threshold is 10, for example.

As an example of the depth edge detection result obtained by the above method, a case where a depth edge is detected as shown by a thick dotted line in FIGS. 7 and 8 will be described. In the vertical direction prediction mode of FIG. 7 and the horizontal direction prediction mode of FIG. 8, the position of the depth edge is the same. In the examples of FIGS. 7 and 8, there are depth edges so that the prediction target block is divided into left and right. In this case, there is a high possibility that different subjects appear on the left and right sides in the processing target block. In the prediction in the conventional vertical direction or the prediction in the horizontal direction, the prediction accuracy is remarkably lowered at the pixel straddling this edge and the subsequent pixels in the prediction direction.

The boundary prediction control unit 301 controls the prediction performed by the boundary control predicted image generation unit 300 using the boundary information (depth edge) of the subject in the horizontal direction and the vertical direction input from the subject boundary detection unit 302. Specifically, when there is a depth edge perpendicular to the prediction direction, the boundary prediction control unit 301 performs control to suppress copying from pixels adjacent to the prediction direction. The control for suppressing copying of pixels in the prediction direction is, for example, controlling the processing in the boundary control predicted image generation unit 300 as follows.

The boundary control predicted image generation unit 300 acquires the reference image block signal RB from the addition unit 108 and generates a predicted image block signal as follows. In this embodiment, the prediction mode of the boundary control prediction image generation unit 300 includes a prediction mode in which the prediction direction is vertical as shown in FIG. 7 and a prediction mode in which the prediction direction is horizontal as shown in FIG. Yes (two types of predicted image block signals are generated). When the boundary of the subject does not exist between the processing target pixel and the pixel immediately before the prediction direction, that is, when there is no depth edge in the direction perpendicular to the prediction direction, the boundary prediction control unit 301 The boundary control predicted image generation unit 300 is processed in the same manner as in the conventional prediction method. That is, the boundary prediction control unit 301 controls the boundary control prediction image generation unit 300 so as to copy the pixel value immediately before the processing target pixel in the prediction direction.

For example, in FIG. 7, when the pixel to be processed is Qv1, there is no depth edge between Pv1 which is the previous pixel in the prediction direction. For this reason, the boundary prediction control unit 301 controls the boundary control prediction image generation unit 300 so as to copy the pixel value of the pixel Pv1 to the pixel value of the pixel Qv1. The same applies to the horizontal direction (processing from the pixel Ph2 to the pixel Qh2 in FIG. 8). On the other hand, when an edge exists in a direction perpendicular to the prediction direction, the boundary prediction control unit 301 controls the boundary control prediction image generation unit 300 to perform the following processing.

The boundary control prediction image generation unit 300 generates a prediction pixel according to the following formula when there is a depth edge perpendicular to the prediction direction.

Expression (5) is an expression for generating a prediction pixel when a depth edge exists in the horizontal direction in the prediction mode in the vertical direction. Expression (6) is an expression for generating a prediction pixel when there is a depth edge in the vertical direction in the horizontal prediction mode. Since the basic processing is the same in the horizontal direction and the vertical direction, Expression (5) in the horizontal direction will be described below.
In Expression (5), G [x] on the left side is a predicted pixel value of the pixel x. Argmin {} with pre = {Pv1,..., Pv16} on the right side is a function indicating pre that the evaluation expression in argmin {} is minimum when pre is any of the pixels Pv1 to Pv16. is there. Therefore, the pixel value that minimizes the evaluation formula is selected from pre = {Pv1,..., Pv16} (pixels existing in the line immediately before the pixel being processed), and the processing target pixel on the left side Is copied as a pixel value.

The evaluation formula is formed from two terms. The first term (| D (Qvi) −D (pre) |) represents the absolute value of the difference between the depth value of the pixel to be processed and the depth value corresponding to each pixel in the pre. The second term (Dis (Qvi, pre)) represents the distance between the position of the pixel to be processed and the pixel position of each pixel in the pre. The meaning of each term is that the first term is controlled so as to select a pixel having a depth value so that a pixel showing the subject considered to be the same as the subject appearing in the processing target pixel can be referred to as much as possible. Term. That is, the term also suppresses the use of a pixel that has a subject boundary between the target pixel and the processing target pixel, as in the control by the boundary prediction control unit 301. The second term is a term for selecting pixels as close as possible to the pixel to be processed. Further, α and β accumulated in each term are constants for changing the weighting between the first term and the second term. Specifically, for example, α is 0.1 and β is 1.0.

In the above formulas (5) and (6), the sum of the first term and the second term is used in the evaluation formula, but a ratio may be used. Further, only the first term may be used.
In the present embodiment, the above formulas (5) and (6) are used only when there is a depth edge between the pixel to be processed and the previous pixel in the prediction direction. The above formulas (5) and (6) may be used.

In this way, the boundary prediction control unit 301, when there is a depth edge (subject boundary) between the pixel to be processed and the previous pixel (adjacent pixel) in the prediction direction (predetermined direction), By using the above-described equations (5) and (6) by the boundary control predicted image generation unit 300, the use of the pixel value of the previous pixel in the prediction direction is suppressed.
In the above formulas (5) and (6), the first term is the difference in depth value between the processing target pixel and the pixel (peripheral pixel) in the previous column (or row) in the prediction direction. Therefore, it is possible to suppress the use of peripheral pixels that have a subject boundary between the processing target pixels and have a large difference in depth value.

The boundary control predicted image generation unit 300 generates a predicted image block predicted in the horizontal direction and the vertical direction. The boundary control predicted image generation unit 300 determines the correlation between the image block input from the image input unit 101 and the predicted image block predicted in each of the two types of prediction modes using, for example, the SAD value. As a result of this determination, the boundary control prediction image generation unit 300 selects a prediction image block having a higher correlation (similar) and outputs the prediction image block to the prediction mode selection unit 202.
The boundary control prediction image generation unit 300 also outputs prediction encoding information indicating the prediction mode of the selected prediction image block to the prediction mode selection unit 202.
In this way, since control is performed to suppress continuous pixel prediction within the depth map boundary (subject boundary) indicating the distance to the subject, the prediction accuracy can be improved.

<Image Encoding Device 100 Flowchart>
Next, an image encoding process performed by the image encoding device 100 according to the present embodiment will be described. FIG. 10 is a flowchart showing an image encoding process performed by the image encoding apparatus 100 according to the present embodiment.
(Step S201) The image encoding apparatus 100 acquires an image for each frame and a depth map corresponding to the image from the outside. Thereafter, the process proceeds to step S202.
(Step S202) The image input unit 101 divides an input image signal for each frame acquired from the outside of the image encoding device 100 into blocks of a predetermined size (for example, 16 pixels in the vertical direction × 16 pixels in the horizontal direction). Output to the subtraction unit 102, the inter prediction unit 120, and the intra prediction unit 121. Further, the depth input unit 118 divides the depth map synchronized with the image input to the image input unit 101 in the same manner as the image division performed by the image input unit 101, and sends the depth map to the depth map encoding unit 116. Output.

The image coding apparatus 100 repeats the processing from step S203 to step S211 for each image block in the frame.
(Step S203) The depth map encoding unit 116 encodes the depth map input from the depth input unit 118, and converts the depth map encoded data whose data amount is further compressed into the intra prediction unit 121 and the image encoding device. 100 (for example, the image decoding device 800). Thereafter, the process of step S204 and the process of step S205 are performed in parallel.

(Step S <b> 204) The inter prediction unit 120 acquires an image block signal from the image input unit 101, and acquires a reference image block signal decoded by the addition unit 108. The inter prediction unit 120 performs inter prediction using these acquired signals. The inter prediction unit 120 outputs the inter prediction image block signal generated by the inter prediction to the prediction method control unit 109 and the selection unit 110, and outputs the inter prediction coding information IPE to the prediction method control unit 109.
In the first process, when the processing of the adding unit 108 is not completed, a reset image block (an image block signal with all pixel values being 0) is acquired from the adding unit 108. If the process of the inter estimation part 120 is completed, it will progress to step S206.

(Step S205) The intra prediction unit 121 acquires an image block signal from the image input unit 101, acquires depth map encoded data from the depth map encoding unit 116, and receives the reference image block signal decoded by the addition unit 108. get. The intra prediction unit 121 performs intra prediction using these acquired signals. The intra prediction unit 121 outputs the intra prediction image block signal generated by the intra prediction to the prediction scheme control unit 109 and the selection unit 110, and outputs the intra prediction coding information TPE to the prediction scheme control unit 109. In the first process, when the process of the adding unit 108 is not completed, a reset image block (an image block in which all pixel values are 0) is acquired. If the process of the intra estimation part 121 is completed, it will progress to step S206.

(Step S206) The prediction scheme control unit 109 receives the inter prediction image block signal and the inter prediction encoding information IPE from the inter prediction unit 120, and receives the intra prediction image block signal and the intra prediction encoding information TPE from the intra prediction unit 121. . The prediction scheme control unit 109 selects a prediction mode with good coding efficiency based on the Lagrangian cost. The prediction method control unit 109 outputs information indicating the selected prediction mode to the selection unit 110. The prediction scheme control unit 109 outputs prediction encoding information corresponding to the selected prediction mode to the entropy encoding unit 105.
The selection unit 101 selects an inter prediction image block signal input from the inter prediction unit 120 or an intra prediction image block signal input from the intra prediction unit 121 according to the prediction mode information input from the prediction method control unit 109. To the subtraction unit 102 and the addition unit 108. Thereafter, the process proceeds to step S207.

(Step S207) The subtraction unit 102 subtracts the predicted image block signal output from the selection unit 110 from the image block signal output from the image input unit 101 to generate a difference image block signal. The subtraction unit 102 outputs the difference image block signal to the orthogonal transformation unit 103. Thereafter, the process proceeds to step S208.

(Step S208) The orthogonal transform unit 103 acquires the difference image block signal from the subtraction unit 102, and performs the orthogonal transform. The orthogonal transform unit 103 outputs the signal after the orthogonal transform to the quantization unit 104. The quantization unit 104 performs the above quantization process on the signal input from the orthogonal transform unit 103 to generate a difference image code. The quantization unit 104 outputs the difference image code to the entropy coding unit 105 and the inverse quantization unit 106.
The entropy encoding unit 105 packs the differential image code input from the quantization unit 104 and the prediction encoding information input from the prediction scheme control unit 109, and performs variable length encoding (entropy encoding). ) To generate encoded data E1 in which the amount of information is further compressed. The entropy encoding unit 105 outputs the encoded data E1 to the outside of the image encoding device 100 (for example, the image decoding device 800). Thereafter, the process proceeds to step S209.

(Step S209) The inverse quantization unit 106 acquires the difference image code ED from the quantization unit 104, and performs the inverse process of the quantization performed by the quantization unit 104. The inverse quantization unit 106 outputs the signal generated by this processing to the inverse orthogonal transform unit 107. The inverse orthogonal transform unit 107 acquires the inversely quantized signal from the inverse quantization unit 106, performs the inverse orthogonal transform process of the orthogonal transform process performed by the orthogonal transform unit 103, and obtains a difference image (decoded difference image block signal). ). The inverse orthogonal transform unit 107 outputs the decoded difference image block signal to the addition unit 108. Thereafter, the process proceeds to step S210.

(Step S210) The addition unit 108 adds the predicted image block signal output from the selection unit 110 to the decoded difference image block signal output from the inverse orthogonal transform unit 107, and decodes the input image (reference image block). signal). The adding unit 108 outputs the reference image block signal to the inter prediction unit 120 and the intra prediction unit 121. Then, it progresses to step S211.
(Step S211) When the image coding apparatus 100 has not completed the processes of Steps S203 to S210 for all the blocks in the frame, the block to be processed is changed and the process returns to Step S202. When all the processes are completed, the process ends.

FIG. 11 is a flowchart for explaining the processing of the inter prediction unit 120.
(Step S <b> 301) The deblocking filter unit 111 acquires a reference image block signal from the addition unit 108 that is outside the inter prediction unit 120, and performs the FIR filter process. The deblocking filter unit 111 outputs the corrected block signal after the filtering process to the frame memory 112. Thereafter, the process proceeds to step S302.

(Step S302) The frame memory 112 acquires the correction block signal of the deblocking filter unit 111, and holds the correction block signal as a part of the image together with information that can identify the frame number. Thereafter, the process proceeds to step S303.
(Step S303) Upon receiving the image block signal from the image input unit 101, the motion vector detection unit 114 searches the image stored in the frame memory 112 for a block similar to the image block output by the image input unit 101 (block). Matching) and generating vector information (motion vector) representing the found block. The motion vector detection unit 114 outputs information necessary for encoding including the detected vector information to the motion compensation unit 113. Thereafter, the process proceeds to step S304.

(Step S304) The motion compensation unit 113 acquires information necessary for encoding from the motion vector detection 114, and extracts a corresponding prediction block from the frame memory. The motion compensation unit 113 outputs the prediction image block signal extracted from the frame memory to the prediction method control unit 109 and the selection unit 110 as an inter prediction image block signal. At the same time, the motion compensation unit 113 outputs information necessary for prediction acquired from the motion vector detection unit 114 to the prediction method control unit 109. Thereafter, the inter prediction is terminated.

FIG. 12 is a flowchart for explaining processing of the intra prediction unit 121.
(Step S401) The depth map decoding unit 117 acquires depth map encoded data E2 from the depth map encoding unit 116, and decodes a depth map having a larger amount of information by using, for example, variable length decoding. The depth map decoding unit 117 outputs the decoded depth map (depth block decoded signal) to the depth information use intra prediction unit 115. Thereafter, the process proceeds to step S402.

(Step S402) The first prediction mode execution unit 200-1 to the n-th prediction mode execution unit 200-n perform processing in each prediction mode (prediction image block generation method) from the reference image block signal acquired from the addition unit 108. In response, first to n-th predicted image block signals are generated. The first prediction mode execution unit 200-1 to the n-th prediction mode execution unit 200-n output the generated first to n-th prediction image block signals to the prediction mode selection unit 202.
The depth use prediction mode execution unit 201 generates a prediction image block signal using depth from the reference image block signal acquired from the addition unit 108 and the depth block decoded signal acquired from the depth map decoding unit 117, and a prediction mode selection unit To 202. Thereafter, the process proceeds to step S403. Note that the predicted image generation processing performed by the depth use prediction mode execution unit 201 is as described above.

(Step S403) The prediction mode selection unit 202 receives the prediction image block signal from the first prediction mode execution unit 200-1 to the n-th prediction mode execution unit 200-n and the depth-use prediction mode execution unit 201 and information necessary for prediction. Enter.
The prediction mode selection unit 202 selects a prediction mode with high coding efficiency by the above method from the input prediction image block signals (including the prediction image block signal input from the depth-based prediction mode execution unit). , Corresponding prediction mode information is generated.
The prediction mode selection unit 202 sends the selected prediction image block signal (hereinafter referred to as an intra prediction image block signal) to the selection unit 110 and the prediction scheme control unit 109, and the prediction mode information (hereinafter referred to as intra prediction coding information TPE). Output to the prediction method control unit 109. Then, intra prediction is complete | finished.

Next, the image decoding device 800 according to this embodiment will be described. FIG. 13 is a schematic block diagram showing the configuration of the image decoding device 800 according to this embodiment. The image decoding apparatus 800 includes an encoded data input unit 813, an entropy decoding unit 801, an inverse quantization unit 802, an inverse orthogonal transform unit 803, an addition unit 804, a prediction scheme control unit 805, a selection unit 806, and a deblocking filter unit 807. A frame memory 808, a motion compensation unit 809, a depth information use intra prediction unit 810, a depth map decoding unit 811, an image output unit 812, and a depth map encoded data input unit 814. The deblocking filter unit 807, the frame memory 808, and the motion compensation unit 809 constitute an inter processing unit 820. Also, the depth information utilization intra prediction unit 810 and the depth map decoding unit 811 constitute an intra processing unit 821.

The encoded data input unit 813 divides the encoded data E1 acquired from the outside (for example, the image encoding device 100) into processing block units and outputs the result to the entropy decoding unit 801. The encoded data input unit 813 repeatedly outputs the blocks until the blocks are sequentially changed until all the blocks in the frame are completed and the acquired encoded data is completed.

The entropy decoding unit 801 performs processing reverse to the encoding method (for example, variable-length encoding) performed by the entropy encoding unit 105 on the encoded data divided into processing units acquired from the encoded data input unit 813 ( For example, entropy decoding that is variable length decoding) is performed to generate a difference image block code and predictive coding information PE. The entropy decoding unit 801 outputs the difference image block code to the inverse quantization unit 802 and the prediction coding information PE to the prediction scheme control unit 805.

The inverse quantization unit 802 performs inverse quantization on the difference image block code input from the entropy decoding unit 801 to generate a decoded frequency domain signal, and outputs the decoded frequency domain signal to the inverse orthogonal transform unit 803.
The inverse orthogonal transform unit 803 generates a decoded difference image block signal that is a spatial domain signal by, for example, inverse DCT transforming the decoded frequency domain signal output from the inverse quantization unit 802. As long as the inverse orthogonal transform unit 803 can generate a spatial domain signal based on the decoded frequency domain signal, the inverse orthogonal transform unit 803 is not limited to the inverse DCT transform, and other methods (for example, IFFT (Inverse Fast Fourier Transform)) are used. It may be used.
The inverse orthogonal transform unit 803 outputs the generated decoded difference image block signal to the addition unit 804.

The prediction method control unit 805 extracts the prediction method PM in units of macroblocks adopted by the image coding device 100 from the prediction coding information PE input from the entropy decoding unit 801. Here, the prediction method PM is inter prediction or intra prediction. The prediction method control unit 805 outputs information regarding the extracted prediction method PM to the selection unit 806. Also, the prediction scheme control unit 805 takes out the prediction coding information corresponding to the prediction scheme PM extracted from the prediction coding information PE output from the entropy decoding unit 801, and stores it in the processing unit corresponding to the extracted prediction scheme PM. Predictive coding information is output. The prediction method control unit 805 outputs the inter prediction coding information IPE to the inter processing unit 820 when the prediction method PM is inter prediction. The prediction method control unit 805 outputs the intra prediction encoding information TPE to the intra processing unit 821 when the prediction method PM is intra prediction.

The selection unit 806, based on the prediction scheme PM information input from the prediction scheme control unit 805, outputs an inter prediction image block signal output by an inter processing unit 820 described later or an intra predicted image block output by an intra processing unit 821 described later. Select a signal. When the prediction method PM is inter prediction, an inter prediction image block signal is selected. When the prediction method PM is intra prediction, an intra prediction image block signal is selected. The selection unit 806 outputs the selected predicted image block signal to the addition unit 804.

The addition unit 804 adds the predicted image block signal output from the selection unit 806 to the decoded difference image block signal output from the inverse orthogonal transform unit 803 to generate a decoded image block signal DB. The adding unit 804 outputs the decoded decoded image block signal DB to the inter processing unit 820, the intra processing unit 821, and the image output unit 812.

Next, the inter processing unit 820 will be described. The inter processing unit 820 includes a deblocking filter unit 807, a frame memory 808, and a motion compensation unit 809.
The deblocking filter unit 807 performs the same processing as the FIR filter performed by the deblocking filter unit 111 on the decoded image block signal DB input from the addition unit 804, and the processing result (correction block signal) is framed. Output to the memory 808.
The frame memory 808 acquires the correction block signal from the deblocking filter unit 807, and holds the correction block signal as a part of the image together with information that can identify the frame number.

The motion compensation unit 809 acquires inter prediction coding information IPE from the prediction method control unit 805, and extracts reference image information and prediction vector information (motion vector) from the inter prediction coding information IPE. The motion compensation unit 809 extracts a target image block signal (predicted image block signal) from the images stored in the frame memory 808 based on the extracted reference image information and predicted vector information. When there is one prediction vector (motion vector), the motion compensation unit 809 extracts one corresponding image block from the frame memory 808 and outputs it to the selection unit 806. When there are two prediction vectors (motion vectors), two corresponding image blocks are taken out from the frame memory 808, averaged, and output to the selection unit 806. This signal output from the inter processing unit 820 (motion compensation unit 809) to the selection unit 806 is an inter prediction image block signal.

Next, the intra processing unit 821 will be described. The intra processing unit 821 includes a depth information use intra prediction unit 810 and a depth map decoding unit 811.
The depth map encoded data input unit 814 divides the depth map encoded data E2 input from the outside (for example, the image encoding device 100) into processing blocks, and outputs them to the intra processing unit 821.

The depth map decoding unit 811 reverses the block unit depth map encoded data output from the depth map encoded data input unit 814 to the encoding method (for example, variable length encoding) performed by the depth map encoding unit 116. The depth block decoded signal is generated by performing entropy decoding which is the above process (for example, variable length decoding). The depth map decoding unit 811 outputs the depth block decoded signal to the depth information use intra prediction unit 810.

FIG. 14 is a schematic block diagram illustrating a configuration of the depth information use intra prediction unit 810.
The depth information use intra prediction unit 810 includes a first prediction mode execution unit 900-1, a second prediction mode execution unit 900-2, an nth prediction mode execution unit 900-n, a depth use prediction mode execution unit 901, and a prediction mode selection. A portion 902 is included.

The prediction mode selection unit 902 includes an index (prediction mode) indicating the prediction mode created by the prediction mode selection unit 202 of the image encoding device 100 from the intra prediction encoding information TPE output by the prediction method control unit 805, Each piece of information necessary for prediction is extracted. Here, the information necessary for prediction is extracted because the prediction mode indicated by the index is a prediction mode in which information necessary for prediction exists (specifically, the sub-blocks of the first prediction mode and the second prediction mode). This is a case where a prediction image is generated in units and a depth use prediction mode). When the prediction mode selection unit 902 extracts information necessary for prediction, the prediction mode selection unit 902 outputs the information to the corresponding prediction mode execution units 900-1 to 900-n and 901. The prediction mode selection unit 902 selects a prediction image block signal of the prediction mode indicated by the index (prediction mode) from the prediction image block signals generated by each prediction mode execution unit, and selects the prediction image block signal as an intra prediction image block signal. Output to the unit 806.

The first prediction mode execution unit 900-1, the second prediction mode execution unit 900-2, and the nth prediction mode execution unit 900-n are provided in the depth information use intra prediction unit 115 of the image encoding device 100. The same processing as that of the execution unit 200-1, the second prediction mode execution unit 200-2, and the nth prediction mode execution unit 200-n is performed. However, for the first prediction mode execution unit 200-1 and the second prediction mode execution unit 200-2 that perform prediction in sub-block units obtained by further dividing 16 × 16 pixels, the prediction mode (necessary for prediction) in each sub-block is used. Information) is input from the prediction mode selection unit 902, and the corresponding prediction mode is executed in units of sub-blocks. About the prediction mode, it is the content shown in FIG.

The depth use prediction mode execution unit 901 acquires information necessary for prediction (specifically, information indicating the direction of prediction) from the prediction mode selection unit 902, and acquires a depth block decoded signal from the depth map decoding unit 811. The depth use prediction mode execution unit 901 uses the acquired information and signal to generate a predicted image block signal as performed by the depth use prediction mode execution unit 201 of the image encoding device 100. The information necessary for prediction is information regarding the direction of prediction selected by the depth use prediction mode execution unit 201. The configuration of the depth usage prediction mode execution unit 901 is basically the same as the configuration of the depth usage prediction mode execution unit 201. However, the depth control prediction mode is implemented as the final process in which the boundary control prediction image generation unit 300 of the image encoding device 100 selects the prediction block in the horizontal direction and the prediction block in the vertical direction based on the correlation between the input image and the input image. The boundary control predicted image generation unit 300 of the unit 901 is different in that selection is performed using information necessary for prediction. Through the above processing, the depth use prediction mode execution unit 901 generates the same predicted image block signal as the depth use prediction mode execution unit 201 at the time of encoding.

Next, an image decoding process performed by the image decoding apparatus 800 according to the present embodiment will be described. FIG. 15 is a flowchart showing an image decoding process performed by the image decoding apparatus 800 according to this embodiment.
(Step S601) The image decoding apparatus 800 acquires the encoded data E including the encoded data E1 of the image and the encoded data E2 of the depth map from the image encoding apparatus 100 via the communication network 500. Thereafter, the process proceeds to step S602.

(Step S602) The encoded data input unit 813 divides the acquired encoded data E1 of the image into processing blocks corresponding to a predetermined size (for example, 16 pixels in the vertical direction × 16 pixels in the horizontal direction) to generate entropy. The data is output to the decoding unit 801. Further, the depth map encoded data input unit 814 inputs depth map encoded data synchronized with the encoded data input to the encoded data input unit 813 from the outside of the image decoding apparatus 800, and the encoded data input unit 813. The data is divided into the same processing units as the division performed in the above and output to the intra processing unit 821.

The image decoding apparatus 800 repeats the processing in steps S603 to S608 for each image block in the frame.
(Step S603) The entropy decoding unit 801 performs entropy decoding on the encoded data output from the encoded data input unit 813, and generates a difference image block code and predictive encoding information. The entropy decoding unit 801 outputs the difference image block code to the inverse quantization unit 802, and outputs the prediction coding information to the prediction scheme control unit 805. The prediction scheme control unit 805 acquires the prediction coding information from the entropy decoding unit 801, and extracts information regarding the prediction scheme PM and the prediction coding information corresponding to the prediction scheme PM. When the prediction method PM is inter prediction, the prediction coding information is output to the inter processing unit 820 as inter prediction coding information IPE. When the prediction method PM is intra prediction, the prediction coding information is output to the intra processing unit 821 as the intra prediction coding information TPE. Then, it progresses to step S604 and step S605. In step S604 and step S605, each block may be processed in parallel, or only one of the processes may be performed in accordance with the prediction method PM.

(Step S604) The inter processing unit 820 acquires the inter prediction coding information IPE output from the prediction scheme control unit 805 and the decoded image block signal DB output from the adding unit 804, and performs inter processing. The inter processing unit 820 outputs the generated inter predicted image block signal to the selection unit 806. The contents of the inter processing will be described later. In the first process, when the process of the adding unit 804 is not completed, a reset image block signal (an image block signal in which all pixel values are 0) is input. When the processing of the inter processing unit is completed, the process proceeds to step S606.

(Step S605) The intra processing unit 821 acquires the intra prediction encoded information TPE output from the prediction scheme control unit 805 and the decoded image block signal DB output from the adding unit 804, and performs intra prediction. The intra processing unit 821 outputs the generated intra predicted image block signal to the selection unit 806. The intra prediction process will be described later. In the first process, when the process of the adding unit 804 is not completed, a reset image block signal (an image block signal in which all pixel values are 0) is input. If the process of the intra estimation part 821 is completed, it will progress to step S606.

(Step S606) The selection unit 806 acquires information on the prediction method PM output from the prediction method control unit 805, and outputs the inter prediction image block signal output from the inter processing unit 820 or the intra prediction output from the intra processing unit 821. An image signal is selected and output to the adder 804. Thereafter, the process proceeds to step S607.

(Step S607) The inverse quantization unit 802 performs the inverse process of the quantization performed by the quantization unit 104 of the image coding device 100 on the difference image block code input from the entropy decoding unit 801. The inverse quantization unit 802 outputs the generated decoded frequency domain signal to the inverse orthogonal transform unit 803. The inverse orthogonal transform unit 803 obtains the inversely quantized decoded frequency domain signal from the inverse quantization unit 802, and performs the inverse orthogonal transform process of the orthogonal transform process performed by the orthogonal transform unit 103 of the image encoding device 100. Then, the difference image (decoded difference image block signal) is decoded. The inverse orthogonal transform unit 803 outputs the decoded decoded difference image block signal to the adding unit 804. The adding unit 804 adds the predicted image block signal output from the selection unit 806 to the decoded difference image block signal output from the inverse orthogonal transform unit 803 to generate a decoded image block signal DB. The adding unit 804 outputs the decoded decoded image block signal DB to the image output unit 812, the inter processing unit 820, and the intra processing unit 821. Thereafter, the process proceeds to step S608.

(Step S608) The image output unit 812 generates the output image signal R ′ by arranging the decoded image block signal DB output by the adding unit 804 at a corresponding position in the image. If the processes in steps S603 to S607 have not been completed for all the blocks in the frame, the block to be processed is changed and the process returns to step S602.
When outputting the generated output image signal R ′ to the outside of the image decoding device 800 (display device 600), the image output unit 812, for example, has 5 frames (I picture (I0), B picture ( B3), B picture (B2), B picture (B4) and P picture (P1)).

FIG. 16 is a flowchart for explaining the inter processing in step S604.
(Step S701) The deblocking / filtering unit 807 acquires the decoded image block signal DB from the adding unit 804 that is external to the inter processing unit 820, and performs the FIR filter processing performed at the time of encoding. The deblocking filter unit 807 outputs the corrected corrected block signal to the frame memory 808. Thereafter, the process proceeds to step S702.

(Step S <b> 702) The frame memory 808 holds the correction block signal output from the deblocking filter unit 807 as part of the image together with information that can identify the frame number. Thereafter, the process proceeds to step S703.
(Step S703) The motion compensation unit 809 acquires the inter prediction coding information IPE from the prediction scheme control unit 805, and extracts a corresponding prediction block signal from the frame memory. The motion compensation unit 809 outputs the prediction image block signal extracted from the frame memory to the selection unit 806 as an inter prediction image block signal. Thereafter, the inter processing is terminated.

FIG. 17 is a flowchart illustrating the intra processing in step S605.
(Step S801) The depth map decoding unit 811 acquires depth map encoded data divided into processing units from the depth map encoded data input unit 814, and decodes a depth map having a larger amount of information using variable length decoding, for example. To do. The depth map decoding unit 811 outputs the decoded depth map (depth block decoded signal) to the depth information use intra prediction unit 810. Thereafter, the process proceeds to step S802.

(Step S802) The first prediction mode execution unit 900-1 to the n-th prediction mode execution unit 900-n generate a prediction image block signal using the decoded image block signal DB output from the addition unit 804. The prediction mode execution unit that performs processing in units of sub-blocks, specifically, the first prediction mode execution unit 900-1 and the second prediction mode execution unit 900-2 are employed in the image coding apparatus 100. Information indicating the prediction mode of each sub-block is acquired from the prediction mode selection unit 902, and a prediction image block signal is generated. First prediction mode execution unit 900-1 to n-th prediction mode execution unit 900-n output the generated first to n-th prediction image block signals to prediction mode selection unit 902.

The depth use prediction mode execution unit 901 includes a decoded image block signal DB output from the addition unit 108, a depth block decoded signal output from the depth depth map decoding unit 811, and information necessary for prediction output from the prediction mode selection unit 902. (Specifically, information indicating the direction of prediction) is used to perform the same processing as the processing performed by the depth use prediction mode execution unit 201 in FIG. 3 to generate a depth use prediction image. The depth use prediction mode execution unit 901 outputs the generated prediction image signal to the prediction mode selection unit 902. Thereafter, the process proceeds to step S803.

(Step S803) The prediction mode selection unit 902 extracts information indicating the prediction mode employed by the image encoding device 100 from the intra prediction encoding information TPE input from the prediction method control unit 805, and predicts the corresponding prediction mode. The image block signal is output to the selection unit 806 as an intra prediction image block signal. When the extracted prediction mode is a prediction mode to be executed in units of subblocks, the prediction mode selection unit 902 further extracts the prediction mode of each subblock and outputs the information to the corresponding prediction mode execution unit. Then, intra prediction is complete | finished. When the extracted prediction mode is the depth use prediction mode, the prediction mode selection unit 902 extracts information regarding the prediction direction and outputs the information to the depth use prediction mode execution unit 901.

The image encoding device 100 described above includes the depth input unit 118 and the depth map encoding unit 116, and the image decoding device 800 includes the depth map encoded data input unit 814 and the depth map decoding unit 811. It is not limited to this. For example, information regarding the depth map corresponding to the input image may be made available in the image decoding apparatus 800 by a separate means. For example, the image encoding device 100 and the image decoding device 800 are configured to receive the depth map via a communication line from a server device that stores the depth map in correspondence with video information installed outside or offline. May be. Therefore, a video title indicating video information can be searched through a communication line, and when the video information is selected, a corresponding depth map can be received.

In addition, the image encoding device 100 according to the present embodiment also acquires an image of a viewpoint different from the input image, and between the pixels included in the input image and the pixels included in the image of the viewpoint different from the input image. A depth map generation unit that generates a depth map having a pixel value as a value representing the parallax of the image may be provided. In that case, the depth map generation unit outputs the generated depth map to the depth input unit 118.

Further, the image decoding apparatus 800 according to the present embodiment generates a second output image having a viewpoint different from the output image based on the output image and the depth map of the same frame as the output image, and outputs the second output image to the outside. Also good.
In the above-described example, the image encoding apparatus 100 inputs the input image signal every 5 frames. However, in the present embodiment, the image encoding apparatus 100 is not limited to this and may input every arbitrary number of frames. Good.
In the above-described example, the image decoding apparatus 800 outputs the output image signal every 5 frames. However, in the present embodiment, the image decoding apparatus 800 is not limited to this and may output every arbitrary number of frames. .

In this embodiment, the image to be encoded is a moving image, but it may be a still image.
Further, the image to be encoded is a multi-viewpoint image, and the depth-use prediction mode is used only in the viewpoint image with the corresponding depth map. For the viewpoint image without the corresponding depth map, the conventional prediction mode is used. May be used.

As described above, the present embodiment has two prediction modes for performing control to suppress continuous pixel prediction at the boundary of the depth map indicating the distance to the subject when performing intra prediction. Since only two prediction modes are added as compared with the prior art, it is possible to improve the accuracy of the predicted image while suppressing an increase in code amount due to an increase in the number of prediction modes. And the accuracy of the prediction image is improved while suppressing the increase in the amount of code due to the increase in the number of prediction modes, so the residual between the prediction image and the input image is minimized, realizing highly efficient image encoding and decoding can do.
If the depth-based prediction mode is used instead of the conventional prediction mode, the number of prediction modes does not increase, and therefore the increase in code amount can be further suppressed.

In addition, a part of the image coding apparatus 100 and the image decoding apparatus 800 in the above-described embodiment, for example, the subtraction unit 102, the orthogonal transformation unit 103, the quantization unit 104, the entropy coding unit 105, the inverse quantization unit 106, and the inverse Orthogonal transformation unit 107, addition unit 108, prediction scheme control unit 109, selection unit 110, deblocking filter unit 111, motion compensation unit 113, motion vector detection unit 114, depth information use intra prediction unit 115, depth map encoding unit 116, a depth map decoding unit 117, an entropy decoding unit 801, an inverse quantization unit 802, an inverse orthogonal transform unit 803, an addition unit 804, a prediction scheme control unit 805, a selection unit 806, a deblocking filter unit 807, a motion compensation unit 809, a depth information use intra prediction unit 810 and a depth map decoding unit 811 are connected. It may be realized by Yuta.

In that case, the program for realizing the control function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into a computer system and executed. The “computer system” here is a computer system built in the image encoding device 100 or the image decoding device 800, and includes an OS and hardware such as peripheral devices. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM or a CD-ROM, and a hard disk incorporated in a computer system. Further, the “computer-readable recording medium” is a medium that dynamically holds a program for a short time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line, In such a case, a volatile memory inside a computer system serving as a server or a client may be included and a program that holds a program for a certain period of time. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.

Further, part or all of the image encoding device 100 and the image decoding device 800 in the above-described embodiment may be realized as an integrated circuit such as an LSI (Large Scale Integration). Each functional block of the image encoding device 100 and the image decoding device 800 may be individually made into a processor, or a part or all of them may be integrated into a processor. Further, the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. Further, in the case where an integrated circuit technology that replaces LSI appears due to progress in semiconductor technology, an integrated circuit based on the technology may be used.

As described above, according to one aspect of the present invention, when an input image is encoded, an image for performing intra prediction that predicts a pixel value of the processing target pixel using pixel values of peripheral pixels around the processing target pixel. The encoding device, wherein when performing the intra prediction, a screen that suppresses using the peripheral pixels having a subject boundary represented by the input image between the processing target pixels and the peripheral pixels. An inner prediction unit is provided.

Another aspect of the present invention is the image encoding device described above, wherein the intra-screen prediction unit detects a boundary of the subject using information indicating a distance to the subject of the input image. A boundary detection unit is provided.

Another aspect of the present invention is the image encoding device described above, wherein the intra prediction unit includes a pixel adjacent to the processing target pixel in a predetermined direction among the peripheral pixels, and the processing target. When there is no boundary between the subject and the pixel, the pixel value of the processing target pixel is predicted using the pixel adjacent in the predetermined direction, and the pixel adjacent in the predetermined direction and the processing target When there is a boundary between the subject and a pixel, a predicted image generation unit is provided that suppresses predicting the pixel value of the processing target pixel using a pixel adjacent in the predetermined direction.

Another aspect of the present invention is the above-described image encoding device, in which the in-screen prediction unit represents a peripheral pixel used when predicting a pixel value of the processing target pixel. A prediction image generation unit that determines at least based on the difference between the information indicating the distance to the subject and the information indicating the distance to the subject represented by the processing target pixel.

Further, in the present embodiment, in the above-described image encoding device, the intra-screen prediction unit uses the peripheral pixels and the processing target pixels as peripheral pixels used when predicting the pixel value of the processing target pixel. A predicted image generation unit that determines at least based on the distance.

Further, according to another aspect of the present invention, when decoding an encoded image, intra-screen prediction is performed to predict the pixel value of the processing target pixel using the pixel values of peripheral pixels around the processing target pixel. In the image decoding device, when performing the intra prediction, it is possible to suppress use of the peripheral pixels having a subject boundary represented by the encoded image between the processing target pixels and the peripheral pixels. An in-screen prediction unit is provided.

Another aspect of the present invention is the above-described image decoding device, wherein the in-screen prediction unit detects a boundary of the subject using information indicating a distance to the subject of the encoded image. A boundary detection unit is provided.

Another aspect of the present invention is the above-described image decoding device, wherein the intra prediction unit includes a pixel adjacent to the processing target pixel in a predetermined direction among the peripheral pixels, and the processing target pixel. If there is no boundary between the subject and the pixel, the pixel value of the processing target pixel is predicted using pixels adjacent in the predetermined direction, and the pixel adjacent to the predetermined direction and the processing target pixel A prediction image generation unit that suppresses prediction of a pixel value of the processing target pixel using pixels adjacent in the predetermined direction when the boundary of the subject is between the two.

Another aspect of the present invention is the above-described image decoding device, wherein the intra-screen prediction unit determines a peripheral pixel used when predicting a pixel value of the processing target pixel to a subject represented by the peripheral pixel. A predicted image generation unit that determines at least based on the difference between the information indicating the distance to the subject and the information indicating the distance to the subject represented by the processing target pixel.

Another aspect of the present invention is the above-described image decoding device, wherein the intra-screen prediction unit uses the peripheral pixels and the processing target as peripheral pixels used when predicting a pixel value of the processing target pixel. A predicted image generation unit that determines at least based on the distance to the pixel;

According to another aspect of the present invention, when an input image is encoded, intra-screen prediction is performed to predict the pixel value of the processing target pixel using the pixel values of peripheral pixels around the processing target pixel. In the image encoding method, when the intra prediction is performed, use of the peripheral pixels having a boundary of a subject represented by the input image between the processing target pixels and the peripheral pixels is suppressed. Have a process.

Further, according to another aspect of the present invention, when decoding an encoded image, intra-screen prediction is performed to predict the pixel value of the processing target pixel using the pixel values of peripheral pixels around the processing target pixel. In the image decoding method, when the intra prediction is performed, use of the peripheral pixels having a boundary of a subject represented by the encoded image between the processing target pixels and the peripheral pixels is suppressed. Have a process.

According to another aspect of the present invention, when an input image is encoded, intra-screen prediction is performed to predict the pixel value of the processing target pixel using the pixel values of peripheral pixels around the processing target pixel. When the computer of the image encoding device performs the intra prediction, the use of the peripheral pixels having the boundary of the subject represented by the input image between the processing target pixels and the peripheral pixels is suppressed. It is a program for functioning as an in-screen prediction unit.

Further, according to another aspect of the present invention, when decoding an encoded image, intra-screen prediction is performed to predict the pixel value of the processing target pixel using the pixel values of peripheral pixels around the processing target pixel. When the computer of the image decoding apparatus performs the intra prediction, the use of the peripheral pixels having a boundary of the subject represented by the encoded image between the peripheral pixels and the processing target pixel is suppressed. It is a program for functioning as an in-screen prediction unit.

The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and design changes and the like within a scope not departing from the gist of the present invention are included. .

DESCRIPTION OF SYMBOLS 10 ... Moving image transmission system 100 ... Image coding apparatus 101 ... Image input part 102 ... Subtraction part 103 ... Orthogonal transformation part 104 ... Quantization part 105 ... Entropy coding part 106 ... Inverse quantization part 107 ... Inverse orthogonal transformation part 108 DESCRIPTION OF SYMBOLS ... Addition unit 109 ... Prediction method control unit 110 ... Selection unit 111 ... Deblocking filter unit 112 ... Frame memory unit 113 ... Motion compensation unit 114 ... Motion vector detection unit 115 ... Depth information utilization intra prediction unit 116 ... Depth map encoding 117: Depth map decoding unit 118 ... Depth input unit 120 ... Inter prediction unit 121 ... Intra prediction unit 200-1 ... First prediction mode execution unit 200-2 ... Second prediction mode execution unit 200-n ... nth prediction mode Execution unit 201 ... depth use prediction mode execution unit 202 ... prediction mode selection unit 300 ... boundary Control predicted image generation unit 301 ... Boundary prediction control unit 302 ... Subject boundary detection unit 500 ... Communication network 600 ... Display device 800 ... Image decoding device 801 ... Entropy decoding unit 802 ... Inverse quantization unit 803 ... Inverse orthogonal transformation unit 804 ... Addition 805 ... Prediction method control unit 806 ... Selection unit 807 ... Deblocking filter unit 808 ... Frame memory 809 ... Motion compensation unit 810 ... Depth information use intra prediction unit 811 ... Depth map decoding unit 812 ... Image output unit 813 ... Encoding Data input unit 814 ... Depth map encoded data input unit 820 ... Inter processing unit 821 ... Intra processing unit 900-1 ... First prediction mode execution unit 900-2 ... Second prediction mode execution unit 900-n ... nth prediction mode Implementation unit 901 ... Depth use prediction mode implementation unit 902 ... Prediction mode selection unit

Claims

When encoding an input image, an image encoding device that performs in-screen prediction for predicting a pixel value of a processing target pixel using pixel values of peripheral pixels around the processing target pixel,
An image code comprising: an in-screen prediction unit that determines a predicted value for each pixel based on information indicating a distance to the subject in the processing target pixel and information indicating a distance to the subject in the peripheral pixels. Device.
The intra prediction unit
The image coding apparatus according to claim 1, wherein a prediction value for each pixel is determined based on a distance between pixels of the processing target pixel and the surrounding pixels.
The intra prediction unit
A subject boundary detection unit that detects a boundary of the subject using information indicating a distance to the subject of the input image;
When the boundary is not detected between the processing target pixel and a pixel adjacent in a predetermined direction, the prediction value of the processing target pixel is predicted using the pixel adjacent in the predetermined direction. The image encoding apparatus according to claim 1, wherein the apparatus is an image encoding apparatus.
When decoding an encoded image, an image decoding device that performs in-screen prediction that predicts a pixel value of a processing target pixel using pixel values of surrounding pixels around the processing target pixel,
An intra-screen prediction unit that determines a predicted value for each pixel based on information indicating a distance to the subject in the processing target pixel and information indicating a distance to the subject in the peripheral pixels when performing the intra-screen prediction. An image decoding apparatus comprising:
The intra prediction unit
The image decoding apparatus according to claim 4, wherein a prediction value for each pixel is determined based on a distance between the processing target pixel and the peripheral pixel.
The intra prediction unit
A subject boundary detection unit that detects a boundary of the subject using information indicating a distance to the subject of the input image;
When the boundary is not detected between the processing target pixel and a pixel adjacent in a predetermined direction, the prediction value of the processing target pixel is predicted using the pixel adjacent in the predetermined direction. The image decoding device according to claim 4, characterized in that:
When encoding an input image, an image encoding method for performing intra prediction to predict a pixel value of the processing target pixel using pixel values of peripheral pixels around the processing target pixel,
When performing the intra prediction, the intra prediction unit determines a prediction value for each pixel based on information indicating the distance to the subject in the processing target pixel and information indicating the distance to the subject in the peripheral pixels. An image encoding method comprising the steps of:
When decoding an input image, an image decoding method for performing intra prediction to predict a pixel value of the processing target pixel using pixel values of peripheral pixels around the processing target pixel,
When performing the intra prediction, the intra prediction unit determines a prediction value for each pixel based on information indicating the distance to the subject in the processing target pixel and information indicating the distance to the subject in the peripheral pixels. An image decoding method comprising the steps of:
When encoding an input image, an image encoding device that performs in-screen prediction for predicting a pixel value of the processing target pixel using pixel values of peripheral pixels around the processing target pixel,
When performing the intra prediction, an intra prediction unit that determines a prediction value for each pixel based on information indicating the distance to the subject in the processing target pixel and information indicating the distance to the subject in the peripheral pixels. A program to make it work.
When decoding an input image, an image decoding device that performs in-screen prediction for predicting a pixel value of the processing target pixel using pixel values of peripheral pixels around the processing target pixel,
When performing the intra prediction, an intra prediction unit that determines a prediction value for each pixel based on information indicating the distance to the subject in the processing target pixel and information indicating the distance to the subject in the peripheral pixels. A program to make it work.