WO2012060179A1 - Encoder apparatus, decoder apparatus, encoding method, decoding method, program, recording medium, and data structure of encoded data - Google Patents

Encoder apparatus, decoder apparatus, encoding method, decoding method, program, recording medium, and data structure of encoded data Download PDF

Info

Publication number
WO2012060179A1
WO2012060179A1 PCT/JP2011/073134 JP2011073134W WO2012060179A1 WO 2012060179 A1 WO2012060179 A1 WO 2012060179A1 JP 2011073134 W JP2011073134 W JP 2011073134W WO 2012060179 A1 WO2012060179 A1 WO 2012060179A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
value
representative
image
encoding
Prior art date
Application number
PCT/JP2011/073134
Other languages
French (fr)
Japanese (ja)
Inventor
純生 佐藤
Original Assignee
シャープ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by シャープ株式会社 filed Critical シャープ株式会社
Publication of WO2012060179A1 publication Critical patent/WO2012060179A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0081Depth or disparity estimation from stereoscopic image signals

Definitions

  • the present invention mainly relates to an encoding device that encodes a distance image (Depth Image) and a decoding device that decodes the distance image encoded by such an encoding device.
  • a texture image which is a general two-dimensional image that represents the subject space with the color of each subject and background, and the subject space is represented by the distance from the viewpoint to each subject and background.
  • distance image is an image expressing a distance value (depth value) from the viewpoint to a corresponding point in the object space for each pixel.
  • This distance image can be acquired by a distance measuring device such as a depth camera installed in the vicinity of the camera that records the texture image.
  • a distance image can be acquired by analyzing a plurality of texture images obtained by photographing with a multi-viewpoint camera, and many analysis methods have been proposed.
  • distance values are expressed in 256 levels (ie, 8-bit luminance values) in the Moving Picture Experts Group (MPEG), which is a working group of the International Organization for Standardization / ISO / IEC, as a standard for distance images.
  • MPEG-C part3 which is a standard to be established. That is, the standard distance image is an 8-bit grayscale image.
  • a subject located in front is expressed as white and a subject located in the back is expressed in black.
  • the distance from the viewpoint of each pixel constituting the subject image drawn in the texture image is known from the distance image, so that the subject has the maximum depth. It can be restored as a three-dimensional shape expressed in 256 stages. Furthermore, by projecting the 3D shape onto the 2D plane geometrically, the original texture image is converted into a texture image in the subject space when the subject is photographed from another angle within a certain range from the original angle. It is possible to convert. In other words, since a three-dimensional shape can be restored when viewed from an arbitrary angle within a certain range by a set of texture images and distance images, a free viewpoint of a three-dimensional shape can be obtained by using multiple sets of texture images and distance images. It is possible to represent an image with a small amount of data.
  • Non-Patent Document 1 discloses a technique capable of compressing and encoding video by efficiently eliminating temporal or spatial redundancy in the video.
  • a texture video video having a texture image as each frame
  • a distance video video having a distance image as each frame
  • the present inventor has found that there are the following two characteristics between the texture image and the distance image. (1) The subject and background edge portions in the distance image and the subject and background edge portions in the texture image are common. (2) In the distance image, the distance depth value is relatively flat inside (in the region surrounded by the edge) from the edge of the subject and the background.
  • the characteristic (1) will be described. As long as the texture image includes information that allows the subject to be distinguished from the background as an image, the boundary (edge) between the subject and the background is common to the texture video and the distance video. . That is, the edge information indicating the edge portion of the subject is one large element indicating the correlation between the texture image and the distance image. Further, the characteristic (2) will be described.
  • the distance image tends to be an image having a lower spatial frequency component than the texture image. For example, even if a person wearing a fancy pattern of clothes is drawn on the texture image, the distance depth value of the clothes portion tends to be constant in the distance image. In other words, in the distance image, it can be said that there is a strong tendency for a single distance depth value to appear in a wider area than in the texture image.
  • the distance depth value is substantially constant within that range (within the segmented pixel group).
  • the entire region of the texture image is divided into a plurality of regions so that the difference between the maximum pixel value and the minimum pixel value of the pixel group included in the region is equal to or less than a predetermined threshold, and the same pattern as the texture image division pattern
  • the distance depth value becomes substantially constant in each region in the distance image.
  • a pixel group (each region formed by dividing the entire region of the texture image and the distance image) divided so that the distance depth value becomes substantially constant is referred to as a segment.
  • the distance image can be handled in units of segments, not in units of pixels. Further, since the distance image is divided based on the corresponding texture image (texture image at the same time as the distance image), the distance image values in the adjacent segments in the distance image are the same or close values. It becomes more and more. Therefore, further information compression is possible by using the characteristic and eliminating the spatial redundancy between segments in the distance image.
  • Non-Patent Document 1 a texture image is made into a block, and spatial redundancy between blocks is eliminated by intra prediction encoding or intra prediction encoding. Specifically, first, pixels included in the texture image are blocked in units of 4 ⁇ 4 pixels, 8 ⁇ 8 pixels, 16 ⁇ 16 pixels, or the like. Next, the blocks are encoded in the order from the upper left block to the lower right block of the image. In the encoding of each block, the pixel or pixel adjacent to the encoding target block that is encoded prior to the encoding target block and is inside the block adjacent to the left, top, and upper right of the encoding target block. With reference to the column, the value of the pixel included in the encoding target block is predicted.
  • the difference obtained by subtracting the predicted value from the actual value of each pixel included in the encoding target block is orthogonally transformed and encoded. If the prediction accuracy is good, it can be expected that the value will be smaller than when the actual value itself is encoded, and as a result, the number of bits required for encoding can be reduced.
  • Non-Patent Document 1 is a technique optimized for a texture image, and there is a problem that it cannot be applied as it is to a distance image divided into the above-described segment units. .
  • Non-Patent Document 1 a texture image is divided into blocks each having a square shape.
  • the range image dividing method proposed by the present inventors the range image is divided into segments of arbitrary shapes. This is because with this division method, the smaller the number of segments to be divided, the better the coding efficiency. Therefore, each segment can have a flexible shape without any restriction on the shape of each segment. desirable.
  • the unit for dividing the image is a square
  • the blocks adjacent to the left, top, and top right of the encoding target block can be uniquely determined. Furthermore, since it is guaranteed that a block including pixels that the encoding target block refers to for prediction is encoded prior to the encoding target block, the decoding side may reproduce the predicted value. it can.
  • the adjacent segment of the encoding target segment cannot be uniquely determined. Further, it cannot be determined which segment adjacent to the encoding target segment is encoded in advance. Therefore, even if the technique described in Non-Patent Document 1 is applied as it is, the spatial redundancy of the distance image divided into segment units cannot be removed.
  • the present invention has been made in view of the above problems, and a main object thereof is an encoding apparatus that performs encoding by eliminating spatial redundancy between segments of an image divided in segment units of an arbitrary shape. Another object of the present invention is to realize a decoding device that decodes a distance image supplied from such an encoding device.
  • an encoding apparatus is an encoding apparatus that encodes an image, and is divided by a dividing unit that divides the entire area of the image into a plurality of regions, and the dividing unit.
  • representative value determining means for determining a representative value from the pixel value of each pixel included in the region, number giving means for assigning a number to the plurality of regions in raster scan order,
  • the above-mentioned area is set as the encoding target area in the order of the numbers given by the above-mentioned number assigning means, and among the pixels included in the encoding target area, the first pixel in the raster scan order is set as the representative pixel, and is included in the encoding target area.
  • a pixel that is adjacent to a pixel on the same scan line as the representative pixel and that has a raster scan order before the representative pixel is a predicted reference pixel, and a representative of the region having the predicted reference pixel
  • the difference value calculation means for subtracting the calculated predicted value to calculate the difference value, and the difference values calculated by the difference value calculation means are arranged and encoded in the order given by the number assignment means, and the encoded data of the image And encoding means for generating.
  • the number assigning unit assigns numbers in the raster scan order to the plurality of regions into which the dividing unit has divided the image.
  • the prediction value calculation means sets the area as the encoding target area in the order of the numbers given by the number assignment means, and sets the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
  • a pixel that is included in the encoding target region and is close to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is a predicted reference pixel.
  • the predicted value calculation means calculates the predicted value of the encoding target region based on at least one of the representative values of the region having the predicted reference pixel.
  • the difference value calculation means subtracts the prediction value calculated by the prediction value calculation means from the representative value determined by the representative value determination means for each encoding target region to calculate a difference value. Then, the encoding unit arranges and encodes the difference values calculated by the difference value calculation unit in the order given by the number assigning unit, and generates encoded data of the image.
  • the order of the areas can be uniquely specified.
  • the representative pixel used when calculating the predicted value of the representative value of each region and the prediction target pixel based on the representative pixel can be uniquely specified. Therefore, the predicted value of the encoding target area determined from the representative value of the area adjacent to the encoding target area can be uniquely calculated.
  • the prediction reference pixel for a certain area needs to be the same at the time of encoding and at the time of decoding. Therefore, a prediction reference pixel for a certain area needs to be decoded before the certain area, that is, needs to be encoded first.
  • an encoding method is an encoding method of an encoding device that encodes an image, and the encoding device divides the entire region of the image into a plurality of regions.
  • a numbering step for assigning numbers in order, and the region as an encoding target region in the order of the numbers given in the numbering step, and among the pixels included in the encoding target region, the first pixel in the raster scan order A pixel that is included in the encoding target area and is close to a pixel on the same scan line as the representative pixel, and the raster scan order is higher than that of the representative pixel.
  • a prediction value calculating step for calculating a prediction value of the encoding target area based on at least one representative value of the area having the prediction reference pixel,
  • a difference value calculation step for calculating a difference value by subtracting the prediction value calculated in the prediction value calculation step from the representative value determined in the representative value determination step; and a difference value calculated in the difference value calculation step are encoded in the order given in the number assigning step, and an encoded step of generating encoded data of the image is included.
  • the encoding method according to the present invention has the same effects as the encoding apparatus according to the present invention.
  • the decoding device for each of a plurality of areas obtained by dividing the entire area of the image with a predetermined division pattern, a representative value of the pixel value of each pixel included in the area,
  • the encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order.
  • the predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
  • a decoding device that decodes the encoded data calculated based on: a dividing unit that divides the entire area of the image into a plurality of areas based on area information defining the plurality of areas; and the encoding Decoding means for decoding data and generating differential values arranged in order; Number assigning means for assigning numbers to the plurality of regions divided by the dividing means in raster scan order; and the number assigning means Are assigned to the plurality of areas in order of numbers assigned by each of the plurality of areas, and the areas are set as decoding target areas in order of the numbers assigned by the number assigning means, and are included in the decoding target areas.
  • the first pixel in the raster scan order is a representative pixel, and is included in the decoding target area and is adjacent to a pixel on the same scan line as the representative pixel.
  • a prediction value calculation unit that calculates a prediction value of a decoding target region based on a pixel value of at least one pixel of the prediction reference pixels, the pixel having a raster scan order before the representative pixel as a prediction reference pixel; For each decoding target area, the pixel value of the decoding target area is calculated by adding the difference value assigned by the allocating means to the prediction value calculated by the prediction value calculating means, and all the pixels included in the decoding target area are calculated.
  • a pixel value setting unit that sets the calculated pixel value to the calculated pixel value, and the predicted value calculation unit and the pixel value setting unit repeatedly execute the process for each decoding target region in the order of the numbers. It is characterized by restoring pixel values of an image.
  • the decoding unit decodes the encoded data and generates difference values arranged in order.
  • the allocating unit calculates a difference for each of the plurality of regions obtained by dividing the image by the dividing unit based on region information defining the plurality of regions in the order of numbers assigned by the number assigning unit in raster scan order. Assign values in order from the beginning.
  • the prediction value calculation means decodes the area as the decoding target area in the order of the numbers given by the number assigning means, and uses the first pixel in the raster scan order as the representative pixel among the pixels included in the decoding target area.
  • a pixel that is included in the target region and is adjacent to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is set as a predicted reference pixel.
  • the predicted value calculation means calculates a predicted value of the decoding target region based on the pixel value of at least one pixel among the predicted reference pixels.
  • the pixel value setting means calculates the pixel value of the decoding target area by adding the difference value assigned by the assigning means to the prediction value calculated by the prediction value calculating means for each decoding target area, The pixel values of all the pixels included in the decoding target area are set to the calculated pixel values.
  • the prediction value calculation means and the pixel value setting means repeatedly execute the above processing for each decoding target area in the order of the numbers given by the number assignment means, and restore the pixel values of the image.
  • the decoding target area is the same as the plurality of areas into which the image indicated by the encoded data is divided.
  • the representative pixel used when calculating the predicted value of the representative value of each decoding target area and the prediction target pixel based thereon can be uniquely specified, and the representative pixel of the decoding target area and the prediction target based thereon
  • the pixel, the representative pixel of the encoding target region corresponding to the decoding target region, and the prediction target pixel based thereon can be the same pixel. Therefore, there is an effect that the image indicated by the encoded data can be accurately restored.
  • the decoding method provides, for each of a plurality of areas obtained by dividing the entire area of an image with a predetermined division pattern, a representative value of pixel values of each pixel included in the area,
  • the encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order.
  • the predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
  • a division step for dividing, a decoding step for decoding the encoded data and generating difference values arranged in order, and assigning numbers to the plurality of regions divided in the division step in a raster scan order A number assigning step, an assigning step in which the difference values are assigned in order from the top to the plurality of regions in the order of numbers assigned in the number assigning step, and the regions in the order of the numbers given in the number assigning step.
  • the first pixel in the raster scan order is set as the representative pixel and included in the decoding target area.
  • the prediction value calculation step and the pixel value setting step are repeatedly executed for each target region to restore the pixel value of the image.
  • the decoding method according to the present invention has the same operational effects as the decoding device according to the present invention.
  • the encoding apparatus can uniquely decode even if a plurality of regions into which an image is divided has an arbitrary shape, eliminating spatial redundancy between the regions. There is an effect that encoded data can be generated.
  • the decoding device has an effect that the image indicated by the encoded data can be accurately restored.
  • FIG. 4 is a diagram showing the distribution of each segment defined by the moving image encoding apparatus of FIG. 1 from the texture image of FIG. 3.
  • FIG. 6 is a diagram illustrating a segment boundary portion in which an image division processing unit of the moving image encoding device in FIG.
  • FIG. 7 shows 12 pixels of 3 ⁇ 4 in the vertical direction that constitute a partial area of the texture image.
  • FIGS. 7A and 7B show a case where two pixels are adjacent vertically and horizontally.
  • FIG. 7C shows a case where two pixels are in contact at only one point. It is a figure which shows the order which scans a texture image in order to determine the value of the segment number which the moving image encoder of FIG. 1 assign
  • FIG. 1 is given the representative value of the distance value in the corresponding segment in the distance image and the raster scan order. It is a figure which shows typically the data with which the segment number was matched. It is a flowchart which shows an example of the prediction encoding process which the prediction encoding part of the moving image encoder of FIG. 1 performs. (A) to (e) of FIG. 12 show 12 pixels of length 3 and width 4 constituting a partial region of the distance image, and the prediction encoding unit predicts the representative value of the segment. It is a figure which shows the specific example of the representative pixel to be used, and the prediction reference pixel based on this representative pixel.
  • the moving picture coding apparatus generally generates coded data for each frame constituting a three-dimensional moving picture by coding a texture image and a distance image constituting the frame. It is a device to do.
  • the moving picture encoding apparatus uses H.264 for encoding texture images.
  • the encoding technique employed in the H.264 / MPEG-4 AVC standard is used, while the encoding of the distance image is a moving picture encoding apparatus using the encoding technique peculiar to the present invention.
  • the above encoding technique unique to the present invention is an encoding technique developed by paying attention to the fact that there is a correlation between a texture image and a distance image.
  • a certain area in the texture image is composed of pixel groups composed of pixels of similar colors, all or almost all of the pixels included in the corresponding area in the distance image are the same.
  • the values of the pixels constituting the texture image and the distance image are referred to as pixel values.
  • the pixel value in the texture image indicates information regarding the luminance and color of each pixel.
  • the pixel value in the distance image indicates information related to the distance depth that each pixel has.
  • the pixel value of the texture image is referred to as a color value
  • the pixel value of the distance image is referred to as a distance value.
  • FIG. 1 is a block diagram illustrating a configuration of a main part of a video encoding device.
  • the moving image encoding apparatus 1 includes an image encoding unit 11, an image decoding unit (decoding unit) 12, a distance image encoding unit 20, and a packaging unit (transmission unit) 28.
  • the distance image encoding unit 20 includes an image division processing unit 21, a distance image division processing unit (dividing unit) 22, a distance value correcting unit (representative value determining unit) 23, a number assigning unit (number assigning unit) 24, and A prediction encoding unit (prediction value calculation means, difference value calculation means, encoding means) 25 is provided.
  • the image encoding unit 11 The texture image # 1 is encoded by AVC (Advanced Video Coding) coding defined in the H.264 / MPEG-4 AVC standard.
  • AVC Advanced Video Coding
  • the image decoding unit 12 decodes the texture image # 1 'from the encoded data # 11 of the texture image # 1.
  • the image division processing unit 21 divides the entire area of the texture image # 1 into a plurality of segments (areas). Then, the image division processing unit 21 outputs segment information # 21 including position information of each segment.
  • the segment position information is information indicating the position of the segment in the texture image # 1.
  • the distance image division processing unit 22 When the distance image # 2 and the segment information # 21 are input, the distance image division processing unit 22 includes each segment included in the corresponding segment (region) in the distance image # 2 for each segment in the texture image # 1 ′. A distance value set consisting of pixel distance values is extracted. Then, the distance image division processing unit 22 generates segment information # 22 in which the distance value set and the position information are associated with each segment from the segment information # 21.
  • the distance value correction unit 23 calculates the mode value as the representative value # 23a from the distance value set of the segment included in the segment information # 22 for each segment of the distance image # 2. That is, when the segment i in the distance image # 2 includes N pixels, the distance value correcting unit 23 calculates the mode value from the N distance values.
  • the distance value correcting unit 23 may calculate an average of N distance values as an average value, or a median value of N distance values or the like as a representative value # 23a instead of the mode value.
  • the distance value correcting unit 23 may further round the decimal value to an integer value by rounding down, rounding up, or rounding off when the average value, median value, or the like becomes a decimal value as a result of the calculation. .
  • the distance value correcting unit 23 replaces the distance value set of each segment included in the segment information # 22 with the representative value # 23a of the corresponding segment, and outputs it to the number assigning unit 24 as the segment information # 23.
  • the number assigning unit 24 scans the pixels included in the distance image in the raster scan order, and assigns the segment number # 24 in the scanned order to each segment that is an area divided by the segment information # 23. , It is associated with each representative value # 23a included in the segment information # 23.
  • the predictive encoding unit 25 performs predictive encoding processing based on the input M sets of representative values # 23a and segment numbers # 24, and outputs the obtained encoded data # 25 to the packaging unit 28. Specifically, the predictive encoding unit 25 calculates the segment predicted value for each segment in the order of segment number # 24, subtracts the predicted value from the representative value, calculates the difference value, and calculates the difference value. Encode. Then, the predictive encoding unit 25 arranges the encoded difference values in the order of the segment number # 24 to obtain encoded data # 25, and outputs the encoded data # 25 to the packaging unit 28.
  • the packaging unit 28 associates the encoded data # 11 of the texture image # 1 and the encoded data # 25 of the distance image # 2 and outputs them as encoded data # 28 to the outside.
  • FIG. 2 is a flowchart showing the operation of the moving image encoding apparatus 1.
  • the operation of the moving image encoding apparatus 1 described here is an operation of encoding a texture image and a distance image of the t frame from the head in a moving image including a large number of frames. That is, the moving image encoding apparatus 1 repeats the operation described below as many times as the number of frames of the moving image in order to encode the entire moving image.
  • each data # 1 to # 28 is interpreted as data of the t-th frame.
  • the image encoding unit 11 and the distance image division processing unit 22 respectively receive the texture image # 1 and the distance image # 2 from the outside of the moving image encoding device 1 (step S1).
  • the pair of the texture image # 1 and the distance image # 2 received from the outside is correlated with the contents of the image, as can be seen, for example, by comparing the texture image of FIG. 3 and the distance image of FIG. is there.
  • the image encoding unit 11 The texture image # 1 is encoded by the AVC encoding method stipulated in the H.264 / MPEG-4 AVC standard, and the obtained texture image encoded data # 11 is transmitted to the packaging unit 28 and the image decoding unit 12. Output (step S2).
  • the texture image # 1 is a B picture or a P picture in step S2
  • the image encoding unit 11 encodes the prediction residual between the texture image # 1 and the predicted image, and the encoded prediction residual Is output as encoded data # 11.
  • the image decoding unit 12 decodes the texture image # 1 'from the encoded data # 11 and outputs it to the image division processing unit 21 (step S3).
  • the texture image # 1 'to be decoded is not completely the same as the texture image # 1 encoded by the image encoding unit 11. This is because the image encoding unit 11 performs the DCT conversion process and the quantization process during the encoding process, but a quantization error occurs when the DCT coefficient obtained by the DCT conversion is quantized.
  • the timing at which the image decoding unit 12 decodes the texture image differs depending on whether or not the texture image # 1 is a B picture. This will be described in detail.
  • the image decoding unit 12 decodes the texture image # 1 ′ without performing inter prediction (inter-screen prediction).
  • the image decoding unit 12 decodes the prediction residual from the encoded data # 11. Then, the image decoding unit 12 decodes the texture image # 1 ′ by adding a prediction residual to the predicted image generated using the encoded data # 11 of one or more frames before the t-th frame as a reference picture.
  • the image decoding unit 12 decodes the prediction residual from the encoded data # 11. Then, the image decoding unit 12 generates, as reference pictures, encoded data # 11 of one or more frames before the t-th frame and encoded data # 11 of one or more frames after the t-th frame. Texture image # 1 ′ is decoded by adding the prediction residual to the prediction image.
  • the timing at which the image decoding unit 12 decodes the texture image # 1 ′ in the t frame is the t frame. Immediately after the encoded data # 11 is generated.
  • the timing at which the image decoding unit 12 decodes the texture image # 1 ′ is the T (> t) frame (the last frame in the reference picture). ) After the time when the encoding process for texture image # 1 is completed.
  • the image division processing unit 21 defines a plurality of segments from the input texture image # 1 '(step S4).
  • Each segment defined by the image division processing unit 21 has a similar color pixel (that is, the difference between the maximum pixel value and the minimum pixel value (difference between the maximum color value and the minimum color value) is equal to or less than a predetermined threshold value. (Closed pixel group).
  • FIG. 5 is a diagram showing the distribution of each segment defined by the image division processing unit 21 from the texture image # 1 ′ of FIG.
  • the closed region drawn by the same pattern indicates one segment.
  • the left and right hairs of the girl's head division are drawn in two colors, brown and light brown.
  • the image division processing unit 21 defines a closed region made up of pixels of similar colors such as brown and light brown as one segment.
  • the skin portion of the girl's face is also drawn in two colors, the skin color and the pink color of the cheek portion.
  • Each pink area is defined as a separate segment. This is because the skin color and the pink color are not similar (that is, the difference between the skin color value and the pink color value exceeds a predetermined threshold value).
  • the image division processing unit 21 After the process of step S4, the image division processing unit 21 generates segment information # 21 including the position information of each segment and outputs it to the distance image division processing unit 22 (step S5).
  • the position information of the segment for example, the coordinate values of all the pixels included in the segment can be cited. That is, when defining each segment from the texture image # 1 ′ in FIG. 3, each closed region in FIG. 6 is defined as one segment, but the position information of the segment constitutes a closed region corresponding to the segment. Coordinate values for all pixels.
  • FIG. (A) to (c) of FIG. 7 show 12 pixels of 3 ⁇ 4 in the vertical direction that constitute a partial region of the texture image.
  • the color of the pixel labeled “A” and the color of the pixel labeled “B” are the same or similar.
  • the colors of the pixels in the other ten partial regions are completely different from the colors of the pixel A and the pixel B.
  • each segment is a closed region (a group of connected pixels) made up of pixels of similar colors.
  • the definition of the closed region will be described with reference to FIG.
  • the pixel A and the pixel B are connected when the positional relationship between the two pixels is (a) and (b) in FIG. That is, it is considered that the pixel A and the pixel B are connected when they are in contact with each other in the vertical direction or the horizontal direction. In other words, when the pixel A and the pixel B are in contact with each other on one side, it is considered that they are connected. That is, in this case, the pixel A and the pixel B form the same segment.
  • the pixel A and the pixel B are not connected. That is, when the pixel A and the pixel B are in contact with each other in an oblique direction, it is considered that they are not connected. In other words, when the pixel A and the pixel B are in contact with each other only at a certain point, it is considered that they are not connected. That is, in this case, the pixel A and the pixel B are the same color or similar colors, but are different segments. Needless to say, when the pixel A and the pixel B are not in contact with each other, the pixel A and the pixel B are separate segments.
  • pixels are adjacent to each other is strictly synonymous with the Manhattan distance between the coordinates of the two pixels being “1”, and two pixels are not adjacent to each other. This is synonymous with the fact that the Manhattan distance between the coordinates of two pixels is “2 or more”.
  • a pixel group (each region formed by dividing the entire region of the texture image and the distance image) divided so that the distance depth value is substantially constant, A pixel group having a connection relationship is referred to as a segment.
  • the pixel A and the pixel B are in the positional relationship shown in FIGS. 7A and 7B, the pixel A is also referred to as being adjacent to the pixel B.
  • the pixel A and the pixel B are in any of the positional relationships shown in FIGS. 7A to 7C, the pixel A is also referred to as being close to the pixel B.
  • the segment is referred to as being adjacent to another segment.
  • the segment is referred to as being close to another segment.
  • the distance image division processing unit 22 divides the input distance image # 2 into a plurality of segments. Specifically, the distance image division processing unit 22 refers to the input segment information # 21, specifies the position of each segment in the texture image # 1 ′, and is the same as the segment division pattern in the texture image # 1 ′. In this division pattern, the distance image # 2 is divided into a plurality of segments (in the following description, it is assumed that the number of segments is M).
  • the distance image division processing unit 22 extracts the distance value of each pixel included in the segment as a distance value set for each segment of the distance image # 2. Furthermore, the distance image division processing unit 22 associates the distance value set extracted from the corresponding segment with the position information of each segment included in the segment information # 21. And the distance image division
  • the distance value correction unit 23 calculates the mode value as the representative value # 23a from the distance value set of the segment included in the segment information # 22 for each segment of the distance image # 2. Then, the distance value correcting unit 23 replaces each of the M distance value sets included in the segment information # 22 with the representative value # 23a of the corresponding segment, and outputs it as the segment information # 23 to the number assigning unit 24 ( Step S7).
  • the number assigning unit 24 associates the representative value # 23a with the segment number # 24 corresponding to the position information for each of the M sets of position information and representative value # 23a included in the segment information # 23, and sets M sets The representative value # 23a and the segment number # 24 are output to the predictive coding unit 25 (step S8). Specifically, the number assigning unit 24 sets the representative value # 23a of the i-th segment in the raster scan order for each segment i from 1 to M (M: the number of segments) based on the segment information # 23.
  • M the number of segments
  • the segment number “i ⁇ 1” is associated.
  • the “i-th segment in the raster scan order” is a segment in which the i-th pixel is scanned when the distance image or the texture image is scanned in the raster scan order as shown in FIG.
  • FIG. 9 is a diagram schematically showing the position of each segment of the distance image input to the moving image encoding device 1 together with the texture image as shown in FIG. In FIG. 9, one closed region indicates one segment.
  • segment number “0” is assigned to the segment R0 located at the head in the raster scan order. Further, the segment number “1” is assigned to the segment R1 that is positioned second in the raster scan order. Similarly, segment numbers “2” and “3” are respectively assigned to the third and fourth segments R2 and R3 in the raster scan order.
  • the number assigning unit 24 outputs the M sets of representative values # 23a and the segment number # 24 whose specific examples are shown in FIG. 10 to the predictive encoding unit 25.
  • the predictive encoding unit 25 performs predictive encoding processing based on the input M sets of representative values # 23a and segment numbers # 24, and the obtained encoded data # 25 is packaged by the packaging unit 28. (Step S9). Specifically, the predictive encoding unit 25 calculates the segment predicted value for each segment in the order of segment number # 24, subtracts the predicted value from the representative value, calculates the difference value, and calculates the difference value. Encode. Then, the predictive encoding unit 25 arranges the encoded difference values in the order of the segment number # 24 to obtain encoded data # 25.
  • FIG. 11 is a flowchart illustrating an example of the prediction encoding process performed by the prediction encoding unit 25.
  • step S101 “i” that is the segment number # 24 is set to “0” (step S101). Then, the segment whose segment number # 24 is “i” is set as an encoding target segment (encoding target region) (step S102). That is, the segment with the first segment number “0” is set as the encoding target segment.
  • the representative pixel of the encoding target segment used for calculating the predicted value is specified from the pixels included in the encoding target segment (step S103). Specifically, the pixel included in the encoding target segment and scanned first in the raster scan order in step S8 (the first pixel in the raster scan order) is set as the representative pixel.
  • the shape of the segment in the present invention is various as described above, the pixel to be scanned first in the raster scan order is uniquely determined regardless of the shape of the segment.
  • the prediction reference pixel is specified based on the representative pixel (step S104). Specifically, a pixel that is included in the encoding target segment and is close to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is a predicted reference pixel.
  • the prediction reference pixel is a pixel adjacent to the pixel on the same scan line as the representative pixel that is included in the encoding target segment and the pixel immediately before the raster scan order of the representative pixel.
  • the pixel of the previous scan line in the raster scan order of the pixel and the next pixel in the raster scan order of the last pixel included in the encoding target segment and the same scan line as the representative pixel are adjacent.
  • the pixel group may include a pixel and a pixel on a scan line that is one previous in the raster scan order of the representative pixel.
  • the prediction reference pixel is a pixel adjacent to the pixel in the encoding target segment and the pixel on the same scan line as that of the representative pixel, and the pixel immediately before the raster scan order of the representative pixel, Any one of the pixels of the previous scan line in the raster scan order of the representative pixel and the last pixel in the same scan line as the representative pixel included in the segment to be encoded Three pixels including a pixel adjacent to the next pixel and a pixel on the previous scan line in the raster scan order of the representative pixel may be used.
  • pixels A denoted by “A” in FIGS. 12A to 12E are pixels that constitute the same segment.
  • the pixels labeled “B”, “C”, or “D” (referred to as pixels B, C, and D, respectively) and “blank” pixels are different from the segment RA having the pixel A. It constitutes a segment.
  • the segments to which the pixels other than the pixel A belong may all be the same segment, or may all be different segments. 12A to 12E, the representative pixels in each case and the pixels on the same scanning line as the representative pixels (scan lines: pixel rows in the present embodiment) are hatched. .
  • a pixel on the same scanning line as a certain pixel means a pixel in the same row as the certain pixel.
  • a pixel immediately before a certain pixel in the raster scan order means a pixel one pixel to the left of the certain pixel.
  • a pixel whose raster scan order is one after a certain pixel means a pixel one right of a certain pixel.
  • a pixel that is adjacent to a certain pixel and is one pixel before the scanning line in the raster scan order of the certain pixel means a pixel that is one pixel above the certain pixel.
  • the encoding target segment is assumed to be the segment RA having the pixel A, and the identification of the representative pixel and the predicted reference pixel of the segment RA will be described in the example shown in FIGS.
  • the representative pixel is the pixel A in the shaded line located at the top of the pixel A.
  • the pixel B that is one pixel to the left of the representative pixel
  • the pixel C that is one pixel above the representative pixel
  • the pixel D that is diagonally to the right of the representative pixel are used as predicted reference pixels.
  • the representative pixel is the pixel A on the left side of the pixel A among the pixels A that is shaded.
  • a pixel B that is one pixel to the left of the representative pixel, a pixel C that is one pixel above the representative pixel, and a pixel D that is diagonally right above the pixel A that is one pixel to the right of the representative pixel are used as predicted reference pixels.
  • the representative pixel is the leftmost pixel A of the pixel A among the pixels A that is shaded.
  • prediction reference is made to a pixel B one pixel to the left of the representative pixel, a pixel C one pixel above the representative pixel, and a pixel D diagonally right above the pixel A located at the rightmost position in the same row as the representative pixel. Let it be a pixel.
  • the representative pixel is the pixel A on the left side of the pixel A among the pixels A that is shaded.
  • pixel B one pixel to the left of the representative pixel
  • pixel C one pixel above the representative pixel
  • pixel C one pixel right above the representative pixel
  • the representative pixel is the leftmost pixel A of the pixel A among the pixels A that is shaded.
  • the pixel B one pixel to the left of the representative pixel, the three pixels C respectively positioned on one pixel A on the same row as the representative pixel, and the rightmost pixel on the same row as the representative pixel.
  • a pixel D on the upper right side of the pixel A is set as a predicted reference pixel.
  • the pixel is scanned in the order shown in FIG. 8, and segment number # 24 is assigned. Therefore, in order to encode each segment in the order indicated by the segment number # 24, the pixel B, the pixel C, and the pixel D illustrated in FIG. 12 are encoded in advance of the encoding target segment (the pixel A included in the encoding target segment). It is guaranteed that
  • the predicted value of the representative value of the encoding target segment is calculated based on the representative value of the segment having the predicted reference pixel (step S105). For example, when the prediction reference pixel is pixel B, pixel C, and pixel D as in the example illustrated in FIG. 12A, the representative value Z_B of the segment RB having the pixel B and the representative of the segment RC having the pixel C Based on the value Z_C and the representative value Z_D of the segment RD having the pixel D, the predicted value Z′_A of the representative value Z_A of the segment RA is calculated.
  • Z′_A may be a median value of Z_B, Z_C, and Z_D.
  • Z′_A may be an average value of Z_B, Z_C, and Z_D.
  • Z′_A may be any value of Z_B, Z_C, and Z_D.
  • the difference value ⁇ Z_A is calculated by subtracting the prediction value Z′_A from the representative value Z_A of the encoding target segment (step S106).
  • the calculated difference value ⁇ Z_A is a value indicating the distance value of the pixels included in the encoding target segment. As described above, the distance value has 256 steps and takes a value from 0 to 255. Therefore, ⁇ Z_A can take a value from ⁇ 255 to +255.
  • the calculated difference value is encoded by a variable-length encoding method in which the code word is shorter as the value is closer to 0 (step S107).
  • the difference value is encoded using an exponential Golomb encoding method which is one of variable length encoding methods.
  • FIG. 13 shows the correspondence between the difference value and the code word in the exponential Golomb encoding method.
  • the difference value is shown in the right column, and the code word when exponent Golomb coding is performed on the difference value is shown in the left column.
  • the codeword to be assigned becomes shorter as the difference value is closer to 0, that is, as the predicted value is closer to the representative value approximating the actual distance value. Therefore, it is possible to transmit the distance image while reducing the amount of information to be transmitted.
  • step S107 it is confirmed whether or not all (M) segments have been encoded. If all segments have not been encoded, the processes of steps S102 to S107 are executed in order of segment number # 24. If difference values are calculated and encoded for all segments, the process proceeds to step S110.
  • step S110 the encoded difference values are arranged in the order of segment number # 24 to generate encoded data # 25.
  • a specific example of the encoded data # 25 is shown in FIG. FIG. 14 shows an example of encoded data # 25 in which difference values “3”, “ ⁇ 4”, “ ⁇ 1”, and “0” are encoded in order.
  • the predictive encoding unit 25 compresses the input data to generate encoded data # 25, and outputs the generated encoded data # 25 to the packaging unit 28 (step S9).
  • the packaging unit 28 integrates the encoded data # 11 output from the image encoding unit 11 in step S2 and the encoded data # 25 output from the predictive encoding unit 25 in step S9. Then, the obtained encoded data # 28 is transmitted to a moving picture decoding apparatus to be described later (step S10).
  • the packaging unit 28 is H.264.
  • the texture image encoded data # 11 and the distance image encoded data # 25 are integrated. More specifically, the integration of the encoded data # 11 and the encoded data # 25 is performed as follows.
  • FIG. 15 is a diagram schematically showing the configuration of the NAL unit. As shown in FIG. 11, the NAL unit is composed of three parts: a NAL header part, an RBSP part, and an RBSP trailing bit part. .
  • the packaging unit 28 stores a specified numerical value I in the nal_unit_type (identifier indicating the type of NAL unit) field of the NAL header portion of the NAL unit corresponding to each slice (main slice) of the main picture.
  • the prescribed numerical value I is generated in accordance with the encoding method according to the present embodiment (that is, the encoding method for encoding the distance image # 2 after calculating the difference value for each segment) according to the present embodiment. This is a value indicating encoded data.
  • the numerical value I is, for example, H.264. Values defined as “undefined” or “for future expansion” in the H.264 / MPEG-4 AVC standard can be used.
  • the packaging unit 28 stores the encoded data # 11 and the encoded data # 25 in the RBSP unit of the NAL unit corresponding to the main slice. Further, the packaging unit 28 stores the RBSP trailing bit in the RBSP trailing bit unit.
  • the packaging unit 28 transmits the NAL unit thus obtained to the video decoding device as encoded data # 28.
  • the image division processing unit 21 is configured from a group of pixels whose difference between the maximum pixel value and the minimum pixel value is equal to or less than a predetermined threshold value from the input texture image # 1 ′.
  • a predetermined threshold value from the input texture image # 1 ′.
  • the method of defining the segments is not limited to this configuration. For example, for each segment, the image division processing unit 21 calculates the average value calculated from the pixel values of the pixel group included in the segment and the pixels included in the segment adjacent to the segment from the input texture image # 1 ′.
  • a plurality of segments whose difference from the average value calculated from the pixel values of the group is equal to or greater than a predetermined threshold value may be defined.
  • FIG. 21 is a flowchart showing an operation in which the video encoding device 1 defines a plurality of segments based on the above algorithm.
  • FIG. 22 is a flowchart showing a subroutine of segment combination processing in the flowchart of FIG.
  • the image division processing unit 21 performs, for each of all the pixels included in the texture image, in the initialization step in the figure for the texture image subjected to the smoothing process as shown in (Appendix 2).
  • One independent segment provisional segment
  • the pixel value itself of the corresponding pixel is set as the average value (average color) of all pixel values in each provisional segment (step S41).
  • step S42 a segment combination processing step to combine provisional segments having similar colors.
  • This segment combining process will be described in detail below with reference to FIG. 22, but this combining process is repeated until the combination is not performed.
  • the image division processing unit 21 performs the following processing (steps S51 to S55) for all provisional segments.
  • the image division processing unit 21 determines whether or not the height and width of the temporary segment of interest are both equal to or less than a threshold value (step S51). If it is determined that both are equal to or less than the threshold value (YES in S51), the process proceeds to step S52. On the other hand, when it is determined that any one is larger than the threshold value (NO in S51), the process of step S51 is performed for the temporary segment to be focused next. Note that the temporary segment to be noted next may be, for example, a temporary segment positioned next to the temporary segment of interest in the raster scan order.
  • the image division processing unit 21 selects a temporary segment having an average color closest to the average color in the temporary segment of interest among the temporary segments adjacent to the temporary segment of interest (step S52).
  • a temporary segment having an average color closest to the average color in the temporary segment of interest among the temporary segments adjacent to the temporary segment of interest.
  • an index for judging the closeness of colors for example, the Euclidean distance between vectors when the three RGB values of pixel values are regarded as a three-dimensional vector can be used.
  • a pixel value of each segment an average value of all pixel values included in each segment is used.
  • the image division processing unit 21 determines whether or not the proximity of the temporary segment of interest and the temporary segment that is determined to have the closest color is equal to or less than a certain threshold value. (Step S53). If it is determined that the value is larger than the threshold value (NO in step S53), the process of step S51 is performed for the temporary segment that should be noticed next. On the other hand, when it is determined that the value is equal to or less than the threshold value (NO in step S53), the process proceeds to step S54.
  • the image division processing unit 21 converts two provisional segments (provisional segments determined to be closest in color to the provisional segment of interest) into one provisional segment. (Step S54). The number of provisional segments is reduced by 1 by the process of step S54.
  • step S54 the average value of the pixel values of all the pixels included in the converted target segment is calculated (step S55). If there is a segment that has not yet been subjected to the processing of steps S51 to S55, the processing of step S51 is performed for the temporary segment to be noticed next.
  • step S43 After completing the processes of steps S51 to S55 for all the provisional segments, the process proceeds to the process of step S43.
  • the image division processing unit 21 compares the number of provisional segments before the process of step S42 and the number of provisional segments after the process of step S42 (step S43).
  • step S43 If the number of provisional segments has decreased (YES in step S43), the process returns to step S42. On the other hand, when the number of temporary segments does not change (NO in step S43), the image division processing unit 21 defines each current temporary segment as one segment.
  • the input texture image is an image of 1024 ⁇ 768 dots, it can be divided into several thousand segments (for example, 3000 to 5000 segments).
  • step S51 is not essential, but it is desirable to prevent the segment size from becoming too large by limiting the segment size as in step S51.
  • the image division processing unit 21 is configured from a group of pixels whose difference between the maximum pixel value and the minimum pixel value is equal to or less than a predetermined threshold value from the input texture image # 1 ′.
  • an upper limit may be set for the number of pixels included in each segment.
  • an upper limit may be provided for the width or height of the segment together with the upper limit of the number of pixels or instead of the upper limit of the number of pixels.
  • the moving image decoding apparatus 2 can decode a distance image that more faithfully reproduces the original distance image # 2.
  • the image division processing unit 21 may perform a smoothing process on the input texture image # 1 ′.
  • the image division processing unit 21 is a non-patent document “C. Lawrence Zinick, Sing Bing Kang, Mattew Uyttendaele, Simon Winder and Richard Szeliski,“ High-quality video view interpolation using a layered representation, ”ACM Trans. On Graphics, 23 (3), 600-608, (2004) ”, the texture image # 1 ′ may be repeatedly smoothed to such an extent that the edge information is not lost.
  • the image division processing unit 21 converts the texture image after the smoothing process into a plurality of segments each composed of a pixel group in which the difference between the maximum pixel value and the minimum pixel value is equal to or less than a predetermined threshold value. It may be divided.
  • the smoothing process reduces the size of the segment. Can be suppressed. That is, by performing the smoothing process, the code amount of the encoded data # 25 can be reduced as compared with the case where the smoothing process is not performed.
  • the image division processing unit 21 may be arranged before the image encoding unit 11 instead of being arranged between the image decoding unit 12 and the distance image division processing unit 22. . That is, the image division processing unit 21 outputs the input texture image # 1 as it is to the subsequent image encoding unit 11, and each segment of the texture image # 1 has a predetermined difference between the maximum pixel value and the minimum pixel value. May be divided into a plurality of segments composed of pixel groups that are equal to or smaller than the threshold value, and segment information # 21 may be output to the distance image division processing unit 22 in the subsequent stage.
  • the number assigning unit 24 receives the segment information # 22 in which the distance value set and the position information are associated with each segment from the distance image division processing unit 22. Then, the number assigning unit 24 scans the pixels included in the distance image in the raster scan order, and performs segment numbers in the scanned order for each segment that is an area divided by the position information of the segment information # 22. # 24 is assigned and associated with the distance value set of each segment included in the segment information # 22.
  • the distance value correcting unit 23 receives information in which the segment number # 24 and the distance value set are associated with each other from the number assigning unit 24. Then, the distance value correcting unit 23 calculates the mode value as the representative value # 23a from the distance value set of each segment. Then, the distance value correction unit 23 associates the segment number # 24 with the segment representative value # 23a and outputs the segment number # 24 to the prediction encoding unit 25.
  • the number assigning unit 24 receives the segment information # 23 including the segment position information and the representative value # 23a of each segment, and the predictive encoding unit 25 converts the representative value # 23a and the segment number # 24 of each segment.
  • the number assigning unit 24 may output the segment position information to the predictive encoding unit 25 in addition to the representative value # 23a and the segment number # 24 of each segment.
  • the predictive encoding unit 25 adds segment position information to encoded data # 25 obtained by encoding the difference value, and outputs the result to the packaging unit 28.
  • the packaging unit 28 may add the position information of the segment to the encoded data # 28 instead of the encoded image # 11 of the texture image output from the image encoding unit 11. That is, in this case, the packaging unit 28 transmits the encoded data # 28 including the encoded data # 25 obtained by encoding the difference value and the segment position information to the video decoding device.
  • the moving picture decoding apparatus decodes the encoded data # 25 based on the position information of the segment.
  • the moving image decoding apparatus only needs to be able to divide a segment with the same division pattern as that of the moving image encoding apparatus 1, and thus can restore a distance image based on segment position information indicating the position of the segment. That is, even when there is no encoded image # 11 of the texture image, the distance image divided into segments based on the texture image can be restored. Therefore, it is sufficient for the packaging unit 28 to transmit the segment defining information (region information) defining the segment and the encoded data # 25 to the video decoding device.
  • the segment defining information is the texture image encoded data # 11 or the segment position information.
  • the prediction encoding unit 25 specifies a prediction reference pixel based on the representative pixel, in the example illustrated in FIG. 12C, in addition to the pixel B and the pixel D, the pixel C that is one above the representative pixel
  • the prediction reference pixel is used, the present invention is not limited to this.
  • At least one of the representative pixel and one pixel above the pixel A on the same scanning line as the representative pixel may be a predicted reference pixel.
  • a pixel that is one pixel above the center pixel (one pixel to the right of the representative pixel) of the pixel A that is shaded may be used as the predicted reference pixel.
  • the predictive encoding unit 25 calculates the predictive value of the encoding target segment based on the representative value of the segment having the predictive reference pixel, but is not limited thereto. For example, when the pixel values of the pixels included in each segment are the same value in the same segment (when the values can be regarded as constant), the pixel value of the predicted reference pixel is used instead of the representative value of the segment having the predicted reference pixel Based on the above, the predicted value of the encoding target segment may be calculated.
  • the prediction encoding unit 25 may encode information indicating a calculation method of the prediction value and add it to the encoded data # 25.
  • the packaging unit 28 transmits encoded data # 28 including information indicating the calculation method of the predicted value to the video decoding device.
  • the predictive coding unit 25 (1) “predicted value Z′_A is Z_B.” (2) “predicted value Z′_A is Z_C.” (3) “predicted value Z′_A is Z_D. (4) When calculating predicted values by selecting from the four predicted value calculation methods of “predicted value Z′_A is an average value of Z_B, Z_C, and Z_D”, those four calculation methods are used.
  • the selected calculation method may be associated with the difference value of the encoding target segment to generate encoded data # 25. Further, for example, the predictive encoding unit 25 adds (5) “predicted value Z′_A as the median value of Z_B, Z_C, and Z_D” to the above four calculation methods, and adds these five calculation methods. Information to represent may be used.
  • the predictive encoding unit 25 does not include a segment having a predicted reference pixel and a predicted reference pixel.
  • the pixel value of the representative value and the prediction reference pixel is set to 0. That is, when specifying the prediction reference pixel based on the representative pixel, the prediction encoding unit 25 sets the representative value of the segment having the prediction reference pixel and the pixel value of the prediction reference pixel to 0 when there is no prediction reference pixel. To do.
  • FIG. 12 shows the case where the number of pixels on the same scanning line as the representative pixel and the representative pixel is 1 to 3, but the number of pixels is naturally not limited to this, and the number is 4 or more. Is also present. In those cases, processing can be performed in the same manner as described in the three examples.
  • the predictive encoding unit 25 encodes the difference value by the exponent Golomb encoding method
  • the encoding method is not limited to this.
  • the exponent Golomb coding method makes the codeword very long for values far from 0, instead of making the codeword for values near 0 very short. For this reason, when the accuracy of prediction is not so good, it is better to use general Golomb coding instead of the exponential Golomb coding method, and the amount of information can be relatively compressed. That is, it is desirable to select an encoding method based on prediction accuracy (a distribution of differences between representative values and predicted values).
  • the prediction value is a value obtained by multiplying the representative value of that segment by ⁇ 1, so that the value is far from zero. Therefore, for the first segment, a code word obtained by encoding the representative value of the segment itself with a fixed-length encoding method (for example, 8 bits) instead of the difference from the predicted value may be used. In this case, the amount of information can be further compressed.
  • the moving image encoding apparatus 1 is the H.264 standard.
  • the texture image # 1 is encoded using AVC encoding defined in the H.264 / MPEG-4 AVC standard, but the present invention is not limited to this. That is, the image encoding unit 11 of the moving image encoding apparatus 1 may encode the texture image # 1 using another encoding method such as MPEG-2 or MPEG-4.
  • the texture image # 1 may be encoded using an encoding method established as the H.265 / HVC standard.
  • the image division processing unit 21 is a plurality of segments obtained by dividing the entire region of the texture image # 2, and the maximum pixel value and the minimum pixel group included in each region A plurality of segments are defined such that the difference from the pixel value is equal to or less than a predetermined threshold value.
  • the distance image division processing unit 22 defines a plurality of segments obtained by dividing the entire area of the distance image # 2 with the same division pattern as the plurality of segment division patterns defined by the image division processing unit 21. Further, for each segment defined by the distance image division processing unit 22, the distance value correction unit 23 calculates a representative value # 23a from the distance value of each pixel included in the segment.
  • the distance image encoding unit 20 generates encoded data # 25 including a plurality of representative values # 23a calculated by the distance value correcting unit 23.
  • the moving image encoding apparatus 1 transmits the representative values # 23a corresponding to the number of segments at most as the encoded data # 25 of the distance image # 2 transmitted to the moving image decoding apparatus.
  • the code amount of the encoded data of the distance image is clearly larger than the code amount of the encoded data # 25.
  • the image segmentation processing unit 21 defines a plurality of segments by the method described in the above (Appendix 1), each segment is determined when the texture image is an image of 1024 ⁇ 768 dots.
  • the number of pixels included in is about 3000 to 5000.
  • the encoding method of this embodiment is also used for the code amount per block of the distance image when AVC encoding is used. In this case, the code amount per segment of the distance image becomes larger.
  • the moving image encoding apparatus 1 reduces the code amount of the encoded data of the distance image # 2 compared to the conventional moving image encoding apparatus that AVC encodes the distance image # 2 and transmits the encoded image to the moving image decoding apparatus. can do.
  • the distance image division processing unit 22 divides the distance image # 2 into segments, and the distance value correction unit 23 approximates the distance values of the pixels included in the segments to determine representative values.
  • the number assigning unit 24 assigns numbers to the segments in the raster scan order.
  • the predictive encoding unit 25 calculates a predicted value of the representative value of the segment based on the pixels that are close to the segment and whose raster scan order is earlier than the pixels included in the segment. A prediction value is subtracted from the value to calculate a difference value, and the difference values are arranged in numerical order and encoded to generate encoded data # 25.
  • the moving image encoding apparatus 1 compresses the spatial redundancy between segments even when the distance image # 2 transmitted to the moving image decoding apparatus is divided into segments of an arbitrary shape by the above configuration. Can be generated. Therefore, the moving image encoding device 1 has an effect that the code amount of the encoded data of the distance image # 2 transmitted to the moving image decoding device can be further reduced.
  • the moving picture decoding apparatus uses the texture image # 1 ′ and the distance picture # from the encoded data # 28 transmitted from the moving picture encoding apparatus 1 described above for each frame constituting the moving picture to be decoded. This is a moving picture decoding apparatus for decoding 2 ′.
  • FIG. 16 is a block diagram illustrating a main configuration of the video decoding device.
  • the moving image decoding apparatus 2 includes an image decoding unit 12, an image division processing unit (dividing unit) 21 ', a numbering unit (numbering unit, assigning unit) 24' an unpackaging unit (receiving unit). ) 31 and a predictive decoding unit (predicted value calculating means, pixel value setting means) 32.
  • the unpackaging unit 31 extracts the encoded data # 11 of the texture image # 1 and the encoded data # 25 of the distance image # 2 from the received encoded data # 28.
  • the image decoding unit 12 decodes the texture image # 1 'from the encoded data # 11.
  • the image decoding unit 12 is the same as the image decoding unit 12 included in the moving image encoding device 1. That is, the image decoding unit 12 is configured to transmit the encoded data # 28 from the moving image encoding apparatus 1 to the moving image decoding apparatus 2 as long as no noise is mixed in the encoded data # 28.
  • the texture image # 1 ′ having the same content as the texture image decoded by the image decoding unit 12 is decoded.
  • the image division processing unit 21 ′ divides the entire area of the texture image # 1 ′ into a plurality of segments (areas) using the same algorithm as the image division processing unit 21 of the video encoding device 1. Then, the image division processing unit 21 ′ generates segment information # 21 ′ including the position information of each segment, and outputs it to the number assigning unit 24 ′.
  • the number assigning unit 24 assigns a number to each segment divided based on the segment information # 21' in the raster scan order by the same algorithm as the number assigning unit 24 of the video encoding device 1.
  • the number assigning unit 24 ′ generates a segment identification image # 24 ′ in which the number assigned to the segment position information is associated, and outputs the generated image to the predictive decoding unit 32.
  • the segment identification image # 24 ' is information in which a number is associated with segment position information indicating the position of each segment.
  • the predictive decoding unit 32 can specify the arrangement of each segment in the entire image, the number of pixels included in each segment based on the segment position information, and can also specify the number of pixels in the entire image. . Therefore, the predictive decoding unit 32 can restore an image that is divided into segments and does not have information indicating the pixel values of the pixels that form the image, based on the segment position information.
  • the segment identification image # 24 ′ divides the texture image # 1 ′ into segments, and assigns the segment number “i-1” to the i-th segment in the raster scan order, so that the texture image # 1 ′
  • the pixel value of each pixel included in the i th segment may be replaced with “i ⁇ 1”.
  • the image division processing unit 21 ′ divides the texture image # 1 ′ into segments, and the number assigning unit 24 ′ assigns the segment number “i-1” to the i-th segment in the raster scan order, and the above i
  • the pixel value of each pixel included in the th segment may be replaced with “i ⁇ 1”.
  • the predictive decoding unit 32 performs a predictive decoding process based on the input encoded data # 25 and the segment identification image # 24 'to restore the distance image # 2'. Specifically, the predictive decoding unit 32 decodes the encoded data # 25, generates difference values arranged in order, and identifies the generated difference values in the order given by the number assigning unit 24 ′. It assigns to each segment prescribed
  • the predictive decoding unit 32 sets the distance value of the set segment as the pixel value (distance value) of all the pixels included in the segment, and restores the distance image # 2 '.
  • the prediction decoding unit 32 outputs the restored distance image # 2 ′ to a stereoscopic video display device (not shown) outside the moving image decoding device 2.
  • FIG. 17 is a flowchart showing the operation of the video decoding device 2.
  • the operation of the moving image decoding apparatus 2 described here is an operation of decoding a texture image and a distance image of the t-th frame from the top in a three-dimensional moving image including a large number of frames. That is, the moving image decoding apparatus 2 repeats the operation described below as many times as the number of frames of the moving image in order to decode the entire moving image. Further, in the following description, unless otherwise specified, each data # 1 to # 28 is interpreted as data of the t-th frame.
  • the unpackaging unit 31 extracts the encoded data # 11 of the texture image and the encoded data # 25 of the distance image from the encoded data # 28 received from the moving image encoding device 1. Then, the unpackaging unit 31 outputs the encoded data # 11 to the image decoding unit 12, and outputs the encoded data # 25 to the predictive decoding unit 32 (Step S21).
  • the image decoding unit 12 decodes the texture image # 1 ′ from the input encoded data # 11, and sends it to the image division processing unit 21 ′ and a stereoscopic video display device (not shown) outside the moving image decoding device 2. Output (step S22).
  • the image division processing unit 21 ′ defines a plurality of segments with the same algorithm as the image division processing unit 21 of the moving image encoding device 1. Then, the image division processing unit 21 'generates segment information # 21' composed of the position information of each segment, and outputs it to the number assigning unit 24 '(step S23).
  • the number assigning unit 24 'assigns a number to each segment divided based on the segment information # 21' in the raster scan order by the same algorithm as the number assigning unit 24 of the video encoding device 1.
  • the number assigning unit 24 'generates a segment identification image # 24' in which the number assigned to the segment position information is associated, and outputs the segment identifying image # 24 'to the predictive decoding unit 32 (step S24).
  • the predictive decoding unit 32 performs a predictive decoding process based on the input encoded data # 25 and the segment identification image # 24 'to restore the distance image # 2' (step S25). Specifically, the predictive decoding unit 32 decodes the encoded data # 25, generates difference values arranged in order, and identifies the generated difference values in the order given by the number assigning unit 24 ′. It assigns to each segment prescribed
  • the predictive decoding unit 32 outputs the restored distance image # 2 'to a stereoscopic video display device (not shown) outside the video decoding device 2. As described above, the texture image # 1 'and the distance image # 2' can be restored.
  • FIG. 18 is a flowchart illustrating an example of the predictive decoding process executed by the predictive decoding unit 32.
  • the predictive decoding unit 32 uses the encoded data # 25 input from the unpackaging unit 31 as a code used when the predictive encoding unit 25 of the video encoding device 1 generates the encoded data # 25.
  • the difference values arranged in order are generated by decoding using the conversion method (step S201). That is, in this embodiment, the predictive decoding unit 32 decodes the encoded data # 25 illustrated in FIG. 14 using the exponential Golomb encoding method illustrated in FIG.
  • the predictive decoding unit 32 sets each segment defined by the segment information # 21 ′ of the segment identification image # 24 ′ according to the order in which the number assigning unit 24 ′ assigns the difference values arranged in order from the top. (Step S202).
  • step S 203 “i”, which is the number assigned by the number assigning unit 24 ′, is set to “0” (step S 203). Then, the segment with the number “i” assigned by the number assigning unit 24 ′ is set as a decoding target segment (decoding target region) (step S 204). That is, the segment with the head number assigned by the number assigning unit 24 'is set as a decoding target segment.
  • the representative pixel of the decoding target segment to be used for calculating the prediction value is specified from the pixels included in the decoding target segment (step S205). Specifically, the pixels included in the decoding target segment and first scanned in the raster scan order in step S24 are set as representative pixels.
  • the predictive decoding unit 32 After identifying the representative pixel, the predictive decoding unit 32 performs prediction based on the identified representative pixel using the same method as the method of identifying the prediction reference pixel used by the predictive coding unit 25 of the video encoding device 1.
  • a reference pixel is specified (step S206). Specifically, a pixel that is included in the decoding target segment and is adjacent to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is set as a predicted reference pixel.
  • the prediction reference pixel is a pixel adjacent to a pixel in the decoding target segment that is one pixel before the representative pixel in the raster scan order and adjacent to a pixel on the same scan line as the representative pixel, and the representative pixel A pixel in the previous scan line in the raster scan order and a pixel adjacent to the next pixel in the raster scan order of the last pixel of the same scan line as the representative pixel included in the decoding target segment
  • the pixel group may include a pixel on the previous scan line in the raster scan order of the representative pixel.
  • the prediction reference pixel is a pixel adjacent to the pixel in the decoding target segment and the pixel on the same scan line as the representative pixel, and the pixel immediately before the raster scan order of the representative pixel, One of the pixels of the previous scan line in the raster scan order of the representative pixel and one in the raster scan order of the last pixel included in the decoding target segment and the same scan line as the representative pixel Three pixels including a pixel adjacent to a subsequent pixel and a pixel on the previous scan line in the raster scan order of the representative pixel may be used.
  • the prediction decoding unit 32 uses the same method as the method for calculating the prediction value used by the prediction encoding unit 25 of the video encoding device 1 to specify the specified prediction reference pixel.
  • a predicted value of the representative value of the decoding target segment is calculated (step S207).
  • the predicted value may be the median value of the pixel values of the prediction target pixels.
  • the predicted value may be an average value of the pixel values of the prediction target pixels.
  • the predicted value may be any one of the pixel values of the prediction target pixel.
  • the prediction decoding unit 32 adds the difference value assigned to the decoding target segment to the calculated prediction value, and sets the value as a representative value of the decoding target segment (step S208). Then, the predictive decoding unit 32 sets the pixel values of all the pixels included in the decoding target segment to the representative values of the set decoding target segment (step S209).
  • step S209 it is confirmed whether or not the pixel values of the pixels included in the segments are set for all (M) segments. If the pixel values are not set for all the segments, the number assigning unit The processes of steps S204 to S209 are executed in the order of numbers assigned by 24 '. If pixel values are set for all segments, the process proceeds to step S212.
  • step S212 all the segments in which the pixel values of the belonging pixels are set are combined to restore the distance image # 2 '(step S212).
  • the distance image # 2 ′ decoded by the prediction decoding unit 32 in step S25 is generally the distance image # 2 input to the video encoding device 1.
  • the distance image approximates to.
  • the distance image # 2 is the same as the image obtained by changing the distance value of a very small part included in the segment in the distance image # 2 to the representative value in the segment. It can be said that the distance image # 2 is approximate.
  • the image division processing unit 21 ′ defines a plurality of segments obtained by dividing the entire area of the texture image # 1 ′. Specifically, the image division processing unit 21 ′ defines a plurality of segments each including a group of pixels each having a similar color.
  • the predictive decoding unit 32 reads the encoded data # 25.
  • the encoded data # 25 is data including at most one representative value # 23a as a distance value for each of a plurality of segments constituting the distance image # 2 'to be decoded. Note that the division pattern of the plurality of segments constituting the distance image # 2 'to be decoded is the same as the division pattern of the plurality of segments defined by the image division processing unit 21'.
  • the moving image decoding apparatus 2 uses a decoding method corresponding to an encoding method in which the representative value of the segment obtained by dividing the distance image # 2 by the moving image encoding apparatus 1 is encoded into encoded data # 25. Since the encoded data # 25 is decoded, the representative value of each segment generated by the video encoding device 1 can be accurately restored. Therefore, the moving image decoding apparatus 2 can accurately restore the distance image # 2 'that approximates the distance image # 2.
  • the distance image # 2 ′ restored from the encoded data # 25 by the moving image decoding apparatus 2 is similar to the distance image # 2 encoded by the moving image encoding apparatus 1 as described above.
  • the device 2 can decode an appropriate distance image.
  • the distance image # 2 'decoded by the video decoding device 2 has further advantages.
  • the contour of the subject in the generated three-dimensional image is the subject and background in the distance image # 2. It depends on the shape of the boundary.
  • the texture image # 1 'and the distance image # 2 match the position of the boundary between the subject and the background, the position of the boundary between the subject and the background may not match.
  • the texture image reproduces the shape of the edge portion between the subject and the background more faithfully.
  • the position of the boundary between the subject and the background in the distance image # 2 ′ decoded by the moving image decoding apparatus 2 often coincides with the position of the boundary between the subject and the background in the texture image # 1. This is because, in general, the subject color and the background color are significantly different in the texture image # 1, and the boundary between the subject and the background becomes the segment boundary in the texture image # 1.
  • the three-dimensional image reproduced by the stereoscopic image display device from the texture image # 1 ′ and the distance image # 2 ′ output from the moving image decoding apparatus 2 according to the present embodiment is the texture image # 1 ′ and the distance image # 2.
  • the video decoding device 2 restores the distance image # 2 ′ from the encoded data # 28 using a decoding method corresponding to the encoding method used by the video encoding device 1. For this reason, the moving image encoding device 1 and the moving image decoding device 2 may determine the encoding method and the decoding method in advance before performing the encoding and decoding processes, respectively.
  • the video decoding device 2 receives the information indicating the encoding method together with the encoded data # 28 (encoded data # 25) from the video encoding device 1, and corresponds to the encoding method indicated by the received information.
  • the decoding method to be performed may be specified, and the distance image # 2 ′ may be restored based on the specified decoding method.
  • information indicating the encoding method may be associated with each segment included in the encoded data # 25.
  • the information indicating the encoding method includes a variable length encoding method for converting a difference value into a code word, information indicating a fixed length encoding method, and a prediction reference pixel specification that specifies a prediction reference pixel based on a representative pixel.
  • the division method information indicating the segment division method in which the image division processing unit 21 divides the segment
  • the numbering rule information indicating the order (rule) in which the number assigning unit 24 assigns the numbers.
  • representative pixel specifying method information indicating a representative pixel specifying method for specifying a representative pixel may be included.
  • the moving image decoding apparatus 2 displays information indicating that fact. Is received together with the encoded data # 28, only the first code word of the encoded data # 25 is decoded by the fixed-length encoding method, and the representative value of the first segment is set to the decoded value. The pixel values of all the pixels included in the segment are set to decoded values.
  • the moving image decoding apparatus 2 receives the encoded data # 28 including the encoded data # 11 of the texture image and the encoded data # 25 of the distance image.
  • the present invention is not limited to this. Absent.
  • the moving image decoding apparatus 2 may receive the encoded data # 25 of the distance image and the position information of the segment.
  • the number assigning unit 24 ′ assigns a number to each segment divided based on the segment position information in the raster scan order.
  • the number assigning unit 24 ′ generates a segment identification image # 24 ′ in which the number assigned to the segment position information is associated, and outputs the segment identifying image # 24 ′ to the predictive decoding unit 32.
  • the encoded data # 25 of the distance image Can receive the segment position information together with the encoded data # 25 of the distance image, thereby restoring the distance image.
  • the moving image encoding apparatus 1 transmits the encoded data # 25 to the moving image decoding apparatus 2.
  • the moving image encoding apparatus 1 transmits the encoded data # 25 to the moving image decoding apparatus 2 as follows. Then, the encoded data # 25 may be supplied.
  • the moving image encoding apparatus 1 and the moving image decoding apparatus 2 are provided with access means that can access a removable recording medium such as an optical disk drive, and the moving image encoding apparatus 1 and the moving image decoding apparatus 2 are connected via the recording medium.
  • the encoded data # 25 may be supplied.
  • the encoding apparatus of the present invention does not necessarily include a means for transmitting data
  • the decoding apparatus of the present invention does not necessarily include a receiving means for receiving data.
  • the moving image encoding apparatus is H.264 for encoding texture images.
  • the MVC coding adopted as the MVC standard in H.264 / AVC is used, while the distance picture is coded by a moving picture coding apparatus using a coding technique peculiar to the present invention.
  • the moving image encoding apparatus according to the present embodiment is different from the moving image encoding apparatus 1 in that a plurality of sets (N sets) of texture images and distance images are encoded per frame.
  • the N sets of texture images and distance images are images of subjects simultaneously captured by cameras and ranging devices installed at N locations so as to surround the subject. That is, the N sets of texture images and distance images are images for generating a free viewpoint image.
  • each set of texture image and distance image includes actual data of the texture image and distance image of the set, and a camera parameter indicating which azimuth angle is an image generated by a camera and a distance measuring device. Is included as metadata.
  • FIG. 19 is a block diagram showing a main configuration of the moving picture encoding apparatus according to the present embodiment.
  • the moving image encoding apparatus 1A includes an image encoding unit 11A, an image decoding unit 12A, a distance image encoding unit 20A, and a packaging unit (transmission means) 28A.
  • the distance image encoding unit 20A includes an image division processing unit 21A, a distance image division processing unit (dividing unit) 22A, a distance value correcting unit (representative value determining unit) 23A, a number assigning unit (number assigning unit) 24A, and A predictive encoding unit (predicted value calculating means, difference value calculating means, encoding means) 25A is provided.
  • the image encoding unit 11A N view components (that is, texture images # 1-1 to # 1-N) are encoded by MVC encoding (multi-view video encoding) defined in the MVC standard in H.264 / AVC, and each view component is Coded data # 11-1 to # 11-N are generated. Further, the image encoding unit 11A converts the encoded data # 11-1 to # 11-N into the image decoding unit 12A and the packaging unit 28 together with view IDs “1” to “N” that are parameters by NAL header extension. Output to '.
  • MVC encoding multi-view video encoding
  • the image decoding unit 12A decodes the texture images # 1′-1 to 1′-N from the encoded data # 11-1 to # 11-N of the texture image # 1 by the decoding method stipulated in the MVC standard. To do.
  • the image division processing unit 21 divides the entire area of the texture image # 1'-j into a plurality of segments (areas). Then, the image division processing unit 21 outputs segment information # 21-j including the position information of each segment.
  • the distance image division processing unit 22A corresponds to each segment in the texture image # 1′-j in the distance image # 2-j. A distance value set including the distance values of each pixel included in the segment (region) is extracted. Then, the distance image division processing unit 22A generates segment information # 22-j in which the distance value set and the position information are associated with each segment from the segment information # 21-j.
  • the distance image division processing unit 22A generates a view ID “j” of the distance image # 2-j, and generates segment information # 22A-j in which the view ID “j” is associated with the segment information # 22-j. To do.
  • the distance value correcting unit 23A calculates the mode value as the representative value # 23a from the distance value set of the segment included in the segment information # 22A-j for each segment of the distance image # 2-j. Then, the distance value correcting unit 23 replaces the distance value set of each segment included in the segment information # 22A-j with the representative value # 23a-j of the corresponding segment, and the number assigning unit 24A as the segment information # 23A-j Output to.
  • Number giving unit 24A when the segment information # 23A-j is input, for each set of M j sets of position information and the representative value # 23a-j contained in the segment information # 23A-j, representative value # 23a-j is associated with segment number # 24-j corresponding to the position information.
  • the number assigning unit 24A then associates M j sets of segment numbers # 24-j and representative values # 23a-j with the view ID “j” included in the segment information # 23A-j. 24A-j is output to the predictive coding unit 25A.
  • the predictive encoding unit 25A performs predictive encoding processing for each viewpoint based on the M j sets of representative values # 23a-j and segment numbers # 24-j included in the input data # 24A-j,
  • the encoded data # 25-j is output to the packaging unit 28.
  • the predictive coding unit 25A calculates the predicted value of the segment for each segment in the order of the segment number # 24-j, and subtracts the predicted value from the representative value # 23a-j to obtain the difference value. Calculate and encode the difference value.
  • the predictive encoding unit 25 arranges the encoded difference values in the order of the segment numbers # 24-j to generate encoded data # 25-j.
  • the prediction encoding unit 25A uses the encoded data of the distance image # 2-j for each j from 1 to N obtained in this way as the VCL / NAL unit and the view ID “j” as the non-VCL / NAL unit. Is transmitted to the packaging unit 28A.
  • the packaging unit 28A integrates the encoded data # 11-1 to # 11-N of the texture images # 1-1 to # 1-N and the encoded data # 25A to thereby convert the encoded data # 28A. Generate. Then, the packaging unit 28A transmits the encoded data # 28A to the video decoding device.
  • FIG. 20 is a block diagram showing a main configuration of the moving picture decoding apparatus according to the present embodiment.
  • the moving picture decoding apparatus 2A includes an image decoding unit 12A, an image division processing unit (dividing unit) 21A ′, a numbering unit (numbering unit, assigning unit) 24A ′, an unpackaging unit (reception). Means) 31A and a predictive decoding unit (predicted value calculating means, pixel value setting means) 32A.
  • the image decoding unit 12A decodes the texture images # 1′-1 to 1′-N from the encoded data # 11-1 to # 11-N of the texture image # 1 by a decoding method defined in the MVC standard. .
  • the unpackaging unit 31A extracts the encoded data # 11-j of the texture image # 1 and the encoded data # 25A of the distance image # 2 from the received encoded data # 28A.
  • the image division processing unit 21A ' divides the entire region of the texture image # 1'-j into a plurality of segments (regions) by the same algorithm as the image division processing unit 21A of the moving image encoding device 1A. Then, the image division processing unit 21A ′ generates segment information # 21′-j including the position information of each segment, and outputs it to the number assigning unit 24A ′.
  • the number assigning unit 24A 'assigns a number to each segment divided based on the segment information # 21'-j in the raster scan order by the same algorithm as the number assigning unit 24A of the moving image encoding device 1A.
  • the number assigning unit 24A 'generates a segment identification image # 24'-j in which the number assigned to the segment position information is associated, and outputs it to the predictive decoding unit 32A.
  • the predictive decoding unit 32A extracts the encoded data # 25-j and the view ID “j” from the input encoded data # 25A. Next, predictive decoding processing is performed based on the encoded data # 25-j and the segment identification image # 24'-j to restore the distance images # 2'-1 to # 2'-N. Specifically, the prediction decoding unit 32A decodes the distance image # 2'-j as follows.
  • the predictive decoding unit 32A decodes the encoded data # 25-j, generates differential values arranged in order, and generates the generated differential values in the order given by the number assigning unit 24A ′. This is assigned to each segment defined by the segment information # 21'-j of 24'-j. Next, the predictive decoding unit 32A calculates the predicted value of the segment for each segment in the order given by the number assigning unit 24A ′, and adds the assigned difference value to the calculated predicted value. Set the value as the distance value for each segment. Then, the predictive decoding unit 32A sets the distance value of the set segment as the pixel value (distance value) of all the pixels included in the segment, and restores the distance image # 2'-j. The predictive decoding unit 32 associates the restored distance image # 2′-j with the view ID “j” included in the encoded data # 25A to provide a stereoscopic video display device (not shown) outside the video decoding device 2A. ).
  • image decoding unit 12 is the same as the image decoding unit 12 of the video decoding device 2 of the first embodiment, and a description thereof will be omitted.
  • the moving image encoding device 1A and the moving image decoding device 2A have N sets of texture images and distance images of a subject captured simultaneously by cameras and ranging devices installed at N locations so as to surround the subject. Then, an encoding process and a decoding process were performed.
  • the moving image encoding device 1A and the moving image decoding device 2A can perform encoding processing and decoding processing on N sets of texture images and distance images generated as follows. .
  • the moving image encoding device 1A and the moving image decoding device 2A are generated by N sets of cameras and ranging devices installed in one place so that each set of cameras and ranging devices faces different directions. Also, encoding processing and decoding processing can be performed on the N sets of texture images and distance images. That is, the moving image encoding device 1A and the moving image decoding device 2A perform the encoding process and the decoding process on N sets of texture images and distance images for generating omnidirectional images, panoramic images, and the like. Can do.
  • the texture image and the distance image of each set indicate the direction of the image generated by the camera and the distance measuring device in which direction it is directed together with the actual data of the texture image and the distance image of the set.
  • Camera parameters are included as metadata.
  • the image encoding unit 11A of the moving image encoding apparatus 1A is configured as H.264.
  • texture images # 1-1 to 1-N are encoded using MVC encoding defined in the MVC standard in H.264 / AVC, the present invention is not limited to this.
  • the image encoding unit 11A of the moving image encoding device 1A uses other encoding methods such as a VSP (View Synthesis Prediction) encoding method, an MVD encoding method, and an LVD (Layered Video Depth) encoding method.
  • Texture images # 1-1 to 1-N may be encoded.
  • the image decoding unit 12A of the video decoding device 2A is configured to decode the texture images # 1 ′ to 1′-N by a decoding method corresponding to the encoding method employed by the image encoding unit 11A. Good.
  • an encoding apparatus is an encoding apparatus that encodes an image, and is divided by a dividing unit that divides the entire area of the image into a plurality of regions, and the dividing unit.
  • representative value determining means for determining a representative value from the pixel value of each pixel included in the region, number giving means for assigning a number to the plurality of regions in raster scan order,
  • the above-mentioned area is set as the encoding target area in the order of the numbers given by the above-mentioned number assigning means, and among the pixels included in the encoding target area, the first pixel in the raster scan order is set as the representative pixel, and is included in the encoding target area.
  • a pixel that is adjacent to a pixel on the same scan line as the representative pixel and that has a raster scan order before the representative pixel is a predicted reference pixel, and a representative value of an area having the predicted reference pixel
  • the predicted value calculating means for calculating the predicted value of the encoding target area, and the predicted value calculating means calculates the representative value determined by the representative value determining means for each of the encoding target areas.
  • the difference value calculation means for calculating the difference value by subtracting the predicted value and the difference value calculated by the difference value calculation means are arranged and encoded in the order given by the number assignment means, and the encoded data of the image is obtained.
  • encoding means for generating are arranged and encoded in the order given by the number assignment means, and the encoded data of the image is obtained.
  • the number assigning unit assigns numbers in the raster scan order to the plurality of regions into which the dividing unit has divided the image.
  • the prediction value calculation means sets the area as the encoding target area in the order of the numbers given by the number assignment means, and sets the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
  • a pixel that is included in the encoding target region and is close to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is a predicted reference pixel.
  • the predicted value calculation means calculates the predicted value of the encoding target region based on at least one of the representative values of the region having the predicted reference pixel.
  • the difference value calculation means subtracts the prediction value calculated by the prediction value calculation means from the representative value determined by the representative value determination means for each encoding target region to calculate a difference value. Then, the encoding unit arranges and encodes the difference values calculated by the difference value calculation unit in the order given by the number assigning unit, and generates encoded data of the image.
  • the order of the areas can be uniquely specified.
  • the representative pixel used when calculating the predicted value of the representative value of each region and the prediction target pixel based on the representative pixel can be uniquely specified. Therefore, the predicted value of the encoding target area determined from the representative value of the area adjacent to the encoding target area can be uniquely calculated.
  • the prediction reference pixel for a certain area needs to be the same at the time of encoding and at the time of decoding. Therefore, a prediction reference pixel for a certain area needs to be decoded before the certain area, that is, needs to be encoded first.
  • an encoding method is an encoding method of an encoding device that encodes an image, and the encoding device divides the entire region of the image into a plurality of regions.
  • a numbering step for assigning numbers in order, and the region as an encoding target region in the order of the numbers given in the numbering step, and among the pixels included in the encoding target region, the first pixel in the raster scan order A pixel that is included in the encoding target area and is close to a pixel on the same scan line as the representative pixel, and the raster scan order is higher than that of the representative pixel.
  • a prediction value calculating step for calculating a prediction value of the encoding target area based on at least one representative value of the area having the prediction reference pixel,
  • a difference value calculation step for calculating a difference value by subtracting the prediction value calculated in the prediction value calculation step from the representative value determined in the representative value determination step; and a difference value calculated in the difference value calculation step are encoded in the order given in the number assigning step, and an encoded step of generating encoded data of the image is included.
  • the encoding method according to the present invention has the same effects as the encoding apparatus according to the present invention.
  • the encoding apparatus further includes transmission means for associating the encoded data of the image generated by the encoding means with the area information defining the plurality of areas, and transmitting the associated information to the outside. It is desirable to have it.
  • the transmission unit transmits the encoded data of the image generated by the encoding unit and the region information defining the plurality of regions in association with each other. Therefore, the device that has received the encoded data and the region information can further accurately decode the received encoded data by dividing the image into the plurality of regions based on the region information. Play.
  • the encoding means encodes the difference value by a variable length encoding method in which a code word is shorter as a value to be encoded is closer to 0.
  • the encoding means encodes the difference value by a variable length encoding method in which the code word is shorter as the value to be encoded is closer to 0.
  • the prediction value of the encoding target region calculated by the prediction value calculating unit approximates the representative value of the encoding target region (when the prediction accuracy of the prediction value calculating unit is high)
  • the difference value I is a very small value. Therefore, when the prediction accuracy of the prediction value calculation unit is high, the encoding unit encodes the difference value using a variable length encoding method, thereby further reducing the amount of encoded data. Play.
  • the prediction value calculation means includes a prediction reference pixel that is included in the encoding target area and the pixel immediately before the representative pixel in the raster scan order, and is the same as the representative pixel.
  • the predicted value calculation means includes the predicted reference pixel, the pixel immediately before the representative pixel in the raster scan order, and the same scan line as the representative pixel included in the encoding target region.
  • the pixel group includes a pixel adjacent to the next pixel in the raster scan order, and a pixel on the previous scan line in the raster scan order of the representative pixel.
  • the pixel immediately before the representative pixel in the raster scan order is a pixel adjacent in the left direction of the representative pixel.
  • a pixel that is included in the encoding target area and is adjacent to a pixel on the same scan line as the representative pixel and that is on the previous scan line in the raster scan order of the representative pixel is the encoding target.
  • the pixel is included in the region and is adjacent in the upward direction to the pixel of the same scan line as the representative pixel.
  • the pixel on the previous scan line is a pixel that is included in the encoding target region and is adjacent to the uppermost pixel of the last scan line on the same scan line as the representative pixel in the diagonally upper right direction.
  • a pixel in three directions of a pixel adjacent in the left direction of the encoding target region, a pixel adjacent in the upward direction, and a pixel adjacent to the upper right (a pixel in the right direction of the encoding target region) is used as a prediction reference pixel. Yes. Therefore, since the prediction value is calculated with reference to the pixels in the raster scan order before the encoding target area and the pixels in multiple directions, the prediction value can be predicted with high accuracy.
  • the prediction value calculation means includes a prediction reference pixel that is included in the encoding target area and the pixel immediately before the representative pixel in the raster scan order, and is the same as the representative pixel.
  • a pixel adjacent to a pixel on the scan line, and one of the pixels on the previous scan line in the raster scan order of the representative pixel and included in the encoding target area and the same as the representative pixel Three pixels including a pixel adjacent to the next pixel in the raster scan order of the last pixel of the scan line and including a pixel in the previous scan line in the raster scan order of the representative pixel. It is desirable.
  • the predicted value calculation means includes the predicted reference pixel, the pixel immediately before the representative pixel in the raster scan order, and the same scan line as the representative pixel included in the encoding target region. Any one of the pixels on the previous scan line in the raster scan order of the representative pixel and the same scan line as the representative pixel included in the encoding target region Are the pixels adjacent to the next pixel in the raster scan order of the last pixel, and the pixels on the previous scan line in the raster scan order of the representative pixel.
  • predictive reference is made to three pixels in three directions: a pixel adjacent in the left direction of the encoding target region, a pixel adjacent in the upward direction, and a pixel adjacent to the upper right (a pixel in the right direction of the encoding target region). It is a pixel. Therefore, since the prediction reference pixel is a pixel located in multiple directions with respect to the encoding target region and is a pixel group (three pixels) as few as possible, the processing load for calculating the prediction value is reduced, There is a further effect that the predicted value can be predicted with high accuracy.
  • the prediction value calculation means sets the median of the representative values of the region having the prediction reference pixel as the prediction value of the encoding target region. Is desirable.
  • the predicted value calculation means sets the median of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region.
  • the representative value of the encoding target region and the representative value of the region having the prediction reference pixel are approximate, but the representative value of the region having a certain prediction reference pixel is the representative of the encoding target region. It is also possible that the value is very different. At this time, when the representative value of the region having any prediction reference pixel is used as the prediction value of the encoding target region as it is, when the certain prediction reference pixel is selected, The prediction value of the encoding target region is greatly different from the representative value of the encoding target region, and the accuracy of the prediction value is reduced.
  • the median of the representative value of the region having the prediction reference pixel is encoded.
  • the predicted value calculation means uses the average value of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region.
  • the predicted value calculation means sets the average value of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region. Therefore, even when the representative value of an area having a certain prediction reference pixel is significantly different from the representative value of the encoding target area, it is possible to predict the predicted value with stable accuracy. Play.
  • the predicted value calculation means use any one of the representative values of the region having the prediction reference pixel as the predicted value of the encoding target region.
  • the predicted value calculation means sets one of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region.
  • the accuracy of the predicted value does not decrease even if the representative value of the region having a certain prediction reference pixel that is significantly different from the representative value of the encoding target region is used as the predicted value of the encoding target region.
  • any one of the representative values of the area having the prediction reference pixel is set as the prediction value of the encoding target area, and the median value or the average value of the representative values of the area having the prediction reference pixel is encoded.
  • the above configuration can reduce the processing load for calculating the predicted value while maintaining the accuracy of the predicted value. There is a further effect of being able to.
  • the transmission means further includes a prediction value calculation method indicating a prediction value calculation method executed by the prediction value calculation means in addition to the encoded data and the region information of the image. It is desirable to associate information and transmit it to the outside.
  • the transmission unit in addition to the encoded data and the region information of the image, the transmission unit further includes prediction value calculation method information indicating a prediction value calculation method executed by the prediction value calculation unit. Relate to the outside and transmit. Therefore, even if the device that has received the encoded data, the region information, and the prediction value calculation method information does not know the prediction value calculation method executed by the prediction value calculation unit, the prediction value calculation method By calculating the predicted value based on the information, there is an additional effect that the received encoded data can be accurately decoded.
  • the encoding unit is configured to encode the difference value of the encoding target area whose number assigned by the number assigning unit is the earliest instead of the variable length encoding method. It is desirable to encode the representative value of the encoding target area by a fixed-length encoding method.
  • the encoding unit is configured to replace the difference value of the first encoding target area with the number assigned by the number assigning unit using the variable length encoding method instead of encoding the first code.
  • the representative value of the conversion target area is encoded by a fixed-length encoding method.
  • the representative pixel of the first encoding target area with the number assigned by the number assigning means is located at the end of the image, there is no pixel in the raster scan order from the representative pixel.
  • the difference value of the earliest encoding target area becomes a very large value.
  • the variable length encoding method the amount of code becomes very large.
  • the representative value of the earliest encoding target area are encoded by a fixed-length encoding method.
  • the dividing unit converts the entire area of the texture image into a pixel group included in the area for each area.
  • a division pattern that divides a plurality of regions so that a difference between an average value calculated from pixel values and an average value calculated from pixel values of a pixel group included in a region adjacent to the region is equal to or less than a predetermined threshold value; It is desirable to divide the entire area of the distance image into a plurality of areas with the same division pattern.
  • the texture image and the distance image are configured by a pixel group including pixels of similar colors in a certain area in the texture image, the pixel group included in the corresponding area in the distance image is all or substantially omitted.
  • the distance value becomes substantially constant in each region in the distance image.
  • the dividing unit calculates the entire area of the texture image from the pixel values of the pixel group included in the area for each area.
  • the representative value determining means determines the representative value from the pixel value of each pixel included in each region, thereby reducing the information amount of the distance image and generating data that can restore the distance image with high accuracy. There is a further effect that it is possible.
  • the encoding device relates to the transmission in which the encoded data of the image generated by the encoding means and the encoded data of the texture image obtained by encoding the texture image are associated and transmitted to the outside. Preferably further means are provided.
  • the transmission unit associates the encoded data of the image generated by the encoding unit with the encoded data of the texture image obtained by encoding the texture image, and externally. To transmit. Therefore, the device that has received the coded data and the coded data of the texture image divides the coded data of the texture image into the plurality of regions by the division pattern, thereby dividing the distance image into the plurality of regions. Can be divided. Therefore, there is an additional effect that the encoded data of the received image can be accurately decoded based on the encoded data of the texture image.
  • the decoding device for each of a plurality of areas obtained by dividing the entire area of the image with a predetermined division pattern, a representative value of the pixel value of each pixel included in the area,
  • the encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order.
  • the predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
  • a decoding device that decodes the encoded data calculated based on: a dividing unit that divides the entire area of the image into a plurality of areas based on area information defining the plurality of areas; and the encoding Decoding means for decoding data and generating differential values arranged in order; Number assigning means for assigning numbers to the plurality of regions divided by the dividing means in raster scan order; and the number assigning means Are assigned to the plurality of areas in order of numbers assigned by each of the plurality of areas, and the areas are set as decoding target areas in order of the numbers assigned by the number assigning means, and are included in the decoding target areas.
  • the first pixel in the raster scan order is a representative pixel, and is included in the decoding target area and is adjacent to a pixel on the same scan line as the representative pixel.
  • a prediction value calculation unit that calculates a prediction value of a decoding target region based on a pixel value of at least one pixel of the prediction reference pixels, the pixel having a raster scan order before the representative pixel as a prediction reference pixel; For each decoding target area, the pixel value of the decoding target area is calculated by adding the difference value assigned by the allocating means to the prediction value calculated by the prediction value calculating means, and all the pixels included in the decoding target area are calculated.
  • a pixel value setting unit that sets the calculated pixel value to the calculated pixel value, and the predicted value calculation unit and the pixel value setting unit repeatedly execute the process for each decoding target region in the order of the numbers. It is characterized by restoring pixel values of an image.
  • the decoding unit decodes the encoded data and generates difference values arranged in order.
  • the allocating unit calculates a difference for each of the plurality of regions obtained by dividing the image by the dividing unit based on region information defining the plurality of regions in the order of numbers assigned by the number assigning unit in raster scan order. Assign values in order from the beginning.
  • the prediction value calculation means decodes the area as the decoding target area in the order of the numbers given by the number assigning means, and uses the first pixel in the raster scan order as the representative pixel among the pixels included in the decoding target area.
  • a pixel that is included in the target region and is adjacent to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is set as a predicted reference pixel.
  • the predicted value calculation means calculates a predicted value of the decoding target region based on the pixel value of at least one pixel among the predicted reference pixels.
  • the pixel value setting means calculates the pixel value of the decoding target area by adding the difference value assigned by the assigning means to the prediction value calculated by the prediction value calculating means for each decoding target area, The pixel values of all the pixels included in the decoding target area are set to the calculated pixel values.
  • the prediction value calculation means and the pixel value setting means repeatedly execute the above processing for each decoding target area in the order of the numbers given by the number assignment means, and restore the pixel values of the image.
  • the decoding target area is the same as the plurality of areas into which the image indicated by the encoded data is divided.
  • the representative pixel used when calculating the predicted value of the representative value of each decoding target area and the prediction target pixel based thereon can be uniquely specified, and the representative pixel of the decoding target area and the prediction target based thereon
  • the pixel, the representative pixel of the encoding target region corresponding to the decoding target region, and the prediction target pixel based thereon can be the same pixel. Therefore, there is an effect that the image indicated by the encoded data can be accurately restored.
  • the decoding method provides, for each of a plurality of areas obtained by dividing the entire area of an image with a predetermined division pattern, a representative value of pixel values of each pixel included in the area,
  • the encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order.
  • the predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
  • a division step for dividing, a decoding step for decoding the encoded data and generating difference values arranged in order, and assigning numbers to the plurality of regions divided in the division step in a raster scan order A number assigning step, an assigning step in which the difference values are assigned in order from the top to the plurality of regions in the order of numbers assigned in the number assigning step, and the regions in the order of the numbers given in the number assigning step.
  • the first pixel in the raster scan order is set as the representative pixel and included in the decoding target area.
  • the prediction value calculation step and the pixel value setting step are repeatedly executed for each target region to restore the pixel value of the image.
  • the decoding method according to the present invention has the same operational effects as the decoding device according to the present invention.
  • the decoding device preferably further includes receiving means for receiving the encoded data and the region information from the outside.
  • the receiving means receives the encoded data and the region information from the outside. Therefore, even when the decoding device does not hold the region information, an image can be divided into the plurality of regions based on the region information by acquiring the region information from the outside. Therefore, even if the decoding apparatus does not hold the area information, the received encoded data can be accurately decoded.
  • the receiving means receives the encoded data encoded by the variable length encoding method in which the code word is shorter as the value to be encoded is closer to 0.
  • the receiving means receives the encoded data encoded by the variable length encoding method in which the code word is shorter as the value to be encoded is closer to 0.
  • the code amount of the encoded data is small. Therefore, the decoding apparatus has an additional effect that the processing load for decoding the encoded data can be reduced.
  • the prediction value calculation means includes a prediction reference pixel, a pixel preceding the raster scan order of the representative pixel, and a scan line that is included in the decoding target area and is the same as the representative pixel. And a raster of the last pixel of the scan line that is included in the decoding target area and is the same as the representative pixel, which is adjacent to the pixel of the first pixel in the raster scan order of the representative pixel. It is desirable that the pixel group includes a pixel adjacent to the next pixel in the scan order and a pixel on the scan line one previous in the raster scan order of the representative pixel.
  • the predicted value calculation means includes the prediction reference pixel, the pixel immediately preceding the raster scan order of the representative pixel, and the same scan line as the representative pixel included in the decoding target area.
  • a raster scan of a pixel adjacent to the pixel, the pixel on the previous scan line in the raster scan order of the representative pixel, and the last pixel included in the decoding target area and on the same scan line as the representative pixel A pixel group that includes a pixel adjacent to the next pixel in the order and a pixel on the previous scan line in the raster scan order of the representative pixel.
  • the pixels in the three directions of the pixel adjacent in the left direction of the decoding target region, the pixel adjacent in the upward direction, and the pixel close to the upper right are used as the prediction reference pixels. Therefore, since the prediction value is calculated with reference to the pixels in the raster scan order before the decoding target region and the pixels in multiple directions, the prediction value can be predicted with high accuracy.
  • the prediction value calculation means includes a prediction reference pixel, a pixel preceding the raster scan order of the representative pixel, and a scan line that is included in the decoding target area and is the same as the representative pixel.
  • the predicted value calculation means includes the prediction reference pixel, the pixel immediately preceding the raster scan order of the representative pixel, and the same scan line as the representative pixel included in the decoding target area.
  • a pixel adjacent to the pixel and one of the pixels of the scan line immediately preceding in the raster scan order of the representative pixel, and the last of the same scan line as the representative pixel included in the decoding target region It is assumed that the pixel is adjacent to the next pixel in the raster scan order of the tail pixel, and includes the pixel of the previous scan line in the raster scan order of the representative pixel.
  • the prediction reference pixel is a pixel located in multiple directions with respect to the decoding target region and is a pixel group (three pixels) as small as possible, the processing load of the prediction value calculation is reduced and the prediction is performed. There is an additional effect that the value can be predicted with high accuracy.
  • the predicted value calculation means uses the median value of the predicted reference pixels as the predicted value of the decoding target area.
  • the predicted value calculation means sets the median value of the predicted reference pixels as the predicted value of the decoding target area.
  • the median of the representative value of the region having the prediction reference pixel is By using the predicted value of the decoding target area, there is an additional effect that the predicted value can be predicted with stable accuracy.
  • the predicted value calculation means use an average value of the pixel values of the predicted reference pixels as a predicted value of the decoding target region.
  • the predicted value calculation means sets the average value of the pixel values of the predicted reference pixels as the predicted value of the decoding target region.
  • the average value of the representative values of the region having the prediction reference pixel is By using the predicted value of the decoding target area, there is an additional effect that the predicted value can be predicted with stable accuracy.
  • the predicted value calculation means uses a pixel value of any pixel included in the predicted reference pixel as a predicted value of the decoding target region.
  • the predicted value calculation means sets the pixel value of any pixel included in the predicted reference pixel as the predicted value of the decoding target region.
  • the above configuration is used when the accuracy of the prediction value does not decrease.
  • the processing load for calculating the predicted value while maintaining the accuracy of the predicted value.
  • the reception unit in addition to the encoded data and the region information of the image, the reception unit further includes prediction value calculation method information indicating a prediction value calculation method executed by the prediction value calculation unit.
  • the prediction value calculation means calculates the prediction value based on the calculation method indicated by the prediction value calculation method information received by the reception means.
  • the reception unit further includes prediction value calculation method information indicating a prediction value calculation method executed by the prediction value calculation unit.
  • the prediction value calculation means receives the prediction value based on the calculation method indicated by the prediction value calculation method information received by the reception means. Therefore, even when the decoding device does not know the calculation method of the prediction value executed by the prediction value calculation unit, the decoding device calculates the prediction value based on the prediction value calculation method information, thereby receiving the encoded data. There is an additional effect that can be accurately decoded.
  • the decoding means when the first code word in the encoded data is obtained by encoding the representative value of the earliest encoding target area by the fixed-length encoding method, the decoding means includes The first code word of the encoded data is decoded by a fixed-length encoding method, and the pixel value setting means calculates the pixel values of all the pixels included in the first area in the number order assigned by the number assigning means. It is desirable that the decoding means sets the representative value obtained by decoding the head codeword.
  • the decoding means when the first code word in the encoded data is obtained by encoding the representative value of the earliest encoding target area by the fixed-length encoding method, the decoding means includes the code The first code word of the coded data is decoded by a fixed-length encoding method, and the pixel value setting means decodes the pixel values of all pixels included in the first area in the number order assigned by the number assigning means. The converting means sets the head codeword to the decoded representative value.
  • the decoding apparatus has an additional effect that the processing load for decoding the encoded data can be reduced.
  • the receiving means uses, as the region information, encoded data of the texture image obtained by encoding the texture image.
  • the dividing means receives the entire area of the texture image decoded from the encoded data of the texture image, and calculates the average value calculated from the pixel values of the pixel group included in the area for each area and the area.
  • a division pattern that divides a plurality of regions so that a difference from an average value calculated from pixel values of pixel groups included in adjacent regions is a predetermined threshold value or less. It is desirable to divide.
  • the said receiving means receives the coding data of the said texture image which coded the said texture image as said area
  • the dividing unit is configured to calculate the entire area of the texture image decoded from the encoded data of the texture image, adjacent to the average value calculated from the pixel value of the pixel group included in the area for each area and the area.
  • This is a division pattern that divides the distance image into a plurality of regions so that the difference from the average value calculated from the pixel values of the pixel group included in the region to be equal to or less than a predetermined threshold value.
  • the distance value is substantially constant in each area of the distance image divided by the dividing means. Therefore, by using the representative value of each region, the encoded data can be made into data that has a small code amount and can restore the distance image with high accuracy. Therefore, the decoding device can reconstruct the distance image from the encoded data with high accuracy, and can further reduce the processing load for decoding the encoded data.
  • an encoding program that causes a computer to function as each unit of the encoding device according to the present invention a decoding program that causes the computer to function as each unit of the decoding device according to the present invention, and a computer-readable recording of the encoding program A recording medium and a computer-readable recording medium on which a decoding program is recorded are also included in the scope of the present invention.
  • the data structure of the encoded data of the image for each of a plurality of regions obtained by dividing the entire region of the image by a predetermined division pattern, a representative value of the pixel value of each pixel included in the region, A difference value that is a difference from the predicted value of the representative value of the region is included, and the difference value is arranged in the order of the numbers given in the raster scan order to the plurality of regions, and the predicted value is
  • the above areas are set as encoding target areas in numerical order, and the first pixel in the raster scan order among the pixels included in the encoding target area is set as the representative pixel, and the same scan as the above representative pixel is included in the encoding target area.
  • each block included in the moving image encoding device 1, 1A and the moving image decoding device 2, 2A may be configured by hardware logic.
  • each control of the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A may be realized by software using a CPU (Central Processing Unit) as follows.
  • the program code (execution format program, intermediate code program, source program) of the control program that realizes the control of each of the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A is recorded so as to be readable by a computer. Just do it.
  • the moving image encoding device 1, 1A and the moving image decoding device 2, 2A may read and execute the program code recorded on the supplied recording medium.
  • the recording medium for supplying the program code to the moving image encoding apparatus 1, 1A and the moving image decoding apparatus 2, 2A is, for example, a tape system such as a magnetic tape or a cassette tape, or a magnetic disk such as a floppy (registered trademark) disk / hard disk.
  • disk systems including optical disks such as CD-ROM / MO / MD / DVD / CD-R, card systems such as IC cards (including memory cards) / optical cards, mask ROM / EPROM / EEPROM / flash ROM, etc.
  • a semiconductor memory system can be used.
  • the object of the present invention can be achieved.
  • the program code is supplied to the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A via a communication network.
  • This communication network is not limited to a specific type or form as long as it can supply a program code to the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A.
  • the Internet, intranet, extranet, LAN, ISDN, VAN, CATV communication network, mobile communication network, satellite communication network, etc. may be used.
  • the transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type.
  • wired communication such as IEEE 1394, USB (Universal Serial Bus), power line carrier, cable TV line, telephone line, ADSL (Asymmetric Digital Subscriber Line) line, infrared such as IrDA or remote control, Bluetooth (registered trademark), 802. 11 wireless, HDR, mobile phone network, satellite line, terrestrial digital network, etc. can also be used.
  • the present invention can be suitably applied to a content generation device that generates 3D-compatible content, a content playback device that plays back 3D-compatible content, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A moving image encoder apparatus (1) comprises: a range image dividing unit (22) that divides a range image into segments; a range value correcting unit (23) that determines a representative value of each of the segments; a number adding unit (24) that adds numbers to the respective segments in raster scan order; and a predicting/encoding unit (25) that calculates predicted values of the respective segments in numerical order, calculates differential values by subtracting the respective predicted values from the respective representative values of the segments, and that arranges, in numerical order, and encodes the calculated differential values to generate encoded data of the range image.

Description

符号化装置、復号装置、符号化方法、復号方法、プログラム、記録媒体、および符号化データのデータ構造Encoding device, decoding device, encoding method, decoding method, program, recording medium, and data structure of encoded data
 本発明は、主に、距離画像(Depth Image)を符号化する符号化装置、および、そのような符号化装置により符号化された距離画像を復号する復号装置に関する。 The present invention mainly relates to an encoding device that encodes a distance image (Depth Image) and a decoding device that decodes the distance image encoded by such an encoding device.
 被写体の三次元形状を、正確に、且つ、効率良くデータとして記録することは重要なテーマであり、従来からさまざまな方法が提案されている。 It is an important theme to record the three-dimensional shape of a subject accurately and efficiently as data, and various methods have been proposed.
 それらの方法の一つとして、被写空間を各被写体および背景の色で表現した一般的な二次元画像であるテクスチャ画像と、被写空間を各被写体および背景までの視点からの距離で表現した画像(以下、「距離画像」と呼ぶ)との二種類の画像データを関連付けて記録する方法がある。より具体的には、距離画像とは、画素ごとに、被写空間中の対応する地点までの視点からの距離値(深度値)を表現する画像である。 As one of those methods, a texture image, which is a general two-dimensional image that represents the subject space with the color of each subject and background, and the subject space is represented by the distance from the viewpoint to each subject and background. There is a method of recording two types of image data associated with an image (hereinafter referred to as “distance image”) in association with each other. More specifically, the distance image is an image expressing a distance value (depth value) from the viewpoint to a corresponding point in the object space for each pixel.
 この距離画像は、例えば、テクスチャ画像を記録するカメラ近傍に設置された、デプスカメラ等の測距装置によって取得できる。あるいは、多視点カメラの撮影によって得られる複数のテクスチャ画像を解析することによっても距離画像を取得することができ、その解析手法も数多く提案されている。 This distance image can be acquired by a distance measuring device such as a depth camera installed in the vicinity of the camera that records the texture image. Alternatively, a distance image can be acquired by analyzing a plurality of texture images obtained by photographing with a multi-viewpoint camera, and many analysis methods have been proposed.
 また、距離画像に関する規格として、国際標準化機構/国際電機標準会議(ISO/IEC)のワーキンググループであるMoving Picture Experts Group(MPEG)において、距離値を256段階(すなわち8ビットの輝度値)で表現する規格であるMPEG-C part3が定められている。すなわち、標準的な距離画像は8ビットのグレースケール画像となる。また、視点からの距離が近いほど高い輝度値を割り当てるように規定されているため、標準的な距離画像では、手前に位置する被写体ほど白く、奥に位置する被写体ほど黒く表現される。 In addition, distance values are expressed in 256 levels (ie, 8-bit luminance values) in the Moving Picture Experts Group (MPEG), which is a working group of the International Organization for Standardization / ISO / IEC, as a standard for distance images. MPEG-C part3, which is a standard to be established. That is, the standard distance image is an 8-bit grayscale image. In addition, since it is defined that a higher luminance value is assigned as the distance from the viewpoint is shorter, in a standard distance image, a subject located in front is expressed as white and a subject located in the back is expressed in black.
 同一の被写空間を表現したテクスチャ画像と距離画像とが得られれば、テクスチャ画像に描画されている被写体像を構成する各画素の視点からの距離が距離画像から分かるため、被写体を奥行きが最大256段階で表現される三次元形状として復元することができる。さらに、三次元形状を二次元平面上に幾何的に投影することにより、元のテクスチャ画像を、元の角度から一定範囲にある別の角度から被写体を撮影した場合の被写空間のテクスチャ画像に変換することが可能である。すなわち、1組のテクスチャ画像および距離画像によって一定範囲にある任意の角度から見たときの三次元形状を復元できるため、たかだか複数組のテクスチャ画像および距離画像を用いることにより三次元形状の自由視点画像を少ないデータ量で表すことが可能である。 If a texture image and a distance image representing the same subject space are obtained, the distance from the viewpoint of each pixel constituting the subject image drawn in the texture image is known from the distance image, so that the subject has the maximum depth. It can be restored as a three-dimensional shape expressed in 256 stages. Furthermore, by projecting the 3D shape onto the 2D plane geometrically, the original texture image is converted into a texture image in the subject space when the subject is photographed from another angle within a certain range from the original angle. It is possible to convert. In other words, since a three-dimensional shape can be restored when viewed from an arbitrary angle within a certain range by a set of texture images and distance images, a free viewpoint of a three-dimensional shape can be obtained by using multiple sets of texture images and distance images. It is possible to represent an image with a small amount of data.
 ところで、非特許文献1には、映像が内部に持つ時間的あるいは空間的な冗長性を効率良く排除することにより、映像を圧縮符号化することができる技術が開示されている。この技術を用いた符号化装置により、テクスチャ映像(テクスチャ画像を各フレームとする映像)と距離映像(距離画像を各フレームとする映像)との各映像を符号化すると、各映像が有する冗長性を排除することが可能となり、復号装置に伝送される各映像のデータ量をさらに削減することができる。 By the way, Non-Patent Document 1 discloses a technique capable of compressing and encoding video by efficiently eliminating temporal or spatial redundancy in the video. When each video of a texture video (video having a texture image as each frame) and a distance video (video having a distance image as each frame) is encoded by an encoding device using this technology, the redundancy that each video has Can be eliminated, and the data amount of each video transmitted to the decoding device can be further reduced.
 ここで、本発明者は、テクスチャ画像と距離画像との間に以下の2つの特性があることを見出した。(1)距離画像における被写体および背景のエッジ部分と、テクスチャ画像における被写体および背景のエッジ部分とが共通している。(2)距離画像において、被写体および背景のエッジより内部(エッジに囲まれている領域内)は距離深度値が比較的平坦である。 Here, the present inventor has found that there are the following two characteristics between the texture image and the distance image. (1) The subject and background edge portions in the distance image and the subject and background edge portions in the texture image are common. (2) In the distance image, the distance depth value is relatively flat inside (in the region surrounded by the edge) from the edge of the subject and the background.
 特性(1)について説明すると、テクスチャ画像において被写体がその背景から画像として区別できる情報が含まれている限りにおいて、被写体と背景との境界(エッジ)は、テクスチャ映像と距離映像とで共通である。すなわち、被写体のエッジ部分を示すエッジ情報が、テクスチャ画像と距離画像との相関関係を示す1つの大きな要素となる。また、特性(2)について説明すると、一般的に、距離画像は、テクスチャ画像と比べて、空間周波数成分が低い画像となる傾向がある。例えば、テクスチャ画像に派手な絵柄の服を着ている人物が描かれていても、距離画像においては、服の部分の距離深度値が一定になる傾向がある。換言すると、距離画像においては、テクスチャ画像と比べてより広い領域において単一の距離深度値が表れる傾向が強いと言える。 The characteristic (1) will be described. As long as the texture image includes information that allows the subject to be distinguished from the background as an image, the boundary (edge) between the subject and the background is common to the texture video and the distance video. . That is, the edge information indicating the edge portion of the subject is one large element indicating the correlation between the texture image and the distance image. Further, the characteristic (2) will be described. Generally, the distance image tends to be an image having a lower spatial frequency component than the texture image. For example, even if a person wearing a fancy pattern of clothes is drawn on the texture image, the distance depth value of the clothes portion tends to be constant in the distance image. In other words, in the distance image, it can be said that there is a strong tendency for a single distance depth value to appear in a wider area than in the texture image.
 すなわち、テクスチャ画像および距離画像には、テクスチャ画像中のある領域が類似する色の画素からなる画素群で構成されている場合、距離画像中の対応する領域に含まれる画素群は全部または略全ての画素が同じ距離深度値をとる傾向が強いという相関がある。 That is, in the texture image and the distance image, when a certain area in the texture image is composed of pixel groups composed of pixels of similar colors, all or almost all the pixel groups included in the corresponding area in the distance image There is a correlation that these pixels tend to take the same distance depth value.
 この2つの画像における相関関係を考えると、距離画像において距離深度値が一定の範囲ごとに画素を区切ることができれば、その範囲内(区分けされた画素群内)は距離深度値が略一定となる。例えば、領域に含まれる画素群の最大画素値と最小画素値との差が所定の閾値以下となるように、テクスチャ画像の全領域を複数の領域に分割し、テクスチャ画像の分割パターンと同一の分割パターンで、距離画像を複数の領域に分割することにより、距離画像における各領域内では、距離深度値が略一定となる。この距離深度値が略一定となるように区切った画素群(テクスチャ画像および距離画像の全領域を分割して形成される各領域)をセグメントと称する。 Considering the correlation between the two images, if the distance image can be segmented for each range where the distance depth value is constant in the distance image, the distance depth value is substantially constant within that range (within the segmented pixel group). . For example, the entire region of the texture image is divided into a plurality of regions so that the difference between the maximum pixel value and the minimum pixel value of the pixel group included in the region is equal to or less than a predetermined threshold, and the same pattern as the texture image division pattern By dividing the distance image into a plurality of regions by the division pattern, the distance depth value becomes substantially constant in each region in the distance image. A pixel group (each region formed by dividing the entire region of the texture image and the distance image) divided so that the distance depth value becomes substantially constant is referred to as a segment.
 このように距離画像をセグメント単位に分割することにより、距離画像を符号化する際に、セグメントに含まれる画素に対して直交変換などを行う必要がなく、非常に効率的な符号化を行うことができる。さらに、距離画像をセグメントに分割する際にテクスチャ画像のセグメントを一意に特定する所定の画像分割アルゴリズムを用いることにより、画像を伝送する場合、セグメントの配置や形状に関する情報を伝送する必要がなくなり、さらに符号化効率を向上させることができる。 By dividing the distance image into segment units in this way, when encoding the distance image, there is no need to perform orthogonal transformation on the pixels included in the segment, and very efficient encoding is performed. Can do. Furthermore, by using a predetermined image segmentation algorithm that uniquely identifies a segment of a texture image when segmenting a distance image into segments, when transmitting an image, there is no need to transmit information about the arrangement and shape of the segment, Furthermore, the encoding efficiency can be improved.
 このように、距離深度値が略一定となるように距離画像を各セグメントに分割することにより、セグメント内の情報量を圧縮することができる。そのため、距離画像を画素単位ではなく、セグメント単位で扱うことができる。さらに、距離画像を、その対応するテクスチャ画像(距離画像と同時刻におけるテクスチャ画像)に基づいて分割しているため、距離画像において、近接するセグメントにおけるそれぞれの距離画像値は同じ値、あるいは近い値となる場合が多くなる。したがって、その特性を利用し、距離画像におけるセグメント間の空間的冗長性を排除すれば、さらなる情報圧縮が可能である。 Thus, by dividing the distance image into each segment so that the distance depth value becomes substantially constant, the amount of information in the segment can be compressed. Therefore, the distance image can be handled in units of segments, not in units of pixels. Further, since the distance image is divided based on the corresponding texture image (texture image at the same time as the distance image), the distance image values in the adjacent segments in the distance image are the same or close values. It becomes more and more. Therefore, further information compression is possible by using the characteristic and eliminating the spatial redundancy between segments in the distance image.
 ここで、非特許文献1では、テクスチャ画像をブロック化し、画面内予測符号化あるいはイントラ予測符号化によって、ブロック間の空間的冗長性を排除することが行われている。具体的には、まず、テクスチャ画像に含まれる画素を、4×4画素単位、8×8画素単位、16×16画素単位等でブロック化する。次に、画像の左上のブロックから右下のブロックに向う順番で、各ブロックの符号化を行う。各ブロックの符号化では、符号化対象ブロックに先行して符号化が行われる、符号化対象ブロックの左、上、右上に隣接するブロックの内部にある、符号化対象ブロックに隣接する画素あるいは画素列を参照して、符号化対象ブロックに含まれる画素の値を予測する。そして、符号化対象ブロックに含まれる各画素の実際の値から予測値を引いた差分を直交変換し符号化する。予測の精度がよければ、実際の値そのものを符号化するよりも値は小さくなることが期待でき、その結果、符号化に要するビット数を減らすことができる。 Here, in Non-Patent Document 1, a texture image is made into a block, and spatial redundancy between blocks is eliminated by intra prediction encoding or intra prediction encoding. Specifically, first, pixels included in the texture image are blocked in units of 4 × 4 pixels, 8 × 8 pixels, 16 × 16 pixels, or the like. Next, the blocks are encoded in the order from the upper left block to the lower right block of the image. In the encoding of each block, the pixel or pixel adjacent to the encoding target block that is encoded prior to the encoding target block and is inside the block adjacent to the left, top, and upper right of the encoding target block. With reference to the column, the value of the pixel included in the encoding target block is predicted. Then, the difference obtained by subtracting the predicted value from the actual value of each pixel included in the encoding target block is orthogonally transformed and encoded. If the prediction accuracy is good, it can be expected that the value will be smaller than when the actual value itself is encoded, and as a result, the number of bits required for encoding can be reduced.
 ところが、非特許文献1に記載の技術は、テクスチャ画像に対して最適化された手法であり、上述のセグメント単位に分割された距離画像に対しては、そのまま適用することができないという問題がある。 However, the technique described in Non-Patent Document 1 is a technique optimized for a texture image, and there is a problem that it cannot be applied as it is to a distance image divided into the above-described segment units. .
 具体的には、非特許文献1では、テクスチャ画像を形状が正方形であるブロックに分割している。一方、本発明者が発案した距離画像の分割方法では、距離画像を任意の形状のセグメントに分割する。というのも、この分割方法では、分割するセグメントの個数が少ないほど符号化効率を向上させることができるため、各セグメントの形状に制限を持たせず、各セグメントが柔軟な形状を取り得ることが望ましい。 Specifically, in Non-Patent Document 1, a texture image is divided into blocks each having a square shape. On the other hand, in the range image dividing method proposed by the present inventors, the range image is divided into segments of arbitrary shapes. This is because with this division method, the smaller the number of segments to be divided, the better the coding efficiency. Therefore, each segment can have a flexible shape without any restriction on the shape of each segment. desirable.
 画像を分割する単位が正方形であれば、符号化対象ブロックに対する左、上、右上に隣接するブロックは、それぞれ一意に決定することができる。さらに、符号化対象ブロックが予測のために参照する画素を含むブロックは、符号化対象ブロックよりも先行して符号化されることが保証されているため、復号側で予測値を再現することができる。一方、画像を任意の形状で分割する場合、符号化対象セグメントの隣接セグメントを一意に決定することができない。また、符号化対象セグメントに近接するどのセグメントが先行して符号化されるかが判断できない。よって、非特許文献1に記載の技術をそのまま適用しても、セグメント単位に分割された距離画像の空間的冗長性を取り除くことはできない。 If the unit for dividing the image is a square, the blocks adjacent to the left, top, and top right of the encoding target block can be uniquely determined. Furthermore, since it is guaranteed that a block including pixels that the encoding target block refers to for prediction is encoded prior to the encoding target block, the decoding side may reproduce the predicted value. it can. On the other hand, when an image is divided into arbitrary shapes, the adjacent segment of the encoding target segment cannot be uniquely determined. Further, it cannot be determined which segment adjacent to the encoding target segment is encoded in advance. Therefore, even if the technique described in Non-Patent Document 1 is applied as it is, the spatial redundancy of the distance image divided into segment units cannot be removed.
 また、上述の距離画像をセグメント単位に分割する手法は、本発明者が新たに発案したものであるため、セグメント間の空間的冗長性を排除する手法が今まで考えられていない。 In addition, since the method of dividing the above-described distance image into segment units has been newly invented by the present inventor, a method for eliminating spatial redundancy between segments has not been considered.
 本発明は、上記課題に鑑みてなされたものであり、その主な目的は、任意の形状のセグメント単位で分割された画像のセグメント間の空間的冗長性を排除して符号化する符号化装置、および、そのような符号化装置から供給された距離画像を復号する復号装置を実現することにある。 The present invention has been made in view of the above problems, and a main object thereof is an encoding apparatus that performs encoding by eliminating spatial redundancy between segments of an image divided in segment units of an arbitrary shape. Another object of the present invention is to realize a decoding device that decodes a distance image supplied from such an encoding device.
 本発明に係る符号化装置は、上記課題を解決するために、画像を符号化する符号化装置において、上記画像の全領域を複数の領域に分割する分割手段と、上記分割手段により分割された上記複数の領域の各々について、該領域に含まれる各画素の画素値から代表値を決定する代表値決定手段と、上記複数の領域に対してラスタスキャン順で番号を付与する番号付与手段と、上記番号付与手段の付与した番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて、符号化対象領域の予測値を算出する予測値算出手段と、上記符号化対象領域毎に、上記代表値決定手段が決定した代表値から、上記予測値算出手段が算出した予測値を減算して差分値を算出する差分値算出手段と、上記差分値算出手段が算出した差分値を、上記番号付与手段が付与した順番に並べて符号化し、上記画像の符号化データを生成する符号化手段と、を備えることを特徴としている。 In order to solve the above-described problem, an encoding apparatus according to the present invention is an encoding apparatus that encodes an image, and is divided by a dividing unit that divides the entire area of the image into a plurality of regions, and the dividing unit. For each of the plurality of regions, representative value determining means for determining a representative value from the pixel value of each pixel included in the region, number giving means for assigning a number to the plurality of regions in raster scan order, The above-mentioned area is set as the encoding target area in the order of the numbers given by the above-mentioned number assigning means, and among the pixels included in the encoding target area, the first pixel in the raster scan order is set as the representative pixel, and is included in the encoding target area. A pixel that is adjacent to a pixel on the same scan line as the representative pixel and that has a raster scan order before the representative pixel is a predicted reference pixel, and a representative of the region having the predicted reference pixel Based on at least one of the above, the predicted value calculating means for calculating the predicted value of the encoding target area, and the predicted value calculating means for each encoding target area from the representative value determined by the representative value determining means. The difference value calculation means for subtracting the calculated predicted value to calculate the difference value, and the difference values calculated by the difference value calculation means are arranged and encoded in the order given by the number assignment means, and the encoded data of the image And encoding means for generating.
 上記の構成によれば、番号付与手段が、分割手段が上記画像を分割した上記複数の領域に対してラスタスキャン順で番号を付与する。次に、予測値算出手段が、上記番号付与手段の付与した番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とする。そして、予測値算出手段が、予測参照画素を有する領域の代表値の少なくとも1つに基づいて、符号化対象領域の予測値を算出する。次に、差分値算出手段が、上記符号化対象領域毎に、上記代表値決定手段が決定した代表値から、上記予測値算出手段が算出した予測値を減算して差分値を算出する。そして、符号化手段が、上記差分値算出手段が算出した差分値を、上記番号付与手段が付与した順番に並べて符号化して、上記画像の符号化データを生成する。 According to the above configuration, the number assigning unit assigns numbers in the raster scan order to the plurality of regions into which the dividing unit has divided the image. Next, the prediction value calculation means sets the area as the encoding target area in the order of the numbers given by the number assignment means, and sets the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area. A pixel that is included in the encoding target region and is close to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is a predicted reference pixel. Then, the predicted value calculation means calculates the predicted value of the encoding target region based on at least one of the representative values of the region having the predicted reference pixel. Next, the difference value calculation means subtracts the prediction value calculated by the prediction value calculation means from the representative value determined by the representative value determination means for each encoding target region to calculate a difference value. Then, the encoding unit arranges and encodes the difference values calculated by the difference value calculation unit in the order given by the number assigning unit, and generates encoded data of the image.
 そのため、上記分割手段が分割した上記複数の領域が任意の形状であっても、各領域の順序を一意に特定することができる。また、各領域の代表値の予測値を算出する際に使用する代表画素、およびそれに基づく予測対象画素を一意に特定することができる。それゆえ、符号化対象領域と近接する領域の代表値から定まる符号化対象領域の予測値を一意に算出することができる。 Therefore, even if the plurality of areas divided by the dividing means have an arbitrary shape, the order of the areas can be uniquely specified. Moreover, the representative pixel used when calculating the predicted value of the representative value of each region and the prediction target pixel based on the representative pixel can be uniquely specified. Therefore, the predicted value of the encoding target area determined from the representative value of the area adjacent to the encoding target area can be uniquely calculated.
 よって、上記分割手段が分割した上記複数の領域が任意の形状であっても、各領域間の空間的冗長性を排除し、かつ、一意に復号可能な符号化データを生成することができるという効果を奏する。 Therefore, even if the plurality of regions divided by the dividing unit have an arbitrary shape, it is possible to eliminate spatial redundancy between the regions and generate encoded data that can be uniquely decoded. There is an effect.
 また、算出した予測値を用いて符号化した符号化データを復号する場合、予測値を符号化時と同じ方法で算出する必要がある。つまり、或る領域に対する予測参照画素は、符号化時と復号時とで同じである必要がある。そのため、或る領域に対する予測参照画素は、当該或る領域より先に復号されている必要がある、すなわち、先に符号化する必要がある。 In addition, when decoding the encoded data encoded using the calculated predicted value, it is necessary to calculate the predicted value in the same manner as when encoding. That is, the prediction reference pixel for a certain area needs to be the same at the time of encoding and at the time of decoding. Therefore, a prediction reference pixel for a certain area needs to be decoded before the certain area, that is, needs to be encoded first.
 そこで、上記のように、符号化対象領域に含まれる画素よりラスタスキャン順が前の画素を予測参照画素とすることにより、復号時に、抜けなどなく正常に、符号化データを先頭から順に復号可能であることが保証される。そのため、効率的な符号化処理を行うことができ、かつ、符号化処理時に使用するメモリ量を削減することができるというさらなる効果を奏する。 Therefore, as described above, by setting the pixel whose raster scan order is earlier than the pixel included in the encoding target area as the prediction reference pixel, it is possible to normally decode the encoded data in order from the beginning without omission at the time of decoding. Is guaranteed. Therefore, there is an additional effect that efficient encoding processing can be performed and the amount of memory used during the encoding processing can be reduced.
 本発明に係る符号化方法は、上記課題を解決するために、画像を符号化する符号化装置の符号化方法において、上記符号化装置にて、上記画像の全領域を複数の領域に分割する分割ステップと、上記分割ステップにおいて分割された上記複数の領域の各々について、該領域に含まれる各画素の画素値から代表値を決定する代表値決定ステップと、上記複数の領域に対してラスタスキャン順で番号を付与する番号付与ステップと、上記番号付与ステップにおいて付与された番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて、符号化対象領域の予測値を算出する予測値算出ステップと、上記符号化対象領域毎に、上記代表値決定ステップにおいて決定された代表値から、上記予測値算出ステップにおいて算出された予測値を減算して差分値を算出する差分値算出ステップと、上記差分値算出ステップにおいて算出された差分値を、上記番号付与ステップにおいて付与された順番に並べて符号化し、上記画像の符号化データを生成する符号化ステップと、を含むことを特徴としている。 In order to solve the above-described problem, an encoding method according to the present invention is an encoding method of an encoding device that encodes an image, and the encoding device divides the entire region of the image into a plurality of regions. A division step; a representative value determination step for determining a representative value from a pixel value of each pixel included in the plurality of regions divided in the division step; and a raster scan for the plurality of regions. A numbering step for assigning numbers in order, and the region as an encoding target region in the order of the numbers given in the numbering step, and among the pixels included in the encoding target region, the first pixel in the raster scan order A pixel that is included in the encoding target area and is close to a pixel on the same scan line as the representative pixel, and the raster scan order is higher than that of the representative pixel. For each of the encoding target areas, a prediction value calculating step for calculating a prediction value of the encoding target area based on at least one representative value of the area having the prediction reference pixel, A difference value calculation step for calculating a difference value by subtracting the prediction value calculated in the prediction value calculation step from the representative value determined in the representative value determination step; and a difference value calculated in the difference value calculation step Are encoded in the order given in the number assigning step, and an encoded step of generating encoded data of the image is included.
 上記の構成によれば、本発明に係る符号化方法は、本発明に係る符号化装置と同様の作用効果を奏する。 According to the above configuration, the encoding method according to the present invention has the same effects as the encoding apparatus according to the present invention.
 本発明に係る復号装置は、上記課題を解決するために、画像の全領域を所定の分割パターンで分割した複数の領域の各々について、該領域に含まれる各画素の画素値の代表値と、該領域の代表値の予測値との差分である差分値を含む上記画像の符号化データであって、上記差分値は、上記複数の領域に対してラスタスキャン順で付与した番号順で並べられており、上記予測値は、上記番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて算出された符号化データを復号する復号装置であって、上記複数の領域を規定する領域情報に基づいて、上記画像の全領域を複数の領域に分割する分割手段と、上記符号化データを復号し、順に並べられた差分値を生成する復号化手段と、上記分割手段により分割された上記複数の領域に対してラスタスキャン順で番号を付与する番号付与手段と、上記番号付与手段が付与した番号順に、上記複数の領域に対して、それぞれ、差分値を先頭から順に割り当てる割当手段と、上記番号付与手段の付与した番号順に上記領域を復号対象領域とし、復号対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素のうち、少なくとも1つの画素の画素値に基づいて、復号対象領域の予測値を算出する予測値算出手段と、上記復号対象領域毎に、上記予測値算出手段が算出した予測値に、上記割当手段が割り当てた差分値を加算して上記復号対象領域の画素値を算出し、該復号対象領域に含まれる全画素の画素値を算出した画素値に設定する画素値設定手段と、を備え、上記番号順で、上記復号対象領域毎に上記予測値算出手段および画素値設定手段が上記処理を繰り返し実行し、上記画像の画素値を復元することを特徴としている。 In order to solve the above problem, the decoding device according to the present invention, for each of a plurality of areas obtained by dividing the entire area of the image with a predetermined division pattern, a representative value of the pixel value of each pixel included in the area, The encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order. The predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area. A pixel that is adjacent to a pixel on the same scan line as the representative pixel and has a raster scan order before the representative pixel is a predicted reference pixel, and is at least one representative value of a region having the predicted reference pixel A decoding device that decodes the encoded data calculated based on: a dividing unit that divides the entire area of the image into a plurality of areas based on area information defining the plurality of areas; and the encoding Decoding means for decoding data and generating differential values arranged in order; Number assigning means for assigning numbers to the plurality of regions divided by the dividing means in raster scan order; and the number assigning means Are assigned to the plurality of areas in order of numbers assigned by each of the plurality of areas, and the areas are set as decoding target areas in order of the numbers assigned by the number assigning means, and are included in the decoding target areas. Among the pixels, the first pixel in the raster scan order is a representative pixel, and is included in the decoding target area and is adjacent to a pixel on the same scan line as the representative pixel. A prediction value calculation unit that calculates a prediction value of a decoding target region based on a pixel value of at least one pixel of the prediction reference pixels, the pixel having a raster scan order before the representative pixel as a prediction reference pixel; For each decoding target area, the pixel value of the decoding target area is calculated by adding the difference value assigned by the allocating means to the prediction value calculated by the prediction value calculating means, and all the pixels included in the decoding target area are calculated. A pixel value setting unit that sets the calculated pixel value to the calculated pixel value, and the predicted value calculation unit and the pixel value setting unit repeatedly execute the process for each decoding target region in the order of the numbers. It is characterized by restoring pixel values of an image.
 上記の構成によれば、復号化手段が上記符号化データを復号し、順に並べられた差分値を生成する。割当手段が、上記番号付与手段がラスタスキャン順で付与した番号順に、上記複数の領域を規定する領域情報に基づいて上記分割手段が上記画像を分割した上記複数の領域に対して、それぞれ、差分値を先頭から順に割り当てる。次に、予測値算出手段が、上記番号付与手段の付与した番号順に上記領域を復号対象領域とし、復号対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とする。そして、上記予測値算出手段が、上記予測参照画素のうち、少なくとも1つの画素の画素値に基づいて、復号対象領域の予測値を算出する。上記画素値設定手段が、上記復号対象領域毎に、上記予測値算出手段が算出した予測値に、上記割当手段が割り当てた差分値を加算して上記復号対象領域の画素値を算出し、該復号対象領域に含まれる全画素の画素値を算出した画素値に設定する。そして、上記番号付与手段の付与した番号順で、上記復号対象領域毎に上記予測値算出手段および画素値設定手段が上記処理を繰り返し実行し、上記画像の画素値を復元する。 According to the above configuration, the decoding unit decodes the encoded data and generates difference values arranged in order. The allocating unit calculates a difference for each of the plurality of regions obtained by dividing the image by the dividing unit based on region information defining the plurality of regions in the order of numbers assigned by the number assigning unit in raster scan order. Assign values in order from the beginning. Next, the prediction value calculation means decodes the area as the decoding target area in the order of the numbers given by the number assigning means, and uses the first pixel in the raster scan order as the representative pixel among the pixels included in the decoding target area. A pixel that is included in the target region and is adjacent to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is set as a predicted reference pixel. Then, the predicted value calculation means calculates a predicted value of the decoding target region based on the pixel value of at least one pixel among the predicted reference pixels. The pixel value setting means calculates the pixel value of the decoding target area by adding the difference value assigned by the assigning means to the prediction value calculated by the prediction value calculating means for each decoding target area, The pixel values of all the pixels included in the decoding target area are set to the calculated pixel values. Then, the prediction value calculation means and the pixel value setting means repeatedly execute the above processing for each decoding target area in the order of the numbers given by the number assignment means, and restore the pixel values of the image.
 そのため、復号対象領域の区分は、上記符号化データの示す画像が分割された複数の領域と同じ区分である。また、各復号対象領域の代表値の予測値を算出する際に使用する代表画素、およびそれに基づく予測対象画素を一意に特定することができ、かつ、復号対象領域の代表画素およびそれに基づく予測対象画素と、該復号対象領域に対応する符号化対象領域の代表画素およびそれに基づく予測対象画素とが同じ画素にすることができる。よって、符号化データの示す画像を正確に復元することができるという効果を奏する。 Therefore, the decoding target area is the same as the plurality of areas into which the image indicated by the encoded data is divided. In addition, the representative pixel used when calculating the predicted value of the representative value of each decoding target area and the prediction target pixel based thereon can be uniquely specified, and the representative pixel of the decoding target area and the prediction target based thereon The pixel, the representative pixel of the encoding target region corresponding to the decoding target region, and the prediction target pixel based thereon can be the same pixel. Therefore, there is an effect that the image indicated by the encoded data can be accurately restored.
 本発明に係る復号方法は、上記課題を解決するために、画像の全領域を所定の分割パターンで分割した複数の領域の各々について、該領域に含まれる各画素の画素値の代表値と、該領域の代表値の予測値との差分である差分値を含む上記画像の符号化データであって、上記差分値は、上記複数の領域に対してラスタスキャン順で付与した番号順で並べられており、上記予測値は、上記番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて算出された符号化データを復号する復号装置の復号方法であって、上記復号装置にて、上記複数の領域を規定する領域情報に基づいて、上記画像の全領域を複数の領域に分割する分割ステップと、上記符号化データを復号し、順に並べられた差分値を生成する復号化ステップと、上記分割ステップにおいて分割された上記複数の領域に対してラスタスキャン順で番号を付与する番号付与ステップと、上記番号付与ステップにおいて付与された番号順に、上記複数の領域に対して、それぞれ、差分値を先頭から順に割り当てる割当ステップと、上記番号付与ステップにおいて付与された番号順に上記領域を復号対象領域とし、復号対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素のうち、少なくとも1つの画素の画素値に基づいて、復号対象領域の予測値を算出する予測値算出ステップと、上記復号対象領域毎に、上記予測値算出ステップにおいて算出された予測値に、上記割当ステップにおいて割り当てられた差分値を加算して上記復号対象領域の画素値を算出し、該復号対象領域に含まれる全画素の画素値を算出した画素値に設定する画素値設定ステップと、を含み、上記番号順で、上記復号対象領域毎に上記予測値算出ステップおよび画素値設定ステップを繰り返し実行し、上記画像の画素値を復元することを特徴としている。 In order to solve the above problem, the decoding method according to the present invention provides, for each of a plurality of areas obtained by dividing the entire area of an image with a predetermined division pattern, a representative value of pixel values of each pixel included in the area, The encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order. The predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area. A pixel that is adjacent to a pixel on the same scan line as the representative pixel and has a raster scan order before the representative pixel is a predicted reference pixel, and is at least one representative value of a region having the predicted reference pixel A decoding method of a decoding device that decodes encoded data calculated based on the method, wherein the decoding device converts all regions of the image into a plurality of regions based on region information that defines the plurality of regions. A division step for dividing, a decoding step for decoding the encoded data and generating difference values arranged in order, and assigning numbers to the plurality of regions divided in the division step in a raster scan order A number assigning step, an assigning step in which the difference values are assigned in order from the top to the plurality of regions in the order of numbers assigned in the number assigning step, and the regions in the order of the numbers given in the number assigning step. Among the pixels included in the decoding target area, the first pixel in the raster scan order is set as the representative pixel and included in the decoding target area. A pixel that is adjacent to a pixel on the same scan line as the representative pixel and has a raster scan order before the representative pixel is a predicted reference pixel, and a pixel of at least one of the predicted reference pixels A prediction value calculation step of calculating a prediction value of the decoding target region based on the value, and for each decoding target region, the difference value assigned in the assignment step is added to the prediction value calculated in the prediction value calculation step. A pixel value setting step of adding and calculating the pixel value of the decoding target area and setting the pixel values of all the pixels included in the decoding target area to the calculated pixel value, and in order of the numbers The prediction value calculation step and the pixel value setting step are repeatedly executed for each target region to restore the pixel value of the image.
 上記の構成によれば、本発明に係る復号方法は、本発明に係る復号装置と同様の作用効果を奏する。 According to the above configuration, the decoding method according to the present invention has the same operational effects as the decoding device according to the present invention.
 以上説明したように、本発明に係る符号化装置は、画像を分割する複数の領域が任意の形状であっても、各領域間の空間的冗長性を排除し、かつ、一意に復号可能な符号化データを生成することができるという効果を奏する。 As described above, the encoding apparatus according to the present invention can uniquely decode even if a plurality of regions into which an image is divided has an arbitrary shape, eliminating spatial redundancy between the regions. There is an effect that encoded data can be generated.
 また、本発明に係る復号装置は、上記符号化データの示す画像を正確に復元することができるという効果を奏する。 In addition, the decoding device according to the present invention has an effect that the image indicated by the encoded data can be accurately restored.
 本発明のさらに他の目的、特徴、及び優れた点は、以下に示す記載によって十分わかるであろう。また、本発明の利益は、添付図面を参照した次の説明で明白になるであろう。 Further objects, features, and superior points of the present invention will be fully understood from the following description. The benefits of the present invention will become apparent from the following description with reference to the accompanying drawings.
本発明の一実施形態に係る動画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image encoder which concerns on one Embodiment of this invention. 図1の動画像符号化装置の動作を示すフローチャート図である。It is a flowchart figure which shows operation | movement of the moving image encoder of FIG. 図1の動画像符号化装置に入力されるカラーのテクスチャ画像の一具体例を示した図である。It is the figure which showed one specific example of the color texture image input into the moving image encoder of FIG. 図1の動画像符号化装置に入力される距離画像の一具体例を示す図であり、図5のテクスチャ画像とペアで入力される距離画像を示している。It is a figure which shows a specific example of the distance image input into the moving image encoder of FIG. 1, and has shown the distance image input by a pair with the texture image of FIG. 図1の動画像符号化装置が図3のテクスチャ画像から規定する各セグメントの分布を示した図である。FIG. 4 is a diagram showing the distribution of each segment defined by the moving image encoding apparatus of FIG. 1 from the texture image of FIG. 3. 図1の動画像符号化装置の画像分割処理部が、図5の各セグメントについて、位置情報として座標値を後段に出力する、セグメントの境界部分を示した図である。FIG. 6 is a diagram illustrating a segment boundary portion in which an image division processing unit of the moving image encoding device in FIG. 1 outputs a coordinate value as position information to the subsequent stage for each segment in FIG. 5. テクスチャ画像の一部の領域を構成する縦3横4の12個の画素を示しており、図7の(a)および(b)は、2つの画素が縦および横に隣接している場合を示しており、図7の(c)は、2つの画素が1点のみで接している場合を示している。FIG. 7 shows 12 pixels of 3 × 4 in the vertical direction that constitute a partial area of the texture image. FIGS. 7A and 7B show a case where two pixels are adjacent vertically and horizontally. FIG. 7C shows a case where two pixels are in contact at only one point. 図1の動画像符号化装置が各セグメントに付与するセグメント番号の値を決定するためにテクスチャ画像を走査する順序を示す図である。It is a figure which shows the order which scans a texture image in order to determine the value of the segment number which the moving image encoder of FIG. 1 assign | provides to each segment. 図3のテクスチャ画像から規定される各セグメントに付与されるセグメント番号を模式的に示す図である。It is a figure which shows typically the segment number provided to each segment prescribed | regulated from the texture image of FIG. 図1の動画像符号化装置が、テクスチャ画像の全領域を分割することにより規定される各セグメント(領域)について、距離画像中の対応するセグメントにおける距離値の代表値と、ラスタスキャン順に付与されたセグメント番号とが対応付けられているデータを模式的に示す図である。For each segment (region) defined by dividing the entire region of the texture image, the moving image encoding device of FIG. 1 is given the representative value of the distance value in the corresponding segment in the distance image and the raster scan order. It is a figure which shows typically the data with which the segment number was matched. 図1の動画像符号化装置の予測符号化部が実行する予測符号化処理の一例を示すフローチャートである。It is a flowchart which shows an example of the prediction encoding process which the prediction encoding part of the moving image encoder of FIG. 1 performs. 図12の(a)~(e)は、距離画像の一部の領域を構成する縦3横4の12個の画素を示しており、予測符号化部がセグメントの代表値を予測する際に用いる代表画素、および、該代表画素に基づく予測参照画素の具体例を示す図である。(A) to (e) of FIG. 12 show 12 pixels of length 3 and width 4 constituting a partial region of the distance image, and the prediction encoding unit predicts the representative value of the segment. It is a figure which shows the specific example of the representative pixel to be used, and the prediction reference pixel based on this representative pixel. 予測符号化部が使用する指数ゴロム符号化方法における、符号化の対象である値と符号語との対応関係を示す図である。It is a figure which shows the correspondence of the value and codeword which are the object of an encoding in the exponent Golomb encoding method which a prediction encoding part uses. 予測符号化部が生成する符号化データの一例を示すである。It is an example of the encoded data which a prediction encoding part produces | generates. NALユニットのデータ構造を模式的に示した図である。It is the figure which showed the data structure of the NAL unit typically. 本発明の一実施形態に係る動画像復号装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image decoding apparatus which concerns on one Embodiment of this invention. 図16の動画像復号装置の動作を示すフローチャート図である。It is a flowchart figure which shows operation | movement of the moving image decoding apparatus of FIG. 図16の動画像復号装置の予測復号部が実行する予測復号処理の一例を示すフローチャートである。It is a flowchart which shows an example of the prediction decoding process which the prediction decoding part of the moving image decoding apparatus of FIG. 16 performs. 本発明の別の実施形態に係る動画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image encoder which concerns on another embodiment of this invention. 本発明の別の実施形態に係る動画像復号装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image decoding apparatus which concerns on another embodiment of this invention. 図1の動画像符号化装置が複数のセグメントを規定する動作の一例を示すフローチャート図である。It is a flowchart figure which shows an example of the operation | movement which prescribes | regulates several segments with the moving image encoder of FIG. 図21のフローチャートにおけるセグメント結合処理のサブルーチンを示すフローチャート図である。It is a flowchart figure which shows the subroutine of the segment coupling | bonding process in the flowchart of FIG.
 <実施形態1>
 本発明の一実施形態に係る動画像符号化装置および動画像復号装置について図1~図18を参照しながら以下に説明する。
<Embodiment 1>
A video encoding device and video decoding device according to an embodiment of the present invention will be described below with reference to FIGS.
 最初に、本実施形態に係る動画像符号化装置について説明する。本実施形態に係る動画像符号化装置は、概略的に言えば、3次元動画像を構成する各フレームについて、該フレームを構成するテクスチャ画像および距離画像を符号化することによって符号化データを生成する装置である。 First, the moving picture encoding apparatus according to the present embodiment will be described. The moving picture coding apparatus according to the present embodiment generally generates coded data for each frame constituting a three-dimensional moving picture by coding a texture image and a distance image constituting the frame. It is a device to do.
 本実施形態に係る動画像符号化装置は、テクスチャ画像の符号化に、H.264/MPEG-4 AVC規格に採用されている符号化技術を用いる一方、距離画像の符号化には本発明に特有の符号化技術を用いている動画像符号化装置である。 The moving picture encoding apparatus according to the present embodiment uses H.264 for encoding texture images. On the other hand, the encoding technique employed in the H.264 / MPEG-4 AVC standard is used, while the encoding of the distance image is a moving picture encoding apparatus using the encoding technique peculiar to the present invention.
 本発明に特有の上記符号化技術は、テクスチャ画像と距離画像とに相関があることに着目して開発された符号化技術である。2つの画像には、テクスチャ画像中のある領域が類似する色の画素からなる画素群で構成されている場合、距離画像中の対応する領域に含まれる画素群は全部または略全ての画素が同じ距離値をとる傾向が強いという相関がある。 The above encoding technique unique to the present invention is an encoding technique developed by paying attention to the fact that there is a correlation between a texture image and a distance image. In two images, when a certain area in the texture image is composed of pixel groups composed of pixels of similar colors, all or almost all of the pixels included in the corresponding area in the distance image are the same. There is a correlation that the tendency to take a distance value is strong.
 本発明では、テクスチャ画像および距離画像を構成する画素の値を画素値と称する。テクスチャ画像における画素値は、各画素が有する輝度や色に関する情報を示す。また、距離画像における画素値は、各画素が有する距離深度に関する情報を示す。テクスチャ画像と距離画像との画素値を区別する場合、テクスチャ画像の画素値を色値と称し、距離画像の画素値を距離値と称する。 In the present invention, the values of the pixels constituting the texture image and the distance image are referred to as pixel values. The pixel value in the texture image indicates information regarding the luminance and color of each pixel. In addition, the pixel value in the distance image indicates information related to the distance depth that each pixel has. When distinguishing the pixel values of the texture image and the distance image, the pixel value of the texture image is referred to as a color value, and the pixel value of the distance image is referred to as a distance value.
 最初に本実施形態に係る動画像符号化装置の構成について図1を参照しながら説明する。図1は、動画像符号化装置の要部構成を示すブロック図である。 First, the configuration of the video encoding apparatus according to the present embodiment will be described with reference to FIG. FIG. 1 is a block diagram illustrating a configuration of a main part of a video encoding device.
 (動画像符号化装置1の構成)
 図1に示すように、動画像符号化装置1は、画像符号化部11、画像復号部(復号化手段)12、距離画像符号化部20、およびパッケージング部(伝送手段)28を備えている。また、距離画像符号化部20は、画像分割処理部21、距離画像分割処理部(分割手段)22、距離値修正部(代表値決定手段)23、番号付与部(番号付与手段)24、および予測符号化部(予測値算出手段、差分値算出手段、符号化手段)25を備えている。
(Configuration of moving picture encoding apparatus 1)
As shown in FIG. 1, the moving image encoding apparatus 1 includes an image encoding unit 11, an image decoding unit (decoding unit) 12, a distance image encoding unit 20, and a packaging unit (transmission unit) 28. Yes. The distance image encoding unit 20 includes an image division processing unit 21, a distance image division processing unit (dividing unit) 22, a distance value correcting unit (representative value determining unit) 23, a number assigning unit (number assigning unit) 24, and A prediction encoding unit (prediction value calculation means, difference value calculation means, encoding means) 25 is provided.
 画像符号化部11は、H.264/MPEG-4 AVC規格に規定されているAVC(Advanced Video Coding)符号化によりテクスチャ画像#1の符号化を行う。 The image encoding unit 11 The texture image # 1 is encoded by AVC (Advanced Video Coding) coding defined in the H.264 / MPEG-4 AVC standard.
 画像復号部12は、テクスチャ画像#1の符号化データ#11からテクスチャ画像#1’を復号する。 The image decoding unit 12 decodes the texture image # 1 'from the encoded data # 11 of the texture image # 1.
 画像分割処理部21は、テクスチャ画像#1の全領域を複数のセグメント(領域)に分割する。そして、画像分割処理部21は、各セグメントの位置情報からなるセグメント情報#21を出力する。セグメントの位置情報とは、そのセグメントのテクスチャ画像#1における位置を表す情報である。 The image division processing unit 21 divides the entire area of the texture image # 1 into a plurality of segments (areas). Then, the image division processing unit 21 outputs segment information # 21 including position information of each segment. The segment position information is information indicating the position of the segment in the texture image # 1.
 距離画像分割処理部22は、距離画像#2およびセグメント情報#21が入力されると、テクスチャ画像#1’中の各セグメントについて、距離画像#2中の対応するセグメント(領域)に含まれる各画素の距離値からなる距離値セットを抽出する。そして、距離画像分割処理部22は、セグメント情報#21から、各セグメントについて距離値セットと位置情報とが関連づけられたセグメント情報#22を生成する。 When the distance image # 2 and the segment information # 21 are input, the distance image division processing unit 22 includes each segment included in the corresponding segment (region) in the distance image # 2 for each segment in the texture image # 1 ′. A distance value set consisting of pixel distance values is extracted. Then, the distance image division processing unit 22 generates segment information # 22 in which the distance value set and the position information are associated with each segment from the segment information # 21.
 距離値修正部23は、距離画像#2の各セグメントについて、セグメント情報#22に含まれる該セグメントの距離値セットから代表値#23aとして最頻値を算出する。すなわち、距離値修正部23は、距離画像#2中のセグメントiにN個の画素が含まれている場合には、N個の距離値から最頻値を算出する。なお、距離値修正部23は、最頻値の代わりに、N個の距離値の平均を平均値、または、N個の距離値の中央値等を代表値#23aとして算出してもよい。そして、距離値修正部23は、算出の結果、平均値や中央値等の値が小数値になる場合には、さらに、切捨て、切り上げ、または四捨五入等により小数値を整数値に丸めればよい。 The distance value correction unit 23 calculates the mode value as the representative value # 23a from the distance value set of the segment included in the segment information # 22 for each segment of the distance image # 2. That is, when the segment i in the distance image # 2 includes N pixels, the distance value correcting unit 23 calculates the mode value from the N distance values. The distance value correcting unit 23 may calculate an average of N distance values as an average value, or a median value of N distance values or the like as a representative value # 23a instead of the mode value. The distance value correcting unit 23 may further round the decimal value to an integer value by rounding down, rounding up, or rounding off when the average value, median value, or the like becomes a decimal value as a result of the calculation. .
 そして、距離値修正部23は、セグメント情報#22に含まれる各セグメントの距離値セットを、対応するセグメントの代表値#23aに置き換え、セグメント情報#23として番号付与部24に出力する。 Then, the distance value correcting unit 23 replaces the distance value set of each segment included in the segment information # 22 with the representative value # 23a of the corresponding segment, and outputs it to the number assigning unit 24 as the segment information # 23.
 番号付与部24は、距離画像に含まれる画素に対してラスタスキャン順に走査して、セグメント情報#23によって分割される領域である各セグメントに対して、走査した順番でセグメント番号#24を付与し、セグメント情報#23に含まれている各代表値#23aに対応付ける。 The number assigning unit 24 scans the pixels included in the distance image in the raster scan order, and assigns the segment number # 24 in the scanned order to each segment that is an area divided by the segment information # 23. , It is associated with each representative value # 23a included in the segment information # 23.
 予測符号化部25は、入力されたM組の代表値#23aおよびセグメント番号#24に基づいて予測符号化処理を施し、得られた符号化データ#25をパッケージング部28に出力する。具体的には、予測符号化部25は、セグメント番号#24の順番で、セグメント毎に、セグメントの予測値を算出し、代表値から予測値を減算して差分値を算出し、差分値を符号化する。そして、予測符号化部25は、符号化した差分値をセグメント番号#24の順番に並べて、符号化データ#25とし、符号化データ#25をパッケージング部28に出力する。 The predictive encoding unit 25 performs predictive encoding processing based on the input M sets of representative values # 23a and segment numbers # 24, and outputs the obtained encoded data # 25 to the packaging unit 28. Specifically, the predictive encoding unit 25 calculates the segment predicted value for each segment in the order of segment number # 24, subtracts the predicted value from the representative value, calculates the difference value, and calculates the difference value. Encode. Then, the predictive encoding unit 25 arranges the encoded difference values in the order of the segment number # 24 to obtain encoded data # 25, and outputs the encoded data # 25 to the packaging unit 28.
 パッケージング部28は、入力されたテクスチャ画像#1の符号化データ#11と距離画像#2の符号化データ#25とを関連づけ、符号化データ#28として外部に出力する。 The packaging unit 28 associates the encoded data # 11 of the texture image # 1 and the encoded data # 25 of the distance image # 2 and outputs them as encoded data # 28 to the outside.
 (動画像符号化装置1の動作)
 次に、動画像符号化装置1の動作について、図2を参照しながら以下に説明する。図2は、動画像符号化装置1の動作を示すフローチャートである。なお、ここで説明する動画像符号化装置1の動作とは、多数のフレームからなる動画像における先頭からtフレーム目のテクスチャ画像および距離画像を符号化する動作である。すなわち、動画像符号化装置1は、上記動画像全体を符号化するために、上記動画像のフレーム数に応じた回数だけ以下に説明する動作を繰り返すことになる。また、以下の動作の説明においては、特に明示していなければ、各データ#1~#28はtフレーム目のデータであると解釈するものとする。
(Operation of the video encoding device 1)
Next, the operation of the moving picture encoding apparatus 1 will be described below with reference to FIG. FIG. 2 is a flowchart showing the operation of the moving image encoding apparatus 1. Note that the operation of the moving image encoding apparatus 1 described here is an operation of encoding a texture image and a distance image of the t frame from the head in a moving image including a large number of frames. That is, the moving image encoding apparatus 1 repeats the operation described below as many times as the number of frames of the moving image in order to encode the entire moving image. In the following description of the operation, unless otherwise specified, each data # 1 to # 28 is interpreted as data of the t-th frame.
 最初に、画像符号化部11および距離画像分割処理部22が、それぞれ、テクスチャ画像#1および距離画像#2を動画像符号化装置1の外部から受信する(ステップS1)。前述したように、外部から受信されるテクスチャ画像#1および距離画像#2のペアは、例えば図3のテクスチャ画像と図4の距離画像とを対比するとわかるように、画像の内容に互いに相関がある。 First, the image encoding unit 11 and the distance image division processing unit 22 respectively receive the texture image # 1 and the distance image # 2 from the outside of the moving image encoding device 1 (step S1). As described above, the pair of the texture image # 1 and the distance image # 2 received from the outside is correlated with the contents of the image, as can be seen, for example, by comparing the texture image of FIG. 3 and the distance image of FIG. is there.
 次に、画像符号化部11は、H.264/MPEG-4 AVC規格に規定されているAVC符号化方式によりテクスチャ画像#1の符号化を行い、得られたテクスチャ画像の符号化データ#11をパッケージング部28と画像復号部12とに出力する(ステップS2)。なお、ステップS2において、テクスチャ画像#1がBピクチャまたはPピクチャである場合、画像符号化部11は、テクスチャ画像#1と予測画像との予測残差を符号化し、符号化済みの予測残差を符号化データ#11として出力する。 Next, the image encoding unit 11 The texture image # 1 is encoded by the AVC encoding method stipulated in the H.264 / MPEG-4 AVC standard, and the obtained texture image encoded data # 11 is transmitted to the packaging unit 28 and the image decoding unit 12. Output (step S2). When the texture image # 1 is a B picture or a P picture in step S2, the image encoding unit 11 encodes the prediction residual between the texture image # 1 and the predicted image, and the encoded prediction residual Is output as encoded data # 11.
 そして、画像復号部12は、符号化データ#11からテクスチャ画像#1’を復号して画像分割処理部21に出力する(ステップS3)。ここで、復号するテクスチャ画像#1’は、画像符号化部11が符号化するテクスチャ画像#1と完全に同一ではない。これは、画像符号化部11は符号化処理中にDCT変換処理および量子化処理を施すが、DCT変換により得られたDCT係数を量子化する際に量子化誤差が生じるためである。 Then, the image decoding unit 12 decodes the texture image # 1 'from the encoded data # 11 and outputs it to the image division processing unit 21 (step S3). Here, the texture image # 1 'to be decoded is not completely the same as the texture image # 1 encoded by the image encoding unit 11. This is because the image encoding unit 11 performs the DCT conversion process and the quantization process during the encoding process, but a quantization error occurs when the DCT coefficient obtained by the DCT conversion is quantized.
 ところで、画像復号部12がテクスチャ画像を復号するタイミングは、テクスチャ画像#1がBピクチャであるか否かによって異なっているが、このことについて具体的に説明する。 Incidentally, the timing at which the image decoding unit 12 decodes the texture image differs depending on whether or not the texture image # 1 is a B picture. This will be described in detail.
 すなわち、画像復号部12は、テクスチャ画像#1がIピクチャである場合には、インター予測(画面間予測)を行わずにテクスチャ画像#1’を復号する。 That is, when the texture image # 1 is an I picture, the image decoding unit 12 decodes the texture image # 1 ′ without performing inter prediction (inter-screen prediction).
 また、テクスチャ画像#1がPピクチャである場合には、画像復号部12は、符号化データ#11から予測残差を復号する。そして、画像復号部12は、tフレーム目以前の1または複数のフレームの符号化データ#11を参照ピクチャとして生成した予測画像に予測残差を加算することによりテクスチャ画像#1’を復号する。 If the texture image # 1 is a P picture, the image decoding unit 12 decodes the prediction residual from the encoded data # 11. Then, the image decoding unit 12 decodes the texture image # 1 ′ by adding a prediction residual to the predicted image generated using the encoded data # 11 of one or more frames before the t-th frame as a reference picture.
 さらに、テクスチャ画像#1がBピクチャである場合には、画像復号部12は、符号化データ#11から予測残差を復号する。そして、画像復号部12は、tフレーム目以前の1または複数のフレームの符号化データ#11と、tフレーム目以降の1または複数のフレームの符号化データ#11と、を参照ピクチャとして生成した予測画像に予測残差を加算することによりテクスチャ画像#1’を復号する。 Furthermore, when the texture image # 1 is a B picture, the image decoding unit 12 decodes the prediction residual from the encoded data # 11. Then, the image decoding unit 12 generates, as reference pictures, encoded data # 11 of one or more frames before the t-th frame and encoded data # 11 of one or more frames after the t-th frame. Texture image # 1 ′ is decoded by adding the prediction residual to the prediction image.
 以上の説明からわかるように、tフレーム目のテクスチャ画像#1がIピクチャまたはPピクチャである場合には、画像復号部12がtフレーム目のテクスチャ画像#1’を復号するタイミングはtフレームの符号化データ#11が生成された直後となる。一方、tフレーム目のテクスチャ画像#1がBピクチャである場合、画像復号部12がテクスチャ画像#1’を復号するタイミングは、T(>t)フレーム目(参照ピクチャの中で最後方のフレーム)のテクスチャ画像#1に対する符号化処理が終わった時点以降となる。 As can be seen from the above description, when the texture image # 1 in the t frame is an I picture or a P picture, the timing at which the image decoding unit 12 decodes the texture image # 1 ′ in the t frame is the t frame. Immediately after the encoded data # 11 is generated. On the other hand, when the texture image # 1 of the t frame is a B picture, the timing at which the image decoding unit 12 decodes the texture image # 1 ′ is the T (> t) frame (the last frame in the reference picture). ) After the time when the encoding process for texture image # 1 is completed.
 ステップS3の処理の後、画像分割処理部21は、入力されたテクスチャ画像#1’から、複数のセグメントを規定する(ステップS4)。画像分割処理部21が規定する各セグメントは、類似する色の画素(すなわち、最大画素値と最小画素値との差(最大色値と最小色値との差)が所定の閾値以下であるような画素群)で構成される閉領域となる。 After the process of step S3, the image division processing unit 21 defines a plurality of segments from the input texture image # 1 '(step S4). Each segment defined by the image division processing unit 21 has a similar color pixel (that is, the difference between the maximum pixel value and the minimum pixel value (difference between the maximum color value and the minimum color value) is equal to or less than a predetermined threshold value. (Closed pixel group).
 ステップS4の処理について具体例を挙げて説明する。図5は、画像分割処理部21が図3のテクスチャ画像#1’から規定する各セグメントの分布を示した図である。なお、図5において、同一の模様により描かれている閉領域は1つのセグメントを示している。 The process of step S4 will be described with a specific example. FIG. 5 is a diagram showing the distribution of each segment defined by the image division processing unit 21 from the texture image # 1 ′ of FIG. In FIG. 5, the closed region drawn by the same pattern indicates one segment.
 図3のテクスチャ画像#1において、女の子の頭の分け目の左右の髪は、茶色と薄茶色との2色で描かれている。図5を見るとわかるように、画像分割処理部21は、茶色と薄茶色とのように類似する色の画素からなる閉領域を1つのセグメントに規定する。 In the texture image # 1 in FIG. 3, the left and right hairs of the girl's head division are drawn in two colors, brown and light brown. As can be seen from FIG. 5, the image division processing unit 21 defines a closed region made up of pixels of similar colors such as brown and light brown as one segment.
 一方、女の子の顔の肌の部分も、肌色と頬の部分のピンク色との2色で描かれているが、図5を見るとわかるように、画像分割処理部21は、肌色の領域とピンク色の領域とをそれぞれ別個のセグメントとして規定している。これは、肌色とピンク色とが類似しない色(すなわち、肌色の色値とピンク色の色値との差が所定の閾値を上回る)ためである。 On the other hand, the skin portion of the girl's face is also drawn in two colors, the skin color and the pink color of the cheek portion. As can be seen from FIG. Each pink area is defined as a separate segment. This is because the skin color and the pink color are not similar (that is, the difference between the skin color value and the pink color value exceeds a predetermined threshold value).
 ステップS4の処理の後、画像分割処理部21は、各セグメントの位置情報からなるセグメント情報#21を生成し、距離画像分割処理部22に出力する(ステップS5)。セグメントの位置情報としては、例えば、そのセグメントに含まれる全画素の座標値が挙げられる。すなわち、図3のテクスチャ画像#1’から各セグメントを規定する場合、図6における各閉領域が1つのセグメントとして規定されるが、セグメントの位置情報は、そのセグメントに対応する閉領域を構成する全画素の座標値となる。 After the process of step S4, the image division processing unit 21 generates segment information # 21 including the position information of each segment and outputs it to the distance image division processing unit 22 (step S5). As the position information of the segment, for example, the coordinate values of all the pixels included in the segment can be cited. That is, when defining each segment from the texture image # 1 ′ in FIG. 3, each closed region in FIG. 6 is defined as one segment, but the position information of the segment constitutes a closed region corresponding to the segment. Coordinate values for all pixels.
 ここで、前述したセグメントについての補足事項を、図7を参照しながら説明する。図7の(a)~(c)は、テクスチャ画像の一部の領域を構成する縦3横4の12個の画素を示している。また、図7の(a)~(c)において「A」と付されている画素の色と「B」と付されている画素の色とは同一色または類似する色であるものとする。また、その他の10個の部分領域における画素の色は、画素Aの色と画素Bの色とのいずれとも全く異なっているものとする。 Here, supplementary matters regarding the above-mentioned segment will be described with reference to FIG. (A) to (c) of FIG. 7 show 12 pixels of 3 × 4 in the vertical direction that constitute a partial region of the texture image. In addition, in FIGS. 7A to 7C, the color of the pixel labeled “A” and the color of the pixel labeled “B” are the same or similar. In addition, the colors of the pixels in the other ten partial regions are completely different from the colors of the pixel A and the pixel B.
 前述したように各セグメントは、類似する色の画素からなる閉領域(接続関係のある画素群)となっている。この閉領域の定義について図7に基づいて説明する。 As described above, each segment is a closed region (a group of connected pixels) made up of pixels of similar colors. The definition of the closed region will be described with reference to FIG.
 本発明では、2つの画素の位置関係が図7の(a)および(b)の場合、画素Aと画素Bと接続関係にあるものとする。すなわち、画素Aと画素Bとが、互いに縦方向または横方向で接している場合に接続しているとみなす。換言すると、画素Aと画素Bとが或る一辺で接している場合、接続しているとみなす。つまり、この場合、画素Aおよび画素Bは同一セグメントを形成する。 In the present invention, it is assumed that the pixel A and the pixel B are connected when the positional relationship between the two pixels is (a) and (b) in FIG. That is, it is considered that the pixel A and the pixel B are connected when they are in contact with each other in the vertical direction or the horizontal direction. In other words, when the pixel A and the pixel B are in contact with each other on one side, it is considered that they are connected. That is, in this case, the pixel A and the pixel B form the same segment.
 一方、2つの画素の位置関係が図7の(c)の場合、画素Aと画素Bと接続関係にないものとする。すなわち、画素Aと画素Bとが、互いに斜め方向で接している場合は、接続していないとみなす。換言すると、画素Aと画素Bとが或る一点のみで接している場合、接続していないとみなす。つまり、この場合、画素Aおよび画素Bは、同一色または類似する色であるが、別のセグメントとなる。なお、画素Aと画素Bとが全く接していない場合、言うまでも無く、画素Aと画素Bとは別個のセグメントになる。 On the other hand, when the positional relationship between the two pixels is (c) in FIG. 7, it is assumed that the pixel A and the pixel B are not connected. That is, when the pixel A and the pixel B are in contact with each other in an oblique direction, it is considered that they are not connected. In other words, when the pixel A and the pixel B are in contact with each other only at a certain point, it is considered that they are not connected. That is, in this case, the pixel A and the pixel B are the same color or similar colors, but are different segments. Needless to say, when the pixel A and the pixel B are not in contact with each other, the pixel A and the pixel B are separate segments.
 まとめると、「画素と画素とが隣接する」とは、厳密には、2つの画素の座標間のマンハッタン距離が「1」であることと同義であり、2つの画素が隣接しないとは、2つの画素の座標間のマンハッタン距離が「2以上」であることと同義である。 In summary, “pixels are adjacent to each other” is strictly synonymous with the Manhattan distance between the coordinates of the two pixels being “1”, and two pixels are not adjacent to each other. This is synonymous with the fact that the Manhattan distance between the coordinates of two pixels is “2 or more”.
 以上を踏まえて、本発明では、距離深度値が略一定となるように区切った画素群(テクスチャ画像および距離画像の全領域を分割して形成される各領域)であって、各画素が互いに接続関係にある画素群をセグメントと称する。 Based on the above, in the present invention, a pixel group (each region formed by dividing the entire region of the texture image and the distance image) divided so that the distance depth value is substantially constant, A pixel group having a connection relationship is referred to as a segment.
 さらに、画素Aおよび画素Bが図7の(a)および(b)に示す位置関係にある場合、画素Aは、画素Bと隣接しているとも称す。また、画素Aおよび画素Bが図7の(a)~(c)に示す位置関係のいずれかである場合、画素Aは、画素Bと近接しているとも称す。また、セグメントを構成する画素が、他のセグメントを構成する画素と隣接している場合、該セグメントは他のセグメントと隣接していると称する。また、セグメントを構成する画素が、他のセグメントを構成する画素と近接している場合、該セグメントは他のセグメントと近接していると称する。 Furthermore, when the pixel A and the pixel B are in the positional relationship shown in FIGS. 7A and 7B, the pixel A is also referred to as being adjacent to the pixel B. In addition, when the pixel A and the pixel B are in any of the positional relationships shown in FIGS. 7A to 7C, the pixel A is also referred to as being close to the pixel B. Further, when a pixel constituting a segment is adjacent to a pixel constituting another segment, the segment is referred to as being adjacent to another segment. Further, when a pixel constituting a segment is close to a pixel constituting another segment, the segment is referred to as being close to another segment.
 ステップS5の処理の後、距離画像分割処理部22は、入力された距離画像#2を複数のセグメントに分割する。具体的には、距離画像分割処理部22は、入力されたセグメント情報#21を参照して各セグメントのテクスチャ画像#1’における位置を特定し、テクスチャ画像#1’におけるセグメントの分割パターンと同一の分割パターンで、距離画像#2を複数のセグメントに分割する(以下では、セグメントの個数がM個であるものとして説明する)。 After step S5, the distance image division processing unit 22 divides the input distance image # 2 into a plurality of segments. Specifically, the distance image division processing unit 22 refers to the input segment information # 21, specifies the position of each segment in the texture image # 1 ′, and is the same as the segment division pattern in the texture image # 1 ′. In this division pattern, the distance image # 2 is divided into a plurality of segments (in the following description, it is assumed that the number of segments is M).
 そして、距離画像分割処理部22は、距離画像#2の各セグメントについて、該セグメントに含まれる各画素の距離値を距離値セットとして抽出する。さらに、距離画像分割処理部22は、セグメント情報#21に含まれる各セグメントの位置情報に、対応するセグメントから抽出した距離値セットを関連づける。そして、距離画像分割処理部22は、これにより得られたセグメント情報#22を、距離値修正部23に出力する(ステップS6)。 Then, the distance image division processing unit 22 extracts the distance value of each pixel included in the segment as a distance value set for each segment of the distance image # 2. Furthermore, the distance image division processing unit 22 associates the distance value set extracted from the corresponding segment with the position information of each segment included in the segment information # 21. And the distance image division | segmentation process part 22 outputs segment information # 22 obtained by this to the distance value correction part 23 (step S6).
 距離値修正部23は、距離画像#2の各セグメントについて、セグメント情報#22に含まれる該セグメントの距離値セットから代表値#23aとして最頻値を算出する。そして、距離値修正部23は、セグメント情報#22に含まれるM個の距離値セットの各々を、対応するセグメントの代表値#23aに置き換え、セグメント情報#23として番号付与部24に出力する(ステップS7)。 The distance value correction unit 23 calculates the mode value as the representative value # 23a from the distance value set of the segment included in the segment information # 22 for each segment of the distance image # 2. Then, the distance value correcting unit 23 replaces each of the M distance value sets included in the segment information # 22 with the representative value # 23a of the corresponding segment, and outputs it as the segment information # 23 to the number assigning unit 24 ( Step S7).
 番号付与部24は、セグメント情報#23に含まれているM組の位置情報および代表値#23aの各組について、代表値#23aと位置情報に応じたセグメント番号#24とを関連づけ、M組の代表値#23aおよびセグメント番号#24を予測符号化部25に出力する(ステップS8)。具体的には、番号付与部24は、セグメント情報#23に基づいて、1からM(M:セグメントの個数)までの各セグメントiについて、ラスタスキャン順でi番目のセグメントの代表値#23aにセグメント番号「i-1」を関連づける。ここで、「ラスタスキャン順でi番目のセグメント」とは、距離画像またはテクスチャ画像を図8に示すようにラスタスキャン順に走査した場合にi番目に画素が走査されるセグメントである。 The number assigning unit 24 associates the representative value # 23a with the segment number # 24 corresponding to the position information for each of the M sets of position information and representative value # 23a included in the segment information # 23, and sets M sets The representative value # 23a and the segment number # 24 are output to the predictive coding unit 25 (step S8). Specifically, the number assigning unit 24 sets the representative value # 23a of the i-th segment in the raster scan order for each segment i from 1 to M (M: the number of segments) based on the segment information # 23. The segment number “i−1” is associated. Here, the “i-th segment in the raster scan order” is a segment in which the i-th pixel is scanned when the distance image or the texture image is scanned in the raster scan order as shown in FIG.
 図9を参照して具体例を以下に説明する。 A specific example will be described below with reference to FIG.
 図9は、図3に示すようなテクスチャ画像とともに動画像符号化装置1に入力される距離画像の各セグメントの位置を模式的に示す図である。なお、図9において、1つの閉領域が1つのセグメントを示している。 FIG. 9 is a diagram schematically showing the position of each segment of the distance image input to the moving image encoding device 1 together with the texture image as shown in FIG. In FIG. 9, one closed region indicates one segment.
 図9の距離画像においては、ラスタスキャン順で先頭に位置するセグメントR0にはセグメント番号「0」が割り当てられる。また、ラスタスキャン順で2番目に位置するセグメントR1にはセグメント番号「1」が割り当てられる。同様に、ラスタスキャン順で3、4番目に位置するセグメントR2、R3には、それぞれ、セグメント番号「2」「3」が割り当てられる。 In the distance image of FIG. 9, the segment number “0” is assigned to the segment R0 located at the head in the raster scan order. Further, the segment number “1” is assigned to the segment R1 that is positioned second in the raster scan order. Similarly, segment numbers “2” and “3” are respectively assigned to the third and fourth segments R2 and R3 in the raster scan order.
 そして、番号付与部24は、図10に具体例が示されているようなM組の代表値#23aおよびセグメント番号#24を予測符号化部25に出力する。 Then, the number assigning unit 24 outputs the M sets of representative values # 23a and the segment number # 24 whose specific examples are shown in FIG. 10 to the predictive encoding unit 25.
 ステップS8の後、予測符号化部25は、入力されたM組の代表値#23aおよびセグメント番号#24に基づいて予測符号化処理を施し、得られた符号化データ#25をパッケージング部28に出力する(ステップS9)。具体的には、予測符号化部25は、セグメント番号#24の順番で、セグメント毎に、セグメントの予測値を算出し、代表値から予測値を減算して差分値を算出し、差分値を符号化する。そして、予測符号化部25は、符号化した差分値をセグメント番号#24の順番に並べて、符号化データ#25とする。 After step S8, the predictive encoding unit 25 performs predictive encoding processing based on the input M sets of representative values # 23a and segment numbers # 24, and the obtained encoded data # 25 is packaged by the packaging unit 28. (Step S9). Specifically, the predictive encoding unit 25 calculates the segment predicted value for each segment in the order of segment number # 24, subtracts the predicted value from the representative value, calculates the difference value, and calculates the difference value. Encode. Then, the predictive encoding unit 25 arranges the encoded difference values in the order of the segment number # 24 to obtain encoded data # 25.
 本発明では、近接するセグメントは、距離値が同じ値、あるいは近い値であると仮定している。この仮定に基づいて、セグメントの代表値の予測値を、該セグメントに近接するセグメントの代表値から予測する。通常、同じ被写体の内側の各領域は、互いに距離値が僅かに違うだけであるため、このような仮定は妥当である。 In the present invention, it is assumed that adjacent segments have the same or close distance values. Based on this assumption, the predicted value of the representative value of the segment is predicted from the representative value of the segment adjacent to the segment. Usually, each region inside the same subject has a slightly different distance value, so this assumption is valid.
 このステップS9において、予測符号化部25が実行する予測符号化処理の詳細を図11に基づいて説明する。図11は、予測符号化部25が実行する予測符号化処理の一例を示すフローチャートである。 Details of the predictive encoding process executed by the predictive encoding unit 25 in step S9 will be described with reference to FIG. FIG. 11 is a flowchart illustrating an example of the prediction encoding process performed by the prediction encoding unit 25.
 まず、セグメント番号#24である「i」を「0」とする(ステップS101)。そして、セグメント番号#24が「i」のセグメントを符号化対象セグメント(符号化対象領域)とする(ステップS102)。つまり、最初のセグメント番号「0」のセグメントを符号化対象セグメントとする。 First, “i” that is the segment number # 24 is set to “0” (step S101). Then, the segment whose segment number # 24 is “i” is set as an encoding target segment (encoding target region) (step S102). That is, the segment with the first segment number “0” is set as the encoding target segment.
 次に、符号化対象セグメントに含まれる画素の中から、予測値を算出するために用いる、該符号化対象セグメントの代表画素を特定する(ステップS103)。具体的には、符号化対象セグメントに含まれる画素であって、ステップS8においてラスタスキャン順で最初に走査される画素(ラスタスキャン順で最先の画素)を代表画素とする。本発明におけるセグメントの形状は前述のように様々であるが、どのような形状のセグメントであっても、ラスタスキャン順で最初に走査される画素は一意に決まる。 Next, the representative pixel of the encoding target segment used for calculating the predicted value is specified from the pixels included in the encoding target segment (step S103). Specifically, the pixel included in the encoding target segment and scanned first in the raster scan order in step S8 (the first pixel in the raster scan order) is set as the representative pixel. Although the shape of the segment in the present invention is various as described above, the pixel to be scanned first in the raster scan order is uniquely determined regardless of the shape of the segment.
 代表画素を特定した後、その代表画素に基づいて予測参照画素を特定する(ステップS104)。具体的には、符号化対象セグメントに含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とする。例えば、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、符号化対象セグメントに含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、符号化対象セグメントに含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む画素群としてもよい。また、例えば、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、符号化対象セグメントに含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素の何れか1つの画素と、符号化対象セグメントに含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む3つの画素としてもよい。 After specifying the representative pixel, the prediction reference pixel is specified based on the representative pixel (step S104). Specifically, a pixel that is included in the encoding target segment and is close to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is a predicted reference pixel. For example, the prediction reference pixel is a pixel adjacent to the pixel on the same scan line as the representative pixel that is included in the encoding target segment and the pixel immediately before the raster scan order of the representative pixel. The pixel of the previous scan line in the raster scan order of the pixel and the next pixel in the raster scan order of the last pixel included in the encoding target segment and the same scan line as the representative pixel are adjacent. The pixel group may include a pixel and a pixel on a scan line that is one previous in the raster scan order of the representative pixel. Further, for example, the prediction reference pixel is a pixel adjacent to the pixel in the encoding target segment and the pixel on the same scan line as that of the representative pixel, and the pixel immediately before the raster scan order of the representative pixel, Any one of the pixels of the previous scan line in the raster scan order of the representative pixel and the last pixel in the same scan line as the representative pixel included in the segment to be encoded Three pixels including a pixel adjacent to the next pixel and a pixel on the previous scan line in the raster scan order of the representative pixel may be used.
 ここで、代表画素に基づいて予測参照画素を特定する具体的な例を図12に基づいて説明する。図12の(a)~(e)は、距離画像の一部の領域を構成する縦3横4の12個の画素を示している。また、図12の(a)~(e)において「A」と付されている画素(画素Aと称する)は、同一セグメントを構成する画素である。また、「B」、「C」または「D」と付されている画素(それぞれ画素B、画素C、画素Dと称する)および「空白」の画素は、画素Aを有するセグメントRAとは、異なるセグメントを構成するものである。ここで、画素A以外の画素が属するセグメントは、全て同一セグメントであってもよいし、全て互いに異なるセグメントであってもよい。また、図12の(a)~(e)では、それぞれの場合における代表画素および代表画素と同じ走査線(スキャンライン:本実施形態では、画像における画素行)の画素に斜線を付している。 Here, a specific example of specifying the prediction reference pixel based on the representative pixel will be described based on FIG. (A) to (e) of FIG. 12 show 12 pixels of 3 × 4 in the vertical direction that constitute a partial region of the distance image. In addition, pixels (referred to as pixels A) denoted by “A” in FIGS. 12A to 12E are pixels that constitute the same segment. Also, the pixels labeled “B”, “C”, or “D” (referred to as pixels B, C, and D, respectively) and “blank” pixels are different from the segment RA having the pixel A. It constitutes a segment. Here, the segments to which the pixels other than the pixel A belong may all be the same segment, or may all be different segments. 12A to 12E, the representative pixels in each case and the pixels on the same scanning line as the representative pixels (scan lines: pixel rows in the present embodiment) are hatched. .
 また、本実施形態では、図8に示すように、画像の左上から右下に向かって横方向に画素を走査するものとする。そのため、ここでは、或る画素と同じ走査線の画素とは、或る画素と同じ行の画素を意味する。また、或る画素よりラスタスキャン順の1つ前の画素とは、或る画素の1つ左の画素を意味する。また、或る画素よりラスタスキャン順が1つ後の画素とは、或る画素の1つ右の画素を意味する。或る画素と隣接する画素であって、或る画素のラスタスキャン順で1つ前の走査線の画素とは、或る画素の1つ上の画素を意味する。 In this embodiment, as shown in FIG. 8, the pixels are scanned in the horizontal direction from the upper left to the lower right of the image. Therefore, here, a pixel on the same scanning line as a certain pixel means a pixel in the same row as the certain pixel. Further, a pixel immediately before a certain pixel in the raster scan order means a pixel one pixel to the left of the certain pixel. Further, a pixel whose raster scan order is one after a certain pixel means a pixel one right of a certain pixel. A pixel that is adjacent to a certain pixel and is one pixel before the scanning line in the raster scan order of the certain pixel means a pixel that is one pixel above the certain pixel.
 以下、符号化対象セグメントを、画素Aを有するセグメントRAとし、図12の(a)~(e)に示す例において、セグメントRAの代表画素、予測参照画素の特定について説明する。 Hereinafter, the encoding target segment is assumed to be the segment RA having the pixel A, and the identification of the representative pixel and the predicted reference pixel of the segment RA will be described in the example shown in FIGS.
 まず、図12の(a)の場合、代表画素は、画素Aのうち、一番上に位置する斜線の画素Aである。そして、ここでは、代表画素の1つ左の画素B、代表画素の1つ上の画素C、および、代表画素の右斜め上の画素Dを予測参照画素とする。 First, in the case of FIG. 12A, the representative pixel is the pixel A in the shaded line located at the top of the pixel A. In this case, the pixel B that is one pixel to the left of the representative pixel, the pixel C that is one pixel above the representative pixel, and the pixel D that is diagonally to the right of the representative pixel are used as predicted reference pixels.
 また、図12の(b)の場合、代表画素は、画素Aのうち、斜線が付されている画素Aの左側の画素Aである。そして、ここでは、代表画素の1つ左の画素B、代表画素の1つ上の画素C、および、代表画素の1つ右の画素Aの右斜め上の画素Dを予測参照画素とする。 In the case of FIG. 12B, the representative pixel is the pixel A on the left side of the pixel A among the pixels A that is shaded. Here, a pixel B that is one pixel to the left of the representative pixel, a pixel C that is one pixel above the representative pixel, and a pixel D that is diagonally right above the pixel A that is one pixel to the right of the representative pixel are used as predicted reference pixels.
 また、図12の(c)の場合、代表画素は、画素Aのうち、斜線が付されている画素Aの一番左側の画素Aである。そして、ここでは、代表画素の1つ左の画素B、代表画素の1つ上の画素C、および、代表画素と同じ行の最も右に位置する画素Aの右斜め上の画素Dを予測参照画素とする。 In the case of FIG. 12C, the representative pixel is the leftmost pixel A of the pixel A among the pixels A that is shaded. And here, prediction reference is made to a pixel B one pixel to the left of the representative pixel, a pixel C one pixel above the representative pixel, and a pixel D diagonally right above the pixel A located at the rightmost position in the same row as the representative pixel. Let it be a pixel.
 また、図12の(d)の場合、代表画素は、画素Aのうち、斜線が付されている画素Aの左側の画素Aである。そして、ここでは、代表画素の1つ左の画素B、代表画素の1つ上の画素C、代表画素の1つ右の画素の1つ上の画素C、および、代表画素の1つ右の画素の右斜め上の画素Dを予測参照画素とする。 In the case of FIG. 12D, the representative pixel is the pixel A on the left side of the pixel A among the pixels A that is shaded. And here, pixel B one pixel to the left of the representative pixel, pixel C one pixel above the representative pixel, pixel C one pixel right above the representative pixel, and one pixel right above the representative pixel A pixel D on the upper right side of the pixel is set as a predicted reference pixel.
 また、図12の(e)の場合、代表画素は、画素Aのうち、斜線が付されている画素Aの一番左側の画素Aである。そして、ここでは、代表画素の1つ左の画素B、代表画素および代表画素と同じ行の画素Aの1つ上にそれぞれ位置する3つの画素C、および、代表画素と同じ行の最も右に位置する画素Aの右斜め上の画素Dを予測参照画素とする。 In the case of (e) in FIG. 12, the representative pixel is the leftmost pixel A of the pixel A among the pixels A that is shaded. In this example, the pixel B one pixel to the left of the representative pixel, the three pixels C respectively positioned on one pixel A on the same row as the representative pixel, and the rightmost pixel on the same row as the representative pixel. A pixel D on the upper right side of the pixel A is set as a predicted reference pixel.
 なお、本実施形態では、図8に示す順番で画素を走査してセグメント番号#24を付与している。そのため、セグメント番号#24の示す順番に従って、各セグメントを符号化するため、図12に示す画素B、画素Cおよび画素Dは、符号化対象セグメント(に含まれる画素A)よりも先行して符号化されることが保証される。 In this embodiment, the pixel is scanned in the order shown in FIG. 8, and segment number # 24 is assigned. Therefore, in order to encode each segment in the order indicated by the segment number # 24, the pixel B, the pixel C, and the pixel D illustrated in FIG. 12 are encoded in advance of the encoding target segment (the pixel A included in the encoding target segment). It is guaranteed that
 ステップS104において予測参照画素を特定した後、予測参照画素を有するセグメントの代表値に基づいて、符号化対象セグメントの代表値の予測値を算出する(ステップS105)。例えば、図12の(a)に示す例のように、予測参照画素を画素B、画素C、画素Dとした場合、画素Bを有するセグメントRBの代表値Z_B、画素Cを有するセグメントRCの代表値Z_C、および、画素Dを有するセグメントRDの代表値Z_Dに基づいて、セグメントRAの代表値Z_Aの予測値Z’_Aを算出する。ここで、Z’_Aを、Z_B、Z_C、Z_Dの中央値としてもよい。また、Z’_Aを、Z_B、Z_C、Z_Dの平均値としてもよい。また、Z’_Aを、Z_B、Z_C、Z_Dの何れかの値としてもよい。 After specifying the predicted reference pixel in step S104, the predicted value of the representative value of the encoding target segment is calculated based on the representative value of the segment having the predicted reference pixel (step S105). For example, when the prediction reference pixel is pixel B, pixel C, and pixel D as in the example illustrated in FIG. 12A, the representative value Z_B of the segment RB having the pixel B and the representative of the segment RC having the pixel C Based on the value Z_C and the representative value Z_D of the segment RD having the pixel D, the predicted value Z′_A of the representative value Z_A of the segment RA is calculated. Here, Z′_A may be a median value of Z_B, Z_C, and Z_D. Z′_A may be an average value of Z_B, Z_C, and Z_D. Z′_A may be any value of Z_B, Z_C, and Z_D.
 符号化対象セグメントの予測値Z’_Aを算出した後、符号化対象セグメントの代表値Z_Aから予測値Z’_Aを減算して差分値ΔZ_Aを算出する(ステップS106)。算出した差分値ΔZ_Aを、符号化対象セグメントに含まれる画素の距離値を示す値とする。なお、前述のように、距離値は256段階であり、0から255までの値を取るため、ΔZ_Aは-255から+255までの値を取り得る。 After calculating the prediction value Z′_A of the encoding target segment, the difference value ΔZ_A is calculated by subtracting the prediction value Z′_A from the representative value Z_A of the encoding target segment (step S106). The calculated difference value ΔZ_A is a value indicating the distance value of the pixels included in the encoding target segment. As described above, the distance value has 256 steps and takes a value from 0 to 255. Therefore, ΔZ_A can take a value from −255 to +255.
 次に、算出した差分値を、値が0に近いほど符号語が短い可変長符号化方法によって符号化する(ステップS107)。本実施形態では、可変長符号化方法の1つである指数ゴロム符号化方法を用いて差分値を符号化するものとする。指数ゴロム符号化方法における、差分値と符号語との対応関係を図13に示す。図13では、右欄に差分値の値を示し、左欄に差分値に対して指数ゴロム符号化を行った場合の符号語を示す。図13に示すように、指数ゴロム符号化方法では、差分値が0に近いほど、すなわち、予測値が実際の距離値を近似した代表値に近いほど、割り当てられる符号語が短くなる。よって、伝送する情報量を削減して距離画像を伝送することができる。 Next, the calculated difference value is encoded by a variable-length encoding method in which the code word is shorter as the value is closer to 0 (step S107). In the present embodiment, it is assumed that the difference value is encoded using an exponential Golomb encoding method which is one of variable length encoding methods. FIG. 13 shows the correspondence between the difference value and the code word in the exponential Golomb encoding method. In FIG. 13, the difference value is shown in the right column, and the code word when exponent Golomb coding is performed on the difference value is shown in the left column. As shown in FIG. 13, in the exponential Golomb encoding method, the codeword to be assigned becomes shorter as the difference value is closer to 0, that is, as the predicted value is closer to the representative value approximating the actual distance value. Therefore, it is possible to transmit the distance image while reducing the amount of information to be transmitted.
 ステップS107の後、セグメント番号#24である「i」が「M-1」になっているかどうか確認する(ステップS108)。「i=M-1」でなければ(ステップS108でNO)、セグメント番号#24である「i」を「i+1」とし(ステップS109)、ステップS102~S107の処理を実行する。一方、「i=M-1」であれば(ステップS108でYES)、ステップS110に進む。 After step S107, it is confirmed whether or not the segment number # 24 “i” is “M−1” (step S108). If it is not “i = M−1” (NO in step S108), “i” that is the segment number # 24 is set to “i + 1” (step S109), and the processes of steps S102 to S107 are executed. On the other hand, if “i = M−1” (YES in step S108), the process proceeds to step S110.
 すなわち、ステップS107の後、全て(M個)のセグメントについて符号化したかどうかを確認し、全てのセグメントを符号化していなければ、セグメント番号#24順にステップS102~S107の処理を実行する。そして、全てのセグメントについて差分値を算出し、符号化した場合、ステップS110に進む。 That is, after step S107, it is confirmed whether or not all (M) segments have been encoded. If all segments have not been encoded, the processes of steps S102 to S107 are executed in order of segment number # 24. If difference values are calculated and encoded for all segments, the process proceeds to step S110.
 ステップS110では、符号化された差分値をセグメント番号#24順に並べて、符号化データ#25を生成する。符号化データ#25の具体的な例を図14に示す。図14には、差分値「3」、「-4」、「-1」、「0」が順に並んで符号化されている符号化データ#25の一例を示す。このように、予測符号化部25は、入力されたデータを圧縮して符号化データ#25を生成し、生成した符号化データ#25をパッケージング部28に出力する(ステップS9)。 In step S110, the encoded difference values are arranged in the order of segment number # 24 to generate encoded data # 25. A specific example of the encoded data # 25 is shown in FIG. FIG. 14 shows an example of encoded data # 25 in which difference values “3”, “−4”, “−1”, and “0” are encoded in order. Thus, the predictive encoding unit 25 compresses the input data to generate encoded data # 25, and outputs the generated encoded data # 25 to the packaging unit 28 (step S9).
 ステップS9の後、パッケージング部28は、ステップS2にて画像符号化部11が出力した符号化データ#11とステップS9にて予測符号化部25が出力した符号化データ#25とを統合し、得られた符号化データ#28を、後述する動画像復号装置に伝送する(ステップS10)。 After step S9, the packaging unit 28 integrates the encoded data # 11 output from the image encoding unit 11 in step S2 and the encoded data # 25 output from the predictive encoding unit 25 in step S9. Then, the obtained encoded data # 28 is transmitted to a moving picture decoding apparatus to be described later (step S10).
 具体的には、パッケージング部28は、H.264/MPEG-4 AVC規格で規定されているNALユニットのフォーマットに従って、テクスチャ画像の符号化データ#11と距離画像の符号化データ#25とを統合する。符号化データ#11と符号化データ#25との統合は、より具体的には以下のように行われる。 Specifically, the packaging unit 28 is H.264. In accordance with the format of the NAL unit defined in the H.264 / MPEG-4 AVC standard, the texture image encoded data # 11 and the distance image encoded data # 25 are integrated. More specifically, the integration of the encoded data # 11 and the encoded data # 25 is performed as follows.
 図15はNALユニットの構成を模式的に示した図であるが、図11に示すように、NALユニットは、NALヘッダ部とRBSP部とRBSPトレイリングビット部との3つの部分から構成される。 FIG. 15 is a diagram schematically showing the configuration of the NAL unit. As shown in FIG. 11, the NAL unit is composed of three parts: a NAL header part, an RBSP part, and an RBSP trailing bit part. .
 パッケージング部28は、主ピクチャの各スライス(主スライス)に対応するNALユニットのNALヘッダ部のnal_unit_type(NALユニットの種類を示す識別子)フィールドに、規定の数値Iを格納する。この規定の数値Iは、符号化データ#28が本実施形態に係る符号化方法(すなわち、距離画像#2をセグメント毎に差分値を算出した上で符号化する符号化方法)に従って生成された符号化データであることを示す値である。また、数値Iとしては、例えば、H.264/MPEG-4 AVC規格で「未定義」または「将来拡張用」と規定されている数値を用いることできる。 The packaging unit 28 stores a specified numerical value I in the nal_unit_type (identifier indicating the type of NAL unit) field of the NAL header portion of the NAL unit corresponding to each slice (main slice) of the main picture. The prescribed numerical value I is generated in accordance with the encoding method according to the present embodiment (that is, the encoding method for encoding the distance image # 2 after calculating the difference value for each segment) according to the present embodiment. This is a value indicating encoded data. The numerical value I is, for example, H.264. Values defined as “undefined” or “for future expansion” in the H.264 / MPEG-4 AVC standard can be used.
 そして、パッケージング部28は、主スライスに対応するNALユニットのRBSP部に、符号化データ#11と符号化データ#25とを格納する。さらに、パッケージング部28は、RBSPトレイリングビット部にRBSPトレイリングビットを格納する。 The packaging unit 28 stores the encoded data # 11 and the encoded data # 25 in the RBSP unit of the NAL unit corresponding to the main slice. Further, the packaging unit 28 stores the RBSP trailing bit in the RBSP trailing bit unit.
 パッケージング部28は、このようにして得られたNALユニットを符号化データ#28として動画像復号装置に伝送する。 The packaging unit 28 transmits the NAL unit thus obtained to the video decoding device as encoded data # 28.
 (付記事項1)
 上記実施形態では、画像分割処理部21は、入力されたテクスチャ画像#1’から、各セグメントが最大画素値と最小画素値との差が所定の閾値以下であるような画素群から構成される複数のセグメントを規定するものとしたが、セグメントの規定の仕方はこの構成に限られない。例えば、画像分割処理部21は、入力されたテクスチャ画像#1’から、各セグメントについて、該セグメントに含まれる画素群の画素値から算出される平均値と該セグメントに隣接するセグメントに含まれる画素群の画素値から算出される平均値との差が所定の閾値以上であるような複数のセグメントを規定してもよい。
(Appendix 1)
In the above-described embodiment, the image division processing unit 21 is configured from a group of pixels whose difference between the maximum pixel value and the minimum pixel value is equal to or less than a predetermined threshold value from the input texture image # 1 ′. Although a plurality of segments are defined, the method of defining the segments is not limited to this configuration. For example, for each segment, the image division processing unit 21 calculates the average value calculated from the pixel values of the pixel group included in the segment and the pixels included in the segment adjacent to the segment from the input texture image # 1 ′. A plurality of segments whose difference from the average value calculated from the pixel values of the group is equal to or greater than a predetermined threshold value may be defined.
 上記平均値の差が所定の閾値以上であるような複数のセグメントを規定する具体的なアルゴリズムについて図21および図22を参照しながら以下に説明する。 A specific algorithm for defining a plurality of segments in which the difference between the average values is equal to or greater than a predetermined threshold will be described below with reference to FIGS.
 図21は、上記アルゴリズムに基づいて動画像符号化装置1が複数のセグメントを規定する動作を示すフローチャート図である。また、図22は、図21のフローチャートにおけるセグメント結合処理のサブルーチンを示すフローチャート図である。 FIG. 21 is a flowchart showing an operation in which the video encoding device 1 defines a plurality of segments based on the above algorithm. FIG. 22 is a flowchart showing a subroutine of segment combination processing in the flowchart of FIG.
 画像分割処理部21は、次の(付記事項2)に示すような平滑化処理が施されたテクスチャ画像に対し、図中の初期化ステップで、テクスチャ画像中に含まれる全ての画素の各々について、独立した1つのセグメント(暫定セグメント)を規定し、各暫定セグメントにおける全画素値の平均値(平均色)として、対応する画素の画素値そのものを設定する(ステップS41)。 The image division processing unit 21 performs, for each of all the pixels included in the texture image, in the initialization step in the figure for the texture image subjected to the smoothing process as shown in (Appendix 2). One independent segment (provisional segment) is defined, and the pixel value itself of the corresponding pixel is set as the average value (average color) of all pixel values in each provisional segment (step S41).
 次に、セグメント結合処理ステップ(ステップS42)に進み、色が似ている暫定セグメント同士を結合させる。このセグメント結合処理について以下に図22を参照しながら詳細に説明するが、この結合処理を、結合が行われなくなるまで繰り返し続ける。 Next, the process proceeds to a segment combination processing step (step S42) to combine provisional segments having similar colors. This segment combining process will be described in detail below with reference to FIG. 22, but this combining process is repeated until the combination is not performed.
 画像分割処理部21は、全ての暫定セグメントについて、以下の処理(ステップS51~S55)を行う。 The image division processing unit 21 performs the following processing (steps S51 to S55) for all provisional segments.
 まず、画像分割処理部21は、注目する暫定セグメントの高さと幅とが、いずれも閾値以下であるかどうかを判定する(ステップS51)。もしいずれも閾値以下であると判定された場合(S51においてYES)、ステップS52の処理に進む。一方、いずれかが閾値より大きいと判定された場合(S51においてNO)、次に注目すべき暫定セグメントについてステップS51の処理を行う。なお、次に注目すべき暫定セグメントは、例えば、ラスタスキャン順で注目している暫定セグメントの次に位置する暫定セグメントにすればよい。 First, the image division processing unit 21 determines whether or not the height and width of the temporary segment of interest are both equal to or less than a threshold value (step S51). If it is determined that both are equal to or less than the threshold value (YES in S51), the process proceeds to step S52. On the other hand, when it is determined that any one is larger than the threshold value (NO in S51), the process of step S51 is performed for the temporary segment to be focused next. Note that the temporary segment to be noted next may be, for example, a temporary segment positioned next to the temporary segment of interest in the raster scan order.
 画像分割処理部21は、注目している暫定セグメントに隣接する暫定セグメントのうち、注目している暫定セグメントにおける平均色と最も近い平均色の暫定セグメントを選択する(ステップS52)。色の近さを判断する指標としては、例えば、画素値のRGBの3つの値を3次元ベクトルと見做したときの、ベクトル同士のユークリッド距離を用いることができる。各セグメントの画素値としては、各セグメントに含まれる全画素値の平均値を用いる。 The image division processing unit 21 selects a temporary segment having an average color closest to the average color in the temporary segment of interest among the temporary segments adjacent to the temporary segment of interest (step S52). As an index for judging the closeness of colors, for example, the Euclidean distance between vectors when the three RGB values of pixel values are regarded as a three-dimensional vector can be used. As a pixel value of each segment, an average value of all pixel values included in each segment is used.
 ステップS52の処理の後、画像分割処理部21は、注目している暫定セグメントと、最も色が近いと判断された暫定セグメントと、の近さが、ある閾値以下であるか否かを判定する(ステップS53)。閾値より大きいと判定された場合(ステップS53においてNO)、次に注目すべき暫定セグメントについてステップS51の処理を行う。一方、閾値以下であると判定された場合(ステップS53においてNO)、ステップS54の処理に進む。 After the process of step S52, the image division processing unit 21 determines whether or not the proximity of the temporary segment of interest and the temporary segment that is determined to have the closest color is equal to or less than a certain threshold value. (Step S53). If it is determined that the value is larger than the threshold value (NO in step S53), the process of step S51 is performed for the temporary segment that should be noticed next. On the other hand, when it is determined that the value is equal to or less than the threshold value (NO in step S53), the process proceeds to step S54.
 ステップS53の処理の後、画像分割処理部21は、2つの暫定セグメント(注目している暫定セグメントと最も色が近いと判断された暫定セグメント)を結合することにより、1つの暫定セグメントに変換する(ステップS54)。このステップS54の処理のより暫定セグメントの数が1減ることになる。 After the process of step S53, the image division processing unit 21 converts two provisional segments (provisional segments determined to be closest in color to the provisional segment of interest) into one provisional segment. (Step S54). The number of provisional segments is reduced by 1 by the process of step S54.
 ステップS54の処理の後、変換後の対象セグメントに含まれる全画素の画素値の平均値を計算する(ステップS55)。まだステップS51~S55までの処理を行っていないセグメントがある場合には、次に注目すべき暫定セグメントについてステップS51の処理を行う。 After the process of step S54, the average value of the pixel values of all the pixels included in the converted target segment is calculated (step S55). If there is a segment that has not yet been subjected to the processing of steps S51 to S55, the processing of step S51 is performed for the temporary segment to be noticed next.
 ステップS51~S55の処理を全暫定セグメントについて完了した後、ステップS43の処理に進む。 After completing the processes of steps S51 to S55 for all the provisional segments, the process proceeds to the process of step S43.
 画像分割処理部21は、ステップS42の処理を行う前の暫定セグメントの数とステップS42の処理を行った後の暫定セグメントの数とを比較する(ステップS43)。 The image division processing unit 21 compares the number of provisional segments before the process of step S42 and the number of provisional segments after the process of step S42 (step S43).
 暫定セグメントの数が減少した場合(ステップS43においてYES)には、ステップS42の処理に戻る。一方、暫定セグメントの数が変わらない場合(ステップS43においてNO)、画像分割処理部21は、現状の各暫定セグメントを1つのセグメントとして規定する。 If the number of provisional segments has decreased (YES in step S43), the process returns to step S42. On the other hand, when the number of temporary segments does not change (NO in step S43), the image division processing unit 21 defines each current temporary segment as one segment.
 以上のようなアルゴリズムによって、例えば、入力されたテクスチャ画像が1024×768ドットの画像である場合、数千個程度(例えば3000個~5000個)のセグメントに分割することができる。 By the above algorithm, for example, when the input texture image is an image of 1024 × 768 dots, it can be divided into several thousand segments (for example, 3000 to 5000 segments).
 なお、前述したように、セグメントは、距離画像を分割するために用いられる。したがって、セグメントのサイズが大きくなり過ぎると、1つのセグメントの中にさまざまな距離値が含まれてしまい、代表値との誤差が大きい画素が生じてしまい、その結果、距離画像の符号化精度が低下する。したがって、本発明ではステップS51の処理は必須ではないがステップS51のようにセグメントの大きさを制限することにより、セグメントのサイズが大きくなり過ぎることを防ぐことが望ましい。 Note that, as described above, the segment is used to divide the distance image. Therefore, if the size of the segment becomes too large, various distance values are included in one segment, resulting in pixels having a large error from the representative value. As a result, the encoding accuracy of the distance image is increased. descend. Therefore, in the present invention, the process of step S51 is not essential, but it is desirable to prevent the segment size from becoming too large by limiting the segment size as in step S51.
 上記実施形態では、画像分割処理部21は、入力されたテクスチャ画像#1’から、各セグメントが最大画素値と最小画素値との差が所定の閾値以下であるような画素群から構成される複数のセグメントを規定するものとしたが、各セグメントに含まれる画素の数に上限を設けてもよい。また、画素数の上限とともに、または、画素数の上限に代えて、セグメントの幅または高さに上限を設けてもよい。 In the above-described embodiment, the image division processing unit 21 is configured from a group of pixels whose difference between the maximum pixel value and the minimum pixel value is equal to or less than a predetermined threshold value from the input texture image # 1 ′. Although a plurality of segments are defined, an upper limit may be set for the number of pixels included in each segment. In addition, an upper limit may be provided for the width or height of the segment together with the upper limit of the number of pixels or instead of the upper limit of the number of pixels.
 上限を設けた場合、画像分割処理部21により規定されるセグメントの数は、上限を設けない場合に比べて多くなる。すなわち、セグメントの数が多くなる分、セグメントの大きさは相対的に小さくなる。したがって、上限を設けることにより、動画像復号装置2では、元の距離画像#2をより忠実に再現した距離画像を復号することができる。 When the upper limit is provided, the number of segments defined by the image division processing unit 21 is larger than when no upper limit is provided. That is, as the number of segments increases, the size of the segments becomes relatively small. Therefore, by providing an upper limit, the moving image decoding apparatus 2 can decode a distance image that more faithfully reproduces the original distance image # 2.
 (付記事項2)
 画像分割処理部21は、入力されたテクスチャ画像#1’に平滑化処理を施してもよい。例えば、画像分割処理部21は、非特許文献”C.Lawrence Zinick, Sing Bing Kang, Mattew Uyttendaele, Simon Winder and Richard Szeliski, “High-quality video view interpolation using a layered representation,” ACM Trans. on Graphics, 23(3), 600-608, (2004)”に記載されているように、エッジ情報が失われない程度にテクスチャ画像#1’に繰り返し平滑化処理を施してもよい。
(Appendix 2)
The image division processing unit 21 may perform a smoothing process on the input texture image # 1 ′. For example, the image division processing unit 21 is a non-patent document “C. Lawrence Zinick, Sing Bing Kang, Mattew Uyttendaele, Simon Winder and Richard Szeliski,“ High-quality video view interpolation using a layered representation, ”ACM Trans. On Graphics, 23 (3), 600-608, (2004) ”, the texture image # 1 ′ may be repeatedly smoothed to such an extent that the edge information is not lost.
 そして、画像分割処理部21は、平滑化処理後のテクスチャ画像を、各セグメントが最大画素値と最小画素値との差が所定の閾値以下であるような画素群から構成される複数のセグメントに分割してもよい。 Then, the image division processing unit 21 converts the texture image after the smoothing process into a plurality of segments each composed of a pixel group in which the difference between the maximum pixel value and the minimum pixel value is equal to or less than a predetermined threshold value. It may be divided.
 上述の平滑化処理を施さない場合、テクスチャ画像#1’にノイズが多く含まれていると、セグメントのサイズが小さくなってしまうが、平滑化処理を施すことにより、セグメントのサイズが小さくなってしまうことを抑制することができる。すなわち、平滑化処理を行うことにより、符号化データ#25の符号量を、平滑化処理を施さない場合に比べて削減することができる。 When the above smoothing process is not performed, if the texture image # 1 ′ contains a lot of noise, the size of the segment is reduced. However, the smoothing process reduces the size of the segment. Can be suppressed. That is, by performing the smoothing process, the code amount of the encoded data # 25 can be reduced as compared with the case where the smoothing process is not performed.
 また、画像分割処理部21は、画像復号部12と距離画像分割処理部22との間に配置するのではなく、画像分割処理部21を、画像符号化部11の前段に配置してもよい。すなわち、画像分割処理部21は、入力されたテクスチャ画像#1をそのまま後段の画像符号化部11に出力するとともに、テクスチャ画像#1を各セグメントが最大画素値と最小画素値との差が所定の閾値以下であるような画素群から構成される複数のセグメントに分割し、セグメント情報#21を後段の距離画像分割処理部22に出力してもよい。 In addition, the image division processing unit 21 may be arranged before the image encoding unit 11 instead of being arranged between the image decoding unit 12 and the distance image division processing unit 22. . That is, the image division processing unit 21 outputs the input texture image # 1 as it is to the subsequent image encoding unit 11, and each segment of the texture image # 1 has a predetermined difference between the maximum pixel value and the minimum pixel value. May be divided into a plurality of segments composed of pixel groups that are equal to or smaller than the threshold value, and segment information # 21 may be output to the distance image division processing unit 22 in the subsequent stage.
 (付記事項3)
 また、距離値修正部23と番号付与部24との位置を入れ換えて配置してもよい。つまり、図2に示すステップS7とS8との処理の順番を入れ替えてもよい。
(Appendix 3)
Further, the positions of the distance value correcting unit 23 and the number assigning unit 24 may be interchanged. That is, the processing order of steps S7 and S8 shown in FIG.
 この場合、番号付与部24は、距離画像分割処理部22から各セグメントについて距離値セットと位置情報とが関連づけられたセグメント情報#22を受信する。そして、番号付与部24は、距離画像に含まれる画素に対してラスタスキャン順に走査して、セグメント情報#22の位置情報によって分割される領域である各セグメントに対して、走査した順番でセグメント番号#24を付与し、セグメント情報#22に含まれている各セグメントの距離値セットに対応付ける。 In this case, the number assigning unit 24 receives the segment information # 22 in which the distance value set and the position information are associated with each segment from the distance image division processing unit 22. Then, the number assigning unit 24 scans the pixels included in the distance image in the raster scan order, and performs segment numbers in the scanned order for each segment that is an area divided by the position information of the segment information # 22. # 24 is assigned and associated with the distance value set of each segment included in the segment information # 22.
 距離値修正部23は、セグメント番号#24と距離値セットが対応付けられた情報を番号付与部24から受信する。そして、距離値修正部23は、各セグメントの距離値セットから代表値#23aとして最頻値を算出する。そして、距離値修正部23は、セグメント番号#24にセグメントの代表値#23aを対応付けて予測符号化部25に出力する。 The distance value correcting unit 23 receives information in which the segment number # 24 and the distance value set are associated with each other from the number assigning unit 24. Then, the distance value correcting unit 23 calculates the mode value as the representative value # 23a from the distance value set of each segment. Then, the distance value correction unit 23 associates the segment number # 24 with the segment representative value # 23a and outputs the segment number # 24 to the prediction encoding unit 25.
 (付記事項4)
 また、番号付与部24は、セグメントの位置情報と各セグメントの代表値#23aとを含むセグメント情報#23を受信して、各セグメントの代表値#23aおよびセグメント番号#24を予測符号化部25に出力しているが、これに限るものではない。例えば、番号付与部24が、予測符号化部25に、各セグメントの代表値#23aおよびセグメント番号#24に加えて、セグメントの位置情報を出力してもよい。この場合、予測符号化部25は、差分値を符号化した符号化データ#25にセグメントの位置情報を加えてパッケージング部28に出力する。そして、パッケージング部28は、画像符号化部11が出力したテクスチャ画像の符号化データ#11に代えて、セグメントの位置情報を符号化データ#28に加えてもよい。つまり、この場合、パッケージング部28は、差分値を符号化した符号化データ#25とセグメントの位置情報とを含む符号化データ#28を動画像復号装置に伝送する。
(Appendix 4)
The number assigning unit 24 receives the segment information # 23 including the segment position information and the representative value # 23a of each segment, and the predictive encoding unit 25 converts the representative value # 23a and the segment number # 24 of each segment. However, it is not limited to this. For example, the number assigning unit 24 may output the segment position information to the predictive encoding unit 25 in addition to the representative value # 23a and the segment number # 24 of each segment. In this case, the predictive encoding unit 25 adds segment position information to encoded data # 25 obtained by encoding the difference value, and outputs the result to the packaging unit 28. The packaging unit 28 may add the position information of the segment to the encoded data # 28 instead of the encoded image # 11 of the texture image output from the image encoding unit 11. That is, in this case, the packaging unit 28 transmits the encoded data # 28 including the encoded data # 25 obtained by encoding the difference value and the segment position information to the video decoding device.
 この場合、詳細は後述するが、動画像復号装置において符号化データ#25を復号する場合、動画像復号装置は、セグメントの位置情報に基づいて符号化データ#25を復号する。動画像復号装置は、動画像符号化装置1と同じ分割パターンでセグメントが分割できれば十分であるため、セグメントの位置を示すセグメントの位置情報に基づいて距離画像を復元することができる。つまり、テクスチャ画像の符号化データ#11がない場合でも、該テクスチャ画像に基づいてセグメントに分割した距離画像を復元することができる。したがって、パッケージング部28は、セグメントを規定するセグメント規定情報(領域情報)と符号化データ#25をとを動画像復号装置に伝送すれば十分である。ここで、セグメント規定情報とは、テクスチャ画像の符号化データ#11、もしくは、セグメントの位置情報である。 In this case, although details will be described later, when the encoded data # 25 is decoded in the moving picture decoding apparatus, the moving picture decoding apparatus decodes the encoded data # 25 based on the position information of the segment. The moving image decoding apparatus only needs to be able to divide a segment with the same division pattern as that of the moving image encoding apparatus 1, and thus can restore a distance image based on segment position information indicating the position of the segment. That is, even when there is no encoded image # 11 of the texture image, the distance image divided into segments based on the texture image can be restored. Therefore, it is sufficient for the packaging unit 28 to transmit the segment defining information (region information) defining the segment and the encoded data # 25 to the video decoding device. Here, the segment defining information is the texture image encoded data # 11 or the segment position information.
 (付記事項5)
 予測符号化部25が代表画素に基づいて予測参照画素を特定する際に、図12の(c)に示す例において、画素B、画素Dに加えて、代表画素の1つ上の画素Cを予測参照画素としたが、これに限るものではない。代表画素および代表画素と同じ走査線の画素Aの1つ上の画素の少なくとも1つを予測参照画素としてもよい。例えば、図12の(c)に示す例において、斜線が付されている画素Aの中央の画素(代表画素の1つ右の画素)の1つ上の画素を予測参照画素としてもよい。
(Appendix 5)
When the prediction encoding unit 25 specifies a prediction reference pixel based on the representative pixel, in the example illustrated in FIG. 12C, in addition to the pixel B and the pixel D, the pixel C that is one above the representative pixel Although the prediction reference pixel is used, the present invention is not limited to this. At least one of the representative pixel and one pixel above the pixel A on the same scanning line as the representative pixel may be a predicted reference pixel. For example, in the example shown in (c) of FIG. 12, a pixel that is one pixel above the center pixel (one pixel to the right of the representative pixel) of the pixel A that is shaded may be used as the predicted reference pixel.
 また、予測符号化部25は、予測参照画素を有するセグメントの代表値に基づいて、符号化対象セグメントの予測値を算出したがこれに限るものではない。例えば、各セグメントに含まれる画素の画素値が、同一セグメント内で同じ値である場合(値が一定とみなせる場合)、予測参照画素を有するセグメントの代表値に代えて、予測参照画素の画素値に基づいて、符号化対象セグメントの予測値を算出してもよい。 Further, the predictive encoding unit 25 calculates the predictive value of the encoding target segment based on the representative value of the segment having the predictive reference pixel, but is not limited thereto. For example, when the pixel values of the pixels included in each segment are the same value in the same segment (when the values can be regarded as constant), the pixel value of the predicted reference pixel is used instead of the representative value of the segment having the predicted reference pixel Based on the above, the predicted value of the encoding target segment may be calculated.
 また、予測符号化部25は、予測値の算出方法を示す情報を符号化し、符号化データ#25に追加してもよい。この場合、パッケージング部28は、予測値の算出方法を示す情報を含む符号化データ#28を動画復号装置に送信する。例えば、予測符号化部25は、(1)「予測値Z’_AをZ_Bとする。」(2)「予測値Z’_AをZ_Cとする。」(3)「予測値Z’_AをZ_Dとする。」(4)「予測値Z’_AをZ_B、Z_C、Z_Dの平均値とする。」の4つの予測値の算出方法から選択して予測値を算出する場合、それら4つの算出方法を2ビットの情報で表し、符号化対象セグメント毎に、選択された算出方法を符号化対象セグメントの差分値に対応付けて、符号化データ#25を生成してもよい。また、例えば、予測符号化部25は、上記4つの算出方法に、(5)「予測値Z’_AをZ_B、Z_C、Z_Dの中央値とする。」を加えて、これら5つの算出方法を表す情報を用いてもよい。 Further, the prediction encoding unit 25 may encode information indicating a calculation method of the prediction value and add it to the encoded data # 25. In this case, the packaging unit 28 transmits encoded data # 28 including information indicating the calculation method of the predicted value to the video decoding device. For example, the predictive coding unit 25 (1) “predicted value Z′_A is Z_B.” (2) “predicted value Z′_A is Z_C.” (3) “predicted value Z′_A is Z_D. (4) When calculating predicted values by selecting from the four predicted value calculation methods of “predicted value Z′_A is an average value of Z_B, Z_C, and Z_D”, those four calculation methods are used. May be represented by 2-bit information, and for each encoding target segment, the selected calculation method may be associated with the difference value of the encoding target segment to generate encoded data # 25. Further, for example, the predictive encoding unit 25 adds (5) “predicted value Z′_A as the median value of Z_B, Z_C, and Z_D” to the above four calculation methods, and adds these five calculation methods. Information to represent may be used.
 また、予測符号化部25は、例えば、代表画素が距離画像の左上の端の画素の場合、予測参照画素および予測参照画素を有するセグメントが存在しないため、この場合、予測参照画素を有するセグメントの代表値および予測参照画素の画素値を0とする。すなわち、予測符号化部25は、代表画素に基づいて予測参照画素を特定する際に、予測参照画素が存在しない場合、予測参照画素を有するセグメントの代表値および予測参照画素の画素値を0とする。 In addition, for example, when the representative pixel is the upper leftmost pixel of the distance image, the predictive encoding unit 25 does not include a segment having a predicted reference pixel and a predicted reference pixel. The pixel value of the representative value and the prediction reference pixel is set to 0. That is, when specifying the prediction reference pixel based on the representative pixel, the prediction encoding unit 25 sets the representative value of the segment having the prediction reference pixel and the pixel value of the prediction reference pixel to 0 when there is no prediction reference pixel. To do.
 なお、図12では、代表画素および代表画素と同じ走査線の画素の数が1~3の場合を示しているが、当然その画素の数はこれに限定するものではなく、4つ以上ある場合も存在する。それらの場合は、3つの例で説明した方法と同様にして処理することができる。 FIG. 12 shows the case where the number of pixels on the same scanning line as the representative pixel and the representative pixel is 1 to 3, but the number of pixels is naturally not limited to this, and the number is 4 or more. Is also present. In those cases, processing can be performed in the same manner as described in the three examples.
 また、予測符号化部25は、差分値を指数ゴロム符号化方法によって符号化しているが、符号化方法はこれに限定しない。指数ゴロム符号化方法は0付近の値に対する符号語を非常に短くする代わりに、0から遠い値に対しては符号語が非常に長くなってしまう。このため、予測の精度があまり良くない場合、指数ゴロム符号化方法に代えて、一般的なゴロム符号化を用いた方がよい結果となり、相対的に情報量を圧縮することができる。つまり、予測の精度(代表値と予測値との差分の分布)に基づいて符号化方法を選択することが望ましい。 Further, although the predictive encoding unit 25 encodes the difference value by the exponent Golomb encoding method, the encoding method is not limited to this. The exponent Golomb coding method makes the codeword very long for values far from 0, instead of making the codeword for values near 0 very short. For this reason, when the accuracy of prediction is not so good, it is better to use general Golomb coding instead of the exponential Golomb coding method, and the amount of information can be relatively compressed. That is, it is desirable to select an encoding method based on prediction accuracy (a distribution of differences between representative values and predicted values).
 また、セグメント番号#24が最初のセグメントに対する予測参照画素は存在せず、予測値はそのセグメントの代表値に-1をかけたものとなってしまうため、0から遠い値となる。よって、最初のセグメントに関しては、予測値との差分ではなく、該セグメントの代表値そのものを固定長符号化方法(例えば8ビット)で符号化した符号語を使用することにしてもよい。この場合、情報量をさらに圧縮することができる。 Also, there is no prediction reference pixel for the first segment with segment number # 24, and the prediction value is a value obtained by multiplying the representative value of that segment by −1, so that the value is far from zero. Therefore, for the first segment, a code word obtained by encoding the representative value of the segment itself with a fixed-length encoding method (for example, 8 bits) instead of the difference from the predicted value may be used. In this case, the amount of information can be further compressed.
 (付記事項6)
 上記実施形態では、動画像符号化装置1は、H.264/MPEG-4 AVC規格に規定されているAVC符号化を用いてテクスチャ画像#1を符号化するものとしたが、本発明はこれに限定されない。すなわち、動画像符号化装置1の画像符号化部11は、MPEG―2やMPEG-4他の他の符号化方式を用いてテクスチャ画像#1を符号化してもよいし、H.265/HVC規格として策定されている符号化方式を用いてテクスチャ画像#1を符号化してもよい。
(Appendix 6)
In the above embodiment, the moving image encoding apparatus 1 is the H.264 standard. The texture image # 1 is encoded using AVC encoding defined in the H.264 / MPEG-4 AVC standard, but the present invention is not limited to this. That is, the image encoding unit 11 of the moving image encoding apparatus 1 may encode the texture image # 1 using another encoding method such as MPEG-2 or MPEG-4. The texture image # 1 may be encoded using an encoding method established as the H.265 / HVC standard.
 (動画像符号化装置1の利点)
 以上のように、動画像符号化装置1では、画像分割処理部21が、テクスチャ画像#2の全領域を分割した複数のセグメントであって、各領域に含まれる画素群の最大画素値と最小画素値との差が所定の閾値以下となるような複数のセグメントを規定する。また、距離画像分割処理部22が、画像分割処理部21が規定した複数のセグメントの分割パターンと同一の分割パターンで、距離画像#2の全領域を分割した複数のセグメントを規定する。さらに、距離画像分割処理部22が規定した各セグメントについて、距離値修正部23が、セグメントに含まれる各画素の距離値から代表値#23aを算出する。
(Advantages of the video encoding device 1)
As described above, in the moving image encoding device 1, the image division processing unit 21 is a plurality of segments obtained by dividing the entire region of the texture image # 2, and the maximum pixel value and the minimum pixel group included in each region A plurality of segments are defined such that the difference from the pixel value is equal to or less than a predetermined threshold value. Further, the distance image division processing unit 22 defines a plurality of segments obtained by dividing the entire area of the distance image # 2 with the same division pattern as the plurality of segment division patterns defined by the image division processing unit 21. Further, for each segment defined by the distance image division processing unit 22, the distance value correction unit 23 calculates a representative value # 23a from the distance value of each pixel included in the segment.
 距離画像符号化部20は、距離値修正部23により算出された複数個の代表値#23aを含む符号化データ#25を生成する。 The distance image encoding unit 20 generates encoded data # 25 including a plurality of representative values # 23a calculated by the distance value correcting unit 23.
 動画像符号化装置1は、上記の構成によって、動画像復号装置に伝送する距離画像#2の符号化データ#25として、たかだかセグメントの個数分の代表値#23aを伝送することになる。 With the above configuration, the moving image encoding apparatus 1 transmits the representative values # 23a corresponding to the number of segments at most as the encoded data # 25 of the distance image # 2 transmitted to the moving image decoding apparatus.
 一方、AVC符号化を用いて距離画像を符号化した場合、距離画像の符号化データの符号量は符号化データ#25の符号量より明らかに大きくなる。 On the other hand, when the distance image is encoded using AVC encoding, the code amount of the encoded data of the distance image is clearly larger than the code amount of the encoded data # 25.
 例えば、画像分割処理部21(距離画像分割処理部22)が、上述の(付記事項1)に記載した方法により複数のセグメントを規定すると、テクスチャ画像が1024×768ドットの画像の場合、各セグメントに含まれる画素の数はおよそ3000から5000程度になる。一方、AVC符号化を用いて距離画像を符号化すると、ブロック(4x4=16画素)ごとにDCT変換および量子化処理を行い、そのブロックの総数は、49152個になる。また、AVC符号化では、ブロックに含まれる全画素の画素値を符号化するので、AVC符号化を用いた場合における距離画像の1ブロックあたりの符号量も、本実施形態の符号化方式を用いた場合における距離画像の1セグメントあたりの符号量よりも大きくなる。 For example, if the image segmentation processing unit 21 (distance image segmentation processing unit 22) defines a plurality of segments by the method described in the above (Appendix 1), each segment is determined when the texture image is an image of 1024 × 768 dots. The number of pixels included in is about 3000 to 5000. On the other hand, when a distance image is encoded using AVC encoding, DCT transformation and quantization processing are performed for each block (4 × 4 = 16 pixels), and the total number of blocks is 49152. In addition, since the pixel values of all the pixels included in the block are encoded by AVC encoding, the encoding method of this embodiment is also used for the code amount per block of the distance image when AVC encoding is used. In this case, the code amount per segment of the distance image becomes larger.
 したがって、動画像符号化装置1は、距離画像#2をAVC符号化して動画像復号装置に伝送する従来の動画像符号化装置に比べて、距離画像#2の符号化データの符号量を削減することができる。 Therefore, the moving image encoding apparatus 1 reduces the code amount of the encoded data of the distance image # 2 compared to the conventional moving image encoding apparatus that AVC encodes the distance image # 2 and transmits the encoded image to the moving image decoding apparatus. can do.
 また、動画像符号化装置1では、距離画像分割処理部22が、距離画像#2をセグメントに分割し、距離値修正部23がセグメントに含まれる画素の距離値を近似して代表値を決定し、番号付与部24がセグメントにラスタスキャン順で番号を付与する。そして、予測符号化部25が、セグメント毎に、セグメントと近接し、該セグメントに含まれる画素よりラスタスキャン順が前の画素に基づいて該セグメントの代表値の予測値を算出し、セグメントの代表値から予測値を減算して差分値を算出し、差分値を番号順に並べて符号化して符号化データ#25を生成する。 In the moving image encoding device 1, the distance image division processing unit 22 divides the distance image # 2 into segments, and the distance value correction unit 23 approximates the distance values of the pixels included in the segments to determine representative values. The number assigning unit 24 assigns numbers to the segments in the raster scan order. Then, for each segment, the predictive encoding unit 25 calculates a predicted value of the representative value of the segment based on the pixels that are close to the segment and whose raster scan order is earlier than the pixels included in the segment. A prediction value is subtracted from the value to calculate a difference value, and the difference values are arranged in numerical order and encoded to generate encoded data # 25.
 動画像符号化装置1は、上記の構成によって、動画像復号装置に伝送する距離画像#2を任意の形状のセグメントに分割する場合であっても、セグメント間の空間的冗長性を圧縮して生成することができる。したがって、動画像符号化装置1は、動画像復号装置に伝送する距離画像#2の符号化データの符号量をさらに削減することができるという効果を奏する。 The moving image encoding apparatus 1 compresses the spatial redundancy between segments even when the distance image # 2 transmitted to the moving image decoding apparatus is divided into segments of an arbitrary shape by the above configuration. Can be generated. Therefore, the moving image encoding device 1 has an effect that the code amount of the encoded data of the distance image # 2 transmitted to the moving image decoding device can be further reduced.
 (動画像復号装置2)
 次に、本発明の一実施形態に係る動画像復号装置について、図16~図18に基づいて以下に説明する。本実施形態に係る動画像復号装置は、復号すべき動画像を構成する各フレームについて、前述した動画像符号化装置1より伝送された符号化データ#28からテクスチャ画像#1’および距離画像#2’を復号する動画像復号装置である。
(Moving picture decoding apparatus 2)
Next, a moving picture decoding apparatus according to an embodiment of the present invention will be described below with reference to FIGS. The moving picture decoding apparatus according to the present embodiment uses the texture image # 1 ′ and the distance picture # from the encoded data # 28 transmitted from the moving picture encoding apparatus 1 described above for each frame constituting the moving picture to be decoded. This is a moving picture decoding apparatus for decoding 2 ′.
 最初に本実施形態に係る動画像復号装置の構成について図16を参照しながら説明する。図16は、動画像復号装置の要部構成を示すブロック図である。 First, the configuration of the video decoding device according to the present embodiment will be described with reference to FIG. FIG. 16 is a block diagram illustrating a main configuration of the video decoding device.
 図16に示すように、動画像復号装置2は、画像復号部12、画像分割処理部(分割手段)21’、番号付与部(番号付与手段、割当手段)24’アンパッケージング部(受信手段)31、および、予測復号部(予測値算出手段、画素値設定手段)32を備えている。 As shown in FIG. 16, the moving image decoding apparatus 2 includes an image decoding unit 12, an image division processing unit (dividing unit) 21 ', a numbering unit (numbering unit, assigning unit) 24' an unpackaging unit (receiving unit). ) 31 and a predictive decoding unit (predicted value calculating means, pixel value setting means) 32.
 アンパッケージング部31は、受信した符号化データ#28から、テクスチャ画像#1の符号化データ#11と距離画像#2の符号化データ#25とを抽出する。 The unpackaging unit 31 extracts the encoded data # 11 of the texture image # 1 and the encoded data # 25 of the distance image # 2 from the received encoded data # 28.
 画像復号部12は、符号化データ#11からテクスチャ画像#1’を復号する。画像復号部12は、動画像符号化装置1が備える画像復号部12と同一である。すなわち、画像復号部12は、動画像符号化装置1から動画像復号装置2への符号化データ#28の伝送中に符号化データ#28中にノイズが混入しない限り、動画像符号化装置1の画像復号部12が復号したテクスチャ画像と同一内容のテクスチャ画像#1’を復号するようになっている。 The image decoding unit 12 decodes the texture image # 1 'from the encoded data # 11. The image decoding unit 12 is the same as the image decoding unit 12 included in the moving image encoding device 1. That is, the image decoding unit 12 is configured to transmit the encoded data # 28 from the moving image encoding apparatus 1 to the moving image decoding apparatus 2 as long as no noise is mixed in the encoded data # 28. The texture image # 1 ′ having the same content as the texture image decoded by the image decoding unit 12 is decoded.
 画像分割処理部21’は、動画像符号化装置1の画像分割処理部21と同じアルゴリズムにより、テクスチャ画像#1’の全体領域を複数のセグメント(領域)に分割する。そして、画像分割処理部21’は、各セグメントの位置情報からなるセグメント情報#21’を生成し、番号付与部24’に出力する。 The image division processing unit 21 ′ divides the entire area of the texture image # 1 ′ into a plurality of segments (areas) using the same algorithm as the image division processing unit 21 of the video encoding device 1. Then, the image division processing unit 21 ′ generates segment information # 21 ′ including the position information of each segment, and outputs it to the number assigning unit 24 ′.
 番号付与部24’は、動画像符号化装置1の番号付与部24と同じアルゴリズムにより、セグメント情報#21’に基づいて分割される各セグメントに対してラスタスキャン順に番号を付与する。番号付与部24’は、セグメントの位置情報に付与した番号を対応付けたセグメント識別用画像#24’を生成し、予測復号部32に出力する。 The number assigning unit 24 'assigns a number to each segment divided based on the segment information # 21' in the raster scan order by the same algorithm as the number assigning unit 24 of the video encoding device 1. The number assigning unit 24 ′ generates a segment identification image # 24 ′ in which the number assigned to the segment position information is associated, and outputs the generated image to the predictive decoding unit 32.
 ここで、セグメント識別用画像#24’とは、各セグメントの位置を示すセグメントの位置情報に番号が対応付けられた情報である。後述するように、予測復号部32は、セグメントの位置情報に基づいて、画像全体における各セグメントの配置、各セグメントが有する画素数を特定し、そして、画像全体の画素数も特定することができる。よって、予測復号部32は、セグメントの位置情報に基づいて、セグメントに分割された画像であって、画像を構成する画素の画素値を示す情報を持たない画像を復元することができる。 Here, the segment identification image # 24 'is information in which a number is associated with segment position information indicating the position of each segment. As will be described later, the predictive decoding unit 32 can specify the arrangement of each segment in the entire image, the number of pixels included in each segment based on the segment position information, and can also specify the number of pixels in the entire image. . Therefore, the predictive decoding unit 32 can restore an image that is divided into segments and does not have information indicating the pixel values of the pixels that form the image, based on the segment position information.
 なお、セグメント識別用画像#24’は、テクスチャ画像#1’をセグメントに分割し、ラスタスキャン順でi番目に位置するセグメントにセグメント番号「i―1」を付与し、テクスチャ画像#1’中の上記i番目に位置するセグメントに含まれる各画素の画素値を「i―1」に置き換えたものであってもよい。この場合、画像分割処理部21’がテクスチャ画像#1’をセグメントに分割し、番号付与部24’が、ラスタスキャン順でi番目のセグメントにセグメント番号「i―1」を付与し、上記i番目のセグメントに含まれる各画素の画素値を「i―1」に置き換えればよい。 The segment identification image # 24 ′ divides the texture image # 1 ′ into segments, and assigns the segment number “i-1” to the i-th segment in the raster scan order, so that the texture image # 1 ′ The pixel value of each pixel included in the i th segment may be replaced with “i−1”. In this case, the image division processing unit 21 ′ divides the texture image # 1 ′ into segments, and the number assigning unit 24 ′ assigns the segment number “i-1” to the i-th segment in the raster scan order, and the above i The pixel value of each pixel included in the th segment may be replaced with “i−1”.
 予測復号部32は、入力された符号化データ#25およびセグメント識別用画像#24’に基づいて予測復号処理を施して、距離画像#2’を復元する。具体的には、予測復号部32は、符号化データ#25を復号して、順に並べられた差分値を生成し、生成した差分値を、番号付与部24’が付与した順番で、セグメント識別用画像#24’のセグメント情報#21’で規定される各セグメントに割り当てる。次に、予測復号部32は、番号付与部24’が付与した順番で、セグメント毎に、セグメントの予測値を算出し、算出した予測値に、割り当てられた差分値を加算して、算出した値を各セグメントの距離値として設定する。そして、予測復号部32は、設定したセグメントの距離値を、該セグメントに含まれる全画素の画素値(距離値)として設定し、距離画像#2’を復元する。予測復号部32は、復元した距離画像#2’を動画像復号装置2の外部の立体映像表示装置(図示せず)に出力する。 The predictive decoding unit 32 performs a predictive decoding process based on the input encoded data # 25 and the segment identification image # 24 'to restore the distance image # 2'. Specifically, the predictive decoding unit 32 decodes the encoded data # 25, generates difference values arranged in order, and identifies the generated difference values in the order given by the number assigning unit 24 ′. It assigns to each segment prescribed | regulated by segment information # 21 'of image # 24'. Next, the predictive decoding unit 32 calculates the predicted value of the segment for each segment in the order given by the number assigning unit 24 ′, and adds the assigned difference value to the calculated predicted value. Set the value as the distance value for each segment. Then, the predictive decoding unit 32 sets the distance value of the set segment as the pixel value (distance value) of all the pixels included in the segment, and restores the distance image # 2 '. The prediction decoding unit 32 outputs the restored distance image # 2 ′ to a stereoscopic video display device (not shown) outside the moving image decoding device 2.
 (動画像復号装置2の動作)
 次に、動画像復号装置2の動作について、図17を参照しながら以下に説明する。図17は、動画像復号装置2の動作を示すフローチャートである。ここで説明する動画像復号装置2の動作とは、多数のフレームからなる3次元動画像における先頭からtフレーム目のテクスチャ画像および距離画像を復号する動作である。すなわち、動画像復号装置2は、上記動画像全体を復号するために、上記動画像のフレーム数に応じた回数だけ以下に説明する動作を繰り返すことになる。また、以下の説明においては、特に断りがない限り、各データ#1~#28はtフレーム目のデータであると解釈するものとする。
(Operation of the video decoding device 2)
Next, the operation of the video decoding device 2 will be described below with reference to FIG. FIG. 17 is a flowchart showing the operation of the video decoding device 2. The operation of the moving image decoding apparatus 2 described here is an operation of decoding a texture image and a distance image of the t-th frame from the top in a three-dimensional moving image including a large number of frames. That is, the moving image decoding apparatus 2 repeats the operation described below as many times as the number of frames of the moving image in order to decode the entire moving image. Further, in the following description, unless otherwise specified, each data # 1 to # 28 is interpreted as data of the t-th frame.
 最初に、アンパッケージング部31は、動画像符号化装置1より受信した符号化データ#28から、テクスチャ画像の符号化データ#11および距離画像の符号化データ#25を抽出する。そして、アンパッケージング部31は、符号化データ#11を画像復号部12に出力し、符号化データ#25を予測復号部32に出力する(ステップS21)。 First, the unpackaging unit 31 extracts the encoded data # 11 of the texture image and the encoded data # 25 of the distance image from the encoded data # 28 received from the moving image encoding device 1. Then, the unpackaging unit 31 outputs the encoded data # 11 to the image decoding unit 12, and outputs the encoded data # 25 to the predictive decoding unit 32 (Step S21).
 画像復号部12は、入力された符号化データ#11からテクスチャ画像#1’を復号し、画像分割処理部21’と動画像復号装置2の外部の立体映像表示装置(図示せず)とに出力する(ステップS22)。 The image decoding unit 12 decodes the texture image # 1 ′ from the input encoded data # 11, and sends it to the image division processing unit 21 ′ and a stereoscopic video display device (not shown) outside the moving image decoding device 2. Output (step S22).
 画像分割処理部21’は、動画像符号化装置1の画像分割処理部21と同じアルゴリズムで複数のセグメントを規定する。そして、画像分割処理部21’は、各セグメントの位置情報からなるセグメント情報#21’を生成し、番号付与部24’に出力する(ステップS23)。 The image division processing unit 21 ′ defines a plurality of segments with the same algorithm as the image division processing unit 21 of the moving image encoding device 1. Then, the image division processing unit 21 'generates segment information # 21' composed of the position information of each segment, and outputs it to the number assigning unit 24 '(step S23).
 番号付与部24’は、動画像符号化装置1の番号付与部24と同じアルゴリズムにより、セグメント情報#21’に基づいて分割される各セグメントに対してラスタスキャン順に番号を付与する。番号付与部24’は、セグメントの位置情報に付与した番号を対応付けたセグメント識別用画像#24’を生成し、予測復号部32に出力する(ステップS24)。 The number assigning unit 24 'assigns a number to each segment divided based on the segment information # 21' in the raster scan order by the same algorithm as the number assigning unit 24 of the video encoding device 1. The number assigning unit 24 'generates a segment identification image # 24' in which the number assigned to the segment position information is associated, and outputs the segment identifying image # 24 'to the predictive decoding unit 32 (step S24).
 ステップS24の後、予測復号部32は、入力された符号化データ#25およびセグメント識別用画像#24’に基づいて予測復号処理を施して、距離画像#2’を復元する(ステップS25)。具体的には、予測復号部32は、符号化データ#25を復号して、順に並べられた差分値を生成し、生成した差分値を、番号付与部24’が付与した順番で、セグメント識別用画像#24’のセグメント情報#21’で規定される各セグメントに割り当てる。次に、予測復号部32は、番号付与部24’が付与した順番で、セグメント毎に、セグメントの予測値を算出し、算出した予測値に、割り当てられた差分値を加算して、算出した値を各セグメントの距離値として設定する。そして、予測復号部32は、設定したセグメントの距離値を、該セグメントに含まれる全画素の画素値(距離値)として設定し、距離画像#2’を復元する。 After step S24, the predictive decoding unit 32 performs a predictive decoding process based on the input encoded data # 25 and the segment identification image # 24 'to restore the distance image # 2' (step S25). Specifically, the predictive decoding unit 32 decodes the encoded data # 25, generates difference values arranged in order, and identifies the generated difference values in the order given by the number assigning unit 24 ′. It assigns to each segment prescribed | regulated by segment information # 21 'of image # 24'. Next, the predictive decoding unit 32 calculates the predicted value of the segment for each segment in the order given by the number assigning unit 24 ′, and adds the assigned difference value to the calculated predicted value. Set the value as the distance value for each segment. Then, the predictive decoding unit 32 sets the distance value of the set segment as the pixel value (distance value) of all the pixels included in the segment, and restores the distance image # 2 '.
 予測復号部32は、復元した距離画像#2’を動画像復号装置2の外部の立体映像表示装置(図示せず)に出力する。以上のようにして、テクスチャ画像#1’と距離画像#2’を復元することができる。 The predictive decoding unit 32 outputs the restored distance image # 2 'to a stereoscopic video display device (not shown) outside the video decoding device 2. As described above, the texture image # 1 'and the distance image # 2' can be restored.
 (予測復号処理)
 このステップS25において、予測復号部32が実行する予測復号処理の詳細を図18に基づいて説明する。図18は、予測復号部32が実行する予測復号処理の一例を示すフローチャートである。
(Predictive decoding process)
Details of the predictive decoding process executed by the predictive decoding unit 32 in step S25 will be described with reference to FIG. FIG. 18 is a flowchart illustrating an example of the predictive decoding process executed by the predictive decoding unit 32.
 まず、予測復号部32は、アンパッケージング部31から入力された符号化データ#25を、動画像符号化装置1の予測符号化部25が符号化データ#25を生成する際に使用した符号化方法を用いて復号し、順に並べられた差分値を生成する(ステップS201)。つまり、本実施形態では、予測復号部32は、図13に示す指数ゴロム符号化方法を用いて、図14に示す符号化データ#25を復号する。 First, the predictive decoding unit 32 uses the encoded data # 25 input from the unpackaging unit 31 as a code used when the predictive encoding unit 25 of the video encoding device 1 generates the encoded data # 25. The difference values arranged in order are generated by decoding using the conversion method (step S201). That is, in this embodiment, the predictive decoding unit 32 decodes the encoded data # 25 illustrated in FIG. 14 using the exponential Golomb encoding method illustrated in FIG.
 予測復号部32は、順に並べられた差分値を先頭から順番に、番号付与部24’が付与した順番に応じて、セグメント識別用画像#24’のセグメント情報#21’によって規定される各セグメントにそれぞれ割り当てる(ステップS202)。 The predictive decoding unit 32 sets each segment defined by the segment information # 21 ′ of the segment identification image # 24 ′ according to the order in which the number assigning unit 24 ′ assigns the difference values arranged in order from the top. (Step S202).
 次に、番号付与部24’が付与した番号である「i」を「0」とする(ステップS203)。そして、番号付与部24’が付与した番号が「i」のセグメントを復号対象セグメント(復号対象領域)とする(ステップS204)。つまり、番号付与部24’が付与した番号が先頭であるセグメントを復号対象セグメントとする。 Next, “i”, which is the number assigned by the number assigning unit 24 ′, is set to “0” (step S 203). Then, the segment with the number “i” assigned by the number assigning unit 24 ′ is set as a decoding target segment (decoding target region) (step S 204). That is, the segment with the head number assigned by the number assigning unit 24 'is set as a decoding target segment.
 次に、復号対象セグメントに含まれる画素の中から、予測値を算出するために用いる、該復号対象セグメントの代表画素を特定する(ステップS205)。具体的には、復号対象セグメントに含まれる画素であって、ステップS24においてラスタスキャン順で最初に走査される画素を代表画素とする。 Next, the representative pixel of the decoding target segment to be used for calculating the prediction value is specified from the pixels included in the decoding target segment (step S205). Specifically, the pixels included in the decoding target segment and first scanned in the raster scan order in step S24 are set as representative pixels.
 予測復号部32は、代表画素を特定した後、動画像符号化装置1の予測符号化部25が用いた予測参照画素を特定する方法と同じ方法を用いて、特定した代表画素に基づいて予測参照画素を特定する(ステップS206)。具体的には、復号対象セグメントに含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とする。例えば、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、復号対象セグメントに含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、復号対象セグメントに含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む画素群としてもよい。また、例えば、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、復号対象セグメントに含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素の何れか1つの画素と、復号対象セグメントに含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む3つの画素としてもよい。 After identifying the representative pixel, the predictive decoding unit 32 performs prediction based on the identified representative pixel using the same method as the method of identifying the prediction reference pixel used by the predictive coding unit 25 of the video encoding device 1. A reference pixel is specified (step S206). Specifically, a pixel that is included in the decoding target segment and is adjacent to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is set as a predicted reference pixel. For example, the prediction reference pixel is a pixel adjacent to a pixel in the decoding target segment that is one pixel before the representative pixel in the raster scan order and adjacent to a pixel on the same scan line as the representative pixel, and the representative pixel A pixel in the previous scan line in the raster scan order and a pixel adjacent to the next pixel in the raster scan order of the last pixel of the same scan line as the representative pixel included in the decoding target segment In this case, the pixel group may include a pixel on the previous scan line in the raster scan order of the representative pixel. In addition, for example, the prediction reference pixel is a pixel adjacent to the pixel in the decoding target segment and the pixel on the same scan line as the representative pixel, and the pixel immediately before the raster scan order of the representative pixel, One of the pixels of the previous scan line in the raster scan order of the representative pixel and one in the raster scan order of the last pixel included in the decoding target segment and the same scan line as the representative pixel Three pixels including a pixel adjacent to a subsequent pixel and a pixel on the previous scan line in the raster scan order of the representative pixel may be used.
 ステップS206において予測参照画素を特定した後、予測復号部32は、動画像符号化装置1の予測符号化部25が用いた予測値を算出する方法と同じ方法を用いて、特定した予測参照画素の画素値に基づいて、復号対象セグメントの代表値の予測値を算出する(ステップS207)。例えば、予測値を、予測対象画素の画素値の中央値としてもよい。また、予測値を、予測対象画素の画素値の平均値としてもよい。また、予測値を、予測対象画素の画素値の何れかの値としてもよい。 After specifying the prediction reference pixel in step S206, the prediction decoding unit 32 uses the same method as the method for calculating the prediction value used by the prediction encoding unit 25 of the video encoding device 1 to specify the specified prediction reference pixel. Based on the pixel value, a predicted value of the representative value of the decoding target segment is calculated (step S207). For example, the predicted value may be the median value of the pixel values of the prediction target pixels. The predicted value may be an average value of the pixel values of the prediction target pixels. Further, the predicted value may be any one of the pixel values of the prediction target pixel.
 予測復号部32は、算出した予測値に、該復号対象セグメントに割り当てられた差分値を加算して、その値を該復号対象セグメントの代表値として設定する(ステップS208)。そして、予測復号部32は、復号対象セグメントに含まれる全画素の画素値を、設定した復号対象セグメントの代表値に設定する(ステップS209)。 The prediction decoding unit 32 adds the difference value assigned to the decoding target segment to the calculated prediction value, and sets the value as a representative value of the decoding target segment (step S208). Then, the predictive decoding unit 32 sets the pixel values of all the pixels included in the decoding target segment to the representative values of the set decoding target segment (step S209).
 ステップS209の後、番号付与部24’が付与した番号である「i」が「M-1」になっているかどうか確認する(ステップS210)。「i=M-1」でなければ(ステップS210でNO)、番号付与部24’が付与した番号である「i」を「i+1」とし(ステップS211)、ステップS204~S209の処理を実行する。一方、「i=M-1」であれば(ステップS210でYES)、ステップS212に進む。 After step S209, it is confirmed whether or not “i”, which is the number assigned by the number assigning unit 24 ', is “M-1” (step S210). If it is not “i = M−1” (NO in step S210), “i”, which is the number assigned by the number assigning unit 24 ′, is set to “i + 1” (step S211), and the processes of steps S204 to S209 are executed. . On the other hand, if “i = M−1” (YES in step S210), the process proceeds to step S212.
 すなわち、ステップS209の後、全て(M個)のセグメントについてセグメントに含まれる画素の画素値を設定したかどうかを確認し、全てのセグメントに対して画素値を設定していなければ、番号付与部24’が付与した番号順にステップS204~S209の処理を実行する。そして、全てのセグメントに対して画素値を設定した場合、ステップS212に進む。 That is, after step S209, it is confirmed whether or not the pixel values of the pixels included in the segments are set for all (M) segments. If the pixel values are not set for all the segments, the number assigning unit The processes of steps S204 to S209 are executed in the order of numbers assigned by 24 '. If pixel values are set for all segments, the process proceeds to step S212.
 ステップS212では、属する画素の画素値を設定した全てのセグメントを合成して、距離画像#2’を復元する(ステップS212)。 In step S212, all the segments in which the pixel values of the belonging pixels are set are combined to restore the distance image # 2 '(step S212).
 以上、動画像復号装置2の動作について説明したが、ステップS25にて予測復号部32が復号する距離画像#2’は、一般的に、動画像符号化装置1に入力される距離画像#2に近似する距離画像になる。 The operation of the video decoding device 2 has been described above. The distance image # 2 ′ decoded by the prediction decoding unit 32 in step S25 is generally the distance image # 2 input to the video encoding device 1. The distance image approximates to.
 これは、前述したように、テクスチャ画像#1と距離画像#2との相関から、「各セグメントが類似する色の画素群で構成されるような複数のセグメントにテクスチャ画像#1’を分割すると、距離画像#2中の単一のセグメントに含まれる全部または略全ての画素が同一の距離値を持つ傾向がある」と言えるからである。すなわち、距離画像#2’は、距離画像#2中のセグメントに含まれる極一部の距離値を該セグメントにおける代表値に変更することにより得られる画像と同一であるので、距離画像#2’と距離画像#2とは近似すると言える。 As described above, this is because, from the correlation between the texture image # 1 and the distance image # 2, “when the texture image # 1 ′ is divided into a plurality of segments each composed of a group of pixels of similar colors” This is because it can be said that all or almost all pixels included in a single segment in the distance image # 2 have the same distance value. That is, the distance image # 2 ′ is the same as the image obtained by changing the distance value of a very small part included in the segment in the distance image # 2 to the representative value in the segment. It can be said that the distance image # 2 is approximate.
 (動画像復号装置2の利点)
 以上のように、動画像復号装置2は、画像分割処理部21’が、テクスチャ画像#1’の全領域を分割した複数のセグメントを規定する。具体的には、画像分割処理部21’は、各セグメントが類似する色からなる画素群により構成される複数のセグメントを規定する。
(Advantages of the video decoding device 2)
As described above, in the video decoding device 2, the image division processing unit 21 ′ defines a plurality of segments obtained by dividing the entire area of the texture image # 1 ′. Specifically, the image division processing unit 21 ′ defines a plurality of segments each including a group of pixels each having a similar color.
 また、予測復号部32が、符号化データ#25を読み出す。符号化データ#25は、復号すべき距離画像#2’を構成する複数のセグメントの各々について該セグメントにおける代表値#23aをたかだか1つ距離値として含んでいるデータである。なお、復号すべき距離画像#2’を構成する上記複数のセグメントの分割パターンは、画像分割処理部21’が規定した複数のセグメントの分割パターンと同一である。 Also, the predictive decoding unit 32 reads the encoded data # 25. The encoded data # 25 is data including at most one representative value # 23a as a distance value for each of a plurality of segments constituting the distance image # 2 'to be decoded. Note that the division pattern of the plurality of segments constituting the distance image # 2 'to be decoded is the same as the division pattern of the plurality of segments defined by the image division processing unit 21'.
 また、動画像復号装置2は、動画像符号化装置1が距離画像#2を分割したセグメントの代表値を符号化データ#25に符号化する符号化方法に対応する復号方法を用いて、符号化データ#25を復号するため、動画像符号化装置1が生成した各セグメントの代表値を正確に復元することができる。よって、動画像復号装置2は、距離画像#2と近似する距離画像#2’を正確に復元することができる。 In addition, the moving image decoding apparatus 2 uses a decoding method corresponding to an encoding method in which the representative value of the segment obtained by dividing the distance image # 2 by the moving image encoding apparatus 1 is encoded into encoded data # 25. Since the encoded data # 25 is decoded, the representative value of each segment generated by the video encoding device 1 can be accurately restored. Therefore, the moving image decoding apparatus 2 can accurately restore the distance image # 2 'that approximates the distance image # 2.
 動画像復号装置2が符号化データ#25から復元する距離画像#2’は、前述したように、動画像符号化装置1が符号化する距離画像#2と類似しているので、動画像復号装置2は適切な距離画像を復号することができる。 The distance image # 2 ′ restored from the encoded data # 25 by the moving image decoding apparatus 2 is similar to the distance image # 2 encoded by the moving image encoding apparatus 1 as described above. The device 2 can decode an appropriate distance image.
 以上に加えて、動画像復号装置2が復号する距離画像#2’にさらなる利点があることを以下に示す。 In addition to the above, it will be shown below that the distance image # 2 'decoded by the video decoding device 2 has further advantages.
 すなわち、被写体および背景が描画されているテクスチャ画像#1’と距離画像#2とから3次元画像を生成すると、生成される3次元画像における被写体の輪郭は、距離画像#2中の被写体と背景との境界の形状に応じたものとなる。 That is, when a three-dimensional image is generated from the texture image # 1 ′ on which the subject and the background are drawn and the distance image # 2, the contour of the subject in the generated three-dimensional image is the subject and background in the distance image # 2. It depends on the shape of the boundary.
 一般に、テクスチャ画像#1’と距離画像#2とは、被写体と背景との境界の位置が一致するものの、被写体と背景との境界の位置が一致しないこともある。この場合、カメラ撮影により生成されたテクスチャ画像#1と測距装置により生成された距離画像#2とでは、テクスチャ画像のほうが、被写体と背景とのエッジ部分の形状をより忠実に再現する。 Generally, although the texture image # 1 'and the distance image # 2 match the position of the boundary between the subject and the background, the position of the boundary between the subject and the background may not match. In this case, in the texture image # 1 generated by camera photographing and the distance image # 2 generated by the distance measuring device, the texture image reproduces the shape of the edge portion between the subject and the background more faithfully.
 動画像復号装置2が復号する距離画像#2’において被写体と背景との境界の位置は、テクスチャ画像#1における被写体と背景との境界の位置と一致することが多い。これは、一般に、テクスチャ画像#1において被写体の色と背景の色とは大きく異なるため、テクスチャ画像#1において被写体と背景との境界がセグメントの境界になるためである。 The position of the boundary between the subject and the background in the distance image # 2 ′ decoded by the moving image decoding apparatus 2 often coincides with the position of the boundary between the subject and the background in the texture image # 1. This is because, in general, the subject color and the background color are significantly different in the texture image # 1, and the boundary between the subject and the background becomes the segment boundary in the texture image # 1.
 したがって、本実施形態に係る動画像復号装置2が出力したテクスチャ画像#1’および距離画像#2’から立体映像表示装置で再現される3次元画像は、テクスチャ画像#1’および距離画像#2から再現される3次元画像に略忠実であるばかりか、場合によっては実物の被写体をより忠実に再現した3次元画像となる。 Therefore, the three-dimensional image reproduced by the stereoscopic image display device from the texture image # 1 ′ and the distance image # 2 ′ output from the moving image decoding apparatus 2 according to the present embodiment is the texture image # 1 ′ and the distance image # 2. In addition to being substantially faithful to the three-dimensional image reproduced from the above, in some cases, it becomes a three-dimensional image reproducing the real subject more faithfully.
 (付記事項7)
 上述のように、動画像復号装置2は、動画像符号化装置1が使用する符号化方法に対応する復号方法を用いて、符号化データ#28から距離画像#2’を復元する。そのため、動画像符号化装置1および動画像復号装置2は、符号化、復号化の処理を行う前に、それぞれ予め符号化方法および復号方法を定めていればよい。
(Appendix 7)
As described above, the video decoding device 2 restores the distance image # 2 ′ from the encoded data # 28 using a decoding method corresponding to the encoding method used by the video encoding device 1. For this reason, the moving image encoding device 1 and the moving image decoding device 2 may determine the encoding method and the decoding method in advance before performing the encoding and decoding processes, respectively.
 また、動画像復号装置2は、動画像符号化装置1から符号化データ#28(符号化データ#25)と共に、符号化方法を示す情報を受信し、受信した情報の示す符号化方法に対応する復号方法を特定し、特定した復号方法に基づいて距離画像#2’を復元してもよい。このとき、符号化方法を示す情報を符号化データ#25に含まれるセグメント毎に対応付けておいてもよい。このようにすることにより、動画像符号化装置1は、セグメント毎に最適な符号化方法を用いることができ、そして、動画像復号装置2は、セグメント毎で異なる符号化方法で符号化されていた場合でも、正確にデータを復号することができる。 In addition, the video decoding device 2 receives the information indicating the encoding method together with the encoded data # 28 (encoded data # 25) from the video encoding device 1, and corresponds to the encoding method indicated by the received information. The decoding method to be performed may be specified, and the distance image # 2 ′ may be restored based on the specified decoding method. At this time, information indicating the encoding method may be associated with each segment included in the encoded data # 25. By doing in this way, the moving picture coding apparatus 1 can use the optimum coding method for each segment, and the moving picture decoding apparatus 2 is coded by a coding method that is different for each segment. Even in the case of data, the data can be accurately decoded.
 例えば、上記符号化方法を示す情報とは、差分値を符号語に変換する可変長符号化方法、固定長符号化方法を示す情報、代表画素に基づいて予測参照画素を特定する予測参照画素特定方法を示す予測参照画素特定方法情報、予測参照画素を有するセグメントの代表値に基づいて予測値を算出する予測値算出方法を示す予測値算出方法情報などである。その他、上記符号化方法を示す情報に、画像分割処理部21がセグメントを分割するセグメントの分割方法を示す分割方法情報、番号付与部24が番号を付与する順番(規則)を示す番号付与規則情報、代表画素を特定する代表画素特定方法を示す代表画素特定方法情報なども含めても良い。 For example, the information indicating the encoding method includes a variable length encoding method for converting a difference value into a code word, information indicating a fixed length encoding method, and a prediction reference pixel specification that specifies a prediction reference pixel based on a representative pixel. Prediction reference pixel specifying method information indicating a method, prediction value calculation method information indicating a prediction value calculation method for calculating a prediction value based on a representative value of a segment having a prediction reference pixel, and the like. In addition to the information indicating the encoding method, the division method information indicating the segment division method in which the image division processing unit 21 divides the segment, and the numbering rule information indicating the order (rule) in which the number assigning unit 24 assigns the numbers. Also, representative pixel specifying method information indicating a representative pixel specifying method for specifying a representative pixel may be included.
 具体的には、動画像符号化装置1がセグメント番号#24が先頭のセグメントのみ、セグメントの代表値を固定長符号化方法によって符号化した場合、動画像復号装置2は、その旨を示す情報を符号化データ#28と共に受信することにより、符号化データ#25の先頭の符号語のみ固定長符号化方法により復号し、先頭のセグメントの代表値を、復号した値に設定する、つまり、先頭のセグメントに含まれる全画素の画素値を復号した値に設定する。 Specifically, when the moving image encoding apparatus 1 encodes the segment representative value only for the first segment with the segment number # 24 by the fixed-length encoding method, the moving image decoding apparatus 2 displays information indicating that fact. Is received together with the encoded data # 28, only the first code word of the encoded data # 25 is decoded by the fixed-length encoding method, and the representative value of the first segment is set to the decoded value. The pixel values of all the pixels included in the segment are set to decoded values.
 (付記事項8)
 また、本実施形態では、動画像復号装置2は、テクスチャ画像の符号化データ#11および距離画像の符号化データ#25を含む符号化データ#28を受信しているが、これに限るものではない。例えば、動画像復号装置2は、距離画像の符号化データ#25およびセグメントの位置情報を受信してもよい。この場合、番号付与部24’は、セグメントの位置情報に基づいて分割される各セグメントに対してラスタスキャン順に番号を付与する。そして、番号付与部24’は、セグメントの位置情報に付与した番号を対応付けたセグメント識別用画像#24’を生成し、予測復号部32に出力する。
(Appendix 8)
In the present embodiment, the moving image decoding apparatus 2 receives the encoded data # 28 including the encoded data # 11 of the texture image and the encoded data # 25 of the distance image. However, the present invention is not limited to this. Absent. For example, the moving image decoding apparatus 2 may receive the encoded data # 25 of the distance image and the position information of the segment. In this case, the number assigning unit 24 ′ assigns a number to each segment divided based on the segment position information in the raster scan order. The number assigning unit 24 ′ generates a segment identification image # 24 ′ in which the number assigned to the segment position information is associated, and outputs the segment identifying image # 24 ′ to the predictive decoding unit 32.
 例えば、動画像符号化装置1がテクスチャ画像の符号化データ#11と、距離画像の符号化データ#25とを別々の復号装置に送信した場合であっても、距離画像の符号化データ#25を受信した復号装置は、距離画像の符号化データ#25と共にセグメントの位置情報を受信することにより、距離画像を復元することができる。 For example, even when the moving image encoding device 1 transmits the encoded image data # 11 of the texture image and the encoded data # 25 of the distance image to different decoding devices, the encoded data # 25 of the distance image Can receive the segment position information together with the encoded data # 25 of the distance image, thereby restoring the distance image.
 (付記事項9)
 上記実施形態では、動画像符号化装置1が符号化データ#25を動画像復号装置2に伝送するものとしたが、動画像符号化装置1は、動画像復号装置2に、以下のようにして符号化データ#25を供給するようにしてもよい。
(Appendix 9)
In the above embodiment, the moving image encoding apparatus 1 transmits the encoded data # 25 to the moving image decoding apparatus 2. However, the moving image encoding apparatus 1 transmits the encoded data # 25 to the moving image decoding apparatus 2 as follows. Then, the encoded data # 25 may be supplied.
 すなわち、動画像符号化装置1および動画像復号装置2に光ディスクドライブ等、着脱可能な記録媒体にアクセス可能なアクセス手段を設け、記録媒体を介して動画像符号化装置1から動画像復号装置2に符号化データ#25を供給するようにしてもよい。換言すると、本発明の符号化装置はデータを伝送する手段を必ずしも備えていなくともよく、本発明の復号装置は、データを受信する受信手段を必ずしも備えていなくともよい。 That is, the moving image encoding apparatus 1 and the moving image decoding apparatus 2 are provided with access means that can access a removable recording medium such as an optical disk drive, and the moving image encoding apparatus 1 and the moving image decoding apparatus 2 are connected via the recording medium. Alternatively, the encoded data # 25 may be supplied. In other words, the encoding apparatus of the present invention does not necessarily include a means for transmitting data, and the decoding apparatus of the present invention does not necessarily include a receiving means for receiving data.
 <実施形態2>
 次に、本発明の別の一実施形態に動画像符号化装置および動画像復号装置について、図19および図20を参照しながら以下に説明する。最初に、本実施形態に係る動画像符号化装置について説明する。
<Embodiment 2>
Next, a moving picture coding apparatus and a moving picture decoding apparatus according to another embodiment of the present invention will be described below with reference to FIGS. 19 and 20. First, the moving picture coding apparatus according to the present embodiment will be described.
 本実施形態に係る動画像符号化装置は、テクスチャ画像の符号化にH.264/AVCにおけるMVC規格として採用されているMVC符号化を用いる一方、距離画像の符号化には本発明に特有の符号化技術を用いている動画像符号化装置である。本実施形態に係る動画像符号化装置は、1フレームあたりテクスチャ画像および距離画像を複数組(N組)符号化する点において動画像符号化装置1と異なっている。ここで、N組のテクスチャ画像および距離画像は、被写体を取り囲むようにN箇所に設置されたカメラおよび測距装置によって同時に取り込まれた被写体の画像である。すなわち、N組のテクスチャ画像および距離画像は、自由視点画像を生成するための画像である。また、各組のテクスチャ画像および距離画像には、当該組のテクスチャ画像および距離画像の実データとともに、どの方位角に設置されたカメラおよび測距装置によって生成された画像であるかを示すカメラパラメータがメタデータとして含まれている。 The moving image encoding apparatus according to this embodiment is H.264 for encoding texture images. On the other hand, the MVC coding adopted as the MVC standard in H.264 / AVC is used, while the distance picture is coded by a moving picture coding apparatus using a coding technique peculiar to the present invention. The moving image encoding apparatus according to the present embodiment is different from the moving image encoding apparatus 1 in that a plurality of sets (N sets) of texture images and distance images are encoded per frame. Here, the N sets of texture images and distance images are images of subjects simultaneously captured by cameras and ranging devices installed at N locations so as to surround the subject. That is, the N sets of texture images and distance images are images for generating a free viewpoint image. In addition, each set of texture image and distance image includes actual data of the texture image and distance image of the set, and a camera parameter indicating which azimuth angle is an image generated by a camera and a distance measuring device. Is included as metadata.
 以下、本実施形態の動画像符号化装置の構成について図19を参照して説明する。 Hereinafter, the configuration of the moving picture encoding apparatus of the present embodiment will be described with reference to FIG.
 (動画像符号化装置)
 図19は、本実施形態の動画像符号化装置の要部構成を示すブロック図である。図19に示すように、動画像符号化装置1Aは、画像符号化部11A、画像復号部12A、距離画像符号化部20A、およびパッケージング部(伝送手段)28Aを備えている。また、距離画像符号化部20Aは、画像分割処理部21A、距離画像分割処理部(分割手段)22A、距離値修正部(代表値決定手段)23A、番号付与部(番号付与手段)24A、および予測符号化部(予測値算出手段、差分値算出手段、符号化手段)25Aを備えている。
(Moving picture encoding device)
FIG. 19 is a block diagram showing a main configuration of the moving picture encoding apparatus according to the present embodiment. As shown in FIG. 19, the moving image encoding apparatus 1A includes an image encoding unit 11A, an image decoding unit 12A, a distance image encoding unit 20A, and a packaging unit (transmission means) 28A. The distance image encoding unit 20A includes an image division processing unit 21A, a distance image division processing unit (dividing unit) 22A, a distance value correcting unit (representative value determining unit) 23A, a number assigning unit (number assigning unit) 24A, and A predictive encoding unit (predicted value calculating means, difference value calculating means, encoding means) 25A is provided.
 画像符号化部11Aは、H.264/AVCにおけるMVC規格に規定されているMVC符号化(多視点映像符号化)によりN個のビューコンポーネント(すなわち、テクスチャ画像#1-1~#1-N)を符号化し、各ビューコンポーネントの符号化データ#11-1~#11-Nを生成する。また、画像符号化部11Aは、符号化データ#11-1~#11-Nを、NALヘッダ拡張によるパラメータであるビューID「1」~「N」とともに、画像復号部12Aおよびパッケージング部28’に出力する。 The image encoding unit 11A N view components (that is, texture images # 1-1 to # 1-N) are encoded by MVC encoding (multi-view video encoding) defined in the MVC standard in H.264 / AVC, and each view component is Coded data # 11-1 to # 11-N are generated. Further, the image encoding unit 11A converts the encoded data # 11-1 to # 11-N into the image decoding unit 12A and the packaging unit 28 together with view IDs “1” to “N” that are parameters by NAL header extension. Output to '.
 画像復号部12Aは、上記MVC規格に規定されている復号方式により、テクスチャ画像#1の符号化データ#11-1~#11-Nからテクスチャ画像#1’-1~1’-Nを復号する。 The image decoding unit 12A decodes the texture images # 1′-1 to 1′-N from the encoded data # 11-1 to # 11-N of the texture image # 1 by the decoding method stipulated in the MVC standard. To do.
 画像分割処理部21は、テクスチャ画像#1’-jの全領域を複数のセグメント(領域)に分割する。そして、画像分割処理部21は、各セグメントの位置情報からなるセグメント情報#21-jを出力する。 The image division processing unit 21 divides the entire area of the texture image # 1'-j into a plurality of segments (areas). Then, the image division processing unit 21 outputs segment information # 21-j including the position information of each segment.
 距離画像分割処理部22Aは、距離画像#2―jおよびセグメント情報#21―jが入力されると、テクスチャ画像#1’―j中の各セグメントについて、距離画像#2―j中の対応するセグメント(領域)に含まれる各画素の距離値からなる距離値セットを抽出する。そして、距離画像分割処理部22Aは、セグメント情報#21―jから、各セグメントについて距離値セットと位置情報とが関連づけられたセグメント情報#22―jを生成する。 When the distance image # 2-j and the segment information # 21-j are input, the distance image division processing unit 22A corresponds to each segment in the texture image # 1′-j in the distance image # 2-j. A distance value set including the distance values of each pixel included in the segment (region) is extracted. Then, the distance image division processing unit 22A generates segment information # 22-j in which the distance value set and the position information are associated with each segment from the segment information # 21-j.
 さらに、距離画像分割処理部22Aは、距離画像#2―jのビューID「j」を生成し、ビューID「j」とセグメント情報#22―jとを関連づけたセグメント情報#22A-jを生成する。 Further, the distance image division processing unit 22A generates a view ID “j” of the distance image # 2-j, and generates segment information # 22A-j in which the view ID “j” is associated with the segment information # 22-j. To do.
 距離値修正部23Aは、距離画像#2―jの各セグメントについて、セグメント情報#22A-jに含まれる該セグメントの距離値セットから代表値#23aとして最頻値を算出する。そして、距離値修正部23は、セグメント情報#22A-jに含まれる各セグメントの距離値セットを、対応するセグメントの代表値#23a-jに置き換え、セグメント情報#23A-jとして番号付与部24Aに出力する。 The distance value correcting unit 23A calculates the mode value as the representative value # 23a from the distance value set of the segment included in the segment information # 22A-j for each segment of the distance image # 2-j. Then, the distance value correcting unit 23 replaces the distance value set of each segment included in the segment information # 22A-j with the representative value # 23a-j of the corresponding segment, and the number assigning unit 24A as the segment information # 23A-j Output to.
 番号付与部24Aは、セグメント情報#23A-jが入力されると、セグメント情報#23A-jに含まれているMj組の位置情報および代表値#23a-jの各組について、代表値#23a-jと位置情報に応じたセグメント番号#24-jとを関連づける。そして、番号付与部24Aは、Mj組のセグメント番号#24-jおよび代表値#23a-jと、セグメント情報#23A-jに含まれているビューID「j」とが関連づけられたデータ#24A-jを予測符号化部25Aに出力する。 Number giving unit 24A, when the segment information # 23A-j is input, for each set of M j sets of position information and the representative value # 23a-j contained in the segment information # 23A-j, representative value # 23a-j is associated with segment number # 24-j corresponding to the position information. The number assigning unit 24A then associates M j sets of segment numbers # 24-j and representative values # 23a-j with the view ID “j” included in the segment information # 23A-j. 24A-j is output to the predictive coding unit 25A.
 予測符号化部25Aは、入力されたデータ#24A-jに含まれるM組の代表値#23a-jおよびセグメント番号#24-jに基づいて、視点ごとに予測符号化処理を施し、得られた符号化データ#25-jをパッケージング部28に出力する。具体的には、予測符号化部25Aは、セグメント番号#24-jの順番で、セグメント毎に、セグメントの予測値を算出し、代表値#23a-jから予測値を減算して差分値を算出し、差分値を符号化する。そして、予測符号化部25は、符号化した差分値をセグメント番号#24-jの順番に並べて、符号化データ#25-jを生成する。 The predictive encoding unit 25A performs predictive encoding processing for each viewpoint based on the M j sets of representative values # 23a-j and segment numbers # 24-j included in the input data # 24A-j, The encoded data # 25-j is output to the packaging unit 28. Specifically, the predictive coding unit 25A calculates the predicted value of the segment for each segment in the order of the segment number # 24-j, and subtracts the predicted value from the representative value # 23a-j to obtain the difference value. Calculate and encode the difference value. Then, the predictive encoding unit 25 arranges the encoded difference values in the order of the segment numbers # 24-j to generate encoded data # 25-j.
 予測符号化部25Aは、このようにして得られた1からNまでの各jに関する距離画像#2-jの符号化データをVCL・NALユニットとして、ビューID「j」を非VCL・NALユニットとして含む符号化データ#25Aをパッケージング部28Aに伝送する。 The prediction encoding unit 25A uses the encoded data of the distance image # 2-j for each j from 1 to N obtained in this way as the VCL / NAL unit and the view ID “j” as the non-VCL / NAL unit. Is transmitted to the packaging unit 28A.
 パッケージング部28Aは、テクスチャ画像#1-1~#1-Nの符号化データ#11-1~#11-Nと、符号化データ#25Aとを統合することにより、符号化データ#28Aを生成する。そして、パッケージング部28Aは、符号化データ#28Aを動画像復号装置に伝送する。 The packaging unit 28A integrates the encoded data # 11-1 to # 11-N of the texture images # 1-1 to # 1-N and the encoded data # 25A to thereby convert the encoded data # 28A. Generate. Then, the packaging unit 28A transmits the encoded data # 28A to the video decoding device.
 (動画像復号装置)
 次に、本実施形態の動画像復号装置の構成について図20を参照して説明する。図20は、本実施形態の動画像復号装置の要部構成を示すブロック図である。図20に示すように、動画像復号装置2Aは、画像復号部12A、画像分割処理部(分割手段)21A’、番号付与部(番号付与手段、割当手段)24A’、アンパッケージング部(受信手段)31A、および予測復号部(予測値算出手段、画素値設定手段)32Aを備えている。
(Video decoding device)
Next, the configuration of the video decoding device according to the present embodiment will be described with reference to FIG. FIG. 20 is a block diagram showing a main configuration of the moving picture decoding apparatus according to the present embodiment. As shown in FIG. 20, the moving picture decoding apparatus 2A includes an image decoding unit 12A, an image division processing unit (dividing unit) 21A ′, a numbering unit (numbering unit, assigning unit) 24A ′, an unpackaging unit (reception). Means) 31A and a predictive decoding unit (predicted value calculating means, pixel value setting means) 32A.
 画像復号部12Aは、MVC規格に規定されている復号方式により、テクスチャ画像#1の符号化データ#11-1~#11-Nからテクスチャ画像#1’-1~1’-Nを復号する。 The image decoding unit 12A decodes the texture images # 1′-1 to 1′-N from the encoded data # 11-1 to # 11-N of the texture image # 1 by a decoding method defined in the MVC standard. .
 アンパッケージング部31Aは、受信した符号化データ#28Aから、テクスチャ画像#1の符号化データ#11-jと距離画像#2の符号化データ#25Aとを抽出する。 The unpackaging unit 31A extracts the encoded data # 11-j of the texture image # 1 and the encoded data # 25A of the distance image # 2 from the received encoded data # 28A.
 画像分割処理部21A’は、動画像符号化装置1Aの画像分割処理部21Aと同じアルゴリズムにより、テクスチャ画像#1’-jの全体領域を複数のセグメント(領域)に分割する。そして、画像分割処理部21A’は、各セグメントの位置情報からなるセグメント情報#21’-jを生成し、番号付与部24A’に出力する。 The image division processing unit 21A 'divides the entire region of the texture image # 1'-j into a plurality of segments (regions) by the same algorithm as the image division processing unit 21A of the moving image encoding device 1A. Then, the image division processing unit 21A ′ generates segment information # 21′-j including the position information of each segment, and outputs it to the number assigning unit 24A ′.
 番号付与部24A’は、動画像符号化装置1Aの番号付与部24Aと同じアルゴリズムにより、セグメント情報#21’-jに基づいて分割される各セグメントに対してラスタスキャン順に番号を付与する。番号付与部24A’は、セグメントの位置情報に付与した番号を対応付けたセグメント識別用画像#24’-jを生成し、予測復号部32Aに出力する。 The number assigning unit 24A 'assigns a number to each segment divided based on the segment information # 21'-j in the raster scan order by the same algorithm as the number assigning unit 24A of the moving image encoding device 1A. The number assigning unit 24A 'generates a segment identification image # 24'-j in which the number assigned to the segment position information is associated, and outputs it to the predictive decoding unit 32A.
 予測復号部32Aは、入力された符号化データ#25Aから、符号化データ#25-jと、ビューID「j」とを抽出する。次に、符号化データ#25-jおよびセグメント識別用画像#24’-jに基づいて予測復号処理を施して、距離画像#2’-1~#2’-Nを復元する。具体的には、予測復号部32Aは、距離画像#2’-jを以下のようにして復号する。 The predictive decoding unit 32A extracts the encoded data # 25-j and the view ID “j” from the input encoded data # 25A. Next, predictive decoding processing is performed based on the encoded data # 25-j and the segment identification image # 24'-j to restore the distance images # 2'-1 to # 2'-N. Specifically, the prediction decoding unit 32A decodes the distance image # 2'-j as follows.
 予測復号部32Aは、符号化データ#25-jを復号して、順に並べられた差分値を生成し、生成した差分値を、番号付与部24A’が付与した順番で、セグメント識別用画像#24’-jのセグメント情報#21’-jで規定される各セグメントに割り当てる。次に、予測復号部32Aは、番号付与部24A’が付与した順番で、セグメント毎に、セグメントの予測値を算出し、算出した予測値に、割り当てられた差分値を加算して、算出した値を各セグメントの距離値として設定する。そして、予測復号部32Aは、設定したセグメントの距離値を、該セグメントに含まれる全画素の画素値(距離値)として設定し、距離画像#2’-jを復元する。予測復号部32は、復元した距離画像#2’-jに、符号化データ#25Aに含まれるビューID「j」を対応付けて動画像復号装置2Aの外部の立体映像表示装置(図示せず)に出力する。 The predictive decoding unit 32A decodes the encoded data # 25-j, generates differential values arranged in order, and generates the generated differential values in the order given by the number assigning unit 24A ′. This is assigned to each segment defined by the segment information # 21'-j of 24'-j. Next, the predictive decoding unit 32A calculates the predicted value of the segment for each segment in the order given by the number assigning unit 24A ′, and adds the assigned difference value to the calculated predicted value. Set the value as the distance value for each segment. Then, the predictive decoding unit 32A sets the distance value of the set segment as the pixel value (distance value) of all the pixels included in the segment, and restores the distance image # 2'-j. The predictive decoding unit 32 associates the restored distance image # 2′-j with the view ID “j” included in the encoded data # 25A to provide a stereoscopic video display device (not shown) outside the video decoding device 2A. ).
 なお、画像復号部12は、実施形態1の動画像復号装置2の画像復号部12と同一であるので、説明を省略する。 Note that the image decoding unit 12 is the same as the image decoding unit 12 of the video decoding device 2 of the first embodiment, and a description thereof will be omitted.
 (付記事項10)
 上記実施形態では、動画像符号化装置1Aおよび動画像復号装置2Aは、被写体を取り囲むようにN箇所に設置されたカメラおよび測距装置によって同時に取り込まれた被写体のN組のテクスチャ画像および距離画像に対して、符号化処理および復号処理を行った。
(Appendix 10)
In the above-described embodiment, the moving image encoding device 1A and the moving image decoding device 2A have N sets of texture images and distance images of a subject captured simultaneously by cameras and ranging devices installed at N locations so as to surround the subject. Then, an encoding process and a decoding process were performed.
 動画像符号化装置1Aおよび動画像復号装置2Aは、言うまでも無く、以下のようにして生成されたN組のテクスチャ画像および距離画像に対して、符号化処理および復号処理を行うことができる。 Needless to say, the moving image encoding device 1A and the moving image decoding device 2A can perform encoding processing and decoding processing on N sets of texture images and distance images generated as follows. .
 すなわち、動画像符号化装置1Aおよび動画像復号装置2Aは、各組のカメラおよび測距装置が互いに相異なる方向を向くように1箇所に設置されたN組のカメラおよび測距装置によって生成されたN組のテクスチャ画像および距離画像に対しても、符号化処理および復号処理を行うことができる。つまり、動画像符号化装置1Aおよび動画像復号装置2Aは、全方位画像やパノラマ画像等を生成するためのN組のテクスチャ画像および距離画像に対しても、符号化処理および復号処理を行うことができる。 That is, the moving image encoding device 1A and the moving image decoding device 2A are generated by N sets of cameras and ranging devices installed in one place so that each set of cameras and ranging devices faces different directions. Also, encoding processing and decoding processing can be performed on the N sets of texture images and distance images. That is, the moving image encoding device 1A and the moving image decoding device 2A perform the encoding process and the decoding process on N sets of texture images and distance images for generating omnidirectional images, panoramic images, and the like. Can do.
 この場合、各組のテクスチャ画像および距離画像には、当該組のテクスチャ画像および距離画像の実データとともに、どの方向に向いている組のカメラおよび測距装置によって生成された画像であるかを示すカメラパラメータがメタデータとして含まれることになる。 In this case, the texture image and the distance image of each set indicate the direction of the image generated by the camera and the distance measuring device in which direction it is directed together with the actual data of the texture image and the distance image of the set. Camera parameters are included as metadata.
 (付記事項11)
 実施形態2において、動画像符号化装置1Aの画像符号化部11Aは、H.264/AVCにおけるMVC規格に規定されているMVC符号化を用いてテクスチャ画像#1-1~1-Nを符号化するものとしたが、本発明はこれに限定されない。
(Appendix 11)
In the second embodiment, the image encoding unit 11A of the moving image encoding apparatus 1A is configured as H.264. Although texture images # 1-1 to 1-N are encoded using MVC encoding defined in the MVC standard in H.264 / AVC, the present invention is not limited to this.
 すなわち、動画像符号化装置1Aの画像符号化部11Aは、VSP(View Synthesis Prediction)符号化方式や、MVD符号化方式、LVD(Layered Video Depth)符号化方式といった他の符号化方式を用いてテクスチャ画像#1-1~1-Nを符号化してもよい。この場合、動画像復号装置2Aの画像復号部12Aを、画像符号化部11Aが採用する符号化方式に対応する復号方式によりテクスチャ画像#1’~1’-Nを復号するように構成すればよい。 That is, the image encoding unit 11A of the moving image encoding device 1A uses other encoding methods such as a VSP (View Synthesis Prediction) encoding method, an MVD encoding method, and an LVD (Layered Video Depth) encoding method. Texture images # 1-1 to 1-N may be encoded. In this case, if the image decoding unit 12A of the video decoding device 2A is configured to decode the texture images # 1 ′ to 1′-N by a decoding method corresponding to the encoding method employed by the image encoding unit 11A. Good.
 (課題を解決するための手段)
 本発明に係る符号化装置は、上記課題を解決するために、画像を符号化する符号化装置において、上記画像の全領域を複数の領域に分割する分割手段と、上記分割手段により分割された上記複数の領域の各々について、該領域に含まれる各画素の画素値から代表値を決定する代表値決定手段と、上記複数の領域に対してラスタスキャン順で番号を付与する番号付与手段と、上記番号付与手段の付与した番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて、符号化対象領域の予測値を算出する予測値算出手段と、上記符号化対象領域毎に、上記代表値決定手段が決定した代表値から、上記予測値算出手段が算出した予測値を減算して差分値を算出する差分値算出手段と、上記差分値算出手段が算出した差分値を、上記番号付与手段が付与した順番に並べて符号化し、上記画像の符号化データを生成する符号化手段と、を備えることを特徴としている。
(Means for solving problems)
In order to solve the above-described problem, an encoding apparatus according to the present invention is an encoding apparatus that encodes an image, and is divided by a dividing unit that divides the entire area of the image into a plurality of regions, and the dividing unit. For each of the plurality of regions, representative value determining means for determining a representative value from the pixel value of each pixel included in the region, number giving means for assigning a number to the plurality of regions in raster scan order, The above-mentioned area is set as the encoding target area in the order of the numbers given by the above-mentioned number assigning means, and among the pixels included in the encoding target area, the first pixel in the raster scan order is set as the representative pixel, and is included in the encoding target area. A pixel that is adjacent to a pixel on the same scan line as the representative pixel and that has a raster scan order before the representative pixel is a predicted reference pixel, and a representative value of an area having the predicted reference pixel Based on at least one, the predicted value calculating means for calculating the predicted value of the encoding target area, and the predicted value calculating means calculates the representative value determined by the representative value determining means for each of the encoding target areas. The difference value calculation means for calculating the difference value by subtracting the predicted value and the difference value calculated by the difference value calculation means are arranged and encoded in the order given by the number assignment means, and the encoded data of the image is obtained. And encoding means for generating.
 上記の構成によれば、番号付与手段が、分割手段が上記画像を分割した上記複数の領域に対してラスタスキャン順で番号を付与する。次に、予測値算出手段が、上記番号付与手段の付与した番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とする。そして、予測値算出手段が、予測参照画素を有する領域の代表値の少なくとも1つに基づいて、符号化対象領域の予測値を算出する。次に、差分値算出手段が、上記符号化対象領域毎に、上記代表値決定手段が決定した代表値から、上記予測値算出手段が算出した予測値を減算して差分値を算出する。そして、符号化手段が、上記差分値算出手段が算出した差分値を、上記番号付与手段が付与した順番に並べて符号化して、上記画像の符号化データを生成する。 According to the above configuration, the number assigning unit assigns numbers in the raster scan order to the plurality of regions into which the dividing unit has divided the image. Next, the prediction value calculation means sets the area as the encoding target area in the order of the numbers given by the number assignment means, and sets the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area. A pixel that is included in the encoding target region and is close to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is a predicted reference pixel. Then, the predicted value calculation means calculates the predicted value of the encoding target region based on at least one of the representative values of the region having the predicted reference pixel. Next, the difference value calculation means subtracts the prediction value calculated by the prediction value calculation means from the representative value determined by the representative value determination means for each encoding target region to calculate a difference value. Then, the encoding unit arranges and encodes the difference values calculated by the difference value calculation unit in the order given by the number assigning unit, and generates encoded data of the image.
 そのため、上記分割手段が分割した上記複数の領域が任意の形状であっても、各領域の順序を一意に特定することができる。また、各領域の代表値の予測値を算出する際に使用する代表画素、およびそれに基づく予測対象画素を一意に特定することができる。それゆえ、符号化対象領域と近接する領域の代表値から定まる符号化対象領域の予測値を一意に算出することができる。 Therefore, even if the plurality of areas divided by the dividing means have an arbitrary shape, the order of the areas can be uniquely specified. Moreover, the representative pixel used when calculating the predicted value of the representative value of each region and the prediction target pixel based on the representative pixel can be uniquely specified. Therefore, the predicted value of the encoding target area determined from the representative value of the area adjacent to the encoding target area can be uniquely calculated.
 よって、上記分割手段が分割した上記複数の領域が任意の形状であっても、各領域間の空間的冗長性を排除し、かつ、一意に復号可能な符号化データを生成することができるという効果を奏する。 Therefore, even if the plurality of regions divided by the dividing unit have an arbitrary shape, it is possible to eliminate spatial redundancy between the regions and generate encoded data that can be uniquely decoded. There is an effect.
 また、算出した予測値を用いて符号化した符号化データを復号する場合、予測値を符号化時と同じ方法で算出する必要がある。つまり、或る領域に対する予測参照画素は、符号化時と復号時とで同じである必要がある。そのため、或る領域に対する予測参照画素は、当該或る領域より先に復号されている必要がある、すなわち、先に符号化する必要がある。 In addition, when decoding the encoded data encoded using the calculated predicted value, it is necessary to calculate the predicted value in the same manner as when encoding. That is, the prediction reference pixel for a certain area needs to be the same at the time of encoding and at the time of decoding. Therefore, a prediction reference pixel for a certain area needs to be decoded before the certain area, that is, needs to be encoded first.
 そこで、上記のように、符号化対象領域に含まれる画素よりラスタスキャン順が前の画素を予測参照画素とすることにより、復号時に、抜けなどなく正常に、符号化データを先頭から順に復号可能であることが保証される。そのため、効率的な符号化処理を行うことができ、かつ、符号化処理時に使用するメモリ量を削減することができるというさらなる効果を奏する。 Therefore, as described above, by setting the pixel whose raster scan order is earlier than the pixel included in the encoding target area as the prediction reference pixel, it is possible to normally decode the encoded data in order from the beginning without omission at the time of decoding. Is guaranteed. Therefore, there is an additional effect that efficient encoding processing can be performed and the amount of memory used during the encoding processing can be reduced.
 本発明に係る符号化方法は、上記課題を解決するために、画像を符号化する符号化装置の符号化方法において、上記符号化装置にて、上記画像の全領域を複数の領域に分割する分割ステップと、上記分割ステップにおいて分割された上記複数の領域の各々について、該領域に含まれる各画素の画素値から代表値を決定する代表値決定ステップと、上記複数の領域に対してラスタスキャン順で番号を付与する番号付与ステップと、上記番号付与ステップにおいて付与された番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて、符号化対象領域の予測値を算出する予測値算出ステップと、上記符号化対象領域毎に、上記代表値決定ステップにおいて決定された代表値から、上記予測値算出ステップにおいて算出された予測値を減算して差分値を算出する差分値算出ステップと、上記差分値算出ステップにおいて算出された差分値を、上記番号付与ステップにおいて付与された順番に並べて符号化し、上記画像の符号化データを生成する符号化ステップと、を含むことを特徴としている。 In order to solve the above-described problem, an encoding method according to the present invention is an encoding method of an encoding device that encodes an image, and the encoding device divides the entire region of the image into a plurality of regions. A division step; a representative value determination step for determining a representative value from a pixel value of each pixel included in the plurality of regions divided in the division step; and a raster scan for the plurality of regions. A numbering step for assigning numbers in order, and the region as an encoding target region in the order of the numbers given in the numbering step, and among the pixels included in the encoding target region, the first pixel in the raster scan order A pixel that is included in the encoding target area and is close to a pixel on the same scan line as the representative pixel, and the raster scan order is higher than that of the representative pixel. For each of the encoding target areas, a prediction value calculating step for calculating a prediction value of the encoding target area based on at least one representative value of the area having the prediction reference pixel, A difference value calculation step for calculating a difference value by subtracting the prediction value calculated in the prediction value calculation step from the representative value determined in the representative value determination step; and a difference value calculated in the difference value calculation step Are encoded in the order given in the number assigning step, and an encoded step of generating encoded data of the image is included.
 上記の構成によれば、本発明に係る符号化方法は、本発明に係る符号化装置と同様の作用効果を奏する。 According to the above configuration, the encoding method according to the present invention has the same effects as the encoding apparatus according to the present invention.
 本発明に係る符号化装置は、上記符号化手段により生成された上記画像の上記符号化データと、上記複数の領域を規定する領域情報と、を関連づけて外部に伝送する伝送手段と、をさらに備えていることが望ましい。 The encoding apparatus according to the present invention further includes transmission means for associating the encoded data of the image generated by the encoding means with the area information defining the plurality of areas, and transmitting the associated information to the outside. It is desirable to have it.
 上記の構成によれば、上記伝送手段は、上記符号化手段により生成された上記画像の上記符号化データと、上記複数の領域を規定する領域情報と、を関連づけて外部に伝送する。そのため、上記符号化データおよび上記領域情報を受信した装置は、上記領域情報に基づいて画像を上記複数の領域に分割することにより、受信した符号化データを正確に復号することができるというさらなる効果を奏する。 According to the above configuration, the transmission unit transmits the encoded data of the image generated by the encoding unit and the region information defining the plurality of regions in association with each other. Therefore, the device that has received the encoded data and the region information can further accurately decode the received encoded data by dividing the image into the plurality of regions based on the region information. Play.
 本発明に係る符号化装置は、上記符号化手段は、上記差分値を、符号化する対象の値が0に近いほど符号語が短い可変長符号化方法によって符号化することが望ましい。 In the encoding apparatus according to the present invention, it is preferable that the encoding means encodes the difference value by a variable length encoding method in which a code word is shorter as a value to be encoded is closer to 0.
 上記の構成によれば、上記符号化手段は、上記差分値を、符号化する対象の値が0に近いほど符号語が短い可変長符号化方法によって符号化する。ここで、上記予測値算出手段が算出する符号化対象領域の予測値が該符号化対象領域の代表値と近似している場合(上記予測値算出手段の予測精度が高い場合)、上記差分値は、非常に小さな値となる。そのため、上記予測値算出手段の予測精度が高い場合、上記符号化手段が上記差分値を可変長符号化方法によって符号化することにより、符号化データの符号量を削減することができるというさらなる効果を奏する。 According to the above configuration, the encoding means encodes the difference value by a variable length encoding method in which the code word is shorter as the value to be encoded is closer to 0. Here, when the prediction value of the encoding target region calculated by the prediction value calculating unit approximates the representative value of the encoding target region (when the prediction accuracy of the prediction value calculating unit is high), the difference value Is a very small value. Therefore, when the prediction accuracy of the prediction value calculation unit is high, the encoding unit encodes the difference value using a variable length encoding method, thereby further reducing the amount of encoded data. Play.
 本発明に係る符号化装置は、上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む画素群とすることが望ましい。 In the encoding device according to the present invention, the prediction value calculation means includes a prediction reference pixel that is included in the encoding target area and the pixel immediately before the representative pixel in the raster scan order, and is the same as the representative pixel. A pixel adjacent to the scan line pixel, the pixel of the scan line immediately preceding in the raster scan order of the representative pixel, and the last pixel of the same scan line included in the encoding target area and the representative pixel. It is desirable that the pixel group includes a pixel adjacent to the next pixel in the raster scan order of the pixels and a pixel on the previous scan line in the raster scan order of the representative pixel.
 上記の構成によれば、上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む画素群とする。 According to the above configuration, the predicted value calculation means includes the predicted reference pixel, the pixel immediately before the representative pixel in the raster scan order, and the same scan line as the representative pixel included in the encoding target region. Of the scan line that is adjacent to the first pixel in the raster scan order of the representative pixel and the last pixel of the same scan line that is included in the encoding target area and that is the same as the representative pixel. The pixel group includes a pixel adjacent to the next pixel in the raster scan order, and a pixel on the previous scan line in the raster scan order of the representative pixel.
 ここで、符号化対象領域の予測値の予測の精度を向上させるためには、符号化対象領域と隣接または近接する、できるだけ多くの領域の画素を参照することが望ましい。しかしながら、上述のように、符号化データを効率的に復号するために、符号化対象領域より先に符号化される画素を参照しなければならない。 Here, in order to improve the accuracy of prediction of the prediction value of the encoding target region, it is desirable to refer to pixels in as many regions as possible adjacent to or close to the encoding target region. However, as described above, in order to efficiently decode the encoded data, it is necessary to refer to pixels that are encoded prior to the encoding target region.
 そこで、ラスタスキャン順を画像の左上から右下の方向とすると、上記代表画素のラスタスキャン順の1つ前の画素とは、上記代表画素の左方向に隣接する画素である。また、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素とは、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素の上方向に隣接する画素である。また、符号化対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素とは、符号化対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素の右斜め上方向に近接する画素である。 Therefore, assuming that the raster scan order is from the upper left to the lower right of the image, the pixel immediately before the representative pixel in the raster scan order is a pixel adjacent in the left direction of the representative pixel. Further, a pixel that is included in the encoding target area and is adjacent to a pixel on the same scan line as the representative pixel and that is on the previous scan line in the raster scan order of the representative pixel is the encoding target. The pixel is included in the region and is adjacent in the upward direction to the pixel of the same scan line as the representative pixel. Further, a pixel that is included in the encoding target area and is adjacent to the next pixel in the raster scan order of the last pixel of the same scan line as the representative pixel, and 1 in the raster scan order of the representative pixel. The pixel on the previous scan line is a pixel that is included in the encoding target region and is adjacent to the uppermost pixel of the last scan line on the same scan line as the representative pixel in the diagonally upper right direction.
 すなわち、符号化対象領域の左方向に隣接する画素、上方向に隣接する画素、右斜め上に近接する画素(符号化対象領域の右方向にある画素)の3方向の画素を予測参照画素としている。よって、符号化対象領域よりラスタスキャン順が前の画素であって、多方向の画素を参照して予測値を算出するため、予測値を高精度に予測することができるというさらなる効果を奏する。 That is, a pixel in three directions of a pixel adjacent in the left direction of the encoding target region, a pixel adjacent in the upward direction, and a pixel adjacent to the upper right (a pixel in the right direction of the encoding target region) is used as a prediction reference pixel. Yes. Therefore, since the prediction value is calculated with reference to the pixels in the raster scan order before the encoding target area and the pixels in multiple directions, the prediction value can be predicted with high accuracy.
 本発明に係る符号化装置は、上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素の何れか1つの画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む3つの画素とすることが望ましい。 In the encoding device according to the present invention, the prediction value calculation means includes a prediction reference pixel that is included in the encoding target area and the pixel immediately before the representative pixel in the raster scan order, and is the same as the representative pixel. A pixel adjacent to a pixel on the scan line, and one of the pixels on the previous scan line in the raster scan order of the representative pixel and included in the encoding target area and the same as the representative pixel Three pixels including a pixel adjacent to the next pixel in the raster scan order of the last pixel of the scan line and including a pixel in the previous scan line in the raster scan order of the representative pixel. It is desirable.
 上記の構成によれば、上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素の何れか1つの画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む3つの画素とする。 According to the above configuration, the predicted value calculation means includes the predicted reference pixel, the pixel immediately before the representative pixel in the raster scan order, and the same scan line as the representative pixel included in the encoding target region. Any one of the pixels on the previous scan line in the raster scan order of the representative pixel and the same scan line as the representative pixel included in the encoding target region Are the pixels adjacent to the next pixel in the raster scan order of the last pixel, and the pixels on the previous scan line in the raster scan order of the representative pixel.
 すなわち、符号化対象領域の左方向に隣接する画素、上方向に隣接する画素、右斜め上に近接する画素(符号化対象領域の右方向にある画素)の3方向の3つの画素を予測参照画素としている。よって、予測参照画素が、符号化対象領域に対して多方向に位置する画素であって、できるだけ少ない画素群(3つの画素)であるため、予測値の算出の処理の負荷を低減すると共に、予測値を高精度に予測することができるというさらなる効果を奏する。 That is, predictive reference is made to three pixels in three directions: a pixel adjacent in the left direction of the encoding target region, a pixel adjacent in the upward direction, and a pixel adjacent to the upper right (a pixel in the right direction of the encoding target region). It is a pixel. Therefore, since the prediction reference pixel is a pixel located in multiple directions with respect to the encoding target region and is a pixel group (three pixels) as few as possible, the processing load for calculating the prediction value is reduced, There is a further effect that the predicted value can be predicted with high accuracy.
 本発明に係る符号化装置は、本発明に係る符号化装置は、上記予測値算出手段は、上記予測参照画素を有する領域の代表値の中央値を、符号化対象領域の予測値とすることが望ましい。 In the encoding device according to the present invention, in the encoding device according to the present invention, the prediction value calculation means sets the median of the representative values of the region having the prediction reference pixel as the prediction value of the encoding target region. Is desirable.
 上記の構成によれば、上記予測値算出手段は、上記予測参照画素を有する領域の代表値の中央値を、符号化対象領域の予測値とする。 According to the above configuration, the predicted value calculation means sets the median of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region.
 基本的に、符号化対象領域の代表値と、予測参照画素を有する領域の代表値とは、近似しているが、或る予測参照画素を有する領域の代表値が、符号化対象領域の代表値と大きく異なっている場合も考えられる。このとき、何れかの予測参照画素を有する領域の代表値を、そのまま符号化対象領域の予測値とした場合、上記或る予測参照画素を選択すると、
符号化対象領域の予測値が符号化対象領域の代表値と大きく異なり、予測値の精度が低下する。
Basically, the representative value of the encoding target region and the representative value of the region having the prediction reference pixel are approximate, but the representative value of the region having a certain prediction reference pixel is the representative of the encoding target region. It is also possible that the value is very different. At this time, when the representative value of the region having any prediction reference pixel is used as the prediction value of the encoding target region as it is, when the certain prediction reference pixel is selected,
The prediction value of the encoding target region is greatly different from the representative value of the encoding target region, and the accuracy of the prediction value is reduced.
 そこで、或る予測参照画素を有する領域の代表値が、符号化対象領域の代表値と大きく異なっている場合であっても、上記予測参照画素を有する領域の代表値の中央値を、符号化対象領域の予測値とすることにより、予測値を安定した精度を保って予測することができるというさらなる効果を奏する。 Therefore, even if the representative value of the region having a certain prediction reference pixel is significantly different from the representative value of the region to be encoded, the median of the representative value of the region having the prediction reference pixel is encoded. By using the predicted value of the target area, the predicted value can be predicted with a stable accuracy.
 本発明に係る符号化装置は、上記予測値算出手段は、上記予測参照画素を有する領域の代表値の平均値を、符号化対象領域の予測値とすることが望ましい。 In the encoding apparatus according to the present invention, it is desirable that the predicted value calculation means uses the average value of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region.
 上記の構成によれば、上記予測値算出手段は、上記予測参照画素を有する領域の代表値の平均値を、符号化対象領域の予測値とする。そのため、或る予測参照画素を有する領域の代表値が、符号化対象領域の代表値と大きく異なっている場合であっても、予測値を安定した精度を保って予測することができるというさらなる効果を奏する。 According to the above configuration, the predicted value calculation means sets the average value of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region. Therefore, even when the representative value of an area having a certain prediction reference pixel is significantly different from the representative value of the encoding target area, it is possible to predict the predicted value with stable accuracy. Play.
 本発明に係る符号化装置は、上記予測値算出手段は、上記予測参照画素を有する領域の代表値の何れかを、符号化対象領域の予測値とすることが望ましい。 In the encoding apparatus according to the present invention, it is desirable that the predicted value calculation means use any one of the representative values of the region having the prediction reference pixel as the predicted value of the encoding target region.
 上記の構成によれば、上記予測値算出手段は、上記予測参照画素を有する領域の代表値の何れかを、符号化対象領域の予測値とする。 According to the above configuration, the predicted value calculation means sets one of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region.
 ここで、符号化対象領域の代表値と大きく異なっている、或る予測参照画素を有する領域の代表値を符号化対象領域の予測値としても、予測値の精度が低下しない場合、上記の構成にすることにより、予測値を精度を維持しつつ、予測値の算出の処理の負荷を低減することができるというさらなる効果を奏する。 Here, when the accuracy of the predicted value does not decrease even if the representative value of the region having a certain prediction reference pixel that is significantly different from the representative value of the encoding target region is used as the predicted value of the encoding target region, the above configuration is used. By doing so, there is an additional effect that it is possible to reduce the processing load of the prediction value calculation while maintaining the accuracy of the prediction value.
 換言すると、上記予測参照画素を有する領域の代表値の何れかを、符号化対象領域の予測値とすることと、上記予測参照画素を有する領域の代表値の中央値または平均値を、符号化対象領域の予測値とすることと、予測値の精度に差がない場合、上記の構成にすることにより、予測値を精度を維持しつつ、予測値の算出の処理の負荷を低減することができるというさらなる効果を奏する。 In other words, any one of the representative values of the area having the prediction reference pixel is set as the prediction value of the encoding target area, and the median value or the average value of the representative values of the area having the prediction reference pixel is encoded. When there is no difference between the predicted value of the target area and the accuracy of the predicted value, the above configuration can reduce the processing load for calculating the predicted value while maintaining the accuracy of the predicted value. There is a further effect of being able to.
 本発明に係る符号化装置は、上記伝送手段は、上記画像の上記符号化データおよび上記領域情報に加えて、さらに、上記予測値算出手段が実行する予測値の算出方法を示す予測値算出方法情報を関連づけて外部に伝送することが望ましい。 In the encoding apparatus according to the present invention, the transmission means further includes a prediction value calculation method indicating a prediction value calculation method executed by the prediction value calculation means in addition to the encoded data and the region information of the image. It is desirable to associate information and transmit it to the outside.
 上記の構成によれば、上記伝送手段は、上記画像の上記符号化データおよび上記領域情報に加えて、さらに、上記予測値算出手段が実行する予測値の算出方法を示す予測値算出方法情報を関連づけて外部に伝送する。そのため、上記符号化データ、上記領域情報および上記予測値算出方法情報を受信した装置は、上記予測値算出手段が実行する予測値の算出方法を知らない場合であっても、上記予測値算出方法情報に基づいて予測値を算出することにより、受信した符号化データを正確に復号することができるというさらなる効果を奏する。 According to the above configuration, in addition to the encoded data and the region information of the image, the transmission unit further includes prediction value calculation method information indicating a prediction value calculation method executed by the prediction value calculation unit. Relate to the outside and transmit. Therefore, even if the device that has received the encoded data, the region information, and the prediction value calculation method information does not know the prediction value calculation method executed by the prediction value calculation unit, the prediction value calculation method By calculating the predicted value based on the information, there is an additional effect that the received encoded data can be accurately decoded.
 本発明に係る符号化装置は、上記符号化手段は、上記番号付与手段が付与した番号が最先の符号化対象領域の上記差分値を可変長符号化方法によって符号化する代わりに、最先の符号化対象領域の代表値を固定長符号化方法によって符号化することが望ましい。 In the encoding apparatus according to the present invention, the encoding unit is configured to encode the difference value of the encoding target area whose number assigned by the number assigning unit is the earliest instead of the variable length encoding method. It is desirable to encode the representative value of the encoding target area by a fixed-length encoding method.
 上記の構成によれば、上記符号化手段は、上記番号付与手段が付与した番号が最先の符号化対象領域の上記差分値を可変長符号化方法によって符号化する代わりに、最先の符号化対象領域の代表値を固定長符号化方法によって符号化する。ここで、上記番号付与手段が付与した番号が最先の符号化対象領域の代表画素は、画像の端にある場合、代表画素よりラスタスキャン順の画素が存在しないことになる。この場合、最先の符号化対象領域の予測値を精度よく予測することができないため、最先の符号化対象領域の差分値が非常に大きな値となる。大きな値の差分値を可変長符号化方法によって符号化した場合、符号量が非常に大きくなってしまう。 According to the above configuration, the encoding unit is configured to replace the difference value of the first encoding target area with the number assigned by the number assigning unit using the variable length encoding method instead of encoding the first code. The representative value of the conversion target area is encoded by a fixed-length encoding method. Here, when the representative pixel of the first encoding target area with the number assigned by the number assigning means is located at the end of the image, there is no pixel in the raster scan order from the representative pixel. In this case, since the predicted value of the earliest encoding target area cannot be predicted with high accuracy, the difference value of the earliest encoding target area becomes a very large value. When a large difference value is encoded by the variable length encoding method, the amount of code becomes very large.
 よって、予測値を精度よく予測することができない最先の符号化対象領域については、差分値を算出して可変長符号化方法によって符号化する代わりに、最先の符号化対象領域の代表値を固定長符号化方法によって符号化する。これにより、符号化データの符号量をさらに削減することができるというさらなる効果を奏する。 Therefore, for the earliest encoding target area where the predicted value cannot be predicted accurately, instead of calculating the difference value and encoding using the variable length encoding method, the representative value of the earliest encoding target area Are encoded by a fixed-length encoding method. Thereby, there is an additional effect that the code amount of the encoded data can be further reduced.
 本発明に係る符号化装置は、上記画像が、テクスチャ画像とで対を成す距離画像である場合、上記分割手段は、上記テクスチャ画像の全領域を、各領域について該領域に含まれる画素群の画素値から算出される平均値と該領域に隣接する領域に含まれる画素群の画素値から算出される平均値との差が所定の閾値以下となるような複数の領域に分割する分割パターンと同じ分割パターンで、上記距離画像の全領域を複数の領域に分割することが望ましい。 In the encoding device according to the present invention, when the image is a distance image that forms a pair with a texture image, the dividing unit converts the entire area of the texture image into a pixel group included in the area for each area. A division pattern that divides a plurality of regions so that a difference between an average value calculated from pixel values and an average value calculated from pixel values of a pixel group included in a region adjacent to the region is equal to or less than a predetermined threshold value; It is desirable to divide the entire area of the distance image into a plurality of areas with the same division pattern.
 ここで、テクスチャ画像と距離画像とは、テクスチャ画像中のある領域が類似する色の画素からなる画素群で構成されている場合、距離画像中の対応する領域に含まれる画素群は全部または略全ての画素が同じ距離深度値をとる傾向が強いという相関がある。そのため、テクスチャ画像において画素値が一定の範囲ごとに画素を区切り、同じ区分で距離画像を分割することにより、距離画像における各領域内では距離値が略一定となる。 Here, when the texture image and the distance image are configured by a pixel group including pixels of similar colors in a certain area in the texture image, the pixel group included in the corresponding area in the distance image is all or substantially omitted. There is a correlation that all pixels tend to take the same distance depth value. Therefore, by dividing the pixels for each range in which the pixel value is constant in the texture image and dividing the distance image in the same section, the distance value becomes substantially constant in each region in the distance image.
 上記構成によれば、上記画像が、テクスチャ画像とで対を成す距離画像である場合、上記分割手段は、上記テクスチャ画像の全領域を、各領域について該領域に含まれる画素群の画素値から算出される平均値と該領域に隣接する領域に含まれる画素群の画素値から算出される平均値との差が所定の閾値以下となるような複数の領域に分割する分割パターンと同じ分割パターンで、上記距離画像の全領域を複数の領域に分割する。そのため、代表値決定手段が各領域に含まれる各画素の画素値から代表値を決定することにより、距離画像の情報量を削減し、かつ、距離画像を高精度に復元可能なデータを生成することができるというさらなる効果を奏する。 According to the above configuration, when the image is a distance image that forms a pair with the texture image, the dividing unit calculates the entire area of the texture image from the pixel values of the pixel group included in the area for each area. The same division pattern as the division pattern that is divided into a plurality of areas in which the difference between the calculated average value and the average value calculated from the pixel values of the pixel group included in the area adjacent to the area is equal to or less than a predetermined threshold value Thus, the entire area of the distance image is divided into a plurality of areas. Therefore, the representative value determining means determines the representative value from the pixel value of each pixel included in each region, thereby reducing the information amount of the distance image and generating data that can restore the distance image with high accuracy. There is a further effect that it is possible.
 本発明に係る符号化装置は、上記符号化手段により生成された上記画像の上記符号化データと、上記テクスチャ画像を符号化した上記テクスチャ画像の符号化データと、を関連づけて外部に伝送する伝送手段をさらに備えていることが望ましい。 The encoding device according to the present invention relates to the transmission in which the encoded data of the image generated by the encoding means and the encoded data of the texture image obtained by encoding the texture image are associated and transmitted to the outside. Preferably further means are provided.
 上記の構成によれば、上記伝送手段は、上記符号化手段により生成された上記画像の上記符号化データと、上記テクスチャ画像を符号化した上記テクスチャ画像の符号化データと、を関連づけて外部に伝送する。そのため、上記符号化データおよび上記テクスチャ画像の符号化データを受信した装置は、上記テクスチャ画像の符号化データを上記分割パターンで上記複数の領域に分割することにより、距離画像を上記複数の領域に分割することができる。よって、上記テクスチャ画像の符号化データに基づいて、受信した上記画像の上記符号化データを正確に復号することができるというさらなる効果を奏する。 According to the above configuration, the transmission unit associates the encoded data of the image generated by the encoding unit with the encoded data of the texture image obtained by encoding the texture image, and externally. To transmit. Therefore, the device that has received the coded data and the coded data of the texture image divides the coded data of the texture image into the plurality of regions by the division pattern, thereby dividing the distance image into the plurality of regions. Can be divided. Therefore, there is an additional effect that the encoded data of the received image can be accurately decoded based on the encoded data of the texture image.
 本発明に係る復号装置は、上記課題を解決するために、画像の全領域を所定の分割パターンで分割した複数の領域の各々について、該領域に含まれる各画素の画素値の代表値と、該領域の代表値の予測値との差分である差分値を含む上記画像の符号化データであって、上記差分値は、上記複数の領域に対してラスタスキャン順で付与した番号順で並べられており、上記予測値は、上記番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて算出された符号化データを復号する復号装置であって、上記複数の領域を規定する領域情報に基づいて、上記画像の全領域を複数の領域に分割する分割手段と、上記符号化データを復号し、順に並べられた差分値を生成する復号化手段と、上記分割手段により分割された上記複数の領域に対してラスタスキャン順で番号を付与する番号付与手段と、上記番号付与手段が付与した番号順に、上記複数の領域に対して、それぞれ、差分値を先頭から順に割り当てる割当手段と、上記番号付与手段の付与した番号順に上記領域を復号対象領域とし、復号対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素のうち、少なくとも1つの画素の画素値に基づいて、復号対象領域の予測値を算出する予測値算出手段と、上記復号対象領域毎に、上記予測値算出手段が算出した予測値に、上記割当手段が割り当てた差分値を加算して上記復号対象領域の画素値を算出し、該復号対象領域に含まれる全画素の画素値を算出した画素値に設定する画素値設定手段と、を備え、上記番号順で、上記復号対象領域毎に上記予測値算出手段および画素値設定手段が上記処理を繰り返し実行し、上記画像の画素値を復元することを特徴としている。 In order to solve the above problem, the decoding device according to the present invention, for each of a plurality of areas obtained by dividing the entire area of the image with a predetermined division pattern, a representative value of the pixel value of each pixel included in the area, The encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order. The predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area. A pixel that is adjacent to a pixel on the same scan line as the representative pixel and has a raster scan order before the representative pixel is a predicted reference pixel, and is at least one representative value of a region having the predicted reference pixel A decoding device that decodes the encoded data calculated based on: a dividing unit that divides the entire area of the image into a plurality of areas based on area information defining the plurality of areas; and the encoding Decoding means for decoding data and generating differential values arranged in order; Number assigning means for assigning numbers to the plurality of regions divided by the dividing means in raster scan order; and the number assigning means Are assigned to the plurality of areas in order of numbers assigned by each of the plurality of areas, and the areas are set as decoding target areas in order of the numbers assigned by the number assigning means, and are included in the decoding target areas. Among the pixels, the first pixel in the raster scan order is a representative pixel, and is included in the decoding target area and is adjacent to a pixel on the same scan line as the representative pixel. A prediction value calculation unit that calculates a prediction value of a decoding target region based on a pixel value of at least one pixel of the prediction reference pixels, the pixel having a raster scan order before the representative pixel as a prediction reference pixel; For each decoding target area, the pixel value of the decoding target area is calculated by adding the difference value assigned by the allocating means to the prediction value calculated by the prediction value calculating means, and all the pixels included in the decoding target area are calculated. A pixel value setting unit that sets the calculated pixel value to the calculated pixel value, and the predicted value calculation unit and the pixel value setting unit repeatedly execute the process for each decoding target region in the order of the numbers. It is characterized by restoring pixel values of an image.
 上記の構成によれば、復号化手段が上記符号化データを復号し、順に並べられた差分値を生成する。割当手段が、上記番号付与手段がラスタスキャン順で付与した番号順に、上記複数の領域を規定する領域情報に基づいて上記分割手段が上記画像を分割した上記複数の領域に対して、それぞれ、差分値を先頭から順に割り当てる。次に、予測値算出手段が、上記番号付与手段の付与した番号順に上記領域を復号対象領域とし、復号対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とする。そして、上記予測値算出手段が、上記予測参照画素のうち、少なくとも1つの画素の画素値に基づいて、復号対象領域の予測値を算出する。上記画素値設定手段が、上記復号対象領域毎に、上記予測値算出手段が算出した予測値に、上記割当手段が割り当てた差分値を加算して上記復号対象領域の画素値を算出し、該復号対象領域に含まれる全画素の画素値を算出した画素値に設定する。そして、上記番号付与手段の付与した番号順で、上記復号対象領域毎に上記予測値算出手段および画素値設定手段が上記処理を繰り返し実行し、上記画像の画素値を復元する。 According to the above configuration, the decoding unit decodes the encoded data and generates difference values arranged in order. The allocating unit calculates a difference for each of the plurality of regions obtained by dividing the image by the dividing unit based on region information defining the plurality of regions in the order of numbers assigned by the number assigning unit in raster scan order. Assign values in order from the beginning. Next, the prediction value calculation means decodes the area as the decoding target area in the order of the numbers given by the number assigning means, and uses the first pixel in the raster scan order as the representative pixel among the pixels included in the decoding target area. A pixel that is included in the target region and is adjacent to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is set as a predicted reference pixel. Then, the predicted value calculation means calculates a predicted value of the decoding target region based on the pixel value of at least one pixel among the predicted reference pixels. The pixel value setting means calculates the pixel value of the decoding target area by adding the difference value assigned by the assigning means to the prediction value calculated by the prediction value calculating means for each decoding target area, The pixel values of all the pixels included in the decoding target area are set to the calculated pixel values. Then, the prediction value calculation means and the pixel value setting means repeatedly execute the above processing for each decoding target area in the order of the numbers given by the number assignment means, and restore the pixel values of the image.
 そのため、復号対象領域の区分は、上記符号化データの示す画像が分割された複数の領域と同じ区分である。また、各復号対象領域の代表値の予測値を算出する際に使用する代表画素、およびそれに基づく予測対象画素を一意に特定することができ、かつ、復号対象領域の代表画素およびそれに基づく予測対象画素と、該復号対象領域に対応する符号化対象領域の代表画素およびそれに基づく予測対象画素とが同じ画素にすることができる。よって、符号化データの示す画像を正確に復元することができるという効果を奏する。 Therefore, the decoding target area is the same as the plurality of areas into which the image indicated by the encoded data is divided. In addition, the representative pixel used when calculating the predicted value of the representative value of each decoding target area and the prediction target pixel based thereon can be uniquely specified, and the representative pixel of the decoding target area and the prediction target based thereon The pixel, the representative pixel of the encoding target region corresponding to the decoding target region, and the prediction target pixel based thereon can be the same pixel. Therefore, there is an effect that the image indicated by the encoded data can be accurately restored.
 本発明に係る復号方法は、上記課題を解決するために、画像の全領域を所定の分割パターンで分割した複数の領域の各々について、該領域に含まれる各画素の画素値の代表値と、該領域の代表値の予測値との差分である差分値を含む上記画像の符号化データであって、上記差分値は、上記複数の領域に対してラスタスキャン順で付与した番号順で並べられており、上記予測値は、上記番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて算出された符号化データを復号する復号装置の復号方法であって、上記復号装置にて、上記複数の領域を規定する領域情報に基づいて、上記画像の全領域を複数の領域に分割する分割ステップと、上記符号化データを復号し、順に並べられた差分値を生成する復号化ステップと、上記分割ステップにおいて分割された上記複数の領域に対してラスタスキャン順で番号を付与する番号付与ステップと、上記番号付与ステップにおいて付与された番号順に、上記複数の領域に対して、それぞれ、差分値を先頭から順に割り当てる割当ステップと、上記番号付与ステップにおいて付与された番号順に上記領域を復号対象領域とし、復号対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素のうち、少なくとも1つの画素の画素値に基づいて、復号対象領域の予測値を算出する予測値算出ステップと、上記復号対象領域毎に、上記予測値算出ステップにおいて算出された予測値に、上記割当ステップにおいて割り当てられた差分値を加算して上記復号対象領域の画素値を算出し、該復号対象領域に含まれる全画素の画素値を算出した画素値に設定する画素値設定ステップと、を含み、上記番号順で、上記復号対象領域毎に上記予測値算出ステップおよび画素値設定ステップを繰り返し実行し、上記画像の画素値を復元することを特徴としている。 In order to solve the above problem, the decoding method according to the present invention provides, for each of a plurality of areas obtained by dividing the entire area of an image with a predetermined division pattern, a representative value of pixel values of each pixel included in the area, The encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order. The predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area. A pixel that is adjacent to a pixel on the same scan line as the representative pixel and has a raster scan order before the representative pixel is a predicted reference pixel, and is at least one representative value of a region having the predicted reference pixel A decoding method of a decoding device that decodes encoded data calculated based on the method, wherein the decoding device converts all regions of the image into a plurality of regions based on region information that defines the plurality of regions. A division step for dividing, a decoding step for decoding the encoded data and generating difference values arranged in order, and assigning numbers to the plurality of regions divided in the division step in a raster scan order A number assigning step, an assigning step in which the difference values are assigned in order from the top to the plurality of regions in the order of numbers assigned in the number assigning step, and the regions in the order of the numbers given in the number assigning step. Among the pixels included in the decoding target area, the first pixel in the raster scan order is set as the representative pixel and included in the decoding target area. A pixel that is adjacent to a pixel on the same scan line as the representative pixel and has a raster scan order before the representative pixel is a predicted reference pixel, and a pixel of at least one of the predicted reference pixels A prediction value calculation step of calculating a prediction value of the decoding target region based on the value, and for each decoding target region, the difference value assigned in the assignment step is added to the prediction value calculated in the prediction value calculation step. A pixel value setting step of adding and calculating the pixel value of the decoding target area and setting the pixel values of all the pixels included in the decoding target area to the calculated pixel value, and in order of the numbers The prediction value calculation step and the pixel value setting step are repeatedly executed for each target region to restore the pixel value of the image.
 上記の構成によれば、本発明に係る復号方法は、本発明に係る復号装置と同様の作用効果を奏する。 According to the above configuration, the decoding method according to the present invention has the same operational effects as the decoding device according to the present invention.
 本発明に係る復号装置は、上記符号化データおよび上記領域情報を外部から受信する受信手段をさらに備えていることが望ましい。 The decoding device according to the present invention preferably further includes receiving means for receiving the encoded data and the region information from the outside.
 上記の構成によれば、上記受信手段は、上記符号化データおよび上記領域情報を外部から受信する。そのため、復号装置が上記領域情報を保持していない場合であっても、外部から上記領域情報を取得することにより、上記領域情報に基づいて画像を上記複数の領域に分割できる。よって、復号装置が上記領域情報を保持していない場合であっても、受信した符号化データを正確に復号することができるというさらなる効果を奏する。 According to the above configuration, the receiving means receives the encoded data and the region information from the outside. Therefore, even when the decoding device does not hold the region information, an image can be divided into the plurality of regions based on the region information by acquiring the region information from the outside. Therefore, even if the decoding apparatus does not hold the area information, the received encoded data can be accurately decoded.
 本発明に係る復号装置は、上記受信手段は、符号化する対象の値が0に近いほど符号語が短い可変長符号化方法によって符号化された上記符号化データを受信することが望ましい。 In the decoding apparatus according to the present invention, it is desirable that the receiving means receives the encoded data encoded by the variable length encoding method in which the code word is shorter as the value to be encoded is closer to 0.
 上記の構成によれば、上記受信手段は、符号化する対象の値が0に近いほど符号語が短い可変長符号化方法によって符号化された上記符号化データを受信する。上述のように、符号化データが精度よく符号化されている場合、上記符号化データの符号量が小さくなっている。よって、復号装置は、符号化データを復号する処理負荷を軽減することができるというさらなる効果を奏する。 According to the above configuration, the receiving means receives the encoded data encoded by the variable length encoding method in which the code word is shorter as the value to be encoded is closer to 0. As described above, when the encoded data is encoded with high accuracy, the code amount of the encoded data is small. Therefore, the decoding apparatus has an additional effect that the processing load for decoding the encoded data can be reduced.
 本発明に係る復号装置は、上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む画素群とすることが望ましい。 In the decoding device according to the present invention, the prediction value calculation means includes a prediction reference pixel, a pixel preceding the raster scan order of the representative pixel, and a scan line that is included in the decoding target area and is the same as the representative pixel. And a raster of the last pixel of the scan line that is included in the decoding target area and is the same as the representative pixel, which is adjacent to the pixel of the first pixel in the raster scan order of the representative pixel. It is desirable that the pixel group includes a pixel adjacent to the next pixel in the scan order and a pixel on the scan line one previous in the raster scan order of the representative pixel.
 上記の構成によれば、上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む画素群とする。 According to the above configuration, the predicted value calculation means includes the prediction reference pixel, the pixel immediately preceding the raster scan order of the representative pixel, and the same scan line as the representative pixel included in the decoding target area. A raster scan of a pixel adjacent to the pixel, the pixel on the previous scan line in the raster scan order of the representative pixel, and the last pixel included in the decoding target area and on the same scan line as the representative pixel A pixel group that includes a pixel adjacent to the next pixel in the order and a pixel on the previous scan line in the raster scan order of the representative pixel.
 すなわち、復号対象領域の左方向に隣接する画素、上方向に隣接する画素、右斜め上に近接する画素(復号対象領域の右方向にある画素)の3方向の画素を予測参照画素としている。よって、復号対象領域よりラスタスキャン順が前の画素であって、多方向の画素を参照して予測値を算出するため、予測値を高精度に予測することができるというさらなる効果を奏する。 That is, the pixels in the three directions of the pixel adjacent in the left direction of the decoding target region, the pixel adjacent in the upward direction, and the pixel close to the upper right (the pixel in the right direction of the decoding target region) are used as the prediction reference pixels. Therefore, since the prediction value is calculated with reference to the pixels in the raster scan order before the decoding target region and the pixels in multiple directions, the prediction value can be predicted with high accuracy.
 本発明に係る復号装置は、上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素の何れか1つの画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む3つの画素とすることが望ましい。 In the decoding device according to the present invention, the prediction value calculation means includes a prediction reference pixel, a pixel preceding the raster scan order of the representative pixel, and a scan line that is included in the decoding target area and is the same as the representative pixel. One of the pixels of the scan line immediately preceding in the raster scan order of the representative pixel and the same scan line as the representative pixel included in the decoding target region. It is desirable that the pixel is adjacent to the next pixel in the raster scan order of the last pixel, and includes the pixel of the previous scan line in the raster scan order of the representative pixel. .
 上記の構成によれば、上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素の何れか1つの画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む3つの画素とする。 According to the above configuration, the predicted value calculation means includes the prediction reference pixel, the pixel immediately preceding the raster scan order of the representative pixel, and the same scan line as the representative pixel included in the decoding target area. A pixel adjacent to the pixel and one of the pixels of the scan line immediately preceding in the raster scan order of the representative pixel, and the last of the same scan line as the representative pixel included in the decoding target region It is assumed that the pixel is adjacent to the next pixel in the raster scan order of the tail pixel, and includes the pixel of the previous scan line in the raster scan order of the representative pixel.
 すなわち、復号対象領域の左方向に隣接する画素、上方向に隣接する画素、右斜め上に近接する画素(復号対象領域の右方向にある画素)の3方向の3つの画素を予測参照画素としている。よって、予測参照画素が、復号対象領域に対して多方向に位置する画素であって、できるだけ少ない画素群(3つの画素)であるため、予測値の算出の処理の負荷を低減すると共に、予測値を高精度に予測することができるというさらなる効果を奏する。 That is, three pixels in three directions, that is, a pixel adjacent in the left direction of the decoding target area, a pixel adjacent in the upward direction, and a pixel close to the upper right (a pixel in the right direction of the decoding target area) are used as predicted reference pixels. Yes. Therefore, since the prediction reference pixel is a pixel located in multiple directions with respect to the decoding target region and is a pixel group (three pixels) as small as possible, the processing load of the prediction value calculation is reduced and the prediction is performed. There is an additional effect that the value can be predicted with high accuracy.
 本発明に係る復号装置は、上記予測値算出手段は、上記予測参照画素の画素値の中央値を、復号対象領域の予測値とすることが望ましい。 In the decoding apparatus according to the present invention, it is desirable that the predicted value calculation means uses the median value of the predicted reference pixels as the predicted value of the decoding target area.
 上記の構成によれば、上記予測値算出手段は、上記予測参照画素の画素値の中央値を、復号対象領域の予測値とする。 According to the above configuration, the predicted value calculation means sets the median value of the predicted reference pixels as the predicted value of the decoding target area.
 上述のように、或る予測参照画素を有する領域の代表値が、復号対象領域の代表値と大きく異なっている場合であっても、上記予測参照画素を有する領域の代表値の中央値を、復号対象領域の予測値とすることにより、予測値を安定した精度を保って予測することができるというさらなる効果を奏する。 As described above, even if the representative value of the region having a certain prediction reference pixel is significantly different from the representative value of the decoding target region, the median of the representative value of the region having the prediction reference pixel is By using the predicted value of the decoding target area, there is an additional effect that the predicted value can be predicted with stable accuracy.
 本発明に係る復号装置は、上記予測値算出手段は、上記予測参照画素の画素値の平均値を、復号対象領域の予測値とすることが望ましい。 In the decoding apparatus according to the present invention, it is desirable that the predicted value calculation means use an average value of the pixel values of the predicted reference pixels as a predicted value of the decoding target region.
 上記の構成によれば、上記予測値算出手段は、上記予測参照画素の画素値の平均値を、復号対象領域の予測値とする。 According to the above configuration, the predicted value calculation means sets the average value of the pixel values of the predicted reference pixels as the predicted value of the decoding target region.
 上述のように、或る予測参照画素を有する領域の代表値が、復号対象領域の代表値と大きく異なっている場合であっても、上記予測参照画素を有する領域の代表値の平均値を、復号対象領域の予測値とすることにより、予測値を安定した精度を保って予測することができるというさらなる効果を奏する。 As described above, even if the representative value of the region having a certain prediction reference pixel is significantly different from the representative value of the region to be decoded, the average value of the representative values of the region having the prediction reference pixel is By using the predicted value of the decoding target area, there is an additional effect that the predicted value can be predicted with stable accuracy.
 本発明に係る復号装置は、上記予測値算出手段は、上記予測参照画素に含まれる、何れかの画素の画素値を、復号対象領域の予測値とすることが望ましい。 In the decoding apparatus according to the present invention, it is desirable that the predicted value calculation means uses a pixel value of any pixel included in the predicted reference pixel as a predicted value of the decoding target region.
 上記の構成によれば、上記予測値算出手段は、上記予測参照画素に含まれる、何れかの画素の画素値を、復号対象領域の予測値とする。 According to the above configuration, the predicted value calculation means sets the pixel value of any pixel included in the predicted reference pixel as the predicted value of the decoding target region.
 ここで、復号対象領域の代表値と大きく異なっている、或る予測参照画素を有する領域の代表値を復号対象領域の予測値としても、予測値の精度が低下しない場合、上記の構成にすることにより、予測値を精度を維持しつつ、予測値の算出の処理の負荷を低減することができるというさらなる効果を奏する。 Here, even if the representative value of a region having a certain prediction reference pixel that is significantly different from the representative value of the decoding target region is used as the prediction value of the decoding target region, the above configuration is used when the accuracy of the prediction value does not decrease. Thus, there is an additional effect that it is possible to reduce the processing load for calculating the predicted value while maintaining the accuracy of the predicted value.
 本発明に係る復号装置は、上記受信手段は、上記画像の上記符号化データおよび上記領域情報に加えて、さらに、上記予測値算出手段が実行する予測値の算出方法を示す予測値算出方法情報を受信し、上記予測値算出手段は、上記受信手段の受信した予測値算出方法情報の示す算出方法に基づいて予測値を算出することが望ましい。 In the decoding apparatus according to the present invention, in addition to the encoded data and the region information of the image, the reception unit further includes prediction value calculation method information indicating a prediction value calculation method executed by the prediction value calculation unit. Preferably, the prediction value calculation means calculates the prediction value based on the calculation method indicated by the prediction value calculation method information received by the reception means.
 上記の構成によれば、上記受信手段は、上記画像の上記符号化データおよび上記領域情報に加えて、さらに、上記予測値算出手段が実行する予測値の算出方法を示す予測値算出方法情報を受信し、上記予測値算出手段は、上記受信手段の受信した予測値算出方法情報の示す算出方法に基づいて予測値を算出する。そのため、復号装置は、上記予測値算出手段が実行する予測値の算出方法を知らない場合であっても、上記予測値算出方法情報に基づいて予測値を算出することにより、受信した符号化データを正確に復号することができるというさらなる効果を奏する。 According to the above configuration, in addition to the encoded data and the region information of the image, the reception unit further includes prediction value calculation method information indicating a prediction value calculation method executed by the prediction value calculation unit. The prediction value calculation means receives the prediction value based on the calculation method indicated by the prediction value calculation method information received by the reception means. Therefore, even when the decoding device does not know the calculation method of the prediction value executed by the prediction value calculation unit, the decoding device calculates the prediction value based on the prediction value calculation method information, thereby receiving the encoded data. There is an additional effect that can be accurately decoded.
 本発明に係る復号装置は、上記符号化データにおいて先頭の符号語が最先の符号化対象領域の代表値を固定長符号化方法によって符号化したものである場合、上記復号化手段は、上記符号化データの先頭の符号語を固定長符号化方法により復号し、上記画素値設定手段は、上記番号付与手段の付与した番号順が先頭の上記領域に含まれる全画素の画素値を、上記復号化手段が上記先頭の符号語を復号した代表値に設定することが望ましい。 In the decoding apparatus according to the present invention, when the first code word in the encoded data is obtained by encoding the representative value of the earliest encoding target area by the fixed-length encoding method, the decoding means includes The first code word of the encoded data is decoded by a fixed-length encoding method, and the pixel value setting means calculates the pixel values of all the pixels included in the first area in the number order assigned by the number assigning means. It is desirable that the decoding means sets the representative value obtained by decoding the head codeword.
 上記の構成によれば、上記符号化データにおいて先頭の符号語が最先の符号化対象領域の代表値を固定長符号化方法によって符号化したものである場合、上記復号化手段は、上記符号化データの先頭の符号語を固定長符号化方法により復号し、上記画素値設定手段は、上記番号付与手段の付与した番号順が先頭の上記領域に含まれる全画素の画素値を、上記復号化手段が上記先頭の符号語を復号した代表値に設定する。 According to the above configuration, when the first code word in the encoded data is obtained by encoding the representative value of the earliest encoding target area by the fixed-length encoding method, the decoding means includes the code The first code word of the coded data is decoded by a fixed-length encoding method, and the pixel value setting means decodes the pixel values of all pixels included in the first area in the number order assigned by the number assigning means. The converting means sets the head codeword to the decoded representative value.
 上述のように、最先の符号化対象領域の代表値が固定長符号化方法によって符号化されている場合、上記符号化データの符号量が小さくなっている。よって、復号装置は、符号化データを復号する処理負荷を軽減することができるというさらなる効果を奏する。 As described above, when the representative value of the earliest encoding target area is encoded by the fixed-length encoding method, the code amount of the encoded data is small. Therefore, the decoding apparatus has an additional effect that the processing load for decoding the encoded data can be reduced.
 本発明に係る復号装置は、上記画像が、テクスチャ画像とで対を成す距離画像である場合、上記受信手段は、上記領域情報として、上記テクスチャ画像を符号化した上記テクスチャ画像の符号化データを受信し、上記分割手段は、上記テクスチャ画像の符号化データから復号された上記テクスチャ画像の全領域を、各領域について該領域に含まれる画素群の画素値から算出される平均値と該領域に隣接する領域に含まれる画素群の画素値から算出される平均値との差が所定の閾値以下となるような複数の領域に分割する分割パターンで、上記距離画像の全領域を複数の領域に分割することが望ましい。 In the decoding device according to the present invention, when the image is a distance image that forms a pair with a texture image, the receiving means uses, as the region information, encoded data of the texture image obtained by encoding the texture image. The dividing means receives the entire area of the texture image decoded from the encoded data of the texture image, and calculates the average value calculated from the pixel values of the pixel group included in the area for each area and the area. A division pattern that divides a plurality of regions so that a difference from an average value calculated from pixel values of pixel groups included in adjacent regions is a predetermined threshold value or less. It is desirable to divide.
 上記の構成によれば、上記画像が、テクスチャ画像とで対を成す距離画像である場合、上記受信手段は、上記領域情報として、上記テクスチャ画像を符号化した上記テクスチャ画像の符号化データを受信し、上記分割手段は、上記テクスチャ画像の符号化データから復号された上記テクスチャ画像の全領域を、各領域について該領域に含まれる画素群の画素値から算出される平均値と該領域に隣接する領域に含まれる画素群の画素値から算出される平均値との差が所定の閾値以下となるような複数の領域に分割する分割パターンで、上記距離画像の全領域を複数の領域に分割する。 According to said structure, when the said image is a distance image which makes a pair with a texture image, the said receiving means receives the coding data of the said texture image which coded the said texture image as said area | region information. Then, the dividing unit is configured to calculate the entire area of the texture image decoded from the encoded data of the texture image, adjacent to the average value calculated from the pixel value of the pixel group included in the area for each area and the area. This is a division pattern that divides the distance image into a plurality of regions so that the difference from the average value calculated from the pixel values of the pixel group included in the region to be equal to or less than a predetermined threshold value. To do.
 上述のように、上記分割手段が分割した距離画像の各領域内では距離値が略一定となる。そのため、各領域の代表値を用いることにより、符号化データを、符号量が少なく、かつ、距離画像を高精度に復元可能なデータとすることができる。よって、復号装置は、符号化データから高精度に距離画像を復元することができ、かつ、符号化データを復号する処理負荷を軽減することができるというさらなる効果を奏する。 As described above, the distance value is substantially constant in each area of the distance image divided by the dividing means. Therefore, by using the representative value of each region, the encoded data can be made into data that has a small code amount and can restore the distance image with high accuracy. Therefore, the decoding device can reconstruct the distance image from the encoded data with high accuracy, and can further reduce the processing load for decoding the encoded data.
 また、本発明に係る符号化装置の各手段としてコンピュータを機能させる符号化プログラム、本発明に係る復号装置の各手段としてコンピュータを機能させる復号プログラム、並びに、符号化プログラムを記録したコンピュータ読み取り可能な記録媒体、および、復号プログラムを記録したコンピュータ読み取り可能な記録媒体も本発明の範疇に含まれる。 Also, an encoding program that causes a computer to function as each unit of the encoding device according to the present invention, a decoding program that causes the computer to function as each unit of the decoding device according to the present invention, and a computer-readable recording of the encoding program A recording medium and a computer-readable recording medium on which a decoding program is recorded are also included in the scope of the present invention.
 さらに、画像の符号化データのデータ構造であって、上記画像の全領域を所定の分割パターンで分割した複数の領域の各々について、該領域に含まれる各画素の画素値の代表値と、該領域の代表値の予測値との差分である差分値を含んでおり、上記差分値は、上記複数の領域に対してラスタスキャン順で付与した番号順に並べられており、上記予測値は、上記番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて算出されたものであることを特徴とする符号化データのデータ構造も本発明の範疇に含まれる。 Furthermore, the data structure of the encoded data of the image, for each of a plurality of regions obtained by dividing the entire region of the image by a predetermined division pattern, a representative value of the pixel value of each pixel included in the region, A difference value that is a difference from the predicted value of the representative value of the region is included, and the difference value is arranged in the order of the numbers given in the raster scan order to the plurality of regions, and the predicted value is The above areas are set as encoding target areas in numerical order, and the first pixel in the raster scan order among the pixels included in the encoding target area is set as the representative pixel, and the same scan as the above representative pixel is included in the encoding target area. Calculated based on at least one representative value of the region having the predicted reference pixel, with the pixel adjacent to the pixel of the line and having the raster scan order before the representative pixel as the predicted reference pixel Data structure of the coded data, wherein the at is also included in the scope of the present invention.
 (プログラム等)
 最後に、動画像符号化装置1、1Aおよび動画像復号装置2、2Aに含まれている各ブロックは、ハードウェアロジックによって構成すればよい。また、動画像符号化装置1、1Aおよび動画像復号装置2、2Aの各制御は、次のように、CPU(Central Processing Unit)を用いてソフトウェアによって実現してもよい。
(Program etc.)
Finally, each block included in the moving image encoding device 1, 1A and the moving image decoding device 2, 2A may be configured by hardware logic. In addition, each control of the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A may be realized by software using a CPU (Central Processing Unit) as follows.
 すなわち、動画像符号化装置1、1Aおよび動画像復号装置2、2Aの各制御を実現する制御プログラムのプログラムコード(実行形式プログラム、中間コードプログラム、ソースプログラム)をコンピュータで読み取り可能に記録していればよい。動画像符号化装置1、1Aおよび動画像復号装置2、2A(またはCPUやMPU)が、供給された記録媒体に記録されているプログラムコードを読み出し、実行すればよい。 That is, the program code (execution format program, intermediate code program, source program) of the control program that realizes the control of each of the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A is recorded so as to be readable by a computer. Just do it. The moving image encoding device 1, 1A and the moving image decoding device 2, 2A (or CPU or MPU) may read and execute the program code recorded on the supplied recording medium.
 プログラムコードを動画像符号化装置1、1Aおよび動画像復号装置2、2Aに供給する記録媒体は、たとえば、磁気テープやカセットテープ等のテープ系、フロッピー(登録商標)ディスク/ハードディスク等の磁気ディスクやCD-ROM/MO/MD/DVD/CD-R等の光ディスクを含むディスク系、ICカード(メモリカードを含む)/光カード等のカード系、あるいはマスクROM/EPROM/EEPROM/フラッシュROM等の半導体メモリ系などとすることができる。 The recording medium for supplying the program code to the moving image encoding apparatus 1, 1A and the moving image decoding apparatus 2, 2A is, for example, a tape system such as a magnetic tape or a cassette tape, or a magnetic disk such as a floppy (registered trademark) disk / hard disk. And disk systems including optical disks such as CD-ROM / MO / MD / DVD / CD-R, card systems such as IC cards (including memory cards) / optical cards, mask ROM / EPROM / EEPROM / flash ROM, etc. A semiconductor memory system can be used.
 また、動画像符号化装置1、1Aおよび動画像復号装置2、2Aは、通信ネットワークと接続可能に構成しても、本発明の目的を達成できる。この場合、上記のプログラムコードを、通信ネットワークを介して動画像符号化装置1、1Aおよび動画像復号装置2、2Aに供給する。この通信ネットワークは、動画像符号化装置1、1Aおよび動画像復号装置2、2Aにプログラムコードを供給できるものであればよく、特定の種類または形態に限定されない。たとえば、インターネット、イントラネット、エキストラネット、LAN、ISDN、VAN、CATV通信網、移動体通信網、衛星通信網等であればよい。 Further, even if the moving image encoding devices 1 and 1A and the moving image decoding devices 2 and 2A are configured to be connectable to a communication network, the object of the present invention can be achieved. In this case, the program code is supplied to the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A via a communication network. This communication network is not limited to a specific type or form as long as it can supply a program code to the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A. For example, the Internet, intranet, extranet, LAN, ISDN, VAN, CATV communication network, mobile communication network, satellite communication network, etc. may be used.
 この通信ネットワークを構成する伝送媒体も、プログラムコードを伝送可能な任意の媒体であればよく、特定の構成または種類のものに限定されない。たとえば、IEEE1394、USB(Universal Serial Bus)、電力線搬送、ケーブルTV回線、電話線、ADSL(Asymmetric Digital Subscriber Line)回線などの有線でも、IrDAやリモコンのような赤外線、Bluetooth(登録商標)、802.11無線、HDR、携帯電話網、衛星回線、地上波デジタル網等の無線でも利用可能である。 The transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type. For example, wired communication such as IEEE 1394, USB (Universal Serial Bus), power line carrier, cable TV line, telephone line, ADSL (Asymmetric Digital Subscriber Line) line, infrared such as IrDA or remote control, Bluetooth (registered trademark), 802. 11 wireless, HDR, mobile phone network, satellite line, terrestrial digital network, etc. can also be used.
 本発明は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。 The present invention is not limited to the above-described embodiments, and various modifications are possible within the scope shown in the claims, and embodiments obtained by appropriately combining technical means disclosed in different embodiments. Is also included in the technical scope of the present invention.
 本発明は、3D対応のコンテンツを生成するコンテンツ生成装置や3D対応のコンテンツを再生するコンテンツ再生装置等に好適に適用することができる。 The present invention can be suitably applied to a content generation device that generates 3D-compatible content, a content playback device that plays back 3D-compatible content, and the like.
1、1A     動画像符号化装置
2、2A     動画像復号装置
11       画像符号化部
12       画像復号部(復号化手段)
20、20A   距離画像符号化部
21、21A   画像分割処理部
21’、21A’ 画像分割処理部(分割手段)
22、22A   距離画像分割処理部(分割手段)
23、23A   距離値修正部(代表値決定手段)
24、24A   番号付与部(番号付与手段)
24’、24A’ 番号付与部(番号付与手段、割当手段)
25、25A   予測符号化部(予測値算出手段、差分値算出手段、符号化手段)
28、28A   パッケージング部(伝送手段)
31、31A   アンパッケージング部(受信手段)
32、32A   予測復号部(予測値算出手段、画素値設定手段)
DESCRIPTION OF SYMBOLS 1, 1A moving image encoding apparatus 2, 2A moving image decoding apparatus 11 Image encoding part 12 Image decoding part (decoding means)
20, 20A Distance image encoding unit 21, 21A Image division processing unit 21 ′, 21A ′ Image division processing unit (division means)
22, 22A Distance image division processing unit (division means)
23, 23A Distance value correction unit (representative value determining means)
24, 24A Numbering unit (numbering means)
24 ', 24A' number assigning unit (number assigning means, assigning means)
25, 25A Predictive encoding unit (predicted value calculating means, difference value calculating means, encoding means)
28, 28A Packaging part (transmission means)
31, 31A Unpackaging unit (receiving means)
32, 32A Predictive decoding unit (predicted value calculating means, pixel value setting means)

Claims (29)

  1.  画像を符号化する符号化装置において、
     上記画像の全領域を複数の領域に分割する分割手段と、
     上記分割手段により分割された上記複数の領域の各々について、該領域に含まれる各画素の画素値から代表値を決定する代表値決定手段と、
     上記複数の領域に対してラスタスキャン順で番号を付与する番号付与手段と、
     上記番号付与手段の付与した番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて、符号化対象領域の予測値を算出する予測値算出手段と、
     上記符号化対象領域毎に、上記代表値決定手段が決定した代表値から、上記予測値算出手段が算出した予測値を減算して差分値を算出する差分値算出手段と、
     上記差分値算出手段が算出した差分値を、上記番号付与手段が付与した順番に並べて符号化し、上記画像の符号化データを生成する符号化手段と、を備えることを特徴とする符号化装置。
    In an encoding device for encoding an image,
    Dividing means for dividing the entire region of the image into a plurality of regions;
    For each of the plurality of areas divided by the dividing means, representative value determining means for determining a representative value from the pixel value of each pixel included in the area;
    Numbering means for assigning numbers to the plurality of areas in raster scan order;
    The above-mentioned area is set as the encoding target area in the order of the numbers given by the above-mentioned number assigning means, and among the pixels included in the encoding target area, the first pixel in the raster scan order is set as the representative pixel, and is included in the encoding target area. A pixel that is adjacent to a pixel on the same scan line as the representative pixel and has a raster scan order before the representative pixel is a predicted reference pixel, and is at least one of the representative values of the region having the predicted reference pixel. Based on a predicted value calculation means for calculating a predicted value of the encoding target region,
    Difference value calculating means for subtracting the predicted value calculated by the predicted value calculating means from the representative value determined by the representative value determining means for each encoding target area;
    An encoding apparatus comprising: encoding means for encoding the difference values calculated by the difference value calculating means in the order given by the number assigning means and generating encoded data of the image.
  2.  上記符号化手段により生成された上記画像の上記符号化データと、上記複数の領域を規定する領域情報と、を関連づけて外部に伝送する伝送手段と、をさらに備えていることを特徴とする請求項1に記載の符号化装置。 And further comprising: transmission means for associating the encoded data of the image generated by the encoding means with the area information defining the plurality of areas, and transmitting the associated information to the outside. Item 4. The encoding device according to Item 1.
  3.  上記符号化手段は、上記差分値を、符号化する対象の値が0に近いほど符号語が短い可変長符号化方法によって符号化することを特徴とする請求項1または2に記載の符号化装置。 3. The encoding according to claim 1, wherein the encoding unit encodes the difference value by a variable-length encoding method in which a code word is shorter as a value to be encoded is closer to 0. 4. apparatus.
  4.  上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む画素群とすることを特徴とする請求項1~3の何れか1項に記載の符号化装置。 The predicted value calculation means includes a prediction reference pixel that includes a pixel immediately preceding the representative pixel in the raster scan order, and a pixel adjacent to a pixel on the same scan line as the representative pixel that is included in the encoding target region. Then, the pixel of the previous scan line in the raster scan order of the representative pixel and the next pixel in the raster scan order of the last pixel included in the encoding target area and the same scan line as the representative pixel. 4. The pixel group according to claim 1, wherein the pixel group includes a pixel adjacent to the first pixel and a pixel on a scan line immediately preceding the representative pixel in the raster scan order. The encoding device described.
  5.  上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素の何れか1つの画素と、符号化対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む3つの画素とすることを特徴とする請求項1~3の何れか1項に記載の符号化装置。 The predicted value calculation means includes a prediction reference pixel that includes a pixel immediately preceding the representative pixel in the raster scan order, and a pixel adjacent to a pixel on the same scan line as the representative pixel that is included in the encoding target region. Any one of the pixels on the previous scan line in the raster scan order of the representative pixel, and the last pixel on the same scan line as the representative pixel included in the encoding target area. 2. The pixel adjacent to the next pixel in the scan order and including the pixel on the scan line one previous in the raster scan order of the representative pixel, 4. The encoding device according to any one of 3.
  6.  上記予測値算出手段は、上記予測参照画素を有する領域の代表値の中央値を、符号化対象領域の予測値とすることを特徴とする請求項1~5の何れか1項に記載の符号化装置。 The code according to any one of claims 1 to 5, wherein the predicted value calculating means uses a median of representative values of the area having the predicted reference pixel as a predicted value of the encoding target area. Device.
  7.  上記予測値算出手段は、上記予測参照画素を有する領域の代表値の平均値を、符号化対象領域の予測値とすることを特徴とする請求項1~5の何れか1項に記載の符号化装置。 The code according to any one of claims 1 to 5, wherein the prediction value calculation means uses an average value of the representative values of the area having the prediction reference pixel as a prediction value of the encoding target area. Device.
  8.  上記予測値算出手段は、上記予測参照画素を有する領域の代表値の何れかを、符号化対象領域の予測値とすることを特徴とする請求項1~5の何れか1項に記載の符号化装置。 The code according to any one of claims 1 to 5, wherein the prediction value calculation means uses any one of the representative values of the area having the prediction reference pixel as a prediction value of the encoding target area. Device.
  9.  上記伝送手段は、上記画像の上記符号化データおよび上記領域情報に加えて、さらに、上記予測値算出手段が実行する予測値の算出方法を示す予測値算出方法情報を関連づけて外部に伝送することを特徴とする請求項2に記載の符号化装置。 In addition to the encoded data and the region information of the image, the transmission unit further transmits prediction value calculation method information indicating a prediction value calculation method executed by the prediction value calculation unit in association with the outside. The encoding device according to claim 2.
  10.  上記符号化手段は、上記番号付与手段が付与した番号が最先の符号化対象領域の上記差分値を可変長符号化方法によって符号化する代わりに、最先の符号化対象領域の代表値を固定長符号化方法によって符号化することを特徴とする請求項3に記載の符号化装置。 The encoding means, instead of encoding the difference value of the earliest encoding target area with the number given by the number assigning means by the variable length encoding method, represents the representative value of the earliest encoding target area. The encoding apparatus according to claim 3, wherein encoding is performed by a fixed-length encoding method.
  11.  上記画像が、テクスチャ画像とで対を成す距離画像である場合、
     上記分割手段は、上記テクスチャ画像の全領域を、各領域について該領域に含まれる画素群の画素値から算出される平均値と該領域に隣接する領域に含まれる画素群の画素値から算出される平均値との差が所定の閾値以下となるようfな複数の領域に分割する分割パターンと同じ分割パターンで、上記距離画像の全領域を複数の領域に分割することを特徴とする請求項1に記載の符号化装置。
    If the image is a distance image that is paired with a texture image,
    The dividing means calculates the entire area of the texture image from the average value calculated from the pixel value of the pixel group included in the area and the pixel value of the pixel group included in the area adjacent to the area for each area. The entire area of the distance image is divided into a plurality of areas with the same division pattern as the division pattern divided into a plurality of areas such that the difference from the average value is equal to or less than a predetermined threshold. The encoding device according to 1.
  12.  上記符号化手段により生成された上記画像の上記符号化データと、上記テクスチャ画像を符号化した上記テクスチャ画像の符号化データと、を関連づけて外部に伝送する伝送手段をさらに備えていることを特徴とする請求項11に記載の符号化装置。 The image processing apparatus further comprises transmission means for associating the encoded data of the image generated by the encoding means with the encoded data of the texture image obtained by encoding the texture image and transmitting them to the outside. The encoding device according to claim 11.
  13.  画像の全領域を所定の分割パターンで分割した複数の領域の各々について、該領域に含まれる各画素の画素値の代表値と、該領域の代表値の予測値との差分である差分値を含む上記画像の符号化データであって、上記差分値は、上記複数の領域に対してラスタスキャン順で付与した番号順で並べられており、上記予測値は、上記番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて算出された符号化データを復号する復号装置であって、
     上記複数の領域を規定する領域情報に基づいて、上記画像の全領域を複数の領域に分割する分割手段と、
     上記符号化データを復号し、順に並べられた差分値を生成する復号化手段と、
     上記分割手段により分割された上記複数の領域に対してラスタスキャン順で番号を付与する番号付与手段と、
     上記番号付与手段が付与した番号順に、上記複数の領域に対して、それぞれ、差分値を先頭から順に割り当てる割当手段と、
     上記番号付与手段の付与した番号順に上記領域を復号対象領域とし、復号対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素のうち、少なくとも1つの画素の画素値に基づいて、復号対象領域の予測値を算出する予測値算出手段と、
     上記復号対象領域毎に、上記予測値算出手段が算出した予測値に、上記割当手段が割り当てた差分値を加算して上記復号対象領域の画素値を算出し、該復号対象領域に含まれる全画素の画素値を算出した画素値に設定する画素値設定手段と、を備え、
     上記番号順で、上記復号対象領域毎に上記予測値算出手段および画素値設定手段が上記処理を繰り返し実行し、上記画像の画素値を復元することを特徴とする復号装置。
    For each of a plurality of areas obtained by dividing the entire area of the image with a predetermined division pattern, a difference value that is a difference between the representative value of the pixel value of each pixel included in the area and the predicted value of the representative value of the area is calculated. Encoded data of the image including the difference values are arranged in the order of numbers assigned to the plurality of areas in a raster scan order, and the predicted values are encoded in the order of the numbers. Among the pixels included in the encoding target region as the target region, the first pixel in the raster scan order is the representative pixel, and the pixel that is included in the encoding target region and is close to the pixel on the same scan line as the representative pixel A decoding device that decodes encoded data calculated based on at least one representative value of a region having the predicted reference pixel, with a pixel having a raster scan order before the representative pixel as a predicted reference pixel. There,
    A dividing means for dividing the entire area of the image into a plurality of areas based on area information defining the plurality of areas;
    Decoding means for decoding the encoded data and generating difference values arranged in order;
    Numbering means for assigning numbers in the raster scan order to the plurality of regions divided by the dividing means;
    Assigning means for sequentially assigning difference values from the top to the plurality of areas in the order of numbers assigned by the number assigning means;
    The region is set as a decoding target region in the order of numbers given by the number assigning means, and among the pixels included in the decoding target region, the first pixel in the raster scan order is set as a representative pixel and is included in the decoding target region. A pixel adjacent to a pixel on the same scan line as the pixel and having a raster scan order before the representative pixel is a predicted reference pixel, and based on a pixel value of at least one of the predicted reference pixels, Predicted value calculation means for calculating a predicted value of the decoding target area;
    For each decoding target region, the difference value assigned by the assigning unit is added to the prediction value calculated by the prediction value calculating unit to calculate the pixel value of the decoding target region, and all the decoding target regions are included in the decoding target region. Pixel value setting means for setting the pixel value of the pixel to the calculated pixel value,
    The decoding apparatus according to claim 1, wherein the prediction value calculation unit and the pixel value setting unit repeatedly execute the processing for each decoding target area in the order of numbers to restore the pixel values of the image.
  14.  上記符号化データおよび上記領域情報を外部から受信する受信手段をさらに備えていることを特徴とする請求項13に記載の復号装置。 14. The decoding apparatus according to claim 13, further comprising receiving means for receiving the encoded data and the region information from the outside.
  15.  上記受信手段は、符号化する対象の値が0に近いほど符号語が短い可変長符号化方法によって符号化された上記符号化データを受信することを特徴とする請求項14に記載の復号装置。 15. The decoding apparatus according to claim 14, wherein the receiving unit receives the encoded data encoded by a variable length encoding method in which a codeword is shorter as a value to be encoded is closer to 0. .
  16.  上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む画素群とすることを特徴とする請求項13~15の何れか1項に記載の復号装置。 The predicted value calculation means includes a predicted reference pixel that is adjacent to a pixel in the decoding target area and a pixel on the same scan line as the representative pixel, and a pixel immediately preceding the representative pixel in the raster scan order. In addition, the pixel of the previous scan line in the raster scan order of the representative pixel, and the pixel immediately after in the raster scan order of the last pixel included in the decoding target area and the same scan line as the representative pixel 16. The pixel group according to claim 13, wherein the pixel group includes: a pixel adjacent to the first scan line in a raster scan order of the representative pixel. Decoding device.
  17.  上記予測値算出手段は、予測参照画素を、上記代表画素のラスタスキャン順の1つ前の画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素の何れか1つの画素と、復号対象領域に含まれており上記代表画素と同じスキャンラインの最後尾の画素のラスタスキャン順で1つ後の画素と隣接する画素であって、上記代表画素のラスタスキャン順で1つ前のスキャンラインの画素と、を含む3つの画素とすることを特徴とする請求項13~15の何れか1項に記載の復号装置。 The predicted value calculation means includes a predicted reference pixel that is adjacent to a pixel in the decoding target area and a pixel on the same scan line as the representative pixel, and a pixel immediately preceding the representative pixel in the raster scan order. The raster scan order of any one pixel of the previous scan line pixel in the raster scan order of the representative pixel and the last pixel of the same scan line included in the decoding target area as the representative pixel 16. The pixel adjacent to the next pixel in step (1) and including the pixel on the previous scan line in the raster scan order of the representative pixel. The decoding device according to any one of the preceding claims.
  18.  上記予測値算出手段は、上記予測参照画素の画素値の中央値を、復号対象領域の予測値とすることを特徴とする請求項13~17の何れか1項に記載の復号装置。 The decoding apparatus according to any one of claims 13 to 17, wherein the predicted value calculation means uses a median value of the predicted reference pixels as a predicted value of a decoding target region.
  19.  上記予測値算出手段は、上記予測参照画素の画素値の平均値を、復号対象領域の予測値とすることを特徴とする請求項13~17の何れか1項に記載の復号装置。 The decoding apparatus according to any one of claims 13 to 17, wherein the predicted value calculation means uses an average value of pixel values of the prediction reference pixels as a predicted value of a decoding target region.
  20.  上記予測値算出手段は、上記予測参照画素に含まれる、何れかの画素の画素値を、復号対象領域の予測値とすることを特徴とする請求項13~17の何れか1項に記載の復号装置。 18. The prediction value calculating unit according to claim 13, wherein the pixel value of any pixel included in the prediction reference pixel is set as a prediction value of a decoding target region. Decoding device.
  21.  上記受信手段は、上記画像の上記符号化データおよび上記領域情報に加えて、さらに、上記予測値算出手段が実行する予測値の算出方法を示す予測値算出方法情報を受信し、
     上記予測値算出手段は、上記受信手段の受信した予測値算出方法情報の示す算出方法に基づいて予測値を算出することを特徴とする請求項14に記載の復号装置。
    In addition to the encoded data and the region information of the image, the reception unit further receives prediction value calculation method information indicating a calculation method of a prediction value executed by the prediction value calculation unit,
    15. The decoding apparatus according to claim 14, wherein the prediction value calculation unit calculates a prediction value based on a calculation method indicated by the prediction value calculation method information received by the reception unit.
  22.  上記符号化データにおいて先頭の符号語が最先の符号化対象領域の代表値を固定長符号化方法によって符号化したものである場合、
     上記復号化手段は、上記符号化データの先頭の符号語を固定長符号化方法により復号し、
     上記画素値設定手段は、上記番号付与手段の付与した番号順が先頭の上記領域に含まれる全画素の画素値を、上記復号化手段が上記先頭の符号語を復号した代表値に設定することを特徴とする請求項15に記載の復号装置。
    When the first code word in the encoded data is a representative value of the first encoding target area encoded by a fixed-length encoding method,
    The decoding means decodes the head codeword of the encoded data by a fixed length encoding method,
    The pixel value setting means sets the pixel values of all the pixels included in the first area in the number order assigned by the number assigning means to a representative value obtained by decoding the first code word by the decoding means. The decoding device according to claim 15.
  23.  上記画像が、テクスチャ画像とで対を成す距離画像である場合、
     上記受信手段は、上記領域情報として、上記テクスチャ画像を符号化した上記テクスチャ画像の符号化データを受信し、
     上記分割手段は、上記テクスチャ画像の符号化データから復号された上記テクスチャ画像の全領域を、各領域について該領域に含まれる画素群の画素値から算出される平均値と該領域に隣接する領域に含まれる画素群の画素値から算出される平均値との差が所定の閾値以下となるような複数の領域に分割する分割パターンで、上記距離画像の全領域を複数の領域に分割することを特徴とする請求項14に記載の復号装置。
    If the image is a distance image that is paired with a texture image,
    The reception means receives encoded data of the texture image obtained by encoding the texture image as the region information,
    The dividing means calculates the total area of the texture image decoded from the encoded image of the texture image, the average value calculated from the pixel values of the pixel group included in the area for each area, and the area adjacent to the area Dividing the entire area of the distance image into a plurality of areas with a division pattern in which the difference from the average value calculated from the pixel values of the pixel group included in the pixel group is a predetermined threshold value or less. The decoding device according to claim 14.
  24.  画像を符号化する符号化装置の符号化方法において、
     上記符号化装置にて、
     上記画像の全領域を複数の領域に分割する分割ステップと、
     上記分割ステップにおいて分割された上記複数の領域の各々について、該領域に含まれる各画素の画素値から代表値を決定する代表値決定ステップと、
     上記複数の領域に対してラスタスキャン順で番号を付与する番号付与ステップと、
     上記番号付与ステップにおいて付与された番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて、符号化対象領域の予測値を算出する予測値算出ステップと、
     上記符号化対象領域毎に、上記代表値決定ステップにおいて決定された代表値から、上記予測値算出ステップにおいて算出された予測値を減算して差分値を算出する差分値算出ステップと、
     上記差分値算出ステップにおいて算出された差分値を、上記番号付与ステップにおいて付与された順番に並べて符号化し、上記画像の符号化データを生成する符号化ステップと、を含むことを特徴とする符号化方法。
    In an encoding method of an encoding device for encoding an image,
    In the above encoding device,
    A division step of dividing the entire area of the image into a plurality of areas;
    For each of the plurality of regions divided in the dividing step, a representative value determining step for determining a representative value from the pixel value of each pixel included in the region;
    A numbering step of assigning numbers to the plurality of regions in raster scan order;
    The region is an encoding target region in the order of the numbers given in the numbering step, and among the pixels included in the encoding target region, the earliest pixel in the raster scan order is the representative pixel and is included in the encoding target region. A pixel that is adjacent to a pixel on the same scan line as the representative pixel and has a raster scan order before the representative pixel is a predicted reference pixel, and is at least one representative value of a region having the predicted reference pixel Based on the prediction value calculation step of calculating the prediction value of the encoding target region,
    A difference value calculating step for calculating a difference value by subtracting the predicted value calculated in the predicted value calculating step from the representative value determined in the representative value determining step for each encoding target region;
    A coding step, wherein the difference values calculated in the difference value calculating step are encoded in the order given in the number assigning step and encoded data of the image is generated. Method.
  25.  画像の全領域を所定の分割パターンで分割した複数の領域の各々について、該領域に含まれる各画素の画素値の代表値と、該領域の代表値の予測値との差分である差分値を含む上記画像の符号化データであって、上記差分値は、上記複数の領域に対してラスタスキャン順で付与した番号順で並べられており、上記予測値は、上記番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて算出された符号化データを復号する復号装置の復号方法であって、
     上記復号装置にて、
     上記複数の領域を規定する領域情報に基づいて、上記画像の全領域を複数の領域に分割する分割ステップと、
     上記符号化データを復号し、順に並べられた差分値を生成する復号化ステップと、
     上記分割ステップにおいて分割された上記複数の領域に対してラスタスキャン順で番号を付与する番号付与ステップと、
     上記番号付与ステップにおいて付与された番号順に、上記複数の領域に対して、それぞれ、差分値を先頭から順に割り当てる割当ステップと、
     上記番号付与ステップにおいて付与された番号順に上記領域を復号対象領域とし、復号対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、復号対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素のうち、少なくとも1つの画素の画素値に基づいて、復号対象領域の予測値を算出する予測値算出ステップと、
     上記復号対象領域毎に、上記予測値算出ステップにおいて算出された予測値に、上記割当ステップにおいて割り当てられた差分値を加算して上記復号対象領域の画素値を算出し、該復号対象領域に含まれる全画素の画素値を算出した画素値に設定する画素値設定ステップと、を含み、
     上記番号順で、上記復号対象領域毎に上記予測値算出ステップおよび画素値設定ステップを繰り返し実行し、上記画像の画素値を復元することを特徴とする復号方法。
    For each of a plurality of areas obtained by dividing the entire area of the image with a predetermined division pattern, a difference value that is a difference between the representative value of the pixel value of each pixel included in the area and the predicted value of the representative value of the area is calculated. Encoded data of the image including the difference values are arranged in the order of numbers assigned to the plurality of areas in a raster scan order, and the predicted values are encoded in the order of the numbers. Among the pixels included in the encoding target region as the target region, the first pixel in the raster scan order is the representative pixel, and the pixel that is included in the encoding target region and is close to the pixel on the same scan line as the representative pixel A decoding device that decodes encoded data calculated based on at least one representative value of a region having the predicted reference pixel, with a pixel having a raster scan order before the representative pixel as a predicted reference pixel A decoding method,
    In the decoding device,
    A division step of dividing the entire area of the image into a plurality of areas based on area information defining the plurality of areas;
    A decoding step of decoding the encoded data and generating difference values arranged in order;
    A numbering step for assigning numbers in the raster scan order to the plurality of regions divided in the dividing step;
    An assigning step for assigning difference values in order from the top to the plurality of regions in the order of the numbers assigned in the numbering step;
    The region is a decoding target region in the order of the numbers given in the numbering step, and among the pixels included in the decoding target region, the first pixel in the raster scan order is the representative pixel, and is included in the decoding target region. A pixel that is adjacent to a pixel on the same scan line as the representative pixel and has a raster scan order before the representative pixel is a predicted reference pixel, and is based on the pixel value of at least one of the predicted reference pixels. A predicted value calculation step of calculating a predicted value of the decoding target area;
    For each decoding target area, the pixel value of the decoding target area is calculated by adding the difference value allocated in the allocation step to the prediction value calculated in the prediction value calculating step, and is included in the decoding target area A pixel value setting step for setting the pixel values of all the pixels to be calculated pixel values,
    A decoding method characterized by repetitively executing the prediction value calculating step and the pixel value setting step for each decoding target region in the order of numbers to restore the pixel values of the image.
  26.  コンピュータを請求項1から12の何れか1項に記載の符号化装置として動作させるプログラムであって、上記コンピュータを上記の各手段として機能させるためのプログラム。 A program that causes a computer to operate as the encoding device according to any one of claims 1 to 12, and that causes the computer to function as each of the above-described means.
  27.  コンピュータを請求項13から23の何れか1項に記載の復号装置として動作させるプログラムであって、上記コンピュータを上記の各手段として機能させるためのプログラム。 A program for causing a computer to operate as the decoding device according to any one of claims 13 to 23, wherein the program causes the computer to function as each of the means.
  28.  請求項26に記載のプログラムおよび請求項27に記載のプログラムのうち、少なくとも何れかのプログラムを記録しているコンピュータ読み取り可能な記録媒体。 A computer-readable recording medium on which at least one of the program according to claim 26 and the program according to claim 27 is recorded.
  29.  画像の符号化データのデータ構造であって、
     上記画像の全領域を所定の分割パターンで分割した複数の領域の各々について、該領域に含まれる各画素の画素値の代表値と、該領域の代表値の予測値との差分である差分値を含んでおり、
     上記差分値は、上記複数の領域に対してラスタスキャン順で付与した番号順に並べられており、
     上記予測値は、上記番号順に上記領域を符号化対象領域とし、符号化対象領域に含まれる画素のうち、ラスタスキャン順で最先の画素を代表画素とし、符号化対象領域に含まれており上記代表画素と同じスキャンラインの画素と近接する画素であって、上記代表画素よりラスタスキャン順が前の画素を予測参照画素とし、上記予測参照画素を有する領域の代表値の少なくとも1つに基づいて算出されたものであることを特徴とする符号化データのデータ構造。
    A data structure of encoded data of an image,
    For each of a plurality of areas obtained by dividing the entire area of the image with a predetermined division pattern, a difference value that is a difference between the representative value of the pixel value of each pixel included in the area and the predicted value of the representative value of the area Contains
    The difference values are arranged in the order of the numbers assigned in the raster scan order to the plurality of areas,
    The predicted value is included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area. Based on at least one of the representative values of the region having the predicted reference pixel, the pixel adjacent to the pixel on the same scan line as the representative pixel and having a raster scan order before the representative pixel is a predicted reference pixel. A data structure of encoded data, which is calculated by
PCT/JP2011/073134 2010-11-04 2011-10-06 Encoder apparatus, decoder apparatus, encoding method, decoding method, program, recording medium, and data structure of encoded data WO2012060179A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2010247423 2010-11-04
JP2010-247423 2010-11-04

Publications (1)

Publication Number Publication Date
WO2012060179A1 true WO2012060179A1 (en) 2012-05-10

Family

ID=46024299

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2011/073134 WO2012060179A1 (en) 2010-11-04 2011-10-06 Encoder apparatus, decoder apparatus, encoding method, decoding method, program, recording medium, and data structure of encoded data

Country Status (1)

Country Link
WO (1) WO2012060179A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9241900B2 (en) 2010-11-11 2016-01-26 Novaliq Gmbh Liquid pharmaceutical composition for the treatment of a posterior eye disease

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09289638A (en) * 1996-04-23 1997-11-04 Nec Corp Three-dimensional image encoding/decoding system
WO2004071102A1 (en) * 2003-01-20 2004-08-19 Sanyo Electric Co,. Ltd. Three-dimensional video providing method and three-dimensional video display device
JP2008193530A (en) * 2007-02-06 2008-08-21 Canon Inc Image recorder, image recording method and program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09289638A (en) * 1996-04-23 1997-11-04 Nec Corp Three-dimensional image encoding/decoding system
WO2004071102A1 (en) * 2003-01-20 2004-08-19 Sanyo Electric Co,. Ltd. Three-dimensional video providing method and three-dimensional video display device
JP2008193530A (en) * 2007-02-06 2008-08-21 Canon Inc Image recorder, image recording method and program

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9241900B2 (en) 2010-11-11 2016-01-26 Novaliq Gmbh Liquid pharmaceutical composition for the treatment of a posterior eye disease

Similar Documents

Publication Publication Date Title
JP6788699B2 (en) Effective partition coding with high partitioning degrees of freedom
JP6814783B2 (en) Valid predictions using partition coding
KR102390298B1 (en) Image processing apparatus and method
CN111418206B (en) Video encoding method based on intra prediction using MPM list and apparatus therefor
JP5960693B2 (en) Generation of high dynamic range image from low dynamic range image
CN104054343B (en) Picture decoding apparatus, picture coding device
TW201143458A (en) Dynamic image encoding device and dynamic image decoding device
CN112235584B (en) Method and device for image division and method and device for encoding and decoding video sequence image
KR20150020175A (en) Method and apparatus for processing video signal
US20170041623A1 (en) Method and Apparatus for Intra Coding for a Block in a Coding System
JP6212890B2 (en) Moving picture coding apparatus, moving picture coding method, and moving picture coding program
KR20220019241A (en) Video or image coding based on adaptive loop filter
US8189673B2 (en) Method of and apparatus for predicting DC coefficient of video data unit
JP7180679B2 (en) Video encoding device, video encoding method, video encoding program, video decoding device, video decoding method, and video decoding program
WO2012060179A1 (en) Encoder apparatus, decoder apparatus, encoding method, decoding method, program, recording medium, and data structure of encoded data
KR20150113713A (en) Method and device for creating inter-view merge candidates
CN111434116B (en) Image decoding method and apparatus based on affine motion prediction using constructed affine MVP candidates in image coding system
WO2012060168A1 (en) Encoder apparatus, decoder apparatus, encoding method, decoding method, program, recording medium, and encoded data
WO2012060172A1 (en) Movie image encoding device, movie image decoding device, movie image transmitting system, method of controlling movie image encoding device, method of controlling movie image decoding device, movie image encoding device controlling program, movie image decoding device controlling program, and recording medium
JP4860763B2 (en) Image encoding apparatus, image encoding apparatus control method, control program, and recording medium
WO2012128209A1 (en) Image encoding device, image decoding device, program, and encoded data
WO2012060171A1 (en) Movie image encoding device, movie image decoding device, movie image transmitting system, method of controlling movie image encoding device, method of controlling movie image decoding device, movie image encoding device controlling program, movie image decoding device controlling program, and recording medium
CN117337565A (en) Intra-frame prediction method and device based on multiple DIMD modes
JP2022120186A (en) Video decoding method
CN117356091A (en) Intra-frame prediction method and apparatus using auxiliary MPM list

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11837831

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11837831

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP