WO2012060179A1 - Dispositif de codage, dispositif de décodage, procédé de codage, procédé de décodage, programme, support d'enregistrement et structure de données de données codées - Google Patents
Dispositif de codage, dispositif de décodage, procédé de codage, procédé de décodage, programme, support d'enregistrement et structure de données de données codées Download PDFInfo
- Publication number
- WO2012060179A1 WO2012060179A1 PCT/JP2011/073134 JP2011073134W WO2012060179A1 WO 2012060179 A1 WO2012060179 A1 WO 2012060179A1 JP 2011073134 W JP2011073134 W JP 2011073134W WO 2012060179 A1 WO2012060179 A1 WO 2012060179A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pixel
- value
- representative
- image
- encoding
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
- H04N13/239—Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N2013/0074—Stereoscopic image analysis
- H04N2013/0081—Depth or disparity estimation from stereoscopic image signals
Definitions
- the present invention mainly relates to an encoding device that encodes a distance image (Depth Image) and a decoding device that decodes the distance image encoded by such an encoding device.
- a texture image which is a general two-dimensional image that represents the subject space with the color of each subject and background, and the subject space is represented by the distance from the viewpoint to each subject and background.
- distance image is an image expressing a distance value (depth value) from the viewpoint to a corresponding point in the object space for each pixel.
- This distance image can be acquired by a distance measuring device such as a depth camera installed in the vicinity of the camera that records the texture image.
- a distance image can be acquired by analyzing a plurality of texture images obtained by photographing with a multi-viewpoint camera, and many analysis methods have been proposed.
- distance values are expressed in 256 levels (ie, 8-bit luminance values) in the Moving Picture Experts Group (MPEG), which is a working group of the International Organization for Standardization / ISO / IEC, as a standard for distance images.
- MPEG-C part3 which is a standard to be established. That is, the standard distance image is an 8-bit grayscale image.
- a subject located in front is expressed as white and a subject located in the back is expressed in black.
- the distance from the viewpoint of each pixel constituting the subject image drawn in the texture image is known from the distance image, so that the subject has the maximum depth. It can be restored as a three-dimensional shape expressed in 256 stages. Furthermore, by projecting the 3D shape onto the 2D plane geometrically, the original texture image is converted into a texture image in the subject space when the subject is photographed from another angle within a certain range from the original angle. It is possible to convert. In other words, since a three-dimensional shape can be restored when viewed from an arbitrary angle within a certain range by a set of texture images and distance images, a free viewpoint of a three-dimensional shape can be obtained by using multiple sets of texture images and distance images. It is possible to represent an image with a small amount of data.
- Non-Patent Document 1 discloses a technique capable of compressing and encoding video by efficiently eliminating temporal or spatial redundancy in the video.
- a texture video video having a texture image as each frame
- a distance video video having a distance image as each frame
- the present inventor has found that there are the following two characteristics between the texture image and the distance image. (1) The subject and background edge portions in the distance image and the subject and background edge portions in the texture image are common. (2) In the distance image, the distance depth value is relatively flat inside (in the region surrounded by the edge) from the edge of the subject and the background.
- the characteristic (1) will be described. As long as the texture image includes information that allows the subject to be distinguished from the background as an image, the boundary (edge) between the subject and the background is common to the texture video and the distance video. . That is, the edge information indicating the edge portion of the subject is one large element indicating the correlation between the texture image and the distance image. Further, the characteristic (2) will be described.
- the distance image tends to be an image having a lower spatial frequency component than the texture image. For example, even if a person wearing a fancy pattern of clothes is drawn on the texture image, the distance depth value of the clothes portion tends to be constant in the distance image. In other words, in the distance image, it can be said that there is a strong tendency for a single distance depth value to appear in a wider area than in the texture image.
- the distance depth value is substantially constant within that range (within the segmented pixel group).
- the entire region of the texture image is divided into a plurality of regions so that the difference between the maximum pixel value and the minimum pixel value of the pixel group included in the region is equal to or less than a predetermined threshold, and the same pattern as the texture image division pattern
- the distance depth value becomes substantially constant in each region in the distance image.
- a pixel group (each region formed by dividing the entire region of the texture image and the distance image) divided so that the distance depth value becomes substantially constant is referred to as a segment.
- the distance image can be handled in units of segments, not in units of pixels. Further, since the distance image is divided based on the corresponding texture image (texture image at the same time as the distance image), the distance image values in the adjacent segments in the distance image are the same or close values. It becomes more and more. Therefore, further information compression is possible by using the characteristic and eliminating the spatial redundancy between segments in the distance image.
- Non-Patent Document 1 a texture image is made into a block, and spatial redundancy between blocks is eliminated by intra prediction encoding or intra prediction encoding. Specifically, first, pixels included in the texture image are blocked in units of 4 ⁇ 4 pixels, 8 ⁇ 8 pixels, 16 ⁇ 16 pixels, or the like. Next, the blocks are encoded in the order from the upper left block to the lower right block of the image. In the encoding of each block, the pixel or pixel adjacent to the encoding target block that is encoded prior to the encoding target block and is inside the block adjacent to the left, top, and upper right of the encoding target block. With reference to the column, the value of the pixel included in the encoding target block is predicted.
- the difference obtained by subtracting the predicted value from the actual value of each pixel included in the encoding target block is orthogonally transformed and encoded. If the prediction accuracy is good, it can be expected that the value will be smaller than when the actual value itself is encoded, and as a result, the number of bits required for encoding can be reduced.
- Non-Patent Document 1 is a technique optimized for a texture image, and there is a problem that it cannot be applied as it is to a distance image divided into the above-described segment units. .
- Non-Patent Document 1 a texture image is divided into blocks each having a square shape.
- the range image dividing method proposed by the present inventors the range image is divided into segments of arbitrary shapes. This is because with this division method, the smaller the number of segments to be divided, the better the coding efficiency. Therefore, each segment can have a flexible shape without any restriction on the shape of each segment. desirable.
- the unit for dividing the image is a square
- the blocks adjacent to the left, top, and top right of the encoding target block can be uniquely determined. Furthermore, since it is guaranteed that a block including pixels that the encoding target block refers to for prediction is encoded prior to the encoding target block, the decoding side may reproduce the predicted value. it can.
- the adjacent segment of the encoding target segment cannot be uniquely determined. Further, it cannot be determined which segment adjacent to the encoding target segment is encoded in advance. Therefore, even if the technique described in Non-Patent Document 1 is applied as it is, the spatial redundancy of the distance image divided into segment units cannot be removed.
- the present invention has been made in view of the above problems, and a main object thereof is an encoding apparatus that performs encoding by eliminating spatial redundancy between segments of an image divided in segment units of an arbitrary shape. Another object of the present invention is to realize a decoding device that decodes a distance image supplied from such an encoding device.
- an encoding apparatus is an encoding apparatus that encodes an image, and is divided by a dividing unit that divides the entire area of the image into a plurality of regions, and the dividing unit.
- representative value determining means for determining a representative value from the pixel value of each pixel included in the region, number giving means for assigning a number to the plurality of regions in raster scan order,
- the above-mentioned area is set as the encoding target area in the order of the numbers given by the above-mentioned number assigning means, and among the pixels included in the encoding target area, the first pixel in the raster scan order is set as the representative pixel, and is included in the encoding target area.
- a pixel that is adjacent to a pixel on the same scan line as the representative pixel and that has a raster scan order before the representative pixel is a predicted reference pixel, and a representative of the region having the predicted reference pixel
- the difference value calculation means for subtracting the calculated predicted value to calculate the difference value, and the difference values calculated by the difference value calculation means are arranged and encoded in the order given by the number assignment means, and the encoded data of the image And encoding means for generating.
- the number assigning unit assigns numbers in the raster scan order to the plurality of regions into which the dividing unit has divided the image.
- the prediction value calculation means sets the area as the encoding target area in the order of the numbers given by the number assignment means, and sets the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
- a pixel that is included in the encoding target region and is close to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is a predicted reference pixel.
- the predicted value calculation means calculates the predicted value of the encoding target region based on at least one of the representative values of the region having the predicted reference pixel.
- the difference value calculation means subtracts the prediction value calculated by the prediction value calculation means from the representative value determined by the representative value determination means for each encoding target region to calculate a difference value. Then, the encoding unit arranges and encodes the difference values calculated by the difference value calculation unit in the order given by the number assigning unit, and generates encoded data of the image.
- the order of the areas can be uniquely specified.
- the representative pixel used when calculating the predicted value of the representative value of each region and the prediction target pixel based on the representative pixel can be uniquely specified. Therefore, the predicted value of the encoding target area determined from the representative value of the area adjacent to the encoding target area can be uniquely calculated.
- the prediction reference pixel for a certain area needs to be the same at the time of encoding and at the time of decoding. Therefore, a prediction reference pixel for a certain area needs to be decoded before the certain area, that is, needs to be encoded first.
- an encoding method is an encoding method of an encoding device that encodes an image, and the encoding device divides the entire region of the image into a plurality of regions.
- a numbering step for assigning numbers in order, and the region as an encoding target region in the order of the numbers given in the numbering step, and among the pixels included in the encoding target region, the first pixel in the raster scan order A pixel that is included in the encoding target area and is close to a pixel on the same scan line as the representative pixel, and the raster scan order is higher than that of the representative pixel.
- a prediction value calculating step for calculating a prediction value of the encoding target area based on at least one representative value of the area having the prediction reference pixel,
- a difference value calculation step for calculating a difference value by subtracting the prediction value calculated in the prediction value calculation step from the representative value determined in the representative value determination step; and a difference value calculated in the difference value calculation step are encoded in the order given in the number assigning step, and an encoded step of generating encoded data of the image is included.
- the encoding method according to the present invention has the same effects as the encoding apparatus according to the present invention.
- the decoding device for each of a plurality of areas obtained by dividing the entire area of the image with a predetermined division pattern, a representative value of the pixel value of each pixel included in the area,
- the encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order.
- the predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
- a decoding device that decodes the encoded data calculated based on: a dividing unit that divides the entire area of the image into a plurality of areas based on area information defining the plurality of areas; and the encoding Decoding means for decoding data and generating differential values arranged in order; Number assigning means for assigning numbers to the plurality of regions divided by the dividing means in raster scan order; and the number assigning means Are assigned to the plurality of areas in order of numbers assigned by each of the plurality of areas, and the areas are set as decoding target areas in order of the numbers assigned by the number assigning means, and are included in the decoding target areas.
- the first pixel in the raster scan order is a representative pixel, and is included in the decoding target area and is adjacent to a pixel on the same scan line as the representative pixel.
- a prediction value calculation unit that calculates a prediction value of a decoding target region based on a pixel value of at least one pixel of the prediction reference pixels, the pixel having a raster scan order before the representative pixel as a prediction reference pixel; For each decoding target area, the pixel value of the decoding target area is calculated by adding the difference value assigned by the allocating means to the prediction value calculated by the prediction value calculating means, and all the pixels included in the decoding target area are calculated.
- a pixel value setting unit that sets the calculated pixel value to the calculated pixel value, and the predicted value calculation unit and the pixel value setting unit repeatedly execute the process for each decoding target region in the order of the numbers. It is characterized by restoring pixel values of an image.
- the decoding unit decodes the encoded data and generates difference values arranged in order.
- the allocating unit calculates a difference for each of the plurality of regions obtained by dividing the image by the dividing unit based on region information defining the plurality of regions in the order of numbers assigned by the number assigning unit in raster scan order. Assign values in order from the beginning.
- the prediction value calculation means decodes the area as the decoding target area in the order of the numbers given by the number assigning means, and uses the first pixel in the raster scan order as the representative pixel among the pixels included in the decoding target area.
- a pixel that is included in the target region and is adjacent to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is set as a predicted reference pixel.
- the predicted value calculation means calculates a predicted value of the decoding target region based on the pixel value of at least one pixel among the predicted reference pixels.
- the pixel value setting means calculates the pixel value of the decoding target area by adding the difference value assigned by the assigning means to the prediction value calculated by the prediction value calculating means for each decoding target area, The pixel values of all the pixels included in the decoding target area are set to the calculated pixel values.
- the prediction value calculation means and the pixel value setting means repeatedly execute the above processing for each decoding target area in the order of the numbers given by the number assignment means, and restore the pixel values of the image.
- the decoding target area is the same as the plurality of areas into which the image indicated by the encoded data is divided.
- the representative pixel used when calculating the predicted value of the representative value of each decoding target area and the prediction target pixel based thereon can be uniquely specified, and the representative pixel of the decoding target area and the prediction target based thereon
- the pixel, the representative pixel of the encoding target region corresponding to the decoding target region, and the prediction target pixel based thereon can be the same pixel. Therefore, there is an effect that the image indicated by the encoded data can be accurately restored.
- the decoding method provides, for each of a plurality of areas obtained by dividing the entire area of an image with a predetermined division pattern, a representative value of pixel values of each pixel included in the area,
- the encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order.
- the predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
- a division step for dividing, a decoding step for decoding the encoded data and generating difference values arranged in order, and assigning numbers to the plurality of regions divided in the division step in a raster scan order A number assigning step, an assigning step in which the difference values are assigned in order from the top to the plurality of regions in the order of numbers assigned in the number assigning step, and the regions in the order of the numbers given in the number assigning step.
- the first pixel in the raster scan order is set as the representative pixel and included in the decoding target area.
- the prediction value calculation step and the pixel value setting step are repeatedly executed for each target region to restore the pixel value of the image.
- the decoding method according to the present invention has the same operational effects as the decoding device according to the present invention.
- the encoding apparatus can uniquely decode even if a plurality of regions into which an image is divided has an arbitrary shape, eliminating spatial redundancy between the regions. There is an effect that encoded data can be generated.
- the decoding device has an effect that the image indicated by the encoded data can be accurately restored.
- FIG. 4 is a diagram showing the distribution of each segment defined by the moving image encoding apparatus of FIG. 1 from the texture image of FIG. 3.
- FIG. 6 is a diagram illustrating a segment boundary portion in which an image division processing unit of the moving image encoding device in FIG.
- FIG. 7 shows 12 pixels of 3 ⁇ 4 in the vertical direction that constitute a partial area of the texture image.
- FIGS. 7A and 7B show a case where two pixels are adjacent vertically and horizontally.
- FIG. 7C shows a case where two pixels are in contact at only one point. It is a figure which shows the order which scans a texture image in order to determine the value of the segment number which the moving image encoder of FIG. 1 assign
- FIG. 1 is given the representative value of the distance value in the corresponding segment in the distance image and the raster scan order. It is a figure which shows typically the data with which the segment number was matched. It is a flowchart which shows an example of the prediction encoding process which the prediction encoding part of the moving image encoder of FIG. 1 performs. (A) to (e) of FIG. 12 show 12 pixels of length 3 and width 4 constituting a partial region of the distance image, and the prediction encoding unit predicts the representative value of the segment. It is a figure which shows the specific example of the representative pixel to be used, and the prediction reference pixel based on this representative pixel.
- the moving picture coding apparatus generally generates coded data for each frame constituting a three-dimensional moving picture by coding a texture image and a distance image constituting the frame. It is a device to do.
- the moving picture encoding apparatus uses H.264 for encoding texture images.
- the encoding technique employed in the H.264 / MPEG-4 AVC standard is used, while the encoding of the distance image is a moving picture encoding apparatus using the encoding technique peculiar to the present invention.
- the above encoding technique unique to the present invention is an encoding technique developed by paying attention to the fact that there is a correlation between a texture image and a distance image.
- a certain area in the texture image is composed of pixel groups composed of pixels of similar colors, all or almost all of the pixels included in the corresponding area in the distance image are the same.
- the values of the pixels constituting the texture image and the distance image are referred to as pixel values.
- the pixel value in the texture image indicates information regarding the luminance and color of each pixel.
- the pixel value in the distance image indicates information related to the distance depth that each pixel has.
- the pixel value of the texture image is referred to as a color value
- the pixel value of the distance image is referred to as a distance value.
- FIG. 1 is a block diagram illustrating a configuration of a main part of a video encoding device.
- the moving image encoding apparatus 1 includes an image encoding unit 11, an image decoding unit (decoding unit) 12, a distance image encoding unit 20, and a packaging unit (transmission unit) 28.
- the distance image encoding unit 20 includes an image division processing unit 21, a distance image division processing unit (dividing unit) 22, a distance value correcting unit (representative value determining unit) 23, a number assigning unit (number assigning unit) 24, and A prediction encoding unit (prediction value calculation means, difference value calculation means, encoding means) 25 is provided.
- the image encoding unit 11 The texture image # 1 is encoded by AVC (Advanced Video Coding) coding defined in the H.264 / MPEG-4 AVC standard.
- AVC Advanced Video Coding
- the image decoding unit 12 decodes the texture image # 1 'from the encoded data # 11 of the texture image # 1.
- the image division processing unit 21 divides the entire area of the texture image # 1 into a plurality of segments (areas). Then, the image division processing unit 21 outputs segment information # 21 including position information of each segment.
- the segment position information is information indicating the position of the segment in the texture image # 1.
- the distance image division processing unit 22 When the distance image # 2 and the segment information # 21 are input, the distance image division processing unit 22 includes each segment included in the corresponding segment (region) in the distance image # 2 for each segment in the texture image # 1 ′. A distance value set consisting of pixel distance values is extracted. Then, the distance image division processing unit 22 generates segment information # 22 in which the distance value set and the position information are associated with each segment from the segment information # 21.
- the distance value correction unit 23 calculates the mode value as the representative value # 23a from the distance value set of the segment included in the segment information # 22 for each segment of the distance image # 2. That is, when the segment i in the distance image # 2 includes N pixels, the distance value correcting unit 23 calculates the mode value from the N distance values.
- the distance value correcting unit 23 may calculate an average of N distance values as an average value, or a median value of N distance values or the like as a representative value # 23a instead of the mode value.
- the distance value correcting unit 23 may further round the decimal value to an integer value by rounding down, rounding up, or rounding off when the average value, median value, or the like becomes a decimal value as a result of the calculation. .
- the distance value correcting unit 23 replaces the distance value set of each segment included in the segment information # 22 with the representative value # 23a of the corresponding segment, and outputs it to the number assigning unit 24 as the segment information # 23.
- the number assigning unit 24 scans the pixels included in the distance image in the raster scan order, and assigns the segment number # 24 in the scanned order to each segment that is an area divided by the segment information # 23. , It is associated with each representative value # 23a included in the segment information # 23.
- the predictive encoding unit 25 performs predictive encoding processing based on the input M sets of representative values # 23a and segment numbers # 24, and outputs the obtained encoded data # 25 to the packaging unit 28. Specifically, the predictive encoding unit 25 calculates the segment predicted value for each segment in the order of segment number # 24, subtracts the predicted value from the representative value, calculates the difference value, and calculates the difference value. Encode. Then, the predictive encoding unit 25 arranges the encoded difference values in the order of the segment number # 24 to obtain encoded data # 25, and outputs the encoded data # 25 to the packaging unit 28.
- the packaging unit 28 associates the encoded data # 11 of the texture image # 1 and the encoded data # 25 of the distance image # 2 and outputs them as encoded data # 28 to the outside.
- FIG. 2 is a flowchart showing the operation of the moving image encoding apparatus 1.
- the operation of the moving image encoding apparatus 1 described here is an operation of encoding a texture image and a distance image of the t frame from the head in a moving image including a large number of frames. That is, the moving image encoding apparatus 1 repeats the operation described below as many times as the number of frames of the moving image in order to encode the entire moving image.
- each data # 1 to # 28 is interpreted as data of the t-th frame.
- the image encoding unit 11 and the distance image division processing unit 22 respectively receive the texture image # 1 and the distance image # 2 from the outside of the moving image encoding device 1 (step S1).
- the pair of the texture image # 1 and the distance image # 2 received from the outside is correlated with the contents of the image, as can be seen, for example, by comparing the texture image of FIG. 3 and the distance image of FIG. is there.
- the image encoding unit 11 The texture image # 1 is encoded by the AVC encoding method stipulated in the H.264 / MPEG-4 AVC standard, and the obtained texture image encoded data # 11 is transmitted to the packaging unit 28 and the image decoding unit 12. Output (step S2).
- the texture image # 1 is a B picture or a P picture in step S2
- the image encoding unit 11 encodes the prediction residual between the texture image # 1 and the predicted image, and the encoded prediction residual Is output as encoded data # 11.
- the image decoding unit 12 decodes the texture image # 1 'from the encoded data # 11 and outputs it to the image division processing unit 21 (step S3).
- the texture image # 1 'to be decoded is not completely the same as the texture image # 1 encoded by the image encoding unit 11. This is because the image encoding unit 11 performs the DCT conversion process and the quantization process during the encoding process, but a quantization error occurs when the DCT coefficient obtained by the DCT conversion is quantized.
- the timing at which the image decoding unit 12 decodes the texture image differs depending on whether or not the texture image # 1 is a B picture. This will be described in detail.
- the image decoding unit 12 decodes the texture image # 1 ′ without performing inter prediction (inter-screen prediction).
- the image decoding unit 12 decodes the prediction residual from the encoded data # 11. Then, the image decoding unit 12 decodes the texture image # 1 ′ by adding a prediction residual to the predicted image generated using the encoded data # 11 of one or more frames before the t-th frame as a reference picture.
- the image decoding unit 12 decodes the prediction residual from the encoded data # 11. Then, the image decoding unit 12 generates, as reference pictures, encoded data # 11 of one or more frames before the t-th frame and encoded data # 11 of one or more frames after the t-th frame. Texture image # 1 ′ is decoded by adding the prediction residual to the prediction image.
- the timing at which the image decoding unit 12 decodes the texture image # 1 ′ in the t frame is the t frame. Immediately after the encoded data # 11 is generated.
- the timing at which the image decoding unit 12 decodes the texture image # 1 ′ is the T (> t) frame (the last frame in the reference picture). ) After the time when the encoding process for texture image # 1 is completed.
- the image division processing unit 21 defines a plurality of segments from the input texture image # 1 '(step S4).
- Each segment defined by the image division processing unit 21 has a similar color pixel (that is, the difference between the maximum pixel value and the minimum pixel value (difference between the maximum color value and the minimum color value) is equal to or less than a predetermined threshold value. (Closed pixel group).
- FIG. 5 is a diagram showing the distribution of each segment defined by the image division processing unit 21 from the texture image # 1 ′ of FIG.
- the closed region drawn by the same pattern indicates one segment.
- the left and right hairs of the girl's head division are drawn in two colors, brown and light brown.
- the image division processing unit 21 defines a closed region made up of pixels of similar colors such as brown and light brown as one segment.
- the skin portion of the girl's face is also drawn in two colors, the skin color and the pink color of the cheek portion.
- Each pink area is defined as a separate segment. This is because the skin color and the pink color are not similar (that is, the difference between the skin color value and the pink color value exceeds a predetermined threshold value).
- the image division processing unit 21 After the process of step S4, the image division processing unit 21 generates segment information # 21 including the position information of each segment and outputs it to the distance image division processing unit 22 (step S5).
- the position information of the segment for example, the coordinate values of all the pixels included in the segment can be cited. That is, when defining each segment from the texture image # 1 ′ in FIG. 3, each closed region in FIG. 6 is defined as one segment, but the position information of the segment constitutes a closed region corresponding to the segment. Coordinate values for all pixels.
- FIG. (A) to (c) of FIG. 7 show 12 pixels of 3 ⁇ 4 in the vertical direction that constitute a partial region of the texture image.
- the color of the pixel labeled “A” and the color of the pixel labeled “B” are the same or similar.
- the colors of the pixels in the other ten partial regions are completely different from the colors of the pixel A and the pixel B.
- each segment is a closed region (a group of connected pixels) made up of pixels of similar colors.
- the definition of the closed region will be described with reference to FIG.
- the pixel A and the pixel B are connected when the positional relationship between the two pixels is (a) and (b) in FIG. That is, it is considered that the pixel A and the pixel B are connected when they are in contact with each other in the vertical direction or the horizontal direction. In other words, when the pixel A and the pixel B are in contact with each other on one side, it is considered that they are connected. That is, in this case, the pixel A and the pixel B form the same segment.
- the pixel A and the pixel B are not connected. That is, when the pixel A and the pixel B are in contact with each other in an oblique direction, it is considered that they are not connected. In other words, when the pixel A and the pixel B are in contact with each other only at a certain point, it is considered that they are not connected. That is, in this case, the pixel A and the pixel B are the same color or similar colors, but are different segments. Needless to say, when the pixel A and the pixel B are not in contact with each other, the pixel A and the pixel B are separate segments.
- pixels are adjacent to each other is strictly synonymous with the Manhattan distance between the coordinates of the two pixels being “1”, and two pixels are not adjacent to each other. This is synonymous with the fact that the Manhattan distance between the coordinates of two pixels is “2 or more”.
- a pixel group (each region formed by dividing the entire region of the texture image and the distance image) divided so that the distance depth value is substantially constant, A pixel group having a connection relationship is referred to as a segment.
- the pixel A and the pixel B are in the positional relationship shown in FIGS. 7A and 7B, the pixel A is also referred to as being adjacent to the pixel B.
- the pixel A and the pixel B are in any of the positional relationships shown in FIGS. 7A to 7C, the pixel A is also referred to as being close to the pixel B.
- the segment is referred to as being adjacent to another segment.
- the segment is referred to as being close to another segment.
- the distance image division processing unit 22 divides the input distance image # 2 into a plurality of segments. Specifically, the distance image division processing unit 22 refers to the input segment information # 21, specifies the position of each segment in the texture image # 1 ′, and is the same as the segment division pattern in the texture image # 1 ′. In this division pattern, the distance image # 2 is divided into a plurality of segments (in the following description, it is assumed that the number of segments is M).
- the distance image division processing unit 22 extracts the distance value of each pixel included in the segment as a distance value set for each segment of the distance image # 2. Furthermore, the distance image division processing unit 22 associates the distance value set extracted from the corresponding segment with the position information of each segment included in the segment information # 21. And the distance image division
- the distance value correction unit 23 calculates the mode value as the representative value # 23a from the distance value set of the segment included in the segment information # 22 for each segment of the distance image # 2. Then, the distance value correcting unit 23 replaces each of the M distance value sets included in the segment information # 22 with the representative value # 23a of the corresponding segment, and outputs it as the segment information # 23 to the number assigning unit 24 ( Step S7).
- the number assigning unit 24 associates the representative value # 23a with the segment number # 24 corresponding to the position information for each of the M sets of position information and representative value # 23a included in the segment information # 23, and sets M sets The representative value # 23a and the segment number # 24 are output to the predictive coding unit 25 (step S8). Specifically, the number assigning unit 24 sets the representative value # 23a of the i-th segment in the raster scan order for each segment i from 1 to M (M: the number of segments) based on the segment information # 23.
- M the number of segments
- the segment number “i ⁇ 1” is associated.
- the “i-th segment in the raster scan order” is a segment in which the i-th pixel is scanned when the distance image or the texture image is scanned in the raster scan order as shown in FIG.
- FIG. 9 is a diagram schematically showing the position of each segment of the distance image input to the moving image encoding device 1 together with the texture image as shown in FIG. In FIG. 9, one closed region indicates one segment.
- segment number “0” is assigned to the segment R0 located at the head in the raster scan order. Further, the segment number “1” is assigned to the segment R1 that is positioned second in the raster scan order. Similarly, segment numbers “2” and “3” are respectively assigned to the third and fourth segments R2 and R3 in the raster scan order.
- the number assigning unit 24 outputs the M sets of representative values # 23a and the segment number # 24 whose specific examples are shown in FIG. 10 to the predictive encoding unit 25.
- the predictive encoding unit 25 performs predictive encoding processing based on the input M sets of representative values # 23a and segment numbers # 24, and the obtained encoded data # 25 is packaged by the packaging unit 28. (Step S9). Specifically, the predictive encoding unit 25 calculates the segment predicted value for each segment in the order of segment number # 24, subtracts the predicted value from the representative value, calculates the difference value, and calculates the difference value. Encode. Then, the predictive encoding unit 25 arranges the encoded difference values in the order of the segment number # 24 to obtain encoded data # 25.
- FIG. 11 is a flowchart illustrating an example of the prediction encoding process performed by the prediction encoding unit 25.
- step S101 “i” that is the segment number # 24 is set to “0” (step S101). Then, the segment whose segment number # 24 is “i” is set as an encoding target segment (encoding target region) (step S102). That is, the segment with the first segment number “0” is set as the encoding target segment.
- the representative pixel of the encoding target segment used for calculating the predicted value is specified from the pixels included in the encoding target segment (step S103). Specifically, the pixel included in the encoding target segment and scanned first in the raster scan order in step S8 (the first pixel in the raster scan order) is set as the representative pixel.
- the shape of the segment in the present invention is various as described above, the pixel to be scanned first in the raster scan order is uniquely determined regardless of the shape of the segment.
- the prediction reference pixel is specified based on the representative pixel (step S104). Specifically, a pixel that is included in the encoding target segment and is close to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is a predicted reference pixel.
- the prediction reference pixel is a pixel adjacent to the pixel on the same scan line as the representative pixel that is included in the encoding target segment and the pixel immediately before the raster scan order of the representative pixel.
- the pixel of the previous scan line in the raster scan order of the pixel and the next pixel in the raster scan order of the last pixel included in the encoding target segment and the same scan line as the representative pixel are adjacent.
- the pixel group may include a pixel and a pixel on a scan line that is one previous in the raster scan order of the representative pixel.
- the prediction reference pixel is a pixel adjacent to the pixel in the encoding target segment and the pixel on the same scan line as that of the representative pixel, and the pixel immediately before the raster scan order of the representative pixel, Any one of the pixels of the previous scan line in the raster scan order of the representative pixel and the last pixel in the same scan line as the representative pixel included in the segment to be encoded Three pixels including a pixel adjacent to the next pixel and a pixel on the previous scan line in the raster scan order of the representative pixel may be used.
- pixels A denoted by “A” in FIGS. 12A to 12E are pixels that constitute the same segment.
- the pixels labeled “B”, “C”, or “D” (referred to as pixels B, C, and D, respectively) and “blank” pixels are different from the segment RA having the pixel A. It constitutes a segment.
- the segments to which the pixels other than the pixel A belong may all be the same segment, or may all be different segments. 12A to 12E, the representative pixels in each case and the pixels on the same scanning line as the representative pixels (scan lines: pixel rows in the present embodiment) are hatched. .
- a pixel on the same scanning line as a certain pixel means a pixel in the same row as the certain pixel.
- a pixel immediately before a certain pixel in the raster scan order means a pixel one pixel to the left of the certain pixel.
- a pixel whose raster scan order is one after a certain pixel means a pixel one right of a certain pixel.
- a pixel that is adjacent to a certain pixel and is one pixel before the scanning line in the raster scan order of the certain pixel means a pixel that is one pixel above the certain pixel.
- the encoding target segment is assumed to be the segment RA having the pixel A, and the identification of the representative pixel and the predicted reference pixel of the segment RA will be described in the example shown in FIGS.
- the representative pixel is the pixel A in the shaded line located at the top of the pixel A.
- the pixel B that is one pixel to the left of the representative pixel
- the pixel C that is one pixel above the representative pixel
- the pixel D that is diagonally to the right of the representative pixel are used as predicted reference pixels.
- the representative pixel is the pixel A on the left side of the pixel A among the pixels A that is shaded.
- a pixel B that is one pixel to the left of the representative pixel, a pixel C that is one pixel above the representative pixel, and a pixel D that is diagonally right above the pixel A that is one pixel to the right of the representative pixel are used as predicted reference pixels.
- the representative pixel is the leftmost pixel A of the pixel A among the pixels A that is shaded.
- prediction reference is made to a pixel B one pixel to the left of the representative pixel, a pixel C one pixel above the representative pixel, and a pixel D diagonally right above the pixel A located at the rightmost position in the same row as the representative pixel. Let it be a pixel.
- the representative pixel is the pixel A on the left side of the pixel A among the pixels A that is shaded.
- pixel B one pixel to the left of the representative pixel
- pixel C one pixel above the representative pixel
- pixel C one pixel right above the representative pixel
- the representative pixel is the leftmost pixel A of the pixel A among the pixels A that is shaded.
- the pixel B one pixel to the left of the representative pixel, the three pixels C respectively positioned on one pixel A on the same row as the representative pixel, and the rightmost pixel on the same row as the representative pixel.
- a pixel D on the upper right side of the pixel A is set as a predicted reference pixel.
- the pixel is scanned in the order shown in FIG. 8, and segment number # 24 is assigned. Therefore, in order to encode each segment in the order indicated by the segment number # 24, the pixel B, the pixel C, and the pixel D illustrated in FIG. 12 are encoded in advance of the encoding target segment (the pixel A included in the encoding target segment). It is guaranteed that
- the predicted value of the representative value of the encoding target segment is calculated based on the representative value of the segment having the predicted reference pixel (step S105). For example, when the prediction reference pixel is pixel B, pixel C, and pixel D as in the example illustrated in FIG. 12A, the representative value Z_B of the segment RB having the pixel B and the representative of the segment RC having the pixel C Based on the value Z_C and the representative value Z_D of the segment RD having the pixel D, the predicted value Z′_A of the representative value Z_A of the segment RA is calculated.
- Z′_A may be a median value of Z_B, Z_C, and Z_D.
- Z′_A may be an average value of Z_B, Z_C, and Z_D.
- Z′_A may be any value of Z_B, Z_C, and Z_D.
- the difference value ⁇ Z_A is calculated by subtracting the prediction value Z′_A from the representative value Z_A of the encoding target segment (step S106).
- the calculated difference value ⁇ Z_A is a value indicating the distance value of the pixels included in the encoding target segment. As described above, the distance value has 256 steps and takes a value from 0 to 255. Therefore, ⁇ Z_A can take a value from ⁇ 255 to +255.
- the calculated difference value is encoded by a variable-length encoding method in which the code word is shorter as the value is closer to 0 (step S107).
- the difference value is encoded using an exponential Golomb encoding method which is one of variable length encoding methods.
- FIG. 13 shows the correspondence between the difference value and the code word in the exponential Golomb encoding method.
- the difference value is shown in the right column, and the code word when exponent Golomb coding is performed on the difference value is shown in the left column.
- the codeword to be assigned becomes shorter as the difference value is closer to 0, that is, as the predicted value is closer to the representative value approximating the actual distance value. Therefore, it is possible to transmit the distance image while reducing the amount of information to be transmitted.
- step S107 it is confirmed whether or not all (M) segments have been encoded. If all segments have not been encoded, the processes of steps S102 to S107 are executed in order of segment number # 24. If difference values are calculated and encoded for all segments, the process proceeds to step S110.
- step S110 the encoded difference values are arranged in the order of segment number # 24 to generate encoded data # 25.
- a specific example of the encoded data # 25 is shown in FIG. FIG. 14 shows an example of encoded data # 25 in which difference values “3”, “ ⁇ 4”, “ ⁇ 1”, and “0” are encoded in order.
- the predictive encoding unit 25 compresses the input data to generate encoded data # 25, and outputs the generated encoded data # 25 to the packaging unit 28 (step S9).
- the packaging unit 28 integrates the encoded data # 11 output from the image encoding unit 11 in step S2 and the encoded data # 25 output from the predictive encoding unit 25 in step S9. Then, the obtained encoded data # 28 is transmitted to a moving picture decoding apparatus to be described later (step S10).
- the packaging unit 28 is H.264.
- the texture image encoded data # 11 and the distance image encoded data # 25 are integrated. More specifically, the integration of the encoded data # 11 and the encoded data # 25 is performed as follows.
- FIG. 15 is a diagram schematically showing the configuration of the NAL unit. As shown in FIG. 11, the NAL unit is composed of three parts: a NAL header part, an RBSP part, and an RBSP trailing bit part. .
- the packaging unit 28 stores a specified numerical value I in the nal_unit_type (identifier indicating the type of NAL unit) field of the NAL header portion of the NAL unit corresponding to each slice (main slice) of the main picture.
- the prescribed numerical value I is generated in accordance with the encoding method according to the present embodiment (that is, the encoding method for encoding the distance image # 2 after calculating the difference value for each segment) according to the present embodiment. This is a value indicating encoded data.
- the numerical value I is, for example, H.264. Values defined as “undefined” or “for future expansion” in the H.264 / MPEG-4 AVC standard can be used.
- the packaging unit 28 stores the encoded data # 11 and the encoded data # 25 in the RBSP unit of the NAL unit corresponding to the main slice. Further, the packaging unit 28 stores the RBSP trailing bit in the RBSP trailing bit unit.
- the packaging unit 28 transmits the NAL unit thus obtained to the video decoding device as encoded data # 28.
- the image division processing unit 21 is configured from a group of pixels whose difference between the maximum pixel value and the minimum pixel value is equal to or less than a predetermined threshold value from the input texture image # 1 ′.
- a predetermined threshold value from the input texture image # 1 ′.
- the method of defining the segments is not limited to this configuration. For example, for each segment, the image division processing unit 21 calculates the average value calculated from the pixel values of the pixel group included in the segment and the pixels included in the segment adjacent to the segment from the input texture image # 1 ′.
- a plurality of segments whose difference from the average value calculated from the pixel values of the group is equal to or greater than a predetermined threshold value may be defined.
- FIG. 21 is a flowchart showing an operation in which the video encoding device 1 defines a plurality of segments based on the above algorithm.
- FIG. 22 is a flowchart showing a subroutine of segment combination processing in the flowchart of FIG.
- the image division processing unit 21 performs, for each of all the pixels included in the texture image, in the initialization step in the figure for the texture image subjected to the smoothing process as shown in (Appendix 2).
- One independent segment provisional segment
- the pixel value itself of the corresponding pixel is set as the average value (average color) of all pixel values in each provisional segment (step S41).
- step S42 a segment combination processing step to combine provisional segments having similar colors.
- This segment combining process will be described in detail below with reference to FIG. 22, but this combining process is repeated until the combination is not performed.
- the image division processing unit 21 performs the following processing (steps S51 to S55) for all provisional segments.
- the image division processing unit 21 determines whether or not the height and width of the temporary segment of interest are both equal to or less than a threshold value (step S51). If it is determined that both are equal to or less than the threshold value (YES in S51), the process proceeds to step S52. On the other hand, when it is determined that any one is larger than the threshold value (NO in S51), the process of step S51 is performed for the temporary segment to be focused next. Note that the temporary segment to be noted next may be, for example, a temporary segment positioned next to the temporary segment of interest in the raster scan order.
- the image division processing unit 21 selects a temporary segment having an average color closest to the average color in the temporary segment of interest among the temporary segments adjacent to the temporary segment of interest (step S52).
- a temporary segment having an average color closest to the average color in the temporary segment of interest among the temporary segments adjacent to the temporary segment of interest.
- an index for judging the closeness of colors for example, the Euclidean distance between vectors when the three RGB values of pixel values are regarded as a three-dimensional vector can be used.
- a pixel value of each segment an average value of all pixel values included in each segment is used.
- the image division processing unit 21 determines whether or not the proximity of the temporary segment of interest and the temporary segment that is determined to have the closest color is equal to or less than a certain threshold value. (Step S53). If it is determined that the value is larger than the threshold value (NO in step S53), the process of step S51 is performed for the temporary segment that should be noticed next. On the other hand, when it is determined that the value is equal to or less than the threshold value (NO in step S53), the process proceeds to step S54.
- the image division processing unit 21 converts two provisional segments (provisional segments determined to be closest in color to the provisional segment of interest) into one provisional segment. (Step S54). The number of provisional segments is reduced by 1 by the process of step S54.
- step S54 the average value of the pixel values of all the pixels included in the converted target segment is calculated (step S55). If there is a segment that has not yet been subjected to the processing of steps S51 to S55, the processing of step S51 is performed for the temporary segment to be noticed next.
- step S43 After completing the processes of steps S51 to S55 for all the provisional segments, the process proceeds to the process of step S43.
- the image division processing unit 21 compares the number of provisional segments before the process of step S42 and the number of provisional segments after the process of step S42 (step S43).
- step S43 If the number of provisional segments has decreased (YES in step S43), the process returns to step S42. On the other hand, when the number of temporary segments does not change (NO in step S43), the image division processing unit 21 defines each current temporary segment as one segment.
- the input texture image is an image of 1024 ⁇ 768 dots, it can be divided into several thousand segments (for example, 3000 to 5000 segments).
- step S51 is not essential, but it is desirable to prevent the segment size from becoming too large by limiting the segment size as in step S51.
- the image division processing unit 21 is configured from a group of pixels whose difference between the maximum pixel value and the minimum pixel value is equal to or less than a predetermined threshold value from the input texture image # 1 ′.
- an upper limit may be set for the number of pixels included in each segment.
- an upper limit may be provided for the width or height of the segment together with the upper limit of the number of pixels or instead of the upper limit of the number of pixels.
- the moving image decoding apparatus 2 can decode a distance image that more faithfully reproduces the original distance image # 2.
- the image division processing unit 21 may perform a smoothing process on the input texture image # 1 ′.
- the image division processing unit 21 is a non-patent document “C. Lawrence Zinick, Sing Bing Kang, Mattew Uyttendaele, Simon Winder and Richard Szeliski,“ High-quality video view interpolation using a layered representation, ”ACM Trans. On Graphics, 23 (3), 600-608, (2004) ”, the texture image # 1 ′ may be repeatedly smoothed to such an extent that the edge information is not lost.
- the image division processing unit 21 converts the texture image after the smoothing process into a plurality of segments each composed of a pixel group in which the difference between the maximum pixel value and the minimum pixel value is equal to or less than a predetermined threshold value. It may be divided.
- the smoothing process reduces the size of the segment. Can be suppressed. That is, by performing the smoothing process, the code amount of the encoded data # 25 can be reduced as compared with the case where the smoothing process is not performed.
- the image division processing unit 21 may be arranged before the image encoding unit 11 instead of being arranged between the image decoding unit 12 and the distance image division processing unit 22. . That is, the image division processing unit 21 outputs the input texture image # 1 as it is to the subsequent image encoding unit 11, and each segment of the texture image # 1 has a predetermined difference between the maximum pixel value and the minimum pixel value. May be divided into a plurality of segments composed of pixel groups that are equal to or smaller than the threshold value, and segment information # 21 may be output to the distance image division processing unit 22 in the subsequent stage.
- the number assigning unit 24 receives the segment information # 22 in which the distance value set and the position information are associated with each segment from the distance image division processing unit 22. Then, the number assigning unit 24 scans the pixels included in the distance image in the raster scan order, and performs segment numbers in the scanned order for each segment that is an area divided by the position information of the segment information # 22. # 24 is assigned and associated with the distance value set of each segment included in the segment information # 22.
- the distance value correcting unit 23 receives information in which the segment number # 24 and the distance value set are associated with each other from the number assigning unit 24. Then, the distance value correcting unit 23 calculates the mode value as the representative value # 23a from the distance value set of each segment. Then, the distance value correction unit 23 associates the segment number # 24 with the segment representative value # 23a and outputs the segment number # 24 to the prediction encoding unit 25.
- the number assigning unit 24 receives the segment information # 23 including the segment position information and the representative value # 23a of each segment, and the predictive encoding unit 25 converts the representative value # 23a and the segment number # 24 of each segment.
- the number assigning unit 24 may output the segment position information to the predictive encoding unit 25 in addition to the representative value # 23a and the segment number # 24 of each segment.
- the predictive encoding unit 25 adds segment position information to encoded data # 25 obtained by encoding the difference value, and outputs the result to the packaging unit 28.
- the packaging unit 28 may add the position information of the segment to the encoded data # 28 instead of the encoded image # 11 of the texture image output from the image encoding unit 11. That is, in this case, the packaging unit 28 transmits the encoded data # 28 including the encoded data # 25 obtained by encoding the difference value and the segment position information to the video decoding device.
- the moving picture decoding apparatus decodes the encoded data # 25 based on the position information of the segment.
- the moving image decoding apparatus only needs to be able to divide a segment with the same division pattern as that of the moving image encoding apparatus 1, and thus can restore a distance image based on segment position information indicating the position of the segment. That is, even when there is no encoded image # 11 of the texture image, the distance image divided into segments based on the texture image can be restored. Therefore, it is sufficient for the packaging unit 28 to transmit the segment defining information (region information) defining the segment and the encoded data # 25 to the video decoding device.
- the segment defining information is the texture image encoded data # 11 or the segment position information.
- the prediction encoding unit 25 specifies a prediction reference pixel based on the representative pixel, in the example illustrated in FIG. 12C, in addition to the pixel B and the pixel D, the pixel C that is one above the representative pixel
- the prediction reference pixel is used, the present invention is not limited to this.
- At least one of the representative pixel and one pixel above the pixel A on the same scanning line as the representative pixel may be a predicted reference pixel.
- a pixel that is one pixel above the center pixel (one pixel to the right of the representative pixel) of the pixel A that is shaded may be used as the predicted reference pixel.
- the predictive encoding unit 25 calculates the predictive value of the encoding target segment based on the representative value of the segment having the predictive reference pixel, but is not limited thereto. For example, when the pixel values of the pixels included in each segment are the same value in the same segment (when the values can be regarded as constant), the pixel value of the predicted reference pixel is used instead of the representative value of the segment having the predicted reference pixel Based on the above, the predicted value of the encoding target segment may be calculated.
- the prediction encoding unit 25 may encode information indicating a calculation method of the prediction value and add it to the encoded data # 25.
- the packaging unit 28 transmits encoded data # 28 including information indicating the calculation method of the predicted value to the video decoding device.
- the predictive coding unit 25 (1) “predicted value Z′_A is Z_B.” (2) “predicted value Z′_A is Z_C.” (3) “predicted value Z′_A is Z_D. (4) When calculating predicted values by selecting from the four predicted value calculation methods of “predicted value Z′_A is an average value of Z_B, Z_C, and Z_D”, those four calculation methods are used.
- the selected calculation method may be associated with the difference value of the encoding target segment to generate encoded data # 25. Further, for example, the predictive encoding unit 25 adds (5) “predicted value Z′_A as the median value of Z_B, Z_C, and Z_D” to the above four calculation methods, and adds these five calculation methods. Information to represent may be used.
- the predictive encoding unit 25 does not include a segment having a predicted reference pixel and a predicted reference pixel.
- the pixel value of the representative value and the prediction reference pixel is set to 0. That is, when specifying the prediction reference pixel based on the representative pixel, the prediction encoding unit 25 sets the representative value of the segment having the prediction reference pixel and the pixel value of the prediction reference pixel to 0 when there is no prediction reference pixel. To do.
- FIG. 12 shows the case where the number of pixels on the same scanning line as the representative pixel and the representative pixel is 1 to 3, but the number of pixels is naturally not limited to this, and the number is 4 or more. Is also present. In those cases, processing can be performed in the same manner as described in the three examples.
- the predictive encoding unit 25 encodes the difference value by the exponent Golomb encoding method
- the encoding method is not limited to this.
- the exponent Golomb coding method makes the codeword very long for values far from 0, instead of making the codeword for values near 0 very short. For this reason, when the accuracy of prediction is not so good, it is better to use general Golomb coding instead of the exponential Golomb coding method, and the amount of information can be relatively compressed. That is, it is desirable to select an encoding method based on prediction accuracy (a distribution of differences between representative values and predicted values).
- the prediction value is a value obtained by multiplying the representative value of that segment by ⁇ 1, so that the value is far from zero. Therefore, for the first segment, a code word obtained by encoding the representative value of the segment itself with a fixed-length encoding method (for example, 8 bits) instead of the difference from the predicted value may be used. In this case, the amount of information can be further compressed.
- the moving image encoding apparatus 1 is the H.264 standard.
- the texture image # 1 is encoded using AVC encoding defined in the H.264 / MPEG-4 AVC standard, but the present invention is not limited to this. That is, the image encoding unit 11 of the moving image encoding apparatus 1 may encode the texture image # 1 using another encoding method such as MPEG-2 or MPEG-4.
- the texture image # 1 may be encoded using an encoding method established as the H.265 / HVC standard.
- the image division processing unit 21 is a plurality of segments obtained by dividing the entire region of the texture image # 2, and the maximum pixel value and the minimum pixel group included in each region A plurality of segments are defined such that the difference from the pixel value is equal to or less than a predetermined threshold value.
- the distance image division processing unit 22 defines a plurality of segments obtained by dividing the entire area of the distance image # 2 with the same division pattern as the plurality of segment division patterns defined by the image division processing unit 21. Further, for each segment defined by the distance image division processing unit 22, the distance value correction unit 23 calculates a representative value # 23a from the distance value of each pixel included in the segment.
- the distance image encoding unit 20 generates encoded data # 25 including a plurality of representative values # 23a calculated by the distance value correcting unit 23.
- the moving image encoding apparatus 1 transmits the representative values # 23a corresponding to the number of segments at most as the encoded data # 25 of the distance image # 2 transmitted to the moving image decoding apparatus.
- the code amount of the encoded data of the distance image is clearly larger than the code amount of the encoded data # 25.
- the image segmentation processing unit 21 defines a plurality of segments by the method described in the above (Appendix 1), each segment is determined when the texture image is an image of 1024 ⁇ 768 dots.
- the number of pixels included in is about 3000 to 5000.
- the encoding method of this embodiment is also used for the code amount per block of the distance image when AVC encoding is used. In this case, the code amount per segment of the distance image becomes larger.
- the moving image encoding apparatus 1 reduces the code amount of the encoded data of the distance image # 2 compared to the conventional moving image encoding apparatus that AVC encodes the distance image # 2 and transmits the encoded image to the moving image decoding apparatus. can do.
- the distance image division processing unit 22 divides the distance image # 2 into segments, and the distance value correction unit 23 approximates the distance values of the pixels included in the segments to determine representative values.
- the number assigning unit 24 assigns numbers to the segments in the raster scan order.
- the predictive encoding unit 25 calculates a predicted value of the representative value of the segment based on the pixels that are close to the segment and whose raster scan order is earlier than the pixels included in the segment. A prediction value is subtracted from the value to calculate a difference value, and the difference values are arranged in numerical order and encoded to generate encoded data # 25.
- the moving image encoding apparatus 1 compresses the spatial redundancy between segments even when the distance image # 2 transmitted to the moving image decoding apparatus is divided into segments of an arbitrary shape by the above configuration. Can be generated. Therefore, the moving image encoding device 1 has an effect that the code amount of the encoded data of the distance image # 2 transmitted to the moving image decoding device can be further reduced.
- the moving picture decoding apparatus uses the texture image # 1 ′ and the distance picture # from the encoded data # 28 transmitted from the moving picture encoding apparatus 1 described above for each frame constituting the moving picture to be decoded. This is a moving picture decoding apparatus for decoding 2 ′.
- FIG. 16 is a block diagram illustrating a main configuration of the video decoding device.
- the moving image decoding apparatus 2 includes an image decoding unit 12, an image division processing unit (dividing unit) 21 ', a numbering unit (numbering unit, assigning unit) 24' an unpackaging unit (receiving unit). ) 31 and a predictive decoding unit (predicted value calculating means, pixel value setting means) 32.
- the unpackaging unit 31 extracts the encoded data # 11 of the texture image # 1 and the encoded data # 25 of the distance image # 2 from the received encoded data # 28.
- the image decoding unit 12 decodes the texture image # 1 'from the encoded data # 11.
- the image decoding unit 12 is the same as the image decoding unit 12 included in the moving image encoding device 1. That is, the image decoding unit 12 is configured to transmit the encoded data # 28 from the moving image encoding apparatus 1 to the moving image decoding apparatus 2 as long as no noise is mixed in the encoded data # 28.
- the texture image # 1 ′ having the same content as the texture image decoded by the image decoding unit 12 is decoded.
- the image division processing unit 21 ′ divides the entire area of the texture image # 1 ′ into a plurality of segments (areas) using the same algorithm as the image division processing unit 21 of the video encoding device 1. Then, the image division processing unit 21 ′ generates segment information # 21 ′ including the position information of each segment, and outputs it to the number assigning unit 24 ′.
- the number assigning unit 24 assigns a number to each segment divided based on the segment information # 21' in the raster scan order by the same algorithm as the number assigning unit 24 of the video encoding device 1.
- the number assigning unit 24 ′ generates a segment identification image # 24 ′ in which the number assigned to the segment position information is associated, and outputs the generated image to the predictive decoding unit 32.
- the segment identification image # 24 ' is information in which a number is associated with segment position information indicating the position of each segment.
- the predictive decoding unit 32 can specify the arrangement of each segment in the entire image, the number of pixels included in each segment based on the segment position information, and can also specify the number of pixels in the entire image. . Therefore, the predictive decoding unit 32 can restore an image that is divided into segments and does not have information indicating the pixel values of the pixels that form the image, based on the segment position information.
- the segment identification image # 24 ′ divides the texture image # 1 ′ into segments, and assigns the segment number “i-1” to the i-th segment in the raster scan order, so that the texture image # 1 ′
- the pixel value of each pixel included in the i th segment may be replaced with “i ⁇ 1”.
- the image division processing unit 21 ′ divides the texture image # 1 ′ into segments, and the number assigning unit 24 ′ assigns the segment number “i-1” to the i-th segment in the raster scan order, and the above i
- the pixel value of each pixel included in the th segment may be replaced with “i ⁇ 1”.
- the predictive decoding unit 32 performs a predictive decoding process based on the input encoded data # 25 and the segment identification image # 24 'to restore the distance image # 2'. Specifically, the predictive decoding unit 32 decodes the encoded data # 25, generates difference values arranged in order, and identifies the generated difference values in the order given by the number assigning unit 24 ′. It assigns to each segment prescribed
- the predictive decoding unit 32 sets the distance value of the set segment as the pixel value (distance value) of all the pixels included in the segment, and restores the distance image # 2 '.
- the prediction decoding unit 32 outputs the restored distance image # 2 ′ to a stereoscopic video display device (not shown) outside the moving image decoding device 2.
- FIG. 17 is a flowchart showing the operation of the video decoding device 2.
- the operation of the moving image decoding apparatus 2 described here is an operation of decoding a texture image and a distance image of the t-th frame from the top in a three-dimensional moving image including a large number of frames. That is, the moving image decoding apparatus 2 repeats the operation described below as many times as the number of frames of the moving image in order to decode the entire moving image. Further, in the following description, unless otherwise specified, each data # 1 to # 28 is interpreted as data of the t-th frame.
- the unpackaging unit 31 extracts the encoded data # 11 of the texture image and the encoded data # 25 of the distance image from the encoded data # 28 received from the moving image encoding device 1. Then, the unpackaging unit 31 outputs the encoded data # 11 to the image decoding unit 12, and outputs the encoded data # 25 to the predictive decoding unit 32 (Step S21).
- the image decoding unit 12 decodes the texture image # 1 ′ from the input encoded data # 11, and sends it to the image division processing unit 21 ′ and a stereoscopic video display device (not shown) outside the moving image decoding device 2. Output (step S22).
- the image division processing unit 21 ′ defines a plurality of segments with the same algorithm as the image division processing unit 21 of the moving image encoding device 1. Then, the image division processing unit 21 'generates segment information # 21' composed of the position information of each segment, and outputs it to the number assigning unit 24 '(step S23).
- the number assigning unit 24 'assigns a number to each segment divided based on the segment information # 21' in the raster scan order by the same algorithm as the number assigning unit 24 of the video encoding device 1.
- the number assigning unit 24 'generates a segment identification image # 24' in which the number assigned to the segment position information is associated, and outputs the segment identifying image # 24 'to the predictive decoding unit 32 (step S24).
- the predictive decoding unit 32 performs a predictive decoding process based on the input encoded data # 25 and the segment identification image # 24 'to restore the distance image # 2' (step S25). Specifically, the predictive decoding unit 32 decodes the encoded data # 25, generates difference values arranged in order, and identifies the generated difference values in the order given by the number assigning unit 24 ′. It assigns to each segment prescribed
- the predictive decoding unit 32 outputs the restored distance image # 2 'to a stereoscopic video display device (not shown) outside the video decoding device 2. As described above, the texture image # 1 'and the distance image # 2' can be restored.
- FIG. 18 is a flowchart illustrating an example of the predictive decoding process executed by the predictive decoding unit 32.
- the predictive decoding unit 32 uses the encoded data # 25 input from the unpackaging unit 31 as a code used when the predictive encoding unit 25 of the video encoding device 1 generates the encoded data # 25.
- the difference values arranged in order are generated by decoding using the conversion method (step S201). That is, in this embodiment, the predictive decoding unit 32 decodes the encoded data # 25 illustrated in FIG. 14 using the exponential Golomb encoding method illustrated in FIG.
- the predictive decoding unit 32 sets each segment defined by the segment information # 21 ′ of the segment identification image # 24 ′ according to the order in which the number assigning unit 24 ′ assigns the difference values arranged in order from the top. (Step S202).
- step S 203 “i”, which is the number assigned by the number assigning unit 24 ′, is set to “0” (step S 203). Then, the segment with the number “i” assigned by the number assigning unit 24 ′ is set as a decoding target segment (decoding target region) (step S 204). That is, the segment with the head number assigned by the number assigning unit 24 'is set as a decoding target segment.
- the representative pixel of the decoding target segment to be used for calculating the prediction value is specified from the pixels included in the decoding target segment (step S205). Specifically, the pixels included in the decoding target segment and first scanned in the raster scan order in step S24 are set as representative pixels.
- the predictive decoding unit 32 After identifying the representative pixel, the predictive decoding unit 32 performs prediction based on the identified representative pixel using the same method as the method of identifying the prediction reference pixel used by the predictive coding unit 25 of the video encoding device 1.
- a reference pixel is specified (step S206). Specifically, a pixel that is included in the decoding target segment and is adjacent to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is set as a predicted reference pixel.
- the prediction reference pixel is a pixel adjacent to a pixel in the decoding target segment that is one pixel before the representative pixel in the raster scan order and adjacent to a pixel on the same scan line as the representative pixel, and the representative pixel A pixel in the previous scan line in the raster scan order and a pixel adjacent to the next pixel in the raster scan order of the last pixel of the same scan line as the representative pixel included in the decoding target segment
- the pixel group may include a pixel on the previous scan line in the raster scan order of the representative pixel.
- the prediction reference pixel is a pixel adjacent to the pixel in the decoding target segment and the pixel on the same scan line as the representative pixel, and the pixel immediately before the raster scan order of the representative pixel, One of the pixels of the previous scan line in the raster scan order of the representative pixel and one in the raster scan order of the last pixel included in the decoding target segment and the same scan line as the representative pixel Three pixels including a pixel adjacent to a subsequent pixel and a pixel on the previous scan line in the raster scan order of the representative pixel may be used.
- the prediction decoding unit 32 uses the same method as the method for calculating the prediction value used by the prediction encoding unit 25 of the video encoding device 1 to specify the specified prediction reference pixel.
- a predicted value of the representative value of the decoding target segment is calculated (step S207).
- the predicted value may be the median value of the pixel values of the prediction target pixels.
- the predicted value may be an average value of the pixel values of the prediction target pixels.
- the predicted value may be any one of the pixel values of the prediction target pixel.
- the prediction decoding unit 32 adds the difference value assigned to the decoding target segment to the calculated prediction value, and sets the value as a representative value of the decoding target segment (step S208). Then, the predictive decoding unit 32 sets the pixel values of all the pixels included in the decoding target segment to the representative values of the set decoding target segment (step S209).
- step S209 it is confirmed whether or not the pixel values of the pixels included in the segments are set for all (M) segments. If the pixel values are not set for all the segments, the number assigning unit The processes of steps S204 to S209 are executed in the order of numbers assigned by 24 '. If pixel values are set for all segments, the process proceeds to step S212.
- step S212 all the segments in which the pixel values of the belonging pixels are set are combined to restore the distance image # 2 '(step S212).
- the distance image # 2 ′ decoded by the prediction decoding unit 32 in step S25 is generally the distance image # 2 input to the video encoding device 1.
- the distance image approximates to.
- the distance image # 2 is the same as the image obtained by changing the distance value of a very small part included in the segment in the distance image # 2 to the representative value in the segment. It can be said that the distance image # 2 is approximate.
- the image division processing unit 21 ′ defines a plurality of segments obtained by dividing the entire area of the texture image # 1 ′. Specifically, the image division processing unit 21 ′ defines a plurality of segments each including a group of pixels each having a similar color.
- the predictive decoding unit 32 reads the encoded data # 25.
- the encoded data # 25 is data including at most one representative value # 23a as a distance value for each of a plurality of segments constituting the distance image # 2 'to be decoded. Note that the division pattern of the plurality of segments constituting the distance image # 2 'to be decoded is the same as the division pattern of the plurality of segments defined by the image division processing unit 21'.
- the moving image decoding apparatus 2 uses a decoding method corresponding to an encoding method in which the representative value of the segment obtained by dividing the distance image # 2 by the moving image encoding apparatus 1 is encoded into encoded data # 25. Since the encoded data # 25 is decoded, the representative value of each segment generated by the video encoding device 1 can be accurately restored. Therefore, the moving image decoding apparatus 2 can accurately restore the distance image # 2 'that approximates the distance image # 2.
- the distance image # 2 ′ restored from the encoded data # 25 by the moving image decoding apparatus 2 is similar to the distance image # 2 encoded by the moving image encoding apparatus 1 as described above.
- the device 2 can decode an appropriate distance image.
- the distance image # 2 'decoded by the video decoding device 2 has further advantages.
- the contour of the subject in the generated three-dimensional image is the subject and background in the distance image # 2. It depends on the shape of the boundary.
- the texture image # 1 'and the distance image # 2 match the position of the boundary between the subject and the background, the position of the boundary between the subject and the background may not match.
- the texture image reproduces the shape of the edge portion between the subject and the background more faithfully.
- the position of the boundary between the subject and the background in the distance image # 2 ′ decoded by the moving image decoding apparatus 2 often coincides with the position of the boundary between the subject and the background in the texture image # 1. This is because, in general, the subject color and the background color are significantly different in the texture image # 1, and the boundary between the subject and the background becomes the segment boundary in the texture image # 1.
- the three-dimensional image reproduced by the stereoscopic image display device from the texture image # 1 ′ and the distance image # 2 ′ output from the moving image decoding apparatus 2 according to the present embodiment is the texture image # 1 ′ and the distance image # 2.
- the video decoding device 2 restores the distance image # 2 ′ from the encoded data # 28 using a decoding method corresponding to the encoding method used by the video encoding device 1. For this reason, the moving image encoding device 1 and the moving image decoding device 2 may determine the encoding method and the decoding method in advance before performing the encoding and decoding processes, respectively.
- the video decoding device 2 receives the information indicating the encoding method together with the encoded data # 28 (encoded data # 25) from the video encoding device 1, and corresponds to the encoding method indicated by the received information.
- the decoding method to be performed may be specified, and the distance image # 2 ′ may be restored based on the specified decoding method.
- information indicating the encoding method may be associated with each segment included in the encoded data # 25.
- the information indicating the encoding method includes a variable length encoding method for converting a difference value into a code word, information indicating a fixed length encoding method, and a prediction reference pixel specification that specifies a prediction reference pixel based on a representative pixel.
- the division method information indicating the segment division method in which the image division processing unit 21 divides the segment
- the numbering rule information indicating the order (rule) in which the number assigning unit 24 assigns the numbers.
- representative pixel specifying method information indicating a representative pixel specifying method for specifying a representative pixel may be included.
- the moving image decoding apparatus 2 displays information indicating that fact. Is received together with the encoded data # 28, only the first code word of the encoded data # 25 is decoded by the fixed-length encoding method, and the representative value of the first segment is set to the decoded value. The pixel values of all the pixels included in the segment are set to decoded values.
- the moving image decoding apparatus 2 receives the encoded data # 28 including the encoded data # 11 of the texture image and the encoded data # 25 of the distance image.
- the present invention is not limited to this. Absent.
- the moving image decoding apparatus 2 may receive the encoded data # 25 of the distance image and the position information of the segment.
- the number assigning unit 24 ′ assigns a number to each segment divided based on the segment position information in the raster scan order.
- the number assigning unit 24 ′ generates a segment identification image # 24 ′ in which the number assigned to the segment position information is associated, and outputs the segment identifying image # 24 ′ to the predictive decoding unit 32.
- the encoded data # 25 of the distance image Can receive the segment position information together with the encoded data # 25 of the distance image, thereby restoring the distance image.
- the moving image encoding apparatus 1 transmits the encoded data # 25 to the moving image decoding apparatus 2.
- the moving image encoding apparatus 1 transmits the encoded data # 25 to the moving image decoding apparatus 2 as follows. Then, the encoded data # 25 may be supplied.
- the moving image encoding apparatus 1 and the moving image decoding apparatus 2 are provided with access means that can access a removable recording medium such as an optical disk drive, and the moving image encoding apparatus 1 and the moving image decoding apparatus 2 are connected via the recording medium.
- the encoded data # 25 may be supplied.
- the encoding apparatus of the present invention does not necessarily include a means for transmitting data
- the decoding apparatus of the present invention does not necessarily include a receiving means for receiving data.
- the moving image encoding apparatus is H.264 for encoding texture images.
- the MVC coding adopted as the MVC standard in H.264 / AVC is used, while the distance picture is coded by a moving picture coding apparatus using a coding technique peculiar to the present invention.
- the moving image encoding apparatus according to the present embodiment is different from the moving image encoding apparatus 1 in that a plurality of sets (N sets) of texture images and distance images are encoded per frame.
- the N sets of texture images and distance images are images of subjects simultaneously captured by cameras and ranging devices installed at N locations so as to surround the subject. That is, the N sets of texture images and distance images are images for generating a free viewpoint image.
- each set of texture image and distance image includes actual data of the texture image and distance image of the set, and a camera parameter indicating which azimuth angle is an image generated by a camera and a distance measuring device. Is included as metadata.
- FIG. 19 is a block diagram showing a main configuration of the moving picture encoding apparatus according to the present embodiment.
- the moving image encoding apparatus 1A includes an image encoding unit 11A, an image decoding unit 12A, a distance image encoding unit 20A, and a packaging unit (transmission means) 28A.
- the distance image encoding unit 20A includes an image division processing unit 21A, a distance image division processing unit (dividing unit) 22A, a distance value correcting unit (representative value determining unit) 23A, a number assigning unit (number assigning unit) 24A, and A predictive encoding unit (predicted value calculating means, difference value calculating means, encoding means) 25A is provided.
- the image encoding unit 11A N view components (that is, texture images # 1-1 to # 1-N) are encoded by MVC encoding (multi-view video encoding) defined in the MVC standard in H.264 / AVC, and each view component is Coded data # 11-1 to # 11-N are generated. Further, the image encoding unit 11A converts the encoded data # 11-1 to # 11-N into the image decoding unit 12A and the packaging unit 28 together with view IDs “1” to “N” that are parameters by NAL header extension. Output to '.
- MVC encoding multi-view video encoding
- the image decoding unit 12A decodes the texture images # 1′-1 to 1′-N from the encoded data # 11-1 to # 11-N of the texture image # 1 by the decoding method stipulated in the MVC standard. To do.
- the image division processing unit 21 divides the entire area of the texture image # 1'-j into a plurality of segments (areas). Then, the image division processing unit 21 outputs segment information # 21-j including the position information of each segment.
- the distance image division processing unit 22A corresponds to each segment in the texture image # 1′-j in the distance image # 2-j. A distance value set including the distance values of each pixel included in the segment (region) is extracted. Then, the distance image division processing unit 22A generates segment information # 22-j in which the distance value set and the position information are associated with each segment from the segment information # 21-j.
- the distance image division processing unit 22A generates a view ID “j” of the distance image # 2-j, and generates segment information # 22A-j in which the view ID “j” is associated with the segment information # 22-j. To do.
- the distance value correcting unit 23A calculates the mode value as the representative value # 23a from the distance value set of the segment included in the segment information # 22A-j for each segment of the distance image # 2-j. Then, the distance value correcting unit 23 replaces the distance value set of each segment included in the segment information # 22A-j with the representative value # 23a-j of the corresponding segment, and the number assigning unit 24A as the segment information # 23A-j Output to.
- Number giving unit 24A when the segment information # 23A-j is input, for each set of M j sets of position information and the representative value # 23a-j contained in the segment information # 23A-j, representative value # 23a-j is associated with segment number # 24-j corresponding to the position information.
- the number assigning unit 24A then associates M j sets of segment numbers # 24-j and representative values # 23a-j with the view ID “j” included in the segment information # 23A-j. 24A-j is output to the predictive coding unit 25A.
- the predictive encoding unit 25A performs predictive encoding processing for each viewpoint based on the M j sets of representative values # 23a-j and segment numbers # 24-j included in the input data # 24A-j,
- the encoded data # 25-j is output to the packaging unit 28.
- the predictive coding unit 25A calculates the predicted value of the segment for each segment in the order of the segment number # 24-j, and subtracts the predicted value from the representative value # 23a-j to obtain the difference value. Calculate and encode the difference value.
- the predictive encoding unit 25 arranges the encoded difference values in the order of the segment numbers # 24-j to generate encoded data # 25-j.
- the prediction encoding unit 25A uses the encoded data of the distance image # 2-j for each j from 1 to N obtained in this way as the VCL / NAL unit and the view ID “j” as the non-VCL / NAL unit. Is transmitted to the packaging unit 28A.
- the packaging unit 28A integrates the encoded data # 11-1 to # 11-N of the texture images # 1-1 to # 1-N and the encoded data # 25A to thereby convert the encoded data # 28A. Generate. Then, the packaging unit 28A transmits the encoded data # 28A to the video decoding device.
- FIG. 20 is a block diagram showing a main configuration of the moving picture decoding apparatus according to the present embodiment.
- the moving picture decoding apparatus 2A includes an image decoding unit 12A, an image division processing unit (dividing unit) 21A ′, a numbering unit (numbering unit, assigning unit) 24A ′, an unpackaging unit (reception). Means) 31A and a predictive decoding unit (predicted value calculating means, pixel value setting means) 32A.
- the image decoding unit 12A decodes the texture images # 1′-1 to 1′-N from the encoded data # 11-1 to # 11-N of the texture image # 1 by a decoding method defined in the MVC standard. .
- the unpackaging unit 31A extracts the encoded data # 11-j of the texture image # 1 and the encoded data # 25A of the distance image # 2 from the received encoded data # 28A.
- the image division processing unit 21A ' divides the entire region of the texture image # 1'-j into a plurality of segments (regions) by the same algorithm as the image division processing unit 21A of the moving image encoding device 1A. Then, the image division processing unit 21A ′ generates segment information # 21′-j including the position information of each segment, and outputs it to the number assigning unit 24A ′.
- the number assigning unit 24A 'assigns a number to each segment divided based on the segment information # 21'-j in the raster scan order by the same algorithm as the number assigning unit 24A of the moving image encoding device 1A.
- the number assigning unit 24A 'generates a segment identification image # 24'-j in which the number assigned to the segment position information is associated, and outputs it to the predictive decoding unit 32A.
- the predictive decoding unit 32A extracts the encoded data # 25-j and the view ID “j” from the input encoded data # 25A. Next, predictive decoding processing is performed based on the encoded data # 25-j and the segment identification image # 24'-j to restore the distance images # 2'-1 to # 2'-N. Specifically, the prediction decoding unit 32A decodes the distance image # 2'-j as follows.
- the predictive decoding unit 32A decodes the encoded data # 25-j, generates differential values arranged in order, and generates the generated differential values in the order given by the number assigning unit 24A ′. This is assigned to each segment defined by the segment information # 21'-j of 24'-j. Next, the predictive decoding unit 32A calculates the predicted value of the segment for each segment in the order given by the number assigning unit 24A ′, and adds the assigned difference value to the calculated predicted value. Set the value as the distance value for each segment. Then, the predictive decoding unit 32A sets the distance value of the set segment as the pixel value (distance value) of all the pixels included in the segment, and restores the distance image # 2'-j. The predictive decoding unit 32 associates the restored distance image # 2′-j with the view ID “j” included in the encoded data # 25A to provide a stereoscopic video display device (not shown) outside the video decoding device 2A. ).
- image decoding unit 12 is the same as the image decoding unit 12 of the video decoding device 2 of the first embodiment, and a description thereof will be omitted.
- the moving image encoding device 1A and the moving image decoding device 2A have N sets of texture images and distance images of a subject captured simultaneously by cameras and ranging devices installed at N locations so as to surround the subject. Then, an encoding process and a decoding process were performed.
- the moving image encoding device 1A and the moving image decoding device 2A can perform encoding processing and decoding processing on N sets of texture images and distance images generated as follows. .
- the moving image encoding device 1A and the moving image decoding device 2A are generated by N sets of cameras and ranging devices installed in one place so that each set of cameras and ranging devices faces different directions. Also, encoding processing and decoding processing can be performed on the N sets of texture images and distance images. That is, the moving image encoding device 1A and the moving image decoding device 2A perform the encoding process and the decoding process on N sets of texture images and distance images for generating omnidirectional images, panoramic images, and the like. Can do.
- the texture image and the distance image of each set indicate the direction of the image generated by the camera and the distance measuring device in which direction it is directed together with the actual data of the texture image and the distance image of the set.
- Camera parameters are included as metadata.
- the image encoding unit 11A of the moving image encoding apparatus 1A is configured as H.264.
- texture images # 1-1 to 1-N are encoded using MVC encoding defined in the MVC standard in H.264 / AVC, the present invention is not limited to this.
- the image encoding unit 11A of the moving image encoding device 1A uses other encoding methods such as a VSP (View Synthesis Prediction) encoding method, an MVD encoding method, and an LVD (Layered Video Depth) encoding method.
- Texture images # 1-1 to 1-N may be encoded.
- the image decoding unit 12A of the video decoding device 2A is configured to decode the texture images # 1 ′ to 1′-N by a decoding method corresponding to the encoding method employed by the image encoding unit 11A. Good.
- an encoding apparatus is an encoding apparatus that encodes an image, and is divided by a dividing unit that divides the entire area of the image into a plurality of regions, and the dividing unit.
- representative value determining means for determining a representative value from the pixel value of each pixel included in the region, number giving means for assigning a number to the plurality of regions in raster scan order,
- the above-mentioned area is set as the encoding target area in the order of the numbers given by the above-mentioned number assigning means, and among the pixels included in the encoding target area, the first pixel in the raster scan order is set as the representative pixel, and is included in the encoding target area.
- a pixel that is adjacent to a pixel on the same scan line as the representative pixel and that has a raster scan order before the representative pixel is a predicted reference pixel, and a representative value of an area having the predicted reference pixel
- the predicted value calculating means for calculating the predicted value of the encoding target area, and the predicted value calculating means calculates the representative value determined by the representative value determining means for each of the encoding target areas.
- the difference value calculation means for calculating the difference value by subtracting the predicted value and the difference value calculated by the difference value calculation means are arranged and encoded in the order given by the number assignment means, and the encoded data of the image is obtained.
- encoding means for generating are arranged and encoded in the order given by the number assignment means, and the encoded data of the image is obtained.
- the number assigning unit assigns numbers in the raster scan order to the plurality of regions into which the dividing unit has divided the image.
- the prediction value calculation means sets the area as the encoding target area in the order of the numbers given by the number assignment means, and sets the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
- a pixel that is included in the encoding target region and is close to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is a predicted reference pixel.
- the predicted value calculation means calculates the predicted value of the encoding target region based on at least one of the representative values of the region having the predicted reference pixel.
- the difference value calculation means subtracts the prediction value calculated by the prediction value calculation means from the representative value determined by the representative value determination means for each encoding target region to calculate a difference value. Then, the encoding unit arranges and encodes the difference values calculated by the difference value calculation unit in the order given by the number assigning unit, and generates encoded data of the image.
- the order of the areas can be uniquely specified.
- the representative pixel used when calculating the predicted value of the representative value of each region and the prediction target pixel based on the representative pixel can be uniquely specified. Therefore, the predicted value of the encoding target area determined from the representative value of the area adjacent to the encoding target area can be uniquely calculated.
- the prediction reference pixel for a certain area needs to be the same at the time of encoding and at the time of decoding. Therefore, a prediction reference pixel for a certain area needs to be decoded before the certain area, that is, needs to be encoded first.
- an encoding method is an encoding method of an encoding device that encodes an image, and the encoding device divides the entire region of the image into a plurality of regions.
- a numbering step for assigning numbers in order, and the region as an encoding target region in the order of the numbers given in the numbering step, and among the pixels included in the encoding target region, the first pixel in the raster scan order A pixel that is included in the encoding target area and is close to a pixel on the same scan line as the representative pixel, and the raster scan order is higher than that of the representative pixel.
- a prediction value calculating step for calculating a prediction value of the encoding target area based on at least one representative value of the area having the prediction reference pixel,
- a difference value calculation step for calculating a difference value by subtracting the prediction value calculated in the prediction value calculation step from the representative value determined in the representative value determination step; and a difference value calculated in the difference value calculation step are encoded in the order given in the number assigning step, and an encoded step of generating encoded data of the image is included.
- the encoding method according to the present invention has the same effects as the encoding apparatus according to the present invention.
- the encoding apparatus further includes transmission means for associating the encoded data of the image generated by the encoding means with the area information defining the plurality of areas, and transmitting the associated information to the outside. It is desirable to have it.
- the transmission unit transmits the encoded data of the image generated by the encoding unit and the region information defining the plurality of regions in association with each other. Therefore, the device that has received the encoded data and the region information can further accurately decode the received encoded data by dividing the image into the plurality of regions based on the region information. Play.
- the encoding means encodes the difference value by a variable length encoding method in which a code word is shorter as a value to be encoded is closer to 0.
- the encoding means encodes the difference value by a variable length encoding method in which the code word is shorter as the value to be encoded is closer to 0.
- the prediction value of the encoding target region calculated by the prediction value calculating unit approximates the representative value of the encoding target region (when the prediction accuracy of the prediction value calculating unit is high)
- the difference value I is a very small value. Therefore, when the prediction accuracy of the prediction value calculation unit is high, the encoding unit encodes the difference value using a variable length encoding method, thereby further reducing the amount of encoded data. Play.
- the prediction value calculation means includes a prediction reference pixel that is included in the encoding target area and the pixel immediately before the representative pixel in the raster scan order, and is the same as the representative pixel.
- the predicted value calculation means includes the predicted reference pixel, the pixel immediately before the representative pixel in the raster scan order, and the same scan line as the representative pixel included in the encoding target region.
- the pixel group includes a pixel adjacent to the next pixel in the raster scan order, and a pixel on the previous scan line in the raster scan order of the representative pixel.
- the pixel immediately before the representative pixel in the raster scan order is a pixel adjacent in the left direction of the representative pixel.
- a pixel that is included in the encoding target area and is adjacent to a pixel on the same scan line as the representative pixel and that is on the previous scan line in the raster scan order of the representative pixel is the encoding target.
- the pixel is included in the region and is adjacent in the upward direction to the pixel of the same scan line as the representative pixel.
- the pixel on the previous scan line is a pixel that is included in the encoding target region and is adjacent to the uppermost pixel of the last scan line on the same scan line as the representative pixel in the diagonally upper right direction.
- a pixel in three directions of a pixel adjacent in the left direction of the encoding target region, a pixel adjacent in the upward direction, and a pixel adjacent to the upper right (a pixel in the right direction of the encoding target region) is used as a prediction reference pixel. Yes. Therefore, since the prediction value is calculated with reference to the pixels in the raster scan order before the encoding target area and the pixels in multiple directions, the prediction value can be predicted with high accuracy.
- the prediction value calculation means includes a prediction reference pixel that is included in the encoding target area and the pixel immediately before the representative pixel in the raster scan order, and is the same as the representative pixel.
- a pixel adjacent to a pixel on the scan line, and one of the pixels on the previous scan line in the raster scan order of the representative pixel and included in the encoding target area and the same as the representative pixel Three pixels including a pixel adjacent to the next pixel in the raster scan order of the last pixel of the scan line and including a pixel in the previous scan line in the raster scan order of the representative pixel. It is desirable.
- the predicted value calculation means includes the predicted reference pixel, the pixel immediately before the representative pixel in the raster scan order, and the same scan line as the representative pixel included in the encoding target region. Any one of the pixels on the previous scan line in the raster scan order of the representative pixel and the same scan line as the representative pixel included in the encoding target region Are the pixels adjacent to the next pixel in the raster scan order of the last pixel, and the pixels on the previous scan line in the raster scan order of the representative pixel.
- predictive reference is made to three pixels in three directions: a pixel adjacent in the left direction of the encoding target region, a pixel adjacent in the upward direction, and a pixel adjacent to the upper right (a pixel in the right direction of the encoding target region). It is a pixel. Therefore, since the prediction reference pixel is a pixel located in multiple directions with respect to the encoding target region and is a pixel group (three pixels) as few as possible, the processing load for calculating the prediction value is reduced, There is a further effect that the predicted value can be predicted with high accuracy.
- the prediction value calculation means sets the median of the representative values of the region having the prediction reference pixel as the prediction value of the encoding target region. Is desirable.
- the predicted value calculation means sets the median of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region.
- the representative value of the encoding target region and the representative value of the region having the prediction reference pixel are approximate, but the representative value of the region having a certain prediction reference pixel is the representative of the encoding target region. It is also possible that the value is very different. At this time, when the representative value of the region having any prediction reference pixel is used as the prediction value of the encoding target region as it is, when the certain prediction reference pixel is selected, The prediction value of the encoding target region is greatly different from the representative value of the encoding target region, and the accuracy of the prediction value is reduced.
- the median of the representative value of the region having the prediction reference pixel is encoded.
- the predicted value calculation means uses the average value of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region.
- the predicted value calculation means sets the average value of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region. Therefore, even when the representative value of an area having a certain prediction reference pixel is significantly different from the representative value of the encoding target area, it is possible to predict the predicted value with stable accuracy. Play.
- the predicted value calculation means use any one of the representative values of the region having the prediction reference pixel as the predicted value of the encoding target region.
- the predicted value calculation means sets one of the representative values of the region having the predicted reference pixel as the predicted value of the encoding target region.
- the accuracy of the predicted value does not decrease even if the representative value of the region having a certain prediction reference pixel that is significantly different from the representative value of the encoding target region is used as the predicted value of the encoding target region.
- any one of the representative values of the area having the prediction reference pixel is set as the prediction value of the encoding target area, and the median value or the average value of the representative values of the area having the prediction reference pixel is encoded.
- the above configuration can reduce the processing load for calculating the predicted value while maintaining the accuracy of the predicted value. There is a further effect of being able to.
- the transmission means further includes a prediction value calculation method indicating a prediction value calculation method executed by the prediction value calculation means in addition to the encoded data and the region information of the image. It is desirable to associate information and transmit it to the outside.
- the transmission unit in addition to the encoded data and the region information of the image, the transmission unit further includes prediction value calculation method information indicating a prediction value calculation method executed by the prediction value calculation unit. Relate to the outside and transmit. Therefore, even if the device that has received the encoded data, the region information, and the prediction value calculation method information does not know the prediction value calculation method executed by the prediction value calculation unit, the prediction value calculation method By calculating the predicted value based on the information, there is an additional effect that the received encoded data can be accurately decoded.
- the encoding unit is configured to encode the difference value of the encoding target area whose number assigned by the number assigning unit is the earliest instead of the variable length encoding method. It is desirable to encode the representative value of the encoding target area by a fixed-length encoding method.
- the encoding unit is configured to replace the difference value of the first encoding target area with the number assigned by the number assigning unit using the variable length encoding method instead of encoding the first code.
- the representative value of the conversion target area is encoded by a fixed-length encoding method.
- the representative pixel of the first encoding target area with the number assigned by the number assigning means is located at the end of the image, there is no pixel in the raster scan order from the representative pixel.
- the difference value of the earliest encoding target area becomes a very large value.
- the variable length encoding method the amount of code becomes very large.
- the representative value of the earliest encoding target area are encoded by a fixed-length encoding method.
- the dividing unit converts the entire area of the texture image into a pixel group included in the area for each area.
- a division pattern that divides a plurality of regions so that a difference between an average value calculated from pixel values and an average value calculated from pixel values of a pixel group included in a region adjacent to the region is equal to or less than a predetermined threshold value; It is desirable to divide the entire area of the distance image into a plurality of areas with the same division pattern.
- the texture image and the distance image are configured by a pixel group including pixels of similar colors in a certain area in the texture image, the pixel group included in the corresponding area in the distance image is all or substantially omitted.
- the distance value becomes substantially constant in each region in the distance image.
- the dividing unit calculates the entire area of the texture image from the pixel values of the pixel group included in the area for each area.
- the representative value determining means determines the representative value from the pixel value of each pixel included in each region, thereby reducing the information amount of the distance image and generating data that can restore the distance image with high accuracy. There is a further effect that it is possible.
- the encoding device relates to the transmission in which the encoded data of the image generated by the encoding means and the encoded data of the texture image obtained by encoding the texture image are associated and transmitted to the outside. Preferably further means are provided.
- the transmission unit associates the encoded data of the image generated by the encoding unit with the encoded data of the texture image obtained by encoding the texture image, and externally. To transmit. Therefore, the device that has received the coded data and the coded data of the texture image divides the coded data of the texture image into the plurality of regions by the division pattern, thereby dividing the distance image into the plurality of regions. Can be divided. Therefore, there is an additional effect that the encoded data of the received image can be accurately decoded based on the encoded data of the texture image.
- the decoding device for each of a plurality of areas obtained by dividing the entire area of the image with a predetermined division pattern, a representative value of the pixel value of each pixel included in the area,
- the encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order.
- the predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
- a decoding device that decodes the encoded data calculated based on: a dividing unit that divides the entire area of the image into a plurality of areas based on area information defining the plurality of areas; and the encoding Decoding means for decoding data and generating differential values arranged in order; Number assigning means for assigning numbers to the plurality of regions divided by the dividing means in raster scan order; and the number assigning means Are assigned to the plurality of areas in order of numbers assigned by each of the plurality of areas, and the areas are set as decoding target areas in order of the numbers assigned by the number assigning means, and are included in the decoding target areas.
- the first pixel in the raster scan order is a representative pixel, and is included in the decoding target area and is adjacent to a pixel on the same scan line as the representative pixel.
- a prediction value calculation unit that calculates a prediction value of a decoding target region based on a pixel value of at least one pixel of the prediction reference pixels, the pixel having a raster scan order before the representative pixel as a prediction reference pixel; For each decoding target area, the pixel value of the decoding target area is calculated by adding the difference value assigned by the allocating means to the prediction value calculated by the prediction value calculating means, and all the pixels included in the decoding target area are calculated.
- a pixel value setting unit that sets the calculated pixel value to the calculated pixel value, and the predicted value calculation unit and the pixel value setting unit repeatedly execute the process for each decoding target region in the order of the numbers. It is characterized by restoring pixel values of an image.
- the decoding unit decodes the encoded data and generates difference values arranged in order.
- the allocating unit calculates a difference for each of the plurality of regions obtained by dividing the image by the dividing unit based on region information defining the plurality of regions in the order of numbers assigned by the number assigning unit in raster scan order. Assign values in order from the beginning.
- the prediction value calculation means decodes the area as the decoding target area in the order of the numbers given by the number assigning means, and uses the first pixel in the raster scan order as the representative pixel among the pixels included in the decoding target area.
- a pixel that is included in the target region and is adjacent to a pixel on the same scan line as the representative pixel and whose raster scan order is earlier than the representative pixel is set as a predicted reference pixel.
- the predicted value calculation means calculates a predicted value of the decoding target region based on the pixel value of at least one pixel among the predicted reference pixels.
- the pixel value setting means calculates the pixel value of the decoding target area by adding the difference value assigned by the assigning means to the prediction value calculated by the prediction value calculating means for each decoding target area, The pixel values of all the pixels included in the decoding target area are set to the calculated pixel values.
- the prediction value calculation means and the pixel value setting means repeatedly execute the above processing for each decoding target area in the order of the numbers given by the number assignment means, and restore the pixel values of the image.
- the decoding target area is the same as the plurality of areas into which the image indicated by the encoded data is divided.
- the representative pixel used when calculating the predicted value of the representative value of each decoding target area and the prediction target pixel based thereon can be uniquely specified, and the representative pixel of the decoding target area and the prediction target based thereon
- the pixel, the representative pixel of the encoding target region corresponding to the decoding target region, and the prediction target pixel based thereon can be the same pixel. Therefore, there is an effect that the image indicated by the encoded data can be accurately restored.
- the decoding method provides, for each of a plurality of areas obtained by dividing the entire area of an image with a predetermined division pattern, a representative value of pixel values of each pixel included in the area,
- the encoded data of the image including a difference value that is a difference from the predicted value of the representative value of the region, and the difference value is arranged in the order of numbers assigned to the plurality of regions in the raster scan order.
- the predicted values are included in the encoding target area, with the area as the encoding target area in the order of the numbers, with the first pixel in the raster scan order as the representative pixel among the pixels included in the encoding target area.
- a division step for dividing, a decoding step for decoding the encoded data and generating difference values arranged in order, and assigning numbers to the plurality of regions divided in the division step in a raster scan order A number assigning step, an assigning step in which the difference values are assigned in order from the top to the plurality of regions in the order of numbers assigned in the number assigning step, and the regions in the order of the numbers given in the number assigning step.
- the first pixel in the raster scan order is set as the representative pixel and included in the decoding target area.
- the prediction value calculation step and the pixel value setting step are repeatedly executed for each target region to restore the pixel value of the image.
- the decoding method according to the present invention has the same operational effects as the decoding device according to the present invention.
- the decoding device preferably further includes receiving means for receiving the encoded data and the region information from the outside.
- the receiving means receives the encoded data and the region information from the outside. Therefore, even when the decoding device does not hold the region information, an image can be divided into the plurality of regions based on the region information by acquiring the region information from the outside. Therefore, even if the decoding apparatus does not hold the area information, the received encoded data can be accurately decoded.
- the receiving means receives the encoded data encoded by the variable length encoding method in which the code word is shorter as the value to be encoded is closer to 0.
- the receiving means receives the encoded data encoded by the variable length encoding method in which the code word is shorter as the value to be encoded is closer to 0.
- the code amount of the encoded data is small. Therefore, the decoding apparatus has an additional effect that the processing load for decoding the encoded data can be reduced.
- the prediction value calculation means includes a prediction reference pixel, a pixel preceding the raster scan order of the representative pixel, and a scan line that is included in the decoding target area and is the same as the representative pixel. And a raster of the last pixel of the scan line that is included in the decoding target area and is the same as the representative pixel, which is adjacent to the pixel of the first pixel in the raster scan order of the representative pixel. It is desirable that the pixel group includes a pixel adjacent to the next pixel in the scan order and a pixel on the scan line one previous in the raster scan order of the representative pixel.
- the predicted value calculation means includes the prediction reference pixel, the pixel immediately preceding the raster scan order of the representative pixel, and the same scan line as the representative pixel included in the decoding target area.
- a raster scan of a pixel adjacent to the pixel, the pixel on the previous scan line in the raster scan order of the representative pixel, and the last pixel included in the decoding target area and on the same scan line as the representative pixel A pixel group that includes a pixel adjacent to the next pixel in the order and a pixel on the previous scan line in the raster scan order of the representative pixel.
- the pixels in the three directions of the pixel adjacent in the left direction of the decoding target region, the pixel adjacent in the upward direction, and the pixel close to the upper right are used as the prediction reference pixels. Therefore, since the prediction value is calculated with reference to the pixels in the raster scan order before the decoding target region and the pixels in multiple directions, the prediction value can be predicted with high accuracy.
- the prediction value calculation means includes a prediction reference pixel, a pixel preceding the raster scan order of the representative pixel, and a scan line that is included in the decoding target area and is the same as the representative pixel.
- the predicted value calculation means includes the prediction reference pixel, the pixel immediately preceding the raster scan order of the representative pixel, and the same scan line as the representative pixel included in the decoding target area.
- a pixel adjacent to the pixel and one of the pixels of the scan line immediately preceding in the raster scan order of the representative pixel, and the last of the same scan line as the representative pixel included in the decoding target region It is assumed that the pixel is adjacent to the next pixel in the raster scan order of the tail pixel, and includes the pixel of the previous scan line in the raster scan order of the representative pixel.
- the prediction reference pixel is a pixel located in multiple directions with respect to the decoding target region and is a pixel group (three pixels) as small as possible, the processing load of the prediction value calculation is reduced and the prediction is performed. There is an additional effect that the value can be predicted with high accuracy.
- the predicted value calculation means uses the median value of the predicted reference pixels as the predicted value of the decoding target area.
- the predicted value calculation means sets the median value of the predicted reference pixels as the predicted value of the decoding target area.
- the median of the representative value of the region having the prediction reference pixel is By using the predicted value of the decoding target area, there is an additional effect that the predicted value can be predicted with stable accuracy.
- the predicted value calculation means use an average value of the pixel values of the predicted reference pixels as a predicted value of the decoding target region.
- the predicted value calculation means sets the average value of the pixel values of the predicted reference pixels as the predicted value of the decoding target region.
- the average value of the representative values of the region having the prediction reference pixel is By using the predicted value of the decoding target area, there is an additional effect that the predicted value can be predicted with stable accuracy.
- the predicted value calculation means uses a pixel value of any pixel included in the predicted reference pixel as a predicted value of the decoding target region.
- the predicted value calculation means sets the pixel value of any pixel included in the predicted reference pixel as the predicted value of the decoding target region.
- the above configuration is used when the accuracy of the prediction value does not decrease.
- the processing load for calculating the predicted value while maintaining the accuracy of the predicted value.
- the reception unit in addition to the encoded data and the region information of the image, the reception unit further includes prediction value calculation method information indicating a prediction value calculation method executed by the prediction value calculation unit.
- the prediction value calculation means calculates the prediction value based on the calculation method indicated by the prediction value calculation method information received by the reception means.
- the reception unit further includes prediction value calculation method information indicating a prediction value calculation method executed by the prediction value calculation unit.
- the prediction value calculation means receives the prediction value based on the calculation method indicated by the prediction value calculation method information received by the reception means. Therefore, even when the decoding device does not know the calculation method of the prediction value executed by the prediction value calculation unit, the decoding device calculates the prediction value based on the prediction value calculation method information, thereby receiving the encoded data. There is an additional effect that can be accurately decoded.
- the decoding means when the first code word in the encoded data is obtained by encoding the representative value of the earliest encoding target area by the fixed-length encoding method, the decoding means includes The first code word of the encoded data is decoded by a fixed-length encoding method, and the pixel value setting means calculates the pixel values of all the pixels included in the first area in the number order assigned by the number assigning means. It is desirable that the decoding means sets the representative value obtained by decoding the head codeword.
- the decoding means when the first code word in the encoded data is obtained by encoding the representative value of the earliest encoding target area by the fixed-length encoding method, the decoding means includes the code The first code word of the coded data is decoded by a fixed-length encoding method, and the pixel value setting means decodes the pixel values of all pixels included in the first area in the number order assigned by the number assigning means. The converting means sets the head codeword to the decoded representative value.
- the decoding apparatus has an additional effect that the processing load for decoding the encoded data can be reduced.
- the receiving means uses, as the region information, encoded data of the texture image obtained by encoding the texture image.
- the dividing means receives the entire area of the texture image decoded from the encoded data of the texture image, and calculates the average value calculated from the pixel values of the pixel group included in the area for each area and the area.
- a division pattern that divides a plurality of regions so that a difference from an average value calculated from pixel values of pixel groups included in adjacent regions is a predetermined threshold value or less. It is desirable to divide.
- the said receiving means receives the coding data of the said texture image which coded the said texture image as said area
- the dividing unit is configured to calculate the entire area of the texture image decoded from the encoded data of the texture image, adjacent to the average value calculated from the pixel value of the pixel group included in the area for each area and the area.
- This is a division pattern that divides the distance image into a plurality of regions so that the difference from the average value calculated from the pixel values of the pixel group included in the region to be equal to or less than a predetermined threshold value.
- the distance value is substantially constant in each area of the distance image divided by the dividing means. Therefore, by using the representative value of each region, the encoded data can be made into data that has a small code amount and can restore the distance image with high accuracy. Therefore, the decoding device can reconstruct the distance image from the encoded data with high accuracy, and can further reduce the processing load for decoding the encoded data.
- an encoding program that causes a computer to function as each unit of the encoding device according to the present invention a decoding program that causes the computer to function as each unit of the decoding device according to the present invention, and a computer-readable recording of the encoding program A recording medium and a computer-readable recording medium on which a decoding program is recorded are also included in the scope of the present invention.
- the data structure of the encoded data of the image for each of a plurality of regions obtained by dividing the entire region of the image by a predetermined division pattern, a representative value of the pixel value of each pixel included in the region, A difference value that is a difference from the predicted value of the representative value of the region is included, and the difference value is arranged in the order of the numbers given in the raster scan order to the plurality of regions, and the predicted value is
- the above areas are set as encoding target areas in numerical order, and the first pixel in the raster scan order among the pixels included in the encoding target area is set as the representative pixel, and the same scan as the above representative pixel is included in the encoding target area.
- each block included in the moving image encoding device 1, 1A and the moving image decoding device 2, 2A may be configured by hardware logic.
- each control of the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A may be realized by software using a CPU (Central Processing Unit) as follows.
- the program code (execution format program, intermediate code program, source program) of the control program that realizes the control of each of the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A is recorded so as to be readable by a computer. Just do it.
- the moving image encoding device 1, 1A and the moving image decoding device 2, 2A may read and execute the program code recorded on the supplied recording medium.
- the recording medium for supplying the program code to the moving image encoding apparatus 1, 1A and the moving image decoding apparatus 2, 2A is, for example, a tape system such as a magnetic tape or a cassette tape, or a magnetic disk such as a floppy (registered trademark) disk / hard disk.
- disk systems including optical disks such as CD-ROM / MO / MD / DVD / CD-R, card systems such as IC cards (including memory cards) / optical cards, mask ROM / EPROM / EEPROM / flash ROM, etc.
- a semiconductor memory system can be used.
- the object of the present invention can be achieved.
- the program code is supplied to the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A via a communication network.
- This communication network is not limited to a specific type or form as long as it can supply a program code to the moving image encoding apparatuses 1 and 1A and the moving image decoding apparatuses 2 and 2A.
- the Internet, intranet, extranet, LAN, ISDN, VAN, CATV communication network, mobile communication network, satellite communication network, etc. may be used.
- the transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type.
- wired communication such as IEEE 1394, USB (Universal Serial Bus), power line carrier, cable TV line, telephone line, ADSL (Asymmetric Digital Subscriber Line) line, infrared such as IrDA or remote control, Bluetooth (registered trademark), 802. 11 wireless, HDR, mobile phone network, satellite line, terrestrial digital network, etc. can also be used.
- the present invention can be suitably applied to a content generation device that generates 3D-compatible content, a content playback device that plays back 3D-compatible content, and the like.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
La présente invention se rapporte à un dispositif de codage d'image en mouvement (1) comprenant : un module de division d'image télémétrique (22) qui divise une image télémétrique en segments ; un module de correction de valeur télémétrique (23) qui détermine une valeur représentative de chacun des segments ; un module d'ajout de numéro (24) qui ajoute des numéros aux segments respectifs dans un ordre de balayage récurrent. Le dispositif de codage d'image en mouvement comprend également un module de prédiction/codage (25) qui calcule des valeurs prédites des segments respectifs dans un ordre numérique, qui calcule des valeurs différentielles en soustrayant les valeurs prédites respectives aux valeurs représentatives respectives des segments, et qui ordonne, dans un ordre numérique, et qui code les valeurs différentielles calculées dans le but de générer des données codées de l'image télémétrique.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010-247423 | 2010-11-04 | ||
JP2010247423 | 2010-11-04 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2012060179A1 true WO2012060179A1 (fr) | 2012-05-10 |
Family
ID=46024299
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/073134 WO2012060179A1 (fr) | 2010-11-04 | 2011-10-06 | Dispositif de codage, dispositif de décodage, procédé de codage, procédé de décodage, programme, support d'enregistrement et structure de données de données codées |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2012060179A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9241900B2 (en) | 2010-11-11 | 2016-01-26 | Novaliq Gmbh | Liquid pharmaceutical composition for the treatment of a posterior eye disease |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09289638A (ja) * | 1996-04-23 | 1997-11-04 | Nec Corp | 3次元画像符号化復号方式 |
WO2004071102A1 (fr) * | 2003-01-20 | 2004-08-19 | Sanyo Electric Co,. Ltd. | Procede de production d'une video tridimensionnelle et dispositif d'affichage video tridimensionnel |
JP2008193530A (ja) * | 2007-02-06 | 2008-08-21 | Canon Inc | 画像記録装置、画像記録方法、及びプログラム |
-
2011
- 2011-10-06 WO PCT/JP2011/073134 patent/WO2012060179A1/fr active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09289638A (ja) * | 1996-04-23 | 1997-11-04 | Nec Corp | 3次元画像符号化復号方式 |
WO2004071102A1 (fr) * | 2003-01-20 | 2004-08-19 | Sanyo Electric Co,. Ltd. | Procede de production d'une video tridimensionnelle et dispositif d'affichage video tridimensionnel |
JP2008193530A (ja) * | 2007-02-06 | 2008-08-21 | Canon Inc | 画像記録装置、画像記録方法、及びプログラム |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9241900B2 (en) | 2010-11-11 | 2016-01-26 | Novaliq Gmbh | Liquid pharmaceutical composition for the treatment of a posterior eye disease |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6788699B2 (ja) | 高い分割自由度を伴う効果的なパーティション符号化 | |
JP6814783B2 (ja) | パーティション符号化を用いた有効な予測 | |
KR102390298B1 (ko) | 화상 처리 장치 및 방법 | |
JP5960693B2 (ja) | 低いダイナミックレンジ画像から高いダイナミックレンジ画像の生成 | |
CN104054343B (zh) | 图像解码装置、图像编码装置 | |
JP2021503215A (ja) | Mpmリストを使用するイントラ予測基盤画像コーディング方法及びその装置 | |
TW201143458A (en) | Dynamic image encoding device and dynamic image decoding device | |
CN112235584B (zh) | 一种用于图像划分的方法及装置、编码、解码视频序列图像的方法及装置 | |
KR20150020175A (ko) | 비디오 신호 처리 방법 및 장치 | |
US20170041623A1 (en) | Method and Apparatus for Intra Coding for a Block in a Coding System | |
KR20220019241A (ko) | 적응적 루프 필터 기반 비디오 또는 영상 코딩 | |
JP6212890B2 (ja) | 動画像符号化装置、動画像符号化方法、及び動画像符号化プログラム | |
US8189673B2 (en) | Method of and apparatus for predicting DC coefficient of video data unit | |
CN111434116B (zh) | 在图像编码系统中使用构造的仿射mvp候选基于仿射运动预测的图像解码方法和装置 | |
JP7180679B2 (ja) | 映像符号化装置、映像符号化方法、映像符号化プログラム、映像復号装置、映像復号方法、及び映像復号プログラム | |
WO2012060179A1 (fr) | Dispositif de codage, dispositif de décodage, procédé de codage, procédé de décodage, programme, support d'enregistrement et structure de données de données codées | |
WO2012128209A1 (fr) | Dispositif de codage d'image, dispositif de décodage d'image, programme et données codées | |
KR20150113713A (ko) | 시점 간 움직임 병합 후보 유도 방법 및 장치 | |
CN117178549A (zh) | 基于帧内预测模式推导的帧内预测方法和装置 | |
WO2012060168A1 (fr) | Appareil de codage, appareil de décodage, procédé de codage, procédé de décodage, programme, support d'enregistrement et données codées | |
WO2012060172A1 (fr) | Dispositif de codage d'images cinématographiques, dispositif de décodage d'images cinématographiques, système de transmission d'images cinématographiques, procédé de commande d'un dispositif de codage d'images cinématographiques, procédé de commande d'un dispositif de décodage d'images cinématographiques, programme de commande d'un dispositif de codage d'images cinématographiques, programme de commande d'un dispositif de décodage d'images cinématographiques et support d'enregistrement | |
WO2011122168A1 (fr) | Appareil codeur d'image, appareil décodeur d'image, procédé de commande d'appareil codeur d'image, procédé de commande d'appareil décodeur d'image, programmes de commande et support d'enregistrement | |
EP4412212A1 (fr) | Procédé et dispositif de codage d'image basé sur mdmvr | |
WO2012060171A1 (fr) | Dispositif de codage d'images cinématographiques, dispositif de décodage d'images cinématographiques, système de transmission d'images cinématographiques, procédé de commande d'un dispositif de codage d'images cinématographiques, procédé de commande d'un dispositif de décodage d'images cinématographiques, programme de commande d'un dispositif de codage d'images cinématographiques, programme de commande d'un dispositif de décodage d'images cinématographiques et support d'enregistrement | |
CN117337565A (zh) | 基于多个dimd模式的帧内预测方法和装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11837831 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11837831 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: JP |