WO2011122168A1

WO2011122168A1 - Image encoder apparatus, image decoder apparatus, image encoder apparatus control method, image decoder apparatus control method, control programs and recording medium

Info

Publication number: WO2011122168A1
Application number: PCT/JP2011/053727
Authority: WO
Inventors: 内海　端
Original assignee: シャープ株式会社
Priority date: 2010-03-29
Filing date: 2011-02-21
Publication date: 2011-10-06
Also published as: JP4860763B2; JP2011211351A

Abstract

There are included a depth value analyzing unit (14) that acquires a depth value for each of a plurality of unit areas in an original image, counts the number of occurrences of the acquired depth values in the original image, creates a depth value distribution that is a distribution of the number of occurrences for each depth value, divides, based on the shape of the created depth value distribution, the range from the minimum value to the maximum value of the depth values in the depth value distribution into a plurality of sections, and assigns, to the depth values included in the divisional sections, codes that are different on a section-by-section basis; and a depth value converting unit (15) that uses the codes, which are assigned by the depth value analyzing unit (14), to encode the depth values, thereby encoding the original image and that outputs the encoded image with the encoded image being associated with a number of codes that is the number of codes used for the encoding of the encoded image. In this way, the depth values can be encoded only from one original image, so that the depth values can be encoded with less delay and by use of a simple process.

Description

Image encoding device, image decoding device, control method of image encoding device, control method of image decoding device, control program, and recording medium

The present invention relates to an image encoding device, an image decoding device, a control method for an image encoding device, a control method for an image decoding device, a program, and a recording medium that encode a depth image indicating a depth value of the image.

Recently, by using images from a plurality of viewpoints (multi-viewpoint images), a highly realistic video expression that cannot be obtained only by a single viewpoint image that is an image from one viewpoint direction has been realized. Examples of video expression using a plurality of viewpoint images include stereoscopic image display and arbitrary viewpoint image display.

Stereoscopic image display uses two images with parallax, and the observer views one image with the right eye and the other image with the left eye, so that each image is a planar image, but the observer In the brain, it gives the feeling of looking at a three-dimensional space. This will be specifically described with reference to FIG. FIG. 6 is an explanatory diagram showing an overview of stereoscopic image display. As shown in FIG. 6, the two

images

501 and 502 with parallax are viewed by the observer in the observer's brain 503 by viewing the image 501 with the left eye and the image 502 with the right eye. It feels as if the

objects

504 and 505 in 502 exist three-dimensionally.

Also, the arbitrary viewpoint image display is to create and display a subject image at an arbitrary viewpoint from a plurality of viewpoint images having different viewpoints and the distance between the camera and the subject in each viewpoint image. This will be specifically described with reference to FIG. FIG. 7 is an explanatory diagram showing an overview of arbitrary viewpoint image display. As shown in FIG. 7, from a plurality of

viewpoint images

601v, 602v, and 603v having different viewpoints and

depth images

601d, 602d, and 603d that indicate the distance between the camera and the subject in each viewpoint image, Images (in the example shown in FIG. 7,

viewpoint images

604v and 605v) are created and displayed. Thereby, it is possible to display an image of a subject from a viewpoint that has not been shot.

Non-Patent Document 1 describes a method of generating a viewpoint image (arbitrary viewpoint image) at an arbitrary viewpoint. The method described in Non-Patent Document 1 generates an arbitrary viewpoint image using two viewpoint images and depth images corresponding to them. Specifically, (1) the depth image is projected onto the virtual viewpoint, (2) the projected depth image is smoothed, and (3) the pixel value of the actual image is mapped to the smoothed depth image. (4) A method of repairing a pixel at a remaining position by using surrounding pixels. As described above, by using the viewpoint images of the two viewpoints and the depth image thereof, it is possible to generate an image from an arbitrary viewpoint near the viewpoints.

Also, the use of a technique for generating an arbitrary viewpoint image leads to improvement of the above-described stereoscopic image display. This will be described with reference to FIG. FIG. 8 is a diagram for explaining a principle that leads to improvement of stereoscopic display by an arbitrary viewpoint image generation technique. As shown in FIG. 8, it is assumed that

subjects

704 and 705 are photographed by two

cameras

701 and 702 that are installed at a distance of 706 to obtain

viewpoint images

701v and 702v. When the interval 706 is larger than the interval between the left and right eyes of a human (generally about 65 mm), even if the viewpoint image 701v is viewed with the left eye and the viewpoint image 702v is viewed with the right eye, the blurred stereoscopic image Or an image that cannot be seen as a solid at all.

Therefore, a viewpoint image 703v in which a subject is viewed at a viewpoint position 703 that is separated from the camera 701 by the same distance 707 as the distance between the left and right eyes of a human is created, and an appropriate stereoscopic image is obtained by using the viewpoint image 701v and the viewpoint image 703v. Can be observed.

Further, even when the distance between the two

cameras

701 and 702 is too small compared to the distance between the left and right eyes of the human, it corresponds to the distance between the left and right eyes of the human from the point of the camera 701 or the camera 702. By generating a viewpoint image at a point, it is possible to observe a stereoscopic image that provides a sufficient stereoscopic effect.

Furthermore, by using the above-described principle, it is possible to observe a stereoscopic image from an arbitrary viewpoint and adjust the stereoscopic effect when observing the stereoscopic image from an arbitrary viewpoint.

As described above, if a plurality of viewpoint images and corresponding depth images (depth information) are used, the expression function of image display can be improved. However, since a depth image is required, there arises a problem that the amount of data during recording and transmission increases.

In order to solve this problem, in Patent Document 1, when transmitting depth information, a transmission amount is preferentially assigned to frequency components with high perceptual sensitivity according to temporal and spatial frequency characteristics with respect to visual depth changes. Techniques for encoding depth values are disclosed. In Patent Document 1, when compressing the information amount of depth information, a viewpoint image generated by using the depth information quality (that is, using the depth information) by assigning a code amount by paying attention to the sensitivity of human depth perception. Quality), while reducing the amount of information.

Japanese Patent Publication “JP 2001-61164 A (published on March 6, 2001)”

However, since the method of Patent Document 1 analyzes the time frequency characteristics and the spatial frequency characteristics of the depth information in order to encode the depth information, the amount of processing increases compared to the method of encoding the depth information as it is. , Processing time will be delayed. In particular, in order to obtain time-frequency characteristics, it is necessary to analyze depth information over a plurality of frames, and a delay of several frames or more is necessarily generated in the processing.

And, since a delay of several frames or more occurs, it cannot be applied to an application that encodes and decodes in real time.

The present invention has been made in view of the above problems, and an object of the present invention is to realize an image encoding device or the like that encodes a depth value by a simple process with little delay.

In order to solve the above problem, an image encoding device according to the present invention is an image encoding device that encodes a depth value for each unit region in an original image, and a depth value acquisition unit that acquires the depth value. Depth value distribution creating means for counting the number of appearances of the depth value obtained by the depth value obtaining means in the image and creating a depth value distribution that is a distribution of the number of appearances for each depth value; and the depth value distribution creating means Based on the shape of the depth value distribution created by the dividing means for dividing the range of the depth value from the minimum value to the maximum value into a plurality of depth values in the depth value distribution, and the depth value included in each section divided by the dividing means On the other hand, a code allocating unit that allocates a different code for each section, and the original image is encoded by encoding the depth value with a code allocated by the code allocating unit. And an output means for outputting the encoded image output from the encoding means and the number of codes used for encoding the encoded image in association with each other. It is characterized by having.

Further, a control method for an image encoding device according to the present invention is a control method for an image encoding device that encodes a depth value for each unit region in an original image, the depth value acquiring step for acquiring the depth value, and The depth value distribution creating step for counting the number of appearances of the depth value obtained in the depth value obtaining step in the image and creating a depth value distribution that is a distribution of the number of appearances for each depth value, and the depth value distribution creating step Based on the shape of the depth value distribution created in step 1, the division step for dividing the range of the depth value in the depth value distribution from the minimum value to the maximum value into a plurality of divisions, and the depth value included in each section divided in the division step On the other hand, a code allocation step in which a different code is assigned to each section, and the depth value is encoded with the code allocated in the code allocation step. Corresponds to the encoding step of encoding the original image and outputting the encoded image, the encoded image output in the encoding step, and the code number that is the number of codes used for encoding the encoded image And an output step for outputting.

According to the above configuration or method, based on the shape of the distribution of the number of appearances for each depth value in the original image, the range between the minimum value and the maximum value of the depth value is divided, and the depth value included in each divided section A different code is assigned to each section, and the original image is encoded.

Thus, the depth value can be encoded while maintaining the shape characteristic of the distribution of the number of occurrences of the depth value in one original image, so that the image feature can be maintained. In addition, since the distribution of the number of appearances of the depth value is divided and a code is assigned to each divided section, the original image can be encoded by a simple process.

In addition, since the depth value can be encoded from only one original image, the encoded depth value is determined for the image as in the case of encoding the depth value using a plurality of images. It is possible to prevent delay.

In the image encoding device according to the present invention, the dividing unit uses the number of local maximum values of the number of appearances in the depth value distribution created by the depth value distribution creating unit, and is greater than or equal to the minimum value of the depth value in the depth value distribution. It is preferable to determine the number to divide the range below the value.

According to the above configuration, the number of divisions of the range from the minimum value of the depth value to the maximum value is determined using the maximum number of occurrences in the depth value distribution. Since the number of local maximum values characterizes the shape of the depth value distribution, by determining the number to be divided using the number of local maximum values, the shape of the distribution of depth values after encoding is encoded. It can be approximated to the shape of the distribution of depth values before conversion.

This makes it possible to encode the depth value without impairing the shape characteristics of the depth value distribution.

As a method of determining the number of divisions of the range between the minimum value and the maximum value of the depth value using the number of local maximum values, for example, the number obtained by adding the number of local maximum values and the number between local maximum values is divided. It is mentioned to make it the number to do. As a result, it is possible to represent the maximum value and the depth value therebetween, and it is possible to encode the depth value while maintaining the shape of the distribution of the appearance number of the depth value.

In the image encoding device according to the present invention, the assigning means assigns different codes to the maximum value and the minimum value of the depth value in a range of the depth value in the depth value distribution that is not less than the minimum value and not more than the maximum value, A different code may be assigned to each section obtained by dividing the remaining range excluding the maximum value and the minimum value.

According to the above configuration, different codes are assigned to the maximum value and the minimum value of the depth value, and different codes are assigned to each section obtained by dividing the remaining range excluding the maximum value and the minimum value. Therefore, the number of appearances of the maximum value and the minimum value of the depth value does not change before and after encoding. The maximum value and the minimum value of the depth value indicate the farthest subject and the closest subject in the image. Therefore, since the number of occurrences of the maximum value and the minimum value of the depth value does not change by encoding, the area occupied by the farthest subject and the nearest subject in the image can be kept unchanged, and the image characteristics are not impaired. Can be encoded.

In the image encoding device according to the present invention, the dividing unit has a maximum value of the number of occurrences of the depth value as a starting point in a range of the depth value distribution in the depth value distribution that is not less than the minimum value and not more than the maximum value. The inclination of the depth value distribution may be obtained, and a predetermined range with a large inclination of the depth value distribution may be divided more than a predetermined range with a small inclination of the depth value distribution.

Further, a large inclination in the depth value distribution indicates that the number of appearances of the depth value is changing rapidly. Therefore, according to the above configuration, the dividing unit obtains the inclination for each predetermined range in the depth value distribution, and divides the predetermined range with the large inclination more than the predetermined range with the small inclination, so that the appearance of the depth value The number of codes to be assigned can be increased for a range in which the number is changing rapidly, and the number of codes to be assigned can be reduced for a range in which the number of occurrences of depth values has not changed much. Thus, encoding can be performed with a smaller number of codes without impairing the perspective of the image.

In the image encoding device according to the present invention, the original image is a moving image, and the output means includes the number of codes used for encoding the encoded image output from the encoding means, and the encoded image. A difference from the number of codes of the code used for encoding the encoded image output by the encoding means immediately before the output may be output.

According to the above configuration, the output unit includes the number of codes used for encoding the encoded image output from the encoding unit and the encoded image output from the encoding unit immediately before the encoded image. The difference from the code number of the code used for encoding is output. The information indicating the difference has a smaller information amount than the information indicating the number of codes. Therefore, it is possible to output information indicating the number of codes used for encoding the encoded image with a smaller amount of information.

In order to solve the above problems, an image decoding apparatus according to the present invention acquires an encoded image obtained by encoding an original image and a code number that is the number of codes used for encoding the encoded image. And a conversion ratio that is a ratio between the number of codes acquired by the acquisition unit and the number of gradations for expressing the depth value of the original image, and the calculated conversion ratio of the encoded image acquired by the acquisition unit And decoding means for decoding the encoded image by multiplying the depth value.

In addition, the control method of the image decoding device according to the present invention includes an acquisition step of acquiring an encoded image obtained by encoding an original image and a code number that is the number of codes used for encoding the encoded image; A conversion ratio that is a ratio between the number of codes acquired in the acquisition step and the number of gradations for expressing the depth value of the original image is obtained, and the obtained conversion ratio is used as the depth value of the encoded image acquired in the acquisition step. And a decoding step of decoding the encoded image by multiplication.

According to the above configuration or method, for the depth value of the encoded image, conversion that is a ratio between the number of codes encoded in the encoded image and the number of gradations for expressing the depth value of the original image Since the depth value of the encoded image is decoded by multiplying by the ratio, the encoded image can be decoded to the original image of the depth value expressed by the number of gradations to be expressed.

Note that the image encoding device and the image decoding device may be realized by a computer. In this case, the image encoding device that causes the image encoding device to be realized by the computer by causing the computer to operate as the respective means. An apparatus control program, an image decoding apparatus control program for realizing the image decoding apparatus by a computer, and a computer-readable recording medium on which at least one of them is recorded also fall within the scope of the present invention.

As described above, the image coding apparatus according to the present invention uses the depth value acquisition unit that acquires the depth value for each unit region in the original image, and the appearance number of the depth value acquired by the depth value acquisition unit in the image. A depth value distribution creating unit that counts and creates a depth value distribution that is a distribution of the number of appearances for each depth value, and a depth value in the depth value distribution based on the shape of the depth value distribution created by the depth value distribution creating unit. A dividing unit that divides a range between the minimum value and the maximum value into a plurality of values, a code allocating unit that assigns a different code to each of the depth values included in each section divided by the dividing unit, and the code allocating unit Encoding the depth value with the code assigned by the encoding means for encoding the original image and outputting the encoded image; and the encoded image output by the encoding means A configuration that includes an output means for outputting in association with the number code and the number of codes used in coding the coded image.

The image encoding device control method according to the present invention includes a depth value acquisition step of acquiring a depth value for each unit region in the original image, and the number of appearances of the depth value acquired in the depth value acquisition step in the image. A depth value distribution creating step for counting and creating a depth value distribution that is a distribution of the number of appearances for each depth value, and a depth value in the depth value distribution based on the shape of the depth value distribution created in the depth value distribution creating step. A division step for dividing a range from the minimum value to the maximum value into a plurality, a code assignment step for assigning a different code for each section to the depth value included in each section divided in the division step, and the code assignment step An encoding step of encoding the original image by encoding the depth value with the code assigned in step 1 and outputting the encoded image; , The method comprising the coded image output by the encoding step, an output step of outputting in association the number of codes and the number of codes used in coding the coded image.

Thus, the depth value can be encoded while maintaining the shape characteristic of the distribution of the number of occurrences of the depth value in one original image, so that the image feature can be maintained. Further, since the distribution of the number of appearances of depth values is divided and a code is assigned to each divided section, the original image can be encoded by a simple process.

In addition, since the depth value can be encoded from only one original image, the encoded depth value is determined for the image as in the case of encoding the depth value using a plurality of images. There is an effect that the delay can be prevented.

The image decoding apparatus according to the present invention includes an acquisition unit that acquires an encoded image obtained by encoding an original image, and a code number that is the number of codes used for encoding the encoded image, and the acquisition unit includes: By obtaining a conversion ratio that is a ratio between the acquired number of codes and the number of gradations for expressing the depth value of the original image, and multiplying the obtained conversion ratio by the depth value of the encoded image acquired by the acquisition means And a decoding means for decoding the encoded image.

In addition, the control method of the image decoding device according to the present invention includes an acquisition step of acquiring an encoded image obtained by encoding an original image and a code number that is the number of codes used for encoding the encoded image; A conversion ratio that is a ratio between the number of codes acquired in the acquisition step and the number of gradations for expressing the depth value of the original image is obtained, and the obtained conversion ratio is used as the depth value of the encoded image acquired by the acquisition unit. And a decoding step of decoding the encoded image by multiplication.

This produces an effect that the encoded image can be decoded into the original image of the depth value expressed by the number of gradations to be expressed.

Further objects, features, and excellent points of the present invention will be fully understood from the following description. The advantages of the present invention will become apparent from the following description with reference to the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 illustrates an embodiment of the present invention and is a block diagram illustrating a main configuration of an image encoding device. It is a figure for demonstrating the process in the depth value analysis part of the said image coding apparatus. It is a flowchart which shows the flow of a process of the said image coding apparatus. 1, showing an embodiment of the present invention, is a block diagram illustrating a main configuration of an image decoding device. FIG. It is a figure for demonstrating the restoration method of the depth value in the depth value decoding part of the said image decoding apparatus. It is explanatory drawing which shows the outline | summary of a stereo image display. It is explanatory drawing which shows the outline | summary of an arbitrary viewpoint image display. It is a figure for demonstrating the principle which leads to the improvement of a three-dimensional display by the production | generation technique of arbitrary viewpoint images.

Hereinafter, the present invention will be described in more detail with reference to examples, but the present invention is not limited thereto.

An embodiment of the present invention will be described with reference to FIGS. 1 to 5 as follows.

(Configuration of image encoding device)
FIG. 1 is a block diagram showing a main configuration of an image encoding device 1 according to the present embodiment. The image encoding device 1 is a device that performs encoding processing on a plurality of viewpoint images (for example, an image for the right eye and an image for the left eye) and a corresponding depth value to reduce the amount of information and transmits the image. The right-eye image is an image assumed to be viewed by the observer with the right eye, and the left-eye image is an image assumed to be viewed by the observer with the left eye. By observing the right-eye image and the left-eye image at the same time, the observer can view the subject displayed in the image in three dimensions.

As shown in FIG. 1, the image encoding device 1 includes a depth value encoding unit 10, a viewpoint image acquisition unit 11, a viewpoint image encoding unit 12, a depth calculation unit 13, a depth image encoding unit 16, and acquisition of shooting condition information. The configuration includes a unit 17 and a multiplexing unit 18. The depth value encoding unit 10 includes a depth value analysis unit (depth value acquisition unit, depth value distribution generation unit, division unit, code allocation unit) 14 and a depth value conversion unit (encoding unit, output unit) 15. It is a configuration.

The viewpoint image acquisition unit 11 acquires viewpoint images such as a right eye image and a left eye image, and transmits the viewpoint images to the viewpoint image encoding unit 12 and the depth calculation unit 13. Note that the viewpoint image acquired by the viewpoint image acquisition unit 11 may be a moving image or a still image.

The viewpoint image encoding unit 12 compresses and encodes the acquired viewpoint image based on a predetermined encoding method. Examples of the encoding method include JPEG (Joint 画像 Photographic Experts Group) and JPEG2000 if the viewpoint image is a still image. If the viewpoint image is a moving image, MPEG (Moving Picture Experts Group) -2, MPEG-4, AVC (Advanced Video Coding) / H. H.264 or the like can be cited.

The depth calculation unit 13 calculates a depth value from the right eye image and the left eye image acquired from the viewpoint image acquisition unit 11. Then, the calculated depth value is transmitted to the depth value encoding unit 10. In the present embodiment, the depth value calculated by the depth calculation unit 13 is expressed as a depth image (original image) having 8-bit luminance information. Note that the number of bits representing the depth value is not limited to 8 bits.

As a method of calculating the depth value, the movement amount of each corresponding pixel (unit region) is obtained using block matching for the right-eye image and the left-eye image, and the depth value is calculated from the obtained movement amount. A method can be mentioned. The depth value does not have to be calculated for each pixel, but may be calculated with a plurality of pixels as one unit.

Further, the depth value may be measured using a distance measuring camera. A distance measurement camera is a camera that can measure the distance to a subject, for example, by irradiating an infrared ray toward the subject and measuring the time until the ray is reflected back. The distance from the camera to the subject is estimated. When measuring, irradiate infrared rays in a dense plane, and receive the reflected light with an imaging device that can detect infrared rays. Can be obtained. Here, the depth image is an image in which the distance (depth value) between the camera and the subject is configured by luminance information for each unit region (for example, pixel).

The depth value encoding unit 10 encodes the depth value acquired from the depth calculation unit 13. Specifically, the depth value analysis unit 14 analyzes the distribution state of the luminance value that is the acquired depth value, and divides the range of the minimum value and maximum value of the analyzed depth value distribution into a plurality of values. A different code is assigned to each divided section. Then, the depth value included in each divided section and the code assigned to each section are transmitted to the depth value conversion unit 15. Details of the depth value analysis will be described later.

The depth value conversion unit 15 encodes the depth value included in each section with the assigned code from the acquired depth value included in each section and the assigned code. Then, the encoded depth value that is the depth value after encoding and the depth depth information that is the information indicating the number of codes that are assigned are transmitted to the depth image encoding unit 16. Specifically, when the encoded depth value is represented by 3 bits, the depth depth information is “3”, and the number of codes that is the number of assigned code types is the number of gradations that can be represented by 3 bits. “8”. Details of the encoding of the depth value and the depth depth information will be described later.

The depth image encoding unit 16 compresses and encodes the acquired encoded depth value. The encoded depth value can be expressed as an image because the luminance value is encoded. Therefore, the depth image encoding unit 16 can further compress and encode the encoded depth value by an encoding method similar to the encoding method in the viewpoint image encoding unit 12 described above.

Note that when the viewpoint image acquired by the viewpoint image acquisition unit 11 is a moving image, the following encoding method may be employed. When the viewpoint image is a moving image, the depth value changes in units of frames, and therefore depth depth information indicating the number of codes assigned to the depth values also changes in units of frames. Therefore, the difference between the depth and depth information in the immediately preceding frame is taken and encoded.

Also, the depth / depth information between temporally adjacent frames is unlikely to change abruptly, and the smaller the difference value, the higher the appearance frequency. Therefore, the code amount can be further reduced by generating an optimum code such as a Huffman code based on this property.

The imaging condition information acquisition unit 17 acquires the imaging condition information of the viewpoint image and transmits it to the multiplexing unit 18. The imaging condition information includes the distance between the camera that has captured the right-eye image and the camera that has captured the left-eye image, the imaging direction (angle) of each camera, the focal length of each camera, and the like.

The multiplexing unit 18 multiplexes the encoded viewpoint image acquired from the viewpoint image encoding unit 12, the encoded depth value and depth depth information acquired from the depth image encoding unit 16, and shooting condition information. To an external decoding device.

(Processing in the depth value analysis unit 14 and the depth value conversion unit 15)
Next, the process in the depth value analysis part 14 is demonstrated using FIG. FIG. 2 is a diagram for explaining processing in the depth value analysis unit 14.

As shown in FIG. 2A, when there is a viewpoint image including a person 201 in the foreground, a house 202 in the middle distance, and a mountain range 203 and the sky 204 in the distance, the depth value is determined for each subject in the viewpoint image. Can be expressed as a depth image (FIG. 2B) expressing the distance from the camera as luminance information. Usually, the viewpoint image is composed of luminance information of 8 bits for each pixel and two kinds of color difference information of 8 bits for each pixel. The depth image is composed of 8-bit luminance information for each pixel.

Then, the depth value analysis unit 14 counts the number of appearances for each luminance value indicating the depth value in the depth image of FIG. As shown in FIG. 2C, the depth value distribution, which is the result of counting, is the depth information in which the horizontal axis is the depth value and the vertical axis is the number of appearances. It can be expressed as a diagram (histogram) shown. As is apparent from FIG. 2C, it can be seen that the distribution of depth values is distributed with the number of appearances biased to a specific value. Therefore, even if the depth value originally expressed by 8 bits (256 gradations) is expressed by gradations smaller than 256 gradations, the feature of the shape of the depth value distribution can be maintained. Therefore, the depth value analysis unit 14 determines gradations less than 256 gradations that can maintain the shape characteristics of the depth value distribution, and divides the depth value distribution into the number of gradations. A different code is assigned to each divided section.

More specifically, in the case where the depth distribution has three maximum values as shown in FIG. 2C, the region between them is expressed in three gradations corresponding to the number of maximum values. It is determined that the depth value can be expressed with 5 gradations including gradations. Then, 3 is determined as the minimum number of bits that can represent five gradations as the number of bits that represent the depth value. The determined bit number is used as depth / depth information. Then, the depth value distribution is divided so that the depth value can be expressed with 8 gradations that the determined number of bits 3 can take. Specifically, the range of the minimum value 0 to the maximum value 255 of the depth value expressed in 8 bits is equally divided into 8 sections that are the number of gradations that can be expressed in 3 bits, and the minimum value 0 that can be expressed in 3 bits. To the maximum value 7 in order.

The depth value conversion unit 15 encodes the depth value into the assigned code.

In this way, the depth image in which the number of bits representing the depth value is reduced has a rough intermediate gradation, but the foreground person 201, the middle-distance house 202, and the distant mountain range 203, which are the main parts of the image, are empty. As for the relationship between the depth values of the respective areas 204, the relationship before reducing the number of bits can be maintained. Therefore, the perspective distortion in the image can be reduced.

In addition, when converting 256 gradations to 8 gradations as described above, a method other than dividing the 256 gradations equally into 8 may be used. For example, a minimum value 0 and a maximum value 7 of 3 bits are assigned as codes to a minimum value 0 and a maximum value 255 of depth values expressed in 8 bits. Then, the remaining range of depth values 1 to 254 are equally divided, and the remaining values 1 to 6 of 3 bits are assigned as codes in ascending order. According to this method, the depth value before conversion is particularly concentrated on the minimum and maximum values, or the frequency of appearance of the minimum and maximum values is extremely different from the frequency of appearance of adjacent depth values. Thus, by using only separate codes for the minimum value and the maximum value, the perspective relationship with the adjacent sections can be accurately maintained, and the perspective distortion in the image can be further suppressed.

In addition, as described above, the depth values 0 to 255 or 1 to 254 expressed in 8 bits are not evenly assigned to the 3-bit values, but more in areas where the change in appearance frequency is large according to the distribution of depth values. May be assigned.

For example, the slope of the histogram is calculated for each fixed section starting from the local maximum value, and many values are assigned to sections with a large slope, that is, sections with a large change in depth value, and are assigned to sections with a small slope, that is, sections with a small change in depth value. Assigns fewer values. With such an allocation method, it is possible to reproduce the perspective in the image without reducing the accuracy of expressing the change in depth value, even if the number of depth gradations that can be expressed is reduced.

By the above processing, the depth value converting unit 15 can express the depth value expressed in 8 bits as 3 bits, and the information amount per depth value can be reduced by 5 bits. Then, the depth value conversion unit 15 transmits the depth value obtained by reducing the number of bits and the depth depth information indicating the number of bits after the reduction.

The image encoding device 1 is connected to a recording device or a network control device not shown in FIG. 1, and the encoded data multiplexed by the multiplexing unit 18 is a magnetic disk, an optical disk, or a semiconductor memory. It may be stored in a recording medium such as, or may be transmitted to a remote receiving device via a network.

(Processing flow of image encoding device)
Next, a processing flow of the image encoding device 1 will be described with reference to FIG. FIG. 3 is a flowchart showing a process flow of the image encoding device 1.

As shown in FIG. 3, first, the viewpoint image acquisition unit 11 acquires a viewpoint image (S1). Next, the depth calculation unit 13 calculates the depth value of the viewpoint image acquired by the viewpoint image acquisition unit 11 (S2, depth value acquisition step). Then, the depth value analysis unit 14 analyzes the depth value calculated by the depth calculation unit 13 and obtains a distribution of depth values (S3, depth value distribution creation step). Then, the number of bits representing the depth value is determined from the distribution of depth values (S4, division step, code allocation step). Thereafter, the depth value conversion unit 15 converts the depth value so as to be expressed by the number of bits determined by the depth value analysis unit 14 (S5, encoding step, output step).

Thereafter, the depth image encoding unit 16 further compresses and encodes the converted depth value and the like (S6), and the multiplexing unit 18 encodes the encoded depth value, the encoded viewpoint image that is encoded separately, The imaging condition information is multiplexed (S7) and transmitted to other devices. Above, the process of the image coding apparatus 1 is complete | finished.

(Configuration of image decoding device)
Next, the image decoding apparatus 2 according to the present embodiment will be described with reference to FIG. FIG. 4 is a block diagram illustrating a main configuration of the image decoding device 2. The image decoding device 2 performs a decoding process on the encoded data transmitted from the image encoding device 1 via a recording medium or a network (not shown), depth information corresponding to a plurality of viewpoint images and viewpoint images, This is a device for restoring shooting condition information and the like.

The separation unit 51 acquires encoded data and separates it into encoded data of viewpoint images, encoded data of depth values, and shooting condition information. Then, the encoded data of the separated viewpoint image is transmitted to the viewpoint image decoding unit 52, and the encoded data of the depth value is transmitted to the depth image decoding unit 53.

The viewpoint image decoding unit 52 decodes the acquired encoded data of the viewpoint image based on a corresponding method, and transmits the decoded viewpoint images (right-eye image and left-eye image).

The depth image decoding unit 53 decodes the encoded data of the acquired depth value and transmits the depth value and the depth depth information to the depth value decoding unit 54.

Depth value decoding unit (acquisition means, decoding means) 54 uses the acquired depth depth information to decode (restore) the acquired depth value into a depth value before conversion. In the present embodiment, decoding in the depth value decoding unit 54 is referred to as restoration in order to distinguish it from decoding in the depth image decoding unit 53. Details of the restoration will be described later.

When the viewpoint image is a moving image and the depth / depth information is expressed as a difference from the depth / depth information in the immediately preceding frame, the depth value decoding unit 54 adds the depth image to the depth / depth information in the immediately preceding frame. The difference value decoded by the decoding unit 53 may be added to obtain depth / depth information of the current frame.

(Processing in Depth Value Decoding Unit 54)
Next, a depth value decoding (restoring) method in the depth value decoding unit 54 will be described with reference to FIG. FIG. 5 is a diagram for explaining a depth value restoration method in the depth value decoding unit 54. Originally, the depth value is expressed by 8 bits, and when expressed as an image, the depth value is as shown in FIG. When the number of bits to be expressed is reduced to 3 bits and restored so as to be expressed again with 8 bits, it is restored using the reduced number of bits. Specifically, the depth value represented by 3 bits is restored by multiplying each depth value by the number of gradations 32 that can be represented by 5 bits, so that the depth value represented by 3 bits can be represented by 8 bits.

The 8-bit minimum value 0 and the maximum value 255 are assigned to the minimum value 0 and the maximum value 7 of the depth value, and the 8-bit minimum value is assigned to the remaining depth values 1 to 6 expressed in 3 bits. And 254 gradations excluding the maximum value may be equally restored by multiplying by 42, which is a quotient of 254/6.

This makes it possible to express the depth information that was expressed in 3 bits in 8 bits.

As shown in FIG. 5B, the image represented by the restored depth value has a medium gradation, that is, the resolution of expression is rough. However, the main part of the image almost reproduces the perspective before the bit number reduction. can do. Therefore, the perspective distortion in the image can be reduced.

The restored depth value can be used to generate an image of an arbitrary viewpoint from a plurality of viewpoint images, or to adjust a stereoscopic effect when displayed as a stereoscopic image.

As described above, according to the present embodiment, only the depth value corresponding to the viewpoint image is encoded so as to reduce the expression resolution, while the expression resolution of the viewpoint image is maintained and encoded. The depth value can be encoded and the amount of data can be reduced without affecting the quality of the image itself.

(Other)
As described above, the image encoding apparatus according to the present invention is an image encoding apparatus that encodes a depth value for each unit region in an original image, and includes a depth value acquisition unit that acquires the depth value, and the depth. A depth value distribution creating unit that counts the number of appearances of the depth value acquired by the value acquisition unit in the image and creates a depth value distribution, which is a distribution of the number of appearances for each depth value, and the depth value distribution creating unit Based on the shape of the depth value distribution, a dividing unit that divides a range between the minimum value and the maximum value of the depth value in the depth value distribution into a plurality of sections, and for the depth value included in each section divided by the dividing unit, A code allocating unit for allocating a different code for each code, and a code for encoding the original image by encoding the depth value with the code allocated by the code allocating unit and outputting the encoded image. Encoding means, and output means for outputting the encoded image output from the encoding means and the number of codes used for encoding the encoded image in association with each other. It is a feature.

Then, as a method of determining the number of divisions of the range from the minimum value of the depth value to the maximum value using the number of local maximum values, for example, the number obtained by adding the number of local maximum values and the number between local maximum values Is the number to be divided. As a result, it is possible to represent the maximum value and the depth value therebetween, and it is possible to encode the depth value while maintaining the shape of the distribution of the appearance number of the depth value.

Although the embodiments of the present invention have been described in detail with reference to the drawings, the specific configuration is not limited to these embodiments, and the design and the like within the scope not departing from the gist of the present invention are also included. Included in scope.

The present invention is not limited to the above-described embodiment, and various modifications can be made within the scope of the claims. That is, embodiments obtained by combining technical means appropriately changed within the scope of the claims are also included in the technical scope of the present invention.

The specific embodiments or examples made in the detailed description of the invention are intended to clarify the technical contents of the present invention, and are limited to such specific examples in a narrow sense. It should not be construed and can be implemented with various modifications within the spirit and scope of the present invention.

Finally, each block of the image encoding device 1 and the image decoding device 2, particularly the viewpoint image acquisition unit 11, the viewpoint image encoding unit 12, the depth calculation unit 13, the depth value analysis unit 14, the depth value conversion unit 15, the depth image The encoding unit 16, the imaging condition information acquisition unit 17, the multiplexing unit 18, the separation unit 51, the viewpoint image decoding unit 52, the depth image decoding unit 53, and the depth value decoding unit 54 are formed on an integrated circuit (IC chip). It may be realized in hardware by a logic circuit that has been implemented, or may be realized in software using a CPU (central processing unit).

In the latter case, the image encoding device 1 and the image decoding device 2 include a CPU that executes instructions of a control program that realizes each function, a ROM (read only memory) that stores the program, and a RAM (random) that expands the program. access memory), a storage device (recording medium) such as a memory for storing the program and various data. An object of the present invention is to enable a computer to read program codes (execution format program, intermediate code program, source program) of control programs for the image encoding device 1 and the image decoding device 2, which are software that realizes the functions described above. Is supplied to the image encoding device 1 and the image decoding device 2, and the computer (or CPU or MPU (microprocessor unit)) reads and executes the program code recorded on the recording medium. Can also be achieved.

Examples of the recording medium include tapes such as a magnetic tape and a cassette tape, a magnetic disk such as a floppy (registered trademark) disk / hard disk, a CD-ROM (compact disk-read-only memory) / MO (magneto-optical) / Discs including optical discs such as MD (Mini Disc) / DVD (digital versatile disc) / CD-R (CD Recordable), IC cards (including memory cards) / optical cards, mask ROM / EPROM (erasable) Programmable read-only memory) / EEPROM (electrically erasable and programmable read-only memory) / semiconductor memory such as flash ROM, or logic circuits such as PLD (Programmable logic device) and FPGA (Field Programmable Gate Array) be able to.

Further, the image encoding device 1 and the image decoding device 2 may be configured to be connectable to a communication network, and the program code may be supplied via the communication network. The communication network is not particularly limited as long as it can transmit the program code. For example, Internet, intranet, extranet, LAN (local area network), ISDN (integrated area services digital area), VAN (value-added area network), CATV (community area antenna television) communication network, virtual area private network (virtual area private network), A telephone line network, a mobile communication network, a satellite communication network, etc. can be used. The transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type. For example, IEEE (institute of electrical and electronic engineers) 1394, USB, power line carrier, cable TV line, telephone line, ADSL (asynchronous digital subscriber loop) line, etc. wired such as IrDA (infrared data association) or remote control , Bluetooth (registered trademark), IEEE802.11 wireless, HDR (high data rate), NFC (Near field communication), DLNA (Digital Living Network Alliance), mobile phone network, satellite line, terrestrial digital network, etc. Is possible. The present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.

Depth values corresponding to viewpoint images can be compressed with simple processing with little delay, so devices that send image data to devices that process images using depth values, such as stereoscopic images and images at arbitrary viewpoints, are created It is suitable for an apparatus that transmits image data to an apparatus that performs such processing.

DESCRIPTION OF SYMBOLS 1 Image coding apparatus 2 Image decoding apparatus 14 Depth value analysis part (Depth value acquisition means, Depth value distribution creation means, Dividing means, Code allocation means)
15 Depth value converter (encoding means, output means)
54 Depth value decoding unit (acquisition means, decoding means)

Claims

An image encoding device for encoding a depth value for each unit area in an original image,
Depth value acquisition means for acquiring the depth value;
A depth value distribution creating unit that counts the number of appearances of the depth value acquired by the depth value acquisition unit in the original image and creates a depth value distribution that is a distribution of the number of appearances for each depth value;
A dividing unit that divides the range of the depth value in the depth value distribution from the minimum value to the maximum value into a plurality of ranges based on the shape of the depth value distribution created by the depth value distribution creating unit;
Code assigning means for assigning a different code for each section to the depth value included in each section divided by the dividing means;
Encoding means for encoding the original image by encoding the depth value with a code assigned by the code assigning means, and outputting an encoded image;
An image comprising: an encoded image output by the encoding means; and an output means for outputting the number of codes, which is the number of codes used for encoding the encoded image, in association with each other Encoding device.
The dividing means determines the number to divide a range from the minimum value of the depth value to the maximum value in the depth value distribution using the maximum number of occurrences in the depth value distribution created by the depth value distribution creating means. The image coding apparatus according to claim 1, wherein:
When the number of local maximum values is 2 or more, the dividing means calculates the sum of the number of local maximum values and the number obtained by subtracting 1 from the number of local maximum values, so that it is equal to or greater than the minimum value of depth values in the depth value distribution. The image coding apparatus according to claim 2, wherein the number of divisions of a range below the maximum value is determined.
The assigning means assigns different codes to the maximum value and the minimum value of the depth value, and excludes the maximum value and the minimum value in a range of the depth value distribution from the minimum value to the maximum value. The image encoding device according to any one of claims 1 to 3, wherein a different code is assigned to each section obtained by dividing the remaining range.
The dividing means obtains the slope of the depth value distribution for each predetermined range starting from the maximum value of the number of occurrences of the depth value in the range of the depth value minimum value to the maximum value in the depth value distribution. 2. The image coding apparatus according to claim 1, wherein a predetermined range having a large slope of the value distribution is divided more than a predetermined range having a small slope of the depth value distribution.
The original image is a moving image,
The output means is used for encoding the number of codes used for encoding the encoded image output by the encoding means and the encoded image output by the encoding means immediately before the encoded image. 6. The image coding apparatus according to claim 1, wherein a difference between the number of codes and the number of codes is output.
Obtaining means for obtaining an encoded image obtained by encoding an original image and a code number which is the number of codes used for encoding the encoded image;
A conversion ratio, which is a ratio between the number of codes acquired by the acquisition unit and the number of gradations for expressing the depth value of the original image, is obtained, and the obtained conversion ratio is the depth value of the encoded image acquired by the acquisition unit. An image decoding apparatus comprising: decoding means for decoding an encoded image by multiplying by.
An image encoding device control program for operating the image encoding device according to any one of claims 1 to 6, wherein the image encoding device control program causes a computer to function as each of the above means.
An image decoding device control program for operating the image decoding device according to claim 7, wherein the image decoding device control program causes a computer to function as each of the above-described means.
A computer-readable recording medium on which at least one of the image encoding device control program according to claim 8 and the image decoding device control program according to claim 9 is recorded.
A control method of an image encoding device for encoding a depth value for each unit area in an original image,
A depth value acquiring step for acquiring the depth value;
A depth value distribution creating step for counting the number of appearances of the depth value obtained in the depth value obtaining step in the image and creating a depth value distribution that is a distribution of the number of appearances for each depth value;
A dividing step of dividing a range of the depth value in the depth value distribution from the minimum value to the maximum value into a plurality based on the shape of the depth value distribution created in the depth value distribution creating step;
A code assigning step for assigning a different code for each section to the depth value included in each section divided in the dividing step;
An encoding step of encoding the original image by encoding the depth value with the code allocated in the code allocation step and outputting an encoded image;
An image encoding comprising: an encoded image output in the encoding step; and an output step of outputting the code number, which is the number of codes used for encoding the encoded image, in association with each other Control method of the device.
An acquisition step of acquiring an encoded image obtained by encoding an original image and a code number that is the number of codes used for encoding the encoded image;
A conversion ratio that is a ratio between the number of codes acquired in the acquisition step and the number of gradations for expressing the depth value of the original image is obtained, and the obtained conversion ratio is obtained as the depth value of the encoded image obtained in the acquisition step. And a decoding step of decoding the encoded image by multiplying by a control method for an image decoding apparatus.