WO2011102131A1 - Image encoding device, image encoding method, program and integrated circuit - Google Patents

Image encoding device, image encoding method, program and integrated circuit Download PDF

Info

Publication number
WO2011102131A1
WO2011102131A1 PCT/JP2011/000875 JP2011000875W WO2011102131A1 WO 2011102131 A1 WO2011102131 A1 WO 2011102131A1 JP 2011000875 W JP2011000875 W JP 2011000875W WO 2011102131 A1 WO2011102131 A1 WO 2011102131A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vector
image
unit
correction
encoding
Prior art date
Application number
PCT/JP2011/000875
Other languages
French (fr)
Japanese (ja)
Inventor
耕治 有村
重里 達郎
津田 賢治郎
一仁 木村
Original Assignee
パナソニック株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by パナソニック株式会社 filed Critical パナソニック株式会社
Publication of WO2011102131A1 publication Critical patent/WO2011102131A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding

Definitions

  • the present invention relates to a high-efficiency image encoding device, and more particularly to a method for high-efficiency encoding of stereoscopic video data captured from a plurality of viewpoints using motion compensation prediction.
  • a stereoscopic video display device has been developed for stereoscopic viewing using parallax of images observed with both eyes.
  • As a method for encoding such a stereoscopic video it is known to use the fact that the correlation between the left-eye video and the right-eye video is high. Specifically, when one of the two images is encoded, a motion vector is obtained using the other image as a reference image, and motion compensation is performed. Thus, an encoding method that realizes highly efficient compression has been proposed.
  • H.264 MVC Multiview Video Coding
  • FIG. 6 shows a reference relationship between frames (pictures) in MVC.
  • a motion vector is detected using a reference in the time axis direction, that is, another frame having a different imaging time as a reference image, and motion compensation prediction is performed.
  • reference between each viewpoint V0 to V4
  • a motion vector is obtained using frames of different viewpoints captured at the same time as reference images. Detection and motion compensation prediction can be performed.
  • Patent Document 1 discloses an accurate motion vector for use in encoding using a motion vector of a block around the corresponding block or a motion vector at the same position in a past frame.
  • a conventional example in which the coding efficiency is improved by detecting the above is disclosed.
  • FIG. 7 shows a block diagram of Patent Document 1 of the conventional example.
  • a conventional image coding apparatus 10 mainly includes a block matching unit 1, a parallax compensation vector detection unit 2, memories 3 and 6, a correction vector detection unit 4, and a variable delay unit 5.
  • the image encoding apparatus 10 having the above configuration includes one image as an encoding target image among two images captured by a pair of synchronized cameras (that is, images captured from different viewpoints), and the other image.
  • An image is input as a reference image.
  • the block matching unit 1 performs block matching with the reference image for each block (encoding target block) constituting the encoding target image.
  • the block matching result output from the block matching unit 1 is input to the parallax compensation vector detection unit 2.
  • the disparity compensation vector detection unit 2 detects the motion vector of the encoding target block based on the block matching result. The motion vector detected in this way is stored in the memory 3.
  • the correction vector detection unit 4 reads the corresponding encoding target block from the memory 3 and the motion vector of the block around the encoding target block, the past block at the same position stored in the memory 6 and the surrounding of the block.
  • the motion vector of the block is acquired and, for example, the motion vector is averaged to detect an accurate motion vector of the block to be encoded.
  • FIG. 8 is a schematic diagram showing a motion vector for each block constituting the encoding target image.
  • the motion vectors of the encoding target images detected using the images of different viewpoints as reference images are vectors “a ⁇ ” to “ta ⁇ ”.
  • the correction vector detection unit 4 includes neighboring vectors “A ⁇ ”, “I ⁇ ”, “U ⁇ ”, “O ⁇ ”, “K ⁇ ”, “ “ ⁇ ”, “Ko ⁇ ”, “sa ⁇ ” are used. Encoding efficiency is improved by performing motion compensation predictive encoding using the correction vector corrected in this way.
  • the symbol “ ⁇ (vector)” indicates a symbol added on the immediately preceding character.
  • the video of each viewpoint constituting the stereoscopic video is captured as a single unit by combining a plurality of single cameras or by fixing a plurality of cameras. For this reason, when one camera is considered as a reference, the other camera has an inclination (rotation) different from the parallax, a vertical (left / right) shift, or the size of the imaging target (imaging magnification) Are often different.
  • a motion compensation residual signal prediction error
  • FIGS. 9A to 9C show an example in which a first viewpoint image is used as a reference image, a second viewpoint image is used as an encoding target image, and a block including a part of a subject (star) is encoded.
  • a block in the first viewpoint image is a reference block obtained by motion vector detection, and a block in the second viewpoint image is an encoding target block.
  • FIG. 9A shows a case where there is no deviation other than parallax between the images of the two viewpoints, and the residual signal (prediction error) between the encoding target block and the reference block becomes small, so that encoding can be performed with high efficiency.
  • FIG. 9B if there is a deviation in inclination between the two viewpoint images, or if there is a difference in size between the two viewpoint images as shown in FIG. 9C, the residual signal ( (Prediction error) becomes large, so that it cannot be encoded with high efficiency.
  • FIG. 10A shows a case where there is no shift other than parallax between the images of the two viewpoints, and the residual signal (prediction error) between the encoding target block and the reference block becomes small, so that encoding can be performed with high efficiency.
  • FIG. 10B when there are vertical shifts in the images of the two viewpoints, the reference block is a position extended outside the image. As a result, the residual signal (prediction error) becomes large and cannot be encoded with high efficiency.
  • the present invention has been made in view of the above first to third problems, and provides an image encoding apparatus that easily and appropriately corrects a shift caused by a cause different from a parallax between two images. For the purpose.
  • the image encoding device encodes a stereoscopic video image including at least two viewpoint videos.
  • an acquisition unit that acquires the stereoscopic video, and a correction for correcting a displacement related to the size or position of the subject displayed in the stereoscopic video acquired by the acquisition unit
  • a correction unit that executes processing
  • a motion vector detection unit that detects a motion vector between two viewpoint videos that form a stereoscopic video corrected by the correction unit, and a motion detected by the motion vector detection unit
  • amendment part performs the said correction process based on the motion vector detected by the said motion vector detection part before the present correction process.
  • the correction unit may use at least one of a shift due to rotation between two viewpoints of the subject displayed in the stereoscopic video, a shift due to enlargement, and a shift due to parallel movement based on the motion vector. May be corrected.
  • the correction unit detects at least one of a shift due to the rotation, a shift due to the enlargement, and a shift due to the parallel movement based on the direction of the motion vector, and the shift indicated in the detection result May be corrected.
  • the correction unit may detect a shift associated with the parallel movement based on a vertical component of the motion vector.
  • the parallel movement is not necessarily detected using a plurality of motion vectors detected for each block, and can be detected from one motion vector.
  • the motion vector detection unit may detect the motion vector for each region smaller than the entire region of the stereoscopic video corrected by the correction unit.
  • amendment part may detect the shift
  • the correction unit when the plurality of motion vectors tend to converge toward a predetermined position in the stereoscopic video, or when the correction unit shows a tendency to diffuse from the predetermined position, the correction unit performs the enlargement.
  • the accompanying shift may be detected.
  • the correction unit may detect a shift associated with the rotation when the plurality of motion vectors tend to draw a circle in the stereoscopic video.
  • the encoding unit may start outputting the compression-encoded video for stereoscopic viewing when a predetermined period has passed since the start of encoding was instructed. Thereby, since an image including a shift is not output, stereoscopic viewing becomes easier.
  • the motion vector detection unit may further start detection of a motion vector between the stereoscopic video images acquired by the acquisition unit before an instruction to start encoding is given. Then, the correction unit detects the latest motion vector detected by the motion vector detection unit for the first stereoscopic image acquired by the acquisition unit immediately after the start of the encoding process is instructed. The correction process may be executed using
  • the correction process can be executed on the first image to be encoded, so that only an image with substantially no deviation can be encoded.
  • the acquisition unit may include a first imaging unit that images a subject from a first viewpoint, and a second imaging unit that images the subject from a second viewpoint.
  • the motion vector detection unit is configured so that one of images captured at the first time at each of the first and second viewpoints is an encoding target image, and the other is a reference image, for each block of the encoding target image.
  • a motion vector may be detected.
  • amendment part is 2nd after the said 1st time in each of the said 1st and 2nd viewpoint based on the tendency of the said several motion vector corresponding to each block of the said encoding object image. You may correct
  • An image encoding method is a method for encoding a stereoscopic video image including at least two viewpoint videos. Specifically, an acquisition step for acquiring the stereoscopic video, and a correction for correcting a displacement related to the size or position of the subject displayed in the stereoscopic video acquired in the acquisition step. A correction step for executing processing, a motion vector detection step for detecting a motion vector between two viewpoint videos constituting the stereoscopic video corrected in the correction step, and the motion detected in the motion vector detection step An encoding step of compressing and encoding the stereoscopic video corrected in the correction step based on the vector. In the correction step, the correction process is executed based on the motion vector detected in the motion vector detection step before the current correction process.
  • the program according to an aspect of the present invention causes a computer to encode a stereoscopic video image including at least two viewpoint videos. Specifically, an acquisition step for acquiring the stereoscopic video, and a correction for correcting a displacement related to the size or position of the subject displayed in the stereoscopic video acquired in the acquisition step. A correction step for executing processing, a motion vector detection step for detecting a motion vector between two viewpoint videos constituting the stereoscopic video corrected in the correction step, and the motion detected in the motion vector detection step Based on the vector, the computer is caused to execute an encoding step of compressing and encoding the stereoscopic video corrected in the correction step. In the correction step, the correction process is executed based on the motion vector detected in the motion vector detection step before the current correction process.
  • An integrated circuit encodes a stereoscopic video image including at least two viewpoint videos. Specifically, an acquisition unit that acquires the stereoscopic video and a correction process for correcting a shift related to the size or position of the subject displayed in the acquired stereoscopic video are displayed. Based on a correction unit, a motion vector detection unit that detects a motion vector between two viewpoint videos constituting the stereoscopic video corrected by the correction unit, and a motion vector detected by the motion vector detection unit And an encoding unit that compresses and encodes the stereoscopic video corrected by the correction unit. And the said correction
  • the present invention it is possible to correct a deviation caused by a cause different from the parallax from the result of motion vector detection by inter-viewpoint reference.
  • highly efficient image coding by inter-viewpoint reference becomes possible.
  • stereoscopic viewing is easy and eye fatigue is unlikely to occur.
  • it is not necessary to provide a new component such as an image shift detection unit it is possible to suppress an increase in circuit scale and reduce power consumption.
  • FIG. 1 is a block diagram of an image coding apparatus according to Embodiment 1 of the present invention.
  • FIG. 2A is a diagram illustrating a state in which images captured at the first and second viewpoints are arranged in the order of imaging.
  • FIG. 2B is a diagram illustrating a state in which images captured at the first and second viewpoints are arranged in the encoding order.
  • FIG. 3A is a flowchart showing main processing of the image coding apparatus according to Embodiment 1.
  • FIG. 3B is a flowchart illustrating an encoding process of the image encoding device according to Embodiment 1.
  • FIG. 3C is a flowchart showing a correction process of the image coding apparatus according to Embodiment 1.
  • FIG. 1 is a block diagram of an image coding apparatus according to Embodiment 1 of the present invention.
  • FIG. 2A is a diagram illustrating a state in which images captured at the first and second viewpoints are arranged in the order of imaging.
  • FIG. 4A is a diagram illustrating an example of images at the first and second viewpoints when there is no imaging deviation.
  • FIG. 4B is a diagram illustrating an example of the first and second viewpoint images when the first viewpoint image is rotated with respect to the second viewpoint image.
  • FIG. 4C is a diagram illustrating an example of first and second viewpoint images in a case where the first viewpoint image is reduced with respect to the second viewpoint image.
  • FIG. 4D is a diagram illustrating an example of the first and second viewpoint images when the first viewpoint image is translated with respect to the second viewpoint image.
  • FIG. 5A is a block diagram of an image coding apparatus according to Embodiment 2 of the present invention.
  • FIG. 5B is a flowchart showing preprocessing of the image coding apparatus according to Embodiment 2 of the present invention.
  • FIG. 2 is a diagram illustrating a reference relationship of H.264 MVC (Multiview Video Coding).
  • FIG. 7 is a block diagram of a conventional image encoding device.
  • FIG. 8 is a diagram showing the motion vectors of the blocks constituting the encoding target image.
  • FIG. 9A is a diagram illustrating encoding efficiency when there is no deviation between the encoding target image and the reference image.
  • FIG. 9B is a diagram illustrating encoding efficiency when the encoding target image is rotated with respect to the reference image.
  • FIG. 9C is a diagram illustrating encoding efficiency when the encoding target image is enlarged with respect to the reference image.
  • FIG. 10A is a diagram illustrating encoding efficiency when there is no deviation between the encoding target image and the reference image.
  • FIG. 10B is a diagram illustrating encoding efficiency when the encoding target image is translated with respect to the reference image.
  • FIG. 1 is a block diagram of an image coding apparatus 100 according to Embodiment 1 of the present invention.
  • the image encoding apparatus 100 is an H.264 filer.
  • the correction unit 102 further includes a correction value calculation unit 111 and an image correction unit 112.
  • the encoding unit 106 further includes an intra-screen encoding unit 114 and an inter-screen encoding unit 115.
  • the left-eye imaging unit 101a outputs an image (video or still image) obtained by imaging the subject from the first viewpoint to the correction unit 102.
  • the right-eye imaging unit 101b outputs an image obtained by imaging the subject from a second viewpoint different from the first viewpoint to the correction unit 102.
  • the image output from the left-eye imaging unit 101a and the image output from the right-eye imaging unit 101b have parallax.
  • the images picked up by the left-eye image pickup unit 101a and the right-eye image pickup unit 101b are stereoscopic images made up of two viewpoint images.
  • FIG. 2A is a schematic diagram illustrating an image (video) captured by the left-eye imaging unit 101a and the right-eye imaging unit 101b.
  • the left-eye imaging unit 101a and the right-eye imaging unit 101b are interlocked with each other, and as shown in FIG. 2A, 1 for every same time (t 0 , t 1 ,..., T 6 ).
  • An image of a frame (picture) is output.
  • the left-eye imaging unit 101a and the right-eye imaging unit 101b constitute an acquisition unit that acquires an image.
  • the left-eye imaging unit 101a and the right-eye imaging unit 101b are not essential components and can be omitted. That is, an image captured by an external imaging device may be acquired and processed by an acquisition unit.
  • the acquisition unit (not shown) may acquire a stereoscopic video from a broadcast wave.
  • the format of the stereoscopic video that can be acquired from the broadcast wave is not particularly limited.
  • the left half of one image is the image of the first viewpoint and the right half is the image of the second viewpoint
  • the upper half of one image is the image of the first viewpoint and the lower half is the image of the first viewpoint.
  • a side-by-side format that is an image of the second viewpoint may be used. With this format, transmission and reception can be performed in the same manner as a conventional planar view video.
  • the first viewpoint image and the second viewpoint image may be transmitted and received alternately in units of pictures.
  • a high-definition stereoscopic video can be transmitted and received although the frame rate is twice that of the conventional one.
  • the correction unit 102 performs a correction process for correcting an imaging shift on at least one of the images input from the left-eye imaging unit 101a and the right-eye imaging unit 101b. More specifically, the correction unit 102 calculates at least one of a shift due to rotation between two viewpoints of a subject displayed in a stereoscopic video, a shift due to enlargement, and a shift due to parallel movement as a motion vector. Is corrected on the basis of the size and / or direction. Then, the correction unit 102 outputs the corrected image to the multiplexing unit 103.
  • the imaging deviation can be defined, for example, as a deviation relating to the size or position when the subject is displayed. Specifically, when the image of the first viewpoint is enlarged or reduced (size shift) with respect to the image of the second viewpoint captured at the same time, rotation or translation (position shift). ) Is considered.
  • the imaging shift can be defined as a shift (up / down shift, size shift, tilt shift, etc.) caused by a cause different from parallax.
  • the “deviation caused by a cause different from parallax” refers to, for example, a deviation caused by an installation error of the left-eye imaging unit 101a and the right-eye imaging unit 101b, an imaging magnification mismatch, or the like.
  • the correction value calculation unit 111 determines the type of imaging deviation based on the tendency of the direction of the plurality of motion vectors. Further, the magnitude of the imaging deviation is calculated based on the tendency of the magnitudes of the plurality of motion vectors.
  • the correction value calculation unit 111 detects a shift due to parallel movement based on the vertical component of the motion vector. Specifically, if the motion vectors of each block are directed in substantially the same direction (upward or downward) and the sizes are substantially the same, it is determined that the displacement is caused by the parallel movement. Can do.
  • the correction value calculation unit 111 may detect a shift due to rotation or a shift due to enlargement based on the tendency indicated by the directions of a plurality of motion vectors. Specifically, when a plurality of motion vectors tend to converge toward a predetermined position in a stereoscopic video, or when they tend to diffuse from the predetermined position, it is determined that the displacement is caused by enlargement. Can do. Further, when a plurality of motion vectors tend to draw a circle in a stereoscopic video, it can be determined that the shift is caused by rotation.
  • the image correcting unit 112 applies at least one of the images captured at the same time from the first and second viewpoints according to the type and size of the imaging shift calculated by the correction value calculating unit 111.
  • the correction process is executed for the above. Details of the operation of the correction unit 102 will be described later.
  • the multiplexing unit 103 changes the image acquired from the correction unit 102 to the encoding order, and outputs it to the switching unit 104.
  • FIG. 2B is a diagram illustrating the order of images (coding order) after the images in FIG. 2A are input to the multiplexing unit 103 and multiplexed.
  • I”, “P”, and “B” represent the encoding type of each frame.
  • I is an intra-frame prediction frame (I picture)
  • P is a unidirectional inter-frame prediction frame (P picture)
  • B is a bi-directional inter-frame prediction frame (B picture).
  • An “arrow” indicates a reference destination when performing inter-viewpoint reference.
  • each block constituting the first viewpoint image is encoded using only the first viewpoint (same viewpoint) image captured at different times as a reference image.
  • each block of the frame F2 is encoded using only the frame F0 as a reference image.
  • Each block of the frame F4 is encoded using the frame F0 or the frame F2 as a reference image.
  • Each block of the frame F6 is encoded using the frame F2 or the frame F4 as a reference image.
  • each block constituting the second viewpoint image includes a first viewpoint (another viewpoint) imaged at the same time, or a second viewpoint (the same viewpoint) image captured at a different time. It is encoded as a reference image.
  • each block of the frame F1 is encoded using only the frame F0 as a reference image.
  • Each block of the frame F3 is encoded using the frame F1 or the frame F2 as a reference image.
  • Each block of the frame F5 is encoded using the frame F1, the frame F3, or the frame F4 as a reference image.
  • Each block of the frame F7 is encoded using the frame F3, the frame F5, or the frame F6 as a reference image.
  • FIG. 3A is a flowchart showing a processing procedure of the main processing. With reference to FIG. 3A, the flow of the operation of the image coding apparatus 100 will be briefly described.
  • the left-eye imaging unit 101a acquires the first viewpoint image
  • the right-eye imaging unit 101b acquires the second viewpoint image (S201). Note that in an image encoding device that does not include the left-eye imaging unit 101a and the right-eye imaging unit 101b, an image may be acquired from an external device.
  • the correction unit 102 performs correction processing on the images acquired from the left-eye imaging unit 101a and the right-eye imaging unit 101b. A specific processing procedure of the correction processing will be described later with reference to FIG. 3C.
  • the switching units 104 and 108, the motion vector detection unit 105, the encoding unit 106, the reference image memory 107, the variable length encoding unit 109, and the encoding mode control unit 110 are corrected by the correction unit 102 and multiplexed.
  • the image multiplexed by the unit 103 is encoded (S203). A specific processing procedure of the encoding process will be described later with reference to FIG. 3B.
  • FIG. 3B is a flowchart showing the procedure of the encoding process. With reference to FIG. 3B, the operation of the components after the switching unit 104 will be described in detail.
  • the switching unit 104 acquires the encoding type of the encoding target image acquired from the multiplexing unit 103 from the encoding mode control unit 110.
  • the switching unit 104 outputs the encoding target image to the intra encoding unit 114 of the encoding unit 106.
  • the switching unit 104 outputs the encoding target image to the motion vector detecting unit 105 simultaneously with the intra-frame encoding unit 114.
  • the encoding target image is always intra-coded by the intra-coding unit 114 (S301). Furthermore, when the encoding mode control unit 110 determines that the frame is an inter-frame prediction frame (Yes in S302), in addition to the intra-frame encoding, the motion vector detection unit 105 detects a motion vector (S303).
  • the in-screen encoding unit 114 performs intra-screen encoding on the input encoding target image (S301). Specifically, the intra-frame encoding unit 114 performs intra-frame prediction for each block (encoding target block) constituting the encoding target image to generate a prediction block. Next, a prediction error (residual signal) is calculated by subtracting the prediction block from the encoding target block. Next, the quantized coefficient is calculated by orthogonal transform and quantizing the calculated prediction error. Then, the obtained quantized coefficients and encoded information are output to switching section 108. Further, the intra-frame coding unit 114 performs inverse quantization and inverse orthogonal transform on the quantized coefficient, adds the prediction blocks, and creates a local decoded image. This local decoded image is stored in the reference image memory 107 as a reference image of the subsequent inter-frame prediction frame.
  • the motion vector detection unit 105 detects a motion vector between two viewpoint videos constituting the stereoscopic video corrected by the correction unit 102 by performing block matching between the encoding target block and the reference image. Specifically, the motion vector detection unit 105 detects a motion vector for each area (block) smaller than the entire area of the stereoscopic video corrected by the correction unit 102.
  • the motion vector detection unit 105 acquires the local decoded image designated by the encoding mode control unit 110 from the reference image memory 107. Then, the motion vector detection unit 105 performs block matching of the encoding target block using the acquired local decoded image as a reference image, and detects a motion vector for each block (S303). Note that the reference image specified by the encoding mode control unit 110 may be one or plural.
  • the motion vector detection unit 105 outputs the detected motion vector to the inter-screen encoding unit 115 of the encoding unit 106. Furthermore, when the reference image and the encoding target image are images of different viewpoints (Yes in S304), the motion vector detection unit 105 outputs the obtained motion vector to the correction value calculation unit 111 of the correction unit 102. (S305).
  • the motion vector detection unit 105 detects the motion vector of the frame F1 detected using the frame F0 as a reference image, the motion vector of the frame F5 detected using the frame F4 as a reference image, and the frame F6 as a reference image.
  • the motion vector and the like of the frame F7 thus output are output to the correction value calculation unit 111.
  • the inter-screen encoding unit 115 inter-codes the input encoding target image. Specifically, the inter-frame encoding unit 115 performs motion compensation using the motion vector acquired from the motion vector detection unit 105 for each block (encoding target block) constituting the encoding target image, and performs prediction compensation. Is generated. Next, a prediction error (residual signal) is calculated by subtracting the prediction block from the encoding target block. Next, the quantized coefficient is calculated by orthogonal transform and quantizing the calculated prediction error.
  • the inter-screen encoding unit 115 outputs the quantization coefficient and the encoding information to the switching unit 108. Further, the inter-frame coding unit 115 performs inverse quantization and inverse orthogonal transform on the quantization coefficient, and adds a prediction block to create a local decoded image. This local decoded image is stored in the reference image memory 107 as a reference image of the subsequent inter-frame prediction frame.
  • the encoding mode control unit 110 performs quantization output from the encoding unit 106 (the intra-screen encoding unit 114 and the inter-screen encoding unit 115). Based on the coefficient and other coding information, for each coding target block, it is determined by a known evaluation formula whether to encode by intra prediction encoding or inter prediction encoding, and the switching unit 108 is Control.
  • the switching unit 108 converts one of the quantized coefficients obtained from the intra-screen encoding unit 114 and the inter-screen encoding unit 115 to the variable-length encoding unit 109. (S306).
  • variable length encoding unit 109 performs variable length encoding on the quantization coefficient and the encoding information acquired from the switching unit 108, and outputs the result as encoded data (S307). Then, the image coding apparatus 100 executes the above processing (S301 to S307) for all the blocks constituting the coding target image (S308).
  • FIG. 3C is a flowchart illustrating a processing procedure of the correction unit 102.
  • 4A to 4D are diagrams showing the relationship between the imaging deviation between the images of the first and second viewpoints and the tendency of the motion vector.
  • the correction value calculation unit 111 aggregates the motion vectors, and types of imaging shift between the images of the first and second viewpoints. And the magnitude are detected, and the correction value is calculated (S311).
  • the correction value calculation unit 111 excludes the influence of parallax from the motion vector detected by the motion vector detection unit 105. Since the parallax is a horizontal component shift, the vertical component is an image shift. In addition, since the object at the convergence point, which is the convergence point of the lens optical axes of the left and right eye imaging units set at the time of shooting, has a parallax almost equal to 0, the motion vector detected by the object at the convergence point is It is. Therefore, an object at the convergence point can be photographed from two different viewpoints, and a motion vector detected between the two captured images can be regarded as an imaging shift.
  • the parallax of objects having the same distance from the imaging unit is the same size in the same direction. Therefore, for example, a motion vector (direction and magnitude) corresponding to the parallax may be set in the correction value calculation unit 111 in advance.
  • the second viewpoint image is set as an encoding target image
  • the first viewpoint image captured at the same time as the encoding target image is used as a reference image, which is detected by the motion vector detection unit 105. It is the figure which excluded the influence of parallax from the performed motion vector, and was shown in figure.
  • FIG. 4A shows the tendency of the motion vector when there is no deviation other than the parallax between the encoding target image and the reference image. In this case, since the encoding target image matches the reference image, the motion vector of each block tends to be (0, 0).
  • FIG. 4B shows the tendency of the motion vector when the reference image is rotated with respect to the encoding target image.
  • the motion vectors of the blocks tend to be arranged to draw a circle as a whole. Such a situation occurs, for example, when the left-eye imaging unit 101a is installed tilted.
  • the motion vectors of each block are arranged to draw a counterclockwise circle. That is, it can be determined that the reference image is rotated counterclockwise with respect to the encoding target image. Further, the rotation center can be estimated as the position where the magnitude of the motion vector is the smallest (in this example, the image center). Furthermore, the degree of rotation (rotation angle) can be estimated from the magnitude of the motion vector and the distance from the center of rotation.
  • the type of imaging deviation is rotation
  • a method for calculating the direction of rotation and the correction value will be described. Note that the method shown below is an example, and it can be calculated by other methods.
  • pre-processing is executed prior to calculating the direction of rotation and the correction value. This pre-processing is executed in common even when the type of imaging deviation is enlargement (reduction) or parallel movement.
  • a motion vector MVaveV [i] is calculated.
  • the number of horizontal macroblock lines is mby
  • the number of vertical macroblock lines is mbx
  • the motion vector in image coding is such that the horizontal component (x component) has a positive direction on the right side of the image and a negative direction on the left side, while the vertical component (y component) has a positive value on the lower side of the image.
  • Direction, upper side is negative direction.
  • the average motion vector MVave in the frame can be calculated using Equation 1. In the examples of FIGS. 4B to 4D, the average value of 12 motion vectors is obtained.
  • the average motion vector MVaveH [j] for each horizontal macroblock line can be calculated using Equation 2.
  • the average motion vector of each of the three horizontal macroblock lines (rows) is calculated.
  • the average motion vector MVaveV [i] for each vertical macroblock line can be calculated using Equation 3.
  • the average motion vector of each of the four vertical macroblock lines (columns) is calculated.
  • the average motion vector is calculated in units of one macroblock line in both the horizontal direction and the vertical direction.
  • the present invention is not limited to this, and the average motion vector may be calculated in units of a plurality of macroblock lines. Good.
  • Vflag of Equation 4 is true when the vertical component y of MVaveV monotonously decreases and the horizontal component x of MVaveV is equal to or less than the threshold value VTh.
  • VTh the threshold value
  • the latter half of the equation 4 can be read as horizontal component x ⁇ 0 of MVaveV.
  • Hflag of Formula 5 is true when the horizontal component x of MVaveH monotonously increases and the vertical component y of MVaveH is equal to or less than the threshold value HTh.
  • the threshold value HTh is set to a value close to 0, the second half of Equation 5 can be read as the vertical component y ⁇ 0 of MVaveH.
  • Equation 6 is true when the average motion vector MVave of the frame is equal to or less than the threshold value FTh.
  • the threshold value FTh is set to a value close to 0, Equation 6 can also be read as MVave ⁇ 0.
  • Vflag of Expression 7 is true when the vertical component y of MVaveV monotonously increases and the horizontal component x of MVaveV is equal to or less than the threshold value VTh.
  • VTh the threshold value
  • the latter half of Equation 7 can be read as horizontal component x ⁇ 0 of MVaveV.
  • Hflag of Expression 8 is true when the horizontal component x of MVaveH monotonously decreases and the vertical component y of MVaveH is equal to or less than the threshold value HTh.
  • the threshold value HTh is set to a value close to 0, the second half of Equation 8 can be read as the vertical component y ⁇ 0 of MVaveH.
  • the horizontal macroblock line number mby is an odd number
  • the motion vector average of the central horizontal macroblock line is excluded from the calculation, and the horizontal macroblock line number mby is decremented by one.
  • the motion vector average of the center vertical macroblock line is excluded from the calculation, and the calculation is performed by subtracting 1 from the number of vertical macroblock lines mbx.
  • the rotation angle X of the above formula 9 is obtained as a positive value for counterclockwise rotation (counterclockwise), and the rotation angle Y of the equation 10 is obtained as a positive value for clockwise rotation (clockwise).
  • the correction value (rotation angle) when correcting the frame for which the motion vector is calculated is the rotation angle of the entire frame. Conversely, the correction value (rotation angle) when correcting the reference frame is in the reverse direction.
  • FIG. 4C shows the tendency of the motion vector when the sizes of the reference image and the encoding target image do not match.
  • the motion vectors of the blocks tend to be arranged radially as a whole. Such a situation occurs, for example, when the imaging magnifications of the left-eye imaging unit 101a and the right-eye imaging unit 101b are different.
  • the motion vector of each block faces the image center of the encoding target image. That is, it can be determined that the reference image is reduced with respect to the encoding target image.
  • the reduction rate can be estimated from, for example, the average value of the magnitudes of the motion vectors.
  • the direction (whether it is enlargement or reduction) and a method for calculating the correction value will be described. Note that the method shown below is an example, and it can be calculated by other methods.
  • Vflag of Expression 11 is true when the horizontal component x of MVaveV monotonously decreases and the vertical component y of MVaveV is equal to or less than the threshold value VTh.
  • the threshold value VTh is set to a value close to 0, the second half of the equation 11 can be read as the vertical component y ⁇ 0 of MVaveV.
  • Hflag of Expression 12 is true when the vertical component y of MVaveH monotonously decreases and the horizontal component x of MVaveH is equal to or less than the threshold value HTh.
  • the threshold value HTh is set to a value close to 0, the latter half of Equation 12 can be read as horizontal component x ⁇ 0 of MVaveH.
  • Aflag of Expression 13 is true when the average motion vector MVave of the frame is equal to or less than the threshold value FTh.
  • the threshold value FTh is set to a value close to 0, the expression 13 can also be read as MVave ⁇ 0.
  • Vflag of Expression 14 is true when the horizontal component x of MVaveV monotonously increases and the vertical component y of MVaveV is equal to or less than the threshold value VTh.
  • VTh the threshold value
  • the latter half of Expression 14 can be read as the vertical component y ⁇ 0 of MVaveV.
  • Hflag of Expression 15 is true when the vertical component y of MVaveH monotonously increases and the horizontal component x of MVaveH is equal to or less than the threshold value HTh.
  • the threshold value HTh is set to a value close to 0, the second half of Equation 15 can be read as horizontal component x ⁇ 0 of MVaveH.
  • the horizontal macroblock line number mby is an odd number
  • the motion vector average of the central horizontal macroblock line is excluded from the calculation, and the horizontal macroblock line number mby is decremented by one.
  • the motion vector average of the center vertical macroblock line is excluded from the calculation, and the calculation is performed by subtracting 1 from the number of vertical macroblock lines mbx.
  • the overall frame reduction ratio is obtained as (XR + YR) / 2 as an average of the vertical reduction ratio XR and the horizontal reduction ratio YR.
  • the correction value when correcting the frame for which the motion vector is calculated is the enlargement ratio. Conversely, the correction value when correcting the reference frame is an inverse number.
  • FIG. 4D shows a tendency of the motion vector when the reference image is displaced in one direction (translated) with respect to the encoding target image.
  • the motion vectors of the blocks tend to be in the same direction as a whole. Such a situation occurs, for example, when the left-eye imaging unit 101a does not accurately face the subject.
  • the motion vector of each block faces upward. That is, it can be determined that the reference image is shifted upward with respect to the encoding target image. Moreover, the magnitude
  • the average motion vector of the frame and the motion vector of each block are compared for each component, and the number of blocks cnt whose difference from the average motion vector is within the threshold value mvTh is calculated.
  • the shift amount of the entire frame is obtained by the average motion vector MVave in the frame of Equation 1.
  • the correction value for correcting the frame for which the motion vector is calculated is the frame motion vector MVave.
  • the correction value for correcting the reference frame is a value obtained by multiplying the motion vector MVave of the frame by -1.
  • the correction value calculation unit 111 calculates a correction value for correcting the detected imaging deviation and outputs the correction value to the image correction unit 112 (S311).
  • the types of imaging deviation shown in FIGS. 4A to 4D are examples, and the correction value calculation unit 111 can also detect other types of imaging deviations.
  • the imaging deviation described with reference to FIGS. 4A to 4D may be combined.
  • the image correction unit 112 detects at least one of the first and second viewpoint images captured by the left-eye imaging unit 101a and the right-eye imaging unit 101b based on the input correction value. It correct
  • the image to be corrected here is an image captured after (second time) after the time (first time) when the encoding target image and the reference image shown in FIGS. 4A to 4D are captured. It is.
  • the image correction unit 112 performs correction by extracting a common part from each image captured at the same time from each of the first and second viewpoints when the imaging shift between the images is vertical movement, horizontal translation, or inclination. May be. In addition, when the displacement between the images does not match, correction may be performed by enlarging or reducing one of the images captured at the same time from the first and second viewpoints.
  • the imaging deviation may be corrected.
  • the switching unit 108 acquires the quantization coefficient and the encoding information from the encoding unit 106, it may be discarded without immediately outputting it to the variable length encoding unit 109.
  • the switching unit 108 discards the encoded information acquired from the start of processing (that is, when the first encoded information is acquired) until the predetermined period elapses, and the predetermined period has elapsed. Only the encoded information after this may be output to the variable length encoding unit 109.
  • the predetermined period may be, for example, until N (N is an integer of 1 or more) frames of encoded information is acquired.
  • the correction unit 102 inputs the presence / absence of an imaging shift between the images of the first and second viewpoints (whether the correction value is 0) to the encoding control unit 116.
  • the encoding control unit 116 determines that there is no imaging deviation between images (the correction value has become 0), and inputs an encoding start signal to the switching unit 108. Then, after receiving the encoding start signal from the encoding control unit 116, the switching unit 108 starts to output the quantization coefficient and the encoding information input from the encoding unit 106 to the variable length encoding unit 109. It may be.
  • FIG. 5A is a block diagram of an image coding apparatus 200 according to Embodiment 2 of the present invention.
  • FIG. 5B is a flowchart illustrating a processing procedure of preprocessing of the image encoding device 200.
  • the image encoding device 200 according to Embodiment 2 has a configuration in which an encoding control unit 116 is added to the configuration of the image encoding device 100 according to Embodiment 1.
  • the operations of the other parts except the correction unit 102 and the switching unit 108 are the same as those in the first embodiment. In the following, description will be made with a focus on the different operations.
  • the left-eye imaging unit 101a and the right-eye imaging unit 101b of the image encoding device 200 start imaging in response to the power being turned on (Yes in S401).
  • the timing which starts imaging is not restricted to said example, imaging shall be started before the start of an encoding process is instruct
  • the motion vector detection unit 105 acquires images captured by the left-eye imaging unit 101a and the right-eye imaging unit 101b via the correction unit 102, the multiplexing unit 103, and the switching unit 104 (S402). Then, the motion vector of each acquired image is detected (S403), and only the latest motion vector is stored.
  • This process (S402, S403) is repeatedly executed until the start of the encoding process is instructed. However, at this time, the correction unit 102 does not execute the correction process.
  • the latest motion vector may be stored in the correction unit 102 instead of the motion vector detection unit 105.
  • the image encoding apparatus 200 executes the main process shown in FIG. 3A (S405).
  • the correction process is executed using the latest motion vector detected by the motion vector detection unit 105 on the image immediately after the start of the encoding process is instructed (first) ( 3A is different from the first embodiment.
  • the correction process can be executed even for an image immediately after the start of encoding, so that only an image from which imaging deviation has been removed can be encoded.
  • Embodiment 3 The configuration of the third embodiment of the present invention is the same as that of the second embodiment, but the operations of the left-eye imaging unit 101a and the right-eye imaging unit 101b are different. Specifically, the left-eye imaging unit 101a and the right-eye imaging unit 101b according to Embodiment 3 continuously capture still images, and sequentially output the captured images (still images) to the correction unit 102. The output point is different from the second embodiment.
  • the correction unit 102 does not perform correction processing on the images acquired from the left-eye imaging unit 101a and the right-eye imaging unit 101b until the start of the encoding process is instructed, and outputs the image to the multiplexing unit 103. To do. Then, the motion vector detection unit 105 acquires and acquires images captured by the left-eye imaging unit 101a and the right-eye imaging unit 101b via the correction unit 102, the multiplexing unit 103, and the switching unit 104. The motion vector of each image is detected and only the latest motion vector is stored. Then, the correction unit 102 performs correction processing on the image immediately after the start of the correction processing is instructed, using the latest motion vector detected by the motion vector detection unit 105 immediately before that.
  • the above configuration enables high-efficiency encoding of still stereoscopic images. Also, it is possible to encode only still stereoscopic image data with no imaging deviation between images, and it is possible to obtain encoded data of a still image of an image that is easy to stereoscopically view and hardly causes eye fatigue.
  • the left-eye imaging unit 101a and the right-eye imaging unit 101b each have an image memory therein, and each captures a single still image, accumulates the captured image in the captured image memory, and continues the same image (still image). Then, it may be output to the correction unit 102.
  • Each of the above devices is specifically a computer system including a microprocessor, ROM, RAM, a hard disk unit, a display unit, a keyboard, a mouse, and the like.
  • a computer program is stored in the RAM or the hard disk unit.
  • Each device achieves its functions by the microprocessor operating according to the computer program.
  • the computer program is configured by combining a plurality of instruction codes indicating instructions for the computer in order to achieve a predetermined function.
  • the system LSI is an ultra-multifunctional LSI manufactured by integrating a plurality of component parts on one chip, and specifically, a computer system including a microprocessor, a ROM, a RAM, and the like. is there. A computer program is stored in the RAM. The system LSI achieves its functions by the microprocessor operating according to the computer program.
  • the constituent elements constituting each of the above devices may be constituted by an IC card or a single module that can be attached to and detached from each device.
  • the IC card or module is a computer system that includes a microprocessor, ROM, RAM, and the like.
  • the IC card or the module may include the super multifunctional LSI described above.
  • the IC card or the module achieves its functions by the microprocessor operating according to the computer program. This IC card or this module may have tamper resistance.
  • the present invention may be the method described above. Further, the present invention may be a computer program that realizes these methods by a computer, or may be a digital signal composed of a computer program.
  • the present invention also relates to a computer-readable recording medium capable of reading a computer program or a digital signal, such as a flexible disk, hard disk, CD-ROM, MO, DVD, DVD-ROM, DVD-RAM, BD (Blu-ray Disc), It may be recorded in a semiconductor memory or the like. Further, it may be a digital signal recorded on these recording media.
  • a computer-readable recording medium capable of reading a computer program or a digital signal, such as a flexible disk, hard disk, CD-ROM, MO, DVD, DVD-ROM, DVD-RAM, BD (Blu-ray Disc), It may be recorded in a semiconductor memory or the like. Further, it may be a digital signal recorded on these recording media.
  • the present invention may transmit a computer program or a digital signal via an electric communication line, a wireless or wired communication line, a network represented by the Internet, a data broadcast, or the like.
  • the present invention may be a computer system including a microprocessor and a memory.
  • the memory may store the computer program, and the microprocessor may operate according to the computer program.
  • program or digital signal may be recorded on a recording medium and transferred, or the program or digital signal may be transferred via a network or the like, and may be implemented by another independent computer system.
  • a digital camera In the stereoscopic video encoding apparatus and method of the present invention, a digital camera, a digital video camera, a mobile phone with a camera, a DVD / BD recorder, a TV for recording programs, a web camera, a program distribution server
  • a digital camera For example, it is useful for encoding, compressing, recording, storing, and transferring stereoscopic image data.

Abstract

An image encoding device (100) comprised of capture units (101a, 101b) which capture three-dimensional footage formed of images taken from at least two viewpoints, a correction unit (102) which corrects any discrepancy in the size or position of the subject of the three-dimensional images captured by the capture units (101a, 101b) when said images are projected, a motion vector detection unit (105) which detects vectors of motion occurring between the images from the two viewpoints which comprise the three-dimensional images corrected by the correction unit (102), and an encoding unit (106) which compresses and encodes the images corrected by the correction unit (102) based on the motion vectors detected by the motion vector detection unit (105). The correction unit (102) bases its correction processing on the vectors previously detected by the motion vector detection unit (105).

Description

画像符号化装置、画像符号化方法、プログラム、及び集積回路Image coding apparatus, image coding method, program, and integrated circuit
 本発明は、高効率な画像符号化装置に関し、特に、複数視点で撮像された立体映像データを、動き補償予測を用いて高効率符号化する方式に関するものである。 The present invention relates to a high-efficiency image encoding device, and more particularly to a method for high-efficiency encoding of stereoscopic video data captured from a plurality of viewpoints using motion compensation prediction.
 左右両眼で観測する画像の視差を利用して立体視させる立体映像表示装置が開発されている。このような立体映像を符号化する方式として、左眼用の映像と右眼用の映像との相関が高いことを利用することが知られている。具体的には、2つの画像のうちの一方を符号化する場合に、他方の画像を参照画像として動きベクトルを求め、動き補償を行う。これにより、高効率の圧縮を実現する符号化方式が提案されている。 A stereoscopic video display device has been developed for stereoscopic viewing using parallax of images observed with both eyes. As a method for encoding such a stereoscopic video, it is known to use the fact that the correlation between the left-eye video and the right-eye video is high. Specifically, when one of the two images is encoded, a motion vector is obtained using the other image as a reference image, and motion compensation is performed. Thus, an encoding method that realizes highly efficient compression has been proposed.
 また、同様に視点間の動き補償を利用した多視点の画像の符号化を実現する画像符号化方式として、H.264のMVC(Multiview Video Coding)の規格化が行われている。 Similarly, as an image encoding method for realizing multi-view image encoding using motion compensation between viewpoints, H.264 MVC (Multiview Video Coding) is being standardized.
 図6は、MVCにおけるフレーム(ピクチャ)間の参照関係を表している。通常の1視点の画像を符号化する場合、時間軸方向の参照、つまり、撮像時刻の異なる他のフレームを参照画像として動きベクトルを検出し、動き補償予測を行う。一方、MVCでは、多視点の画像を、時間軸方向の参照に加え、各視点(V0~V4)間での参照、つまり、同時刻に撮像された異なる視点のフレームを参照画像として動きベクトルを検出し、動き補償予測を行うことができる。 FIG. 6 shows a reference relationship between frames (pictures) in MVC. When a normal one-viewpoint image is encoded, a motion vector is detected using a reference in the time axis direction, that is, another frame having a different imaging time as a reference image, and motion compensation prediction is performed. On the other hand, in MVC, in addition to multi-viewpoint reference in the time axis direction, reference between each viewpoint (V0 to V4), that is, a motion vector is obtained using frames of different viewpoints captured at the same time as reference images. Detection and motion compensation prediction can be performed.
 動き補償予測を用いて高効率な符号化を行うためには、符号化対象画像と参照画像との間の正確な動きベクトルを求める必要である。しかしながら、立体映像を構成する多視点の映像は、単品のカメラを複数個組み合わせたり、カメラを固定して一体型にして開発している場合が多い。この場合、各カメラのノイズや輝度及び色差の変化に個体差があるので、符号化ブロック毎にブロックマッチングを用いた従来の動きベクトル検出では、視点間の動きベクトルを正確に検出ができないという課題があった。 In order to perform highly efficient encoding using motion compensation prediction, it is necessary to obtain an accurate motion vector between the encoding target image and the reference image. However, in many cases, multi-viewpoint images constituting a three-dimensional image are developed by combining a plurality of single cameras or by fixing the cameras to be an integrated type. In this case, since there are individual differences in the noise, brightness, and color difference of each camera, the conventional motion vector detection using block matching for each coding block cannot accurately detect a motion vector between viewpoints. was there.
 こうした課題を解決する従来例として、特許文献1には、該当するブロックの周囲のブロックの動きベクトルや、過去のフレームの同位置の動きベクトルを用いて、符号化に用いるための正確な動きベクトルを検出して、符号化効率を向上させる従来例が開示されている。 As a conventional example for solving such a problem, Patent Document 1 discloses an accurate motion vector for use in encoding using a motion vector of a block around the corresponding block or a motion vector at the same position in a past frame. A conventional example in which the coding efficiency is improved by detecting the above is disclosed.
 図7に従来例の特許文献1のブロック図を示す。従来例の画像符号化装置10は、ブロックマッチング部1と、視差補償ベクトル検出部2と、メモリ3、6と、補正ベクトル検出部4と、可変遅延部5とを主に有する。 FIG. 7 shows a block diagram of Patent Document 1 of the conventional example. A conventional image coding apparatus 10 mainly includes a block matching unit 1, a parallax compensation vector detection unit 2, memories 3 and 6, a correction vector detection unit 4, and a variable delay unit 5.
 上記構成の画像符号化装置10には、同期した一対のカメラで撮像された2つの画像(つまり、互いに異なる視点で撮像された画像)のうち、一方の画像が符号化対象画像として、他方の画像が参照画像として入力される。 The image encoding apparatus 10 having the above configuration includes one image as an encoding target image among two images captured by a pair of synchronized cameras (that is, images captured from different viewpoints), and the other image. An image is input as a reference image.
 ブロックマッチング部1は、符号化対象画像を構成するブロック(符号化対象ブロック)毎に参照画像とブロックマッチングを行う。ブロックマッチング部1から出力されたブロックマッチング結果は、視差補償ベクトル検出部2に入力される。視差補償ベクトル検出部2では、ブロックマッチング結果に基づいて、符号化対象ブロックの動きベクトルを検出する。そして、このように検出した動きベクトルをメモリ3に蓄えておく。 The block matching unit 1 performs block matching with the reference image for each block (encoding target block) constituting the encoding target image. The block matching result output from the block matching unit 1 is input to the parallax compensation vector detection unit 2. The disparity compensation vector detection unit 2 detects the motion vector of the encoding target block based on the block matching result. The motion vector detected in this way is stored in the memory 3.
 補正ベクトル検出部4は、メモリ3から該当する符号化対象ブロック及び符号化対象ブロックの周囲のブロックの動きベクトルと、メモリ6に記憶しておいた過去の同位置のブロック及び当該ブロックの周囲のブロックの動きベクトルとを取得して、たとえば、それらの動きベクトルの平均をとって、符号化対象ブロックの正確な動きベクトルの検出を行う。 The correction vector detection unit 4 reads the corresponding encoding target block from the memory 3 and the motion vector of the block around the encoding target block, the past block at the same position stored in the memory 6 and the surrounding of the block. The motion vector of the block is acquired and, for example, the motion vector is averaged to detect an accurate motion vector of the block to be encoded.
 図8は、符号化対象画像を構成するブロック毎の動きベクトルを示す模式図である。図8に示されるように、異なる視点の画像を参照画像として検出された符号化対象画像の動きベクトルは、ベクトル「ア」~「タ」となる。補正ベクトル検出部4は、ベクトル「カ」の補正ベクトルを得るために、その周辺のベクトル「ア」、「イ」、「ウ」、「オ」、「キ」、「ケ」、「コ」、「サ」を用いる。このように補正された補正ベクトルを用いて動き補償予測符号化を行うことで、符号化効率の向上を実現する。なお、記号「(ベクトル)」は、それぞれ直前の文字の上に付される記号を示す。 FIG. 8 is a schematic diagram showing a motion vector for each block constituting the encoding target image. As shown in FIG. 8, the motion vectors of the encoding target images detected using the images of different viewpoints as reference images are vectors “a ” to “ta ”. In order to obtain a correction vector of the vector “K ”, the correction vector detection unit 4 includes neighboring vectors “A ”, “I ”, “U ”, “O ”, “K ”, “ “ ”, “Ko ”, “sa ” are used. Encoding efficiency is improved by performing motion compensation predictive encoding using the correction vector corrected in this way. The symbol “ (vector)” indicates a symbol added on the immediately preceding character.
特開平6-113335号公報JP-A-6-113335
 しかしながら、立体映像を構成する各視点の映像は、単品のカメラを複数個組み合わせたり、複数のカメラを固定して一体型にして撮像されている。このため、一方のカメラを基準にして考えると、他方のカメラが視差とは異なる傾き(回転)を持っていたり、上下(左右)のずれを持っていたり、撮像対象の大きさ(撮像倍率)が異なったりする場合が多い。このような視差とは異なる画像ズレをもつ立体映像を、視点間参照による動き補償予測を用いて符号化する場合、正確な動きベクトルを検出できたとしても、動き補償の残差信号(予測誤差)が大きく、符号化効率が上がらないという第1の課題があった。 However, the video of each viewpoint constituting the stereoscopic video is captured as a single unit by combining a plurality of single cameras or by fixing a plurality of cameras. For this reason, when one camera is considered as a reference, the other camera has an inclination (rotation) different from the parallax, a vertical (left / right) shift, or the size of the imaging target (imaging magnification) Are often different. When encoding a stereoscopic video having an image shift different from such a parallax using motion compensation prediction based on inter-viewpoint reference, even if an accurate motion vector can be detected, a motion compensation residual signal (prediction error) ) Is large and there is a first problem that the encoding efficiency does not increase.
 図9A~図9C、図10A、及び図10Bを用いて、2つの視点で撮像された画像を符号化する場合の第1の課題を具体的に説明する。図9A~図9Cは、第1の視点の画像を参照画像とし、第2の視点の画像を符号化対象画像とし、被写体(星)の一部を含むブロックを符号化する場合の例である。第1の視点の画像内のブロックは動きベクトル検出により得られた参照ブロックであり、第2の視点の画像内のブロックは符号化対象ブロックである。 9A to 9C, FIG. 10A, and FIG. 10B, the first problem in the case of encoding an image captured from two viewpoints will be specifically described. FIGS. 9A to 9C show an example in which a first viewpoint image is used as a reference image, a second viewpoint image is used as an encoding target image, and a block including a part of a subject (star) is encoded. . A block in the first viewpoint image is a reference block obtained by motion vector detection, and a block in the second viewpoint image is an encoding target block.
 図9Aは、2つの視点の画像に視差以外のずれがない場合であり、符号化対象ブロックと参照ブロックとの残差信号(予測誤差)は小さくなるので、高効率に符号化できる。しかしながら、図9Bに示すように、2つの視点の画像に傾きのずれがある場合や、図9Cに示すように、2つの視点の画像に大きさのずれがある場合には、残差信号(予測誤差)は大きくなるので、高効率に符号化できない。 FIG. 9A shows a case where there is no deviation other than parallax between the images of the two viewpoints, and the residual signal (prediction error) between the encoding target block and the reference block becomes small, so that encoding can be performed with high efficiency. However, as shown in FIG. 9B, if there is a deviation in inclination between the two viewpoint images, or if there is a difference in size between the two viewpoint images as shown in FIG. 9C, the residual signal ( (Prediction error) becomes large, so that it cannot be encoded with high efficiency.
 同様に、図10Aは、2つの視点の画像に視差以外のずれがない場合であり、符号化対象ブロックと参照ブロックとの残差信号(予測誤差)は小さくなるので、高効率に符号化できる。しかしながら、図10Bに示すように、2つの視点の画像に上下のずれがある場合には、参照ブロックは、画像外に拡張された位置となる。その結果、残差信号(予測誤差)は大きくなり、高効率に符号化できない。 Similarly, FIG. 10A shows a case where there is no shift other than parallax between the images of the two viewpoints, and the residual signal (prediction error) between the encoding target block and the reference block becomes small, so that encoding can be performed with high efficiency. . However, as shown in FIG. 10B, when there are vertical shifts in the images of the two viewpoints, the reference block is a position extended outside the image. As a result, the residual signal (prediction error) becomes large and cannot be encoded with high efficiency.
 また、視点間の画像に、視差以外の傾きや大きさや上下のずれがある場合には、符号化された立体画像データを再生し、視聴する時に、立体視が困難になるばかりか、目の疲労を起こしやすいという第2の課題があった。 In addition, when the images between the viewpoints have inclinations, sizes, or vertical shifts other than parallax, the encoded stereoscopic image data is reproduced and viewed, and stereoscopic viewing becomes difficult. There was a second problem of being easily fatigued.
 さらに、視点間の画像の視差以外の画像ずれを予め補正するためには、新たに画像ずれ検出手段を設ける必要があり、画像符号化装置10の消費電力の増大や、回路規模の増大を招くという第3の課題があった。 Furthermore, in order to correct in advance an image shift other than the parallax of the image between the viewpoints, it is necessary to newly provide an image shift detection unit, which causes an increase in power consumption of the image encoding device 10 and an increase in circuit scale. There was a third problem.
 そこで、本発明は、上記第1~第3の課題に鑑みてなされたものであり、2つの画像間の視差と異なる原因によって生じるズレを、簡単且つ適切に補正する画像符号化装置を提供することを目的とする。 Accordingly, the present invention has been made in view of the above first to third problems, and provides an image encoding apparatus that easily and appropriately corrects a shift caused by a cause different from a parallax between two images. For the purpose.
 本発明の一形態に係る画像符号化装置は、少なくとも2視点の映像から成る立体視用の映像を符号化する。具体的には、前記立体視用の映像を取得する取得部と、前記取得部で取得された立体視用の映像に映し出される被写体の表示時における大きさまたは位置に関するずれを補正するための補正処理を実行する補正部と、前記補正部で補正された立体視用の映像を構成する2視点の映像間における動きベクトルを検出する動きベクトル検出部と、前記動きベクトル検出部で検出された動きベクトルに基づいて、前記補正部で補正された立体視用の映像を圧縮符号化する符号化部とを備える。そして、前記補正部は、現在の補正処理よりも前に前記動きベクトル検出部で検出された動きベクトルに基づいて、当該補正処理を実行する。 The image encoding device according to an aspect of the present invention encodes a stereoscopic video image including at least two viewpoint videos. Specifically, an acquisition unit that acquires the stereoscopic video, and a correction for correcting a displacement related to the size or position of the subject displayed in the stereoscopic video acquired by the acquisition unit A correction unit that executes processing, a motion vector detection unit that detects a motion vector between two viewpoint videos that form a stereoscopic video corrected by the correction unit, and a motion detected by the motion vector detection unit An encoding unit that compresses and encodes the stereoscopic video corrected by the correction unit based on the vector. And the said correction | amendment part performs the said correction process based on the motion vector detected by the said motion vector detection part before the present correction process.
 これにより、視点間参照による高効率な画像符号化が可能となる。また、符号化された画像を立体画像として再生し、視聴する時に、立体視が容易で、目の疲労を起こし難い。さらに、画像ずれ検出手段等の新たな構成要素を設ける必要がないので、回路規模の増大を抑えて、消費電力を低減することが可能となる。 This makes it possible to perform highly efficient image coding by inter-viewpoint reference. Also, when an encoded image is reproduced as a stereoscopic image and viewed, stereoscopic viewing is easy and eye fatigue is unlikely to occur. Furthermore, since it is not necessary to provide a new component such as an image shift detection unit, it is possible to suppress an increase in circuit scale and reduce power consumption.
 また、前記補正部は、前記立体視用の映像に映し出される被写体の2視点間における回転に伴うずれ、拡大に伴うずれ、および並行移動に伴うずれのうち少なくとも1つを、前記動きベクトルに基づいて補正してもよい。 In addition, the correction unit may use at least one of a shift due to rotation between two viewpoints of the subject displayed in the stereoscopic video, a shift due to enlargement, and a shift due to parallel movement based on the motion vector. May be corrected.
 さらに、前記補正部は、前記動きベクトルの方向に基づいて、前記回転に伴うずれ、前記拡大に伴うずれ、および前記並行移動に伴うずれのうち少なくとも1つを検出し、検出結果に示されるずれを補正してもよい。 Further, the correction unit detects at least one of a shift due to the rotation, a shift due to the enlargement, and a shift due to the parallel movement based on the direction of the motion vector, and the shift indicated in the detection result May be corrected.
 一例として、前記補正部は、前記動きベクトルの垂直方向成分に基づいて、前記並行移動に伴うずれを検出してもよい。なお、平行移動の検出は、必ずしもブロック毎に検出された複数の動きベクトルを用いる必要はなく、1つの動きベクトルからでも検出することができる。 As an example, the correction unit may detect a shift associated with the parallel movement based on a vertical component of the motion vector. Note that the parallel movement is not necessarily detected using a plurality of motion vectors detected for each block, and can be detected from one motion vector.
 例えば、前記動きベクトル検出部は、前記補正部で補正された立体視用の映像の全領域よりも小さい領域毎に前記動きベクトルを検出してもよい。そして、前記補正部は、前記動きベクトル検出部で検出された複数の動きベクトルの方向が示す傾向に基づいて、前記回転に伴うずれ、または前記拡大に伴うずれを検出してもよい。 For example, the motion vector detection unit may detect the motion vector for each region smaller than the entire region of the stereoscopic video corrected by the correction unit. And the said correction | amendment part may detect the shift | offset | difference accompanying the said rotation, or the shift | offset | difference accompanying the said expansion based on the tendency which the direction of the several motion vector detected by the said motion vector detection part shows.
 一例として、前記補正部は、前記複数の動きベクトルが前記立体視用の映像内の所定位置に向かって収束する傾向を示す場合、または当該所定位置から拡散する傾向を示す場合に、前記拡大に伴うずれを検出してもよい。 As an example, when the plurality of motion vectors tend to converge toward a predetermined position in the stereoscopic video, or when the correction unit shows a tendency to diffuse from the predetermined position, the correction unit performs the enlargement. The accompanying shift may be detected.
 他の例として、前記補正部は、前記複数の動きベクトルが前記立体視用の映像内で円を描くような傾向を示す場合に、前記回転に伴うずれを検出してもよい。 As another example, the correction unit may detect a shift associated with the rotation when the plurality of motion vectors tend to draw a circle in the stereoscopic video.
 また、前記符号化部は、符号化の開始が指示されてから所定の期間経過したことに応じて、圧縮符号化された立体視用の映像の出力を開始してもよい。これにより、ズレを含んだ画像が出力されないので、立体視がより容易となる。 Further, the encoding unit may start outputting the compression-encoded video for stereoscopic viewing when a predetermined period has passed since the start of encoding was instructed. Thereby, since an image including a shift is not output, stereoscopic viewing becomes easier.
 また、前記動きベクトル検出部は、さらに、符号化の開始が指示される前に前記取得部で取得された前記立体視用の映像間における動きベクトルの検出を開始してもよい。そして、前記補正部は、符号化処理の開始が指示された直後に前記取得部で取得された最初の前記立体視用の画像に対して、前記動きベクトル検出部で検出された最新の動きベクトルを用いて補正処理を実行してもよい。 Further, the motion vector detection unit may further start detection of a motion vector between the stereoscopic video images acquired by the acquisition unit before an instruction to start encoding is given. Then, the correction unit detects the latest motion vector detected by the motion vector detection unit for the first stereoscopic image acquired by the acquisition unit immediately after the start of the encoding process is instructed. The correction process may be executed using
 上記構成によれば、符号化される最初の画像に対しても補正処理を実行することができるので、実質的にズレのない画像のみを符号化することができる。 According to the above configuration, the correction process can be executed on the first image to be encoded, so that only an image with substantially no deviation can be encoded.
 一例として、前記取得部は、被写体を第1の視点から撮像する第1の撮像部と、前記被写体を第2の視点から撮像する第2の撮像部とを含んでもよい。 As an example, the acquisition unit may include a first imaging unit that images a subject from a first viewpoint, and a second imaging unit that images the subject from a second viewpoint.
 また、前記動きベクトル検出部は、前記第1及び第2の視点それぞれで第1の時刻に撮像された画像の一方を符号化対象画像、他方を参照画像として、前記符号化対象画像のブロック毎に動きベクトルを検出してもよい。そして、前記補正部は、前記符号化対象画像の各ブロックに対応する複数の前記動きベクトルの傾向に基づいて、前記第1及び第2の視点それぞれで前記第1の時刻より後の第2の時刻に撮像された画像のうちの少なくとも一方の画像に対して、被写体の表示時における大きさまたは位置に関するずれを補正してもよい。 In addition, the motion vector detection unit is configured so that one of images captured at the first time at each of the first and second viewpoints is an encoding target image, and the other is a reference image, for each block of the encoding target image. Alternatively, a motion vector may be detected. And the said correction | amendment part is 2nd after the said 1st time in each of the said 1st and 2nd viewpoint based on the tendency of the said several motion vector corresponding to each block of the said encoding object image. You may correct | amend the shift | offset | difference regarding the magnitude | size or position at the time of the display of a to-be-photographed object with respect to the at least one image imaged at the time.
 本発明の一形態に係る画像符号化方法は、少なくとも2視点の映像から成る立体視用の映像を符号化する方法である。具体的には、前記立体視用の映像を取得する取得ステップと、前記取得ステップで取得された立体視用の映像に映し出される被写体の表示時における大きさまたは位置に関するずれを補正するための補正処理を実行する補正ステップと、前記補正ステップで補正された立体視用の映像を構成する2視点の映像間における動きベクトルを検出する動きベクトル検出ステップと、前記動きベクトル検出ステップで検出された動きベクトルに基づいて、前記補正ステップで補正された立体視用の映像を圧縮符号化する符号化ステップとを含む。そして、前記補正ステップでは、現在の補正処理よりも前に前記動きベクトル検出ステップで検出された動きベクトルに基づいて、当該補正処理を実行する。 An image encoding method according to an aspect of the present invention is a method for encoding a stereoscopic video image including at least two viewpoint videos. Specifically, an acquisition step for acquiring the stereoscopic video, and a correction for correcting a displacement related to the size or position of the subject displayed in the stereoscopic video acquired in the acquisition step. A correction step for executing processing, a motion vector detection step for detecting a motion vector between two viewpoint videos constituting the stereoscopic video corrected in the correction step, and the motion detected in the motion vector detection step An encoding step of compressing and encoding the stereoscopic video corrected in the correction step based on the vector. In the correction step, the correction process is executed based on the motion vector detected in the motion vector detection step before the current correction process.
 本発明の一形態に係るプログラムは、コンピュータに、少なくとも2視点の映像から成る立体視用の映像を符号化させる。具体的には、前記立体視用の映像を取得する取得ステップと、前記取得ステップで取得された立体視用の映像に映し出される被写体の表示時における大きさまたは位置に関するずれを補正するための補正処理を実行する補正ステップと、前記補正ステップで補正された立体視用の映像を構成する2視点の映像間における動きベクトルを検出する動きベクトル検出ステップと、前記動きベクトル検出ステップで検出された動きベクトルに基づいて、前記補正ステップで補正された立体視用の映像を圧縮符号化する符号化ステップと、をコンピュータに実行させる。そして、前記補正ステップでは、現在の補正処理よりも前に前記動きベクトル検出ステップで検出された動きベクトルに基づいて、当該補正処理を実行する。 The program according to an aspect of the present invention causes a computer to encode a stereoscopic video image including at least two viewpoint videos. Specifically, an acquisition step for acquiring the stereoscopic video, and a correction for correcting a displacement related to the size or position of the subject displayed in the stereoscopic video acquired in the acquisition step. A correction step for executing processing, a motion vector detection step for detecting a motion vector between two viewpoint videos constituting the stereoscopic video corrected in the correction step, and the motion detected in the motion vector detection step Based on the vector, the computer is caused to execute an encoding step of compressing and encoding the stereoscopic video corrected in the correction step. In the correction step, the correction process is executed based on the motion vector detected in the motion vector detection step before the current correction process.
 本発明の一形態に係る集積回路は、少なくとも2視点の映像から成る立体視用の映像を符号化する。具体的には、前記立体視用の映像を取得する取得部と、前記取得した立体視用の映像に映し出される被写体の表示時における大きさまたは位置に関するずれを補正するための補正処理を実行する補正部と、前記補正部で補正された立体視用の映像を構成する2視点の映像間における動きベクトルを検出する動きベクトル検出部と、前記動きベクトル検出部で検出された動きベクトルに基づいて、前記補正部で補正された立体視用の映像を圧縮符号化する符号化部とを備える。そして、前記補正部は、現在の補正処理よりも前に前記動きベクトル検出部で検出された動きベクトルに基づいて、当該補正処理を実行する。 An integrated circuit according to an aspect of the present invention encodes a stereoscopic video image including at least two viewpoint videos. Specifically, an acquisition unit that acquires the stereoscopic video and a correction process for correcting a shift related to the size or position of the subject displayed in the acquired stereoscopic video are displayed. Based on a correction unit, a motion vector detection unit that detects a motion vector between two viewpoint videos constituting the stereoscopic video corrected by the correction unit, and a motion vector detected by the motion vector detection unit And an encoding unit that compresses and encodes the stereoscopic video corrected by the correction unit. And the said correction | amendment part performs the said correction process based on the motion vector detected by the said motion vector detection part before the present correction process.
 本発明により、視点間参照による動きベクトル検出の結果から視差と異なる原因によって生じるズレを補正することが可能となる。その結果、視点間参照による高効率な画像符号化が可能となる。また、符号化された画像を立体画像として再生し、視聴する時に、立体視が容易で、目の疲労を起こし難い。さらに、画像ずれ検出手段等の新たな構成要素を設ける必要がないので、回路規模の増大を抑えて、消費電力を低減することが可能となる。 According to the present invention, it is possible to correct a deviation caused by a cause different from the parallax from the result of motion vector detection by inter-viewpoint reference. As a result, highly efficient image coding by inter-viewpoint reference becomes possible. Also, when an encoded image is reproduced as a stereoscopic image and viewed, stereoscopic viewing is easy and eye fatigue is unlikely to occur. Furthermore, since it is not necessary to provide a new component such as an image shift detection unit, it is possible to suppress an increase in circuit scale and reduce power consumption.
図1は、本発明の実施の形態1に係る画像符号化装置のブロック図である。FIG. 1 is a block diagram of an image coding apparatus according to Embodiment 1 of the present invention. 図2Aは、第1及び第2の視点それぞれで撮像された画像が撮像順に並んでいる状態を示す図である。FIG. 2A is a diagram illustrating a state in which images captured at the first and second viewpoints are arranged in the order of imaging. 図2Bは、第1及び第2の視点それぞれで撮像された画像が符号化順に並んでいる状態を示す図である。FIG. 2B is a diagram illustrating a state in which images captured at the first and second viewpoints are arranged in the encoding order. 図3Aは、実施の形態1における画像符号化装置のメイン処理を示すフローチャートである。FIG. 3A is a flowchart showing main processing of the image coding apparatus according to Embodiment 1. 図3Bは、実施の形態1における画像符号化装置の符号化処理を示すフローチャートである。FIG. 3B is a flowchart illustrating an encoding process of the image encoding device according to Embodiment 1. 図3Cは、実施の形態1における画像符号化装置の補正処理を示すフローチャートである。FIG. 3C is a flowchart showing a correction process of the image coding apparatus according to Embodiment 1. 図4Aは、撮像ズレがない場合の第1及び第2の視点の画像の例を示す図である。FIG. 4A is a diagram illustrating an example of images at the first and second viewpoints when there is no imaging deviation. 図4Bは、第1の視点の画像が第2の視点の画像に対して回転している場合の第1及び第2の視点の画像の例を示す図である。FIG. 4B is a diagram illustrating an example of the first and second viewpoint images when the first viewpoint image is rotated with respect to the second viewpoint image. 図4Cは、第1の視点の画像が第2の視点の画像に対して縮小されている場合の第1及び第2の視点の画像の例を示す図である。FIG. 4C is a diagram illustrating an example of first and second viewpoint images in a case where the first viewpoint image is reduced with respect to the second viewpoint image. 図4Dは、第1の視点の画像が第2の視点の画像に対して平行移動している場合の第1及び第2の視点の画像の例を示す図である。FIG. 4D is a diagram illustrating an example of the first and second viewpoint images when the first viewpoint image is translated with respect to the second viewpoint image. 図5Aは、本発明の実施の形態2に係る画像符号化装置のブロック図である。FIG. 5A is a block diagram of an image coding apparatus according to Embodiment 2 of the present invention. 図5Bは、本発明の実施の形態2に係る画像符号化装置の前処理を示すフローチャートである。FIG. 5B is a flowchart showing preprocessing of the image coding apparatus according to Embodiment 2 of the present invention. 図6は、H.264のMVC(Multiview Video Coding)の参照関係を説明する図である。FIG. 2 is a diagram illustrating a reference relationship of H.264 MVC (Multiview Video Coding). FIG. 図7は、従来の画像符号化装置のブロック図である。FIG. 7 is a block diagram of a conventional image encoding device. 図8は、符号化対象画像を構成する各ブロックの動きベクトルを示す図である。FIG. 8 is a diagram showing the motion vectors of the blocks constituting the encoding target image. 図9Aは、符号化対象画像と参照画像との間にズレがない場合の符号化効率を示す図である。FIG. 9A is a diagram illustrating encoding efficiency when there is no deviation between the encoding target image and the reference image. 図9Bは、符号化対象画像が参照画像に対して回転している場合の符号化効率を示す図である。FIG. 9B is a diagram illustrating encoding efficiency when the encoding target image is rotated with respect to the reference image. 図9Cは、符号化対象画像が参照画像に対して拡大されている場合の符号化効率を示す図である。FIG. 9C is a diagram illustrating encoding efficiency when the encoding target image is enlarged with respect to the reference image. 図10Aは、符号化対象画像と参照画像との間にズレがない場合の符号化効率を示す図である。FIG. 10A is a diagram illustrating encoding efficiency when there is no deviation between the encoding target image and the reference image. 図10Bは、符号化対象画像が参照画像に対して平行移動している場合の符号化効率を示す図である。FIG. 10B is a diagram illustrating encoding efficiency when the encoding target image is translated with respect to the reference image.
 以下、本発明の実施の形態を、図面を参照しながら説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
 (実施の形態1)
 図1は、本発明の実施の形態1に係る画像符号化装置100のブロック図である。画像符号化装置100は、H.264のMVCに準じた装置であり、図1に示されるように、左眼用撮像部(第1の撮像部)101aと、右眼用撮像部(第2の撮像部)101bと、補正部102と、多重化部103と、切替部104、108と、動きベクトル検出部105と、符号化部106と、参照画像メモリ107と、可変長符号化部109と、符号化モード制御部110とを備える。補正部102は、さらに、補正値算出部111と、画像補正部112とを備える。符号化部106は、さらに、画面内符号化部114と画面間符号化部115とを備える。
(Embodiment 1)
FIG. 1 is a block diagram of an image coding apparatus 100 according to Embodiment 1 of the present invention. The image encoding apparatus 100 is an H.264 filer. As shown in FIG. 1, the left-eye imaging unit (first imaging unit) 101a, the right-eye imaging unit (second imaging unit) 101b, and a correction unit 102, multiplexing unit 103, switching units 104 and 108, motion vector detection unit 105, encoding unit 106, reference image memory 107, variable length encoding unit 109, encoding mode control unit 110, Is provided. The correction unit 102 further includes a correction value calculation unit 111 and an image correction unit 112. The encoding unit 106 further includes an intra-screen encoding unit 114 and an inter-screen encoding unit 115.
 左眼用撮像部101aは、被写体を第1の視点から撮像して得られる画像(映像、又は静止画)を補正部102に出力する。右眼用撮像部101bは、被写体を第1の視点とは異なる第2の視点から撮像して得られる画像を補正部102に出力する。 The left-eye imaging unit 101a outputs an image (video or still image) obtained by imaging the subject from the first viewpoint to the correction unit 102. The right-eye imaging unit 101b outputs an image obtained by imaging the subject from a second viewpoint different from the first viewpoint to the correction unit 102.
 つまり、左眼用撮像部101aから出力される画像と、右眼用撮像部101bから出力される画像とは、互いに視差を有している。言い換えれば、左眼用撮像部101a及び右眼用撮像部101bで撮像される画像は、2視点の映像から成る立体視用の映像である。 That is, the image output from the left-eye imaging unit 101a and the image output from the right-eye imaging unit 101b have parallax. In other words, the images picked up by the left-eye image pickup unit 101a and the right-eye image pickup unit 101b are stereoscopic images made up of two viewpoint images.
 図2Aは、左眼用撮像部101a及び右眼用撮像部101bで撮像された画像(映像)を示す模式図である。左眼用撮像部101aと右眼用撮像部101bとは、互いに連動しており、図2Aに示されるように、同一の時刻(t0、t1、・・・、t6)毎に1フレーム(ピクチャ)の画像を出力する。 FIG. 2A is a schematic diagram illustrating an image (video) captured by the left-eye imaging unit 101a and the right-eye imaging unit 101b. The left-eye imaging unit 101a and the right-eye imaging unit 101b are interlocked with each other, and as shown in FIG. 2A, 1 for every same time (t 0 , t 1 ,..., T 6 ). An image of a frame (picture) is output.
 なお、実施の形態1では、左眼用撮像部101a及び右眼用撮像部101bで、画像を取得する取得部を構成する。但し、本発明において、左眼用撮像部101a及び右眼用撮像部101bは必須の構成要素ではなく、省略することができる。すなわち、外部の撮像装置によって撮像された画像を取得部によって取得し、処理するものであってもよい。具体的には、取得部(図示省略)は、放送波から立体視用の映像を取得してもよい。放送波から取得できる立体視用の映像の形式は特に限定されない。 In the first embodiment, the left-eye imaging unit 101a and the right-eye imaging unit 101b constitute an acquisition unit that acquires an image. However, in the present invention, the left-eye imaging unit 101a and the right-eye imaging unit 101b are not essential components and can be omitted. That is, an image captured by an external imaging device may be acquired and processed by an acquisition unit. Specifically, the acquisition unit (not shown) may acquire a stereoscopic video from a broadcast wave. The format of the stereoscopic video that can be acquired from the broadcast wave is not particularly limited.
 例えば、1枚の画像の左側半分が第1の視点の画像で、右半分が第2の視点の画像である、又は1枚の画像の上半分が第1の視点の画像で、下半分が第2の視点の画像であるサイドバイサイド形式であってもよい。この形式であれば、従来の平面視用の映像と同じように送受信することができる。 For example, the left half of one image is the image of the first viewpoint and the right half is the image of the second viewpoint, or the upper half of one image is the image of the first viewpoint and the lower half is the image of the first viewpoint. A side-by-side format that is an image of the second viewpoint may be used. With this format, transmission and reception can be performed in the same manner as a conventional planar view video.
 又は、第1の視点の画像と第2の視点の画像とを、ピクチャ単位で交互に送受信する形式であってもよい。この形式であれば、フレームレートが従来の2倍になるものの、高精彩な立体視用の映像を送受信することができる。 Alternatively, the first viewpoint image and the second viewpoint image may be transmitted and received alternately in units of pictures. With this format, a high-definition stereoscopic video can be transmitted and received although the frame rate is twice that of the conventional one.
 補正部102は、左眼用撮像部101a及び右眼用撮像部101bそれぞれから入力された画像のうちの少なくとも一方の画像に対して、撮像ズレを補正するための補正処理を実行する。より具体的には、補正部102は、立体視用の映像に映し出される被写体の2視点間における回転に伴うずれ、拡大に伴うずれ、および並行移動に伴うずれのうち少なくとも1つを、動きベクトルの大きさ、及び/又は、方向に基づいて補正する。そして、補正部102は、補正した画像を多重化部103へ出力する。 The correction unit 102 performs a correction process for correcting an imaging shift on at least one of the images input from the left-eye imaging unit 101a and the right-eye imaging unit 101b. More specifically, the correction unit 102 calculates at least one of a shift due to rotation between two viewpoints of a subject displayed in a stereoscopic video, a shift due to enlargement, and a shift due to parallel movement as a motion vector. Is corrected on the basis of the size and / or direction. Then, the correction unit 102 outputs the corrected image to the multiplexing unit 103.
 なお、撮像ズレとは、例えば、被写体の表示時における大きさまたは位置に関するずれ定義することができる。具体的には、第1の視点の画像が同一時刻に撮像された第2の視点の画像に対して、拡大又は縮小されている場合(大きさのずれ)、回転又は平行移動(位置のずれ)している場合等が考えられる。 It should be noted that the imaging deviation can be defined, for example, as a deviation relating to the size or position when the subject is displayed. Specifically, when the image of the first viewpoint is enlarged or reduced (size shift) with respect to the image of the second viewpoint captured at the same time, rotation or translation (position shift). ) Is considered.
 また、撮像ズレとは、視差と異なる原因によって生じるズレ(上下のずれ、大きさのずれ、傾きのずれ等)と定義することもできる。「視差と異なる原因によって生じるズレ」とは、例えば、左眼用撮像部101a及び右眼用撮像部101bの設置ミス、撮像倍率の不一致等に起因して生じるズレを指す。 Also, the imaging shift can be defined as a shift (up / down shift, size shift, tilt shift, etc.) caused by a cause different from parallax. The “deviation caused by a cause different from parallax” refers to, for example, a deviation caused by an installation error of the left-eye imaging unit 101a and the right-eye imaging unit 101b, an imaging magnification mismatch, or the like.
 補正値算出部111は、複数の動きベクトルの方向の傾向に基づいて、撮像ズレの種類を判断する。また、複数の動きベクトルの大きさの傾向に基づいて、撮像ズレの大きさを算出する。 The correction value calculation unit 111 determines the type of imaging deviation based on the tendency of the direction of the plurality of motion vectors. Further, the magnitude of the imaging deviation is calculated based on the tendency of the magnitudes of the plurality of motion vectors.
 例えば、補正値算出部111は、動きベクトルの垂直方向成分に基づいて、並行移動に伴うずれを検出する。具体的には、各ブロックの動きベクトルが実質的に同一の方向(上方向又は下方向)を向いており、且つ大きさが実質的に同一である場合、平行移動に伴うずれと判断することができる。 For example, the correction value calculation unit 111 detects a shift due to parallel movement based on the vertical component of the motion vector. Specifically, if the motion vectors of each block are directed in substantially the same direction (upward or downward) and the sizes are substantially the same, it is determined that the displacement is caused by the parallel movement. Can do.
 また、補正値算出部111は、複数の動きベクトルの方向が示す傾向に基づいて、回転に伴うずれ、または拡大に伴うずれを検出してもよい。具体的には、複数の動きベクトルが立体視用の映像内の所定位置に向かって収束する傾向を示す場合、または当該所定位置から拡散する傾向を示す場合に、拡大に伴うずれと判断することができる。また、複数の動きベクトルが立体視用の映像内で円を描くような傾向を示す場合に、回転に伴うずれと判断することができる。 Further, the correction value calculation unit 111 may detect a shift due to rotation or a shift due to enlargement based on the tendency indicated by the directions of a plurality of motion vectors. Specifically, when a plurality of motion vectors tend to converge toward a predetermined position in a stereoscopic video, or when they tend to diffuse from the predetermined position, it is determined that the displacement is caused by enlargement. Can do. Further, when a plurality of motion vectors tend to draw a circle in a stereoscopic video, it can be determined that the shift is caused by rotation.
 画像補正部112は、補正値算出部111によって算出された撮像ズレの種類と大きさに応じて、第1及び第2の視点それぞれで同時刻に撮像された画像のうちの少なくとも一方の画像に対して補正処理を実行する。補正部102動作の詳細は後述する。 The image correcting unit 112 applies at least one of the images captured at the same time from the first and second viewpoints according to the type and size of the imaging shift calculated by the correction value calculating unit 111. The correction process is executed for the above. Details of the operation of the correction unit 102 will be described later.
 多重化部103は、補正部102から取得した画像を符号化順序に変更し、切替部104に出力する。 The multiplexing unit 103 changes the image acquired from the correction unit 102 to the encoding order, and outputs it to the switching unit 104.
 図2Bは、図2Aの画像が多重化部103に入力され、多重化された後の画像の順序(符号化の順序)を示す図である。なお、図2A及び図2Bにおいて、「I」、「P」、「B」は、各フレームの符号化タイプを表す。具体的には、「I」は画面内予測フレーム(Iピクチャ)を、「P」は片方向の画面間予測フレーム(Pピクチャ)を、「B」は両方向の画面間予測フレーム(Bピクチャ)を表す。また、「矢印」は、視点間参照を行う場合の参照先を示している。 FIG. 2B is a diagram illustrating the order of images (coding order) after the images in FIG. 2A are input to the multiplexing unit 103 and multiplexed. 2A and 2B, “I”, “P”, and “B” represent the encoding type of each frame. Specifically, “I” is an intra-frame prediction frame (I picture), “P” is a unidirectional inter-frame prediction frame (P picture), and “B” is a bi-directional inter-frame prediction frame (B picture). Represents. An “arrow” indicates a reference destination when performing inter-viewpoint reference.
 図2A及び図2Bに示される例では、第1の視点の画像を構成する各ブロックは、異なる時刻に撮像された第1の視点(同一視点)の画像のみを参照画像として符号化される。例えば、フレームF2の各ブロックは、フレームF0のみを参照画像として符号化される。フレームF4の各ブロックは、フレームF0又はフレームF2を参照画像として符号化される。フレームF6の各ブロックは、フレームF2又はフレームF4を参照画像として符号化される。 In the example shown in FIGS. 2A and 2B, each block constituting the first viewpoint image is encoded using only the first viewpoint (same viewpoint) image captured at different times as a reference image. For example, each block of the frame F2 is encoded using only the frame F0 as a reference image. Each block of the frame F4 is encoded using the frame F0 or the frame F2 as a reference image. Each block of the frame F6 is encoded using the frame F2 or the frame F4 as a reference image.
 一方、第2の視点の画像を構成する各ブロックは、同一時刻に撮像された第1の視点(他視点)の画像、又は異なる時刻に撮像された第2の視点(同一視点)の画像を参照画像として符号化される。例えば、フレームF1の各ブロックは、フレームF0のみを参照画像として符号化される。フレームF3の各ブロックは、フレームF1又はフレームF2を参照画像として符号化される。フレームF5の各ブロックは、フレームF1、フレームF3、又はフレームF4を参照画像として符号化される。フレームF7の各ブロックは、フレームF3、フレームF5、又はフレームF6を参照画像として符号化される。 On the other hand, each block constituting the second viewpoint image includes a first viewpoint (another viewpoint) imaged at the same time, or a second viewpoint (the same viewpoint) image captured at a different time. It is encoded as a reference image. For example, each block of the frame F1 is encoded using only the frame F0 as a reference image. Each block of the frame F3 is encoded using the frame F1 or the frame F2 as a reference image. Each block of the frame F5 is encoded using the frame F1, the frame F3, or the frame F4 as a reference image. Each block of the frame F7 is encoded using the frame F3, the frame F5, or the frame F6 as a reference image.
 図3Aは、メイン処理の処理手順を示すフローチャートである。図3Aを参照して、画像符号化装置100の動作の流れを簡単に説明する。 FIG. 3A is a flowchart showing a processing procedure of the main processing. With reference to FIG. 3A, the flow of the operation of the image coding apparatus 100 will be briefly described.
 まず、左眼用撮像部101aが第1の視点の画像を取得し、右眼用撮像部101bが第2の視点の画像を取得する(S201)。なお、左眼用撮像部101a及び右眼用撮像部101bを有しない画像符号化装置においては、外部の装置から画像を取得してもよい。 First, the left-eye imaging unit 101a acquires the first viewpoint image, and the right-eye imaging unit 101b acquires the second viewpoint image (S201). Note that in an image encoding device that does not include the left-eye imaging unit 101a and the right-eye imaging unit 101b, an image may be acquired from an external device.
 次に、補正部102は、左眼用撮像部101a及び右眼用撮像部101bから取得した画像に対して補正処理を実行する。補正処理の具体的な処理手順は、図3Cを用いて後述する。 Next, the correction unit 102 performs correction processing on the images acquired from the left-eye imaging unit 101a and the right-eye imaging unit 101b. A specific processing procedure of the correction processing will be described later with reference to FIG. 3C.
 次に、切替部104、108、動きベクトル検出部105、符号化部106、参照画像メモリ107、可変長符号化部109、及び符号化モード制御部110は、補正部102で補正され、多重化部103で多重化された画像を符号化する(S203)。符号化処理の具体的な処理手順は、図3Bを用いて後述する。 Next, the switching units 104 and 108, the motion vector detection unit 105, the encoding unit 106, the reference image memory 107, the variable length encoding unit 109, and the encoding mode control unit 110 are corrected by the correction unit 102 and multiplexed. The image multiplexed by the unit 103 is encoded (S203). A specific processing procedure of the encoding process will be described later with reference to FIG. 3B.
 図3Bは、符号化処理の処理手順を示すフローチャートである。図3Bを参照して、切替部104以降の構成要素の動作を詳しく説明する。 FIG. 3B is a flowchart showing the procedure of the encoding process. With reference to FIG. 3B, the operation of the components after the switching unit 104 will be described in detail.
 切替部104は、多重化部103から取得した符号化対象画像の符号化タイプを符号化モード制御部110から取得する。そして、符号化タイプが画面内予測フレーム(Iピクチャ)の場合、切替部104は、符号化対象画像を符号化部106の画面内符号化部114に出力する。一方、符号化タイプが画面間予測フレーム(Pピクチャ、または、Bピクチャ)の場合、切替部104は、符号化対象画像を画面内符号化部114と同時に動きベクトル検出部105に出力する。 The switching unit 104 acquires the encoding type of the encoding target image acquired from the multiplexing unit 103 from the encoding mode control unit 110. When the encoding type is an intra prediction frame (I picture), the switching unit 104 outputs the encoding target image to the intra encoding unit 114 of the encoding unit 106. On the other hand, when the encoding type is an inter-frame prediction frame (P picture or B picture), the switching unit 104 outputs the encoding target image to the motion vector detecting unit 105 simultaneously with the intra-frame encoding unit 114.
 つまり、符号化対象画像は、常に画面内符号化部114で画面内符号化される(S301)。さらに、符号化モード制御部110で画面間予測フレームと判断された場合(S302でYes)、画面内符号化に加えて、動きベクトル検出部105で動きベクトルの検出が行われる(S303)。 That is, the encoding target image is always intra-coded by the intra-coding unit 114 (S301). Furthermore, when the encoding mode control unit 110 determines that the frame is an inter-frame prediction frame (Yes in S302), in addition to the intra-frame encoding, the motion vector detection unit 105 detects a motion vector (S303).
 画面内符号化部114は、入力された符号化対象画像を画面内符号化する(S301)。具体的には、画面内符号化部114は、符号化対象画像を構成するブロック(符号化対象ブロック)毎に画面内予測を行って予測ブロックを生成する。次に、符号化対象ブロックから予測ブロックを減算して予測誤差(残差信号)を算出する。次に、算出された予測誤差を直交変換および量子化して量子化係数を算出する。そして、得られた量子化係数と符号化情報とを切替部108へ出力する。さらに、画面内符号化部114は、量子化係数を逆量子化および逆直交変換し、予測ブロックを加算してローカルデコード画像を作成する。このローカルデコード画像は、以降の画面間予測フレームの参照画像として参照画像メモリ107に蓄積される。 The in-screen encoding unit 114 performs intra-screen encoding on the input encoding target image (S301). Specifically, the intra-frame encoding unit 114 performs intra-frame prediction for each block (encoding target block) constituting the encoding target image to generate a prediction block. Next, a prediction error (residual signal) is calculated by subtracting the prediction block from the encoding target block. Next, the quantized coefficient is calculated by orthogonal transform and quantizing the calculated prediction error. Then, the obtained quantized coefficients and encoded information are output to switching section 108. Further, the intra-frame coding unit 114 performs inverse quantization and inverse orthogonal transform on the quantized coefficient, adds the prediction blocks, and creates a local decoded image. This local decoded image is stored in the reference image memory 107 as a reference image of the subsequent inter-frame prediction frame.
 動きベクトル検出部105は、符号化対象ブロックと参照画像とをブロックマッチングすることにより、補正部102で補正された立体視用の映像を構成する2視点の映像間における動きベクトルを検出する。具体的には、動きベクトル検出部105は、補正部102で補正された立体視用の映像の全領域よりも小さい領域(ブロック)毎に動きベクトルを検出する。 The motion vector detection unit 105 detects a motion vector between two viewpoint videos constituting the stereoscopic video corrected by the correction unit 102 by performing block matching between the encoding target block and the reference image. Specifically, the motion vector detection unit 105 detects a motion vector for each area (block) smaller than the entire area of the stereoscopic video corrected by the correction unit 102.
 さらに具体的には、動きベクトル検出部105は、符号化モード制御部110により指定されたローカルデコード画像を参照画像メモリ107から取得する。そして、動きベクトル検出部105は、取得したローカルデコード画像を参照画像として、符号化対象ブロックのブロックマッチングを行い、ブロック毎の動きベクトルを検出する(S303)。なお、符号化モード制御部110によって指定される参照画像は、1つであってもよいし、複数であってもよい。 More specifically, the motion vector detection unit 105 acquires the local decoded image designated by the encoding mode control unit 110 from the reference image memory 107. Then, the motion vector detection unit 105 performs block matching of the encoding target block using the acquired local decoded image as a reference image, and detects a motion vector for each block (S303). Note that the reference image specified by the encoding mode control unit 110 may be one or plural.
 そして、動きベクトル検出部105は、検出された動きベクトルを符号化部106の画面間符号化部115へ出力する。さらに、参照画像と符号化対象画像とが異なる視点の画像であった場合(S304でYes)、動きベクトル検出部105は、得られた動きベクトルを補正部102の補正値算出部111へ出力する(S305)。 Then, the motion vector detection unit 105 outputs the detected motion vector to the inter-screen encoding unit 115 of the encoding unit 106. Furthermore, when the reference image and the encoding target image are images of different viewpoints (Yes in S304), the motion vector detection unit 105 outputs the obtained motion vector to the correction value calculation unit 111 of the correction unit 102. (S305).
 図2Aを用いて説明すると、動きベクトル検出部105は、フレームF0を参照画像として検出したフレームF1の動きベクトル、フレームF4を参照画像として検出したフレームF5の動きベクトル、フレームF6を参照画像として検出したフレームF7の動きベクトル等を、補正値算出部111へ出力する。 Referring to FIG. 2A, the motion vector detection unit 105 detects the motion vector of the frame F1 detected using the frame F0 as a reference image, the motion vector of the frame F5 detected using the frame F4 as a reference image, and the frame F6 as a reference image. The motion vector and the like of the frame F7 thus output are output to the correction value calculation unit 111.
 画面間符号化部115は、入力された符号化対象画像を画面間符号化する。具体的には、画面間符号化部115は、符号化対象画像を構成するブロック(符号化対象ブロック)毎に、動きベクトル検出部105から取得した動きベクトルを用いて動き補償を行って予測ブロックを生成する。次に、符号化対象ブロックから予測ブロックを減算して予測誤差(残差信号)を算出する。次に、算出された予測誤差を直交変換および量子化して量子化係数を算出する。 The inter-screen encoding unit 115 inter-codes the input encoding target image. Specifically, the inter-frame encoding unit 115 performs motion compensation using the motion vector acquired from the motion vector detection unit 105 for each block (encoding target block) constituting the encoding target image, and performs prediction compensation. Is generated. Next, a prediction error (residual signal) is calculated by subtracting the prediction block from the encoding target block. Next, the quantized coefficient is calculated by orthogonal transform and quantizing the calculated prediction error.
 そして、画面間符号化部115は、量子化係数と符号化情報とを切替部108へ出力する。さらに、画面間符号化部115は、量子化係数を逆量子化および逆直交変換し、予測ブロックを加算してローカルデコード画像を作成する。このローカルデコード画像は、以降の画面間予測フレームの参照画像として参照画像メモリ107に蓄積される。 Then, the inter-screen encoding unit 115 outputs the quantization coefficient and the encoding information to the switching unit 108. Further, the inter-frame coding unit 115 performs inverse quantization and inverse orthogonal transform on the quantization coefficient, and adds a prediction block to create a local decoded image. This local decoded image is stored in the reference image memory 107 as a reference image of the subsequent inter-frame prediction frame.
 符号化モード制御部110は、符号化対象画像が画面間予測フレームの場合(S302でYes)、符号化部106(画面内符号化部114と画面間符号化部115)から出力される量子化係数及びその他の符号化情報に基づいて、符号化対象ブロック毎に、画面内予測符号化及び画面間予測符号化のどちらで符号化するかを公知の評価式で判断して、切替部108を制御する。 When the encoding target image is an inter-frame prediction frame (Yes in S302), the encoding mode control unit 110 performs quantization output from the encoding unit 106 (the intra-screen encoding unit 114 and the inter-screen encoding unit 115). Based on the coefficient and other coding information, for each coding target block, it is determined by a known evaluation formula whether to encode by intra prediction encoding or inter prediction encoding, and the switching unit 108 is Control.
 切替部108は、符号化モード制御部110の制御(判断結果)に従い、画面内符号化部114及び画面間符号化部115から得られた量子化係数のうち、一方を可変長符号化部109へ出力する(S306)。 According to the control (judgment result) of the encoding mode control unit 110, the switching unit 108 converts one of the quantized coefficients obtained from the intra-screen encoding unit 114 and the inter-screen encoding unit 115 to the variable-length encoding unit 109. (S306).
 可変長符号化部109は、切替部108から取得した量子化係数と符号化情報とを可変長符号化して、符号化データとして出力する(S307)。そして、画像符号化装置100は、上記の処理(S301~S307)を、符号化対象画像を構成する全てのブロックに対して実行する(S308)。 The variable length encoding unit 109 performs variable length encoding on the quantization coefficient and the encoding information acquired from the switching unit 108, and outputs the result as encoded data (S307). Then, the image coding apparatus 100 executes the above processing (S301 to S307) for all the blocks constituting the coding target image (S308).
 次に、図3C及び図4A~図4Dを参照して、補正部102の処理を詳しく説明する。なお、図3Cは、補正部102の処理手順を示すフローチャートである。図4A~図4Dは、第1及び第2の視点の画像間の撮像ズレと動きベクトルの傾向との関係を示す図である。 Next, the processing of the correction unit 102 will be described in detail with reference to FIG. 3C and FIGS. 4A to 4D. FIG. 3C is a flowchart illustrating a processing procedure of the correction unit 102. 4A to 4D are diagrams showing the relationship between the imaging deviation between the images of the first and second viewpoints and the tendency of the motion vector.
 まず、補正値算出部111は、視点間参照による動きベクトルの検出が行われた場合(S310)、当該動きベクトルを集計して、第1及び第2の視点それぞれの画像間の撮像ズレの種類と大きさとを検出して、その補正値を算出する(S311)。 First, when a motion vector is detected by inter-viewpoint reference (S310), the correction value calculation unit 111 aggregates the motion vectors, and types of imaging shift between the images of the first and second viewpoints. And the magnitude are detected, and the correction value is calculated (S311).
 具体的には、まず、補正値算出部111は、動きベクトル検出部105で検出された動きベクトルから視差の影響を排除する。視差とは、水平成分のズレであるので、垂直方向の成分は、撮像ズレである。また、撮影時に設定する左右の眼用撮像部のレンズ光軸の収束点であるコンバージェンスポイントにある物体は視差がほぼ0に等しいので、コンバージェンスポイントにある物体で検出される動きベクトルは、撮像ズレである。したがって、コンバージェンスポイントにある物体を異なる2つの視点から撮影し、撮像された2つの画像間で検出された動きベクトルを撮像ズレとみなすことができる。 Specifically, first, the correction value calculation unit 111 excludes the influence of parallax from the motion vector detected by the motion vector detection unit 105. Since the parallax is a horizontal component shift, the vertical component is an image shift. In addition, since the object at the convergence point, which is the convergence point of the lens optical axes of the left and right eye imaging units set at the time of shooting, has a parallax almost equal to 0, the motion vector detected by the object at the convergence point is It is. Therefore, an object at the convergence point can be photographed from two different viewpoints, and a motion vector detected between the two captured images can be regarded as an imaging shift.
 また、撮像部からの距離が同じ物体の視差は、同一の方向で、同一の大きさである。そこで、例えば、視差に相当する動きベクトル(向きと大きさ)を、予め補正値算出部111に設定しておいてもよい。 Also, the parallax of objects having the same distance from the imaging unit is the same size in the same direction. Therefore, for example, a motion vector (direction and magnitude) corresponding to the parallax may be set in the correction value calculation unit 111 in advance.
 図4A~図4Dは、第2の視点の画像を符号化対象画像とし、当該符号化対象画像と同一時刻に撮像された第1の視点の画像を参照画像として、動きベクトル検出部105で検出された動きベクトルから視差の影響を排除して図示した図である。 4A to 4D, the second viewpoint image is set as an encoding target image, and the first viewpoint image captured at the same time as the encoding target image is used as a reference image, which is detected by the motion vector detection unit 105. It is the figure which excluded the influence of parallax from the performed motion vector, and was shown in figure.
 図4Aは、符号化対象画像と参照画像との間に視差以外のズレがない場合の動きベクトルの傾向を示している。この場合、符号化対象画像と参照画像とは一致するので、各ブロックの動きベクトルは(0,0)となる傾向がある。 FIG. 4A shows the tendency of the motion vector when there is no deviation other than the parallax between the encoding target image and the reference image. In this case, since the encoding target image matches the reference image, the motion vector of each block tends to be (0, 0).
 次に、図4Bは、参照画像が符号化対象画像に対して回転している場合の動きベクトルの傾向を示している。この場合、各ブロックの動きベクトルが全体として円を描くように配置される傾向がある。このような状況は、例えば、左眼用撮像部101aが傾いて設置されているような場合に生じる。 Next, FIG. 4B shows the tendency of the motion vector when the reference image is rotated with respect to the encoding target image. In this case, the motion vectors of the blocks tend to be arranged to draw a circle as a whole. Such a situation occurs, for example, when the left-eye imaging unit 101a is installed tilted.
 図4Bをより具体的に観察すると、各ブロックの動きベクトルは、反時計回りの円を描くように配置されている。つまり、参照画像が符号化対象画像に対して反時計回りに回転していると判断できる。また、回転中心は、動きベクトルの大きさが最も小さい位置(この例では、画像中心)と推定できる。さらに、回転の度合い(回転角)は、動きベクトルの大きさと、回転中心からの距離とによって推定することができる。 When observing FIG. 4B more specifically, the motion vectors of each block are arranged to draw a counterclockwise circle. That is, it can be determined that the reference image is rotated counterclockwise with respect to the encoding target image. Further, the rotation center can be estimated as the position where the magnitude of the motion vector is the smallest (in this example, the image center). Furthermore, the degree of rotation (rotation angle) can be estimated from the magnitude of the motion vector and the distance from the center of rotation.
 ここで、撮像ズレの種類が回転である場合において、回転の方向及び補正値の算出方法を説明する。なお、以下に示す方法は一例であって、その他の方法で算出することもできる。 Here, in the case where the type of imaging deviation is rotation, a method for calculating the direction of rotation and the correction value will be described. Note that the method shown below is an example, and it can be calculated by other methods.
 まず、回転の方向及び補正値を算出するのに先立って、前処理を実行する。この前処理は、撮像ズレの種類が拡大(縮小)又は平行移動の場合にも共通して実行される。 First, pre-processing is executed prior to calculating the direction of rotation and the correction value. This pre-processing is executed in common even when the type of imaging deviation is enlargement (reduction) or parallel movement.
 具体的には、各マクロブロックの動きベクトルMV(i、j)を用いて、フレーム内の平均動きベクトルMVave、水平マクロブロックライン毎の平均動きベクトルMVaveH[j]、垂直マクロブロックライン毎の平均動きベクトルMVaveV[i]を算出する。また、水平マクロブロックライン数をmby、垂直マクロブロックライン数をmbx、1フレームのマクロブロック数をMBとする。なお、図4B~図4Dの例では、mbx=4、mby=3、MB=12である。 Specifically, using the motion vector MV (i, j) of each macroblock, the average motion vector MVave in the frame, the average motion vector MVaveH [j] for each horizontal macroblock line, and the average for each vertical macroblock line A motion vector MVaveV [i] is calculated. The number of horizontal macroblock lines is mby, the number of vertical macroblock lines is mbx, and the number of macroblocks in one frame is MB. 4B to 4D, mbx = 4, mby = 3, and MB = 12.
 通常、画像符号化における動きベクトルは、水平成分(x成分)は、画像の右側が正の方向、左側が負の方向であり、一方、垂直成分(y成分)は、画像の下側が正の方向、上側が負の方向である。 Normally, the motion vector in image coding is such that the horizontal component (x component) has a positive direction on the right side of the image and a negative direction on the left side, while the vertical component (y component) has a positive value on the lower side of the image. Direction, upper side is negative direction.
 フレーム内の平均動きベクトルMVaveは、式1を用いて算出することができる。図4B~図4Dの例では、12個の動きベクトルの平均値となる。 The average motion vector MVave in the frame can be calculated using Equation 1. In the examples of FIGS. 4B to 4D, the average value of 12 motion vectors is obtained.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 水平マクロブロックライン毎の平均動きベクトルMVaveH[j]は、式2を用いて算出することができる。図4B~図4Dの例では、3つの水平マクロブロックライン(行)それぞれの平均動きベクトルが算出される。 The average motion vector MVaveH [j] for each horizontal macroblock line can be calculated using Equation 2. In the example of FIGS. 4B to 4D, the average motion vector of each of the three horizontal macroblock lines (rows) is calculated.
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 垂直マクロブロックライン毎の平均動きベクトルMVaveV[i]は、式3を用いて算出することができる。図4B~図4Dの例では、4つの垂直マクロブロックライン(列)それぞれの平均動きベクトルが算出される。 The average motion vector MVaveV [i] for each vertical macroblock line can be calculated using Equation 3. In the example of FIGS. 4B to 4D, the average motion vector of each of the four vertical macroblock lines (columns) is calculated.
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
 なお、上記の例では、水平方向及び垂直方向とも、1マクロブロックライン単位で平均動きベクトルを算出しているが、これに限ることなく、複数マクロブロックライン単位で平均動きベクトルを算出してもよい。 In the above example, the average motion vector is calculated in units of one macroblock line in both the horizontal direction and the vertical direction. However, the present invention is not limited to this, and the average motion vector may be calculated in units of a plurality of macroblock lines. Good.
 次に、上記の各平均動きベクトル及び以下の式4~式8を用いて、回転の方向を判断する方法を説明する。まず、以下の式4~式6のVflag、Hflag、及びAflagが全て真となる場合に、符号化対象画像に対する参照画像の回転方向が左回り(反時計回り)であると判定する。 Next, a method for determining the direction of rotation using each of the above average motion vectors and the following equations 4 to 8 will be described. First, when Vflag, Hflag, and Aflag in the following Expressions 4 to 6 are all true, it is determined that the rotation direction of the reference image with respect to the encoding target image is counterclockwise (counterclockwise).
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 式4のVflagは、MVaveVの垂直成分yが単調減少し、且つMVaveVの水平成分xが閾値VTh以下となる場合に、真となる。ここで、閾値VThには0に近い値が設定されるので、式4の後半は、MVaveVの水平成分x≒0と読み替えることもできる。 Vflag of Equation 4 is true when the vertical component y of MVaveV monotonously decreases and the horizontal component x of MVaveV is equal to or less than the threshold value VTh. Here, since a value close to 0 is set as the threshold value VTh, the latter half of the equation 4 can be read as horizontal component x≈0 of MVaveV.
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
 式5のHflagは、MVaveHの水平成分xが単調増加し、且つMVaveHの垂直成分yが閾値HTh以下となる場合に、真となる。ここで、閾値HThには0に近い値が設定されるので、式5の後半は、MVaveHの垂直成分y≒0と読み替えることもできる。 Hflag of Formula 5 is true when the horizontal component x of MVaveH monotonously increases and the vertical component y of MVaveH is equal to or less than the threshold value HTh. Here, since the threshold value HTh is set to a value close to 0, the second half of Equation 5 can be read as the vertical component y≈0 of MVaveH.
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000006
 式6のAflagは、フレームの平均動きベクトルMVaveが閾値FTh以下となる場合に、真となる。ここで、閾値FThには0に近い値が設定されるので、式6は、MVave≒0と読み替えることもできる。 The Aflag of Equation 6 is true when the average motion vector MVave of the frame is equal to or less than the threshold value FTh. Here, since the threshold value FTh is set to a value close to 0, Equation 6 can also be read as MVave≈0.
 一方、上記の式6、以下の式7、8のVflag、Hflag、及びAflagが全て真となる場合に、符号化対象画像に対する参照画像の回転方向が右回り(時計回り)であると判定する。 On the other hand, when Vflag, Hflag, and Aflag in Expression 6 and Expressions 7 and 8 below are all true, the rotation direction of the reference image with respect to the encoding target image is determined to be clockwise (clockwise). .
Figure JPOXMLDOC01-appb-M000007
Figure JPOXMLDOC01-appb-M000007
 式7のVflagは、MVaveVの垂直成分yが単調増加し、且つMVaveVの水平成分xが閾値VTh以下となる場合に、真となる。ここで、閾値VThには0に近い値が設定されるので、式7の後半は、MVaveVの水平成分x≒0と読み替えることもできる。 Vflag of Expression 7 is true when the vertical component y of MVaveV monotonously increases and the horizontal component x of MVaveV is equal to or less than the threshold value VTh. Here, since a value close to 0 is set as the threshold value VTh, the latter half of Equation 7 can be read as horizontal component x≈0 of MVaveV.
Figure JPOXMLDOC01-appb-M000008
Figure JPOXMLDOC01-appb-M000008
 式8のHflagは、MVaveHの水平成分xが単調減少し、且つMVaveHの垂直成分yが閾値HTh以下となる場合に、真となる。ここで、閾値HThには0に近い値が設定されるので、式8の後半は、MVaveHの垂直成分y≒0と読み替えることもできる。 Hflag of Expression 8 is true when the horizontal component x of MVaveH monotonously decreases and the vertical component y of MVaveH is equal to or less than the threshold value HTh. Here, since the threshold value HTh is set to a value close to 0, the second half of Equation 8 can be read as the vertical component y≈0 of MVaveH.
 次に、上記の各平均動きベクトル及び以下の式9、10を用いて、補正値(回転角度)を算出する方法を説明する。具体的には、水平マクロブロックラインの動きベクトル平均を用いて式9で求まる回転角度Xと、垂直マクロブロックラインの動きベクトル平均から式10を用いて求まる回転角度Yを求める。なお、各マクロブロックの水平・垂直の画素数px=16とする。 Next, a method of calculating a correction value (rotation angle) using each average motion vector and the following formulas 9 and 10 will be described. Specifically, the rotation angle X obtained by Equation 9 using the motion vector average of the horizontal macroblock line and the rotation angle Y obtained by Equation 10 from the motion vector average of the vertical macroblock line are obtained. Note that the number of horizontal and vertical pixels px = 16 in each macroblock.
Figure JPOXMLDOC01-appb-M000009
Figure JPOXMLDOC01-appb-M000009
 但し、水平マクロブロックライン数mbyが奇数の場合、中心の水平マクロブロックラインの動きベクトル平均は計算から除外し、水平マクロブロックライン数mbyを1減算して、計算を行う。 However, when the horizontal macroblock line number mby is an odd number, the motion vector average of the central horizontal macroblock line is excluded from the calculation, and the horizontal macroblock line number mby is decremented by one.
Figure JPOXMLDOC01-appb-M000010
Figure JPOXMLDOC01-appb-M000010
 但し、垂直マクロブロックライン数mbxが奇数の場合、中心の垂直マクロブロックラインの動きベクトル平均は計算から除外し、垂直マクロブロックライン数mbxを1減算して、計算を行う。 However, when the number of vertical macroblock lines mbx is an odd number, the motion vector average of the center vertical macroblock line is excluded from the calculation, and the calculation is performed by subtracting 1 from the number of vertical macroblock lines mbx.
 上記の式9の回転角度Xは、左回り(反時計回り)が正の値として求まり、式10の回転角度Yは右回り(時計回り)が正の値として求まる。 The rotation angle X of the above formula 9 is obtained as a positive value for counterclockwise rotation (counterclockwise), and the rotation angle Y of the equation 10 is obtained as a positive value for clockwise rotation (clockwise).
 上記の式9および式10を用いて、フレーム全体の回転角度(左回り)は、回転角度Xと回転角度Yとの平均として、(X-Y)/2で得られる。 Using Equation 9 and Equation 10 above, the rotation angle (counterclockwise) of the entire frame is obtained as (XY) / 2 as an average of the rotation angle X and the rotation angle Y.
 動きベクトルを算出したフレームを補正する場合の補正値(回転角度)は、上記のフレーム全体の回転角度である。逆に、参照フレームを補正する場合の補正値(回転角度)は、逆方向となる。 The correction value (rotation angle) when correcting the frame for which the motion vector is calculated is the rotation angle of the entire frame. Conversely, the correction value (rotation angle) when correcting the reference frame is in the reverse direction.
 次に、図4Cは、参照画像と符号化対象画像との大きさが一致していない場合の動きベクトルの傾向を示している。この場合、各ブロックの動きベクトルが全体として放射線状に配置される傾向がある。このような状況は、例えば、左眼用撮像部101a及び右眼用撮像部101bの撮像倍率が異なる場合に生じる。 Next, FIG. 4C shows the tendency of the motion vector when the sizes of the reference image and the encoding target image do not match. In this case, the motion vectors of the blocks tend to be arranged radially as a whole. Such a situation occurs, for example, when the imaging magnifications of the left-eye imaging unit 101a and the right-eye imaging unit 101b are different.
 図4Cをより具体的に観察すると、各ブロックの動きベクトルは、符号化対象画像の画像中心を向いている。つまり、参照画像が符号化対象画像に対して縮小されていると判断できる。また、縮小率は、例えば、各動きベクトルの大きさの平均値等から推定することができる。 When FIG. 4C is observed more specifically, the motion vector of each block faces the image center of the encoding target image. That is, it can be determined that the reference image is reduced with respect to the encoding target image. The reduction rate can be estimated from, for example, the average value of the magnitudes of the motion vectors.
 ここで、撮像ズレの種類が拡大又は縮小である場合において、その方向(拡大であるか、縮小であるか)及び補正値の算出方法を説明する。なお、以下に示す方法は一例であって、その他の方法で算出することもできる。 Here, in the case where the type of imaging deviation is enlargement or reduction, the direction (whether it is enlargement or reduction) and a method for calculating the correction value will be described. Note that the method shown below is an example, and it can be calculated by other methods.
 まず、上記の各平均動きベクトル及び以下の式11~式15を用いて、拡大であるか、縮小であるかを判断する方法を説明する。まず、以下の式11~式13のVflag、Hflag、及びAflagが全て真となる場合に、符号化対象画像が参照画像より大きい(拡大)と判定する。 First, a method for determining whether the image is an enlargement or a reduction using each of the average motion vectors and the following equations 11 to 15 will be described. First, when Vflag, Hflag, and Aflag in the following Expressions 11 to 13 are all true, it is determined that the encoding target image is larger (enlarged) than the reference image.
Figure JPOXMLDOC01-appb-M000011
Figure JPOXMLDOC01-appb-M000011
 式11のVflagは、MVaveVの水平成分xが単調減少し、且つMVaveVの垂直成分yが閾値VTh以下となる場合に、真となる。ここで、閾値VThには0に近い値が設定されるので、式11の後半は、MVaveVの垂直成分y≒0と読み替えることもできる。 Vflag of Expression 11 is true when the horizontal component x of MVaveV monotonously decreases and the vertical component y of MVaveV is equal to or less than the threshold value VTh. Here, since the threshold value VTh is set to a value close to 0, the second half of the equation 11 can be read as the vertical component y≈0 of MVaveV.
Figure JPOXMLDOC01-appb-M000012
Figure JPOXMLDOC01-appb-M000012
 式12のHflagは、MVaveHの垂直成分yが単調減少し、且つMVaveHの水平成分xが閾値HTh以下となる場合に、真となる。ここで、閾値HThには0に近い値が設定されるので、式12の後半は、MVaveHの水平成分x≒0と読み替えることもできる。 Hflag of Expression 12 is true when the vertical component y of MVaveH monotonously decreases and the horizontal component x of MVaveH is equal to or less than the threshold value HTh. Here, since the threshold value HTh is set to a value close to 0, the latter half of Equation 12 can be read as horizontal component x≈0 of MVaveH.
Figure JPOXMLDOC01-appb-M000013
Figure JPOXMLDOC01-appb-M000013
 式13のAflagは、フレームの平均動きベクトルMVaveが閾値FTh以下となる場合に、真となる。ここで、閾値FThには0に近い値が設定されるので、式13は、MVave≒0と読み替えることもできる。 Aflag of Expression 13 is true when the average motion vector MVave of the frame is equal to or less than the threshold value FTh. Here, since the threshold value FTh is set to a value close to 0, the expression 13 can also be read as MVave≈0.
 一方、上記の式13、以下の式14、15のVflag、Hflag、及びAflagが全て真となる場合に、符号化対象画像が参照画像より小さい(縮小)と判定する。 On the other hand, when Vflag, Hflag, and Aflag in Expression 13 and Expressions 14 and 15 below are all true, it is determined that the encoding target image is smaller (reduced) than the reference image.
Figure JPOXMLDOC01-appb-M000014
Figure JPOXMLDOC01-appb-M000014
 式14のVflagは、MVaveVの水平成分xが単調増加し、且つMVaveVの垂直成分yが閾値VTh以下となる場合に、真となる。ここで、閾値VThには0に近い値が設定されるので、式14の後半は、MVaveVの垂直成分y≒0と読み替えることもできる。 Vflag of Expression 14 is true when the horizontal component x of MVaveV monotonously increases and the vertical component y of MVaveV is equal to or less than the threshold value VTh. Here, since a value close to 0 is set as the threshold value VTh, the latter half of Expression 14 can be read as the vertical component y≈0 of MVaveV.
Figure JPOXMLDOC01-appb-M000015
Figure JPOXMLDOC01-appb-M000015
 式15のHflagは、MVaveHの垂直成分yが単調増加し、且つMVaveHの水平成分xが閾値HTh以下となる場合に、真となる。ここで、閾値HThには0に近い値が設定されるので、式15の後半は、MVaveHの水平成分x≒0と読み替えることもできる。 Hflag of Expression 15 is true when the vertical component y of MVaveH monotonously increases and the horizontal component x of MVaveH is equal to or less than the threshold value HTh. Here, since the threshold value HTh is set to a value close to 0, the second half of Equation 15 can be read as horizontal component x≈0 of MVaveH.
 次に、上記の各平均動きベクトル及び以下の式16、17を用いて、補正値(縮小率・拡大率)を算出する方法を説明する。まず、水平マクロブロックラインの動きベクトル平均を用いて式16で求まる垂直縮小率XRと、垂直マクロブロックラインの動きベクトル平均から式10を用いて求まる水平縮小率YRを求める。なお、各ブロックの画素数px=16とする。 Next, a method for calculating a correction value (reduction ratio / enlargement ratio) using each average motion vector and the following equations 16 and 17 will be described. First, the vertical reduction ratio XR obtained by Expression 16 using the motion vector average of the horizontal macroblock lines and the horizontal reduction ratio YR obtained by Expression 10 from the motion vector average of the vertical macroblock lines are obtained. Note that the number of pixels in each block px = 16.
Figure JPOXMLDOC01-appb-M000016
Figure JPOXMLDOC01-appb-M000016
 但し、水平マクロブロックライン数mbyが奇数の場合、中心の水平マクロブロックラインの動きベクトル平均は計算から除外し、水平マクロブロックライン数mbyを1減算して、計算を行う。 However, when the horizontal macroblock line number mby is an odd number, the motion vector average of the central horizontal macroblock line is excluded from the calculation, and the horizontal macroblock line number mby is decremented by one.
Figure JPOXMLDOC01-appb-M000017
Figure JPOXMLDOC01-appb-M000017
 但し、垂直マクロブロックライン数mbxが奇数の場合、中心の垂直マクロブロックラインの動きベクトル平均は計算から除外し、垂直マクロブロックライン数mbxを1減算して、計算を行う。 However, when the number of vertical macroblock lines mbx is an odd number, the motion vector average of the center vertical macroblock line is excluded from the calculation, and the calculation is performed by subtracting 1 from the number of vertical macroblock lines mbx.
 上記の式9および式10を用いて、フレーム全体の縮小率は、垂直縮小率XRと水平縮小率YRとの平均として、(XR+YR)/2で得られる。 Using the above formulas 9 and 10, the overall frame reduction ratio is obtained as (XR + YR) / 2 as an average of the vertical reduction ratio XR and the horizontal reduction ratio YR.
 動きベクトルを算出したフレームを補正する場合の補正値は、拡大率となる。逆に、参照フレームを補正する場合の補正値は、逆数となる。 The correction value when correcting the frame for which the motion vector is calculated is the enlargement ratio. Conversely, the correction value when correcting the reference frame is an inverse number.
 次に、図4Dは、参照画像が符号化対象画像に対して一方向にずれている(平行移動している)場合の動きベクトルの傾向を示している。この場合、各ブロックの動きベクトルが全体として同一の方向を向いている傾向がある。このような状況は、例えば、左眼用撮像部101aが正確に被写体の方向を向いていない場合に生じる。 Next, FIG. 4D shows a tendency of the motion vector when the reference image is displaced in one direction (translated) with respect to the encoding target image. In this case, the motion vectors of the blocks tend to be in the same direction as a whole. Such a situation occurs, for example, when the left-eye imaging unit 101a does not accurately face the subject.
 図4Dをより具体的に観察すると、各ブロックの動きベクトルは、上方向を向いている。つまり、参照画像は、符号化対象画像に対して上方向にズレていると判断できる。また、ズレの大きさは、例えば、各動きベクトルの大きさの平均値等から推定することができる。 When observing FIG. 4D more specifically, the motion vector of each block faces upward. That is, it can be determined that the reference image is shifted upward with respect to the encoding target image. Moreover, the magnitude | size of deviation can be estimated from the average value etc. of the magnitude | size of each motion vector, for example.
 ここで、撮像ズレの種類が平行移動である場合において、補正値の算出方法を説明する。なお、以下に示す方法は一例であって、その他の方法で算出することもできる。 Here, a method for calculating a correction value when the type of imaging deviation is parallel movement will be described. Note that the method shown below is an example, and it can be calculated by other methods.
 まず、式18に示されるように、フレームの平均動きベクトルと各ブロックの動きベクトルとを成分毎に比較し、平均動きベクトルとの差が閾値mvTh以内のブロック数cntを算出する。 First, as shown in Expression 18, the average motion vector of the frame and the motion vector of each block are compared for each component, and the number of blocks cnt whose difference from the average motion vector is within the threshold value mvTh is calculated.
Figure JPOXMLDOC01-appb-M000018
Figure JPOXMLDOC01-appb-M000018
 そして、cntが閾値MBThより大きければ、撮像ズレの種類を平行移動と判定する。フレーム全体のずれ量は、式1のフレーム内の平均動きベクトルMVaveで得られる。動きベクトルを算出したフレームを補正する場合の補正値は、フレームの動きベクトルMVaveとなる。逆に、参照フレームを補正する場合の補正値は、フレームの動きベクトルMVaveに-1をかけた値となる。 Then, if cnt is larger than the threshold value MBTh, it is determined that the type of imaging deviation is parallel movement. The shift amount of the entire frame is obtained by the average motion vector MVave in the frame of Equation 1. The correction value for correcting the frame for which the motion vector is calculated is the frame motion vector MVave. Conversely, the correction value for correcting the reference frame is a value obtained by multiplying the motion vector MVave of the frame by -1.
 上記のように、フレーム内の各ブロックの水平位置、または垂直位置における動きベクトルの大きさの変化を集計することで、撮像ズレの種類とその大きさを求めることが可能となる。補正値算出部111は、検出した撮像ズレを補正するための補正値を算出し、画像補正部112へ出力する(S311)。 As described above, it is possible to obtain the type and size of the imaging shift by counting the change in the magnitude of the motion vector at the horizontal position or the vertical position of each block in the frame. The correction value calculation unit 111 calculates a correction value for correcting the detected imaging deviation and outputs the correction value to the image correction unit 112 (S311).
 なお、図4A~図4Dに示される撮像ズレの種類は一例であって、補正値算出部111は、他の種類の撮像ズレを検出することもできる。また、図4A~図4Dで説明した撮像ズレが組み合わされる場合もある。 Note that the types of imaging deviation shown in FIGS. 4A to 4D are examples, and the correction value calculation unit 111 can also detect other types of imaging deviations. In addition, the imaging deviation described with reference to FIGS. 4A to 4D may be combined.
 画像補正部112は、入力された補正値に基づいて左眼用撮像部101a及び右眼用撮像部101bで撮像された第1及び第2の視点の画像のうちの少なくとも一方の画像を、検出された撮像ズレを解消するように補正する(S312)。なお、ここで補正される画像は、図4A~図4Dに示された符号化対象画像及び参照画像が撮像された時刻(第1の時刻)より後(第2の時刻)に撮像された画像である。 The image correction unit 112 detects at least one of the first and second viewpoint images captured by the left-eye imaging unit 101a and the right-eye imaging unit 101b based on the input correction value. It correct | amends so that the image pick-up shift performed may be eliminated (S312). Note that the image to be corrected here is an image captured after (second time) after the time (first time) when the encoding target image and the reference image shown in FIGS. 4A to 4D are captured. It is.
 画像補正部112は、画像間の撮像ズレが上下左右平行移動や傾きの場合には、第1及び第2の視点それぞれで同時刻に撮像された各画像から共通部分を抜き出すことで補正を行ってもよい。また、画像間のズレが大きさの不一致の場合には、第1及び第2の視点それぞれで同時刻に撮像された画像の一方を拡大又は縮小することで補正を行ってもよい。 The image correction unit 112 performs correction by extracting a common part from each image captured at the same time from each of the first and second viewpoints when the imaging shift between the images is vertical movement, horizontal translation, or inclination. May be. In addition, when the displacement between the images does not match, correction may be performed by enlarging or reducing one of the images captured at the same time from the first and second viewpoints.
 ここでは、画像間の撮像ズレの補正を画像処理で実施する例を記述しているが、左眼用撮像部101a及び右眼用撮像部101bの撮像位置やズーム等の撮像設定を変更することで、撮像ズレの補正を行ってもよい。 Here, an example is described in which the correction of imaging deviation between images is performed by image processing, but the imaging settings such as the imaging position and zoom of the left-eye imaging unit 101a and the right-eye imaging unit 101b are changed. Thus, the imaging deviation may be corrected.
 以上のような構成により、検出された動きベクトルの傾向に基づいて画像間の撮像ズレを補正することで、立体映像データの符号化効率を向上させることが可能となる。また、視点間の画像に撮像ズレがなくなるので、符号化された立体画像データの再生し、視聴する時に、立体視が容易で、目の疲労を起こさない立体画像データを得ることが可能となる。さらに、視点間の画像の視差以外のズレを検出するために、新たな画像ずれ検出手段を設ける必要がなく、画像符号化装置の低消費電力の実現や、回路規模の増大を防止することが可能となる。 With the configuration as described above, it is possible to improve the encoding efficiency of stereoscopic video data by correcting the imaging shift between images based on the detected motion vector tendency. In addition, since there is no imaging shift in the images between the viewpoints, it is possible to obtain stereoscopic image data that is easy to stereoscopically view and does not cause eye fatigue when the encoded stereoscopic image data is reproduced and viewed. . Furthermore, it is not necessary to provide a new image shift detection means for detecting a shift other than the parallax of the image between the viewpoints, and it is possible to realize low power consumption of the image encoding device and an increase in circuit scale. It becomes possible.
 (実施の形態1の変形例)
 なお、上記の処理では、処理開始後の所定期間内の画像には、十分な補正処理が施されていない。そのため、撮像ズレが残ったままの画像が符号化され、出力ストリームとして出力されることになる。
(Modification of Embodiment 1)
In the above processing, sufficient correction processing is not performed on the image within a predetermined period after the processing is started. For this reason, an image with the imaging deviation remaining is encoded and output as an output stream.
 そこで、切替部108は、符号化部106から量子化係数や符号化情報を取得しても、直ぐには可変長符号化部109に出力を行わず、廃棄するようにしてもよい。 Therefore, even if the switching unit 108 acquires the quantization coefficient and the encoding information from the encoding unit 106, it may be discarded without immediately outputting it to the variable length encoding unit 109.
 具体的には、切替部108は、処理開始(すなわち、最初の符号化情報を取得した時点)から所定の期間が経過するまでの間に取得した符号化情報を破棄し、所定の期間が経過した後の符号化情報のみを可変長符号化部109に出力するようにしてもよい。なお、所定の期間とは、例えば、N(Nは1以上の整数)フレーム分の符号化情報を取得するまでであってもよい。 Specifically, the switching unit 108 discards the encoded information acquired from the start of processing (that is, when the first encoded information is acquired) until the predetermined period elapses, and the predetermined period has elapsed. Only the encoded information after this may be output to the variable length encoding unit 109. Note that the predetermined period may be, for example, until N (N is an integer of 1 or more) frames of encoded information is acquired.
 また、他の例として、補正部102は、第1及び第2の視点の画像間の撮像ズレの有無(補正値が0か否か)を符号化制御部116に入力する。符号化制御部116は、画像間の撮像ズレが無くなった(補正値が0になった)ことを判断して、切替部108に符号化開始信号を入力する。そして、切替部108は、符号化制御部116から符号化開始信号を受けてから、符号化部106から入力された量子化係数および符号化情報を、可変長符号化部109へ出力し始めるようにしてもよい。 As another example, the correction unit 102 inputs the presence / absence of an imaging shift between the images of the first and second viewpoints (whether the correction value is 0) to the encoding control unit 116. The encoding control unit 116 determines that there is no imaging deviation between images (the correction value has become 0), and inputs an encoding start signal to the switching unit 108. Then, after receiving the encoding start signal from the encoding control unit 116, the switching unit 108 starts to output the quantization coefficient and the encoding information input from the encoding unit 106 to the variable length encoding unit 109. It may be.
 以上の構成により、撮像ズレがない立体画像データのみを符号化可能となり、立体視が容易で、目の疲労を起こさない画像の符号化データを得ることが可能となる。 With the above configuration, it is possible to encode only stereoscopic image data with no imaging deviation, and it is possible to obtain encoded data of an image that is easy to view stereoscopically and does not cause eye fatigue.
 (実施の形態2)
 次に、図5A及び図5Bを参照して、本発明の実施の形態2に係る画像符号化装置を説明する。図5Aは、本発明の実施の形態2に係る画像符号化装置200のブロック図である。図5Bは、画像符号化装置200の前処理の処理手順を示すフローチャートである。
(Embodiment 2)
Next, an image encoding device according to Embodiment 2 of the present invention will be described with reference to FIGS. 5A and 5B. FIG. 5A is a block diagram of an image coding apparatus 200 according to Embodiment 2 of the present invention. FIG. 5B is a flowchart illustrating a processing procedure of preprocessing of the image encoding device 200.
 実施の形態2に係る画像符号化装置200は、実施の形態1に係る画像符号化装置100の構成に、符号化制御部116を加えた構成である。補正部102と切替部108とを除くその他の部分の動作は、実施の形態1と同様である。以降、動作の異なる部分を中心に説明する。 The image encoding device 200 according to Embodiment 2 has a configuration in which an encoding control unit 116 is added to the configuration of the image encoding device 100 according to Embodiment 1. The operations of the other parts except the correction unit 102 and the switching unit 108 are the same as those in the first embodiment. In the following, description will be made with a focus on the different operations.
 まず、画像符号化装置200の左眼用撮像部101a及び右眼用撮像部101bは、電源がON状態になったことに応じて(S401でYes)、撮像を開始する。なお、撮像を開始するタイミングは上記の例に限られないが、符号化処理の開始が指示されるより前に撮像を開始するものとする。 First, the left-eye imaging unit 101a and the right-eye imaging unit 101b of the image encoding device 200 start imaging in response to the power being turned on (Yes in S401). In addition, although the timing which starts imaging is not restricted to said example, imaging shall be started before the start of an encoding process is instruct | indicated.
 次に、動きベクトル検出部105は、補正部102、多重化部103、及び切替部104を介して、左眼用撮像部101a及び右眼用撮像部101bで撮像された画像を取得(S402)し、取得した画像それぞれの動きベクトルを検出し(S403)、最新の動きベクトルのみを記憶しておく。 Next, the motion vector detection unit 105 acquires images captured by the left-eye imaging unit 101a and the right-eye imaging unit 101b via the correction unit 102, the multiplexing unit 103, and the switching unit 104 (S402). Then, the motion vector of each acquired image is detected (S403), and only the latest motion vector is stored.
 この処理(S402、S403)は、符号化処理の開始が指示されるまで繰り返し実行される。但し、この時点では、補正部102は補正処理を実行しない。また、最新の動きベクトルは、動きベクトル検出部105でなく、補正部102で記憶していてもよい。 This process (S402, S403) is repeatedly executed until the start of the encoding process is instructed. However, at this time, the correction unit 102 does not execute the correction process. The latest motion vector may be stored in the correction unit 102 instead of the motion vector detection unit 105.
 そして、画像符号化装置200は、符号化処理の開始が指示されたことに応じて(S404でYes)、図3Aに示されるメイン処理を実行する(S405)。但し、実施の形態2においては、符号化処理の開始が指示された直後(最初)の画像に対して、動きベクトル検出部105で検出された最新の動きベクトルを用いて補正処理を実行する(図3AのS202)点が、実施の形態1と異なる。 Then, in response to the instruction to start the encoding process (Yes in S404), the image encoding apparatus 200 executes the main process shown in FIG. 3A (S405). However, in the second embodiment, the correction process is executed using the latest motion vector detected by the motion vector detection unit 105 on the image immediately after the start of the encoding process is instructed (first) ( 3A is different from the first embodiment.
 上記構成とすることにより、符号化開始直後の画像に対しても補正処理を実行することができるので、撮像ズレが除去された画像のみを符号化することができる。その結果、立体視が容易で、目の疲労を起こさない画像の符号化データを得ることが可能となる。 With the above configuration, the correction process can be executed even for an image immediately after the start of encoding, so that only an image from which imaging deviation has been removed can be encoded. As a result, it is possible to obtain encoded data of an image that is easily stereoscopically viewed and does not cause eye fatigue.
 (実施の形態3)
 本発明の実施の形態3の構成は、実施の形態2と同じ構成であるが、左眼用撮像部101a及び右眼用撮像部101bの動作が異なる。具体的には、実施の形態3に係る左眼用撮像部101a及び右眼用撮像部101bは、それぞれ静止画像を連続して撮像し、撮像した画像(静止画像)を順番に補正部102へ出力する点が、実施の形態2と異なる。
(Embodiment 3)
The configuration of the third embodiment of the present invention is the same as that of the second embodiment, but the operations of the left-eye imaging unit 101a and the right-eye imaging unit 101b are different. Specifically, the left-eye imaging unit 101a and the right-eye imaging unit 101b according to Embodiment 3 continuously capture still images, and sequentially output the captured images (still images) to the correction unit 102. The output point is different from the second embodiment.
 補正部102は、符号化処理の開始が指示されるまでは、左眼用撮像部101a及び右眼用撮像部101bから取得した画像に対して補正処理を実行せず、多重化部103に出力する。そして、動きベクトル検出部105は、補正部102、多重化部103、及び切替部104を介して、左眼用撮像部101a及び右眼用撮像部101bで撮像された画像を取得し、取得した画像それぞれの動きベクトルを検出し、最新の動きベクトルのみを記憶しておく。そして、補正部102は、補正処理の開始が指示された直後の画像に対して、その直前に動きベクトル検出部105で検出された最新の動きベクトルを用いて補正処理を実行する。 The correction unit 102 does not perform correction processing on the images acquired from the left-eye imaging unit 101a and the right-eye imaging unit 101b until the start of the encoding process is instructed, and outputs the image to the multiplexing unit 103. To do. Then, the motion vector detection unit 105 acquires and acquires images captured by the left-eye imaging unit 101a and the right-eye imaging unit 101b via the correction unit 102, the multiplexing unit 103, and the switching unit 104. The motion vector of each image is detected and only the latest motion vector is stored. Then, the correction unit 102 performs correction processing on the image immediately after the start of the correction processing is instructed, using the latest motion vector detected by the motion vector detection unit 105 immediately before that.
 上記の構成により、静止立体画像の高効率な符号化が可能となる。また、画像間の撮像ズレがない静止立体画像データのみを符号化可能となり、立体視が容易で、目の疲労を起こし難い画像の静止画像の符号化データを得ることが可能となる。 The above configuration enables high-efficiency encoding of still stereoscopic images. Also, it is possible to encode only still stereoscopic image data with no imaging deviation between images, and it is possible to obtain encoded data of a still image of an image that is easy to stereoscopically view and hardly causes eye fatigue.
 なお、左眼用撮像部101a及び右眼用撮像部101bは、内部に画像メモリをもち、それぞれ静止画像を1枚撮像し、撮像した画像メモリに蓄積し、同じ画像(静止画像)を連続して、補正部102へ出力してもよい。 Note that the left-eye imaging unit 101a and the right-eye imaging unit 101b each have an image memory therein, and each captures a single still image, accumulates the captured image in the captured image memory, and continues the same image (still image). Then, it may be output to the correction unit 102.
 (その他変形例)
 なお、本発明を上記実施の形態に基づいて説明してきたが、本発明は、上記の実施の形態に限定されないのはもちろんである。以下のような場合も本発明に含まれる。
(Other variations)
Although the present invention has been described based on the above embodiment, it is needless to say that the present invention is not limited to the above embodiment. The following cases are also included in the present invention.
 上記の各装置は、具体的には、マイクロプロセッサ、ROM、RAM、ハードディスクユニット、ディスプレイユニット、キーボード、マウスなどから構成されるコンピュータシステムである。RAMまたはハードディスクユニットには、コンピュータプログラムが記憶されている。マイクロプロセッサが、コンピュータプログラムにしたがって動作することにより、各装置は、その機能を達成する。ここでコンピュータプログラムは、所定の機能を達成するために、コンピュータに対する指令を示す命令コードが複数個組み合わされて構成されたものである。 Each of the above devices is specifically a computer system including a microprocessor, ROM, RAM, a hard disk unit, a display unit, a keyboard, a mouse, and the like. A computer program is stored in the RAM or the hard disk unit. Each device achieves its functions by the microprocessor operating according to the computer program. Here, the computer program is configured by combining a plurality of instruction codes indicating instructions for the computer in order to achieve a predetermined function.
 上記の各装置を構成する構成要素の一部または全部は、1個のシステムLSI(Large Scale Integration:大規模集積回路)から構成されているとしてもよい。システムLSIは、複数の構成要素部を1個のチップ上に集積して製造された超多機能LSIであり、具体的には、マイクロプロセッサ、ROM、RAMなどを含んで構成されるコンピュータシステムである。RAMには、コンピュータプログラムが記憶されている。マイクロプロセッサが、コンピュータプログラムにしたがって動作することにより、システムLSIは、その機能を達成する。 Some or all of the constituent elements constituting each of the above-described devices may be configured by a single system LSI (Large Scale Integration). The system LSI is an ultra-multifunctional LSI manufactured by integrating a plurality of component parts on one chip, and specifically, a computer system including a microprocessor, a ROM, a RAM, and the like. is there. A computer program is stored in the RAM. The system LSI achieves its functions by the microprocessor operating according to the computer program.
 上記の各装置を構成する構成要素の一部または全部は、各装置に脱着可能なICカードまたは単体のモジュールから構成されているとしてもよい。ICカードまたはモジュールは、マイクロプロセッサ、ROM、RAMなどから構成されるコンピュータシステムである。ICカードまたはモジュールは、上記の超多機能LSIを含むとしてもよい。マイクロプロセッサが、コンピュータプログラムにしたがって動作することにより、ICカードまたはモジュールは、その機能を達成する。このICカードまたはこのモジュールは、耐タンパ性を有するとしてもよい。 Some or all of the constituent elements constituting each of the above devices may be constituted by an IC card or a single module that can be attached to and detached from each device. The IC card or module is a computer system that includes a microprocessor, ROM, RAM, and the like. The IC card or the module may include the super multifunctional LSI described above. The IC card or the module achieves its functions by the microprocessor operating according to the computer program. This IC card or this module may have tamper resistance.
 本発明は、上記に示す方法であるとしてもよい。また、これらの方法をコンピュータにより実現するコンピュータプログラムであるとしてもよいし、コンピュータプログラムからなるデジタル信号であるとしてもよい。 The present invention may be the method described above. Further, the present invention may be a computer program that realizes these methods by a computer, or may be a digital signal composed of a computer program.
 また、本発明は、コンピュータプログラムまたはデジタル信号をコンピュータ読み取り可能な記録媒体、例えば、フレキシブルディスク、ハードディスク、CD-ROM、MO、DVD、DVD-ROM、DVD-RAM、BD(Blu-ray Disc)、半導体メモリなどに記録したものとしてもよい。また、これらの記録媒体に記録されているデジタル信号であるとしてもよい。 The present invention also relates to a computer-readable recording medium capable of reading a computer program or a digital signal, such as a flexible disk, hard disk, CD-ROM, MO, DVD, DVD-ROM, DVD-RAM, BD (Blu-ray Disc), It may be recorded in a semiconductor memory or the like. Further, it may be a digital signal recorded on these recording media.
 また、本発明は、コンピュータプログラムまたはデジタル信号を、電気通信回線、無線または有線通信回線、インターネットを代表とするネットワーク、データ放送等を経由して伝送するものとしてもよい。 Further, the present invention may transmit a computer program or a digital signal via an electric communication line, a wireless or wired communication line, a network represented by the Internet, a data broadcast, or the like.
 また、本発明は、マイクロプロセッサとメモリを備えたコンピュータシステムであって、メモリは、上記コンピュータプログラムを記憶しており、マイクロプロセッサは、コンピュータプログラムにしたがって動作するとしてもよい。 Further, the present invention may be a computer system including a microprocessor and a memory. The memory may store the computer program, and the microprocessor may operate according to the computer program.
 また、プログラムまたはデジタル信号を記録媒体に記録して移送することにより、またはプログラムまたはデジタル信号をネットワーク等を経由して移送することにより、独立した他のコンピュータシステムにより実施するとしてもよい。 Also, the program or digital signal may be recorded on a recording medium and transferred, or the program or digital signal may be transferred via a network or the like, and may be implemented by another independent computer system.
 上記実施の形態及び上記変形例をそれぞれ組み合わせてもよい。 The above embodiment and the above modifications may be combined.
 以上、図面を参照してこの発明の実施形態を説明したが、この発明は、図示した実施形態のものに限定されない。図示した実施形態に対して、この発明と同一の範囲内において、あるいは均等の範囲内において、種々の修正や変形を加えることが可能である。 As mentioned above, although embodiment of this invention was described with reference to drawings, this invention is not limited to the thing of embodiment shown in figure. Various modifications and variations can be made to the illustrated embodiment within the same range or equivalent range as the present invention.
 本発明の立体映像符号化装置およびその方法では、立体動画対応や立体静止画対応のデジタルカメラ、デジタルビデオカメラ、カメラ付き携帯電話、DVD/BDレコーダ、番組記録対応テレビ、ウェブカメラ、番組配信サーバーなどで、立体画像データを符号化して、圧縮、記録、保存、転送する用途に有用である。 In the stereoscopic video encoding apparatus and method of the present invention, a digital camera, a digital video camera, a mobile phone with a camera, a DVD / BD recorder, a TV for recording programs, a web camera, a program distribution server For example, it is useful for encoding, compressing, recording, storing, and transferring stereoscopic image data.
 1 ブロックマッチング部
 2 視差補償ベクトル検出部
 3,6 メモリ
 4 補正ベクトル検出部
 5 可変遅延部
 10,100,200 画像符号化装置
 101a 左眼用撮像部
 101b 右眼用撮像部
 102 補正部
 103 多重化部
 104,108 切替部
 105 動きベクトル検出部
 106 符号化部
 107 参照画像メモリ
 109 可変長符号化部
 110 符号化モード制御部
 111 補正値算出部
 112 画像補正部
 114 画面内符号化部
 115 画面間符号化部
 116 符号化制御部
DESCRIPTION OF SYMBOLS 1 Block matching part 2 Parallax compensation vector detection part 3,6 Memory 4 Correction vector detection part 5 Variable delay part 10,100,200 Image coding apparatus 101a Left-eye imaging part 101b Right-eye imaging part 102 Correction part 103 Multiplexing Unit 104, 108 switching unit 105 motion vector detecting unit 106 encoding unit 107 reference image memory 109 variable length encoding unit 110 encoding mode control unit 111 correction value calculation unit 112 image correction unit 114 intra-screen encoding unit 115 inter-screen encoding Encoding unit 116 encoding control unit

Claims (12)

  1.  少なくとも2視点の映像から成る立体視用の映像を符号化する画像符号化装置であって、
     前記立体視用の映像を取得する取得部と、
     前記取得部で取得された立体視用の映像に映し出される被写体の表示時における大きさまたは位置に関するずれを補正する補正処理を実行する補正部と、
     前記補正部で補正された立体視用の映像を構成する2視点の映像間における動きベクトルを検出する動きベクトル検出部と、
     前記動きベクトル検出部で検出された動きベクトルに基づいて、前記補正部で補正された立体視用の映像を圧縮符号化する符号化部と、を備え、
     前記補正部は、現在の補正処理よりも前に前記動きベクトル検出部で検出された動きベクトルに基づいて、当該補正処理を実行する
     画像符号化装置。
    An image encoding device that encodes a stereoscopic video composed of at least two viewpoints,
    An acquisition unit for acquiring the stereoscopic video;
    A correction unit that performs a correction process for correcting a shift relating to the size or position of the subject displayed in the stereoscopic image acquired by the acquisition unit;
    A motion vector detection unit for detecting a motion vector between two viewpoint videos constituting the stereoscopic video corrected by the correction unit;
    An encoding unit that compresses and encodes the stereoscopic video corrected by the correction unit based on the motion vector detected by the motion vector detection unit;
    The image encoding apparatus that performs the correction process based on the motion vector detected by the motion vector detection unit before the current correction process.
  2.  前記補正部は、前記立体視用の映像に映し出される被写体の2視点間における回転に伴うずれ、拡大に伴うずれ、および並行移動に伴うずれのうち少なくとも1つを、前記動きベクトルに基づいて補正する
     請求項1に記載の画像符号化装置。
    The correction unit corrects at least one of a shift due to rotation between two viewpoints of a subject displayed in the stereoscopic video, a shift due to enlargement, and a shift due to parallel movement based on the motion vector. The image encoding device according to claim 1.
  3.  前記補正部は、前記動きベクトルの方向に基づいて、前記回転に伴うずれ、前記拡大に伴うずれ、および前記並行移動に伴うずれのうち少なくとも1つを検出し、検出結果に示されるずれを補正する
     請求項2に記載の画像符号化装置。
    The correction unit detects at least one of a shift due to the rotation, a shift due to the enlargement, and a shift due to the parallel movement based on the direction of the motion vector, and corrects the shift indicated in the detection result. The image encoding device according to claim 2.
  4.  前記補正部は、前記動きベクトルの垂直方向成分に基づいて、前記並行移動に伴うずれを検出する
     請求項3に記載の画像符号化装置。
    The image encoding device according to claim 3, wherein the correction unit detects a shift associated with the parallel movement based on a vertical component of the motion vector.
  5.  前記動きベクトル検出部は、前記補正部で補正された立体視用の映像の全領域よりも小さい領域毎に前記動きベクトルを検出し、
     前記補正部は、前記動きベクトル検出部で検出された複数の動きベクトルの方向が示す傾向に基づいて、前記回転に伴うずれ、または前記拡大に伴うずれを検出する
     請求項2~4のいずれか1項に記載の画像符号化装置。
    The motion vector detection unit detects the motion vector for each region smaller than the entire region of the stereoscopic video corrected by the correction unit,
    The correction unit detects a shift caused by the rotation or a shift caused by the enlargement based on a tendency indicated by directions of a plurality of motion vectors detected by the motion vector detection unit. The image encoding device according to item 1.
  6.  前記補正部は、前記複数の動きベクトルが前記立体視用の映像内の所定位置に向かって収束する傾向を示す場合、または当該所定位置から拡散する傾向を示す場合に、前記拡大に伴うずれを検出する
     請求項5に記載の画像符号化装置。
    When the plurality of motion vectors tend to converge toward a predetermined position in the stereoscopic video, or when the correction unit shows a tendency to diffuse from the predetermined position, the correction unit detects the shift due to the enlargement. The image encoding device according to claim 5, wherein the image encoding device is detected.
  7.  前記補正部は、前記複数の動きベクトルが前記立体視用の映像内で円を描くような傾向を示す場合に、前記回転に伴うずれを検出する
     請求項5に記載の画像符号化装置。
    The image encoding device according to claim 5, wherein the correction unit detects a shift due to the rotation when the plurality of motion vectors have a tendency to draw a circle in the stereoscopic video.
  8.  前記符号化部は、符号化の開始が指示されてから所定の期間経過したことに応じて、圧縮符号化された立体視用の映像の出力を開始する
     請求項1~7のいずれか1項に記載の画像符号化装置。
    8. The encoding unit starts output of compression-encoded stereoscopic video in response to a lapse of a predetermined period from the start of encoding. The image encoding device described in 1.
  9.  前記動きベクトル検出部は、さらに、符号化の開始が指示される前に前記取得部で取得された前記立体視用の映像間における動きベクトルの検出を開始し、
     前記補正部は、符号化処理の開始が指示された直後に前記取得部で取得された最初の前記立体視用の画像に対して、前記動きベクトル検出部で検出された最新の動きベクトルを用いて補正処理を実行する
     請求項1~7のいずれか1項に記載の画像符号化装置。
    The motion vector detection unit further starts detection of a motion vector between the stereoscopic video images acquired by the acquisition unit before the start of encoding is instructed,
    The correction unit uses the latest motion vector detected by the motion vector detection unit for the first stereoscopic image acquired by the acquisition unit immediately after the start of the encoding process is instructed. The image encoding device according to any one of claims 1 to 7, wherein correction processing is executed.
  10.  前記取得部は、
     被写体を第1の視点から撮像する第1の撮像部と、
     前記被写体を第2の視点から撮像する第2の撮像部とを含む
     請求項1~9のいずれか1項に記載の画像符号化装置。
    The acquisition unit
    A first imaging unit that images a subject from a first viewpoint;
    The image encoding device according to any one of claims 1 to 9, further comprising: a second imaging unit that images the subject from a second viewpoint.
  11.  前記動きベクトル検出部は、前記第1及び第2の視点それぞれで第1の時刻に撮像された画像の一方を符号化対象画像、他方を参照画像として、前記符号化対象画像のブロック毎に動きベクトルを検出し、
     前記補正部は、前記符号化対象画像の各ブロックに対応する複数の前記動きベクトルの傾向に基づいて、前記第1及び第2の視点それぞれで前記第1の時刻より後の第2の時刻に撮像された画像のうちの少なくとも一方の画像に対して、被写体の表示時における大きさまたは位置に関するずれを補正する
     請求項10に記載の画像符号化装置。
    The motion vector detection unit performs motion for each block of the encoding target image, with one of the images captured at the first time as the encoding target image and the other as a reference image at each of the first and second viewpoints. Detect the vector
    The correction unit, at a second time after the first time at each of the first and second viewpoints, based on a tendency of the plurality of motion vectors corresponding to each block of the encoding target image. The image coding apparatus according to claim 10, wherein a deviation relating to a size or a position when the subject is displayed is corrected for at least one of the captured images.
  12.  少なくとも2視点の映像から成る立体視用の映像を符号化する画像符号化方法であって、
     前記立体視用の映像を取得する取得ステップと、
     前記取得ステップで取得された立体視用の映像に映し出される被写体の表示時における大きさまたは位置に関するずれを補正するための補正処理を実行する補正ステップと、
     前記補正ステップで補正された立体視用の映像を構成する2視点の映像間における動きベクトルを検出する動きベクトル検出ステップと、
     前記動きベクトル検出ステップで検出された動きベクトルに基づいて、前記補正ステップで補正された立体視用の映像を圧縮符号化する符号化ステップと、を含み、
     前記補正ステップでは、現在の補正処理よりも前に前記動きベクトル検出ステップで検出された動きベクトルに基づいて、当該補正処理を実行する
     画像符号化方法。
    An image encoding method for encoding a stereoscopic video composed of video of at least two viewpoints,
    An acquisition step of acquiring the stereoscopic video;
    A correction step for executing a correction process for correcting a shift related to the size or position at the time of display of the subject displayed in the stereoscopic video acquired in the acquisition step;
    A motion vector detection step of detecting a motion vector between two viewpoint videos constituting the stereoscopic video corrected in the correction step;
    An encoding step of compressing and encoding the stereoscopic video corrected in the correction step based on the motion vector detected in the motion vector detection step;
    In the correction step, the correction processing is executed based on the motion vector detected in the motion vector detection step before the current correction processing.
PCT/JP2011/000875 2010-02-18 2011-02-17 Image encoding device, image encoding method, program and integrated circuit WO2011102131A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2010033952 2010-02-18
JP2010-033952 2010-02-18

Publications (1)

Publication Number Publication Date
WO2011102131A1 true WO2011102131A1 (en) 2011-08-25

Family

ID=44482736

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2011/000875 WO2011102131A1 (en) 2010-02-18 2011-02-17 Image encoding device, image encoding method, program and integrated circuit

Country Status (1)

Country Link
WO (1) WO2011102131A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06269025A (en) * 1993-03-16 1994-09-22 Fujitsu Ltd Coding system for multi-eye stereoscopic video image
JPH10233958A (en) * 1997-02-20 1998-09-02 Nippon Telegr & Teleph Corp <Ntt> Method for estimating camera parameter

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06269025A (en) * 1993-03-16 1994-09-22 Fujitsu Ltd Coding system for multi-eye stereoscopic video image
JPH10233958A (en) * 1997-02-20 1998-09-02 Nippon Telegr & Teleph Corp <Ntt> Method for estimating camera parameter

Similar Documents

Publication Publication Date Title
US8345751B2 (en) Method and system for encoding a 3D video signal, enclosed 3D video signal, method and system for decoder for a 3D video signal
WO2012111756A1 (en) Image processing device and image processing method
US20110298898A1 (en) Three dimensional image generating system and method accomodating multi-view imaging
US20090190662A1 (en) Method and apparatus for encoding and decoding multiview video
WO2012111757A1 (en) Image processing device and image processing method
US20120263372A1 (en) Method And Apparatus For Processing 3D Image
WO2012147621A1 (en) Encoding device and encoding method, and decoding device and decoding method
JP2010503310A (en) File format of encoded stereoscopic video data
KR20110098858A (en) Image encoding device, image encoding method, program thereof, image decoding device, image decoding method, and program thereof
JP5450643B2 (en) Image coding apparatus, image coding method, program, and integrated circuit
JP2012257198A (en) Stereoscopic image encoding apparatus, method therefor, and image pickup apparatus having stereoscopic image encoding apparatus
JP6039178B2 (en) Image encoding apparatus, image decoding apparatus, method and program thereof
WO2012111755A1 (en) Image processing device and image processing method
EP2941867A1 (en) Method and apparatus of spatial motion vector prediction derivation for direct and skip modes in three-dimensional video coding
JP2008034892A (en) Multi-viewpoint image encoder
WO2011114755A1 (en) Multi-view image encoding device
WO2013031573A1 (en) Encoding device, encoding method, decoding device, and decoding method
WO2011074189A1 (en) Image encoding method and image encoding device
JP2006352261A (en) Image encoder and image decoder
WO2011102131A1 (en) Image encoding device, image encoding method, program and integrated circuit
JP2013150071A (en) Encoder, encoding method, program and storage medium
KR20140030202A (en) Recording device, recording method, playback device, playback method, program, and recording/playback device
US10911780B2 (en) Multi-viewpoint image coding apparatus, multi-viewpoint image coding method, and storage medium
JP2015162709A (en) Image encoder, image decoder, image encoding program and image decoding program
JP2012178818A (en) Video encoder and video encoding method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11744422

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11744422

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP