WO2012014695A1 - Three-dimensional imaging device and imaging method for same - Google Patents

Three-dimensional imaging device and imaging method for same Download PDF

Info

Publication number
WO2012014695A1
WO2012014695A1 PCT/JP2011/066089 JP2011066089W WO2012014695A1 WO 2012014695 A1 WO2012014695 A1 WO 2012014695A1 JP 2011066089 W JP2011066089 W JP 2011066089W WO 2012014695 A1 WO2012014695 A1 WO 2012014695A1
Authority
WO
WIPO (PCT)
Prior art keywords
imaging
image
unit
units
video
Prior art date
Application number
PCT/JP2011/066089
Other languages
French (fr)
Japanese (ja)
Inventor
田中 誠一
Original Assignee
シャープ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by シャープ株式会社 filed Critical シャープ株式会社
Publication of WO2012014695A1 publication Critical patent/WO2012014695A1/en

Links

Images

Classifications

    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B35/00Stereoscopic photography
    • G03B35/08Stereoscopic photography by simultaneous recording
    • G03B35/10Stereoscopic photography by simultaneous recording having single camera with stereoscopic-base-defining system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0081Depth or disparity estimation from stereoscopic image signals

Definitions

  • the present invention relates to a stereoscopic imaging device and an imaging method thereof.
  • stereoscopic image devices have been actively developed in order to enhance the power and presence of images.
  • a technique for generating a stereoscopic image there is known a technique in which two imaging devices for the left channel (L) and the right channel (R) are arranged on the left and right, and the subject is simultaneously photographed by the two imaging devices. Yes.
  • a technique for displaying a stereoscopic image a left channel (L) image and a right (R) channel image are alternately displayed on a single display screen for each pixel, and a kamaboko-shaped lens is provided in a predetermined manner.
  • the viewer's left and right eyes use a special optical system such as a lenticular lens arranged at intervals, a parallax barrier with fine slits arranged at a predetermined interval, and a patterning phase difference plate with regularly arranged fine polarizing elements.
  • a technique is known in which only the left channel (L) image is visible to the viewer's left eye and only the right channel (R) image is visible to the right eye by adjusting the viewing area.
  • the thin color camera includes four lenses 22a to 22d, four color filters 25a to 25d, and a detector array 24.
  • the color filter 25 includes a filter 25a that transmits red light (R), filters 25b and 25c that transmit green light (G), and a filter 25d that transmits blue light (B). Take green and blue images.
  • a high-resolution composite image is formed from two green images having high sensitivity in the human visual system, and a full-color image can be obtained by combining red and blue.
  • the conventional technology for capturing a stereoscopic image requires two imaging devices for the left channel (L) and the right channel (R) in order to obtain one stereoscopic image.
  • the imaging devices described in Patent Document 1 are used for the left channel and the right channel, the two imaging devices are arranged side by side. Since the imaging apparatus described in Patent Literature 1 includes four sub cameras, eight sub cameras that are twice that number are required.
  • the present invention has been made in view of such circumstances, and an object thereof is to provide a stereoscopic imaging apparatus and an imaging method thereof in which an increase in apparatus scale is suppressed.
  • a stereoscopic imaging apparatus includes two imaging units that capture the same subject, and imaging of the two imaging units.
  • a disparity calculating unit that detects corresponding points between the images and calculates disparity information of the captured images of the two image capturing units; and the disparity information and the two image capturing units on the basis of the viewpoints of the two image capturing units, respectively.
  • a synthesis processing unit that synthesizes an image having a larger number of pixels than the captured image and generates two systems of images having the larger number of pixels.
  • the imaging unit includes an optical system that forms an image of a subject on an imaging surface, and an imaging element that generates a signal of a captured video of the subject formed on the imaging surface,
  • One of the imaging units may be displaced up or down by half of the imaging pixel of the imaging device with respect to the optical system as compared with the other imaging unit.
  • the above-described stereoscopic imaging device includes three or more imaging units, and among the imaging units, between the side-by-side imaging units, the position of the imaging element with respect to the optical system is the imaging pixel of the imaging element.
  • the position of the image sensor with respect to the optical system is either half left or right of the shooting pixel of the image sensor. It may be shifted.
  • the four imaging units are arranged at the vertices of a square along which each side is either horizontal or vertical
  • the parallax calculation unit includes the four imaging units
  • the parallax information of the captured video of the two imaging units arranged at the adjacent vertices of the square is calculated, and the synthesis processing unit performs parallax in the horizontal direction and the vertical direction of the captured video performed when the video is synthesized.
  • the parallax information may be used for correction.
  • the stereoscopic imaging device described above includes at least three of the imaging units, and the synthesis processing unit captures the parallax information and at least two of the imaging units with reference to the viewpoints of the at least three imaging units. Based on the video, a video having a larger number of pixels than the captured video may be synthesized to generate a video having a larger number of pixels of at least three systems.
  • a corresponding point is detected between captured images of two imaging units that capture the same subject, and parallax information of the captured images of the two imaging units is calculated. And synthesizing an image having a larger number of pixels than the captured image based on the parallax information and the captured images of the two image capturing units on the basis of the viewpoints of the two image capturing units. Generating a large number of images.
  • the step of generating a video having a large number of pixels may include a step of performing parallax correction of the captured video using the parallax information when the video is synthesized.
  • the step of generating the video with a large number of pixels is based on the parallax information and at least two captured images of the imaging units based on the viewpoints of the at least three imaging units that capture the same subject.
  • the method may include a step of synthesizing an image having a larger number of pixels than the captured image and generating at least three systems of the image having the larger number of pixels.
  • FIG. 1 is an overview diagram showing an overview of a stereoscopic imaging apparatus 10 according to a first embodiment of the present invention. It is a schematic block diagram which shows the structure of the three-dimensional imaging device 10 in the embodiment. It is a figure which shows the example of arrangement
  • FIG. 1 An overview of a stereoscopic imaging apparatus 10 according to an embodiment of the present invention is shown in FIG. 1, and a schematic block diagram showing its functional configuration is shown in FIG.
  • the stereoscopic imaging device 10 includes an imaging unit 101, an imaging unit 102, a parallax calculation unit 21, and a high-resolution composition processing unit 20 arranged in the x-axis direction in the drawings.
  • the imaging unit 101 includes an imaging lens 11-1 and an imaging element 12-1.
  • the imaging unit 102 includes an imaging lens 11-2 and an imaging element 12-2. Note that the imaging unit 101 and the imaging unit 102 are arranged so that their optical axes are parallel so as to capture the same subject.
  • the imaging lens 11 forms an image of light from the subject on the imaging element 12.
  • the imaging element 12 optical system
  • the image sensor 12 is a CMOS image sensor or the like, and photoelectrically converts the image formed and outputs it as a video signal.
  • the video signal output from the imaging device 12-1 of the imaging unit 101 is referred to as a video signal R
  • the video signal output from the imaging device 12-2 of the imaging unit 102 is referred to as a video signal L.
  • Two systems of video signals (video signal R and video signal L) output by the imaging unit 101 and the imaging unit 102 are input to the parallax calculation unit 21 and the high-resolution composition processing unit 20.
  • the parallax calculation unit 21 searches for corresponding points between the two input video signals, and R reference parallax data RS based on the viewpoint of the imaging unit 101 and the viewpoint of the imaging unit 102 based on the search result.
  • the L reference parallax data LS is calculated and output to the high resolution composition processing unit 20.
  • the high-resolution composition processing unit 20 synthesizes the two input video signals based on the parallax data (disparity information), and outputs a right-eye video signal RC and a left-eye video signal LC.
  • FIG. 3 is a diagram illustrating an arrangement example of the imaging lens and the imaging element.
  • the x axis is taken in the horizontal direction (lateral direction)
  • the y axis is taken in the vertical direction (up and down direction)
  • the z axis is taken in the depth direction. That is, FIG. 3 shows the arrangement of the imaging lens and the imaging element when the stereoscopic imaging device 10 is viewed from the front.
  • the imaging lens 11-1 and the imaging lens 11-2 are arranged at the same position in the y-axis direction.
  • the image pickup device 12-1 is shifted from the image pickup device 12-2 by py / 2 in the y-axis direction (vertical direction).
  • py is the length of the pixel in the image sensor 12 in the y-axis direction. That is, the image pickup device 12-1 and the image pickup device 12-2 are arranged so as to be shifted in the y-axis direction (vertical direction) by half the pixel height of the image pickup device 12.
  • the position of the imaging device with respect to the imaging lens is shifted upward by half of the imaging pixel of the imaging device, as compared with the imaging unit 102.
  • the image sensor 12-1 may be arranged to be shifted by py / 2 below the image sensor 12-2 in the y-axis direction (vertical direction). In that case, the arrangement order of the pixels when the composition processing is performed in the high-resolution composition processing unit 20 described later is reversed.
  • FIG. 4 is a diagram illustrating another arrangement example of the imaging lens and the imaging element.
  • the x axis is taken in the horizontal direction (lateral direction)
  • the y axis is taken in the vertical direction (up and down direction)
  • the z axis is taken in the depth direction.
  • the imaging lens 11-1 is arranged to be shifted downward by py / 2 in the y-axis direction (vertical direction) from the imaging lens 11-2.
  • the image sensor 12-1 and the image sensor 12-2 are disposed at the same position in the y-axis direction. That is, the image pickup lens 11-1 and the image pickup lens 11-2 are arranged so as to be shifted in the y-axis direction (vertical direction) by half the pixel height of the image pickup element 12.
  • the position of the imaging element 12 with respect to the imaging lens 11 is shifted by half of the imaging pixel of the imaging element 12 as compared with the imaging unit 102.
  • the image sensor 12-1 may be arranged to be shifted by py / 2 below the image sensor 12-2 in the y-axis direction (vertical direction). In that case, the arrangement order of the pixels when the composition processing is performed in the high-resolution composition processing unit 20 described later is reversed.
  • FIG. 5 is a schematic block diagram illustrating the configuration of the parallax calculation unit 21.
  • the parallax calculation unit 21 calculates parallax data from the video signal R output from the imaging unit 101 in FIG. 1 and the video signal L output from the imaging unit L102.
  • the parallax calculation unit 21 includes coordinate conversion units 31 and 32, a right camera parameter storage unit 30R, a left camera parameter storage unit 30L, and a corresponding point search unit 33.
  • the right camera parameter storage unit 30R holds camera parameters including internal parameters such as a focal length and lens distortion parameters specific to the imaging unit 101, and external parameters representing the positional relationship between the two imaging units 101 and 102. .
  • the left camera parameter storage unit 30L holds camera parameters specific to the imaging unit 102.
  • the coordinate conversion unit 31 performs geometric conversion (coordinate conversion) on the video represented by the video signal R output from the imaging unit 101 by a known method for the purpose of placing the video of the imaging unit 101 and the imaging unit 102 on the same plane. To make the epipolar line parallel. At this time, the coordinate conversion unit 31 uses the camera parameters stored in the right camera parameter storage unit 30R.
  • the coordinate conversion unit 32 geometrically converts the video represented by the video signal L output from the imaging unit 102 by an known method for the purpose of placing the video images of the imaging unit 101 and the imaging unit 102 on the same plane. Is parallelized. At this time, the coordinate conversion unit 32 uses the camera parameters stored in the left camera parameter storage unit 30L.
  • the corresponding point search unit 33 searches for the corresponding pixel between the images in which the epipolar lines are parallelized by the coordinate conversion unit 31 and the coordinate conversion unit 32, and represents the parallax between the viewpoint of the imaging unit 101 and the viewpoint of the imaging unit 102 Ask for data.
  • the corresponding point search unit 33 is composed of blocks that calculate two types of parallax data. One is an R reference parallax calculation unit 34, and the other is an L reference parallax calculation unit 35.
  • the R reference parallax calculation unit 34 corresponds to each pixel of the reference image, using the image of the imaging unit 101 with the epipolar line parallelized as a reference image and the image of the imaging unit 102 with the epipolar line parallelized as a reference image.
  • the reference video pixel is searched to calculate R-standard parallax data RS.
  • the L reference parallax calculation unit 35 uses the video of the imaging unit 102 with the epipolar line parallelized as a reference video and the video of the imaging unit 101 with the epipolar line parallelized as a reference video, and references corresponding to each pixel of the standard video The pixel of the video is searched to calculate L reference parallax data.
  • the R-standard parallax calculation block 34 and the L-standard parallax calculation block 35 have the same corresponding point search operation except that the standard video and the reference video are reversed.
  • FIG. 6 is a diagram illustrating the reference image RG.
  • FIG. 7 is a diagram illustrating the reference image BG.
  • the epipolar lines are parallelized in both the reference image RG and the standard image BG.
  • a method of moving the target pixel on the standard image BG will be described with reference to FIG.
  • the R reference parallax calculation unit 34 sets a block centered on the target pixel on the reference image BG (hereinafter referred to as a reference target block BB) from the upper left end (reference start block BS) of the reference image BG to the right along the line. If the moved reference attention block BB reaches the right end of the line, the reference attention block BB is moved pixel by pixel from the left end under one line to the right along the line. This is repeated until the block at the lower right corner of the reference image BG (reference end block BE) is reached.
  • the R reference parallax calculation unit 34 first refers to a block (reference start block RS) on the reference image RG having the same coordinates as the coordinates (x, y) of the reference target block BB on the reference image BG shown in FIG.
  • the target block RB is set, and thereafter, the reference target block is moved pixel by pixel along the line to the right. Then, as shown in FIG.
  • the search range is a value corresponding to the maximum parallax of the photographed subject, and the shortest distance to the subject for which parallax data can be calculated is determined by the set search range.
  • the R reference parallax calculation unit 34 searches the reference attention block BB on the reference image BG illustrated in FIG. 7 for the reference attention block described above.
  • FIG. 9 is a diagram illustrating the configuration of the reference block of interest BB.
  • the reference attention block BB is a block having a size of horizontal M ⁇ vertical N around the target pixel on the reference image BG.
  • FIG. 8 is a diagram illustrating a configuration of the reference attention block RB.
  • the reference attention block RB is a block having a size of horizontal M ⁇ vertical N around the target pixel on the reference image RG.
  • pixel values of coordinates (i, j) where the horizontal direction is i and the vertical direction is j are respectively R Let (i, j), T (i, j).
  • the R reference parallax calculation unit 34 calculates a similarity for each combination of the reference attention block BB and the reference attention block RB, and determines a reference attention block RB similar to each reference attention block BB.
  • SAD Sud of Absolute Difference
  • SAD calculates the absolute value of the difference between R (i, j) and T (i, j) for all the pixels of the block as in the similarity determination formula shown in equation (1), and sums them (SSAD) It is.
  • the R reference parallax calculation unit 34 selects a reference attention block having the smallest SSAD value in the expression (1) among the reference attention blocks RG in the search range on the reference image RG corresponding to a certain reference attention block BB. It is determined that the reference attention block is similar. Then, the R reference parallax calculation unit 34 sets a pixel at the center of the reference attention block RB similar to the reference attention block BB as a pixel corresponding to the attention pixel at the center of the reference attention block BB.
  • the processing operation for searching for a corresponding pixel by the L reference parallax calculation unit 35 is substantially the same as that of the R reference parallax calculation unit 34, but the search range is different.
  • the reference start block RS is a block that is separated from the same coordinates as the reference target block BB by a search range to the left side
  • the reference end block RE is a block at the same coordinates as the reference target block BB. It is.
  • the above description describes the method for calculating parallax data for each processing unit.
  • the R reference parallax calculation unit 34 first sets the reference attention block at the head (reference start block BS) of the reference image BG (FIG. 7) (step S900). Then, all pixel values of the reference target block BB are read from the reference image BG (step S901). Next, the R standard parallax calculation unit 34 sets the block in the reference image RG (FIG. 6) having the same coordinates as the standard target block BB, that is, the head of the reference image RG (reference start block RS) as the reference target block RB. (Step S902).
  • the R reference parallax calculation unit 34 calculates and stores the SSAD values of the pixel values of the read reference attention block BB and reference attention block RB according to the equation (1) (step S904).
  • the R reference parallax calculation unit 34 determines whether or not the search range has ended (step S905). If the search range has not ended, the R reference parallax calculation unit 34 moves the reference block of interest to the right along the line direction by one pixel (step S906). ), Step S903 and Step S904 are performed again. These steps S903 to S906 are repeated while the reference block of interest RB is within the search range, and all SSAD values within the search range are calculated. Based on these calculation results, the R reference parallax calculation unit 34 detects the reference block of interest RB having the smallest SSAD value (step S907). Here, the minimum SSAD value calculated in step S907 is not necessarily a correct similar block.
  • the parallax cannot be detected correctly when the reference target block BB has no pattern (texture) or contour as a feature point, or when the search area on the reference image RG is an occlusion area. Whether the parallax has been detected correctly can be determined from how small the minimum SSAD value is.
  • the R reference parallax calculation unit 34 compares the minimum SSAD value with the threshold (step 908), and when the SSAD value is equal to or lower than the threshold (that is, when the similarity is high), the center of the reference block of interest BB (reference The difference between the x coordinate of the target pixel on the image BG) and the x coordinate of the center of the detected reference target block RB (the corresponding pixel on the reference image RG) is output as parallax data of the target pixel (step S909). ).
  • the R reference parallax calculation unit 34 determines that the parallax has not been detected, and sets the parallax data to 0 or a unique value as the error flag. And output (step 910).
  • the R reference parallax calculation unit 34 determines whether or not the reference block of interest BB has reached the reference end block BE, that is, whether or not the processing has ended (step S911).
  • the block BB is moved to the right by one pixel along the line direction (step S912), and steps S901 to S910 are performed again. If it is determined in step S911 that the process has ended, the process ends. In this way, the steps S901 to S910 are repeated until the reference target block BB becomes the search end block of the reference image BG, and the parallax data of each pixel on the reference image BG is obtained.
  • the pixel on the reference image similar to the target pixel on the base image is searched using the SAD similarity evaluation function.
  • the parallax data may be obtained using any technique as long as it is a technique for searching for similar pixels on the standard image and the reference image.
  • FIG. 11 is a schematic block diagram showing a functional configuration of the high resolution composition processing unit 20.
  • the high-resolution composition processing unit 20 includes a left-eye composition unit 908 that generates a left-eye video signal LC, a right-eye composition unit 909 that generates a right-eye video signal RC, and a right camera parameter storage unit 902R. And a left camera parameter storage unit 902L.
  • Each of the left-eye composition unit 908 and the right-eye composition block unit 909 includes an alignment correction processing unit 901, a correction processing unit 903, and a composition processing unit 906.
  • the left-eye synthesis unit 908 Since the basic operation is the same for the left-eye synthesis unit 908 and the right-eye synthesis unit 909 except that the combination of input parallax data is different, the operation of the left-eye synthesis unit 908 will be described here. Description is omitted.
  • the video signal R of the imaging unit 101 is input to the alignment correction processing unit 901, and the video signal L of the imaging unit 102 is input to the correction processing unit 903.
  • the alignment correction processing unit 901 corrects the lens distortion of the video represented by the video signal R based on the camera parameters indicating the lens distortion state of the imaging unit 101 stored in the right camera parameter storage unit 902R, and then the parallax calculation unit 21 Based on the L reference parallax data LS input from, and the camera parameters indicating the orientation and orientation of the image capturing unit 101 stored in the right camera parameter storage unit 902R, each pixel of the image with the lens distortion corrected is an image of the image capturing unit 102 ( Alignment is performed so as to capture the same subject position as a pixel having the same coordinates in the image corrected by the correction processing unit 903.
  • the position of the image pickup device 12-1 with respect to the image pickup lens 11-1 is determined by the image pickup device 12-1. Correction is not performed for a state of being shifted upward by half of the pixel. That is, the pixel at the coordinate (x, y) as the processing result of the alignment correction processing unit 901 is the pixel at the coordinate (x, y) as the processing result of the correction processing unit 903 and the pixel at the coordinate (x, y ⁇ 1). The subject position between is captured.
  • the correction processing unit 903 corrects the lens distortion of the video represented by the video signal L based on the camera parameters indicating the lens distortion state stored in the left camera parameter storage unit 902L.
  • the alignment correction processing unit 901 and the correction processing unit 903 perform parallelization of epipolar lines using camera parameters and correction of parallax using L reference parallax data LS.
  • the parallelization of the epipolar line can be performed by a known method.
  • the alignment correction processing unit 901 shifts the pixels of the video obtained by parallelizing the epipolar line to the video represented by the video signal R by the amount corresponding to the parallax indicated by the L reference parallax data LS. Move to. For example, if the L reference parallax data LS at the coordinates (x, y) is d, the pixel at the coordinates (x + d, y) is moved to the coordinates (x, y).
  • the video signal R and the camera parameters of the imaging unit 101 are input to the correction processing unit 903 of the right-eye synthesis unit 909, and the video signal L and the camera of the imaging unit 102 are input to the alignment correction processing unit 901.
  • a parameter and R reference parallax data RS are input.
  • the pixel is moved to the right by the amount of parallax indicated by the parallax data RS.
  • the horizontal axis indicates the spread (size) of the space in the y-axis direction
  • the vertical axis indicates the light amplitude (light intensity).
  • an image of a subject is formed by the imaging lenses 11-1 and 11-2 among the imaging elements 12-1 and 12-2 of the imaging units 101 and 102, and is incident on a certain vertical row of pixels. Shows the distribution of light.
  • the graph denoted by reference numeral 40e indicates the distribution of the output of the alignment correction processing unit 901 corresponding to the pixel on which the light of the graph 40a is incident among the pixels of the imaging element 12-1 of the imaging unit 101.
  • a graph denoted by reference numeral 40f indicates an output distribution of the correction processing unit 903 corresponding to the pixel on which the light of the graph 40a is incident among the pixels of the imaging element 12-2 of the imaging unit 102.
  • the graph of reference numeral 40g shows the distribution of the output of the synthesis processing unit 906 with respect to the distribution of reference numerals 40e and 40f.
  • the relationship between the graphs will be described ignoring the effects of correction by the alignment correction processing unit 901 and the correction processing unit 903.
  • the solid line is the boundary line of the pixel of the imaging element 12-1 of the imaging unit 101
  • the broken line is the boundary line of the pixel of the imaging element 12-2 of the imaging unit 102.
  • reference numerals 40b and 40c in the figure are pixels of the imaging unit 101 and the imaging unit 102, respectively, and the relative positional relationship is shifted by an offset indicated by an arrow 40d.
  • the offset is preferably set to be half the size of the pixels (reference numerals 40b and 40c) of the imaging elements 12-1 and 12-2. A half-pixel offset makes it possible to generate the highest definition image.
  • the image sensors 12-1 and 12-2 integrate the light intensity in units of pixels, when an image of a subject indicated by reference numeral 40a is taken by the image pickup element 12-1, a video signal having a light intensity distribution indicated by reference numeral 40e. When a picture is taken with the image sensor 12-2, a video signal having a light intensity distribution indicated by reference numeral 40f is obtained.
  • the composition processing unit 906 synthesizes an image by alternately arranging the output of the alignment correction processing unit 901 and the output of the correction processing unit 903 in the y-axis direction, and compared with the outputs of the imaging units 101 and 102.
  • a left-eye video signal LC in which the resolution in the y-axis direction is doubled is generated.
  • the imaging unit 101 considers that the position of the imaging device 12-1 with respect to the imaging lens 11-1 is shifted upward by half of the imaging pixel of the imaging device 12-1, as compared with the imaging unit 102.
  • the pixel b of the correction processing unit 903 derived from the imaging unit 102 and having the same coordinates as the pixel a of the alignment correction processing unit 901 derived from the imaging unit 101 is disposed below the pixel a. That is, when the origin is the upper right of the image, the x-axis is the right direction, and the y-axis is the lower direction, the pixel at the coordinates (x, y) being output from the alignment correction processing unit 901 is The pixel at the coordinates (2x, 2y) being outputted, and the pixel at the coordinates (x, y) being outputted by the correction processing unit 903 is a pixel at the coordinates (2x, 2y + 1) being outputted by the synthesis processing unit 906.
  • the composition processing unit 906 can reproduce a high-definition image close to the graph 40a indicated by reference numeral 40g by combining the two images in this way.
  • the synthesis processing unit 906 of the right-eye synthesis unit 909 operates in the same manner as the synthesis processing unit 906 of the left-eye synthesis unit 908, but is a pixel having the same coordinates as the pixel c of the correction processing unit 903 derived from the imaging unit 101. Therefore, the pixel d of the alignment correction processing unit 901 derived from the imaging unit 102 is arranged below the pixel c.
  • the above combining operation is executed by the left eye combining unit 908 and the right eye combining unit 909.
  • the left-eye compositing unit 908 outputs a left-eye video signal LC, which is a high-definition video signal taken from the position of the imaging unit 102 (ie, viewed from the left eye)
  • the right-eye compositing unit 909 captures an image.
  • a right-eye video signal RC that is a high-definition video signal shot from the position of the unit 102 (that is, viewed from the right eye) is output.
  • the left-eye video signal LC and the right-eye video signal RC output from the stereoscopic imaging device 10 have twice the resolution as compared with the video signals R and L output from the imaging units 101 and 102. ing. Therefore, the stereoscopic imaging device 10 can generate a stereoscopic video image having a resolution twice that of a stereoscopic imaging device having a resolution equivalent to that of the video signal output by the imaging unit 101.
  • FIG. 13 is an overview diagram illustrating an overview of the stereoscopic imaging apparatus according to the present embodiment.
  • the stereoscopic imaging apparatus 111 according to the present embodiment differs from the first embodiment in the number of imaging units, from two eyes (reference numerals 101, 102) to four eyes (reference numerals 101, 102, 103, 104). ) And increasing. That is, the stereoscopic imaging device 111 includes an imaging unit 101, an imaging unit 102, an imaging unit 103, and an imaging unit 104.
  • the imaging unit 101 includes an imaging lens 11-1 and an imaging element 12-1.
  • the imaging unit 102 includes an imaging lens 11-2 and an imaging element 12-2
  • the imaging unit 103 includes an imaging lens 11-3 and an imaging element 12-3, and the imaging unit 104 includes an imaging lens. 11-4 and an image sensor 12-4.
  • FIG. 14 is a schematic block diagram illustrating a functional configuration of the stereoscopic imaging device 111 according to the present embodiment.
  • the stereoscopic imaging device 111 includes imaging units 101, 102, 103, 104, a parallax calculation unit 21, and a multi-view high-resolution composition processing unit 121.
  • the video signals R and L output from the imaging unit 101 and the imaging unit 102 are input to the multi-view high resolution synthesis processing unit 121 and the parallax calculation unit 21.
  • the video signal R ′ output from the imaging unit 103 and the video signal L ′ output from the imaging unit 104 are input to the multi-view high-resolution composition processing unit 121.
  • the processing of the parallax calculation unit 21 is the same as that of the first embodiment, and calculates the R reference parallax data and the L reference parallax data, and outputs them to the multi-view high-resolution synthesis processing unit 121.
  • the high-resolution synthesis processing unit 121 performs synthesis processing on the input four video signals based on the two parallax data, and outputs a right-eye video signal RC and a left-eye video signal LC.
  • FIG. 15 is a diagram illustrating the arrangement of the imaging lens and the imaging element in the present embodiment.
  • the x axis is taken in the horizontal direction (lateral direction)
  • the y axis is taken in the vertical direction (up and down direction)
  • the z axis is taken in the depth direction. That is, FIG. 15 shows the arrangement of the imaging lens and the imaging element when the stereoscopic imaging device 111 is viewed from the front.
  • the imaging lens 11-1 and the imaging lens 11-2 are arranged at the same position in the y-axis direction.
  • the imaging lens 11-3 and the imaging lens 11-4 are disposed at the same position in the y-axis direction.
  • the imaging lens 11-1 and the imaging lens 11-3 are disposed at the same position in the x-axis direction.
  • the imaging lens 11-2 and the imaging lens 11-4 are disposed at the same position in the x-axis direction.
  • the distance Dx from the center of the imaging lens 11-1 to the center of the imaging lens 11-2 is equal to the distance Dy from the center of the imaging lens 11-1 to the center of the imaging lens 11-3.
  • the imaging units 101 to 104 are arranged at the vertices of a square in which each side is along either the horizontal or vertical direction.
  • the image sensor 12-1 is arranged to be shifted by py / 2 in the y-axis direction (vertical direction) from the image sensor 12-2. Further, the image sensor 12-3 is arranged so as to be shifted by py / 2 in the y-axis direction (vertical direction) from the image sensor 12-4.
  • py is the length of the pixel in the image sensor 12 in the y-axis direction.
  • the image sensor 12-1 is arranged to be shifted to the left by px / 2 in the x-axis direction (lateral direction) from the image sensor 12-3.
  • the image sensor 12-2 is arranged to be shifted to the left by px / 2 in the x-axis direction (lateral direction) from the image sensor 12-4.
  • px is the length of the pixel in the x-axis direction in the image sensor.
  • the position of the imaging element with respect to the imaging lens is shifted upward by half of the imaging pixel of the imaging element, as compared with the imaging unit 102.
  • the position of the imaging element with respect to the imaging lens is shifted upward by half of the imaging pixel of the imaging element.
  • the position of the imaging element with respect to the imaging lens is shifted to the left by half of the imaging pixel of the imaging element, as compared with the imaging unit 103.
  • the position of the imaging element with respect to the imaging lens is shifted to the left by half of the imaging pixel of the imaging element, as compared with the imaging unit 104.
  • the imaging elements are displaced is shown, but the imaging lens may be displaced as shown in FIG. 4 described in the first embodiment.
  • the multi-view high-resolution composition processing unit 121 includes a left-eye multi-view composition unit 130 that generates a left-eye high-definition image, a right-eye multi-view composition unit 132 that generates a right-eye high-definition image, and the imaging unit 101.
  • the left-eye composition unit 130 and the right-eye composition block 130 include an alignment correction processing unit 901, a correction processing unit 903, a vertical / horizontal alignment correction processing unit 904, a vertical alignment correction processing unit 905, and a multi-view synthesis processing unit 131, respectively. It comprises. Since the left eye synthesis unit 130 and the right eye synthesis unit 132 have the same basic operation except that the combination of the input video signal and the parallax data is different, the operation of the left eye synthesis block 131 will be described here.
  • the video signal R output from the imaging unit 101 is input to the alignment correction processing unit 901. Similar to the first embodiment, the alignment correction processing unit 901 performs the correction processing and alignment of the video represented by the video signal R, the camera parameters of the imaging unit 101 stored in the camera parameter storage unit 902, and the L reference parallax. Based on the data LS, an image from the viewpoint of the imaging unit 102 is generated.
  • the video signal L output from the imaging unit 102 is input to the correction processing unit 903. Similar to the first embodiment, the correction processing unit 903 performs correction processing of the video represented by the video signal L based on the camera parameters of the imaging unit 102 stored in the camera parameter storage unit 902.
  • the video signal R ′ output from the imaging unit 103 is input to the vertical / horizontal alignment correction processing unit 904.
  • the vertical / horizontal alignment correction processing unit 904 performs correction processing and alignment of the video represented by the video signal R ′ based on the camera parameters of the imaging unit 103 stored in the camera parameter storage unit 902 and the L reference parallax data LS.
  • An image from the viewpoint of the imaging unit 102 is generated.
  • the distance Dx between the center of the imaging lens 11-1 of the imaging unit 101 and the center of the imaging lens 12-1 of the imaging unit 102 is equal to the center of the imaging lens 11-1 of the imaging unit 101 and the imaging unit 103.
  • L reference parallax data LS is used as vertical parallax data. That is, the vertical / horizontal alignment correction processing unit 904 performs alignment by applying the L reference parallax data LS in the vertical direction and the horizontal direction.
  • the pixel value of the coordinate (x, y) in the image aligned by the vertical / horizontal alignment correction processing unit 904 is the video signal output by the imaging unit 103 when the L reference parallax data LS at the coordinate is d. Is a pixel value of coordinates (x + d, yd) in the image obtained by correcting the image represented by the camera parameter.
  • the video signal L ′ output from the imaging unit 104 is input to the vertical alignment correction processing unit 905.
  • the vertical alignment correction processing unit 905 performs correction processing and alignment of the video represented by the video signal L ′ based on the camera parameters of the imaging unit 104 stored in the camera parameter storage unit 902 and the L reference parallax data LS. An image from the viewpoint of the imaging unit 102 is generated. That is, the vertical alignment correction processing unit 905 performs alignment by applying the L reference parallax data LS in the vertical direction.
  • the pixel value of the coordinates (x, y) in the image aligned by the vertical alignment correction processing unit 905 is the video signal output by the imaging unit 104 when the L reference parallax data LS at the coordinates is d. Is a pixel value at coordinates (x, yd) in the image obtained by correcting the image represented by the camera parameter.
  • the multi-eye synthesis processing unit 131 in the left-eye synthesis unit 130 and the right-eye synthesis unit 132 will be described with reference to FIG.
  • the multi-eye composition processing unit 131 includes four systems obtained by the four imaging units 101, 102, 103, and 104.
  • High-resolution synthesis using the video signal The multi-view synthesis processing unit 131 of the left-eye synthesis unit 130 generates and outputs a left-eye video signal LC ′ that is a signal representing video from the viewpoint of the imaging unit 102.
  • the multi-view synthesis processing unit 131 of the right-eye synthesis unit 132 generates and outputs a right-eye video signal RC ′ that is a signal representing video from the viewpoint of the imaging unit 101.
  • the four-line high-resolution synthesis is the same as the principle described in the light intensity distribution of FIG. 12, but more specifically, the resolution of the four imaging units 101, 102, 103, and 104 is VGA (640 ⁇ 480). In the following description, a high-resolution composition process is performed on a quad-VGA pixel (1280 ⁇ 960 pixels) that is four times the number of pixels.
  • the image pickup device 11-1 of the image pickup unit 101 is arranged so as to be shifted upward by a half pixel with respect to the image pickup device 11-2 of the image pickup unit 102.
  • the image sensor 11-4 of the image capturing unit 104 is shifted to the left by half a pixel with respect to the image sensor 11-2 of the image capture unit 102. ing.
  • the multi-eye synthesis processing unit 131 uses pixels G11, G21, G31, and G41, which are pixels having the same coordinates in each of the corrected image MR ′ derived from and the corrected image ML ′ derived from the imaging unit 104. Arrange so that. That is, the pixel G31 is arranged on the right side of the pixel G11, the pixel G21 is arranged on the lower side of the pixel G11, and the pixel G41 is arranged on the right side of the pixel G21.
  • the corrected video MR derived from the imaging unit 101 is an image generated by the alignment correction processing unit 901
  • the corrected video ML derived from the imaging unit 102 is the correction processing unit.
  • the corrected image MR ′ derived from the image capturing unit 103 is an image generated by the image capturing unit 103
  • the corrected image ML ′ derived from the image capturing unit 104 is the vertical position. This is an image generated by the alignment correction processing unit 905.
  • the corrected video MR derived from the imaging unit 101 is an image generated by the correction processing unit 903
  • the corrected video ML derived from the imaging unit 102 is processed by the alignment correction processing unit 901.
  • the corrected image MR ′ derived from the imaging unit 103 which is the generated image, is the image generated by the vertical alignment correction processing unit 905, and the corrected video ML ′ derived from the imaging unit 104 is the vertical / horizontal alignment correction. It is an image generated by the processing unit 904.
  • the above combining operation is executed by the left eye combining unit 130 and the right eye combining unit 132.
  • the left-eye synthesizing unit 130 outputs a high-definition image obtained by synthesizing four images captured from the position of the imaging unit 102 (that is, viewed from the left eye).
  • the right-eye combining unit 132 outputs a high-definition image obtained by combining four images captured from the position of the imaging unit 101 (that is, viewed from the right eye). That is, an image having a resolution four times that of the output of the imaging unit is output.
  • the left-eye video signal LC ′ output from the stereoscopic imaging device 111 and the right-eye video signal RC ′ have four times higher resolution than the video signals output from the imaging units 101, 102, 103, and 104.
  • the left-eye video signal LC ′ output from the stereoscopic imaging device 111 and the right-eye video signal RC ′ are a video signal obtained by combining the video signal output from the imaging unit 101 and the video signal output from the imaging unit 103. In comparison, it has four times the resolution.
  • the stereoscopic imaging device 111 has a device scale equivalent to that of a stereoscopic imaging device having a resolution equivalent to a video signal obtained by combining the video signal output from the imaging unit 101 and the video signal output from the imaging unit 103, and twice that size. 3D images can be generated.
  • the example in which the video signals input to the parallax calculation unit 21 are two video signals output by the imaging units 101 and 102 has been described.
  • the number of input video signals can be increased.
  • a high-definition video of a plurality of viewpoints other than those for the right eye and the left eye is output, and as a result, a multi-view stereoscopic video can be generated without resolution deterioration.
  • FIG. 2 shows a program for realizing the functions of the parallax calculation unit 21 and the high-resolution synthesis processing unit 20 or the parallax calculation unit 21 and the multi-view high-resolution synthesis processing unit 121 in FIG. 14 on a computer-readable recording medium.
  • the processing of each unit may be performed by recording, reading the program recorded on the recording medium into a computer system, and executing the program.
  • the “computer system” includes an OS and hardware such as peripheral devices.
  • the “computer-readable recording medium” means a storage device such as a flexible disk, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory in a computer system serving as a server or a client in that case, and a program that holds a program for a certain period of time are also included.
  • the program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.
  • the present invention can be applied to a thin color camera that generates a high-definition image using a plurality of cameras.

Abstract

Disclosed is a three-dimensional imaging device comprising: two imaging units that capture images of the same subject; a parallax calculation unit that detects points of correspondence between videos captured by the two imaging units and calculates parallax information for the captured videos from the two imaging units; and a compositing processor that, using the viewpoint of each of the two imaging units for reference and based on parallax information and the videos captured by the two imaging units, combines videos with more pixels than the captured videos and generates a two-system video with many pixels.

Description

立体撮像装置およびその撮像方法Stereo imaging device and imaging method thereof
 本発明は、立体撮像装置およびその撮像方法に関する。
 本願は、2010年7月27日に、日本に出願された特願2010-168145号に基づき優先権を主張し、その内容をここに援用する。
The present invention relates to a stereoscopic imaging device and an imaging method thereof.
This application claims priority on July 27, 2010 based on Japanese Patent Application No. 2010-168145 filed in Japan, the contents of which are incorporated herein by reference.
 近年、映像の迫力や臨場感を高めるために立体映像装置の開発が盛んである。立体画像を生成する技術としては、左チャンネル(L)用と右チャンネル(R)用の2台の撮像装置を左右に配置し、2台の撮像装置で被写体を同時に撮影する技術が知られている。また、立体画像を表示する技術としては、左チャンネル(L)の画像と右(R)チャンネルの画像を一つの表示画面上に1画素毎に交互に表示すると共に、かまぼこ型のレンズを所定の間隔で並べたレンチキュラーレンズや、細かいスリットを所定の間隔で並べたパララックスバリアや、微細偏光素子を規則正しく配列させたパターニング位相差板などの特殊な光学系を用いて鑑賞者の左目と右目で見える領域を調整して、鑑賞者の左目には左チャンネル(L)の画像のみが、右目には右チャンネル(R)の画像のみが見えるようにする技術が知られている。 In recent years, stereoscopic image devices have been actively developed in order to enhance the power and presence of images. As a technique for generating a stereoscopic image, there is known a technique in which two imaging devices for the left channel (L) and the right channel (R) are arranged on the left and right, and the subject is simultaneously photographed by the two imaging devices. Yes. Further, as a technique for displaying a stereoscopic image, a left channel (L) image and a right (R) channel image are alternately displayed on a single display screen for each pixel, and a kamaboko-shaped lens is provided in a predetermined manner. The viewer's left and right eyes use a special optical system such as a lenticular lens arranged at intervals, a parallax barrier with fine slits arranged at a predetermined interval, and a patterning phase difference plate with regularly arranged fine polarizing elements. A technique is known in which only the left channel (L) image is visible to the viewer's left eye and only the right channel (R) image is visible to the right eye by adjusting the viewing area.
 更に近年、複数のカメラを用いて高精細な映像を生成する技術も開発されている。例えば撮像レンズ、カラーフィルタ、検出器アレイから構成される4つのサブカメラを組み合わせて、サブピクセル解像度を有する薄型カラーカメラも提案されている(例えば、特許文献1参照)。この薄型カラーカメラは、図18に示すように4つのレンズ22a~22dと、4つのカラーフィルタ25a~25dと、検出器アレイ24から構成される。カラーフィルタ25は、赤色光(R)を透過するフィルタ25a、緑色光(G)を透過するフィルタ25bと25c、青色光(B)を透過するフィルタ25dから構成され、検出器アレイ24は赤色、緑色、青色の画像を撮影する。この構成で、人間の視覚系で高い感度をもつ緑色の2つの画像から高解像度の合成画像を形成し、赤色と青色と組み合わせてフルカラー画像を得ることができるとしている。 In recent years, technology for generating high-definition video using a plurality of cameras has also been developed. For example, a thin color camera having a sub-pixel resolution by combining four sub-cameras including an imaging lens, a color filter, and a detector array has also been proposed (for example, see Patent Document 1). As shown in FIG. 18, the thin color camera includes four lenses 22a to 22d, four color filters 25a to 25d, and a detector array 24. The color filter 25 includes a filter 25a that transmits red light (R), filters 25b and 25c that transmit green light (G), and a filter 25d that transmits blue light (B). Take green and blue images. With this configuration, a high-resolution composite image is formed from two green images having high sensitivity in the human visual system, and a full-color image can be obtained by combining red and blue.
特表2007-520166号公報Special table 2007-520166
 しかしながら、従来の立体画像を撮像する技術においては、一つの立体画像を得るために左チャンネル(L)用と右チャンネル(R)用の2台の撮像装置を必要とするため、装置構成が大規模になってしまうという問題がある。例えば、左チャンネルと、右チャンネルとに、特許文献1に記載の撮像装置を用いると、その撮像装置を2台並べて配置することになる。特許文献1に記載する撮像装置は、4つのサブカメラを備えるため、その2倍の8つのサブカメラが必要となる。 However, the conventional technology for capturing a stereoscopic image requires two imaging devices for the left channel (L) and the right channel (R) in order to obtain one stereoscopic image. There is a problem of becoming a scale. For example, when the imaging devices described in Patent Document 1 are used for the left channel and the right channel, the two imaging devices are arranged side by side. Since the imaging apparatus described in Patent Literature 1 includes four sub cameras, eight sub cameras that are twice that number are required.
 本発明は、このような事情に鑑みてなされたもので、その目的は、装置規模の増大を抑えた立体撮像装置およびその撮像方法を提供することにある。 The present invention has been made in view of such circumstances, and an object thereof is to provide a stereoscopic imaging apparatus and an imaging method thereof in which an increase in apparatus scale is suppressed.
(1)この発明は上述した課題を解決するためになされたもので、本発明の一実施形態に係る立体撮像装置は、同じ被写体を撮像する2つの撮像部と、前記2つの撮像部の撮影映像の間で対応点を検出し、前記2つの撮像部の撮影映像の視差情報を算出する視差算出部と、前記2つの撮像部それぞれの視点を基準として、前記視差情報および前記2つの撮像部の撮影映像に基づき、該撮影映像よりも画素数の多い映像を合成し、2系統の前記画素数の多い映像を生成する合成処理部とを備える。 (1) The present invention has been made to solve the above-described problem. A stereoscopic imaging apparatus according to an embodiment of the present invention includes two imaging units that capture the same subject, and imaging of the two imaging units. A disparity calculating unit that detects corresponding points between the images and calculates disparity information of the captured images of the two image capturing units; and the disparity information and the two image capturing units on the basis of the viewpoints of the two image capturing units, respectively. And a synthesis processing unit that synthesizes an image having a larger number of pixels than the captured image and generates two systems of images having the larger number of pixels.
(2)上述の立体撮像装置において、前記撮像部は、被写体の像を撮像面に結像させる光学系と、撮像面に結像した被写体の撮影映像の信号を生成する撮像素子とを備え、一方の前記撮像部は、他方の前記撮像部と比較して、前記光学系に対する前記撮像素子の位置が、該撮像素子の撮影画素の半分だけ上下いずれかにずれていてもよい。 (2) In the stereoscopic imaging device described above, the imaging unit includes an optical system that forms an image of a subject on an imaging surface, and an imaging element that generates a signal of a captured video of the subject formed on the imaging surface, One of the imaging units may be displaced up or down by half of the imaging pixel of the imaging device with respect to the optical system as compared with the other imaging unit.
(3)上述の立体撮像装置は、前記撮像部を3つ以上備え、前記撮像部のうち、横並びの撮像部の間では、前記光学系に対する前記撮像素子の位置が、該撮像素子の撮影画素の半分だけ上下いずれかにずれており、前記撮像部のうち、縦並びの撮像部の間では、前記光学系に対する前記撮像素子の位置が、該撮像素子の撮影画素の半分だけ左右いずれかにずれていてもよい。 (3) The above-described stereoscopic imaging device includes three or more imaging units, and among the imaging units, between the side-by-side imaging units, the position of the imaging element with respect to the optical system is the imaging pixel of the imaging element. The position of the image sensor with respect to the optical system is either half left or right of the shooting pixel of the image sensor. It may be shifted.
(4)上述の立体撮像装置において、4つの前記撮像部が、各辺が水平または鉛直方向のいずれかに沿う正方形の頂点に配置され、前記視差算出部は、前記4つの撮像部のうち、前記正方形の隣接する頂点に配置されている2つの撮像部の撮影映像の視差情報を算出し、前記合成処理部は、前記映像の合成の際に行う前記撮影映像の水平方向および鉛直方向の視差補正に、前記視差情報を用いてもよい。 (4) In the above-described stereoscopic imaging device, the four imaging units are arranged at the vertices of a square along which each side is either horizontal or vertical, and the parallax calculation unit includes the four imaging units, The parallax information of the captured video of the two imaging units arranged at the adjacent vertices of the square is calculated, and the synthesis processing unit performs parallax in the horizontal direction and the vertical direction of the captured video performed when the video is synthesized. The parallax information may be used for correction.
(5)上述の立体撮像装置は、前記撮像部を少なくとも3つ備え、前記合成処理部は、少なくとも3つの前記撮像部それぞれの視点を基準として、前記視差情報および少なくとも2つの前記撮像部の撮影映像に基づき、該撮影映像よりも画素数の多い映像を合成し、少なくとも3系統の前記画素数の多い映像を生成してもよい。 (5) The stereoscopic imaging device described above includes at least three of the imaging units, and the synthesis processing unit captures the parallax information and at least two of the imaging units with reference to the viewpoints of the at least three imaging units. Based on the video, a video having a larger number of pixels than the captured video may be synthesized to generate a video having a larger number of pixels of at least three systems.
(6)本発明の他の実施形態に係る撮像方法は、同じ被写体を撮像する2つの撮像部の撮影映像の間で対応点を検出し、前記2つの撮像部の撮影映像の視差情報を算出する工程と、前記2つの撮像部それぞれの視点を基準として、前記視差情報および前記2つの撮像部の撮影映像に基づき、該撮影映像よりも画素数の多い映像を合成し、2系統の前記画素数の多い映像を生成する工程とを有する。 (6) In an imaging method according to another embodiment of the present invention, a corresponding point is detected between captured images of two imaging units that capture the same subject, and parallax information of the captured images of the two imaging units is calculated. And synthesizing an image having a larger number of pixels than the captured image based on the parallax information and the captured images of the two image capturing units on the basis of the viewpoints of the two image capturing units. Generating a large number of images.
(7)前記画素数の多い映像を生成する工程は、前記映像の合成の際に、前記視差情報を用いて前記撮影映像の視差補正を行う工程を含んでいてもよい。 (7) The step of generating a video having a large number of pixels may include a step of performing parallax correction of the captured video using the parallax information when the video is synthesized.
(8)前記画素数の多い映像を生成する工程は、同じ被写体を撮像する少なくとも3つの前記撮像部それぞれの視点を基準として、前記視差情報および少なくとも2つの前記撮像部の撮影映像に基づき、該撮影映像よりも画素数の多い映像を合成し、少なくとも3系統の前記画素数の多い映像を生成する工程を含んでいてもよい。 (8) The step of generating the video with a large number of pixels is based on the parallax information and at least two captured images of the imaging units based on the viewpoints of the at least three imaging units that capture the same subject. The method may include a step of synthesizing an image having a larger number of pixels than the captured image and generating at least three systems of the image having the larger number of pixels.
 この発明によれば、装置規模の増大を抑えることができる。 According to this invention, an increase in the scale of the apparatus can be suppressed.
この発明の第一の実施形態における立体撮像装置10の概観を示す概観図である。1 is an overview diagram showing an overview of a stereoscopic imaging apparatus 10 according to a first embodiment of the present invention. 同実施形態における立体撮像装置10の構成を示す概略ブロック図である。It is a schematic block diagram which shows the structure of the three-dimensional imaging device 10 in the embodiment. 同実施形態における撮像レンズおよび撮像素子の配置例を示す図である。It is a figure which shows the example of arrangement | positioning of the imaging lens and imaging device in the embodiment. 同実施形態における撮像レンズおよび撮像素子の別の配置例を示す図である。It is a figure which shows another example of arrangement | positioning of the imaging lens and imaging device in the embodiment. 同実施形態における視差算出部21の構成を示す概略ブロック図である。It is a schematic block diagram which shows the structure of the parallax calculation part 21 in the embodiment. 同実施形態における参照画像RGを示す図である。It is a figure showing reference picture RG in the embodiment. 同実施形態における基準画像BGを示す図である。It is a figure which shows the reference | standard image BG in the embodiment. 同実施形態における参照注目ブロックRBの構成を示す図である。It is a figure which shows the structure of the reference attention block RB in the same embodiment. 同実施形態における基準注目ブロックBBの構成を示す図である。It is a figure which shows the structure of the reference attention block BB in the embodiment. 同実施形態における視差データの算出処理を説明するフローチャートである。It is a flowchart explaining the calculation process of the parallax data in the embodiment. 同実施形態における高解像度合成処理部20の機能構成を示す概略ブロック図である。It is a schematic block diagram which shows the function structure of the high resolution synthetic | combination process part 20 in the embodiment. 同実施形態における合成処理部906の動作を説明する図である。It is a figure explaining operation | movement of the synthetic | combination process part 906 in the embodiment. この発明の第二の実施形態における立体撮像装置111の概観を示す概観図である。It is a general-view figure which shows the general appearance of the three-dimensional imaging device 111 in 2nd embodiment of this invention. 同実施形態における立体撮像装置111の機能構成を示す概略ブロック図である。It is a schematic block diagram which shows the function structure of the three-dimensional imaging device 111 in the embodiment. 同実施形態における撮像レンズと、撮像素子との配置を説明する図である。It is a figure explaining arrangement of an image pick-up lens and an image sensor in the embodiment. 同実施形態における多眼高解像度合成処理部121の機能構成を示す概略ブロック図である。It is a schematic block diagram which shows the function structure of the multi-view high resolution synthetic | combination process part 121 in the embodiment. 同実施形態における多眼合成処理部131の動作を説明する図である。It is a figure explaining operation | movement of the multi-view synthesis process part 131 in the embodiment. 従来の高精細な映像を生成するカメラの構成を示す概略構成図である。It is a schematic block diagram which shows the structure of the camera which produces | generates the conventional high definition image | video.
(第一の実施形態)
 本発明の第一の実施形態について、図面を参照して詳細に説明する。本発明の実施形態に係る立体撮像装置10の概観を図1に、その機能構成を示す概略ブロック図を図2に示す。図1及び図2に示すように、立体撮像装置10は、図中x軸方向に配置された撮像部101と撮像部102と、視差算出部21と、高解像度合成処理部20とを備える。撮像部101は、撮像レンズ11-1と撮像素子12-1を備える。撮像部102は、撮像レンズ11-2と撮像素子12-2とを備える。なお、撮像部101と撮像部102とは、同じ被写体を撮像するように、光軸が平行になるように配置されている。
(First embodiment)
A first embodiment of the present invention will be described in detail with reference to the drawings. An overview of a stereoscopic imaging apparatus 10 according to an embodiment of the present invention is shown in FIG. 1, and a schematic block diagram showing its functional configuration is shown in FIG. As shown in FIGS. 1 and 2, the stereoscopic imaging device 10 includes an imaging unit 101, an imaging unit 102, a parallax calculation unit 21, and a high-resolution composition processing unit 20 arranged in the x-axis direction in the drawings. The imaging unit 101 includes an imaging lens 11-1 and an imaging element 12-1. The imaging unit 102 includes an imaging lens 11-2 and an imaging element 12-2. Note that the imaging unit 101 and the imaging unit 102 are arranged so that their optical axes are parallel so as to capture the same subject.
 なお、撮像レンズ11-1と、撮像レンズ11-2とは、設置されている位置以外は同様の構成であるので、共通する部分の説明については、撮像レンズ11と表記して説明する。また、撮像素子12-1、撮像素子12-2とは、設置されている位置以外は同様の構成であるので、共通する部分の説明については、撮像素子12と表記して説明する。撮像レンズ11は被写体からの光を撮像素子12上に結像する。なお、撮像素子12(光学系)は、図2に示すように、一つの凸レンズから構成されていてもよいし、複数のレンズからなる光学系であってもよい。撮像素子12は、CMOS撮像素子などであり、結像されたイメージを光電変換して映像信号として出力する。以降、撮像部101の撮像素子12-1が出力する映像信号を、映像信号Rといい、撮像部102の撮像素子12-2が出力する映像信号を、映像信号Lという。 Since the imaging lens 11-1 and the imaging lens 11-2 have the same configuration except for the installed position, the description of the common part will be described as the imaging lens 11. In addition, since the image pickup device 12-1 and the image pickup device 12-2 have the same configuration except for the installed position, the description of the common part will be described as the image pickup device 12. The imaging lens 11 forms an image of light from the subject on the imaging element 12. As shown in FIG. 2, the imaging element 12 (optical system) may be composed of a single convex lens or an optical system composed of a plurality of lenses. The image sensor 12 is a CMOS image sensor or the like, and photoelectrically converts the image formed and outputs it as a video signal. Hereinafter, the video signal output from the imaging device 12-1 of the imaging unit 101 is referred to as a video signal R, and the video signal output from the imaging device 12-2 of the imaging unit 102 is referred to as a video signal L.
 撮像部101と撮像部102とにより出力された2系統の映像信号(映像信号R、映像信号L)は視差算出部21と高解像度合成処理部20に入力される。視差算出部21は入力された2つの映像信号の間で対応点探索を行い、その探索結果から撮像部101の視点を基準としたR基準視差データRSと、撮像部102の視点を基準としたL基準視差データLSを算出して、高解像度合成処理部20に出力する。高解像度合成処理部20は入力された2つの映像信号を、これらの視差データ(視差情報)に基づき合成処理して、右目用映像信号RCと左目用映像信号LCとを出力する。 Two systems of video signals (video signal R and video signal L) output by the imaging unit 101 and the imaging unit 102 are input to the parallax calculation unit 21 and the high-resolution composition processing unit 20. The parallax calculation unit 21 searches for corresponding points between the two input video signals, and R reference parallax data RS based on the viewpoint of the imaging unit 101 and the viewpoint of the imaging unit 102 based on the search result. The L reference parallax data LS is calculated and output to the high resolution composition processing unit 20. The high-resolution composition processing unit 20 synthesizes the two input video signals based on the parallax data (disparity information), and outputs a right-eye video signal RC and a left-eye video signal LC.
 図3は、撮像レンズおよび撮像素子の配置例を示す図である。図3において、水平方向(横方向)にx軸をとり、鉛直方向(上下方向)にy軸をとり、奥行き方向にz軸をとる。すなわち、図3は、立体撮像装置10を正面から見たときの撮像レンズ、撮像素子の配置示す。図3に示すように、撮像レンズ11-1と、撮像レンズ11-2とは、y軸方向には同じ位置に配置されている。一方、撮像素子12-1は、撮像素子12-2より、y軸方向(上下方向)に、py/2だけ上にずれて配置されている。ここで、pyは、撮像素子12における画素のy軸方向の長さである。すなわち、撮像素子12-1と、撮像素子12-2とは、y軸方向(上下方向)に、撮像素子12の画素の高さの半分だけずれて配置されている。 FIG. 3 is a diagram illustrating an arrangement example of the imaging lens and the imaging element. In FIG. 3, the x axis is taken in the horizontal direction (lateral direction), the y axis is taken in the vertical direction (up and down direction), and the z axis is taken in the depth direction. That is, FIG. 3 shows the arrangement of the imaging lens and the imaging element when the stereoscopic imaging device 10 is viewed from the front. As shown in FIG. 3, the imaging lens 11-1 and the imaging lens 11-2 are arranged at the same position in the y-axis direction. On the other hand, the image pickup device 12-1 is shifted from the image pickup device 12-2 by py / 2 in the y-axis direction (vertical direction). Here, py is the length of the pixel in the image sensor 12 in the y-axis direction. That is, the image pickup device 12-1 and the image pickup device 12-2 are arranged so as to be shifted in the y-axis direction (vertical direction) by half the pixel height of the image pickup device 12.
 これにより、撮像部101は、撮像部102と比較して、撮像レンズに対する撮像素子の位置が、撮像素子の撮影画素の半分だけ上にずれている。なお、逆に、撮像素子12-1が、撮像素子12-2より、y軸方向(上下方向)に、py/2だけ下にずれて配置されていてもよい。その場合、後述する高解像度合成処理部20において合成処理する際の画素の配置順が逆となる。 Thereby, in the imaging unit 101, the position of the imaging device with respect to the imaging lens is shifted upward by half of the imaging pixel of the imaging device, as compared with the imaging unit 102. Conversely, the image sensor 12-1 may be arranged to be shifted by py / 2 below the image sensor 12-2 in the y-axis direction (vertical direction). In that case, the arrangement order of the pixels when the composition processing is performed in the high-resolution composition processing unit 20 described later is reversed.
 図4は、撮像レンズおよび撮像素子の別の配置例を示す図である。図4において、水平方向(横方向)にx軸をとり、鉛直方向(上下方向)にy軸をとり、奥行き方向にz軸をとる。図4に示す例では、撮像レンズ11-1は、撮像レンズ11-2より、y軸方向(上下方向)に、py/2だけ下にずれて配置されている。一方、撮像素子12-1と、撮像素子12-2とは、y軸方向には同じ位置に配置されている。すなわち、撮像レンズ11-1と、撮像レンズ11-2とは、y軸方向(上下方向)に、撮像素子12の画素の高さの半分だけずれて配置されている。 FIG. 4 is a diagram illustrating another arrangement example of the imaging lens and the imaging element. In FIG. 4, the x axis is taken in the horizontal direction (lateral direction), the y axis is taken in the vertical direction (up and down direction), and the z axis is taken in the depth direction. In the example shown in FIG. 4, the imaging lens 11-1 is arranged to be shifted downward by py / 2 in the y-axis direction (vertical direction) from the imaging lens 11-2. On the other hand, the image sensor 12-1 and the image sensor 12-2 are disposed at the same position in the y-axis direction. That is, the image pickup lens 11-1 and the image pickup lens 11-2 are arranged so as to be shifted in the y-axis direction (vertical direction) by half the pixel height of the image pickup element 12.
 これにより、撮像部101は、撮像部102と比較して、撮像レンズ11に対する撮像素子12の位置が、撮像素子12の撮影画素の半分だけ上にずれている。なお、逆に、撮像素子12-1が、撮像素子12-2より、y軸方向(上下方向)に、py/2だけ下にずれて配置されていてもよい。その場合、後述する高解像度合成処理部20において合成処理する際の画素の配置順が逆となる。 Thereby, in the imaging unit 101, the position of the imaging element 12 with respect to the imaging lens 11 is shifted by half of the imaging pixel of the imaging element 12 as compared with the imaging unit 102. Conversely, the image sensor 12-1 may be arranged to be shifted by py / 2 below the image sensor 12-2 in the y-axis direction (vertical direction). In that case, the arrangement order of the pixels when the composition processing is performed in the high-resolution composition processing unit 20 described later is reversed.
 次に、視差算出部21について図5を参照し詳細に説明する。図5は、視差算出部21の構成を示す概略ブロック図である。視差算出部21は、図1の撮像部101から出力された映像信号Rと、撮像部L102から出力された映像信号Lとから、視差データを算出する。視差算出部21は、座標変換部31、32、右カメラパラメータ記憶部30R、左カメラパラメータ記憶部30L、対応点探索部33を備える。 Next, the parallax calculation unit 21 will be described in detail with reference to FIG. FIG. 5 is a schematic block diagram illustrating the configuration of the parallax calculation unit 21. The parallax calculation unit 21 calculates parallax data from the video signal R output from the imaging unit 101 in FIG. 1 and the video signal L output from the imaging unit L102. The parallax calculation unit 21 includes coordinate conversion units 31 and 32, a right camera parameter storage unit 30R, a left camera parameter storage unit 30L, and a corresponding point search unit 33.
 右カメラパラメータ記憶部30Rは、撮像部101固有の焦点距離やレンズ歪みパラメータなどの内部パラメータと、2つの撮像部101、102間の位置関係を表す外部パラメータとを含むカメラパラメータを保持している。同様に、左カメラパラメータ記憶部30Lは、撮像部102固有のカメラパラメータを保持している。
 座標変換部31は、撮像部101と撮像部102の映像を同一平面上に乗せることを目的に、撮像部101が出力した映像信号Rが表す映像を公知の方法で幾何学変換(座標変換)してエピポーララインを平行化する。このとき、座標変換部31は、右カメラパラメータ記憶部30Rが記憶するカメラパラメータを用いる。座標変換部32は、撮像部101と撮像部102の映像を同一平面上に乗せることを目的に、撮像部102が出力した映像信号Lが表す映像を公知の方法で幾何学変換してエピポーララインを平行化する。このとき、座標変換部32は、左カメラパラメータ記憶部30Lが記憶するカメラパラメータを用いる。
The right camera parameter storage unit 30R holds camera parameters including internal parameters such as a focal length and lens distortion parameters specific to the imaging unit 101, and external parameters representing the positional relationship between the two imaging units 101 and 102. . Similarly, the left camera parameter storage unit 30L holds camera parameters specific to the imaging unit 102.
The coordinate conversion unit 31 performs geometric conversion (coordinate conversion) on the video represented by the video signal R output from the imaging unit 101 by a known method for the purpose of placing the video of the imaging unit 101 and the imaging unit 102 on the same plane. To make the epipolar line parallel. At this time, the coordinate conversion unit 31 uses the camera parameters stored in the right camera parameter storage unit 30R. The coordinate conversion unit 32 geometrically converts the video represented by the video signal L output from the imaging unit 102 by an known method for the purpose of placing the video images of the imaging unit 101 and the imaging unit 102 on the same plane. Is parallelized. At this time, the coordinate conversion unit 32 uses the camera parameters stored in the left camera parameter storage unit 30L.
 対応点探索部33は、座標変換部31と座標変換部32によりエピポーララインが平行化された映像間の対応画素を探索し、撮像部101の視点と撮像部102の視点との視差を表す視差データを求める。対応点探索部33は2種類の視差データを算出するブロックから構成される。1つは、R基準視差算出部34であり、もう1つは、L基準視差算出部35である。R基準視差算出部34は、エピポーララインが平行化された撮像部101の映像を基準映像とし、エピポーララインが平行化された撮像部102の映像を参照映像として、基準映像の各画素に対応する参照映像の画素を探索して、R基準視差データRSを算出する。L基準視差算出部35は、エピポーララインが平行化された撮像部102の映像を基準映像、エピポーララインが平行化された撮像部101の映像を参照映像として、基準映像の各画素に対応する参照映像の画素を探索して、L基準視差データを算出する。R基準視差算出ブロック34とL基準視差算出ブロック35は基準映像と参照映像が逆になっているだけで対応点探索動作は同じである。 The corresponding point search unit 33 searches for the corresponding pixel between the images in which the epipolar lines are parallelized by the coordinate conversion unit 31 and the coordinate conversion unit 32, and represents the parallax between the viewpoint of the imaging unit 101 and the viewpoint of the imaging unit 102 Ask for data. The corresponding point search unit 33 is composed of blocks that calculate two types of parallax data. One is an R reference parallax calculation unit 34, and the other is an L reference parallax calculation unit 35. The R reference parallax calculation unit 34 corresponds to each pixel of the reference image, using the image of the imaging unit 101 with the epipolar line parallelized as a reference image and the image of the imaging unit 102 with the epipolar line parallelized as a reference image. The reference video pixel is searched to calculate R-standard parallax data RS. The L reference parallax calculation unit 35 uses the video of the imaging unit 102 with the epipolar line parallelized as a reference video and the video of the imaging unit 101 with the epipolar line parallelized as a reference video, and references corresponding to each pixel of the standard video The pixel of the video is searched to calculate L reference parallax data. The R-standard parallax calculation block 34 and the L-standard parallax calculation block 35 have the same corresponding point search operation except that the standard video and the reference video are reversed.
 次に、図6から図9を参照して、R基準視差算出部34による対応画素を探索する処理動作について説明する。図6は、参照画像RGを示す図である。図7は、基準画像BGを示す図である。なお、上述したように、参照画像RG、基準画像BGともに、エピポーララインは平行化されている。基準画像BG上の画素に対応する参照画像RG上の画素(対応画素)を求めるために、基準画像BG上の注目画素の移動方法について図7を参照して説明する。R基準視差算出部34は、基準画像BG上の注目画素を中心とするブロック(以下、基準注目ブロックBBという)を基準画像BGの左上端(基準開始ブロックBS)からラインに沿って右側へ1画素毎に移動させ、移動させた基準注目ブロックBBがラインの右端に到達した場合は、1ライン下の左端からラインに沿って右側へ1画素毎に基準注目ブロックBBを移動させる。これを基準画像BGの右下端のブロック(基準終了ブロックBE)に到達するまで繰り返す。 Next, a processing operation for searching for a corresponding pixel by the R reference parallax calculation unit 34 will be described with reference to FIGS. FIG. 6 is a diagram illustrating the reference image RG. FIG. 7 is a diagram illustrating the reference image BG. As described above, the epipolar lines are parallelized in both the reference image RG and the standard image BG. In order to obtain a pixel (corresponding pixel) on the reference image RG corresponding to a pixel on the standard image BG, a method of moving the target pixel on the standard image BG will be described with reference to FIG. The R reference parallax calculation unit 34 sets a block centered on the target pixel on the reference image BG (hereinafter referred to as a reference target block BB) from the upper left end (reference start block BS) of the reference image BG to the right along the line. If the moved reference attention block BB reaches the right end of the line, the reference attention block BB is moved pixel by pixel from the left end under one line to the right along the line. This is repeated until the block at the lower right corner of the reference image BG (reference end block BE) is reached.
 次に、R基準視差算出部34が、図7に示す基準画像BG上のある1つの注目ブロック(基準注目ブロックBB)に類似する参照画像RG上のブロックを探索する処理動作について、図6を参照して説明する。R基準視差算出部34は、図7に示す基準画像BG上の基準注目ブロックBBの座標(x、y)と同じ座標となる参照画像RG上のブロック(参照開始ブロックRS)を、まず、参照注目ブロックRBとし、以降、ラインに沿って右側へ参照注目ブロックを1画素毎に移動させていく。そして、図6に示すように参照開始ブロックRSから探索範囲だけ右側に離れた参照終了ブロックREまで移動させる。ここで探索する範囲(探索範囲)は、撮影した被写体の最大視差に応じた値であり、設定した探索範囲によって、視差データを算出できる被写体までの最短の距離が決まる。R基準視差算出部34は、上述の参照注目ブロックの探索を、図7に示した基準画像BG上の各基準注目ブロックBBについて行う。 Next, the processing operation for the R-standard parallax calculation unit 34 to search for a block on the reference image RG similar to a certain target block (standard target block BB) on the standard image BG shown in FIG. 7 will be described with reference to FIG. The description will be given with reference. The R reference parallax calculation unit 34 first refers to a block (reference start block RS) on the reference image RG having the same coordinates as the coordinates (x, y) of the reference target block BB on the reference image BG shown in FIG. The target block RB is set, and thereafter, the reference target block is moved pixel by pixel along the line to the right. Then, as shown in FIG. 6, it is moved from the reference start block RS to a reference end block RE that is separated to the right by the search range. The search range (search range) here is a value corresponding to the maximum parallax of the photographed subject, and the shortest distance to the subject for which parallax data can be calculated is determined by the set search range. The R reference parallax calculation unit 34 searches the reference attention block BB on the reference image BG illustrated in FIG. 7 for the reference attention block described above.
 次に、R基準視差算出部34による基準注目ブロックと類似する参照注目ブロックの決定方法について図8および図9を参照して説明する。図9は、基準注目ブロックBBの構成を示す図である。基準注目ブロックBBは、基準画像BG上の注目画素を中心とする横M×縦Nのサイズのブロックである。図8は、参照注目ブロックRBの構成を示す図である。参照注目ブロックRBは、参照画像RG上の注目画素を中心とする横M×縦Nのサイズのブロックである。図8に示す参照注目ブロックRBおよび図9に示す基準注目ブロックBB上の任意の画素を表すために、横方向をi、縦方向をjとした座標(i、j)の画素値をそれぞれR(i、j)、T(i、j)とする。 Next, a method of determining a reference block of interest similar to the reference block of interest by the R reference parallax calculation unit 34 will be described with reference to FIGS. FIG. 9 is a diagram illustrating the configuration of the reference block of interest BB. The reference attention block BB is a block having a size of horizontal M × vertical N around the target pixel on the reference image BG. FIG. 8 is a diagram illustrating a configuration of the reference attention block RB. The reference attention block RB is a block having a size of horizontal M × vertical N around the target pixel on the reference image RG. In order to represent an arbitrary pixel on the reference target block RB shown in FIG. 8 and the reference target block BB shown in FIG. 9, pixel values of coordinates (i, j) where the horizontal direction is i and the vertical direction is j are respectively R Let (i, j), T (i, j).
 R基準視差算出部34は、各基準注目ブロックBBと、参照注目ブロックRBとの組み合わせに対して類似度を算出して、各基準注目ブロックBBに類似する参照注目ブロックRBを決定する。類似度を求める方法には、一般的に多く用いられているSAD(Sum of Absolute Difference)を用いる。SADは、(1)式に示す類似度判定式のようにR(i、j)とT(i、j)の差分の絶対値をブロックの全画素について求め、それを合計した値(SSAD)である。R基準視差算出部34は、ある基準注目ブロックBBに対応する参照画像RG上の探索範囲内の各参照注目ブロックRGの中で(1)式のSSADの値が最も小さくなる参照注目ブロックを、その基準注目ブロックと類似していると決定する。そして、R基準視差算出部34は、基準注目ブロックBBに類似する参照注目ブロックRBの中心にある画素を、基準注目ブロックBBの中心にある注目画素に対応する画素とする。 The R reference parallax calculation unit 34 calculates a similarity for each combination of the reference attention block BB and the reference attention block RB, and determines a reference attention block RB similar to each reference attention block BB. As a method of obtaining the similarity, generally used SAD (Sum of Absolute Difference) is used. SAD calculates the absolute value of the difference between R (i, j) and T (i, j) for all the pixels of the block as in the similarity determination formula shown in equation (1), and sums them (SSAD) It is. The R reference parallax calculation unit 34 selects a reference attention block having the smallest SSAD value in the expression (1) among the reference attention blocks RG in the search range on the reference image RG corresponding to a certain reference attention block BB. It is determined that the reference attention block is similar. Then, the R reference parallax calculation unit 34 sets a pixel at the center of the reference attention block RB similar to the reference attention block BB as a pixel corresponding to the attention pixel at the center of the reference attention block BB.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 なお、L基準視差算出部35による対応画素を探索する処理動作は、R基準視差算出部34と、ほぼ同じであるが、探索する範囲が異なる。L基準視差算出部35の場合、参照開始ブロックRSは、基準注目ブロックBBと同じ座標から左側へ探索範囲だけ離れたブロックであり、参照終了ブロックREは、基準注目ブロックBBと同じ座標にあるブロックである。 The processing operation for searching for a corresponding pixel by the L reference parallax calculation unit 35 is substantially the same as that of the R reference parallax calculation unit 34, but the search range is different. In the case of the L reference parallax calculation unit 35, the reference start block RS is a block that is separated from the same coordinates as the reference target block BB by a search range to the left side, and the reference end block RE is a block at the same coordinates as the reference target block BB. It is.
 以上の説明は、視差データの算出方法を処理単位毎に説明したものである。次に、図6~図9及び図10に示す処理フローを参照して、入力画素のシーケンスに沿った説明をする。R基準視差算出部34は、まず、基準注目ブロックを基準画像BG(図7)の先頭(基準開始ブロックBS)に設定する(ステップS900)。そして、基準注目ブロックBBの全画素値を基準画像BGから読み出す(ステップS901)。次に、R基準視差算出部34は、基準注目ブロックBBと同じ座標の参照画像RG(図6)中のブロック、すなわち参照画像RGの先頭(参照開始ブロックRS)を、参照注目ブロックRBに設定する(ステップS902)。そして、参照注目ブロックRBの全画素値を参照画像RGから読み出す(ステップS903)。R基準視差算出部34は、読み出した基準注目ブロックBBと参照注目ブロックRBの画素値のSSAD値を、(1)式に従って算出し、記憶する(ステップS904)。 The above description describes the method for calculating parallax data for each processing unit. Next, description will be given along the sequence of input pixels with reference to the processing flows shown in FIGS. 6 to 9 and FIG. The R reference parallax calculation unit 34 first sets the reference attention block at the head (reference start block BS) of the reference image BG (FIG. 7) (step S900). Then, all pixel values of the reference target block BB are read from the reference image BG (step S901). Next, the R standard parallax calculation unit 34 sets the block in the reference image RG (FIG. 6) having the same coordinates as the standard target block BB, that is, the head of the reference image RG (reference start block RS) as the reference target block RB. (Step S902). Then, all pixel values of the reference attention block RB are read from the reference image RG (step S903). The R reference parallax calculation unit 34 calculates and stores the SSAD values of the pixel values of the read reference attention block BB and reference attention block RB according to the equation (1) (step S904).
 次に、R基準視差算出部34は、探索範囲が終了したか否かを判定し(ステップS905)、終了していなければ参照注目ブロックをライン方向に沿って右側に1画素移動させ(ステップS906)、再度ステップS903およびステップS904を行う。これらステップS903からステップS906を、参照注目ブロックRBが探索範囲内にある間、繰り返し、探索範囲内の全SSAD値を算出する。R基準視差算出部34はこれらの算出結果に基づき、SSAD値が最小の参照注目ブロックRBを検出する(ステップS907)。ここで、ステップS907で算出した最小のSSAD値が必ずしも正しい類似ブロックとなるとは限らない。例えば、基準注目ブロックBBに特徴点となる模様(テクスチャー)や輪郭がない場合や、参照画像RG上の探索領域がオクルージョン領域の場合などに視差が正しく検出できない。この視差が正しく検出できたかは、最小のSSAD値がどれくらい小さい値になっているかから判断できる。 Next, the R reference parallax calculation unit 34 determines whether or not the search range has ended (step S905). If the search range has not ended, the R reference parallax calculation unit 34 moves the reference block of interest to the right along the line direction by one pixel (step S906). ), Step S903 and Step S904 are performed again. These steps S903 to S906 are repeated while the reference block of interest RB is within the search range, and all SSAD values within the search range are calculated. Based on these calculation results, the R reference parallax calculation unit 34 detects the reference block of interest RB having the smallest SSAD value (step S907). Here, the minimum SSAD value calculated in step S907 is not necessarily a correct similar block. For example, the parallax cannot be detected correctly when the reference target block BB has no pattern (texture) or contour as a feature point, or when the search area on the reference image RG is an occlusion area. Whether the parallax has been detected correctly can be determined from how small the minimum SSAD value is.
 そこで、R基準視差算出部34は、最小のSSAD値を閾値と比較して(ステップ908)、SSAD値が閾値以下のとき(即ち類似度が高い場合)は、基準注目ブロックBBの中心(基準画像BG上の注目画素)のx座標と、検出した参照注目ブロックRBの中心(参照画像RG上の対応する画素)のx座標との差分を、その注目画素の視差データとして出力する(ステップS909)。一方、SSAD値が閾値を超えているとき(即ち類似度が低い場合)は、R基準視差算出部34は、視差が検出できなかったと判断して、エラーフラグとして視差データを0もしくはユニークな値にして出力する(ステップ910)。 Therefore, the R reference parallax calculation unit 34 compares the minimum SSAD value with the threshold (step 908), and when the SSAD value is equal to or lower than the threshold (that is, when the similarity is high), the center of the reference block of interest BB (reference The difference between the x coordinate of the target pixel on the image BG) and the x coordinate of the center of the detected reference target block RB (the corresponding pixel on the reference image RG) is output as parallax data of the target pixel (step S909). ). On the other hand, when the SSAD value exceeds the threshold (that is, when the degree of similarity is low), the R reference parallax calculation unit 34 determines that the parallax has not been detected, and sets the parallax data to 0 or a unique value as the error flag. And output (step 910).
 最後に、R基準視差算出部34は、基準注目ブロックBBが基準終了ブロックBEに到達したか否か、すなわち処理が終了したか否かを判定し(ステップS911)、終了していなければ基準注目ブロックBBをライン方向に沿って右側に1画素移動させ(ステップS912)、再度ステップS901からステップS910を行う。ステップS911にて終了したと判定したときは、処理を終了する。このようにして、基準注目ブロックBBが基準画像BGの探索終了ブロックとなるまで、これらステップS901からステップS910を繰り返し、基準画像BG上の各画素の視差データを求める。 Finally, the R reference parallax calculation unit 34 determines whether or not the reference block of interest BB has reached the reference end block BE, that is, whether or not the processing has ended (step S911). The block BB is moved to the right by one pixel along the line direction (step S912), and steps S901 to S910 are performed again. If it is determined in step S911 that the process has ended, the process ends. In this way, the steps S901 to S910 are repeated until the reference target block BB becomes the search end block of the reference image BG, and the parallax data of each pixel on the reference image BG is obtained.
 以上の説明において、図5の対応点探索部33の処理方法の一例として、基準画像上の注目画素に類似する参照画像上の画素をSADの類似度評価関数で探索したが、この手法に限定されるものではなく、基準画像上と参照画像上の類似画素を探索する手法であればどのような手法を使用して、視差データを求めても良い。 In the above description, as an example of the processing method of the corresponding point search unit 33 in FIG. 5, the pixel on the reference image similar to the target pixel on the base image is searched using the SAD similarity evaluation function. The parallax data may be obtained using any technique as long as it is a technique for searching for similar pixels on the standard image and the reference image.
 次に、図11と図12を参照して図2に示す高解像度合成処理部20の詳細な構成と動作を説明する。図11は、高解像度合成処理部20の機能構成を示す概略ブロック図である。図11に示すように高解像度合成処理部20は、左目用映像信号LCを生成する左目用合成部908と、右目用映像信号RCを生成する右目用合成部909と、右カメラパラメータ記憶部902Rと、左カメラパラメータ記憶部902Lとを具備する。左目用合成部908と右目用合成ブロック部909の各々は、位置合わせ補正処理部901と補正処理部903と合成処理部906とを備える。左目用合成部908と右目用合成部909は入力される視差データの組み合わせが異なるだけで基本動作は同じであるため、ここでは左目用合成部908の動作を説明し、右目用合成部909の説明は省略する。 Next, the detailed configuration and operation of the high resolution composition processing unit 20 shown in FIG. 2 will be described with reference to FIGS. FIG. 11 is a schematic block diagram showing a functional configuration of the high resolution composition processing unit 20. As shown in FIG. 11, the high-resolution composition processing unit 20 includes a left-eye composition unit 908 that generates a left-eye video signal LC, a right-eye composition unit 909 that generates a right-eye video signal RC, and a right camera parameter storage unit 902R. And a left camera parameter storage unit 902L. Each of the left-eye composition unit 908 and the right-eye composition block unit 909 includes an alignment correction processing unit 901, a correction processing unit 903, and a composition processing unit 906. Since the basic operation is the same for the left-eye synthesis unit 908 and the right-eye synthesis unit 909 except that the combination of input parallax data is different, the operation of the left-eye synthesis unit 908 will be described here. Description is omitted.
 左目用合成部908において、撮像部101の映像信号Rは位置合わせ補正処理部901に、撮像部102の映像信号Lは補正処理部903に入力される。位置合わせ補正処理部901は、右カメラパラメータ記憶部902Rが記憶する撮像部101のレンズ歪の状態を示すカメラパラメータに基づき、映像信号Rが表す映像のレンズ歪みを補正した後、視差算出部21から入力されるL基準視差データLSと、右カメラパラメータ記憶部902Rが記憶する撮像部101の向きや姿勢を示すカメラパラメータに基づき、レンズ歪を補正した映像の各画素が撮像部102の映像(補正処理部903により補正された映像)中の同じ座標の画素と同じ被写体位置を捕らえるように位置合わせを行う。 In the left eye combining unit 908, the video signal R of the imaging unit 101 is input to the alignment correction processing unit 901, and the video signal L of the imaging unit 102 is input to the correction processing unit 903. The alignment correction processing unit 901 corrects the lens distortion of the video represented by the video signal R based on the camera parameters indicating the lens distortion state of the imaging unit 101 stored in the right camera parameter storage unit 902R, and then the parallax calculation unit 21 Based on the L reference parallax data LS input from, and the camera parameters indicating the orientation and orientation of the image capturing unit 101 stored in the right camera parameter storage unit 902R, each pixel of the image with the lens distortion corrected is an image of the image capturing unit 102 ( Alignment is performed so as to capture the same subject position as a pixel having the same coordinates in the image corrected by the correction processing unit 903.
 ただし、この位置合わせにおいて、図3、図4で説明した、撮像部101を、撮像部102と比較すると、撮像レンズ11-1に対する撮像素子12-1の位置が、撮像素子12-1の撮影画素の半分だけ上にずれている状態については、補正を行なわない。すなわち、位置合わせ補正処理部901の処理結果の座標(x、y)の画素は、補正処理部903の処理結果の座標(x、y)の画素と、座標(x、y-1)の画素との間の被写体位置を捕らえている。一方、補正処理部903は、左カメラパラメータ記憶部902Lが記憶するレンズ歪の状態を示すカメラパラメータに基づき、映像信号Lが表す映像のレンズ歪みを補正する。 However, in this alignment, when the image pickup unit 101 described in FIGS. 3 and 4 is compared with the image pickup unit 102, the position of the image pickup device 12-1 with respect to the image pickup lens 11-1 is determined by the image pickup device 12-1. Correction is not performed for a state of being shifted upward by half of the pixel. That is, the pixel at the coordinate (x, y) as the processing result of the alignment correction processing unit 901 is the pixel at the coordinate (x, y) as the processing result of the correction processing unit 903 and the pixel at the coordinate (x, y−1). The subject position between is captured. On the other hand, the correction processing unit 903 corrects the lens distortion of the video represented by the video signal L based on the camera parameters indicating the lens distortion state stored in the left camera parameter storage unit 902L.
 すなわち、位置合わせ補正処理部901と、補正処理部903とは、カメラパラメータを用いたエピポーララインの平行化と、L基準視差データLSを用いた視差の補正とを行う。ここで、エピポーララインの平行化は、公知の方法で行うことができる。また、視差の補正として、位置合わせ補正処理部901は、映像信号Rの表す映像に対してエピポーララインの平行化をした映像の画素を、L基準視差データLSが示す視差分だけ、画素を左に移動させる。例えば、座標(x、y)におけるL基準視差データLSがdであれば、座標(x+d、y)の画素を、座標(x、y)に移動させる。なお、右目用合成部909の補正処理部903には、映像信号Rと、撮像部101のカメラパラメータとが入力され、位置合わせ補正処理部901には、映像信号Lと、撮像部102のカメラパラメータと、R基準視差データRSとが入力される。また、右目用合成部909の位置合わせ補正処理部901における視差の補正では、視差データRSが示す視差分だけ、画素を右に移動させる。 That is, the alignment correction processing unit 901 and the correction processing unit 903 perform parallelization of epipolar lines using camera parameters and correction of parallax using L reference parallax data LS. Here, the parallelization of the epipolar line can be performed by a known method. In addition, as the parallax correction, the alignment correction processing unit 901 shifts the pixels of the video obtained by parallelizing the epipolar line to the video represented by the video signal R by the amount corresponding to the parallax indicated by the L reference parallax data LS. Move to. For example, if the L reference parallax data LS at the coordinates (x, y) is d, the pixel at the coordinates (x + d, y) is moved to the coordinates (x, y). Note that the video signal R and the camera parameters of the imaging unit 101 are input to the correction processing unit 903 of the right-eye synthesis unit 909, and the video signal L and the camera of the imaging unit 102 are input to the alignment correction processing unit 901. A parameter and R reference parallax data RS are input. Further, in the parallax correction in the alignment correction processing unit 901 of the right-eye synthesis unit 909, the pixel is moved to the right by the amount of parallax indicated by the parallax data RS.
 次に、図12を参照して、左目用合成部908の合成処理部906の動作を説明する。図12において、横軸はy軸方向の空間の広がり(大きさ)、縦軸は光の振幅(光強度)示している。符号40aのグラフは、撮像部101、102の撮像素子12-1、12-2のうち、撮像レンズ11-1、11-2により被写体の像が結像されて、ある縦一列の画素に入射する光の分布を示す。符号40eのグラフは、撮像部101の撮像素子12-1の画素のうち、グラフ40aの光が入射している画素に対応する位置合わせ補正処理部901の出力の分布を示す。符号40fのグラフは、撮像部102の撮像素子12-2の画素のうち、グラフ40aの光が入射している画素に対応する補正処理部903の出力の分布を示す。符号40gのグラフは、符号40e、符号40fの分布に対する合成処理部906の出力の分布を示す。なお、ここでは、説明を簡易にするため、グラフ間の関係について、位置合わせ補正処理部901、補正処理部903による補正の影響を無視して説明する。 Next, the operation of the composition processing unit 906 of the left-eye composition unit 908 will be described with reference to FIG. In FIG. 12, the horizontal axis indicates the spread (size) of the space in the y-axis direction, and the vertical axis indicates the light amplitude (light intensity). In the graph 40a, an image of a subject is formed by the imaging lenses 11-1 and 11-2 among the imaging elements 12-1 and 12-2 of the imaging units 101 and 102, and is incident on a certain vertical row of pixels. Shows the distribution of light. The graph denoted by reference numeral 40e indicates the distribution of the output of the alignment correction processing unit 901 corresponding to the pixel on which the light of the graph 40a is incident among the pixels of the imaging element 12-1 of the imaging unit 101. A graph denoted by reference numeral 40f indicates an output distribution of the correction processing unit 903 corresponding to the pixel on which the light of the graph 40a is incident among the pixels of the imaging element 12-2 of the imaging unit 102. The graph of reference numeral 40g shows the distribution of the output of the synthesis processing unit 906 with respect to the distribution of reference numerals 40e and 40f. Here, for simplicity of explanation, the relationship between the graphs will be described ignoring the effects of correction by the alignment correction processing unit 901 and the correction processing unit 903.
 図中の縦線のうち、実線は、撮像部101の撮像素子12-1の画素の境界線であり、破線は、撮像部102の撮像素子12-2の画素の境界線である。すなわち、図中の符号40bと符号40cはそれぞれ撮像部101と撮像部102の画素であり、相対位置関係が符号40dの矢印で示すオフセット分だけずれている。なお、このオフセットは、そのずれ量を撮像素子12-1と撮像素子12-2の画素(符号40b、符号40c)の半分の大きさに設定することが好ましい。半画素のオフセットにより最も高精細な画像を生成することが可能となる。撮像素子12-1、12-2は画素単位で光強度を積分することになるため、符号40aで示す被写体の像を撮像素子12-1で撮影すると、符号40eで示す光強度分布の映像信号が得られ、撮像素子12-2で撮影すると符号40fで示す光強度分布の映像信号が得られる。 Of the vertical lines in the figure, the solid line is the boundary line of the pixel of the imaging element 12-1 of the imaging unit 101, and the broken line is the boundary line of the pixel of the imaging element 12-2 of the imaging unit 102. That is, reference numerals 40b and 40c in the figure are pixels of the imaging unit 101 and the imaging unit 102, respectively, and the relative positional relationship is shifted by an offset indicated by an arrow 40d. Note that the offset is preferably set to be half the size of the pixels (reference numerals 40b and 40c) of the imaging elements 12-1 and 12-2. A half-pixel offset makes it possible to generate the highest definition image. Since the image sensors 12-1 and 12-2 integrate the light intensity in units of pixels, when an image of a subject indicated by reference numeral 40a is taken by the image pickup element 12-1, a video signal having a light intensity distribution indicated by reference numeral 40e. When a picture is taken with the image sensor 12-2, a video signal having a light intensity distribution indicated by reference numeral 40f is obtained.
 合成処理部906は、y軸方向に、位置合わせ補正処理部901の出力と、補正処理部903の出力とを交互に並べることで、画像を合成し、撮像部101、102の出力に比べてy軸方向の解像度が2倍になった左目用映像信号LCを生成する。このとき、撮像部101は、撮像部102と比較して、撮像レンズ11-1に対する撮像素子12-1の位置が、撮像素子12-1の撮影画素の半分だけ上にずれていることを考慮し、撮像部101に由来する位置合わせ補正処理部901の画素aと同じ座標の画素であって、撮像部102に由来する補正処理部903の画素bを、画素aの下に配置する。すなわち、画像の右上を原点とし、右方向にx軸、下方向にy軸をとったとき、位置合わせ補正処理部901の出力中の座標(x、y)の画素は、合成処理部906の出力中の座標(2x、2y)の画素となり、補正処理部903の出力中の座標(x、y)の画素は、合成処理部906の出力中の座標(2x、2y+1)の画素となる。 The composition processing unit 906 synthesizes an image by alternately arranging the output of the alignment correction processing unit 901 and the output of the correction processing unit 903 in the y-axis direction, and compared with the outputs of the imaging units 101 and 102. A left-eye video signal LC in which the resolution in the y-axis direction is doubled is generated. At this time, the imaging unit 101 considers that the position of the imaging device 12-1 with respect to the imaging lens 11-1 is shifted upward by half of the imaging pixel of the imaging device 12-1, as compared with the imaging unit 102. The pixel b of the correction processing unit 903 derived from the imaging unit 102 and having the same coordinates as the pixel a of the alignment correction processing unit 901 derived from the imaging unit 101 is disposed below the pixel a. That is, when the origin is the upper right of the image, the x-axis is the right direction, and the y-axis is the lower direction, the pixel at the coordinates (x, y) being output from the alignment correction processing unit 901 is The pixel at the coordinates (2x, 2y) being outputted, and the pixel at the coordinates (x, y) being outputted by the correction processing unit 903 is a pixel at the coordinates (2x, 2y + 1) being outputted by the synthesis processing unit 906.
 合成処理部906は、このようにして、2つの画像を合成することで、符号40gに示すグラフ40aに近い高精細な画像を再現することができる。
 なお、右目用合成部909の合成処理部906は、左目用合成部908の合成処理部906と同様に動作するが、撮像部101に由来する補正処理部903の画素cと同じ座標の画素であって、撮像部102に由来する位置合わせ補正処理部901の画素dを、画素cの下に配置する。
The composition processing unit 906 can reproduce a high-definition image close to the graph 40a indicated by reference numeral 40g by combining the two images in this way.
Note that the synthesis processing unit 906 of the right-eye synthesis unit 909 operates in the same manner as the synthesis processing unit 906 of the left-eye synthesis unit 908, but is a pixel having the same coordinates as the pixel c of the correction processing unit 903 derived from the imaging unit 101. Therefore, the pixel d of the alignment correction processing unit 901 derived from the imaging unit 102 is arranged below the pixel c.
 以上の合成動作が左目用合成部908と右目用合成部909で実行される。その結果、左目用合成部908からは撮像部102の位置から撮影した(即ち左目から見た)高精細な映像の信号である左目用映像信号LCが出力され、右目用合成部909からは撮像部102の位置から撮影した(即ち右目から見た)高精細な映像の信号である右目用映像信号RCが出力される。この左目用と右目用の映像を立体表示装置で表示することで、撮像部101、102の2倍の解像度を持つ立体映像を表示することが可能となる。 The above combining operation is executed by the left eye combining unit 908 and the right eye combining unit 909. As a result, the left-eye compositing unit 908 outputs a left-eye video signal LC, which is a high-definition video signal taken from the position of the imaging unit 102 (ie, viewed from the left eye), and the right-eye compositing unit 909 captures an image. A right-eye video signal RC that is a high-definition video signal shot from the position of the unit 102 (that is, viewed from the right eye) is output. By displaying the left-eye and right-eye images on the stereoscopic display device, it is possible to display a stereoscopic image having a resolution twice that of the imaging units 101 and 102.
 このように、立体撮像装置10が出力する左目用映像信号LCと、右目用映像信号RCとは、撮像部101、102が出力する映像信号R、Lと比較すると、2倍の解像度を有している。したがって、立体撮像装置10は、撮像部101が出力する映像信号と同等の解像度を有する立体撮像装置と同等の装置規模で、その2倍の解像度の立体映像を生成することができる。 As described above, the left-eye video signal LC and the right-eye video signal RC output from the stereoscopic imaging device 10 have twice the resolution as compared with the video signals R and L output from the imaging units 101 and 102. ing. Therefore, the stereoscopic imaging device 10 can generate a stereoscopic video image having a resolution twice that of a stereoscopic imaging device having a resolution equivalent to that of the video signal output by the imaging unit 101.
(第二の実施形態)
 本発明の第二の実施形態について、図面を参照して詳細に説明する。なお、第一の実施形態と同じ機能の部分については同じ符号を用いて説明を省略する。図13は、本実施形態における立体撮像装置の概観を示す概観図である。図13に示すように、本実施形態における立体撮像装置111は、撮像部の数が第一の実施形態と異なり、2眼(符号101,102)から4眼(符号101,102,103,104)と増えている。すなわち、立体撮像装置111は、撮像部101、撮像部102、撮像部103、撮像部104を具備する。また、撮像部101は、撮像レンズ11-1、撮像素子12-1を具備する。同様に、撮像部102は、撮像レンズ11-2、撮像素子12-2を具備し、撮像部103は、撮像レンズ11-3、撮像素子12-3を具備し、撮像部104は、撮像レンズ11-4、撮像素子12-4を具備する。
(Second embodiment)
A second embodiment of the present invention will be described in detail with reference to the drawings. In addition, about the part of the same function as 1st embodiment, description is abbreviate | omitted using the same code | symbol. FIG. 13 is an overview diagram illustrating an overview of the stereoscopic imaging apparatus according to the present embodiment. As shown in FIG. 13, the stereoscopic imaging apparatus 111 according to the present embodiment differs from the first embodiment in the number of imaging units, from two eyes (reference numerals 101, 102) to four eyes ( reference numerals 101, 102, 103, 104). ) And increasing. That is, the stereoscopic imaging device 111 includes an imaging unit 101, an imaging unit 102, an imaging unit 103, and an imaging unit 104. The imaging unit 101 includes an imaging lens 11-1 and an imaging element 12-1. Similarly, the imaging unit 102 includes an imaging lens 11-2 and an imaging element 12-2, the imaging unit 103 includes an imaging lens 11-3 and an imaging element 12-3, and the imaging unit 104 includes an imaging lens. 11-4 and an image sensor 12-4.
 図14は、本実施形態における立体撮像装置111の機能構成を示す概略ブロック図である。立体撮像装置111は、撮像部101、102、103、104、視差算出部21、多眼高解像度合成処理部121を具備する。撮像部101と撮像部102が出力した映像信号R、Lは多眼高解像度化合成処理部121と視差算出部21に入力される。撮像部103が出力した映像信号R’と撮像部104が出力した映像信号L’は、多眼高解像度合成処理部121に入力される。視差算出部21の処理は、第一の実施形態と同じであり、R基準視差データとL基準視差データを算出して多眼高解像度合成処理部121に出力する。高解像度合成処理部121は入力された4系統の映像信号を2系統の視差データに基づき合成処理して、右目用映像信号RCと左目用映像信号LCとを出力する。 FIG. 14 is a schematic block diagram illustrating a functional configuration of the stereoscopic imaging device 111 according to the present embodiment. The stereoscopic imaging device 111 includes imaging units 101, 102, 103, 104, a parallax calculation unit 21, and a multi-view high-resolution composition processing unit 121. The video signals R and L output from the imaging unit 101 and the imaging unit 102 are input to the multi-view high resolution synthesis processing unit 121 and the parallax calculation unit 21. The video signal R ′ output from the imaging unit 103 and the video signal L ′ output from the imaging unit 104 are input to the multi-view high-resolution composition processing unit 121. The processing of the parallax calculation unit 21 is the same as that of the first embodiment, and calculates the R reference parallax data and the L reference parallax data, and outputs them to the multi-view high-resolution synthesis processing unit 121. The high-resolution synthesis processing unit 121 performs synthesis processing on the input four video signals based on the two parallax data, and outputs a right-eye video signal RC and a left-eye video signal LC.
 図15は、本実施形態における撮像レンズと、撮像素子との配置を説明する図である。図15において、水平方向(横方向)にx軸をとり、鉛直方向(上下方向)にy軸をとり、奥行き方向にz軸をとる。すなわち、図15は、立体撮像装置111を正面から見たときの撮像レンズ、撮像素子の配置示す。図15に示すように、撮像レンズ11-1と、撮像レンズ11-2とは、y軸方向には同じ位置に配置されている。また、撮像レンズ11-3と、撮像レンズ11-4とは、y軸方向には同じ位置に配置されている。また、撮像レンズ11-1と、撮像レンズ11-3とは、x軸方向には同じ位置に配置されている。また、撮像レンズ11-2と、撮像レンズ11-4とは、x軸方向には同じ位置に配置されている。また、撮像レンズ11-1の中心から撮像レンズ11-2の中心までの距離Dxは、撮像レンズ11-1の中心から撮像レンズ11-3の中心までの距離Dyと等しい。このように、本実施形態では、撮像部101~104は、各辺が水平または鉛直方向のいずれかに沿う正方形の頂点に配置されている。 FIG. 15 is a diagram illustrating the arrangement of the imaging lens and the imaging element in the present embodiment. In FIG. 15, the x axis is taken in the horizontal direction (lateral direction), the y axis is taken in the vertical direction (up and down direction), and the z axis is taken in the depth direction. That is, FIG. 15 shows the arrangement of the imaging lens and the imaging element when the stereoscopic imaging device 111 is viewed from the front. As shown in FIG. 15, the imaging lens 11-1 and the imaging lens 11-2 are arranged at the same position in the y-axis direction. The imaging lens 11-3 and the imaging lens 11-4 are disposed at the same position in the y-axis direction. The imaging lens 11-1 and the imaging lens 11-3 are disposed at the same position in the x-axis direction. The imaging lens 11-2 and the imaging lens 11-4 are disposed at the same position in the x-axis direction. The distance Dx from the center of the imaging lens 11-1 to the center of the imaging lens 11-2 is equal to the distance Dy from the center of the imaging lens 11-1 to the center of the imaging lens 11-3. As described above, in the present embodiment, the imaging units 101 to 104 are arranged at the vertices of a square in which each side is along either the horizontal or vertical direction.
 一方、撮像素子12-1は、撮像素子12-2より、y軸方向(上下方向)に、py/2だけ上にずれて配置されている。また、撮像素子12-3は、撮像素子12-4より、y軸方向(上下方向)に、py/2だけ上にずれて配置されている。ここで、pyは、撮像素子12における画素のy軸方向の長さである。また、撮像素子12-1は、撮像素子12-3より、x軸方向(横方向)に、px/2だけ左にずれて配置されている。また、撮像素子12-2は、撮像素子12-4より、x軸方向(横方向)に、px/2だけ左にずれて配置されている。ここで、pxは、撮像素子における画素のx軸方向の長さである。 On the other hand, the image sensor 12-1 is arranged to be shifted by py / 2 in the y-axis direction (vertical direction) from the image sensor 12-2. Further, the image sensor 12-3 is arranged so as to be shifted by py / 2 in the y-axis direction (vertical direction) from the image sensor 12-4. Here, py is the length of the pixel in the image sensor 12 in the y-axis direction. The image sensor 12-1 is arranged to be shifted to the left by px / 2 in the x-axis direction (lateral direction) from the image sensor 12-3. The image sensor 12-2 is arranged to be shifted to the left by px / 2 in the x-axis direction (lateral direction) from the image sensor 12-4. Here, px is the length of the pixel in the x-axis direction in the image sensor.
 これにより、撮像部101は、撮像部102と比較して、撮像レンズに対する撮像素子の位置が、撮像素子の撮影画素の半分だけ上にずれている。同様に、撮像部103は、撮像部104と比較して、撮像レンズに対する撮像素子の位置が、撮像素子の撮影画素の半分だけ上にずれている。また、撮像部101は、撮像部103と比較して、撮像レンズに対する撮像素子の位置が、撮像素子の撮影画素の半分だけ左にずれている。同様に、撮像部102は、撮像部104と比較して、撮像レンズに対する撮像素子の位置が、撮像素子の撮影画素の半分だけ左にずれている。
 なお、ここでは、撮像素子がずれて配置される例を示したが、第一の実施形態で示した図4のように、撮像レンズがずれて配置されていてもよい。
As a result, in the imaging unit 101, the position of the imaging element with respect to the imaging lens is shifted upward by half of the imaging pixel of the imaging element, as compared with the imaging unit 102. Similarly, in the imaging unit 103, as compared with the imaging unit 104, the position of the imaging element with respect to the imaging lens is shifted upward by half of the imaging pixel of the imaging element. Further, in the imaging unit 101, the position of the imaging element with respect to the imaging lens is shifted to the left by half of the imaging pixel of the imaging element, as compared with the imaging unit 103. Similarly, in the imaging unit 102, the position of the imaging element with respect to the imaging lens is shifted to the left by half of the imaging pixel of the imaging element, as compared with the imaging unit 104.
Here, an example in which the imaging elements are displaced is shown, but the imaging lens may be displaced as shown in FIG. 4 described in the first embodiment.
 次に、図16、図17を参照して多眼高解像度合成処理部121の詳細な構成と動作を説明する。多眼高解像度合成処理部121は、左目用の高精細映像を生成する左目用多眼合成部130と、右目用の高精細映像を生成する右目用多眼合成部132と、撮像部101から104のカメラパラメータを記憶するカメラパラメータ記憶部902とを具備する。左目用合成部130と右目用合成ブロック130は、それぞれ位置合わせ補正処理部901と、補正処理部903と、縦横位置合わせ補正処理部904、縦位置合わせ補正処理部905、多眼合成処理部131とを具備する。左目用合成部130と右目用合成部132は入力される映像信号と視差データの組み合わせが異なるだけで基本動作は同じであるため、ここでは左目用合成ブロック131で動作を説明する。 Next, the detailed configuration and operation of the multi-view high resolution composition processing unit 121 will be described with reference to FIGS. The multi-view high-resolution composition processing unit 121 includes a left-eye multi-view composition unit 130 that generates a left-eye high-definition image, a right-eye multi-view composition unit 132 that generates a right-eye high-definition image, and the imaging unit 101. A camera parameter storage unit 902 for storing 104 camera parameters. The left-eye composition unit 130 and the right-eye composition block 130 include an alignment correction processing unit 901, a correction processing unit 903, a vertical / horizontal alignment correction processing unit 904, a vertical alignment correction processing unit 905, and a multi-view synthesis processing unit 131, respectively. It comprises. Since the left eye synthesis unit 130 and the right eye synthesis unit 132 have the same basic operation except that the combination of the input video signal and the parallax data is different, the operation of the left eye synthesis block 131 will be described here.
 撮像部101が出力した映像信号Rは、位置合わせ補正処理部901に入力される。位置合わせ補正処理部901は、第1の実施形態と同様に、映像信号Rが表す映像の補正処理および位置合わせを、カメラパラメータ記憶部902が記憶する撮像部101のカメラパラメータと、L基準視差データLSとに基づき行い、撮像部102の視点からの映像を生成する。撮像部102が出力した映像信号Lは、補正処理部903に入力される。補正処理部903は、第1の実施形態と同様に、映像信号Lが表す映像の補正処理を、カメラパラメータ記憶部902が記憶する撮像部102のカメラパラメータに基づき行う。 The video signal R output from the imaging unit 101 is input to the alignment correction processing unit 901. Similar to the first embodiment, the alignment correction processing unit 901 performs the correction processing and alignment of the video represented by the video signal R, the camera parameters of the imaging unit 101 stored in the camera parameter storage unit 902, and the L reference parallax. Based on the data LS, an image from the viewpoint of the imaging unit 102 is generated. The video signal L output from the imaging unit 102 is input to the correction processing unit 903. Similar to the first embodiment, the correction processing unit 903 performs correction processing of the video represented by the video signal L based on the camera parameters of the imaging unit 102 stored in the camera parameter storage unit 902.
 撮像部103が出力した映像信号R’は、縦横位置合わせ補正処理部904に入力される。縦横位置合わせ補正処理部904は、映像信号R’が表す映像の補正処理および位置合わせを、カメラパラメータ記憶部902が記憶する撮像部103のカメラパラメータと、L基準視差データLSとに基づき行い、撮像部102の視点からの映像を生成する。このとき、撮像部101の撮像レンズ11-1の中心と、撮像部102の撮像レンズ12-1の中心との距離Dxは、撮像部101の撮像レンズ11-1の中心と、撮像部103の撮像レンズ12-3の中心との距離Dyと等しいことから、上下方向の視差データとして、L基準視差データLSを用いる。すなわち、縦横位置合わせ補正処理部904は、L基準視差データLSを上下方向と横方向とに適用して、位置合わせを行う。例えば、縦横位置合わせ補正処理部904により位置合わせされた画像中の座標(x、y)の画素値は、該座標におけるL基準視差データLSがdのときは、撮像部103が出力した映像信号が表す画像をカメラパラメータにより補正した画像中の座標(x+d、y―d)の画素値である。 The video signal R ′ output from the imaging unit 103 is input to the vertical / horizontal alignment correction processing unit 904. The vertical / horizontal alignment correction processing unit 904 performs correction processing and alignment of the video represented by the video signal R ′ based on the camera parameters of the imaging unit 103 stored in the camera parameter storage unit 902 and the L reference parallax data LS. An image from the viewpoint of the imaging unit 102 is generated. At this time, the distance Dx between the center of the imaging lens 11-1 of the imaging unit 101 and the center of the imaging lens 12-1 of the imaging unit 102 is equal to the center of the imaging lens 11-1 of the imaging unit 101 and the imaging unit 103. Since it is equal to the distance Dy from the center of the imaging lens 12-3, L reference parallax data LS is used as vertical parallax data. That is, the vertical / horizontal alignment correction processing unit 904 performs alignment by applying the L reference parallax data LS in the vertical direction and the horizontal direction. For example, the pixel value of the coordinate (x, y) in the image aligned by the vertical / horizontal alignment correction processing unit 904 is the video signal output by the imaging unit 103 when the L reference parallax data LS at the coordinate is d. Is a pixel value of coordinates (x + d, yd) in the image obtained by correcting the image represented by the camera parameter.
 撮像部104が出力した映像信号L’は縦位置合わせ補正処理部905に入力される。縦位置合わせ補正処理部905は、映像信号L’が表す映像の補正処理および位置合わせを、カメラパラメータ記憶部902が記憶する撮像部104のカメラパラメータと、L基準視差データLSとに基づき行い、撮像部102の視点からの映像を生成する。すなわち、縦位置合わせ補正処理部905は、L基準視差データLSを上下方向に適用して、位置合わせを行う。例えば、縦位置合わせ補正処理部905により位置合わせされた画像中の座標(x、y)の画素値は、該座標におけるL基準視差データLSがdのときは、撮像部104が出力した映像信号が表す画像をカメラパラメータにより補正した画像中の座標(x、y―d)の画素値である。 The video signal L ′ output from the imaging unit 104 is input to the vertical alignment correction processing unit 905. The vertical alignment correction processing unit 905 performs correction processing and alignment of the video represented by the video signal L ′ based on the camera parameters of the imaging unit 104 stored in the camera parameter storage unit 902 and the L reference parallax data LS. An image from the viewpoint of the imaging unit 102 is generated. That is, the vertical alignment correction processing unit 905 performs alignment by applying the L reference parallax data LS in the vertical direction. For example, the pixel value of the coordinates (x, y) in the image aligned by the vertical alignment correction processing unit 905 is the video signal output by the imaging unit 104 when the L reference parallax data LS at the coordinates is d. Is a pixel value at coordinates (x, yd) in the image obtained by correcting the image represented by the camera parameter.
 次に、図17を参照して、左目用合成部130および右目用合成部132における多眼合成処理部131の動作を説明する。第一の実施形態においては2つの画像による高解像度合成処理を行ったが、本実施形態では、多眼合成処理部131は、4つの撮像部101、102、103、104で得られた4系統の映像信号を用いて高解像度化合成する。左目用合成部130の多眼合成処理部131は、撮像部102の視点からの映像を表す信号である左目用映像信号LC’を生成し、出力する。また、右目用合成部132の多眼合成処理部131は、撮像部101の視点からの映像を表す信号である右目用映像信号RC’を生成し、出力する。4系統の高解像度化合成も図12の光強度分布で説明した原理と同じであるが、ここではより具体的に、4つの撮像部101、102、103、104の解像度がVGA(640×480画素)で、その4倍の画素数であるQuad-VGAの画素(1280×960画素)への高解像度合成処理を行う場合で説明する。 Next, the operation of the multi-eye synthesis processing unit 131 in the left-eye synthesis unit 130 and the right-eye synthesis unit 132 will be described with reference to FIG. In the first embodiment, high-resolution composition processing using two images is performed, but in this embodiment, the multi-eye composition processing unit 131 includes four systems obtained by the four imaging units 101, 102, 103, and 104. High-resolution synthesis using the video signal. The multi-view synthesis processing unit 131 of the left-eye synthesis unit 130 generates and outputs a left-eye video signal LC ′ that is a signal representing video from the viewpoint of the imaging unit 102. In addition, the multi-view synthesis processing unit 131 of the right-eye synthesis unit 132 generates and outputs a right-eye video signal RC ′ that is a signal representing video from the viewpoint of the imaging unit 101. The four-line high-resolution synthesis is the same as the principle described in the light intensity distribution of FIG. 12, but more specifically, the resolution of the four imaging units 101, 102, 103, and 104 is VGA (640 × 480). In the following description, a high-resolution composition process is performed on a quad-VGA pixel (1280 × 960 pixels) that is four times the number of pixels.
 図17に示すように、Quad-VGAの画素(1280×960画素)の隣接する4つの画素が、異なる撮像部で撮像された画素を割り当てて合成することで高解像度の画像を得ることが可能である。ここで、撮像部101の撮像素子11-1は、撮像部102の撮像素子11-2に対して、半画素分だけ上にずれて配置されており、撮像部103の撮像素子11-3に対して、半画素分だけ左にずれて配置されており、撮像部104の撮像素子11-4は、撮像部102の撮像素子11-2に対して、半画素分だけ左にずれて配置されている。そこで、撮像素子の配置がずれている向きに合わせて、図17に示すように、撮像部101に由来する補正後映像MRと、撮像部102に由来する補正後映像MLと、撮像部103に由来する補正後映像MR’と、撮像部104に由来する補正後映像ML’との各々における同一座標の画素である画素G11、G21、G31、G41を、多眼合成処理部131は、以下のように配置する。すなわち、画素G11の右隣に画素G31を配置し、画素G11の下隣に画素G21を配置し、画素G21の右隣に画素G41を配置する。 As shown in FIG. 17, it is possible to obtain a high-resolution image by allocating four adjacent pixels of Quad-VGA pixels (1280 × 960 pixels) by allocating pixels captured by different imaging units. It is. Here, the image pickup device 11-1 of the image pickup unit 101 is arranged so as to be shifted upward by a half pixel with respect to the image pickup device 11-2 of the image pickup unit 102. On the other hand, the image sensor 11-4 of the image capturing unit 104 is shifted to the left by half a pixel with respect to the image sensor 11-2 of the image capture unit 102. ing. Therefore, in accordance with the direction in which the image sensor is displaced, the corrected image MR derived from the imaging unit 101, the corrected image ML derived from the imaging unit 102, and the imaging unit 103, as shown in FIG. The multi-eye synthesis processing unit 131 uses pixels G11, G21, G31, and G41, which are pixels having the same coordinates in each of the corrected image MR ′ derived from and the corrected image ML ′ derived from the imaging unit 104. Arrange so that. That is, the pixel G31 is arranged on the right side of the pixel G11, the pixel G21 is arranged on the lower side of the pixel G11, and the pixel G41 is arranged on the right side of the pixel G21.
 なお、左目用合成部131においては、撮像部101に由来する補正後映像MRは、位置合わせ補正処理部901が生成した画像であり、撮像部102に由来する補正後映像MLは、補正処理部903が生成した画像であり、撮像部103に由来する補正後映像MR’は、縦横位置合わせ補正処理部904が生成した画像であり、撮像部104に由来する補正後映像ML’は、縦位置合わせ補正処理部905が生成した画像である。右目用合成部132においては、撮像部101に由来する補正後映像MRは、補正処理部903が生成した画像であり、撮像部102に由来する補正後映像MLは、位置合わせ補正処理部901が生成した画像であり、撮像部103に由来する補正後映像MR’は、縦位置合わせ補正処理部905が生成した画像であり、撮像部104に由来する補正後映像ML’は、縦横位置合わせ補正処理部904が生成した画像である。 In the left-eye composition unit 131, the corrected video MR derived from the imaging unit 101 is an image generated by the alignment correction processing unit 901, and the corrected video ML derived from the imaging unit 102 is the correction processing unit. The corrected image MR ′ derived from the image capturing unit 103 is an image generated by the image capturing unit 103, and the corrected image ML ′ derived from the image capturing unit 104 is the vertical position. This is an image generated by the alignment correction processing unit 905. In the right-eye composition unit 132, the corrected video MR derived from the imaging unit 101 is an image generated by the correction processing unit 903, and the corrected video ML derived from the imaging unit 102 is processed by the alignment correction processing unit 901. The corrected image MR ′ derived from the imaging unit 103, which is the generated image, is the image generated by the vertical alignment correction processing unit 905, and the corrected video ML ′ derived from the imaging unit 104 is the vertical / horizontal alignment correction. It is an image generated by the processing unit 904.
 以上の合成動作が左目用合成部130と右目用合成部132で実行される。その結果、左目用合成部130からは撮像部102の位置から撮影した(即ち左目から見た)4系統の映像を合成した高精細な映像が出力される。また、右目用合成部132からは撮像部101の位置から撮影した(即ち右目から見た)4系統の映像を合成した高精細な映像が出力される。つまり、撮像部の出力と比較すると4倍の解像度を有する映像が出力される。この左目用と右目用の高精細化映像を立体表示装置で表示することで、高精細な立体映像を表示することが可能となる。 The above combining operation is executed by the left eye combining unit 130 and the right eye combining unit 132. As a result, the left-eye synthesizing unit 130 outputs a high-definition image obtained by synthesizing four images captured from the position of the imaging unit 102 (that is, viewed from the left eye). Further, the right-eye combining unit 132 outputs a high-definition image obtained by combining four images captured from the position of the imaging unit 101 (that is, viewed from the right eye). That is, an image having a resolution four times that of the output of the imaging unit is output. By displaying the high-definition video for the left eye and the right eye on the stereoscopic display device, it is possible to display the high-definition stereoscopic video.
 このように、立体撮像装置111が出力する左目用映像信号LC’と、右目用映像信号RC’とは、撮像部101、102、103、104が出力する映像信号比較すると、4倍の解像度を有している。すなわち、立体撮像装置111が出力する左目用映像信号LC’と、右目用映像信号RC’とは、撮像部101が出力する映像信号と撮像部103が出力する映像信号とを合成した映像信号と比較すると、4倍の解像度を有している。したがって、立体撮像装置111は、撮像部101が出力する映像信号と撮像部103が出力する映像信号とを合成した映像信号と同等の解像度を有する立体撮像装置と同等の装置規模で、その2倍の解像度の立体映像を生成することができる。 As described above, the left-eye video signal LC ′ output from the stereoscopic imaging device 111 and the right-eye video signal RC ′ have four times higher resolution than the video signals output from the imaging units 101, 102, 103, and 104. Have. That is, the left-eye video signal LC ′ output from the stereoscopic imaging device 111 and the right-eye video signal RC ′ are a video signal obtained by combining the video signal output from the imaging unit 101 and the video signal output from the imaging unit 103. In comparison, it has four times the resolution. Therefore, the stereoscopic imaging device 111 has a device scale equivalent to that of a stereoscopic imaging device having a resolution equivalent to a video signal obtained by combining the video signal output from the imaging unit 101 and the video signal output from the imaging unit 103, and twice that size. 3D images can be generated.
 尚、本実施形態では視差算出部21に入力する映像信号を撮像部101、102が出力した2つの映像信号とした例を説明したが、入力する映像信号の数を増やすことも可能である。例えば、撮像部101、102、103、104が出力した4系統の映像を入力して、それぞれを基準として4系統の視差を算出し、それぞれの視差データで多眼高解像度合成することも可能である。その場合、右目用、左目用以外の複数視点の高精細映像が出力されることになり、結果的に、多視点立体映像を解像度劣化無く生成することが可能となる。 In the present embodiment, the example in which the video signals input to the parallax calculation unit 21 are two video signals output by the imaging units 101 and 102 has been described. However, the number of input video signals can be increased. For example, it is also possible to input four systems of video output from the imaging units 101, 102, 103, and 104, calculate four systems of parallax with each as a reference, and perform multi-view high-resolution synthesis with each parallax data. is there. In this case, a high-definition video of a plurality of viewpoints other than those for the right eye and the left eye is output, and as a result, a multi-view stereoscopic video can be generated without resolution deterioration.
 また、図2に視差算出部21、高解像度合成処理部20、あるいは図14における視差算出部21、多眼高解像度合成処理部121の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより各部の処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、OSや周辺機器等のハードウェアを含むものとする。 In addition, FIG. 2 shows a program for realizing the functions of the parallax calculation unit 21 and the high-resolution synthesis processing unit 20 or the parallax calculation unit 21 and the multi-view high-resolution synthesis processing unit 121 in FIG. 14 on a computer-readable recording medium. The processing of each unit may be performed by recording, reading the program recorded on the recording medium into a computer system, and executing the program. Here, the “computer system” includes an OS and hardware such as peripheral devices.
 また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ROM、CD-ROM等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含むものとする。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであっても良い。 Further, the “computer-readable recording medium” means a storage device such as a flexible disk, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory in a computer system serving as a server or a client in that case, and a program that holds a program for a certain period of time are also included. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.
 以上、この発明の実施形態を図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計変更等も含まれる。 As described above, the embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes design changes and the like without departing from the gist of the present invention.
 本発明は、複数のカメラを用いて高精細な映像を生成する薄型カラーカメラなどに適用することができる。 The present invention can be applied to a thin color camera that generates a high-definition image using a plurality of cameras.
 10、111  立体撮像装置
 11-1、11-2、11-3、11-4  撮像レンズ
 12-1、12-2、12-3、12-4  撮像素子
 20  高解像度合成処理部
 21  視差算出部
 30R  右カメラパラメータ記憶部
 30L  左カメラパラメータ記憶部
 31、32  座標変換部
 33  対応点探索部
 34  R基準視差算出部
 35  L基準視差算出部
 101、102、103、104  撮像部
 121  多眼高解像度合成処理部
 131  多眼合成処理部
 901  位置合わせ補正処理部
 902  カメラパラメータ記憶部
 902R  右カメラパラメータ記憶部
 902L  左カメラパラメータ記憶部
 903  補正処理部
 904  縦横位置合わせ補正処理部
 905  縦位置合わせ補正処理部
 906  合成処理部
 908、130  左目用合成部
 909、132  右目用合成部
DESCRIPTION OF SYMBOLS 10, 111 Three-dimensional imaging device 11-1, 11-2, 11-3, 11-4 Imaging lens 12-1, 12-2, 12-3, 12-4 Image sensor 20 High-resolution composition processing part 21 Parallax calculation part 30R Right camera parameter storage unit 30L Left camera parameter storage unit 31, 32 Coordinate conversion unit 33 Corresponding point search unit 34 R reference parallax calculation unit 35 L reference parallax calculation unit 101, 102, 103, 104 Imaging unit 121 Multi-lens high-resolution synthesis Processing unit 131 Multi-lens synthesis processing unit 901 Position correction processing unit 902 Camera parameter storage unit 902R Right camera parameter storage unit 902L Left camera parameter storage unit 903 Correction processing unit 904 Vertical / horizontal alignment correction processing unit 905 Vertical alignment correction processing unit 906 Compositing processing unit 908, 130 Combining unit for left eye 909, 13 The right-eye synthesis unit

Claims (8)

  1.  同じ被写体を撮像する2つの撮像部と、
     前記2つの撮像部の撮影映像の間で対応点を検出し、前記2つの撮像部の撮影映像の視差情報を算出する視差算出部と、
     前記2つの撮像部それぞれの視点を基準として、前記視差情報および前記2つの撮像部の撮影映像に基づき、該撮影映像よりも画素数の多い映像を合成し、2系統の前記画素数の多い映像を生成する合成処理部と
     を備える立体撮像装置。
    Two imagers that image the same subject;
    A parallax calculation unit that detects corresponding points between the captured images of the two imaging units and calculates parallax information of the captured images of the two imaging units;
    Based on the viewpoints of each of the two image capturing units, based on the parallax information and the captured images of the two image capturing units, an image having a larger number of pixels than the captured image is synthesized, and two systems of images having the larger number of pixels A stereoscopic imaging device comprising: a synthesis processing unit that generates
  2.  前記撮像部は、
     被写体の像を撮像面に結像させる光学系と、
     撮像面に結像した被写体の撮影映像の信号を生成する撮像素子と
     を備え、
     一方の前記撮像部は、他方の前記撮像部と比較して、前記光学系に対する前記撮像素子の位置が、該撮像素子の撮影画素の半分だけ上下いずれかにずれている請求項1に記載の立体撮像装置。
    The imaging unit
    An optical system that forms an image of a subject on the imaging surface;
    An image sensor that generates a signal of a photographic image of a subject imaged on the imaging surface,
    2. The one imaging unit according to claim 1, wherein the position of the imaging element with respect to the optical system is shifted up or down by half of the imaging pixel of the imaging element as compared to the other imaging unit. Stereo imaging device.
  3.  前記撮像部を、3つ以上備え、
     前記撮像部のうち、横並びの撮像部の間では、前記光学系に対する前記撮像素子の位置が、該撮像素子の撮影画素の半分だけ上下いずれかにずれており、
     前記撮像部のうち、縦並びの撮像部の間では、前記光学系に対する前記撮像素子の位置が、該撮像素子の撮影画素の半分だけ左右いずれかにずれている請求項2に記載の立体撮像装置。
    Including three or more imaging units,
    Among the imaging units, between the side-by-side imaging units, the position of the imaging element with respect to the optical system is shifted up or down by half of the imaging pixel of the imaging element,
    3. The stereoscopic imaging according to claim 2, wherein a position of the imaging element with respect to the optical system is shifted to either the left or right by half of a photographing pixel of the imaging element between the imaging units arranged vertically in the imaging unit. apparatus.
  4.  4つの前記撮像部が、各辺が水平または鉛直方向のいずれかに沿う正方形の頂点に配置され、
     前記視差算出部は、前記4つの撮像部のうち、前記正方形の隣接する頂点に配置されている2つの撮像部の撮影映像の視差情報を算出し、
     前記合成処理部は、前記映像の合成の際に行う前記撮影映像の水平方向および鉛直方向の視差補正に、前記視差情報を用いる請求項1に記載の立体撮像装置。
    Four of the imaging units are arranged at the vertices of a square along which each side is either horizontal or vertical,
    The parallax calculation unit calculates parallax information of captured images of two imaging units arranged at adjacent vertices of the square among the four imaging units,
    The stereoscopic imaging apparatus according to claim 1, wherein the synthesis processing unit uses the parallax information for horizontal and vertical parallax correction of the captured video performed when the video is synthesized.
  5.  前記撮像部を少なくとも3つ備え、
     前記合成処理部は、少なくとも3つの前記撮像部それぞれの視点を基準として、前記視差情報および少なくとも2つの前記撮像部の撮影映像に基づき、該撮影映像よりも画素数の多い映像を合成し、少なくとも3系統の前記画素数の多い映像を生成する請求項1に記載の立体撮像装置。
    Comprising at least three imaging units;
    The synthesis processing unit synthesizes a video having a larger number of pixels than the captured video based on the parallax information and the captured video of the at least two imaging units based on the viewpoints of each of the at least three imaging units, and at least The stereoscopic imaging device according to claim 1, wherein the stereoscopic imaging device generates three images with a large number of pixels.
  6.  同じ被写体を撮像する2つの撮像部の撮影映像の間で対応点を検出し、前記2つの撮像部の撮影映像の視差情報を算出する工程と、
     前記2つの撮像部それぞれの視点を基準として、前記視差情報および前記2つの撮像部の撮影映像に基づき、該撮影映像よりも画素数の多い映像を合成し、2系統の前記画素数の多い映像を生成する工程と
     を有する撮像方法。
    Detecting corresponding points between the captured images of the two imaging units that capture the same subject, and calculating parallax information of the captured images of the two imaging units;
    Based on the viewpoints of each of the two image capturing units, based on the parallax information and the captured images of the two image capturing units, an image having a larger number of pixels than the captured image is synthesized, and two systems of images having the larger number of pixels A method of generating an image.
  7.  前記画素数の多い映像を生成する工程は、前記映像の合成の際に、前記視差情報を用いて前記撮影映像の視差補正を行う工程を含む請求項6に記載の撮像方法。 The imaging method according to claim 6, wherein the step of generating the video having a large number of pixels includes a step of performing parallax correction of the captured video using the parallax information when the video is synthesized.
  8.  前記画素数の多い映像を生成する工程は、同じ被写体を撮像する少なくとも3つの前記撮像部それぞれの視点を基準として、前記視差情報および少なくとも2つの前記撮像部の撮影映像に基づき、該撮影映像よりも画素数の多い映像を合成し、少なくとも3系統の前記画素数の多い映像を生成する工程を含む請求項6に記載の撮像方法。 The step of generating an image with a large number of pixels is based on the parallax information and at least two captured images of the imaging units based on the viewpoints of the at least three imaging units that capture the same subject. The imaging method according to claim 6, further comprising: synthesizing an image with a large number of pixels to generate an image with a large number of pixels of at least three systems.
PCT/JP2011/066089 2010-07-27 2011-07-14 Three-dimensional imaging device and imaging method for same WO2012014695A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2010168145A JP5088973B2 (en) 2010-07-27 2010-07-27 Stereo imaging device and imaging method thereof
JP2010-168145 2010-07-27

Publications (1)

Publication Number Publication Date
WO2012014695A1 true WO2012014695A1 (en) 2012-02-02

Family

ID=45529911

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2011/066089 WO2012014695A1 (en) 2010-07-27 2011-07-14 Three-dimensional imaging device and imaging method for same

Country Status (2)

Country Link
JP (1) JP5088973B2 (en)
WO (1) WO2012014695A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013061440A (en) * 2011-09-13 2013-04-04 Canon Inc Imaging device and control method of imaging device
CN102752616A (en) * 2012-06-20 2012-10-24 四川长虹电器股份有限公司 Method for converting double-view three-dimensional video to multi-view three-dimensional video
JP6376474B2 (en) * 2013-05-29 2018-08-22 日本電気株式会社 Multi-view imaging system, acquired image composition processing method, and program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0682608A (en) * 1992-07-14 1994-03-25 Nippon Telegr & Teleph Corp <Ntt> Optical element, and optical axis chaging element and projection type display device using the same
JPH0815616A (en) * 1994-06-30 1996-01-19 Olympus Optical Co Ltd Stereoscopic endoscope image pickup device
JP2008078772A (en) * 2006-09-19 2008-04-03 Oki Electric Ind Co Ltd Stereo image processing apparatus and stereo image processing method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0686332A (en) * 1992-09-04 1994-03-25 Canon Inc Compound eye image pickup method
JP4621214B2 (en) * 2007-01-17 2011-01-26 日本放送協会 Stereo image capturing position adjusting device, stereo image capturing position adjusting method and program thereof, and stereo image capturing system
JP4958233B2 (en) * 2007-11-13 2012-06-20 学校法人東京電機大学 Multi-view image creation system and multi-view image creation method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0682608A (en) * 1992-07-14 1994-03-25 Nippon Telegr & Teleph Corp <Ntt> Optical element, and optical axis chaging element and projection type display device using the same
JPH0815616A (en) * 1994-06-30 1996-01-19 Olympus Optical Co Ltd Stereoscopic endoscope image pickup device
JP2008078772A (en) * 2006-09-19 2008-04-03 Oki Electric Ind Co Ltd Stereo image processing apparatus and stereo image processing method

Also Published As

Publication number Publication date
JP5088973B2 (en) 2012-12-05
JP2012029199A (en) 2012-02-09

Similar Documents

Publication Publication Date Title
JP5238429B2 (en) Stereoscopic image capturing apparatus and stereoscopic image capturing system
JP5679978B2 (en) Stereoscopic image alignment apparatus, stereoscopic image alignment method, and program thereof
CN102164298B (en) Method for acquiring element image based on stereo matching in panoramic imaging system
JP5472328B2 (en) Stereo camera
JP5982751B2 (en) Image processing apparatus, image processing method, and program
EP1836859B1 (en) Automatic conversion from monoscopic video to stereoscopic video
JP5014979B2 (en) 3D information acquisition and display system for personal electronic devices
JP5320524B1 (en) Stereo camera
JP5204350B2 (en) Imaging apparatus, playback apparatus, and image processing method
JP5308523B2 (en) Stereoscopic image display device
US8130259B2 (en) Three-dimensional display device and method as well as program
KR20110124473A (en) 3-dimensional image generation apparatus and method for multi-view image
CN102986233B (en) Image imaging device
JP5814692B2 (en) Imaging apparatus, control method therefor, and program
WO2012029298A1 (en) Image capture device and image-processing method
JP2017531976A (en) System and method for dynamically calibrating an array camera
WO2014145856A1 (en) Systems and methods for stereo imaging with camera arrays
JP2013192229A (en) Two dimensional/three dimensional digital information acquisition and display device
KR20150003576A (en) Apparatus and method for generating or reproducing three-dimensional image
JP5088973B2 (en) Stereo imaging device and imaging method thereof
TWI462569B (en) 3d video camera and associated control method
KR102112491B1 (en) Method for description of object points of the object space and connection for its implementation
KR20110025083A (en) Apparatus and method for displaying 3d image in 3d image system
JP5741353B2 (en) Image processing system, image processing method, and image processing program
JP5704885B2 (en) Imaging device, imaging method, and imaging control program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11812287

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11812287

Country of ref document: EP

Kind code of ref document: A1