WO2013099169A1 - ステレオ撮影装置 - Google Patents
ステレオ撮影装置 Download PDFInfo
- Publication number
- WO2013099169A1 WO2013099169A1 PCT/JP2012/008155 JP2012008155W WO2013099169A1 WO 2013099169 A1 WO2013099169 A1 WO 2013099169A1 JP 2012008155 W JP2012008155 W JP 2012008155W WO 2013099169 A1 WO2013099169 A1 WO 2013099169A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- unit
- image
- video
- parallax
- information
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
-
- G—PHYSICS
- G03—PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
- G03B—APPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
- G03B35/00—Stereoscopic photography
- G03B35/08—Stereoscopic photography by simultaneous recording
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/593—Depth or shape recovery from multiple images from stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/97—Determining parameters from multiple pictures
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
- H04N13/25—Image signal generators using stereoscopic image cameras using two or more image sensors with different characteristics other than in their location or field of view, e.g. having different resolutions or colour pickup characteristics; using image signals from one sensor to control the characteristics of another sensor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
- G06T2207/10021—Stereoscopic video; Stereoscopic image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20228—Disparity calculation for image-based rendering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N2013/0074—Stereoscopic image analysis
- H04N2013/0081—Depth or disparity estimation from stereoscopic image signals
Definitions
- the present disclosure relates to a stereo image including a first imaging unit having an optical zoom function and a second imaging unit that acquires an image having a wider imaging angle of view than an image acquired by the first imaging unit.
- the present invention relates to a photographing apparatus.
- Patent Document 1 discloses a digital camera including two imaging units, a main imaging unit and a secondary imaging unit.
- parallax is detected from the video captured by the main imaging unit and the sub imaging unit, and a stereoscopic video is generated from the main image captured by the main imaging unit and the sub image generated based on the main image and the detected parallax.
- a technique for generating is disclosed.
- Patent Document 2 discloses a technique capable of shooting a stereoscopic video even when the shooting magnifications of the two imaging systems are different in a stereo camera provided with two imaging systems.
- the present disclosure provides a technology that can generate a highly safe stereoscopic image with little discomfort between the left and right images when generating a stereoscopic image from images captured by a plurality of imaging systems.
- a stereo imaging device configured to acquire a first image by imaging a subject, and a first imaging unit having an optical zoom function and a first imaging by imaging the subject.
- a second imaging unit configured to acquire two images, and an image signal processing unit that processes the first image and the second image.
- the image signal processing unit includes an angle-of-view matching unit that extracts an image portion estimated to have the same angle of view from each of the first image and the second image; Based on at least one of the first image, the second image, and the parallax information, a parallax information generation unit that generates parallax information indicating the parallax between the two estimated image portions Based on the parallax information and the first image, a reliability information generation unit that generates reliability information indicating the reliability of the information, and generates a third image that forms a stereoscopic image together with the first image And an image generation unit.
- the parallax information generation unit corrects the parallax information based on the reliability information.
- a playback device is a playback device that generates a stereoscopic image based on the first image, the parallax information, and the reliability information generated by the stereo imaging device.
- An image processing unit that generates an image that is a pair of the first image and the stereoscopic image using the parallax information corrected based on the reliability information is provided.
- reliability information indicating the reliability of the parallax information is generated. Therefore, by adjusting the parallax based on the reliability information, a highly safe stereo An image can be generated.
- FIG. 1 is a hardware configuration diagram of a video shooting apparatus according to Embodiment 1.
- FIG. FIG. 2 is a functional configuration diagram of a video shooting apparatus according to Embodiment 1. It is a figure explaining the processing content by a view angle adjustment part. It is a figure which shows the change of the data processed by the image signal process part. In Embodiment 1, it is a figure which shows the difference of the image
- FIG. 3 is an overview diagram of a video photographing apparatus according to Embodiment 2.
- FIG. 6 is a hardware configuration diagram of a video shooting apparatus according to Embodiment 2. It is a function block diagram of the video imaging device by Embodiment 2. It is a figure which shows matching the angle of view of the image
- FIG. 10 is a diagram illustrating an example of a recording method for a generated stereoscopic video or the like in Embodiment 2.
- FIG. 10 is an overview diagram of a video photographing apparatus according to a modification of the first embodiment and the second embodiment.
- Embodiment 1 First, Embodiment 1 will be described with reference to the accompanying drawings.
- image refers to a concept including a moving image (video) and a still image.
- a signal or information indicating an image or video may be simply referred to as “image” or “video”.
- FIG. 1 is a perspective view showing an external appearance of a conventional video imaging apparatus and the video imaging apparatus according to the present embodiment.
- FIG. 1A shows a conventional video imaging apparatus 100 that captures a moving image or a still image.
- FIG. 1B shows a video photographing apparatus 101 according to the present embodiment.
- the video imaging device 100 and the video imaging device 101 are different in appearance in that the video imaging device 101 includes not only the first lens unit 102 but also the second lens unit 103.
- the conventional video shooting device 100 in order to shoot a video, only the first lens unit 102 condenses and shoots the video.
- the video photographing apparatus 101 collects two images (stereoscopic images) having parallax by condensing with two types of optical systems of the first lens unit 102 and the second lens unit 103, respectively.
- the second lens unit 103 is a lens that is smaller in volume than the first lens unit 102.
- volume size means a size represented by a volume determined by the diameter and thickness of each lens part.
- the distance between the first lens unit 102 and the second lens unit 103 affects the size of the parallax of the stereoscopic video to be captured. Therefore, if the distance between the first lens unit 102 and the second lens unit 103 is about the same as the distance between the left and right eyes of a person, the stereoscopic image captured by the image capturing apparatus 101 becomes a more natural image. Conceivable.
- the first lens unit 102 and the second lens unit 103 are typically on substantially the same horizontal plane when the video photographing apparatus 101 is placed parallel to the ground. This is because humans are generally used to parallax in the horizontal direction but not accustomed to parallax in the vertical direction because they generally see the object with their left and right eyes almost horizontal. . Therefore, when shooting a stereoscopic image, in many cases, shooting is performed so that parallax occurs in the horizontal direction instead of the vertical direction. As the positional relationship between the first lens unit 102 and the second lens unit 103 is shifted in the vertical direction, the stereoscopic video generated by the video shooting apparatus 101 can be a video with a sense of incongruity.
- the optical center of the first lens unit 102 and the optical center of the second lens unit 103 in the present embodiment are located on a single plane parallel to the imaging surface of the imaging device in the video imaging apparatus 101. That is, the optical center of the first lens unit 102 protrudes to the subject side (front side), and the optical center of the second lens unit 103 is located on the opposite side (rear side) of the subject, or vice versa. If the first lens unit 102 and the second lens unit 103 are not located on one plane parallel to the imaging surface, the distance to the subject will be different between the first lens unit 102 and the second lens unit 103. . In such a case, it is generally difficult to obtain accurate parallax information.
- the first lens unit 102 and the second lens unit 103 in the present embodiment are in a positional relationship that is substantially the same distance from the subject. In this regard, more strictly, it is necessary to consider the positional relationship between each lens unit and an image sensor disposed at the subsequent stage of the lens unit.
- the amount of signal processing when generating a stereoscopic video from the video captured by each lens unit is reduced. Can be reduced. More specifically, when the first lens unit 102 and the second lens unit 103 are on the same plane parallel to the imaging surface, the left and right image frames (hereinafter referred to as “video surface”) constituting a stereoscopic image. The position of the same subject on the above satisfies the epipolar constraint condition. For this reason, in the signal processing for generating a stereoscopic video, which will be described later, if the position of the subject on one video plane is determined, the position of the subject on the other video plane can also be calculated relatively easily. It becomes.
- the first lens unit 102 is provided in the front part of the main body of the video photographing apparatus 101 as usual, and the second lens unit 103 is a monitor for confirming the photographed video. It is provided on the back surface of the unit 104.
- the monitor unit 104 displays the captured video on the side opposite to the side where the subject is located (the rear side of the video imaging device 101).
- the video imaging apparatus 101 uses a video image captured using the first lens unit 102 as a right-eye viewpoint video and a video image captured using the second lens unit 103 as a left-eye viewpoint. As video.
- the distance between the second lens unit 103 and the first lens unit 102 is on the back surface of the monitor unit 104.
- it can be provided so as to be about the same distance (4 cm to 6 cm) as the distance between the left and right eyes of a person.
- the second lens unit 103 and the first lens unit 102 may be provided so as to be located on the same plane parallel to the imaging surface.
- FIG. 2 is a diagram showing an outline of the internal hardware configuration of the video photographing apparatus 101 shown in FIG.
- the hardware configuration of the video photographing apparatus 101 includes a main photographing unit 250, a sub photographing unit 251, a CPU 208, a RAM 209, a ROM 210, an acceleration sensor 211, a display 212, an encoder 213, a storage device 214, and an input device 215.
- the main photographing unit 250 includes a first lens group 200, a CCD 201, an A / D conversion IC 202, and an actuator 203.
- the sub photographing unit 251 includes a second lens group 204, a CCD 205, an A / D conversion IC 206, and an actuator 207.
- the first lens group 200 is an optical system composed of a plurality of lenses included in the first lens unit 102 in FIG.
- the second lens group 204 is an optical system composed of a plurality of lenses included in the second lens unit 103 in FIG.
- the first lens group 200 optically adjusts light incident from a subject using a plurality of lenses.
- the first lens group 200 has a zoom function for photographing a subject to be photographed large or small, and a focus function for adjusting the sharpness of the contour of the subject image on the imaging surface. Have.
- the CCD 201 is an image sensor (image sensor) that converts light incident from a subject by the first lens group 200 into an electrical signal.
- a CCD Charge Coupled Device
- the A / D conversion IC 202 is an integrated circuit that converts an analog electric signal generated by the CCD 201 into a digital electric signal.
- the actuator 203 has a motor, and adjusts the distance between a plurality of lenses included in the first lens group 200 and adjusts the position of the zoom lens under the control of the CPU 208 described later.
- the second lens group 204, the CCD 205, the A / D conversion IC 206, and the actuator 207 of the sub photographing unit 251 correspond to the first lens group 200, the CCD 201, the A / D conversion IC 202, and the actuator 203 of the main photographing unit 250, respectively.
- description of the same parts as the main photographing unit 250 will be omitted, and only different parts will be described.
- the second lens group 204 includes a lens group that is smaller in volume than the first lens group 200. Specifically, the aperture of the objective lens of the second lens group is smaller than the aperture of the objective lens of the first lens group. This is because the video photographing apparatus 101 as a whole is also miniaturized by making the sub photographing unit 251 smaller than the main photographing unit 250. In the present embodiment, in order to reduce the size of the second lens group 204, the second lens group 204 is not provided with a zoom function. That is, the second lens group 204 is a single focus lens.
- the CCD 205 has a resolution equal to or larger than that of the CCD 201 (the number of pixels in the horizontal direction and the vertical direction is larger).
- the reason why the CCD 205 of the sub photographing unit 251 has a resolution equal to or larger than that of the CCD 201 of the main photographing unit 250 is that the video captured by the sub photographing unit 251 is electronically zoomed (viewing angle adjustment) by signal processing described later. This is to suppress the deterioration of the image quality when performing ().
- Actuator 207 has a motor, and adjusts the distance between a plurality of lenses included in second lens group 200 under the control of CPU 208 described later. Since the second lens group 204 does not have a zoom function, the actuator 207 performs lens adjustment for focus adjustment.
- a CPU (Central Processing Unit) 208 controls the entire video photographing apparatus 101.
- the CPU 208 performs processing for generating a stereoscopic video from both videos based on the videos shot by the main shooting unit 250 and the sub shooting unit 251. Note that the same processing may be realized using an FPGA (Field Programmable Gate Array) instead of the CPU 208.
- FPGA Field Programmable Gate Array
- a RAM (Random Access Memory) 209 temporarily stores various variables at the time of executing a program for operating the CPU 208 according to instructions from the CPU 208.
- ROM (Read Only Memory) 210 records data such as program data and control parameters for operating the CPU 208.
- the acceleration sensor 211 detects the shooting state (posture, orientation, etc.) of the video shooting device 101.
- the acceleration sensor 211 is described as being used, but the present invention is not limited to this.
- a triaxial gyroscope may be used as another sensor. That is, any sensor that detects the shooting state of the video shooting apparatus 101 may be employed.
- the display 212 displays a stereoscopic video imaged by the video imaging device 101 and processed by the CPU 208 or the like. Note that the display 212 may include a touch panel as an input function.
- the encoder 213 encodes (encodes) the stereoscopic video information generated by the CPU 208 or the information data necessary for displaying the stereoscopic video according to a predetermined method.
- the storage device 214 records and holds the data encoded by the encoder 213.
- the storage device 214 may be realized by any system as long as it can record data, such as a magnetic recording disk, an optical recording disk, and a semiconductor memory.
- the input device 215 is an input device that receives an instruction from the outside of the video photographing device 101 such as a user.
- each of the above-described constituent elements in the video photographing apparatus 101 is represented by a functional unit corresponding thereto.
- FIG. 3 is a functional configuration diagram of the video photographing apparatus 101.
- the video imaging apparatus 101 includes a main imaging unit 350, a sub imaging unit 351, an image signal processing unit 308, a horizontal direction detection unit 318, a display unit 314, a video compression unit 315, a storage unit 316, and an input unit 317.
- the main imaging unit 350 includes a first optical unit 300, an imaging unit 301, an A / D conversion unit 302, and an optical control unit 303.
- the sub photographing unit 351 includes a second optical unit 304, an imaging unit 305, an A / D conversion unit 306, and an optical control unit 307.
- the main photographing unit 350 corresponds to a “first photographing unit”
- the sub photographing unit 351 corresponds to a “second photographing unit”.
- the main photographing unit 350 corresponds to the main photographing unit 250 in FIG.
- the first optical unit 300 corresponds to the first lens group 200 in FIG. 2 and adjusts light incident from the subject.
- the first optical unit 300 includes an optical diaphragm unit that controls the amount of incident light from the first optical unit 300 to the imaging unit 301.
- the imaging unit 301 corresponds to the CCD 201 in FIG. 2 and converts the light incident from the first optical unit 300 into an electrical signal.
- the A / D conversion unit 302 corresponds to the A / D conversion IC 202 in FIG. 2 and converts the analog electrical signal output from the imaging unit 301 into a digital signal.
- the optical control unit 303 corresponds to the actuator 203 in FIG. 2 and controls the first optical unit 300 by control from the image signal processing unit 308 described later.
- the sub photographing unit 351 corresponds to the sub photographing unit 251 in FIG.
- the second optical unit 304, the imaging unit 305, the A / D conversion unit 306, and the optical control unit 307 in the sub imaging unit 351 are the first optical unit 300, the imaging unit 301, the A / D conversion unit 302, and the optical control unit 303, respectively. Corresponding to Since these functions are the same as the corresponding functional units in the main photographing unit 350, description thereof is omitted here.
- the second optical unit 304, the imaging unit 305, the A / D conversion unit 306, and the optical control unit 307 respectively correspond to the second lens group 204, the CCD 205, the A / D conversion IC 206, and the actuator 207 in FIG.
- the image signal processing unit 308 corresponds to the CPU 208 in FIG. 2, receives the video signals from the main shooting unit 350 and the sub shooting unit 351 as input, generates a stereoscopic video signal, and outputs it. A specific method by which the image signal processing unit 308 generates a stereoscopic video signal will be described later.
- the horizontal direction detection unit 318 corresponds to the acceleration sensor 211 in FIG. 2 and detects the horizontal direction during video shooting.
- the display unit 314 corresponds to the video display function of the display 212 in FIG. 2 and displays the stereoscopic video signal generated by the image signal processing unit 308.
- the display unit 314 alternately displays the left and right videos included in the input stereoscopic video on the time axis.
- the viewer uses, for example, video viewing glasses (active shutter glasses) that alternately block light incident on the viewer's left eye and light incident on the right eye in synchronization with the display on the display unit 314.
- video viewing glasses active shutter glasses
- the video compression unit 315 corresponds to the encoder 213 in FIG. 2 and encodes the stereoscopic image signal generated by the image signal processing unit 308 according to a predetermined method.
- the storage unit 316 corresponds to the storage device 214 in FIG. 2 and records and holds the stereoscopic video signal encoded by the video compression unit 315. Note that the storage unit 316 is not limited to the above-described stereoscopic video signal, and may record a stereoscopic video signal expressed in another format.
- the input unit 317 corresponds to the touch panel function of the input device 215 and the display 212 in FIG. 2 and accepts input from the outside of the video shooting device.
- the image signal processing unit 308 includes an angle of view matching unit 309, a pixel number matching unit 310, a parallax information generation unit 311, an image generation unit 312, a shooting control unit 313, and a reliability information generation unit 319. .
- the angle-of-view matching unit 309 matches the angle of view of the video signal input from both the main photographing unit 350 and the sub photographing unit 351.
- “Angle of view” means a shooting range (usually expressed as an angle) of a video shot by the main shooting unit 350 and the sub shooting unit 351, respectively. That is, the angle-of-view matching unit 309 extracts an image portion estimated to have the same angle of view from each of the image signal input from the main imaging unit 350 and the image signal input from the sub-imaging unit 351.
- FIG. 4 is a diagram in which two images generated based on video signals at a certain time point input from the main photographing unit 350 and the sub photographing unit 351 are arranged.
- the video magnification is different between the video from the main shooting unit 350 (right video R) and the video from the sub shooting unit 351 (left video L). This is because the first optical unit 300 (first lens group 200) has an optical zoom function, but the second optical unit 304 (second lens group 204) does not have an optical zoom function. .
- the angle-of-view matching unit 309 performs processing for matching videos with different angles of view photographed by the photographing units.
- the second optical unit 304 of the sub photographing unit 351 does not have an optical zoom function, the second optical unit 304 (second lens group 204) can be downsized.
- the angle-of-view matching unit 309 extracts a portion corresponding to the right image captured by the main image capturing unit 350 from the left image captured by the sub image capturing unit 351.
- the image signal processing unit 308 can process the captured video and can acquire the state of the first optical unit 300 that is currently capturing through the optical control unit 303.
- the image signal processing unit 308 controls the zoom function of the first optical unit 300 via the optical control unit 303 by the imaging control unit 313 when performing zoom control. Therefore, the image signal processing unit 308 can acquire the zoom magnification of the video imaged by the main imaging unit 350 as supplementary information.
- the second optical unit 304 does not have a zoom function, its magnification is known in advance.
- the angle-of-view matching unit 309 calculates a difference in magnification between the main photographing unit 350 and the sub photographing unit 351 based on these pieces of information, and corresponds to the right image R in the left image L based on the difference.
- the part can be specified. In this process, if a range that is about 10% larger than the corresponding portion is first cut out and a known pattern matching process or the like is used within the cut out range, the angle of view can be adjusted with a simple process.
- any known method may be used as a method for specifying the portion corresponding to the right image R in the left image L.
- FIG. 4 shows that the portion surrounded by the dotted line of the left image L is a portion corresponding to the right image R. Since the left image L is an image acquired by the second optical unit 304 having a single focus lens without a zoom function, the left image L has a wider range (wide angle) than the right image R taken with the zoom lens zoomed. It extends.
- the angle-of-view matching unit 309 generates both left and right images with the angle of view adjusted by the right image R and the portion surrounded by the dotted line of the left image L.
- the right video R is used as it is without extracting a part of the region, but the technology in the present disclosure is not limited to such an example.
- the method of adjusting the angle of view is arbitrary. For example, a part of the right image R may be extracted and used.
- the pixel number matching unit 310 performs processing to match the number of pixels of both the left and right videos whose field angles are matched by the field angle matching unit 309.
- the imaging unit 301 of the main imaging unit 350 and the imaging unit 305 of the sub imaging unit 351 have different numbers of pixels.
- the main photographing unit 350 performs zoom photographing
- the number of pixels of the image portion extracted from the left video L whose angle of view has been adjusted by the angle of view adjusting unit 309 increases or decreases according to the zoom magnification. For this reason, the left and right videos whose angles of view have been adjusted by the angle of view matching unit 309 are still different in the number of pixels at this point, and are difficult to handle.
- the pixel number matching unit 310 performs a task of matching the number of pixels of the video extracted by the view angle matching unit 309.
- the pixel number matching unit 310 matches the luminance signal level and the color signal level of both the left and right images when there is a large difference in the luminance signal level and the color signal level of the left and right images with the angle of view matched ( (Closer) processing may be performed simultaneously.
- the pixel number matching unit 310 may perform a process of reducing the number of pixels when the imaging unit 301 (CCD 201) and the imaging unit 305 (CCD 205) have a large number of pixels. For example, as shown in FIG. 4, when the video shot by the main shooting unit 350 has an information amount of 1920 ⁇ 1080 corresponding to the high-definition television system, the amount of information handled is large. If the amount of information is large, the required processing capability of the entire video photographing apparatus 101 becomes high, so that data processing tends to be difficult, for example, the time required for processing the captured video becomes long. Therefore, the pixel number matching unit 310 may perform a process of reducing the number of pixels of both images as necessary, while matching the number of pixels.
- the pixel number matching unit 310 reduces, for example, the 1920 ⁇ 1080 right video R captured by the main imaging unit 350 to a size of 288 ⁇ 162. This is a factor of 3/20 in the vertical and horizontal directions. Note that the method of reducing and enlarging the image by the pixel number matching unit 310 is not limited to the method shown here, and any known method may be used.
- the imaging unit 305 of the sub imaging unit 351 has more pixels than the imaging unit 301 of the main imaging unit 350, for example, as illustrated in FIG. 4, the imaging unit 305 has a resolution of 3840 ⁇ 2160, and the angle of view.
- the size of the video extracted from the left video L by the matching unit 309 is 1280 ⁇ 720.
- the pixel number matching unit 310 enlarges the extracted 1280 ⁇ 720 video by 9/40 times in the vertical and horizontal directions.
- the left video L is also 288 ⁇ 162 video.
- FIG. 5 is a diagram illustrating an example of processing results of video data by the angle-of-view matching unit 309 and the pixel number matching unit 310. Note that FIG. 5 also shows the processing results by the disparity information generation unit 311 and the image generation unit 312 described later.
- the angle of view matching unit 309 extracts a portion (image having a size of 1280 ⁇ 720) corresponding to the right image R from the left image L.
- the pixel number matching unit 310 generates the 288 ⁇ 162 video Rs and Ls by matching the number of pixels of the left and right videos with the same angle of view and reducing both videos to a size suitable for the subsequent processing. To do.
- the right video R shown in FIG. 5 corresponds to a “first image”
- the left video L corresponds to a “second image”.
- the “first image” is an image acquired by the imaging unit (main imaging unit 350) having the optical zoom function
- the “second image” is an image acquired by the sub imaging unit 351. It is.
- the right image R and the left image L have the same number of pixels as the number of photosensitive cells in the main photographing unit 350 and the sub photographing unit 351, respectively.
- the parallax information generation unit 311 detects the parallax between both the left and right videos that have been subjected to the view angle matching and the pixel number matching processing by the view angle matching unit 309 and the pixel number matching unit 310. Even if the same subject is photographed, the image photographed by the main photographing unit 350 and the image photographed by the sub photographing unit 351 are different from each other by the amount of parallax caused by the difference in position. For example, when two images shown in FIG. 6 are taken, the position of the building 600 that is captured as a subject differs between the left image L and the right image R.
- the right video R shot by the main shooting unit 350 is a video shot from the right side rather than the left video L shot by the sub shooting unit 351.
- the building 600 is arranged on the left side of the position in the left image L.
- the building 600 is arranged on the right side of the position in the right image R.
- the parallax information generation unit 311 calculates the parallax of the subject being projected based on these different videos.
- FIG. 7 is a flowchart showing a flow of processing executed by the parallax information generation unit 311.
- the disparity information generation unit 311 calculates the disparity between the left and right images according to the flowchart of FIG. Hereinafter, each step shown in FIG. 7 will be described.
- Step S701 The parallax information generation unit 311 creates an image in which only the luminance signal (Y) is extracted from each of the input left and right images.
- Y luminance signal
- YCbCr luminance / color difference signals
- the video is represented by the luminance signal Y and the color difference signal CbCr.
- the video may be represented and processed by three colors of RGB.
- Step S702 The parallax information generation unit 311 calculates the difference ( ⁇ (Ls / Rs)) between the luminance signals of the left and right videos generated in step S701. At this time, the parallax information generation unit 311 compares pixels at the same position in each video to obtain a difference. For example, if the value (pixel value) Ls of the luminance signal of a certain pixel in the left image is 103 and the value Rs of the luminance signal of the corresponding pixel in the right image is 101, the difference value ⁇ (Ls / Rs) at that pixel is 2 It becomes.
- Step S703 Based on the difference value between the pixels calculated in step S702, the parallax information generation unit 311 changes the content of the following processing in units of pixels.
- the difference value is 0 (when the pixel values are exactly the same between the left and right videos)
- the process of step S704 is performed.
- the difference value is other than 0 (when the pixel values are different between the left and right images)
- the process of step S705 is performed.
- Step S704 When the left and right pixel values are exactly the same in the process of step S703, the parallax information generation unit 311 sets the parallax amount at the pixel to 0.
- the case where the left and right pixels are exactly the same is determined as the parallax amount 0, but the calculation method in an actual product is not limited to this example. Even if the left and right pixel values are not exactly the same, the values of the pixels located around the pixel are exactly the same between the left and right images, and if the difference between the pixel values is small, the pixel is also between the left and right images. May be the same.
- the amount of parallax when determining the amount of parallax, the amount of parallax can be determined in consideration of not only the difference between the left and right images of the pixel of interest but also the difference between the left and right images of surrounding pixels. Good. Thereby, it is possible to remove the influence of calculation errors caused by edges, textures, and the like existing in the vicinity of the pixel. Even if the pixel values of the pixel of interest or the surrounding pixels are not exactly the same, the parallax amount may be determined to be 0 if the difference between the pixels of interest is less than a preset threshold value. .
- Step S705 When the parallax information generation unit 311 detects a difference between the two images, each pixel of the reference video is converted into a sub-shooting unit using the video (the right video Rs in the present embodiment) by the main shooting unit 350 as a reference video. It detects (searches) which pixel of the video by 351 (left video Ls in this embodiment) corresponds to.
- the search for the corresponding pixel can be performed, for example, by obtaining a difference while shifting one pixel at a time in the horizontal direction and the vertical direction from the pixel of interest in the left video Ls as a starting point, and specifying a pixel that minimizes the difference.
- a luminance signal pattern is similar between a certain line and its neighboring lines, the most likely corresponding pixel may be searched using information on those patterns.
- the luminance signal when there is a point at infinity in the video, no parallax occurs there, so it is possible to search for a corresponding pixel based on the point at infinity.
- the similarity of the color difference signal patterns may be considered. It is possible to determine which part on the image is the point at infinity in consideration of, for example, the operation of autofocus.
- the parallax occurs only in the horizontal direction, and therefore the detection of the pixel unit of the right video and the left video is performed only in the horizontal direction of the video. It can be said that it is only necessary to search for. Also, in the case of shooting by the parallel method, the parallax of an object at infinity is zero, and the parallax of an object closer to infinity occurs only in one direction in the horizontal direction, so the horizontal search is performed only in one direction. Also good.
- Step S706 The parallax information generation unit 311 calculates the inter-pixel distance on the video plane between the corresponding pixel searched in the left video Ls and the pixel of the reference video Rs.
- the inter-pixel distance is calculated based on the position of each pixel, and is represented by, for example, the number of pixels. Based on this calculation result, the amount of parallax is determined. It can be considered that the greater the inter-pixel distance, the greater the amount of parallax. Conversely, it can be considered that the smaller the inter-pixel distance, the smaller the amount of parallax.
- the main photographing unit 350 and the sub photographing unit 351 are configured to perform the photographing by the parallel photographing method, the parallax amount becomes 0 at infinity as described above. Therefore, the captured subject tends to have a larger amount of parallax on the image plane as the distance from the image capturing apparatus 101 to the subject (shooting distance) is shorter. Conversely, the longer the distance between the video shooting device 101 and the subject, the smaller the amount of parallax on the video screen.
- the main image capturing unit 350 and the sub image capturing unit 351 are configured to perform image capturing using the image capturing method based on the intersection method, the optical axes of both intersect at one point.
- cross point The position where the optical axes of the two intersect is called a “cross point”.
- the cross point when the subject is in front of the cross point (on the video shooting apparatus 101 side), the closer the subject is to the video shooting apparatus 101, the larger the amount of parallax.
- the parallax amount tends to increase as the subject is further away.
- Step S707 When the parallax information generation unit 311 determines the amount of parallax for all the pixels, the process proceeds to the next step S708. If there is a pixel whose parallax amount has not yet been determined, the process returns to step S703 for the pixel for which the parallax amount has not yet been determined, and the above processing is repeated.
- Step S708 When the amount of parallax is determined for all pixels, the amount of parallax has been determined for the entire video plane. Therefore, the parallax information generation unit 311 uses the depth map ( DepthMap). This depth map is information indicating the depth of each subject on the video screen or each part of the video screen. In the depth map, a portion with a small amount of parallax has a value close to 0, and a portion with a large amount of parallax has a large value. There is a one-to-one relationship between the depth information shown in the depth map and the amount of parallax, and mutual conversion can be performed by giving geometrical imaging conditions such as a convergence angle and a stereo base distance. Therefore, the stereoscopic video can be expressed by the right video R and the left and right parallax amounts by the main photographing unit 350 or the right video R and the depth map.
- DepthMap DepthMap
- FIG. 8 is a diagram showing an example of a depth map generated when the video shown in FIG. 6 is acquired.
- a portion with parallax has a finite value according to the amount of parallax, and a portion without parallax has a value of zero.
- the parallax amount is expressed with a coarser accuracy than in actuality. However, in practice, for example, every 288 ⁇ 162 pixels shown in FIG. The amount of parallax is calculated.
- the parallax information generation unit 311 may generate a depth map in consideration of the positional relationship between the first optical unit 300 and the second optical unit 304. For example, when the first optical unit 300 and the second optical unit 304 are arranged close to each other, when the depth map is generated, the calculated individual parallax amount may be converted so as to increase. .
- the parallax information generation unit 311 generates the depth map in consideration of the positional relationship between the first optical unit 300 and the second optical unit 304 when generating the depth map.
- the reliability of the depth map generated by the parallax information generation unit 311 is determined by the reliability information generation unit 319, and information indicating the reliability of the depth map (reliability information) is generated.
- the reliability of the depth map is determined from various viewpoints such as shooting conditions, image characteristics, and depth map contents.
- the generated reliability information is sent to the parallax information generation unit 311.
- the disparity information generation unit 311 refers to the generated reliability information, and corrects the previously generated depth map when the reliability is lower than a predetermined level. Details of the process of generating and correcting the reliability information will be described later.
- the image generation unit 312 is paired with a stereoscopic video from the video shot by the main shooting unit 350 based on the depth map that is information indicating the parallax amount for each pixel calculated or corrected by the parallax information generation unit 311. Generate video.
- the pair of stereoscopic images refers to a left image having the same number of pixels as the right image R captured by the main image capturing unit 350 and having a parallax with respect to the right image R.
- the image generation unit 312 according to the present embodiment generates a left video L ′ that is a pair of the right video R and the stereoscopic video, based on the right video R and the depth map.
- the image generation unit 312 identifies a portion where parallax is generated in the 1920 ⁇ 1080 right video R output from the main imaging unit 350 by referring to the depth map.
- a video L ′ having an appropriate parallax is generated as the left video by performing processing for correcting the position of the portion. That is, processing such as moving the portion of the right video R to the right according to the amount of parallax indicated by the depth map so as to be an appropriate video as the left video, and the resulting video is converted to the left video L Output as'.
- the reason why the part having parallax is moved to the right is that the part having parallax in the left image is located on the right side of the corresponding part in the right image.
- the image generation unit 312 performs the above processing after supplementing the lacking information. For example, when the depth map is considered as an image having 288 ⁇ 162 pixels, the number of pixels is enlarged 20/3 times in the vertical and horizontal directions, and the pixel value representing the amount of parallax is also increased by 20/3 times. A process of filling the value of the pixel added by enlargement with the value of the surrounding pixels is performed. The image generation unit 312 converts the depth map into information of 1920 ⁇ 1080 pixels by the process as described above, and then generates the left video L ′ from the right video R.
- the image generation unit 312 outputs the generated left video L ′ and the right video R input to the image signal processing unit 308 as a stereoscopic video signal. Accordingly, the image signal processing unit 308 can output a stereoscopic video signal based on the video signals captured by the main imaging unit 350 and the sub imaging unit 351, respectively.
- the video imaging apparatus 101 generates the other video that is a pair of the stereoscopic video from one captured video by the signal processing even if the main imaging unit 350 and the sub-imaging unit 351 have different configurations. It becomes possible.
- the reliability information generation unit 319 generates information indicating the reliability of the depth map, and the depth map is corrected based on the generated information, so that a more accurate and safe 3D image is generated. be able to.
- FIG. 3 An example of the overall processing flow of the video imaging apparatus 101 including the angle-of-view matching unit 309, the pixel number matching unit 310, the parallax information generation unit 311, the image generation unit 312, and the reliability information generation unit 319 is illustrated in FIG. This will be described with reference to the flowchart shown in FIG.
- Step S801 The image signal processing unit 308 receives an input of the shooting mode from the input unit 317.
- the shooting mode can be selected by the user from, for example, a stereoscopic video (3D) shooting mode and a non-stereoscopic video (2D) shooting mode.
- Step S802 The image signal processing unit 308 determines whether the input shooting mode is a stereoscopic video shooting mode or a non-stereoscopic video shooting mode. If the stereoscopic video shooting mode is selected, the process proceeds to step S804. If the non-stereoscopic video shooting mode is selected, the process proceeds to step S803.
- Step S803 When the input shooting mode is the non-stereoscopic video shooting mode, the image signal processing unit 308 captures and records the video shot by the main shooting unit 350 in the conventional manner.
- Step S804 When the input shooting mode is the stereoscopic video shooting mode, the image signal processing unit 308 captures the right video R and the left video L by the main shooting unit 350 and the sub shooting unit 351, respectively.
- Step S805 The angle-of-view matching unit 309 performs angle-of-view adjustment processing of the input right video R and left video L by the above-described method.
- Step S806 The pixel number matching unit 310 performs the pixel number matching process on both the left and right images whose angles of view are matched by the above-described method.
- Step S807 The parallax information generation unit 311 detects the amount of parallax for the right video Rs and the left video Ls on which the pixel number matching processing has been performed. The detection of the amount of parallax is performed by the above-described processing described with reference to FIG.
- Step S808 The reliability information generation unit 319 generates reliability information indicating the reliability of the depth map generated by the parallax information generation unit 311. Details of the reliability information will be described later.
- Step S809 The disparity information generation unit 311 determines the reliability of the depth map based on the reliability information generated by the reliability information generation unit 319. For example, if the value of the reliability information is higher than a predetermined threshold value, the process proceeds to step S811, and if not, the process proceeds to step S810.
- Step S810 If the parallax information generation unit 311 determines that the reliability of the previously generated depth map is low, the parallax information generation unit 311 corrects the depth map. The correction is performed so as to reduce the amount of parallax, for example. Details of the depth map correction processing will be described later.
- Step S811 The image generation unit 312 generates a left video L ′ that is a pair of stereoscopic video with respect to the right video R from the right video R and the calculated or corrected depth map by the method described above.
- Step S812 The video imaging apparatus 101 displays a stereoscopic video based on the generated right video R and left video L ′ on the display unit 314. Instead of displaying the stereoscopic video, a process of recording the right video R and the left video L ′ or the right video R and the parallax information may be performed. If these pieces of information are recorded, it is possible to reproduce the stereoscopic video by causing the other reproduction apparatus to read the information.
- Step S813 The video imaging apparatus 101 determines whether video imaging can be continued. If shooting continues, the process returns to step S804 and the process is repeated. If the shooting cannot be continued, the video shooting apparatus 101 ends the shooting.
- the method for generating a stereoscopic video from the captured video is not limited to the above method.
- this method is a method of generating a high-definition image by filling the texture with the contour of one of the left and right rough images and the contour of the other high-definition image.
- textures are mapped onto the surface of a 3D model (3D object) represented by polygons with vertex, ridge, and surface connection information (phase information) (like wallpaper) By pasting), a high-definition image can be generated.
- the texture of the occlusion part hidden part
- the “occlusion portion” refers to a portion (information missing region) that is shown in one video but not shown in the other video. By enlarging a portion that is not an occlusion portion, the occlusion portion can be hidden by a portion that is not an occlusion portion.
- a method for extending a portion that is not an occlusion portion for example, there is a method using a smoothing filter such as a known Gaussian filter.
- An image having an occlusion portion can be corrected by using a new depth map obtained through a smoothing filter having a predetermined attenuation characteristic in a depth map having a relatively low resolution.
- Still another method is a method using 2D-3D conversion. For example, a high-definition left image (estimated L-ch image) generated by performing 2D-3D conversion on a high-definition right image (R-ch image) and a left image (L-ch image) actually captured By comparing the estimated L-ch image with the (ch image), it is possible to generate a high-definition left-side image with no contour error.
- a high-definition left image estimated L-ch image
- R-ch image high-definition right image
- L-ch image left image
- the following method may be used.
- image features such as composition, contour, color, texture, sharpness, and spatial frequency distribution of a high-definition right-side image (for example, an image of horizontal 1920 pixels and vertical 1080 pixels) by the parallax information generation unit 311, depth information ( Depth information 1) is estimated and generated.
- the resolution of the depth information 1 can be set to be equal to or lower than the resolution of the right image.
- the depth information 1 can be set to, for example, horizontal 288 pixels and vertical 162 pixels as in the above example.
- two images for example, horizontal 288 pixels and vertical 162 pixels
- depth information depth
- Information 2 is generated.
- the depth information 2 is also horizontal 288 pixels and vertical 162 pixels.
- the processing in this example is equivalent to using depth information 2 as a constraint condition for increasing the accuracy of depth information (depth information 1) generated by 2D-3D conversion by image analysis.
- the above operations are effective even when the sub photographing unit 351 uses the optical zoom.
- the sub photographing unit 351 uses the optical zoom, it is more resistant to the occurrence of image distortion (error) if the high-definition left image is used as the reference image and the right image is referred to as the sub image.
- the first reason is that the stereo matching process between the left image and the right image when the zoom magnification is slightly changed is simplified.
- the optical zoom magnification of the main photographing unit 350 continuously changes, if the electronic zoom magnification of the sub photographing unit 351 is followed for calculation of depth information, the calculation time increases, and therefore stereo matching processing is performed. This is because image distortion tends to occur.
- parallax information may be obtained by performing geometric calculation using depth information (depth information) actually measured from two lens systems on the right image. Using this parallax information, the left image can be calculated from the right image by geometric calculation.
- depth information depth information
- Another method is super-resolution.
- this method when a high-definition left-side image is generated by super-resolution from a rough left-side image, the high-definition right-side image is referred to.
- a depth map smoothed by a Gaussian filter or the like is converted into disparity information based on the geometric positional relationship of the imaging system, and a high-definition right-side image is converted from a high-definition right-side image using the disparity information. Can be calculated.
- the shooting control unit 313 controls shooting conditions of the main shooting unit 350 and the sub shooting unit 351 based on the parallax information calculated by the parallax information generation unit 311.
- the left and right videos that make up the stereoscopic video are generated and used based on the video shot by the main shooting unit 350.
- the video shot by the sub shooting unit 351 is used to detect parallax information for the video shot by the main shooting unit 350. Therefore, the sub photographing unit 351 may photograph a video that easily obtains parallax information in cooperation with the main photographing unit 350.
- the shooting control unit 313 controls the main shooting unit 350 and the sub shooting unit 351 based on the parallax information calculated by the parallax information generation unit 311. For example, control such as exposure, white balance, and autofocus is performed during shooting.
- the imaging control unit 313 controls the optical control unit 303 and / or the optical control unit 307 based on the parallax detection result of the parallax information generation unit 311, so that the main imaging unit 350 and / or the sub imaging unit 351. Change the shooting conditions.
- the image by the sub photographing unit 351 is a video that is almost white as a whole (of the captured image data).
- the pixel value becomes a value close to the upper limit value), and the contour of the subject may not be identified.
- the photographing control unit 313 performs control for correcting the exposure of the sub photographing unit 351 via the optical control unit 307.
- the exposure is corrected by adjusting a diaphragm (not shown), for example. Accordingly, the parallax information generation unit 311 can detect the parallax using the corrected video from the sub photographing unit 351.
- the following method may be adopted.
- the parallax information generation unit 311 compares the two images to determine that the sharpness of the contour of the subject differs between the two images.
- the photographing control unit 313 detects a difference in the sharpness of the contour of the same subject in both images, the focus of the main photographing unit 350 and the sub photographing unit 351 is made the same via the optical control unit 303 and the optical control unit 307.
- the imaging control unit 313 performs control to adjust the focus of the sub imaging unit 351 to the focus of the main imaging unit 350.
- the shooting control unit 313 controls the shooting conditions of the main shooting unit 350 and the sub shooting unit 351 based on the parallax information calculated by the parallax information generation unit 311.
- the parallax information generation unit 311 can more easily extract the parallax information from the videos shot by the main shooting unit 350 and the sub shooting unit 351, respectively.
- the shooting control unit 313 controls the shooting conditions of the main shooting unit 350 and the shooting conditions of the sub shooting unit 351 to match, but such control is not essential.
- the reliability of the depth map is lowered, and therefore reliability information described later may be set low.
- the depth map generated by the disparity information generating unit 311 does not always reflect the disparity information correctly. For example, when the contrast of images acquired by each imaging unit is small (clouds, sea, etc.), or when similar brightness change patterns are continuous (stripe patterns, etc.), the reliability of the generated depth map is Lower. Also, the reliability of the depth map is low when the ratio of the occlusion portion in the image is large or when the size of each occlusion portion is large. Further, when the zoom magnification of the main image capturing unit 350 is increased, the number of pixels in the corresponding portion of the image acquired by the sub image capturing unit 351 is decreased, and the resolution of the depth map tends to be decreased.
- the stereoscopic video based on the depth map is also inappropriate.
- an image that gives a stereoscopic effect that cannot be realized in real life an image that includes parallax that greatly exceeds the appropriate amount of parallax, or an image in which the degree of three-dimensionality changes rapidly depending on the scene. Viewing such inappropriate images may cause eyestrain and headaches, which is problematic in terms of safety. If the reliability of the depth map is low, there is a high possibility that such a dangerous image is generated.
- the imaging apparatus 101 evaluates the reliability of the depth map and corrects the depth map according to the high reliability, thereby generating a safer stereoscopic image.
- the reliability information generation unit 319 generates reliability information indicating the reliability of the parallax information generated by the parallax information generation unit 311.
- the reliability information can be represented by, for example, a numerical value that is scored according to the level of reliability, or binary information that indicates whether the depth map is reliable.
- the reliability information generation unit 319 is based on the imaging conditions of the two imaging units, the image characteristics of at least one of the two image frames acquired by the two imaging units, and at least one information of the content of the depth map. Generate sex information.
- the reliability information will be exemplified.
- the number of pixels of two image parts (corresponding to images Rs and Ls of 288 ⁇ 162 pixels in the example shown in FIG. 5) on which stereo matching is performed by the parallax information generation unit 319. It may be a numerical value determined based on In general, the smaller the number of pixels of the two image portions subjected to stereo matching, the lower the accuracy of the depth map in the two-dimensional plane (that is, the horizontal resolution and vertical resolution of the depth map). It is appropriate to determine sex information.
- the reliability information is “0: low reliability” when the number of pixels n of the two image portions is in the range of 0 ⁇ n ⁇ N1, and “1: low” when the number of pixels is in the range of N1 ⁇ n ⁇ N2.
- “Reliable” or “2: High reliability” may be set when N2 ⁇ n.
- N1 and N2 are natural numbers that satisfy N1 ⁇ N2.
- the reliability is evaluated in three stages, but may be evaluated in two stages or four or more stages. Note that when the number of pixels of the two image portions on which stereo matching is performed depends on the zoom magnification in the main photographing unit 350, the reliability information may be generated based on the zoom magnification instead of the number of pixels.
- the “feature point” means a pixel or a set of pixels that characterizes an image, and typically refers to an edge or a corner.
- the reliability information generation unit 319 cross-correlates the degree to which the feature points detected from the two image portions by stereo matching are matched for each horizontal line or for each fixed block region. Based on the evaluation, the reliability information digitized according to the degree of matching is generated.
- the reliability information generation unit 319 selects a line including many feature points such as an edge region from one of two image parts to be matched, and determines which line of the other image part is closest to the line. Identify by calculating the cross-correlation function. Then, the amount of change in luminance values of feature points on those lines is compared, and if the difference in the amount of change is small, the reliability information is set high, and if the difference in the amount of change is large, the reliability information is set. Set low. As a result of such processing, the value of the reliability information is set lower as the change in the luminance value of the edge region is more dissimilar between the left and right images.
- line scanning for stereo matching does not necessarily need to be performed for all horizontal lines. For example, if scanning is performed sequentially while skipping a certain number of lines, the amount of calculation processing can be reduced and efficient.
- the reliability information generation unit 319 may set the reliability higher as the change in the level of the edge region and the edge peripheral region is larger, and may set the reliability lower as the level is smaller.
- the reliability information generation unit 319 determines the similarity between the average luminance values of the two image signals to be compared, or the similarity between the average luminance values of specific areas constituting the image. Reliability information may be generated based on the information.
- the specific area is, for example, a partial area on the image including feature points. The greater the difference between the average luminance values of the two images or the specific area, the lower the reliability of the depth map generated by comparing the pixel values. Therefore, the reliability information generation unit 319 sets the reliability information lower as the difference in the average luminance value of the entire image or the specific area between the two images is larger, and sets the reliability information higher as the difference is smaller. Good.
- the average values of the luminance signals and color signals of the areas of the same angle of view of these two images are combined, or the difference between the average values is set in advance. It is also effective to set a value smaller than the set value.
- the reliability information may be generated based on the similarity of the gamma characteristics of the main photographing unit 350 and the sub photographing unit 351.
- the gamma characteristic means a characteristic of correction (gamma correction) performed on the input signal so that the output image displayed on the display device looks natural to humans.
- the main photographing unit 350 and the sub photographing unit 351 each have a unique gamma characteristic. If there is a difference between the two gamma characteristics, the level of the output image signal is also different, so the reliability of the depth map is lowered. For this reason, the reliability information generation unit 319 may set the reliability higher as the gamma characteristics of the main imaging unit 350 and the sub imaging unit 351 are closer, and may set the reliability lower as the distance is longer.
- the reliability information generation unit 319 may generate reliability information based on the horizontal or vertical size of the occlusion area included in the two image portions. Since the parallax information cannot be obtained in the occlusion area, the reliability of the depth map decreases as the size of the occlusion area increases. Therefore, the reliability information generation unit 319 may set the reliability lower as the horizontal and / or vertical size of the occlusion area is larger, and set the reliability higher as the size is smaller.
- the reliability information generation unit 319 responds to various indicators such as the amount of high frequency components included in the image, the ratio of the occlusion portion, the variance of the luminance value of the image, the zoom magnification of the main photographing unit 350, and the like.
- the scored value can be used as reliability information.
- Reliability information may be generated based on any of the above indexes, or comprehensive reliability information may be generated by comprehensively evaluating a plurality of indexes (reliability information) including the example described above. May be.
- the resolution of the depth map is not limited to one type, and a depth map of a plurality of resolutions is calculated, and reliability is determined for each specific area of the image according to feature points that are detected differently depending on the pattern in the image.
- the reliability information in the present embodiment is generated for each frame of the video, but the reliability information may be generated for each pixel or for each block composed of a plurality of pixels.
- the reliability information is generated for each image frame, there is an advantage that the amount of calculation required for processing is relatively small.
- the amount of calculation required for processing is relatively large.
- the reliability can be evaluated for each pixel of the depth map, higher accuracy correction is possible. There is an advantage of being.
- the disparity information generation unit 311 corrects the depth map for each image frame based on the reliability information.
- the depth map is corrected so as to reduce the degree of parallax in order to prevent a sudden jump.
- the disparity information generation unit 311 can correct the depth map by, for example, an adaptive filtering process in the horizontal direction or the vertical direction based on the reliability information.
- the “adaptive filtering process” means a process of converting the depth map using different filters depending on the characteristics of each part of the image.
- a method using a filter that can smooth a depth map using a smoothing filter such as an averaging filter based on a moving average method in a region that does not include an edge in an image and can store an edge in a region around the edge can be used.
- a smoothing filter such as an averaging filter based on a moving average method in a region that does not include an edge in an image and can store an edge in a region around the edge
- a known k nearest neighbor smoothing filter, bilateral filter, or Gaussian filter can be used as a filter capable of storing the edge.
- the median filter in the preprocessing it is possible to reduce noise errors and improve the accuracy of the depth map.
- the parallax information can be filtered in the horizontal direction or the vertical direction by the adaptive filtering process.
- adaptive filtering can be performed according to changes in the time axis direction such as frame differences.
- the disparity information generation unit 311 corrects the depth map by the adaptive filtering process described above, thereby reducing the disparity amount of the scene with low depth map reliability. Can be suppressed.
- the disparity information generation unit 311 determines a constant less than 1 determined based on the value of the entire depth map, in addition to the above method.
- the depth map may be corrected by multiplying by.
- the image acquired by the main photographing unit 350 may be output as it is without generating a stereoscopic image. In this case, it can be considered that the value of the entire depth map is corrected to zero.
- a portion having a particularly large amount of parallax may be corrected.
- the parallax information generation unit 311 outputs the corrected depth map to the image generation unit 312.
- the image generation unit 312 generates a left-eye image L ′ using the corrected depth map. As a result, a safe stereoscopic video with a moderate stereoscopic effect is generated.
- the parallax amount of the stereoscopic video can be maintained in an appropriate range, and thus, a sudden jump at the time of reproduction can be prevented. As a result, the safety of stereoscopic video can be improved.
- the main photographing unit 350 performs photographing while changing the zoom magnification, the depth map error tends to become conspicuous, and thus the processing in the present embodiment is effective.
- the image generation unit 312 generates a stereoscopic video.
- the image generation unit 312 is not limited to such a form.
- the right video R and a depth map corrected as necessary are recorded, and an imaging device is recorded. 101 itself may not generate a stereoscopic image.
- the stereoscopic video can be reproduced by reading the right video R recorded by another playback device and the depth map corrected as necessary, and generating the left video L ′.
- the information to be recorded is not limited to the above example, and the right video R, the depth map before correction, and the reliability information may be recorded. In this case, a stereoscopic video is generated after the depth map is corrected on the playback device side.
- Such a playback apparatus generates a stereoscopic image based on the two image signals, the depth map, and the reliability information acquired by the main imaging unit 350 and the sub imaging unit 351 generated by the stereo imaging apparatus of the present embodiment.
- the reproduction apparatus uses an image processing unit (an image signal processing unit in FIG. 3) that generates an image signal that is a pair of a stereoscopic image and an image signal acquired by the main imaging unit 350 using a depth map corrected based on reliability information. 308).
- the angle-of-view matching unit 309 acquires information regarding the horizontal direction of the video photographing apparatus 101 from the horizontal direction detection unit 318.
- left and right images included in a stereoscopic image have a parallax in the horizontal direction but no parallax in the vertical direction. This is because the left and right eyes of a human are positioned at a predetermined distance in the horizontal direction, while they are positioned on substantially the same horizontal plane in the vertical direction.
- the parallax in the vertical direction is generally considered to be relatively low in sensitivity because it depends on a specific spatial perception pattern due to the vertical retinal image difference. Considering this point, it is considered preferable that the parallax is generated only in the horizontal direction and not generated in the vertical direction also in the stereoscopic image to be shot and generated.
- the horizontal direction detection unit 318 acquires information regarding the state of the video imaging apparatus 101 at the time of video imaging, in particular, the tilt with respect to the horizontal direction.
- the angle-of-view matching unit 309 corrects the horizontal direction of the video using the information about the tilt from the horizontal direction detection unit 318 when matching the angle of view of both the left and right images. For example, it is assumed that since the video shooting apparatus 101 at the time of shooting is tilted, the shot video is also tilted as shown in FIG.
- the angle-of-view matching unit 309 performs angle-of-view adjustment of the images shot by the main shooting unit 350 and the sub-shooting unit 351, and corrects both images in the horizontal direction.
- the angle-of-view matching unit 309 changes the horizontal direction when performing angle-of-view matching based on the tilt information input from the horizontal direction detection unit 318, and changes the range indicated by the dotted frame in FIG. Output as a result of corner alignment.
- FIG. 11B shows a result obtained by correcting the horizontal direction by the angle-of-view matching unit 309 and outputting the result.
- the horizontal direction is appropriately corrected at the stage of generating a stereoscopic video. Therefore, also in the generated stereoscopic video, parallax occurs mainly in the horizontal direction (horizontal direction) and hardly occurs in the vertical direction (vertical direction). Thereby, the viewer can view a natural stereoscopic video.
- the angle-of-view matching unit 309 detects the shooting state of the video shooting device 101 based on the tilt information from the horizontal direction detection unit 318, but the technology in the present disclosure is limited to this. is not. Even without using the horizontal direction detection unit 318, the image signal processing unit 308 may detect the horizontal component and the vertical component of the video by other methods.
- the disparity information generated by the disparity information generating unit 311 is, for example, as illustrated in FIG. It is represented by the video as shown.
- a part without parallax is indicated by a solid line and a part with parallax is indicated by a dotted line based on the parallax information.
- the part with parallax is a part that is in focus in the captured video
- the part without parallax is a subject that is located farther than the subject that is in focus.
- An object located far away is a portion that becomes a background of the video
- the horizontal direction can be detected by analyzing the video for these portions.
- the horizontal direction can be determined by logically analyzing the “mountain” portion of the background.
- the vertical direction and the horizontal direction can be determined from the shape of the mountain and the growth status of the trees constituting the mountain.
- the angle-of-view matching unit 309 and the parallax information generation unit 311 can detect the inclination of the captured video and generate a stereoscopic video with the horizontal direction corrected at the stage of generating the stereoscopic video. Become. Even when the video shooting device 101 is shot in a tilted state, the viewer can view a stereoscopic video whose horizontal direction is maintained within a predetermined range.
- the video shooting apparatus 101 generates a stereoscopic video from the video shot by the main shooting unit 350 and the sub shooting unit 351.
- the video shooting apparatus 101 does not always need to generate a stereoscopic video.
- Stereoscopic video allows viewers to perceive the subject's context based on the parallax between the left and right images, making the viewer feel that the video being viewed is stereoscopic.
- a stereoscopic image may not be generated.
- the shooting of a stereoscopic video and the shooting of a non-stereoscopic video may be switched according to the shooting conditions and the content of the video.
- FIG. 13 is a graph showing the relationship between the distance from the photographing device to the subject (subject distance) and the extent to which the subject located at the distance can be seen stereoscopically (three-dimensional characteristics) for each zoom magnification of the main photographing unit 350. is there.
- the greater the subject distance the smaller the stereoscopic characteristics.
- the smaller the subject distance the greater the stereoscopic characteristics.
- the definition of “subject” the following commonly used definition is used.
- the photographing apparatus When the photographing apparatus is in the manual focus mode, the subject to be photographed usually focused by the photographer is the subject.
- the photographing target When the photographing apparatus is in the auto focus mode, the photographing target automatically focused by the photographing apparatus is the subject.
- a person, a flora and fauna, an object near the center of the object to be imaged, or a person's face or conspicuous object (generally referred to as a Salient object) automatically detected in the image capturing range is usually the object.
- the captured video is composed of only distant subjects such as landscape images, the subjects are concentrated only in the distance.
- the farther away the subject is from the photographing apparatus the smaller the amount of parallax of the subject in the stereoscopic video. Therefore, it may be difficult for the viewer to understand that the video is a stereoscopic video. This is the same as when the zoom magnification increases and the angle of view decreases.
- the video photographing apparatus 101 may switch the validity / invalidity of the function of generating a stereoscopic video according to the photographing condition, the characteristic of the photographed video, and the like using the above characteristics.
- the specific implementation method is described below.
- FIG. 14 is a diagram showing the relationship between the distance from the photographing apparatus to the subject and the number of effective pixels of the subject when the subject is photographed.
- the first optical unit 300 of the main photographing unit 350 has a zoom function. According to FIG. 14, if the subject distance is within the range up to the upper limit of the zoom range (the range in which the number of pixels constituting the subject image can be made constant even when the distance to the subject changes using the zoom function), The first optical unit 300 can maintain a certain number of effective pixels by using a zoom function for the subject. However, when shooting a subject whose subject distance is greater than or equal to the upper limit of the zoom range, the number of effective pixels of the subject decreases according to the distance. On the other hand, the second optical unit 304 of the sub photographing unit 351 has a single focus function. Therefore, the number of effective pixels of the subject decreases according to the subject distance.
- the shooting control unit 313 in the image signal processing unit 308 determines that the subject distance, which is the distance from the video shooting device 101 to the subject, is less than a predetermined value (threshold) (A area in FIG. 14). Only, the functions of the angle-of-view matching unit 309, the pixel number matching unit 310, the parallax information generation unit 311 and the image generation unit 312 are enabled, and a stereoscopic video is generated. On the other hand, when the subject distance is equal to or greater than a predetermined value (threshold value) (B area in FIG.
- the shooting control unit 313 in the image signal processing unit 308 includes an angle-of-view matching unit 309, a pixel number matching unit 310, and parallax information. At least one of the generation unit 311 and the image generation unit 312 is not operated, and the video imaged by the main imaging unit 350 is output to the subsequent stage. This subject distance can be measured by using the focal length when the first optical unit 300 or the second optical unit 304 is focused.
- the imaging control unit 313 enables the operation of the parallax information generation unit 311 only when the subject distance determined based on the focal length of the first optical unit 300 or the second optical unit 304 is smaller than a predetermined threshold. It may be configured.
- the video imaging apparatus 101 performs the process of outputting a stereoscopic video according to the conditions of the captured subject, particularly the distance to the subject, and the process of not outputting the stereoscopic video (outputting a non-stereoscopic video signal). Can be switched.
- the viewer can view a conventional captured video (non-stereoscopic video) for a video that is difficult to perceive as a stereoscopic video even when viewed.
- a stereoscopic video is generated only when necessary, so that the processing amount and the data amount can be reduced.
- the video imaging apparatus 101 can determine whether or not to generate a stereoscopic video based on the amount of parallax detected by the parallax information generation unit 311.
- the image generation unit 312 extracts the maximum amount of parallax included in the video from the depth map generated by the parallax information generation unit 311. When the maximum amount of parallax is equal to or greater than a predetermined value (threshold), the image generation unit 312 can determine that the video is a video that can obtain a stereoscopic effect of a predetermined level or higher.
- the image generation unit 312 when the maximum parallax amount value extracted from the depth map by the image generation unit 312 is less than a predetermined value (threshold), the image generation unit 312 generates a stereoscopic effect to the viewer even if the stereoscopic video is generated. It can be determined that the video is difficult to perceive.
- the maximum amount of parallax included in the image plane has been described as an example, but the present invention is not limited to this. For example, the determination may be made based on a ratio of pixels having a parallax amount larger than a predetermined value in the video screen.
- the video imaging device 101 When the image generation unit 312 generates a stereoscopic video according to the above determination method, the video imaging device 101 generates and outputs a stereoscopic video by the method described above. If the image generation unit 312 determines that the stereoscopic video is difficult to perceive, the image generation unit 312 does not generate the stereoscopic video and outputs the video input from the main photographing unit 350. As a result, the video shooting apparatus 101 can determine the generation and output of a stereoscopic video based on the depth map of the shot video.
- the angle-of-view matching unit 309 or the parallax information generation unit 311 uses the detection result by the horizontal direction detection unit 318 or the amount of parallax detected by the parallax information generation unit 311 to determine the horizontal direction of the captured video. Judgment may be made to determine whether or not to generate a stereoscopic video. For example, as shown in FIG. 15A, when the horizontal inclination is an angle within a predetermined range (in the example of FIG.
- the image signal processing unit 308 generates and outputs a stereoscopic video. To do.
- the image signal processing unit 308 outputs the video imaged by the main imaging unit 350. With such control, the video photographing apparatus 101 can determine whether or not a stereoscopic video should be generated and output according to the inclination in the horizontal direction.
- the video imaging apparatus 101 can automatically switch the generation and output of a stereoscopic video in consideration of the effect (stereoscopic characteristics) by several methods.
- the three-dimensional characteristics refer to the zoom magnification, the maximum parallax amount, the tilt of the camera, and the like.
- a stereoscopic video is output if the degree of stereoscopic characteristics is equal to or higher than a reference level, and a non-stereoscopic video is output if the level of stereoscopic characteristics is not below the reference level.
- FIG. 15B is a flowchart showing a flow of processing of the image signal processing unit 308 relating to the above-described determination of whether or not to generate a stereoscopic video. Hereinafter, each step will be described.
- Step S1601 First, a video (image frame) is photographed by both the main photographing unit 350 and the sub photographing unit 351.
- Step S1602 It is determined whether or not the three-dimensional characteristics of the video being shot are large. The determination is performed, for example, by any one of the methods described above. If it is determined that the three-dimensional characteristic is less than the reference level, the process proceeds to step S1603, and if it is determined that the stereoscopic characteristic is equal to or higher than the reference level, the process proceeds to step S1604.
- Step S1603 The image signal processing unit 308 outputs the 2D video acquired by the main photographing unit 350.
- step S1604 to step S1609 is the same as the processing from step S805 to step S810 in FIG.
- the video photographing apparatus including the main photographing unit 350 having the optical zoom function and the relatively high-resolution sub photographing unit 351 having the electronic zoom function has been described as an example. It is not a thing.
- the main imaging unit 350 and the sub imaging unit 351 may be a video imaging apparatus having a substantially equivalent configuration. Further, it may be a video photographing device in which the photographing unit performs photographing by a single method. In other words, it is a video shooting device that generates a stereoscopic video from the shot video. Depending on the shooting conditions such as the distance to the subject, the tilt in the horizontal direction, the conditions of the shot subject, etc. Or what is necessary is just to perform switching between stereoscopic video imaging and non-stereoscopic video imaging. With such a configuration, the video apparatus can automatically switch according to the size of the stereoscopic characteristics of the captured or generated stereoscopic video.
- the video imaging apparatus 101 suitably switches between stereoscopic video imaging and conventional planar video (non-stereoscopic video) imaging according to the imaging conditions at the time of imaging and the conditions of the captured video. It becomes possible.
- FIG. 16A shows a stereoscopic video generated by the image signal processing unit 308, that is, a video (Main Video Stream) captured by the main imaging unit 350 and an image signal processing unit 308 that is paired with the video.
- This is a method for recording video (Sub Video Stream).
- the right video and the left video are output from the image signal processing unit 308 as independent data.
- the video compression unit 315 encodes these left and right video data independently.
- the video compression unit 315 multiplexes the encoded left and right video data.
- the encoded and multiplexed data is recorded in the storage unit 316.
- the storage unit 316 can be played back by connecting the storage unit 316 to another playback device.
- a playback device reads the data recorded in the storage unit 316, divides the multiplexed data, and decodes the encoded data, thereby playing back the left and right video data of the stereoscopic video.
- the playback device has a function of playing back a 3D video
- the 3D video recorded in the storage unit 316 can be played back.
- the recorded left video is a video generated using a depth map corrected based on the reliability information, and thus is a highly reliable video.
- the video compression unit 315 encodes the video imaged by the main imaging unit 350 and multiplexes the encoded video data and the depth map.
- the encoded and multiplexed data is recorded in the storage unit 316.
- the playback device has a relatively complicated configuration.
- the data amount of the depth map data can be made smaller than that of the video data paired with the stereoscopic video by compression encoding, according to this method, the data amount to be recorded in the storage unit 316 can be reduced.
- the playback apparatus can generate a highly reliable stereoscopic video.
- the corrected depth map is recorded, but the depth map before correction may be recorded together with the reliability information.
- the playback device can correct the stereoscopic video based on the reliability information, correction more suitable for the playback environment is possible.
- the video compression unit 315 encodes the video shot by the main shooting unit 350. Furthermore, the video compression unit 315 multiplexes the encoded video, difference data, and reliability information. The multiplexed data is recorded in the storage unit 316.
- a set of differences ⁇ (Ls / Rs) calculated for each pixel may be referred to as a “difference image”.
- the playback device side needs to calculate a parallax amount (depth map) based on the difference ⁇ (Ls / Rs) and the main-side video, and further generate a video that is a pair of stereoscopic video. Therefore, the playback apparatus needs to have a configuration that is relatively close to the image signal processing unit 308 of the video photographing apparatus 101.
- the data of the difference ⁇ (Ls / Rs) is included, it is possible to calculate a parallax amount (depth map) suitable for the playback device side. In particular, in this method, it is possible to calculate the most suitable amount of parallax for each image frame using reliability information.
- the playback device can generate and display a stereoscopic video in which the amount of parallax is adjusted according to the size of a display display of the device.
- the stereoscopic image has a different stereoscopic effect (a sense of depth in the front-rear direction with respect to the display surface) depending on the magnitude of the parallax between the left image and the right image. Therefore, the stereoscopic effect is different between viewing the same stereoscopic video on a large display and viewing it on a small display.
- the playback apparatus can adjust the parallax amount of the generated stereoscopic video according to the size of its own display.
- the video compression unit 315 encodes the video shot by the main shooting unit 350 and the video shot by the sub shooting unit 351. Further, the video compression unit 315 multiplexes the encoded video, difference data, and reliability information. The multiplexed data is recorded in the storage unit 316.
- the photographing apparatus 101 does not need to include the angle-of-view matching unit 309, the pixel number matching unit 310, the parallax information generation unit 311, and the image generation unit 312.
- the playback device includes an angle-of-view matching unit 2013, a pixel number matching unit 2014, a parallax information generation unit 2015, and an image generation unit 2016.
- the playback apparatus converts the stereoscopic image by the same processing as the processing performed by the image signal processing unit 308 (view angle matching, pixel number matching, difference image generation, depth map generation, main image correction by depth map). It is possible to generate.
- This method can be said to be a method in which the image signal processing unit 308 shown in FIG. 3 is configured as an image processing device independent of the photographing device, and the image processing device is provided in the reproduction device. Even in such a system, the same function as in the above embodiment can be realized.
- the playback device may adjust the parallax amount of the video to be displayed by a viewer who views the stereoscopic video, for example, depending on whether the viewer is an adult or a child.
- the depth feeling of the stereoscopic video can be changed according to the viewer. If the viewer is a child, it may be preferable to reduce the sense of depth.
- the stereoscopic effect may be changed according to the brightness of the room.
- the viewing conditions may be any information as long as the conditions are related to various viewers or viewing environments other than the above, such as the brightness of the room and whether or not the viewer is an authentication registrant. Good.
- the playback device can perform adjustments such as reducing the stereoscopic effect in a scene with low reliability of the parallax information based on the reliability information.
- FIG. 17A shows a stereoscopic video composed of left and right videos shot by the video shooting device 101.
- FIG. 17B is a diagram illustrating a stereoscopic video with a reduced stereoscopic effect generated on the playback device side.
- the position of the building shown as the subject is closer between the left and right images than the image shown in FIG. That is, the position of the building shown in the sub-side image is located on the left side as compared with the case of FIG.
- FIG. 17C is a diagram illustrating an example in the case where a 3D image with enhanced stereoscopic effect is generated on the playback device side. In the video shown in FIG.
- the playback device can uniquely set the size of the stereoscopic effect according to various conditions.
- the video imaging apparatus of the present embodiment switches the necessity of generating a stereoscopic video according to various conditions
- the following information may be added to any of the above recording methods. it can.
- the video imaging apparatus 101 generates a stereoscopic video (outputs a stereoscopic video) and does not generate a stereoscopic video (outputs a stereoscopic video) according to shooting conditions when shooting the video, conditions of the shot video, and the like. No) Switch between processing. For this reason, the video shooting apparatus 101 uses this data as auxiliary data together with the recorded video so that the playback device can distinguish the portion that has generated the stereoscopic video from the portion that has not generated the stereoscopic video. Identification information for distinguishing may be recorded.
- the “portion where a stereoscopic video is generated” means a range of frames generated as a stereoscopic image among a plurality of frames constituting the video, that is, a temporal portion.
- the auxiliary data may be configured by time information indicating a start time and an end time of a portion where a stereoscopic video is generated, or time information indicating a start time and a period during which the stereoscopic video is generated. Other than the time information, it may be indicated by, for example, a frame number or an offset from the top of the video data. In other words, any method can be used as long as the auxiliary data includes information for identifying a portion in which the stereoscopic video is generated in the recorded video data and a portion in which the stereoscopic video is not generated. Also good.
- the video photographing apparatus 101 can identify the time information and other information, for example, 2D / 3D, for identifying a portion that generates a stereoscopic video (3D video) and a portion that does not generate a stereoscopic video (2D video). Generate information such as identification flags. Then, the information is recorded as auxiliary information in, for example, AV data (stream) or a playlist.
- the playback device can distinguish the 2D / 3D shooting section based on time information included in the auxiliary information, a 2D / 3D identification flag, and the like. Using this, the playback device can perform various playback controls such as automatically switching between 2D / 3D playback and extracting and playing back only a 3D shot section (part). Become.
- identification information may be ternary information indicating whether or not 3D output is necessary, for example, “0: unnecessary, 1: required, 2: leave to the imaging system”.
- Information that takes four values indicating the degree of three-dimensional characteristics, such as “0: low, 1: medium, 2: high, 3: too high and dangerous” may be used.
- the necessity of 3D display may be expressed not only by the above example but by binary information or more information than four values.
- the playback device may be configured to display a stereoscopic video only when parallax information is received, and to display a non-stereo video when no parallax information is received.
- the information indicating the amount of parallax is, for example, a depth map calculated by detecting the amount of parallax of the photographed subject.
- the depth value of each pixel constituting the depth map is represented by, for example, a 6-bit bit string.
- identification information as control information may be recorded as integrated data combined with a depth map.
- the integrated data can also be embedded in a specific position (for example, an additional information area or a user area) of the video stream.
- reliability information indicating the reliability of the depth value may be added to the integrated data.
- the reliability information may be information for each pixel instead of the information generated for each image frame described above.
- the reliability information can be expressed for each pixel, for example, “1: reliable, 2: slightly reliable, 3: unreliable”.
- the reliability information (for example, 2 bits) of the depth value can be combined with the depth value of each pixel constituting the depth map and can be handled as, for example, 8-bit depth comprehensive information.
- the total depth information may be recorded by being embedded in the video stream for each frame.
- the depth value reliability information (for example, 2 bits) is combined with the depth value (for example, 6 bits) of each pixel constituting the depth map, and is handled as 8-bit depth comprehensive information. Each time, it can be embedded and recorded in the video stream. It is also possible to divide an image corresponding to one frame into a plurality of block areas and set reliability information of depth values for each block area.
- the integrated data combining the identification information as the control information and the depth map is associated with the time code of the video stream, and the integrated data is converted into a file, and a dedicated file storage area (a directory or folder in a so-called file system). ) Can also be recorded.
- the time code is added for every 30 frames or 60 frames of video frames per second, for example.
- a particular scene is identified by a series of time codes from the time code of the first frame of the scene to the time code of the last frame of the scene.
- the identification information as the control information and the depth map can be associated with the time code of the video stream, respectively, and the respective data can be recorded in a dedicated file storage area.
- the right and left images have a suitable amount of parallax and powerful scenes, and the left and right images have a large amount of parallax. It is possible to mark a scene that is too safe and has a problem with safety. Therefore, using this marking, for example, high-speed search (calling) of a powerful scene having a three-dimensional effect (3D feeling) and application to a scene for highlight reproduction can be easily realized. Also, using this marking, you can skip playback of scenes that do not require 3D output or scenes with safety issues, or reprocess them into safe images (convert them into safe images by signal processing). It becomes possible.
- the depth range width can be reduced and converted into a safe stereoscopic image (3D image) that is not visually broken.
- 3D image a safe stereoscopic image
- the image can be converted into an image that is visually unbroken while having a 3D feeling that pops out of the display screen or pulls it back. You can also.
- the left and right images can be converted into exactly the same image and displayed as a 2D image.
- the main image capturing unit 350 that captures an image constituting one of the three-dimensional images and the sub image capturing unit 351 that captures an image for detecting the amount of parallax have different configurations. it can.
- the sub photographing unit 351 may be realized with a simpler configuration than the main photographing unit 350, the stereoscopic video photographing apparatus 101 can be configured with a simpler configuration.
- the sub-shooting unit 351 acquires the left video L by shooting the subject with a shooting field angle wider than the shooting field angle in the right video R acquired by the main shooting unit 350.
- the technology in the disclosure is not limited to such a form. That is, the shooting angle of view of the image acquired by the sub shooting unit 351 and the shooting angle of view of the image acquired by the main shooting unit 350 may be the same, or the latter may be wider than the former. Good.
- the video by the main photographing unit 350 is handled as the right-side video of the stereoscopic video, and the video generated by the image generation unit 312 is handled as the left-side video. It is not limited to.
- the positional relationship between the main photographing unit 350 and the sub photographing unit 351 may be reversed, that is, the video by the main photographing unit 350 may be the left video, and the video generated by the image generating unit 312 may be the right video.
- the size (288 ⁇ 162) of the video output from the pixel number matching unit 310 is an example, and the technology in the present disclosure is not limited to such a size. You may handle the image of sizes other than the above.
- the lens of the main photographing unit 350 and the lens of the sub photographing unit 351 have different configurations, but these may be the same configuration.
- the main photographing unit 350 and the sub photographing unit 351 may be single focal lenses having the same focal length or single focal lenses having different focal lengths.
- the angle-of-view matching unit 310 determines in advance an extraction part when performing the angle-of-view matching processing from the videos shot by both the shooting units. be able to.
- the resolution of the imaging unit can be determined to an optimum resolution from the design stage according to the lens characteristics of both.
- the stereo photographing apparatus has an optical zoom function, and the main photographing unit 350 that obtains the first image by photographing the subject, and the second by photographing the subject.
- the sub-shooting unit 351 that acquires the image
- the angle-of-view matching unit 309 that extracts an image portion estimated to have the same angle of view from each of the first image and the second image, and the same angle of view by the angle-of-view matching unit 309
- a reliability information generation unit 319 that generates reliability information indicating the reliability of the parallax information.
- the parallax information can be appropriately corrected using the reliability information, so that a more appropriate stereoscopic image can be generated. Further, for example, an operation such as correcting a scene with an inappropriate parallax can be performed in a playback device different from the photographing device. As a result, it is possible to improve the safety of stereoscopic video by appropriately correcting a dangerous scene including a sudden pop-out.
- the sub photographing unit 351 obtains the second image by photographing the subject with a photographing field angle wider than the photographing field angle in the first image.
- the main photographing unit 350 performs zoom photographing, the resolution of the image portion extracted from the second image can be kept relatively high.
- the photographing apparatus further includes a pixel number matching unit 310 that matches the pixel numbers of the two image portions estimated to have the same field angle by the field angle matching unit 309.
- the parallax information generation unit 311 performs stereo matching processing between two image parts in which the number of pixels are matched by the pixel number matching unit 309, and obtains the parallax amount for each pixel, thereby obtaining the parallax. Generate information.
- the reliability information generation unit 319 generates reliability information based on the number of pixels of the two image portions on which the stereo matching process has been performed by the parallax information generation unit 311.
- the reliability information generation unit 319 generates reliability information based on the degree of matching of feature points of two image portions that have been subjected to stereo matching processing by the parallax information generation unit 311. To do.
- the reliability information generation unit 319 generates reliability information based on the similarity of the gamma characteristics of the main imaging unit 350 and the sub imaging unit 351.
- the reliability information generation unit 319 includes the similarity between the average luminance values of the entire screens of the first image and the second image, or the screens of the first image and the second image.
- the reliability information is generated on the basis of the similarity of the average luminance values of the specific areas constituting the.
- the reliability information is set to be lower as the difference in the average luminance value of the entire image or the specific area between both images is larger, and the reliability information is set to be higher as the difference is smaller.
- the reliability information generation unit 319 is based on the level change level of at least one of the edge region and the peripheral region of the edge in at least one of the first image and the second image. To generate reliability information.
- the reliability information generation unit 319 is based on the horizontal or vertical size of the occlusion area included in the two image portions whose field angles are matched by the field angle matching unit 310. To generate reliability information.
- the disparity information generation unit 311 corrects the disparity information by adaptive filtering processing in the horizontal direction, the vertical direction, or the time axis direction based on the reliability information.
- the parallax information can be appropriately corrected so as to keep the parallax amount of a scene with low reliability low.
- the imaging device generates an image that is a pair of the first image and the stereo image based on the first image and the corrected parallax information.
- a video compression unit 315 and a storage unit 316 that record the first image, the parallax information, and the reliability information are further provided.
- the image processing apparatus further includes a video compression unit 315 and a storage unit 316 that record the first image and the parallax information corrected based on the reliability information.
- the playback device in the present embodiment generates a stereoscopic image based on the first image, parallax information, and reliability information generated by the above-described stereo imaging device.
- the reproduction apparatus includes an image processing unit 308 that generates an image that is a pair of the first image and the stereoscopic image using the parallax information corrected based on the reliability information.
- Embodiment 2 Next, Embodiment 2 will be described. This embodiment is different from the first embodiment in that two sub photographing units are provided. Hereinafter, the description will focus on the differences from the first embodiment, and a description of the overlapping items will be omitted.
- FIG. 18 is an external view showing a video photographing apparatus 1800 according to the present embodiment.
- 18 includes a center lens unit 1801 and a first sub lens unit 1802 and a second sub lens unit 1803 provided around the center lens unit 1801.
- the arrangement of the lenses is not limited to this example.
- these lenses may be arranged at a position where the distance between the first sub lens unit 1802 and the second sub lens unit 1803 is substantially equivalent to the distance between the left and right eyes of a person.
- the amount of parallax between the left and right images of the stereoscopic image generated from the image captured by the center lens unit 1801 is set to the amount of parallax when the object is viewed with the human eye. It becomes possible to approach.
- the first sub lens unit 1802 and the second sub lens unit 1803 are arranged so that the centers of the respective lenses are located on substantially the same horizontal plane.
- the center lens unit 1801 is typically disposed so as to be located at substantially the same distance from each of the first sub lens unit 1802 and the second sub lens unit 1803. . This is to make it easy to generate a bilaterally symmetric video when generating a left and right video that forms a stereoscopic video from a video shot using the center lens unit 1803.
- a first sub-lens portion 1802 and a second sub-lens portion 1803 are arranged at positions adjacent to the lens barrel portion 1804 of the center lens portion 1801.
- the center lens part 1801 has a substantially circular shape, it can be said that the first sub-lens part 1802 and the second sub-lens part 1803 are substantially symmetrical with respect to the center lens part 1801. .
- FIG. 19 is a diagram showing an outline of the hardware configuration of the video photographing apparatus 1800.
- the video imaging apparatus 1800 has a center imaging unit 1950 including a lens group (center lens group 1900) of the center lens unit 1801 instead of the main imaging unit 250 in the first embodiment.
- a sub 1 photographing unit 1951 including a lens group (first sub lens group 1904) of the first sub lens unit 1802, and a lens group (second sub lens) of the second sub lens unit 1803.
- a sub 2 photographing unit 1952 having a group 1908).
- the center photographing unit 1950 includes a CCD 1901, an A / D conversion IC 1902, and an actuator 1903 in addition to the center lens group 1900.
- the sub 1 photographing unit 1951 also includes a CCD 1905, an A / D conversion IC 1906, and an actuator 1907.
- the sub 2 photographing unit 1952 also includes a CCD 1909, an A / D conversion IC 1910, and an actuator 1911.
- the center lens group 1900 of the center photographing unit 1950 in the present embodiment is configured by a lens group that is relatively larger than the first sub lens group 1904 of the sub first photographing unit 1951 and the second sub lens group 1908 of the sub second photographing unit 1952. Has been.
- the center photographing unit 1950 is equipped with a zoom function. This is because the image captured by the center lens group 1900 is the basis for generating a stereoscopic image, and therefore it is preferable that the light collecting ability is high and the imaging magnification can be arbitrarily changed.
- the first sub lens group 1904 of the sub 1 imaging unit 1951 and the second sub lens group 1908 of the sub 2 imaging unit may be smaller lenses than the center lens group 1900 of the center imaging unit 1950. Further, the sub 1 shooting unit 1951 and the sub 2 shooting unit 1952 may not have a zoom function.
- the CCD 1905 of the sub 1 photographing unit 1951 and the CCD 1909 of the sub 2 photographing unit 1952 have higher resolution than the CCD 1901 of the center photographing unit. There is a possibility that a part of the video shot by the sub 1 shooting unit 1951 or the sub 2 shooting unit 1952 is extracted by the electronic zoom by the processing of the angle-of-view matching unit 2013 described later. Therefore, these CCDs have high definition so that the accuracy of the image can be maintained at that time.
- FIG. 20 is a functional configuration diagram of the video photographing apparatus 1800.
- the video imaging apparatus 1800 includes a center imaging unit 2050 instead of the main imaging unit 350, a first sub imaging unit 2051 and a second sub imaging unit 2052 instead of the sub imaging unit 351. It has different points.
- the center photographing unit 2050 and the main photographing unit 350 are substantially functionally equivalent, and the first sub photographing unit 2051 and the second sub photographing unit 2052 are substantially functionally equivalent to the sub photographing unit 351.
- the configuration of the video photographing apparatus 1800 illustrated in FIG. 18 will be described as an example, but the technology in the present disclosure is not limited to this configuration.
- a configuration in which three or more sub photographing units are provided may be used.
- the sub photographing unit does not necessarily have to be arranged on substantially the same horizontal plane as the center photographing unit. It may be intentionally arranged at a position different from the center photographing unit or another sub photographing unit in the vertical direction. With such a configuration, it is possible to capture a video with a stereoscopic effect in the vertical direction.
- the video photographing apparatus 1800 can realize photographing (multi-view photographing) from various angles.
- the image signal processing unit 2012 includes an angle-of-view matching unit 2013, a pixel number matching unit 2014, a parallax information generation unit 2015, an image generation unit 2016, and a shooting control unit 2017, similarly to the image signal processing unit 308 in the first embodiment.
- the angle-of-view matching unit 2013 matches the angle of view of the video input from the center photographing unit 2050, the first sub photographing unit 2051, and the second sub photographing unit 2052. Unlike the first embodiment, the angle-of-view matching unit 2013 performs a process of matching the angles of view of videos taken from three different angles.
- the pixel number matching unit 2014 performs a process of matching the number of pixels between the three videos whose field angles are matched by the field angle matching unit 2013.
- the disparity information generation unit 2015 detects the amount of parallax of the photographed subject from the three images in which the angle of view and the number of pixels are matched by the angle of view matching unit 2013 and the pixel number matching unit 2014, and two types of depth maps Is generated.
- the image generation unit 2016 uses the right and left videos for stereoscopic video from the video shot by the center shooting unit 2050 based on the parallax amount (depth map) of the subject shot in the video generated by the parallax information generation unit 2015. Is generated.
- the reliability information generation unit 2023 generates reliability information indicating the reliability of the two types of depth maps generated by the parallax information generation unit 2015.
- the reliability information is referred to by the disparity information generation unit 2015, and two types of depth maps are corrected based on the reliability information.
- the imaging control unit 2017 controls the imaging conditions of the center imaging unit 2050, the first sub imaging unit 2051, and the second sub imaging unit 2052 based on the parallax amount calculated by the parallax information generation unit 2015.
- the horizontal direction detection unit 2022, the display unit 2018, the video compression unit 2019, the storage unit 2020, and the input unit 2021 are respectively the horizontal direction detection unit 318, display unit 314, video compression unit 315, storage unit 316, and input unit of the first embodiment. Since it is the same as 317, the description is omitted.
- the image signal processing unit 2012 receives video signals from the three systems of the center imaging unit 2050, the first sub imaging unit 2051, and the second sub imaging unit 2052, and 2 based on the input three video signals. Types of parallax information are calculated. Thereafter, left and right videos that newly form a stereoscopic video are generated from the video shot by the center shooting unit 2050 based on the calculated parallax information.
- FIG. 21 shows the relationship between the three images input to the angle-of-view matching unit 2013 and the angle-of-view matching processing performed by the angle-of-view matching unit 2013.
- the angle-of-view matching unit 2013 is based on the video (Sub1, Sub2) captured by the first sub-capturing unit 2051 and the second sub-capturing unit 2052 with reference to the video (Center) captured by the center capturing unit 2050.
- An operation for extracting the same region as the portion (view angle) photographed by the center photographing unit 2050 is performed.
- the angle-of-view adjustment unit 2013 may perform an operation for adjusting the angle of view based on the input video, and the control content by the imaging control unit 2017 at the time of shooting, particularly the center.
- the angle of view may be determined from the relationship between the zoom magnification of the photographing unit 2050 and the single focal lengths of the first sub photographing unit 2051 and the second sub photographing unit 2052.
- FIG. 22 is a diagram illustrating processing results obtained by the angle-of-view matching unit 2013, the pixel number matching unit 2014, the parallax information generation unit 2015, and the image generation unit 2016.
- the pixel number matching unit 2014 performs a process of matching the number of pixels for the three images that have been subjected to the field angle matching.
- the image by the center photographing unit 2050 has a size of 1920 ⁇ 1080, and the images photographed and extracted by the first sub photographing unit 2051 and the second sub photographing unit 2052 are both 1280 ⁇ 720 pixels.
- the pixel number adjustment unit 2014 adjusts the number of pixels to a size of, for example, 288 ⁇ 162 as in the first embodiment.
- the three images are adjusted to a predetermined target size in order to facilitate the image signal processing by the image signal processing unit 2012 as a whole. Therefore, instead of simply adjusting to the image having the smallest number of pixels between the three images, the pixels between the three images may be combined and at the same time, the image size may be changed to be easy to process as the entire system.
- the technique in this indication is not limited to what performs the above process.
- a process of matching the number of pixels of another video with the video having the minimum number of pixels may be performed.
- the parallax information generation unit 2015 detects the amount of parallax between the three videos. Specifically, the parallax information generation unit 2015 has the center video (Cs) by the center imaging unit 2050 and the sub 1 video (S1s) by the first sub imaging unit 2051 that have been subjected to the pixel number adjustment by the pixel number adjustment unit 2014. The information indicating the difference ⁇ (Cs / S1s) is calculated. Further, the difference ⁇ (Cs / S2s) between the center video (Cs) by the center imaging unit 2050 and the sub 2 video (S2s) by the second sub imaging unit 2052 that have been subjected to the pixel number adjustment by the pixel number adjustment unit 2014. ) Is calculated. The disparity information generation unit 2015 determines information (depth map) indicating the left and right disparity amounts based on the difference information.
- the parallax information generation unit 2015 may take into account left and right symmetry when determining the left and right parallax amounts from the differences ⁇ (Cs / S1s) and ⁇ (Cs / S2s). For example, if there is an extremely large amount of parallax on the left side and no extreme amount of parallax on the right side, use the more reliable value when determining the amount of parallax for such pixels. May be. Thus, the final amount of parallax can be determined in consideration of the value of the amount of parallax between the left and right.
- the parallax information generation unit 2015 can perform symmetry between the left and right. Based on this, the degree of influence on the calculation of the amount of parallax can be reduced.
- the image generation unit 2016 generates left and right images constituting a stereoscopic image from the depth map generated by the parallax information generation unit 2015 and the image captured by the center imaging unit 2050.
- the depth map is referred to from the video (Center) shot by the center shooting unit 2050, and the subject or video portion is moved to the left or right according to the amount of parallax.
- the left image (Left) and the right image (Right) are generated.
- the building on the left side of the left image is shifted to the right side by the amount of parallax from the position in the center image.
- the background portion is almost the same as the video by the center photographing unit 2050 because the amount of parallax is small.
- the building that is the subject is shifted to the left by the amount of parallax from the position in the center image.
- the background portion is almost the same as the image by the center photographing unit 2050 for the same reason.
- the imaging control unit 2017 performs the same control as in the first embodiment. That is, the center shooting unit 2050 mainly shoots a video that is the basis of a stereoscopic video, and the first sub shooting unit 2051 and the second sub shooting unit 2052 acquire information on parallax with respect to the video shot by the center shooting unit 2050. Shoot a video to play. Therefore, the imaging control unit 2017 performs suitable imaging control according to each application through the optical control unit 2003, the optical control unit 2007, and the optical control unit 2011, and the first optical unit 2000, the sub 1 optical unit 2004, and the sub 2 This is performed on the optical unit 2008. For example, as in the first embodiment, there are exposure control, autofocus, and the like.
- the photographing control unit 2017 is provided between these three photographing units. Also controls the cooperation of the.
- the first sub photographing unit 2051 and the second sub photographing unit 2052 shoot a video for acquiring left and right parallax information at the time of generating a stereoscopic video. Therefore, the first sub photographing unit 2051 and the second sub photographing unit 2052 may perform control to be symmetrical in cooperation.
- the imaging control unit 2017 performs control in consideration of these restrictions when controlling the first sub imaging unit 2051 and the second sub imaging unit 2052.
- the left and right videos (Left Video Stream, Right Video Stream) constituting the stereoscopic video generated by the image generation unit 2016 are encoded by the video compression unit 2019, and the encoded data is multiplexed. In this way, the data is recorded in the storage unit 2020.
- the playback device can play back recorded stereoscopic video if it can divide the recorded data into left and right data, and then decode and play back each data.
- the advantage of this method is that the configuration of the playback device can be made relatively simple.
- FIG. 24B shows a method of recording a center video (Main Video Stream) by the center photographing unit 2050 that is the basis of the stereoscopic video, and a depth map (parallax amount) of each of the left and right videos with respect to the center video.
- the video compression unit 2019 encodes the video by the center photographing unit 2050 as data and the left and right depth maps for the video.
- the video compression unit 2019 multiplexes each encoded data and records it in the storage unit 2020.
- the playback device reads data from the storage unit 2020, divides it for each data type, and decodes the divided data.
- the playback device further generates and displays left and right videos constituting the stereoscopic video based on the left and right depth maps from the decoded center video.
- This method only one video data with a large amount of data is used, and the recording data amount can be suppressed by recording the depth maps necessary for generating the left and right videos together. In the point.
- FIG. 24 (c) is the same as FIG. 24 (b) in that the video by the center photographing unit 2050, which is the basis of the stereoscopic video, is recorded.
- difference information difference image
- the video compression unit 2019 encodes the video by the center photographing unit 2050 and the difference information ⁇ (Cs / Rs) and ⁇ (Cs / Ls) on the left and right with respect to the center photographing unit 2050, respectively. Multiplexed and recorded in the storage unit 2020.
- the playback device divides the data recorded in the storage unit 2020 for each data type, and combines them. Thereafter, the playback device calculates a depth map from the difference information ⁇ (Cs / Rs) and ⁇ (Cs / Ls), and generates and displays the left and right videos constituting the stereoscopic video from the video by the center photographing unit 2050. .
- the advantage of this method is that the playback device can generate a depth map and generate a stereoscopic image according to the performance of its display. Therefore, it is possible to realize the reproduction of the stereoscopic video according to the individual reproduction conditions.
- reliability information may be recorded together. If the reliability information is recorded, it is possible to adjust the stereoscopic video based on the reliability of the depth map on the playback device side, and thus it is possible to reproduce the stereoscopic video more safely.
- the video imaging apparatus can generate the left and right videos constituting the stereoscopic video from the video shot by the center shooting unit 2050. If one video is actually captured video as in the prior art, but the other video is generated based on the actual captured video, there is a large bias in the reliability of the left and right video Occurs. On the other hand, in the present embodiment, both the left and right images are generated from the captured basic image. Therefore, since it is possible to create a video in consideration of left-right symmetry as a stereoscopic video, it is possible to generate a more natural video with left and right balance.
- the center photographing unit 2050 for photographing a video that is the basis of a stereoscopic video, and the amount of parallax are detected.
- the sub-photographing units 2051 and 2052 that shoot video for this can have different configurations.
- the sub photographing units 2051 and 2052 for detecting the amount of parallax may be realized with a simpler configuration as compared with the center photographing unit 2050, and thus the stereoscopic video photographing apparatus 1800 is configured with a simpler configuration. be able to.
- the size of the video output from the pixel number matching unit 2014 is an example, and the technology in the present disclosure is not limited to this. You may handle the image
- Embodiments 1 and 2 have been described as examples of the technology disclosed in the present application. However, the technology in the present disclosure is not limited to this, and can also be applied to an embodiment in which changes, replacements, additions, omissions, and the like are appropriately performed. Moreover, it is also possible to combine each component demonstrated in the said Embodiment 1, 2 and set it as a new embodiment.
- the image capturing apparatus illustrated in FIG. 1B and FIG. 18 has been described as an example, but the image capturing apparatus in the present disclosure is not limited to these configurations.
- the video imaging apparatus may have, for example, the configuration illustrated in FIG. 25 as another configuration.
- FIG. 25A shows a configuration example in which the sub photographing unit 2503 is arranged on the left side of the main photographing unit 2502 when viewed from the front of the video photographing apparatus.
- the sub photographing unit 2503 is supported by the sub lens support portion 2501 and disposed at a position away from the main body.
- the video shooting apparatus in this example can set the video from the main shooting unit as the left video.
- FIG. 25B shows a configuration example in which the sub imaging unit 2504 is arranged on the right side of the main imaging unit 2502 when viewed from the front of the video imaging apparatus, contrary to the configuration shown in FIG. ing.
- the sub photographing unit 2504 is supported by the sub lens support portion 2502 and disposed at a position away from the main body.
- the video shooting apparatus can take a video with a larger parallax.
- the main photographing unit (or center photographing unit) has a zoom lens
- the sub photographing unit has a single focus lens. It may be configured to take a stereoscopic video in accordance with the focal length of the lens. In this case, stereoscopic video is shot in the same state as the optical magnification of the main photographing unit and the optical magnification of the sub photographing unit.
- the main shooting unit may shoot with the zoom lens movable. With such a configuration, stereoscopic video shooting is performed with the magnification of the main shooting unit equal to the magnification of the sub-shooting unit, and the image signal processing unit executes processing such as angle of view relatively easily. It becomes possible to do.
- the angle of view adjustment unit of the image processing unit expands when the corresponding part is extracted from the video shot by the sub shooting unit.
- a stereoscopic image may be generated only when the ratio (electronic zoom) is within a predetermined range (for example, when the enlargement ratio is four times or less).
- the enlargement ratio exceeds a predetermined range, the generation of the stereoscopic video may be stopped, and the image signal processing unit may be configured to output the conventional non-stereoscopic video captured by the main imaging unit.
- the generation of the stereoscopic video is stopped at the shooting portion where the reliability of the calculated depth information (depth map) is low, so that the quality of the generated stereoscopic video is relatively high. It becomes possible to keep it.
- the optical aperture of the zoom optical system or single focus lens optical system is removed. It may be the configuration. For example, it is assumed that the captured stereoscopic video is focused on the entire screen with respect to a subject 1 m or more away from the imaging device. In this case, since all the screens are in focus, a video with defocus can be generated by image processing.
- the depth area to be blurred is uniquely determined by the aperture amount due to the characteristics of the optical system, but in the image processing, the depth area to be sharpened and the depth area to be blurred can be freely controlled. For example, the depth range of the depth area to be sharpened can be made wider than in the case of the optical type, or the subject can be sharpened in a plurality of depth areas.
- the optical axis direction of the main photographing unit 350 or the sub photographing unit 351 may be movable. That is, the video imaging apparatus may be able to change the parallel method and the intersection method in stereoscopic imaging. Specifically, the optical axis may be changed by driving a lens barrel including a lens constituting the sub photographing unit 351 and an imaging unit by a controlled motor or the like. With such a configuration, the video imaging apparatus can switch between the parallel method and the intersection method according to the subject and the imaging conditions. Alternatively, control such as moving the position of the cross point in the cross method can be performed. Note that this may be realized by electronic control instead of mechanical control by a motor or the like.
- a very wide angle fisheye lens or the like is used as compared with the lens of the main photographing unit 350.
- the video imaged by the sub-imaging unit 351 has a wider range (wide angle) than the video imaged by the normal lens
- the video image in the range captured by the main imaging unit 350 is included.
- the angle-of-view matching unit extracts a range included in the case of shooting by the crossing method from the video shot by the sub shooting unit 351 based on the video shot by the main shooting unit 350.
- An image taken with a fisheye lens has a characteristic that the peripheral portion is easily distorted. Therefore, the angle-of-view matching unit also performs image distortion correction at the same time as extraction in consideration of this point.
- this enables the video imaging apparatus to realize the parallel method and the crossing method by electronic processing without mechanically changing the optical axes of the main imaging unit 350 and the sub imaging unit 351.
- the resolution of the sub photographing unit 351 is sufficiently larger than the resolution of the main photographing unit 350 (for example, twice or more). This is because the video shot by the sub-photographing unit 351 is premised on being extracted by an angle-of-view adjustment process or the like, so that the resolution of the extracted portion is increased as much as possible.
- the method of using a wide-angle lens such as a fisheye lens has been described for the configuration of the first embodiment, but at least the case of adopting the configuration of the second embodiment (center lens, first sub lens, second sub lens)
- the above method can be applied to the relationship between two of the three lenses.
- the parallax information generation units 311 and 2015 may change the calculation accuracy of the depth information (depth map) and the calculation step of the depth information according to the position and distribution of the subject within the shooting angle of view and the contour of the subject.
- the parallax information generation units 311 and 2015 may set the depth information step coarsely for a certain subject and finely set the depth information step inside the subject.
- the parallax information generation units 311 and 2015 may have the depth information in a hierarchical structure inside and outside the subject according to the angle of view and the content of the composition.
- the parallax amount of the distant subject becomes small. For this reason, for example, for an image with a horizontal resolution of 288 pixels, the subject distance range (subject distance region) when the parallax amount is 3 pixels, the subject distance region when the parallax amount is 2 pixels, and the parallax amount When the subject distance area is compared with the case of 1 pixel, the subject distance area increases as the amount of parallax decreases. That is, the sensitivity of the change in parallax with respect to the change in subject distance decreases as the distance increases.
- the oyster effect is a phenomenon in which a certain part of the image looks flat, like a stage tool.
- the amount of parallax of one pixel is divided into, for example, two equal parts or four equal parts by using this depth change amount. can do.
- the sensitivity of parallax can be doubled or quadrupled, so that the crisp effect can be reduced.
- the parallax information generation units 311 and 2015 can improve the calculation of depth information with high accuracy, and can express subtle depth in the object.
- the video photographing apparatus can also make the generated stereoscopic video into a video having a change such as intentionally increasing or decreasing the depth of a characteristic part.
- the video photographing apparatus can calculate and generate an image at an arbitrary viewpoint using the principle of trigonometry based on the depth information and the main image.
- the video photographing device itself further includes a storage unit and a learning unit, and by stacking learning and storage about the video, the composition of the video composed of the subject and the background can be changed.
- the distance of a certain subject is known, it is possible to identify what the subject is based on its size, outline, texture, color, and movement (including acceleration and angular velocity information). Therefore, it is possible not only to extract subjects of a specific color as in chroma key processing, but also to extract people and objects (objects) at a specific distance, and to extract specific people and objects from the recognition results It becomes.
- the video has 3D information, it can be developed into CG (Computer Graphics) processing, and VR (Virtual Reality), AR (Augmented Reality), MR (Mixed Reality), etc. Synthetic processing is possible.
- the image capturing device may recognize that the blue region extending infinitely above the image is a blue sky, and the white region in the image blue sky region is a cloud. Is possible.
- the gray area that spreads from the center to the bottom of the video is a road, and the object with a transparent part (glass window part) and a black round donut-shaped black part (tire) on the road is a car. And the like can be recognized by the video shooting device.
- the video photographing device even if it is in the shape of a car, if the distance is known, it can be determined by the video photographing device whether the car is a real car or a toy car. As described above, when the distance between a person or an object as a subject is known, the video photographing apparatus can determine the recognition of the person or the object more accurately.
- the storage means and learning means of the video photographing device itself have limited capacity and processing capacity, these storage means and learning means are made to wait on a network such as WEB and have a more recognition database. It may be implemented as a highly functional cloud service function. In this case, a configuration may be adopted in which a shot video is sent from the video shooting device to a cloud server or the like on the network, and what is desired to be recognized or what is desired to be inquired.
- the cloud server on the network transmits the semantic data of the subject and background included in the captured video and the explanation data from the past to the present regarding the place and person.
- a video imaging device can be utilized as a more intelligent terminal.
- Embodiment 1 and Embodiment 2 demonstrated using the video imaging device
- the invention demonstrated by this application is not limited to this aspect.
- a program used in the above-described video photographing apparatus can be realized by software. By executing such software on a computer having a processor, the above-described various image processes can be realized.
- a video photographing apparatus that generates and records a stereoscopic video is premised.
- the above photographing method and image processing method are also applied to a photographing apparatus that generates only a still image. It is possible.
- the technology in the present disclosure can be used in an imaging device that captures a moving image or a still image.
Abstract
Description
まず、添付の図面を参照しながら、実施形態1を説明する。なお、本明細書において、「画像」とは、動画像(映像)および静止画像を含む概念を指す。以下の説明において、画像または映像を示す信号または情報を、単に「画像」または「映像」と呼ぶことがある。
図1は、従来の映像撮影装置および本実施形態による映像撮影装置の外観を示す斜視図である。図1(a)は、動画像または静止画像を撮影する従来の映像撮影装置100を示している。図1(b)は、本実施形態による映像撮影装置101を示している。映像撮影装置100と映像撮影装置101とは、映像撮影装置101が第1レンズ部102のみならず第2レンズ部103を備えている点で外観上異なっている。従来の映像撮影装置100では、映像を撮影するには、第1レンズ部102でのみ集光して映像を撮影する。これに対して、本実施形態による映像撮影装置101は、第1レンズ部102と第2レンズ部103の2種類の光学系でそれぞれ集光し、視差をもつ2つの映像(立体映像)を撮影する。第2レンズ部103は、第1レンズ部102と比較して体積的な大きさが小型のレンズである。ここで、「体積的な大きさ」とは、各レンズ部の口径および厚さによって定まる体積で表される大きさを意味する。このような構成により、映像撮影装置101は、2種類の光学系を用いて立体映像を撮影する。
[1-2-1.立体映像信号の生成処理]
次に、画像信号処理部308が行う立体映像信号の生成処理を説明する。なお、以下の説明では、画像信号処理部308での処理は、CPU208を用いたソフトウェアによって実現されるものとするが、本実施形態はこれに限定するものではない。例えばFPGAやその他の集積回路によるハードウェア構成によって同様の処理内容を実現するものであってもよい。
次に、図3に示す画像信号処理部308における撮影制御部313の動作を説明する。撮影制御部313は、視差情報生成部311によって算出された視差情報に基づいて、メイン撮影部350やサブ撮影部351の撮影条件を制御する。
次に、図3に示す信頼性情報生成部319による信頼性情報の生成、および視差情報生成部311によるデプスマップの補正処理を説明する。
次に、映像撮影装置101が水平面に対して傾いた状態で撮影が行われた場合における画角合わせ部309の処理の例を説明する。画角合わせ部309は、水平方向検出部318から映像撮影装置101の水平方向に関する情報を取得する。一般に立体映像に含まれる左右の映像は、水平方向については視差を持つが、垂直方向については視差を持たない。これは、人間の左右の眼が水平方向に所定の距離をおいて位置している一方で、垂直方向についてはほぼ同一水平面上に位置しているためである。そのため、一般的に人は、網膜などの知覚細胞においても水平網膜像差による検知度が比較的高い。例えば、視角にして数秒、または、視距離1mにおいて約0.5mmの奥行き量を検出できる。水平方向の視差については感度が高いが、垂直方向の視差については、垂直網膜像差による特定の空間知覚パターンに依存するため、概して相対的に感度が低いと考えられる。その点を考慮すると、撮影および生成される立体映像についても、視差は横方向のみに発生させ、縦方向には発生させないことが好ましいと考えられる。
映像撮影装置101は、上記の説明の通り、メイン撮影部350およびサブ撮影部351によって撮影した映像から立体映像を生成する。しかし、映像撮影装置101は、常に立体映像を生成する必要はない。立体映像は、左右両映像の視差により被写体の前後関係を視聴者に知覚させることで、視聴される映像が立体的であると視聴者に感じさせるため、立体感が得られない映像については、立体映像を生成しなくてもよい。例えば、立体映像の撮影と非立体映像の撮影とを撮影条件や映像の内容に応じて切り替えてもよい。
(ケース1)撮影装置がマニュアル・フォーカス・モードの場合は、通常、撮影者が合焦させた撮影対象が被写体である。
(ケース2)撮影装置がオート・フォーカス・モードの場合は、撮影装置が自動的に合焦させた撮影対象が被写体となる。この場合、撮影対象の中心付近の人物、動植物、物体や、撮影範囲において自動検出された人物の顔や目立つ物体(一般に、Salientな物体と呼ばれる)が被写体となるのが通例である。
次に、図16を参照しながら、生成した立体映像などの記録方式について記載する。画角合わせ部309、画素数合わせ部310、視差情報生成部311、画像生成部312で生成された立体映像の記録方式については幾つかの方式がある。
なお、本実施形態では、サブ撮影部351は、メイン撮影部350によって取得される右側映像Rにおける撮影画角よりも広い撮影画角で被写体を撮影することによって左側映像Lを取得するが、本開示における技術はこのような形態に限られない。すなわち、サブ撮影部351によって取得される画像の撮影画角と、メイン撮影部350によって取得される画像の撮影画角とが同じであってもよいし、後者の方が前者よりも広くてもよい。
以上のように、本実施形態におけるステレオ撮影装置は、光学ズーム機能を有し、被写体を撮影することによって第1の画像を取得するメイン撮影部350と、当該被写体を撮影することによって第2の画像を取得するサブ撮影部351と、第1の画像および第2の画像の各々から、同一画角と推定される画像部分を抽出する画角合わせ部309と、画角合わせ部309によって同一画角と推定された2つの画像部分の間の視差を示す視差情報を生成する視差情報生成部311と、第1の画像、第2の画像、および当該視差情報の少なくとも1つに基づいて、当該視差情報の信頼性を示す信頼性情報を生成する信頼性情報生成部319とを備える。
次に、実施形態2を説明する。本実施形態は、サブ撮影部が2個設けられている点で、実施形態1とは異なっている。以下、実施形態1と異なる点を中心に説明し、重複する事項についての説明は省略する。
図18は、本実施形態による映像撮影装置1800を示す外観図である。図18の映像撮影装置1800は、センターレンズ部1801と、そのまわりに設けられた、第1サブレンズ部1802および第2サブレンズ部1803とを備えている。なお、レンズの配置はこの例に限定するものではない。例えば、第1サブレンズ部1802と第2サブレンズ部1803との距離が人の左右両眼間の距離とほぼ等価になるような位置に、これらのレンズを配置するものであってもよい。この場合は、以下に説明するように、センターレンズ部1801で撮影された映像から生成される立体映像の左右それぞれの映像間の視差量を人の目で対象物を見た場合の視差量に近づけることが可能となる。この場合、第1サブレンズ部1802と第2サブレンズ部1803とは、それぞれのレンズの中心がほぼ同一水平面上に位置するように配置される。
[2-2-1.立体映像信号の生成処理]
以下、本実施形態における立体映像信号の生成処理を説明する。本実施形態における立体映像信号の生成処理において、実施形態1と大きく異なる点は、以下の点にある。すなわち、画像信号処理部2012にセンター撮影部2050、第1サブ撮影部2051、第2サブ撮影部2052の3系統からの映像信号が入力され、その入力された3系統の映像信号に基づいて2種類の視差情報が算出される。その後、算出された視差情報に基づいてセンター撮影部2050で撮影された映像から、新たに立体映像を構成する左右の映像が生成される。
撮影制御部2017は、実施形態1と同様の制御を行う。つまり、センター撮影部2050は、立体映像の基本となる映像を主に撮影し、第1サブ撮影部2051、第2サブ撮影部2052は、センター撮影部2050が撮影した映像に対する視差の情報を取得するための映像を撮影する。そのため、撮影制御部2017は、それぞれの用途に応じた好適な撮影制御を、光学制御部2003、光学制御部2007、光学制御部2011を通じて、第1光学部2000、サブ1光学部2004、サブ2光学部2008に対して行う。例えば、実施形態1と同様に露出の制御、オートフォーカスなどがある。
本実施形態でも、実施形態1と同様に複数の立体映像記録方式がある。以下、図24を参照しながら、それぞれの記録方式について説明する。
以上の構成により、本実施形態による映像撮影装置は、センター撮影部2050で撮影した映像から、立体映像を構成する左右の映像を生成することができる。従来技術のように、一方の映像は実際に撮影された映像であるが、他方の映像は実際に撮影された映像に基づいて生成した映像である場合は、左右の映像の信頼性に大きな偏りが生じる。これに対して、本実施形態では、左右映像の両者とも、撮影された基本映像により生成されている。そのため、立体映像としての左右対称性も考慮して映像を作ることができるため、左右のバランスがとれた、より自然な映像を生成することができる。
以上のように、本出願において開示する技術の例示として、実施形態1、2を説明した。しかしながら、本開示における技術は、これに限定されず、適宜、変更、置き換え、付加、省略などを行った実施の形態にも適用可能である。また、上記実施形態1、2で説明した各構成要素を組み合わせて、新たな実施の形態とすることも可能である。
102、200 第1レンズ群
103、204 第2レンズ群
104 モニター部
201、205、1901、1905、1909 CCD
202、206、1902、1906、1910 A/DIC
203、207、1903、1907、1911 アクチュエーター
208、1912 CPU
209、1913 RAM
210、1914 ROM
211、1919 加速度センサ
212、1915 ディスプレイ
213、1916 エンコーダー
214、1917 記憶装置
215、1918 入力装置
250 メイン撮影ユニット
251 サブ撮影ユニット
300 第1光学部
301、305、2001、2005、2009 撮像部
302、306、2002、2006、2010 A/D変換部
303、307、2003、2007、2011 光学制御部
304 第2光学部
308、2012 画像信号処理部
309、2013 画角合わせ部
310、2014 画素数合わせ部
311、2015 視差情報生成部
312、2016 画像生成部
313、2017 撮影制御部
319、2023 信頼性情報生成部
314、2018 表示部
315、2019 映像圧縮部
316、2020 蓄積部
317、2021 入力部
318、2022 水平方向検出部
350 メイン撮影部
351 サブ撮影部
600 建物
1801、1900 センターレンズ群
1802 第1サブレンズ群
1803 第2サブレンズ群
1804 鏡筒部
1950 センター撮影ユニット
1951 サブ1撮影ユニット
1952 サブ2撮影ユニット
2000 センター光学部
2004 サブ1光学部
2008 サブ2光学部
2050 センター撮影部
2051 第1サブ撮影部
2052 第2サブ撮影部
2501、2502 サブレンズ支持部
Claims (11)
- 被写体を撮影することによって第1の画像を取得するように構成され、光学ズーム機能を有する第1の撮影部と、
前記被写体を撮影することによって第2の画像を取得するように構成された第2の撮影部と、
前記第1の画像および前記第2の画像を処理する画像信号処理部と、
を備え、
前記画像信号処理部は、
前記第1の画像および前記第2の画像の各々から、同一画角と推定される画像部分を抽出する画角合わせ部と、
前記画角合わせ部によって同一画角と推定された2つの画像部分の間の視差を示す視差情報を生成する視差情報生成部と、
前記第1の画像、前記第2の画像、および前記視差情報の少なくとも1つの情報に基づいて、前記視差情報の信頼性を示す信頼性情報を生成する信頼性情報生成部と、
前記視差情報および前記第1の画像に基づいて、前記第1の画像とともに立体画像を構成する第3の画像を生成する画像生成部とを有し、
前記視差情報生成部は、前記信頼性情報に基づいて、前記視差情報を補正する、
ステレオ撮影装置。 - 前記画像信号処理部は、前記第1の撮影部または前記第2の撮影部の光学系の焦点距離に基づいて定められる前記被写体までの距離が予め定められた閾値よりも小さい場合にのみ、前記視差情報生成部の動作を有効にする制御部をさらに有している、請求項1に記載のステレオ撮影装置。
- 前記第2の撮影部は、前記第1の画像における撮影画角よりも広い撮影画角で前記被写体を撮影することによって前記第2の画像を取得する、請求項1または2に記載のステレオ撮影装置。
- 前記画角合わせ部によって同一画角と推定された2つの画像部分の画素数を合わせる画素数合わせ部をさらに備え、
前記視差情報生成部は、前記画素数合わせ部によって画素数が合わせられた前記2つの画像部分の間でステレオマッチング処理を行い、画素ごとの視差量を求めることにより、前記視差情報を生成する、
請求項1から3のいずれかに記載のステレオ撮影装置。 - 前記視差情報生成部は、前記信頼性情報に基づいて、前記視差情報を、水平方向、垂直方向、または時間軸方向の適応型フィルタリング処理によって補正する、請求項1から4のいずれかに記載のステレオ撮影装置。
- 前記信頼性情報生成部は、前記視差情報生成部によって前記視差情報が求められた前記2つの画像部分の画素数に基づいて前記信頼性情報を生成する、請求項1から5のいずれかに記載のステレオ撮影装置。
- 前記信頼性情報生成部は、前記視差情報生成部によって前記視差情報が求められた前記2つの画像部分の特徴点のマッチングの程度に基づいて前記信頼性情報を生成する、請求項1から5のいずれかに記載のステレオ撮影装置。
- 前記信頼性情報生成部は、前記第1の撮影部および前記第2の撮影部のガンマ特性の類似度に基づいて、前記信頼性情報を生成する、請求項1から5のいずれかに記載のステレオ撮影装置。
- 前記信頼性情報生成部は、前記第1の画像および前記第2の画像の画面全体の平均輝度値の類似度、または前記第1の画像および前記第2の画像の画面を構成する特定領域の平均輝度値の類似度に基づいて、前記信頼性情報を生成する、請求項1から5のいずれかに記載のステレオ撮影装置。
- 前記信頼性情報生成部は、前記第1の画像および前記第2の画像の少なくとも一方における、エッジ領域およびエッジの周辺領域の少なくとも一方のレベル変化の大きさに基づいて、前記信頼性情報を生成する、請求項1から5のいずれかに記載のステレオ撮影装置。
- 前記信頼性情報生成部は、前記画角合わせ部によって画角が合わせられた前記2つの画像部分に含まれるオクルージョン領域の水平方向または垂直方向の大きさに基づいて、前記信頼性情報を生成する、請求項1から5のいずれかに記載のステレオ撮影装置。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2013525059A JP5414947B2 (ja) | 2011-12-27 | 2012-12-20 | ステレオ撮影装置 |
US14/017,650 US9204128B2 (en) | 2011-12-27 | 2013-09-04 | Stereoscopic shooting device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011-285213 | 2011-12-27 | ||
JP2011285213 | 2011-12-27 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/017,650 Continuation US9204128B2 (en) | 2011-12-27 | 2013-09-04 | Stereoscopic shooting device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013099169A1 true WO2013099169A1 (ja) | 2013-07-04 |
Family
ID=48696715
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2012/008155 WO2013099169A1 (ja) | 2011-12-27 | 2012-12-20 | ステレオ撮影装置 |
Country Status (3)
Country | Link |
---|---|
US (1) | US9204128B2 (ja) |
JP (1) | JP5414947B2 (ja) |
WO (1) | WO2013099169A1 (ja) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20160060403A (ko) * | 2014-11-20 | 2016-05-30 | 삼성전자주식회사 | 영상 보정 방법 및 장치 |
JP2016158186A (ja) * | 2015-02-26 | 2016-09-01 | カシオ計算機株式会社 | 撮像装置、撮像方法、撮像プログラム |
JP2017112419A (ja) * | 2015-12-14 | 2017-06-22 | 日本電信電話株式会社 | 最適奥行き決定装置、最適奥行き決定方法及びコンピュータプログラム |
WO2019054304A1 (ja) * | 2017-09-15 | 2019-03-21 | 株式会社ソニー・インタラクティブエンタテインメント | 撮像装置 |
JP2020004219A (ja) * | 2018-06-29 | 2020-01-09 | キヤノン株式会社 | 3次元形状データを生成する装置、方法、及びプログラム |
WO2021059695A1 (ja) * | 2019-09-24 | 2021-04-01 | ソニー株式会社 | 情報処理装置、情報処理方法および情報処理プログラム |
Families Citing this family (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11792538B2 (en) | 2008-05-20 | 2023-10-17 | Adeia Imaging Llc | Capturing and processing of images including occlusions focused on an image sensor by a lens stack array |
US8866920B2 (en) * | 2008-05-20 | 2014-10-21 | Pelican Imaging Corporation | Capturing and processing of images using monolithic camera array with heterogeneous imagers |
EP2502115A4 (en) | 2009-11-20 | 2013-11-06 | Pelican Imaging Corp | RECORDING AND PROCESSING IMAGES THROUGH A MONOLITHIC CAMERA ARRAY WITH HETEROGENIC IMAGE CONVERTER |
US8878950B2 (en) | 2010-12-14 | 2014-11-04 | Pelican Imaging Corporation | Systems and methods for synthesizing high resolution images using super-resolution processes |
WO2013049699A1 (en) | 2011-09-28 | 2013-04-04 | Pelican Imaging Corporation | Systems and methods for encoding and decoding light field image files |
US20140002674A1 (en) | 2012-06-30 | 2014-01-02 | Pelican Imaging Corporation | Systems and Methods for Manufacturing Camera Modules Using Active Alignment of Lens Stack Arrays and Sensors |
SG11201500910RA (en) | 2012-08-21 | 2015-03-30 | Pelican Imaging Corp | Systems and methods for parallax detection and correction in images captured using array cameras |
US8866912B2 (en) | 2013-03-10 | 2014-10-21 | Pelican Imaging Corporation | System and methods for calibration of an array camera using a single captured image |
US10122993B2 (en) | 2013-03-15 | 2018-11-06 | Fotonation Limited | Autofocus system for a conventional camera that uses depth information from an array camera |
WO2014145856A1 (en) | 2013-03-15 | 2014-09-18 | Pelican Imaging Corporation | Systems and methods for stereo imaging with camera arrays |
US9497429B2 (en) | 2013-03-15 | 2016-11-15 | Pelican Imaging Corporation | Extended color processing on pelican array cameras |
JP6048574B2 (ja) * | 2013-03-29 | 2016-12-21 | 株式会社ニコン | 画像処理装置、撮像装置および画像処理プログラム |
JP6308748B2 (ja) * | 2013-10-29 | 2018-04-11 | キヤノン株式会社 | 画像処理装置、撮像装置及び画像処理方法 |
WO2015074078A1 (en) | 2013-11-18 | 2015-05-21 | Pelican Imaging Corporation | Estimating depth from projected texture using camera arrays |
US9426361B2 (en) | 2013-11-26 | 2016-08-23 | Pelican Imaging Corporation | Array camera configurations incorporating multiple constituent array cameras |
JP6561511B2 (ja) | 2014-03-20 | 2019-08-21 | 株式会社リコー | 視差値導出装置、移動体、ロボット、視差値生産導出方法、視差値の生産方法及びプログラム |
JP2016038886A (ja) * | 2014-08-11 | 2016-03-22 | ソニー株式会社 | 情報処理装置および情報処理方法 |
WO2016054089A1 (en) | 2014-09-29 | 2016-04-07 | Pelican Imaging Corporation | Systems and methods for dynamic calibration of array cameras |
CN104463890B (zh) * | 2014-12-19 | 2017-05-24 | 北京工业大学 | 一种立体图像显著性区域检测方法 |
US9686468B2 (en) * | 2015-10-15 | 2017-06-20 | Microsoft Technology Licensing, Llc | Imaging apparatus |
JP6702796B2 (ja) * | 2016-05-16 | 2020-06-03 | キヤノン株式会社 | 画像処理装置、撮像装置、画像処理方法および画像処理プログラム |
US10553029B1 (en) | 2016-09-30 | 2020-02-04 | Amazon Technologies, Inc. | Using reference-only decoding of non-viewed sections of a projected video |
US10609356B1 (en) * | 2017-01-23 | 2020-03-31 | Amazon Technologies, Inc. | Using a temporal enhancement layer to encode and decode stereoscopic video content |
US10586308B2 (en) * | 2017-05-09 | 2020-03-10 | Adobe Inc. | Digital media environment for removal of obstructions in a digital image scene |
EP3649913A4 (en) * | 2017-08-03 | 2020-07-08 | Sony Olympus Medical Solutions Inc. | MEDICAL OBSERVATION DEVICE |
WO2019119065A1 (en) | 2017-12-22 | 2019-06-27 | Maryanne Lynch | Camera projection technique system and method |
KR102113285B1 (ko) * | 2018-08-01 | 2020-05-20 | 한국원자력연구원 | 평행축 방식의 양안 카메라 시스템에서 근거리 물체의 입체영상을 위한 영상처리 방법 및 장치 |
JP7240115B2 (ja) * | 2018-08-31 | 2023-03-15 | キヤノン株式会社 | 情報処理装置及びその方法及びコンピュータプログラム |
US11361466B2 (en) * | 2018-11-30 | 2022-06-14 | Casio Computer Co., Ltd. | Position information acquisition device, position information acquisition method, recording medium, and position information acquisition system |
CN109887087B (zh) * | 2019-02-22 | 2021-02-19 | 广州小鹏汽车科技有限公司 | 一种车辆的slam建图方法及系统 |
KR102646521B1 (ko) | 2019-09-17 | 2024-03-21 | 인트린식 이노베이션 엘엘씨 | 편광 큐를 이용한 표면 모델링 시스템 및 방법 |
MX2022004163A (es) | 2019-10-07 | 2022-07-19 | Boston Polarimetrics Inc | Sistemas y metodos para la deteccion de estandares de superficie con polarizacion. |
KR20230116068A (ko) | 2019-11-30 | 2023-08-03 | 보스턴 폴라리메트릭스, 인크. | 편광 신호를 이용한 투명 물체 분할을 위한 시스템및 방법 |
JP7462769B2 (ja) | 2020-01-29 | 2024-04-05 | イントリンジック イノベーション エルエルシー | 物体の姿勢の検出および測定システムを特徴付けるためのシステムおよび方法 |
KR20220133973A (ko) | 2020-01-30 | 2022-10-05 | 인트린식 이노베이션 엘엘씨 | 편광된 이미지들을 포함하는 상이한 이미징 양식들에 대해 통계적 모델들을 훈련하기 위해 데이터를 합성하기 위한 시스템들 및 방법들 |
WO2021243088A1 (en) | 2020-05-27 | 2021-12-02 | Boston Polarimetrics, Inc. | Multi-aperture polarization optical systems using beam splitters |
US11290658B1 (en) | 2021-04-15 | 2022-03-29 | Boston Polarimetrics, Inc. | Systems and methods for camera exposure control |
US11954886B2 (en) | 2021-04-15 | 2024-04-09 | Intrinsic Innovation Llc | Systems and methods for six-degree of freedom pose estimation of deformable objects |
US11689813B2 (en) | 2021-07-01 | 2023-06-27 | Intrinsic Innovation Llc | Systems and methods for high dynamic range imaging using crossed polarizers |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004297540A (ja) * | 2003-03-27 | 2004-10-21 | Sharp Corp | 立体映像記録再生装置 |
JP2009169847A (ja) * | 2008-01-18 | 2009-07-30 | Fuji Heavy Ind Ltd | 車外監視装置 |
JP2010510600A (ja) * | 2006-11-21 | 2010-04-02 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | 画像の深度マップの生成 |
JP2011119995A (ja) * | 2009-12-03 | 2011-06-16 | Fujifilm Corp | 立体撮像装置及び立体撮像方法 |
JP2011120283A (ja) * | 2011-02-23 | 2011-06-16 | Nagoya Univ | 画像情報処理方法及び画像情報受信システム |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3826236B2 (ja) * | 1995-05-08 | 2006-09-27 | 松下電器産業株式会社 | 中間像生成方法、中間像生成装置、視差推定方法、及び画像伝送表示装置 |
JP3733358B2 (ja) * | 1996-04-05 | 2006-01-11 | 松下電器産業株式会社 | 画像伝送装置、送信装置、受信装置、送信方法および受信方法 |
JP3769850B2 (ja) * | 1996-12-26 | 2006-04-26 | 松下電器産業株式会社 | 中間視点画像生成方法および視差推定方法および画像伝送方法 |
US6163337A (en) | 1996-04-05 | 2000-12-19 | Matsushita Electric Industrial Co., Ltd. | Multi-view point image transmission method and multi-view point image display method |
KR100466458B1 (ko) | 1999-09-20 | 2005-01-14 | 마츠시타 덴끼 산교 가부시키가이샤 | 운전지원장치 |
JP4861574B2 (ja) | 2001-03-28 | 2012-01-25 | パナソニック株式会社 | 運転支援装置 |
JP2005020606A (ja) | 2003-06-27 | 2005-01-20 | Sharp Corp | デジタルカメラ |
JP2005210217A (ja) * | 2004-01-20 | 2005-08-04 | Olympus Corp | ステレオカメラ |
JP2005353047A (ja) | 2004-05-13 | 2005-12-22 | Sanyo Electric Co Ltd | 立体画像処理方法および立体画像処理装置 |
EP1784988A1 (en) * | 2004-08-06 | 2007-05-16 | University of Washington | Variable fixation viewing distance scanned light displays |
JP4624245B2 (ja) | 2005-11-29 | 2011-02-02 | イーストマン コダック カンパニー | 撮像装置 |
JP4772494B2 (ja) * | 2005-12-26 | 2011-09-14 | 富士重工業株式会社 | データ処理装置 |
EP2533541A4 (en) * | 2010-02-02 | 2013-10-16 | Konica Minolta Holdings Inc | STEREO CAMERA |
JP2011166285A (ja) * | 2010-02-05 | 2011-08-25 | Sony Corp | 画像表示装置、画像表示観察システム及び画像表示方法 |
JP2012085030A (ja) * | 2010-10-08 | 2012-04-26 | Panasonic Corp | 立体撮像装置および立体撮像方法 |
US20120200676A1 (en) * | 2011-02-08 | 2012-08-09 | Microsoft Corporation | Three-Dimensional Display with Motion Parallax |
-
2012
- 2012-12-20 WO PCT/JP2012/008155 patent/WO2013099169A1/ja active Application Filing
- 2012-12-20 JP JP2013525059A patent/JP5414947B2/ja not_active Expired - Fee Related
-
2013
- 2013-09-04 US US14/017,650 patent/US9204128B2/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004297540A (ja) * | 2003-03-27 | 2004-10-21 | Sharp Corp | 立体映像記録再生装置 |
JP2010510600A (ja) * | 2006-11-21 | 2010-04-02 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | 画像の深度マップの生成 |
JP2009169847A (ja) * | 2008-01-18 | 2009-07-30 | Fuji Heavy Ind Ltd | 車外監視装置 |
JP2011119995A (ja) * | 2009-12-03 | 2011-06-16 | Fujifilm Corp | 立体撮像装置及び立体撮像方法 |
JP2011120283A (ja) * | 2011-02-23 | 2011-06-16 | Nagoya Univ | 画像情報処理方法及び画像情報受信システム |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20160060403A (ko) * | 2014-11-20 | 2016-05-30 | 삼성전자주식회사 | 영상 보정 방법 및 장치 |
KR102281184B1 (ko) * | 2014-11-20 | 2021-07-23 | 삼성전자주식회사 | 영상 보정 방법 및 장치 |
US11140374B2 (en) | 2014-11-20 | 2021-10-05 | Samsung Electronics Co., Ltd. | Method and apparatus for calibrating image |
JP2016158186A (ja) * | 2015-02-26 | 2016-09-01 | カシオ計算機株式会社 | 撮像装置、撮像方法、撮像プログラム |
JP2017112419A (ja) * | 2015-12-14 | 2017-06-22 | 日本電信電話株式会社 | 最適奥行き決定装置、最適奥行き決定方法及びコンピュータプログラム |
WO2019054304A1 (ja) * | 2017-09-15 | 2019-03-21 | 株式会社ソニー・インタラクティブエンタテインメント | 撮像装置 |
JP2019054463A (ja) * | 2017-09-15 | 2019-04-04 | 株式会社ソニー・インタラクティブエンタテインメント | 撮像装置 |
US11064182B2 (en) | 2017-09-15 | 2021-07-13 | Sony Interactive Entertainment Inc. | Imaging apparatus |
US11438568B2 (en) | 2017-09-15 | 2022-09-06 | Sony Interactive Entertainment Inc. | Imaging apparatus |
JP2020004219A (ja) * | 2018-06-29 | 2020-01-09 | キヤノン株式会社 | 3次元形状データを生成する装置、方法、及びプログラム |
JP7195785B2 (ja) | 2018-06-29 | 2022-12-26 | キヤノン株式会社 | 3次元形状データを生成する装置、方法、及びプログラム |
WO2021059695A1 (ja) * | 2019-09-24 | 2021-04-01 | ソニー株式会社 | 情報処理装置、情報処理方法および情報処理プログラム |
Also Published As
Publication number | Publication date |
---|---|
JP5414947B2 (ja) | 2014-02-12 |
US20130342641A1 (en) | 2013-12-26 |
JPWO2013099169A1 (ja) | 2015-04-30 |
US9204128B2 (en) | 2015-12-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5414947B2 (ja) | ステレオ撮影装置 | |
JP5140210B2 (ja) | 撮影装置および画像処理方法 | |
JP5320524B1 (ja) | ステレオ撮影装置 | |
JP5204350B2 (ja) | 撮影装置、再生装置、および画像処理方法 | |
JP5204349B2 (ja) | 撮影装置、再生装置、および画像処理方法 | |
JP5565001B2 (ja) | 立体映像撮像装置、立体映像処理装置および立体映像撮像方法 | |
JP6021541B2 (ja) | 画像処理装置及び方法 | |
CN102428707B (zh) | 立体视用图像对位装置和立体视用图像对位方法 | |
US9007442B2 (en) | Stereo image display system, stereo imaging apparatus and stereo display apparatus | |
US8599245B2 (en) | Image processing apparatus, camera, and image processing method | |
JP5432365B2 (ja) | 立体撮像装置および立体撮像方法 | |
JP5291755B2 (ja) | 立体視画像生成方法および立体視画像生成システム | |
JP3477023B2 (ja) | 多視点画像伝送方法および多視点画像表示方法 | |
JP5444452B2 (ja) | 立体撮像装置および立体撮像方法 | |
US20120242803A1 (en) | Stereo image capturing device, stereo image capturing method, stereo image display device, and program | |
US20130162764A1 (en) | Image processing apparatus, image processing method, and non-transitory computer-readable medium | |
WO2014148031A1 (ja) | 画像生成装置、撮像装置および画像生成方法 | |
JP5507693B2 (ja) | 撮像装置およびその動作制御方法 | |
KR20150003576A (ko) | 삼차원 영상 생성 또는 재생을 위한 장치 및 방법 | |
JP2005072674A (ja) | 三次元画像生成装置および三次元画像生成システム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2013525059 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12862007 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12862007 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12862007 Country of ref document: EP Kind code of ref document: A1 |