WO2013005477A1 - Imaging device, three-dimensional image capturing method and program - Google Patents

Imaging device, three-dimensional image capturing method and program Download PDF

Info

Publication number
WO2013005477A1
WO2013005477A1 PCT/JP2012/062369 JP2012062369W WO2013005477A1 WO 2013005477 A1 WO2013005477 A1 WO 2013005477A1 JP 2012062369 W JP2012062369 W JP 2012062369W WO 2013005477 A1 WO2013005477 A1 WO 2013005477A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
template image
subject
template
search
Prior art date
Application number
PCT/JP2012/062369
Other languages
French (fr)
Japanese (ja)
Inventor
矢作 宏一
Original Assignee
富士フイルム株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 富士フイルム株式会社 filed Critical 富士フイルム株式会社
Publication of WO2013005477A1 publication Critical patent/WO2013005477A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/133Equalising the characteristics of different image components, e.g. their average brightness or colour balance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • H04N23/611Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body

Definitions

  • the present invention relates to an imaging apparatus, a stereoscopic image imaging method, and a program, and more particularly to an imaging apparatus, a stereoscopic image imaging method, and a program capable of acquiring a plurality of viewpoint images obtained by capturing the same subject from a plurality of viewpoints.
  • Patent Document 1 describes that two images taken by a stereo camera are acquired and a head pose is detected from the images using a database created in advance.
  • Patent Document 2 a plurality of images are acquired, an object is detected using distance information, and information such as a grayscale image or a ternary edge image, which is information other than the distance information, is detected from the position of the object detected by the distance information. Is extracted as a template and an object is tracked using this template.
  • Patent Document 3 is an invention for tracking a stationary subject from a time-series image group. Specifically, multiple images are acquired, an initial tracking area (corresponding to a template) represented by boundary elements and image features is set in the image, and tracking within the next frame is performed based on the initial tracking area. An area is detected, and a tracking area is further detected in the next frame based on the tracking area.
  • the tracking area is sequentially changed when tracking the subject.
  • the tracking area is changed by estimating the shape of the tracking area of the current frame based on the boundary element, creating a plurality of candidates by deforming the tracking area of the previous frame based on the estimation result, and detecting the plurality of candidates and the detection target. This is done by comparing and collating with the frame to determine the tracking area of the current frame.
  • JP 2009-169958 A Japanese Patent Application Laid-Open No. 11-252587 JP 2001-101419 A
  • Patent Document 1 it is necessary to create in advance a database used for detecting the head pose. Therefore, there is a problem that it takes time to create a database. Further, the invention described in Patent Literature 1 cannot detect a head that is not registered in the database, that is, when a photographer is photographing an arbitrary subject, an object included in the image is detected. There is a problem that the desire to track cannot be satisfied.
  • Patent Document 2 has no problem with creating a database. However, there is a problem that tracking is highly likely to fail when the orientation of the object to be detected changes.
  • Patent Document 3 can solve the problem of losing sight of the tracking target when the direction of the object to be detected changes.
  • the invention described in Patent Document 3 has a problem that it is a stationary subject and cannot be applied when the subject moves. This is because in the case of a moving subject, the shape of the tracking area of the current frame cannot be estimated based on the boundary element.
  • the present invention has been made in view of such circumstances, and an imaging apparatus and a stereoscopic image imaging method that can reduce the possibility of losing a subject during tracking even when the subject moves or changes direction. And to provide a program.
  • an imaging apparatus includes a first imaging unit and a second imaging unit that acquire two viewpoint images obtained by capturing the same subject from two viewpoints;
  • a first template image is generated by extracting a part of the area including the subject to be tracked from the viewpoint image captured by the first imaging unit, and the viewpoint image captured by the second imaging unit.
  • a first template image generating unit that extracts a part of the region including the subject to be tracked from the first template image generating unit to generate a second template image, and a viewpoint image captured by the first imaging unit, Search for a subject to be tracked using the first template image and the second template image from viewpoint images taken at different times from the viewpoint image that is the basis of the first template image. And that search means, with a.
  • the first template image is generated by extracting a part of the region including the subject to be tracked from the viewpoint image captured by the first imaging unit, and the second template image is generated.
  • a second template image is generated by extracting a part of the region including the subject to be tracked from the viewpoint image captured by the imaging unit.
  • the first template image and the second template image are obtained from the viewpoint image captured by the first imaging unit and captured at a time different from the viewpoint image that is the basis of the first template image.
  • a synthesized template image is synthesized from the first template image and the second template image by image synthesis processing.
  • a second template image generation unit that generates a tracking target object using the combined template image when the second template image generation unit generates a combined template image.
  • a combined template image is generated from the first template image and the second template image by image combining processing. Then, the subject to be tracked is searched using this composite template image. This can further reduce the possibility of losing sight of the subject during tracking.
  • the composite template image is an image in an intermediate state between the first template image and the second template image.
  • the search means when a plurality of search results are obtained, obtains a template image from which the search results are obtained and a result obtained by searching using the template images. Is calculated for each of the plurality of search results, and the result searched using the template image with the highest similarity is calculated as the subject to be tracked.
  • the similarity between the template image obtained from the search result and the result searched using the template image is calculated. Then, the search result using the template image having the highest calculated similarity is set as the subject to be tracked. Thereby, the accuracy of subject tracking can be increased.
  • the search means is a case where a plurality of search results are obtained, and when the search results are obtained using the first template image, The search result using the first template image is set as the tracking target subject. Thereby, the accuracy of subject tracking can be increased.
  • the result searched using the first template image is set as the subject to be tracked. That is, a subject similar to the subject extracted from the same viewpoint image is the subject to be tracked. Thereby, the accuracy of subject tracking can be increased.
  • the second template image generation unit generates a plurality of types of the synthesized template images
  • the search unit is a case where a plurality of search results are obtained.
  • a search result is not obtained using the first template image
  • a search is performed using a synthesized template image closest to the first template image among the plurality of types of synthesized template images. The result is set as the subject to be tracked.
  • the synthesis closest to the first template image among a plurality of types of synthesized template images is generated.
  • a result searched using the template image is set as a subject to be tracked. This can reduce the possibility of losing sight of the subject during tracking while maintaining accuracy.
  • the tracking unit when the tracking unit does not search for the tracking target subject, an arbitrary amount is selected from the region extracted in generating the second template image.
  • a third template image generating unit that generates a third template image by extracting the region moved in the direction from the viewpoint image captured by the second imaging unit; and the search unit includes the generated first template image.
  • the object to be tracked is searched using the template image 3.
  • a region moved in the left-right direction by an arbitrary amount from the region extracted in generating the second template image is obtained.
  • a third template image is generated by extracting from the viewpoint image captured by the second imaging means, and the tracking target subject is searched using the generated third template image.
  • the third template image generation means may be configured to perform the third template image search if the search means cannot obtain a search result using the third template image.
  • Generating a fourth template image by extracting a region moved in the left-right direction by a predetermined amount from the region extracted in generating the template image from the viewpoint image captured by the second imaging unit; Searches for the subject to be tracked using the generated fourth template image.
  • a predetermined amount from the region extracted in generating the third template image is set in the horizontal direction.
  • the moved region is extracted from the viewpoint image captured by the second imaging unit to generate a fourth template image, and the tracking target subject is searched using the generated fourth template image.
  • the third template image generation unit estimates a position where a subject to be tracked is estimated based on the two viewpoint images, and Determine any amount. As a result, the process for creating the template image is only required once, and the time required for tracking the subject can be shortened.
  • the first imaging unit and the second imaging unit continuously acquire the two viewpoint images
  • the search unit includes the third imaging unit.
  • the image is subsequently captured by the first imaging means.
  • the first template image and the second template image are used.
  • an automatic exposure control unit that performs automatic exposure control based on a tracking target subject searched by the searching unit, and a focus on the tracking target subject searched by the searching unit.
  • Automatic focus adjustment means that adjusts the focus so that the two are matched, or zoom control means that adjusts the angle of view based on the subject to be tracked searched by the search means.
  • automatic exposure control, automatic focus adjustment, and zoom control are performed based on the searched subject to be tracked. Thereby, appropriate control can be performed automatically.
  • a first imaging unit and a second imaging unit that acquire two viewpoint images obtained by capturing the same subject from two viewpoints, and the first imaging unit.
  • Template image generation means for generating a first template image by extracting a part of the area including the subject to be tracked from the captured viewpoint image, and parallax acquisition means for acquiring parallax from the two viewpoint images
  • a fourth template image generation unit that generates a template image obtained by removing the background and foreground from the first template image based on the parallax acquired by the parallax acquisition unit, and the first imaging unit.
  • the fourth template image A search means for searching the tracking target object by using a template image generated in adult means, comprising a.
  • the first template image is generated by extracting a part of the region including the subject to be tracked from the viewpoint image captured by the first imaging unit,
  • the parallax is acquired from the viewpoint image
  • a template image is generated by removing the background and foreground from the first template image based on the parallax.
  • it is a viewpoint image captured by the first imaging means
  • the tracking target image is generated from the viewpoint image captured at a time different from the viewpoint image that is the basis of the first template image, using the generated template image. Search for a subject. This can reduce the possibility of losing sight of the subject even when the background changes due to movement of the subject or the foreground enters in front of the subject.
  • the search unit uses the first template image generated by the template image generation unit and the template image generated by the fourth template image generation unit. Search for the subject to be tracked.
  • the tracking target object is searched using the template image from which the background and the foreground are not removed and the removed template image. This can further reduce the possibility of losing sight of the subject.
  • a first template image is generated by extracting a part of the area including the subject to be tracked from the viewpoint image captured by the imaging unit, and the tracking is performed from the viewpoint image captured by the second imaging unit. Extracting a part of a region including a subject to be a target to generate a second template image, a viewpoint image captured by the first imaging unit, and a basis of the first template image Searching for a subject to be tracked using the first template image and the second template image from viewpoint images taken at a different time from the viewpoint image No.
  • a step of obtaining two viewpoint images obtained by photographing the same subject from two viewpoints by the first imaging means and the second imaging means, and the first imaging means The first template image is generated by extracting a part of the region including the subject to be tracked from the viewpoint image captured in step 1, and the tracking target is determined from the viewpoint image captured by the second imaging unit.
  • a computer-readable non-transitory medium (a non-transitory computer-readable medium) in which this program is recorded is also included in the present invention.
  • the present invention even when the subject moves or changes its direction, the possibility of losing sight of the subject during tracking can be reduced.
  • FIG. 2 is a block diagram showing an electrical configuration of the compound-eye digital camera 1.
  • FIG. 2 is a diagram illustrating a positional relationship between a subject and a compound-eye digital camera 1.
  • FIG. 3 is a flowchart showing a flow of subject tracking processing of the compound-eye digital camera 1; It is an example of a parallax image. It is a figure for demonstrating the method to produce a template image from a parallax image. It is an example of a template image.
  • FIG. 4 is a flowchart showing a flow of subject tracking processing of the compound-eye digital camera 3; It is a figure for demonstrating the method to produce a template image from a parallax image. It is an example of a template image. It is a figure for demonstrating the method to produce a template image from a parallax image. It is an example of a template image.
  • 2 is a diagram illustrating a positional relationship between a subject and a compound-eye digital camera 1.
  • FIG. 6 is a flowchart showing a flow of subject tracking processing of a modified example of the compound-eye digital camera 3; It is a figure for demonstrating the method to produce a template image from a parallax image.
  • FIG. 1 is a schematic diagram of a compound-eye digital camera 1 having a stereoscopic image display device according to the present invention, where (a) is a front view and (b) is a rear view.
  • the compound-eye digital camera 1 is a compound-eye digital camera 1 provided with a plurality of imaging systems (two are illustrated in FIG. 1) and is a stereoscopic view of the same subject viewed from a plurality of viewpoints (two viewpoints on the left and right are illustrated in FIG. 1). Images and single viewpoint images (two-dimensional images) can be taken.
  • the compound-eye digital camera 1 is capable of recording and reproducing not only still images but also moving images and sounds.
  • the camera body 10 of the compound-eye digital camera 1 is formed in a substantially rectangular parallelepiped box shape, and mainly on the front thereof is a barrier 11, a right imaging system 12, and a left imaging system 13 as shown in FIG. A flash 14 and a microphone 15 are provided. A release switch 20 and a zoom button 21 are mainly provided on the upper surface of the camera body 10.
  • a monitor 16 On the other hand, on the back of the camera body 10, as shown in FIG. 1B, a monitor 16, a mode button 22, a parallax adjustment button 23, a 2D / 3D switching button 24, a MENU / OK button 25, a cross button 26, a DISP A / BACK button 27 is provided.
  • the barrier 11 is slidably mounted on the front surface of the camera body 10, and is switched between an open state and a closed state when the barrier 11 slides up and down. Normally, as shown by a dotted line in FIG. 1A, the barrier 11 is positioned at the upper end, that is, in a closed state, and the objective lenses 12a, 13a and the like are covered with the barrier 11. Thereby, damage of a lens etc. is prevented.
  • the barrier 11 is slid, the lens disposed on the front surface of the camera body 10 is exposed when the barrier is positioned at the lower end, that is, in the open state (see a solid line in FIG. 1A).
  • a sensor (not shown) recognizes that the barrier 11 is in an open state, the power is turned on by the CPU 110 (see FIG. 2), and photographing is possible.
  • the right imaging system 12 that captures an image for the right eye and the left imaging system 13 that captures an image for the left eye acquire two viewpoint images obtained by capturing the same subject from two viewpoints, as shown in FIG.
  • the right imaging system 12 and the left imaging system 13 are optical units including a photographing lens group having a bending optical system, diaphragm and mechanical shutters 12d and 13d, and imaging elements 122 and 123 (see FIG. 2).
  • the photographic lens group of the right imaging system 12 and the left imaging system 13 mainly includes objective lenses 12a and 13a that take in light from a subject, a prism (not shown) that bends an optical path incident from the objective lens substantially vertically, and a zoom lens 12c. 13c (see FIG. 2), focus lenses 12b and 13b (see FIG. 2), and the like.
  • the flash 14 is composed of a xenon tube, and emits light as necessary when shooting a dark subject or when backlit.
  • the monitor 16 is a liquid crystal monitor capable of color display having a general aspect ratio of 4: 3, and can display both a stereoscopic image and a planar image. Although the detailed structure of the monitor 16 is not shown, the monitor 16 is a parallax barrier type 3D monitor having a parallax barrier display layer on the surface thereof. The monitor 16 is used as a user interface display panel when performing various setting operations, and is used as an electronic viewfinder during image capturing.
  • the monitor 16 can be switched between a mode for displaying a stereoscopic image (3D mode) and a mode for displaying a planar image (2D mode).
  • 3D mode a parallax barrier having a pattern in which light transmitting portions and light shielding portions are alternately arranged at a predetermined pitch is generated on the parallax barrier display layer of the monitor 16, and The strip-shaped image fragments showing the image are alternately arranged and displayed.
  • a 2D mode or a user interface display panel nothing is displayed on the parallax barrier display layer, and one image is displayed as it is on the lower image display surface.
  • the monitor 16 is not limited to the parallax barrier type, and a lenticular method, an integral photography method using a microlens array sheet, a holography method using an interference phenomenon, or the like may be employed.
  • the monitor 16 is not limited to a liquid crystal monitor, and an organic EL or the like may be employed.
  • the release switch 20 is composed of a two-stage stroke switch composed of a so-called “half press” and “full press”.
  • the release switch 20 is pressed halfway, so (Automatic Exposure: Automatic Exposure), AF (Auto Focus: Automatic Focusing), AWB (Automatic White Balance: Automatic White Balance) are performed, and when fully pressed, image capturing / recording processing is performed.
  • the zoom button 21 is used for a zoom operation of the right imaging system 12 and the left imaging system 13, and includes a zoom tele button 21T for instructing zooming to the telephoto side and a zoom wide button 21W for instructing zooming to the wide angle side. Has been.
  • the mode button 22 functions as shooting mode setting means for setting the shooting mode of the digital camera 1, and the shooting mode of the digital camera 1 is set to various modes depending on the setting position of the mode button 22.
  • the shooting mode is divided into a “moving image shooting mode” in which moving image shooting is performed and a “still image shooting mode” in which still image shooting is performed.
  • the parallax adjustment button 23 is a button for electronically adjusting the parallax at the time of stereoscopic image shooting. By pressing the right side of the parallax adjustment button 23, the parallax between the image captured by the right imaging system 12 and the image captured by the left imaging system 13 is increased by a predetermined distance, and the left side of the parallax adjustment button 23 is pressed. Thus, the parallax between the image captured by the right imaging system 12 and the image captured by the left imaging system 13 is reduced by a predetermined distance.
  • the 2D / 3D switching button 24 is a switch for instructing switching between a 2D shooting mode for shooting a single viewpoint image and a 3D shooting mode for shooting a multi-viewpoint image.
  • the MENU / OK button 25 is used for calling up various setting screens (menu screens) of shooting and playback functions (MENU function), as well as for confirming selection contents, executing instructions for processing, etc. (OK function). All adjustment items of the camera 1 are set.
  • a setting screen for adjusting image quality such as exposure value, hue, ISO sensitivity, and the number of recorded pixels is displayed on the monitor 16, and when the MENU / OK button 25 is pressed during playback. Then, a setting screen for erasing the image is displayed on the monitor 16.
  • the compound-eye digital camera 1 operates according to the conditions set on this menu screen.
  • the cross button 26 is a button for setting, selecting, or zooming various menus, and is provided so that it can be pressed in four directions, up, down, left, and right.
  • the button in each direction corresponds to the setting state of the camera.
  • a function is assigned. For example, at the time of shooting, a function for switching the macro function ON / OFF is assigned to the left button, and a function for switching the flash mode is assigned to the right button.
  • a function for changing the brightness of the monitor 16 is assigned to the upper button, and a function for switching ON / OFF of the self-timer and time is assigned to the lower button.
  • a frame advance function is assigned to the right button, and a frame return function is assigned to the left button.
  • a function for deleting an image being reproduced is assigned to the upper button.
  • a function for moving the cursor displayed on the monitor 16 in the direction of each button is assigned.
  • the DISP / BACK button 27 functions as a button for instructing display switching of the monitor 16, and when the DISP / BACK button 27 is pressed during shooting, the display of the monitor 16 is switched from ON to framing guide display to OFF. . If the DISP / BACK button 27 is pressed during playback, the playback mode is switched from normal playback to playback without character display to multi playback.
  • the DISP / BACK button 27 functions as a button for instructing to cancel the input operation or return to the previous operation state.
  • FIG. 2 is a block diagram showing the main internal configuration of the compound-eye digital camera 1.
  • the compound-eye digital camera 1 mainly includes a CPU 110, operation means (release switch 20, MENU / OK button 25, cross button 26, etc.) 112, SDRAM 114, VRAM 116, AF detection means 118, AE / AWB detection means 120, image sensor 122, 123, CDS / AMP 124, 125, A / D converters 126, 127, image input controller 128, image signal processing means 130, compression / decompression processing means 132, stereoscopic image generation unit 133, video encoder 134, template image generation unit 135, Subject search unit 136, media controller 137, sound input processing means 138, recording medium 140, focus lens driving means 142, 143, zoom lens driving means 144, 145, aperture driving means 146, 147, timing generator ( G) consisting of 148 and 149.
  • G timing generator
  • the CPU 110 comprehensively controls the overall operation of the compound-eye digital camera 1.
  • CPU 110 issues a command to each block in response to an input from operation means 112.
  • the CPU 110 controls the operations of the right imaging system 12 and the left imaging system 13.
  • the right imaging system 12 and the left imaging system 13 basically operate in conjunction with each other, but can be operated individually. Further, the CPU 110 generates two pieces of image data obtained by the right imaging system 12 and the left imaging system 13 as strip-shaped image fragments, and generates display image data that is alternately displayed on the monitor 16.
  • a parallax barrier having a pattern in which light transmitting portions and light shielding portions are alternately arranged at a predetermined pitch is generated on the parallax barrier display layer, and the image display surface below the parallax barrier is displayed.
  • Stereoscopic viewing is enabled by alternately displaying strip-shaped image fragments showing left and right images.
  • the SDRAM 114 stores firmware, which is a control program executed by the CPU 110, various data necessary for control, camera setting values, captured image data, and the like.
  • the VRAM 116 is used as a work area for the CPU 110 and also as a temporary storage area for image data.
  • the image sensors 122 and 123 are constituted by color CCDs provided with R, G, and B color filters in a predetermined color filter array (for example, honeycomb array, Bayer array).
  • the image sensors 122 and 123 receive the subject light imaged by the focus lenses 12b and 13b, the zoom lenses 12c and 13c, and the light incident on the light receiving surface is received by the photodiodes arranged on the light receiving surface.
  • the signal charge is converted into an amount corresponding to the amount of incident light.
  • the electronic shutter speed photocharge accumulation time
  • the CDS / AMPs 124 and 125 are for the purpose of reducing correlated double sampling processing (noise (particularly thermal noise) included in the output signal of the image sensor) and the like for the image signals output from the image sensors 122 and 123. Processing to obtain accurate pixel data by taking the difference between the feedthrough component level and the pixel signal component level included in the output signal for each pixel of the image sensor, and amplifying the analog signals of R, G, B An image signal is generated.
  • the A / D converters 126 and 127 convert the R, G, and B analog image signals generated by the CDS / AMPs 124 and 125 into digital image signals.
  • the image input controller 128 has a built-in line buffer with a predetermined capacity, accumulates an image signal for one image output from the CDS / AMP / AD conversion means, and records it in the VRAM 116 in accordance with a command from the CPU 110.
  • the image signal processing unit 130 is a synchronization circuit (a processing circuit that converts a color signal into a simultaneous expression by interpolating a spatial shift of the color signal associated with a color filter array of a single CCD), a white balance correction circuit, and a gamma correction. Circuit, contour correction circuit, luminance / color difference signal generation circuit, etc., and according to a command from the CPU 110, the input image signal is subjected to necessary signal processing to obtain luminance data (Y data) and color difference data (Cr, Cb data). ) Is generated.
  • the image data generated from the image signal output from the image sensor 122 is referred to as a right eye image B
  • the image data generated from the image signal output from the image sensor 123 is referred to as a left eye image A.
  • the left-eye image A and the right-eye image B (3D image data) processed by the image signal processing unit 130 are input to the VRAM 50.
  • the VRAM 50 includes an A area and a B area each storing 3D image data representing a 3D image for one frame.
  • 3D image data representing a 3D image for one frame is rewritten alternately in the A area and the B area.
  • the written 3D image data is read from an area other than the area in which the 3D image data is rewritten in the A area and the B area of the VRAM 50.
  • the stereoscopic image generation unit 133 allows the monitor 16 to display the 3D image data read from the VRAM 50 or the non-compressed 3D image data read from the recording medium 140 and generated by the compression / decompression processing unit 132 on the monitor 16. Process. For example, in the case of a parallax barrier monitor, the stereoscopic image generation unit 133 divides the right-eye image B and the left-eye image A used for reproduction into strips, and the strip-shaped right-eye image B and left-eye image are separated. Display image data in which images A are arranged alternately is generated. The display image data is output from the stereoscopic image generation unit 133 to the monitor 16 via the video encoder 134.
  • the video encoder 134 controls display on the monitor 16. That is, the display image data generated by the stereoscopic image generation unit 133 is converted into a video signal (for example, NTSC signal, PAL signal, SCAM signal) to be displayed on the monitor 16 and output to the monitor 16 as well as necessary. In response to this, predetermined character and graphic information is output to the monitor 16.
  • a video signal for example, NTSC signal, PAL signal, SCAM signal
  • the right-eye image B and the left-eye image A are stereoscopically displayed on the monitor 16.
  • 3D image data representing a 3D image for one frame is alternately rewritten in the VRAM 50 and 3D image data written from an area other than the area where the 3D image data is rewritten is read out, 3D images are continuously displayed in real time on the monitor 16 (display of live view images (through images)).
  • the left-eye image A and the right-eye image B processed by the image signal processing means 130 are input to the template image generation unit 135.
  • the template image generation unit 135 extracts a predetermined region (for example, a rectangle) including the subject Z to be tracked from each of the left eye image A and the right eye image B, and generates a template image. Details of the contents performed by the template image generation unit 135 will be described later.
  • the left eye image A and the right eye image B processed by the image signal processing means 130 are input to the subject searching unit 136.
  • the subject search unit 136 uses the template image generated by the template image generation unit 135 to search for a portion similar to the template image from at least one of the left-eye image A and the right-eye image B. Accordingly, the subject Z is searched from at least one of the left-eye image A and the right-eye image B. Details of the contents performed by the subject searching unit 136 will be described later.
  • At least one of the left-eye image A and the right-eye image B for which the subject Z has been searched by the subject search unit 136 is input to the template image generation unit 135, and a predetermined region including the subject Z is extracted by the template image generation unit 135.
  • a template image is generated.
  • the AF detection unit 118 calculates a physical quantity necessary for AF control from the input image signal so that the subject Z searched by the subject search unit 136 is in focus according to a command from the CPU 110.
  • the AF detection unit 118 includes a right imaging system AF control circuit that performs AF control based on the image signal input from the right imaging system 12, and a left imaging that performs AF control based on the image signal input from the left imaging system 13. And a system AF control circuit.
  • AF control is performed based on the contrast of the images obtained from the image sensors 122 and 123 (so-called contrast AF), and the AF detection unit 118 determines the sharpness of the image from the input image signal.
  • the focus evaluation value shown is calculated.
  • the CPU 110 detects a position where the focus evaluation value calculated by the AF detection means 118 is maximized, and moves the focus lens group to that position. That is, the focus lens group is moved from the closest distance to infinity in a predetermined step, the focus evaluation value is obtained at each position, and the position where the obtained focus evaluation value is the maximum is set as the in-focus position, and the focus lens group is at that position. Move.
  • the focus lens driving units 142 and 143 move the focus lenses 12b and 13b in the optical axis direction in accordance with instructions from the CPU 110, and vary the focal position so that the subject Z searched by the subject search unit 136 is in focus. .
  • the zoom lens driving means 144 and 145 are respectively optical axes of the zoom lenses 12c and 13c according to a command from the CPU 110 in accordance with an instruction from the photographer or according to a command from the CPU 110 so that the subject Z searched for by the subject search unit 136 has a predetermined size. Move in the direction and change the focal length.
  • AE / AWB detection means 120 calculates a physical quantity necessary for AE control and AWB control from the input image signal in accordance with a command from CPU 110.
  • the AE / AWB detection unit 120 divides one screen into a plurality of areas (for example, 16 ⁇ 16) as physical quantities necessary for AE control, and an integrated value of R, G, and B image signals for each divided area. Is calculated.
  • the AE / AWB detection unit 120 calculates an integrated value of R, G, and B image signals in a predetermined area including the subject Z searched by the subject search unit 136.
  • the CPU 110 detects the brightness of the subject (subject brightness) based on the integrated value obtained from the AE / AWB detection means 120, and calculates an exposure value (shooting EV value) suitable for shooting. Then, an aperture value and a shutter speed are determined from the calculated shooting EV value and a predetermined program diagram.
  • the aperture driving means 146 and 147 adjust the amounts of light incident on the image sensors 122 and 123 by varying the apertures of the aperture / mechanical shutters 12d and 13d in accordance with a command from the CPU 110. Further, the aperture driving units 146 and 147 open / close the aperture / mechanical shutters 12d and 13d in accordance with a command from the CPU 110 to perform exposure / light shielding on the image sensors 122 and 123, respectively.
  • the AE / AWB detection unit 120 divides one screen into a plurality of areas (for example, 16 ⁇ 16) as physical quantities necessary for AWB control, and the colors of the R, G, and B image signals for each divided area. Calculate another average integrated value.
  • the AE / AWB detection unit 120 calculates an average integrated value for each color of R, G, and B image signals in a predetermined area including the subject Z searched by the subject search unit 136.
  • the CPU 110 obtains the ratio of R / G and B / G for each divided area from the obtained R accumulated value, B accumulated value, and G accumulated value, and R of the obtained R / G and B / G values.
  • the compression / decompression processing unit 132 performs compression processing in a predetermined format on the input image data in accordance with a command from the CPU 110 to generate compressed image data. Further, in accordance with a command from the CPU 110, the input compressed image data is subjected to a decompression process in a predetermined format to generate uncompressed image data.
  • the media controller 137 records each image data compressed by the compression / decompression processing unit 132 on the recording medium 140.
  • the recording medium 140 includes an xD picture card (registered trademark) detachably attached to the compound-eye digital camera 1, a semiconductor memory card represented by smart media (registered trademark), a portable small hard disk, a magnetic disk, an optical disk, a magneto-optical disk, etc.
  • Various recording media include an xD picture card (registered trademark) detachably attached to the compound-eye digital camera 1, a semiconductor memory card represented by smart media (registered trademark), a portable small hard disk, a magnetic disk, an optical disk, a magneto-optical disk, etc.
  • the sound input processing means 138 receives an audio signal input to the microphone 15 and amplified by a stereo microphone amplifier (not shown), and performs an encoding process on the audio signal.
  • the compound-eye digital camera 1 When the barrier 11 is slid from the closed state to the open state, the compound-eye digital camera 1 is turned on, and the compound-eye digital camera 1 is activated under the photographing mode.
  • a 2D mode and a 3D shooting mode for shooting a stereoscopic image of the same subject viewed from two viewpoints can be set.
  • a 3D shooting mode a 3D shooting mode in which a stereoscopic image is shot with a predetermined parallax using the right imaging system 12 and the left imaging system 13 can be set.
  • the shooting mode is set by selecting “shooting mode” with the cross button 26 or the like on the menu screen displayed on the monitor 16 when the MENU / OK button 25 is pressed while the compound-eye digital camera 1 is driven in the shooting mode.
  • the setting can be made from the shooting mode menu screen displayed on the monitor 16.
  • the CPU 110 selects the right imaging system 12 or the left imaging system 13 (the left imaging system 13 in the present embodiment), and starts shooting for the shooting confirmation image by the imaging device 123 of the left imaging system 13. To do. That is, images are continuously picked up by the image pickup device 123, the image signals are continuously processed, and image data for a shooting confirmation image is generated.
  • the CPU 110 sets the monitor 16 in the 2D mode, sequentially adds the generated image data to the video encoder 134, converts it into a signal format for display, and outputs it to the monitor 16.
  • the image captured by the image sensor 123 is stereoscopically displayed on the monitor 16.
  • the video encoder 134 is not necessary, but it is necessary to convert it to a signal form that matches the input specifications of the monitor 16.
  • the user performs framing while viewing the shooting confirmation image displayed in three dimensions on the monitor 16, checks the subject to be shot, checks the image after shooting, and sets shooting conditions.
  • an S1 ON signal is input to the CPU 110.
  • the CPU 110 detects this and performs AE metering and AF control.
  • the brightness of the subject is measured based on the integrated value of the image signal captured via the image sensor 123.
  • This photometric value photometric value
  • the flash 14 is pre-lighted, and the light emission amount of the flash 14 at the time of actual photographing is determined based on the reflected light.
  • an S2 ON signal is input to the CPU 110.
  • the CPU 110 executes photographing and recording processing in response to the S2ON signal.
  • the CPU 110 drives the aperture-mechanical shutter 13d via the aperture driving means 147 based on the aperture value determined based on the photometric value, and at the same time, the imaging device has a shutter speed determined based on the photometric value.
  • the charge accumulation time at 123 (so-called electronic shutter) is controlled.
  • the CPU 110 sequentially moves the focus lens to a lens position corresponding to the distance from the nearest to infinity during AF control, and the image is based on the image signal of the AF area of the image captured through the image sensor 123 for each lens position.
  • An evaluation value obtained by integrating the high frequency components of the signal is acquired from the AF detection unit 118, a lens position where the evaluation value reaches a peak is obtained, and contrast AF is performed to move the focus lens to the lens position.
  • the flash 14 is caused to emit light
  • the flash 14 is caused to emit light based on the light emission amount of the flash 14 obtained from the result of pre-emission.
  • the subject light is incident on the light receiving surface of the image sensor 123 via the focus lens 13b, the zoom lens 13c, the diaphragm-mechanical shutter 13d, the infrared cut filter 46, the optical low-pass filter 48, and the like.
  • the signal charge accumulated in each photodiode of the image sensor 123 is read according to the timing signal applied from the TG 149, sequentially output from the image sensor 123 as a voltage signal (image signal), and input to the CDS / AMP 125.
  • the CDS / AMP 125 performs correlated double sampling processing on the CCD output signal based on the CDS pulse, and amplifies the image signal output from the CDS circuit by the imaging sensitivity setting gain applied from the CPU 110.
  • the analog image signal output from the CDS / AMP 125 is converted into a digital image signal by the A / D converter 127, and the converted image signal (R, G, B RAW data) is transferred to the SDRAM 114. And once stored here.
  • the R, G, B image signals read from the SDRAM 114 are input to the image signal processing means 130.
  • white balance adjustment is performed by applying digital gain to each of the R, G, and B image signals by the white balance adjustment circuit, and gradation conversion processing according to gamma characteristics is performed by the gamma correction circuit.
  • the synchronization circuit interpolates the spatial shift of the color signals associated with the color filter array of the single CCD and performs the synchronization process for matching the phases of the color signals.
  • the synchronized R, G, B image signals are further converted into a luminance signal Y and color difference signals Cr, Cb (YC signal) by a luminance / color difference data generation circuit and subjected to predetermined signal processing such as edge enhancement.
  • the YC signal processed by the image signal processing means 130 is stored in the SDRAM 114 again.
  • the YC signal stored in the SDRAM 114 as described above is compressed by the compression / expansion processing means 132 and recorded on the recording medium 140 via the media controller 137 as an image file of a predetermined format.
  • Still image data is stored in the recording medium 140 as an image file according to the Exif standard.
  • the Exif file has an area for storing main image data and an area for storing reduced image (thumbnail image) data.
  • a thumbnail image having a specified size (for example, 160 ⁇ 120 or 80 ⁇ 60 pixels) is generated from the main image data obtained by shooting through pixel thinning processing and other necessary data processing.
  • the thumbnail image generated in this way is written in the Exif file together with the main image.
  • tag information such as shooting date / time, shooting conditions, and face detection information is attached to the Exif file.
  • the CPU 110 When the mode of the compound-eye digital camera 1 is set to the playback mode, the CPU 110 outputs a command to the media controller 137 to read out the image file recorded last on the recording medium 140.
  • the compressed image data of the read image file is added to the compression / decompression processing unit 132, decompressed to an uncompressed luminance / color difference signal, and output to the monitor 16 via the video encoder 134.
  • the image recorded on the recording medium 140 is reproduced and displayed on the monitor 16 (reproduction of one image).
  • photography mode a planar image is displayed on the monitor 16 whole surface in 2D mode.
  • (2) 3D shooting mode Shooting for the shooting confirmation image is started by the image sensor 122 and the image sensor 123. That is, the image sensor 122 and the image sensor 123 continuously capture the right-eye image B and the left-eye image A at a predetermined frame rate, the image signals are continuously processed, and the stereoscopic image data for the shooting confirmation image is obtained. Generated.
  • the CPU 110 sets the monitor 16 to the 3D mode, and the generated image data is sequentially converted into a signal format for display by the video encoder 134 and is output to the monitor 16. As a result, stereoscopic image data for the shooting confirmation image is stereoscopically displayed on the monitor 16.
  • a process of tracking the subject Z with respect to the left-eye image A is performed in parallel with the stereoscopic image data for the shooting confirmation image being stereoscopically displayed on the monitor 16.
  • FIG. 4 is a flowchart showing a flow of processing for tracking the subject Z with respect to the image A for the left eye continuously photographed at a predetermined frame rate. This process is controlled by the CPU 110. A program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110.
  • the left-eye image A taken immediately before the processing target frame, in this case, the left-eye image A0 (see FIG. 5) taken immediately before the process of tracking the subject Z is input to the template image generation unit 135.
  • the template image generation unit 135 generates a template image TA0 from the left eye image A0 (step S10). A process in which the template image generation unit 135 generates the template image TA0 will be described in detail.
  • the left-eye image A0 shown in FIG. 5 includes a subject Z (here, a human face) that is a tracking target.
  • the template image generation unit 135 extracts a region including the subject Z from the left-eye image A0 and having a size that allows the shape of the subject Z to be recognized.
  • a rectangular area having a margin of several pixels in the contour of the subject Z is extracted from the left-eye image A0.
  • the template image generation unit 135 sets the extracted image as a template TA0 as shown in FIG. Thereby, a template image is generated.
  • the method of selecting the subject Z as a tracking target includes a method of automatically detecting the subject by face detection or the like, a method of selecting by the photographer via the operation means 112, and a subject having a transmitter automatically.
  • a method etc. can be considered.
  • face detection is performed from each of the left eye image A and the right eye image B, and it is confirmed that the same face is detected in the left eye image A and the right eye image B. Then, the face may be the subject Z.
  • the photographer may select the face via the operation unit 112.
  • the photographer may select from each of the left eye image A and the right eye image B, or the photographer may select from the left eye image A or the right eye image B.
  • the same subject may be detected from the other image by selecting and detecting corresponding points.
  • the right-eye image B photographed immediately before the frame to be processed, in this case, the right-eye image B0 (see FIG. 5) photographed immediately before the process of tracking the subject Z is input to the template image generation unit 135.
  • the template image generation unit 135 generates a template image TB0 from the right eye image B0 by the same method as in step S10 (step S12).
  • the right-eye image B may not include the subject Z that is the tracking target (for example, when it is covered by another subject). It is done.
  • a template image is generated from images that are as close as possible to the photographing timing.
  • the template image TA0 can be generated from the left-eye image A0, but the subject Z is not included in the right-eye image B0, the subject Z is included in the right-eye image captured immediately before the right-eye image B0.
  • the template image generated from this image may be used as the template image TB0. If the subject Z is not included in the right-eye image taken immediately before the right-eye image B0, the previous image is displayed.
  • the template image TB0 may be generated from the image for the right eye that is captured in FIG.
  • step S14 the photographing and processing of the first image is started.
  • i is a positive integer.
  • the optical axis 13 ⁇ / b> L of the left imaging system 13 and the optical axis 12 ⁇ / b> L of the right imaging system 12 do not coincide with each other.
  • the result of photographing the subject Z with the right and the result of photographing the subject Z with the right imaging system 12 are different. That is, as shown in FIG. 6, the orientation of the subject Z included in the left-eye image A is different from the orientation of the subject Z included in the right-eye image B. Therefore, since the subject Z included in the template image TAi and the subject Z included in the template image TBi are different, the left-eye image Ai is searched by using both the template images TA (i ⁇ 1) and TB (i ⁇ 1).
  • a pattern matching method such as template matching can be used. It is not limited to.
  • the subject search unit 136 determines whether or not the search for the left-eye image Ai has been completed (step S22). If the search has not been completed (NO in step S22), step S22 is performed again.
  • the template image A portion similar to TA (i ⁇ 1) and TB (i ⁇ 1) is set as the position of the subject Z.
  • the search result is the area surrounded by the dotted line in FIG. 8A1 regardless of whether the template image TA0 or TB0 is searched.
  • the position surrounded by the dotted line is the position of the subject Z.
  • the subject search unit 136 inputs the search result and the left-eye image Ai to the template image generation unit 135. Thereafter, the process proceeds to step S32.
  • the subject The search unit 136 calculates the similarity between the result of searching the left eye image Ai using the template image TA (i-1) and the template image TA (i-1), and also uses the template image TB (i-1) to perform the left eye operation. The similarity between the search result of the image Ai and the template image TB (i-1) is calculated.
  • the subject search unit 136 searches for the left eye image Ai using the template image TA (i-1) and the similarity between the template image TA (i-1) and the template image TA (i-1). It is determined whether or not the result of searching the image Ai is higher than the similarity between the template image TB (i-1) (step S26).
  • the similarity between the result of searching the left eye image Ai using the template image TA (i-1) and the template image TA (i-1) is the result of searching the left eye image Ai using the template image TB (i-1).
  • the subject searching unit 136 searches the template image TA (i-1) for the left eye image Ai, that is, the template.
  • a portion similar to the image TA (i-1) is set as the position of the subject Z (step S28).
  • the subject search unit 136 inputs the position of the subject Z and the left eye image Ai to the template image generation unit 135.
  • the similarity between the result of searching the left eye image Ai using the template image TA (i-1) and the template image TA (i-1) is the result of searching the left eye image Ai using the template image TB (i-1).
  • the subject searching unit 136 searches the template image TB (i-1) for the left eye image Ai, that is, A portion similar to the template image TB (i-1) is set as the position of the subject Z (step S30).
  • the subject search unit 136 inputs the position of the subject Z and the left eye image Ai to the template image generation unit 135.
  • the template image generation unit 135 extracts a region that includes the subject Z from the left-eye image Ai and has a size that allows the shape of the subject Z to be recognized.
  • a template image TAi is generated (step S32). Similar to step S10, a rectangular area having a margin of several pixels is extracted from the contour of the subject Z and generated as a template image. In the case shown in FIG. 8A1, the dotted line portion is generated as the template image TA1.
  • the template image generation unit 135 generates a template image TBi from the right-eye image Bi by the same method as in step S12 (step S34).
  • the right-eye image is taken at the closest timing before the right-eye image Bi, and the image includes the subject Z.
  • a template image TB0 may be generated.
  • the subject Z is tracked with respect to the left-eye image A.
  • the user performs framing while viewing the shooting confirmation image displayed in three dimensions on the monitor 16, checks the subject to be shot, checks the image after shooting, and sets shooting conditions.
  • the zoom may be optimized based on the subject Z simultaneously with the tracking of the subject Z.
  • the CPU 110 moves the zoom lenses 12c and 13c in the optical axis direction via the zoom lens driving units 144 and 145 so that the size of the subject Z becomes a predetermined size.
  • the photographer can recognize what the tracking target is.
  • the tracking target is an important subject for the photographer, an image in which the important subject is easy to see can be taken.
  • an S1 ON signal is input to the CPU 110.
  • the CPU 110 detects this and ends the subject tracking process shown in FIG.
  • the CPU 110 performs AE metering and AF control.
  • AE metering and AF control are performed by the left imaging system 13 that has performed tracking processing of the subject Z.
  • the exposure and focus are optimized based on the subject Z tracked by the subject tracking process shown in FIG. In other words, AE photometry is performed so that the subject Z has an appropriate exposure, and AF processing is performed so that the subject Z is in focus. Since AE metering and AF control are the same as in the 2D shooting mode, detailed description thereof is omitted.
  • an S2 ON signal is input to the CPU 110.
  • the CPU 110 executes photographing and recording processing in response to the S2ON signal.
  • the processing for generating image data captured by each of the right imaging system 12 and the left imaging system 13 is the same as that in the 2D imaging mode, and thus description thereof is omitted.
  • two pieces of compressed image data are generated by the same method as in the 2D shooting mode.
  • the two pieces of compressed image data are associated and stored in the recording medium 140 as one file.
  • An MP format or the like can be used as the storage format.
  • the CPU 110 When the mode of the compound-eye digital camera 1 is set to the playback mode, the CPU 110 outputs a command to the media controller 137 to read out the image file recorded last on the recording medium 140.
  • the compressed image data of the read image file is added to the compression / decompression processing unit 132, decompressed to an uncompressed luminance / color difference signal, converted into a stereoscopic image by the stereoscopic image generation unit 133, and then passed through the video encoder 134. And output to the monitor 16.
  • the image recorded on the recording medium 140 is reproduced and displayed on the monitor 16 (reproduction of one image).
  • the frame advance of the image is performed by operating the left and right keys of the cross button 26.
  • the right key of the cross button 26 is pressed, the next image file is read from the recording medium 140 and reproduced and displayed on the monitor 16.
  • the left key of the cross button is pressed, the previous image file is read from the recording medium 140 and reproduced and displayed on the monitor 16.
  • the image recorded on the recording medium 140 can be erased as necessary while confirming the image reproduced and displayed on the monitor 16.
  • the image is erased by pressing the MENU / OK button 25 while the image is reproduced and displayed on the monitor 16.
  • the subject is tracked using the images of a plurality of subjects with different orientations as keys, so even if the subject moves or the orientation of the subject changes, there is a possibility that the subject may be lost during tracking. Can be reduced.
  • subject tracking processing is performed using a template image obtained by extracting a parallax image, accurate tracking can be performed. As a result, the amount of calculation can be reduced.
  • the subject tracking process is performed on the left-eye image A.
  • the subject tracking process may be performed on the right-eye image B, or the subject tracking process may be performed on both the left-eye image A and the right-eye image B. Processing may be performed.
  • the subject tracking process is performed on the right eye image B in the same manner as shown in FIG. That is, the right-eye image Bi is searched for both the template images TA (i-1) and TB (i-1). If the results are the same, the position is set as the position of the subject Z. If the results are different, The result with the higher similarity may be set as the position of the subject Z.
  • the first embodiment of the present invention uses a template image generated by extracting a part of a right-eye image and a template image generated by extracting a part of a left-eye image, and a right-eye image and a left-eye image.
  • the subject tracking is performed by searching for the subject from at least one of the above, but the method of performing the subject tracking is not limited to this.
  • a template image generated by extracting a part of the right-eye image, a template image generated by extracting a part of the right-eye image, and a part of the left-eye image are generated.
  • This is a form in which subject tracking is performed by searching for a subject from at least one of a right-eye image and a left-eye image using a template image generated by image synthesis processing from the template image.
  • the compound-eye digital camera 2 of the second embodiment will be described.
  • the same parts as those of the first embodiment are denoted by the same reference numerals, and description thereof is omitted.
  • FIG. 9 is a block diagram showing the main internal configuration of the compound-eye digital camera 2.
  • the compound-eye digital camera 2 mainly includes a CPU 110, operation means (release switch 20, MENU / OK button 25, cross button 26, etc.) 112, SDRAM 114, VRAM 116, AF detection means 118, AE / AWB detection means 120, image sensor 122, 123, CDS / AMP 124, 125, A / D converters 126, 127, image input controller 128, image signal processing means 130, compression / decompression processing means 132, stereoscopic image generation unit 133, video encoder 134, template image generation unit 135, Subject search unit 136, media controller 137, sound input processing unit 138, composite template image generation unit 139, recording medium 140, focus lens driving units 142 and 143, zoom lens driving units 144 and 145, aperture driving unit 146 147, and a timing generator (TG) 148 and 149.
  • TG timing generator
  • the composite template image generation unit 139 generates a composite template image by image composition processing from a template image generated by extracting a part of the image for the right eye and a template image generated by extracting a part of the image for the left eye.
  • FIG. 10 is a schematic diagram illustrating a method in which the composite template image generation unit 139 generates a composite template image.
  • the composite template image generation unit 139 extracts feature points from the template image TA0 generated by extracting a part of the left-eye image A0.
  • the feature points are, for example, points (pixels) having strong signal gradients in a plurality of directions, and can be extracted by using a Harris method, a Shi-Tomasi method, or the like.
  • the composite template image generation unit 139 extracts corresponding points from the template image TB0 generated by extracting a part of the right-eye image B0.
  • the corresponding point is a point corresponding to the feature point extracted from the template image TA0.
  • the composite template image generation unit 139 aligns the feature points extracted from the template image TA0 and the corresponding points extracted from the template image TB0, and then performs template combination TA0 and template image TB0 by image composition processing.
  • Intermediate state images that is, composite template images TM0-1 to TM0-5 are generated.
  • a composite template image is created using a technique called morphing that expresses a state of deformation from one object to another, but the image composition processing is not limited to this.
  • a composite template image may be generated after some image processing is performed on the template images TA (i-1) and TB (i-1). For example, after correcting the luminance so that the average luminance of the template image TA (i-1) and the average luminance of the template image TB (i-1) become the same value, the synthesized template image TM (i-1) -1 TM (i-1) -5 may be generated, or the color balance of the template image TA (i-1) and the color balance of the template image TB (i-1) may have the same value.
  • the synthesized template images TM (i-1) -1 to TM (i-1) -5 may be generated after correcting the luminance.
  • the feature points extracted from the template image TA (i-1) and the corresponding points extracted from the template image TB (i-1) are aligned, and the template image TA (i-1) is converted into the template image TB (i After that, the composite template images TM (i-1) -1 to TM (i-1) -5 may be generated with the same size as that of (-1).
  • FIG. 11 is a flowchart showing a flow of processing for tracking the subject Z with respect to the image A for the left eye continuously photographed at a predetermined frame rate. This process is controlled by the CPU 110. A program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110.
  • the left-eye image A taken immediately before the frame to be processed in this case, the left-eye image A0 (see FIG. 5) taken immediately before the process of tracking the subject Z is input to the template image generation unit 135. .
  • the template image generation unit 135 generates a template image TA0 by extracting an area including the subject Z from the left-eye image A0 and having a size that allows the shape of the subject Z to be recognized (step S10).
  • the right eye image B photographed immediately before the processing target frame in this case, the right eye image B0 (see FIG. 5) photographed immediately before the process of tracking the subject Z, or the processing target frame and the photographing timing as much as possible.
  • the near right-eye image B is input to the template image generation unit 135, and the template image generation unit 135 generates a template image TB0 from the right-eye image B0 by the same method as in step S10 (step S12).
  • step S14 the photographing and processing of the first image is started.
  • i is a positive integer.
  • the subject search unit 136 searches the left-eye image Ai using the template image TA (i-1) and the combined template images TM (i-1) -1 to TM (i-1) -5, and the left-eye image.
  • the subject search unit 136 determines whether or not the search for the left-eye image Ai has been completed (step S44), and if not completed (NO in step S44), performs the step S44 again.
  • the subject search unit 136 determines whether the search using a plurality of templates is successful, that is, whether the search results are obtained using a plurality of templates. Is determined (step S46).
  • step S46 If the search with a plurality of templates is not successful (NO in step S46), only one search result for the subject Z is obtained. Therefore, the subject search unit 136 inputs the search result (position of the subject Z) and the left-eye image Ai to the template image generation unit 135. Thereafter, the process proceeds to step S54.
  • the search result is obtained using a plurality of templates.
  • the similarity with the result searched using the image is calculated for each template image (each search result).
  • the subject search unit 136 determines whether there are a plurality of objects having the highest similarity (step S48).
  • a known method such as a difference between feature value values, a least square method on a feature space (or a weighted space is also possible), and the like can be employed.
  • the subject search unit 136 is similar to the template image TA (i-1) or the synthesized template image closest to the template image TA (i-1). This portion is set as the position of the subject Z (step S50).
  • Step S50 will be specifically described.
  • the result of searching using the template image TA (i-1) is set as the position of the subject Z.
  • the subject search unit 136 inputs the position of the subject Z and the left-eye image Ai to the template image generation unit 135. Thereafter, the process proceeds to step S32.
  • the composite template A portion similar to the image TM0-1 is set as the position of the subject Z. Further, the result of searching the left-eye image Ai with the composite template image TM0-1 and the composite template image TM0-1 are not included in the highest similarity that has a plurality of similarities, and the left-eye with the composite template image TM0-2. If the similarity between the result of the search for the image Ai and the composite template image TM0-2 is included in a plurality of the highest similarities, a portion similar to the composite template image TM0-2 is selected as the subject. Let Z be the position. Thereby, even when the detection result with the highest accuracy is not used, the possibility of losing the subject during tracking can be reduced while maintaining a certain level of accuracy.
  • the subject search unit 136 sets the search result having the highest similarity as the position of the subject Z (step S52).
  • the subject search unit 136 inputs the position of the subject Z and the left eye image Ai to the template image generation unit 135.
  • the template image generation unit 135 extracts a region that includes the subject Z from the left-eye image Ai and has a size that allows the shape of the subject Z to be recognized.
  • a template image TAi is generated (step S54).
  • the template image generation unit 135 generates a template image TBi from the right-eye image Bi by the same method as in step S12 (step S34).
  • the possibility of losing the subject during tracking is reduced even when the subject moves or the orientation of the subject changes. be able to.
  • the subject tracking process is performed on the left-eye image A.
  • the subject tracking process may be performed on the right-eye image B, or the subject tracking process may be performed on both the left-eye image A and the right-eye image B. Processing may be performed.
  • the right eye image Bi is converted into the template image TB (i-1) and the combined template image TM (i-1) -1 to the same method as shown in FIG. Search with TM (i-1) -5.
  • the template image TA (i -1) or a portion similar to the combined template image closest to the template image TA (i-1) is set as the position of the subject Z (step S50), but if the search with a plurality of templates is successful (YES in step S46) ), The portion similar to the template image TA (i-1) or the combined template image closest to the template image TA (i-1) may be set as the position of the subject Z without calculating the similarity.
  • the template images TA (i-1) and TB (i-1) are subjected to image synthesis processing to generate synthesized template images TM (i-1) -1 to TM (i-1) -5.
  • the method for generating the composite template image is not limited to the image composition processing. For example, after the feature points and corresponding points of the template images TA (i-1) and TB (i-1) are aligned, the difference is extracted, and the pixels having different luminance and hue are masked to make the template image TA (i-1 ), A template having only the common part of TB (i-1) may be generated as a synthesis template.
  • the composite template image to be generated is not limited to an image in an intermediate state between the template image TA (i ⁇ 1) and the template image TB (i ⁇ 1), that is, generated by interpolation.
  • a composite template image an image outside the template image TA (i-1) or template image TB (i-1) (that is, an image of a subject viewed from a viewpoint outside the right imaging system 12 or the left imaging system 13). ) May be generated by extrapolation.
  • the left eye image Ai is searched by using the template image TA (i-1) and the combined template images TM (i-1) -1 to TM (i-1) -5, and the left eye A portion similar to the template image TA (i-1) and the composite template images TM (i-1) -1 to TM (i-1) -5 was searched from the image Ai (step S42).
  • the left eye image Ai may be searched using only (i-1) -1 to TM (i-1) -5.
  • the synthesized template image may be rough, it is desirable to search both the template image and the synthesized template image if possible.
  • the left-eye image Ai is searched using the template images TA (i-1), TB (i-1) and the composite template images TM (i-1) -1 to TM (i-1) -5. Good. In this case, although processing takes time, it is possible to reliably track the subject.
  • the left-eye image Ai is searched using the template image TA (i-1) and the combined template images TM (i-1) -1 to TM (i-1) -5.
  • the left eye image Ai is searched using the template images TA (i-1) and TB (i-1), and the subject is lost. May be searched.
  • FIG. 12 is a flowchart showing a flow of processing for tracking the subject Z with respect to the image A for the left eye continuously photographed at a predetermined frame rate. This process is controlled by the CPU 110. A program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110.
  • Steps S10 to S22 are the same as those in the first embodiment, and will be described from step S60.
  • the subject search unit 136 searches the left eye image Ai using the target lost as the search result in step S20, that is, the template images TA (i ⁇ 1) and TB (i ⁇ 1). As a result, the template image TA (i ⁇ 1) It is determined whether or not a portion similar to TB (i-1) has been found (step S60).
  • step S22 If the target has not been lost (NO in step S60), if the search for the left-eye image Ai is completed (YES in step S22), the subject search unit 136 performs TA (i-1) for the left-eye image Ai. It is determined whether or not the result of searching for and the result of searching for the left-eye image Ai by TB (i ⁇ 1) are the same (step S24). Steps S24 to S30 are the same as those in the first embodiment, and a description thereof will be omitted. Thereafter, the process proceeds to step S74.
  • the composite template image generation unit 139 uses the composite template images TM (i-1) -1 to TM (TM) from the template images TA (i-1) and TB (i-1). i-1) -5 is generated (step S40).
  • the subject searching unit 136 searches the left-eye image Ai using the composite template images TM (i-1) -1 to TM (i-1) -5, and generates the composite template image TM (i ⁇ 1) A part similar to -1 to TM (i-1) -5 is searched (step S62).
  • the subject search unit 136 determines whether or not the search for the left-eye image Ai has ended (step S64). If the search has not ended (NO in step S64), the subject search unit 136 performs step S64 again.
  • the subject search unit 136 determines whether the search using a plurality of templates is successful, that is, whether the search results are obtained using a plurality of templates. Is determined (step S66).
  • step S66 If the search with a plurality of templates has not succeeded (NO in step S66), the search result of the subject Z is only one. Therefore, the subject search unit 136 inputs the search result (position of the subject Z) and the left-eye image Ai to the template image generation unit 135. Thereafter, the process proceeds to step S74.
  • the search result is obtained with a plurality of templates. Therefore, the subject search unit 136 includes the template image that has been successfully searched, and the template. The similarity with the search result using the image is calculated for each template image (each search result). Then, the subject searching unit 136 determines whether there are a plurality of objects having the highest similarity (step S68).
  • the subject searching unit 136 sets a portion similar to the synthesized template image closest to the template image TA (i-1) as the position of the subject Z. (Step S70). If there is not a plurality of objects with the highest similarity (NO in step S68), the subject search unit 136 sets the search result with the highest similarity as the position of the subject Z (step S72). The subject search unit 136 inputs the position of the subject Z and the left-eye image Ai in steps S70 and S72 to the template image generation unit 135. Thereafter, the process proceeds to step S74.
  • the template image generation unit 135 includes the subject Z from the left-eye image Ai based on the position of the subject Z set in steps S22, S28, S30, S62, S70, and S72, and is large enough to recognize the shape of the subject Z.
  • a template image TAi (step S74). Similar to step S10, a rectangular area having a margin of several pixels is extracted from the contour of the subject Z and generated as a template image.
  • the subject is tracked using a plurality of subjects having different directions as keys, so that the possibility of losing the subject during tracking is further reduced even when the subject moves or the orientation of the subject changes. can do.
  • a composite template image is created and a search using the composite template image is performed only when the search with the template image fails, it is not necessary to perform useless processing, and the processing time can be shortened.
  • the first embodiment of the present invention uses a template image generated by extracting a part of a right-eye image and a template image generated by extracting a part of a left-eye image, and a right-eye image and a left-eye image.
  • Subject tracking is performed by searching for a subject from at least one of the above, but subject tracking may fail when the entire surface of the subject to be tracked is hidden by a shield.
  • the compound eye digital camera 3 according to the third embodiment will be described below.
  • the same parts as those of the first embodiment are denoted by the same reference numerals, and description thereof is omitted.
  • FIG. 13 is a block diagram showing the main internal configuration of the compound-eye digital camera 3.
  • the compound-eye digital camera 3 mainly includes a CPU 110, operation means (release switch 20, MENU / OK button 25, cross button 26, etc.) 112, SDRAM 114, VRAM 116, AF detection means 118, AE / AWB detection means 120, image sensor 122, 123, CDS / AMP 124, 125, A / D converters 126, 127, image input controller 128, image signal processing means 130, compression / decompression processing means 132, stereoscopic image generation unit 133, video encoder 134, template image generation unit 135, Subject search unit 136, media controller 137, sound input processing unit 138, recording medium 140, template image regeneration unit 141, focus lens driving units 142 and 143, zoom lens driving units 144 and 145, aperture driving unit 146, 47, and a timing generator (TG) 148 and 149.
  • TG timing generator
  • the template image regenerating unit 141 finds a tracking target subject when one of the left eye image A and the right eye image B is lost.
  • a template image for searching for an image that has been lost is generated using an image in which a subject to be tracked is searched.
  • a template image is generated by extracting a part of the image searched for the tracking target subject based on the search result of the image searched for the tracking target subject.
  • the subject searching unit 136 uses the template image generated by the template image regenerating unit 141 to search for an image in which the tracking target subject is lost. Details of the processing of the template image regenerating unit 141 will be described later.
  • FIG. 14 is a flowchart showing a flow of processing for tracking the subject Z with respect to the image A for the left eye continuously photographed at a predetermined frame rate. This process is controlled by the CPU 110. A program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110.
  • the left-eye image A taken immediately before the frame to be processed in this case, the left-eye image A0 (see FIG. 15) taken immediately before the process of tracking the subject Z is input to the template image generation unit 135. .
  • the template image generation unit 135 generates a template image TA0 (see FIG. 16) by extracting a region including the subject Z from the left-eye image A0 and having a size that allows the shape of the subject Z to be recognized (step S10).
  • the right eye image B photographed immediately before the processing target frame in this case, the right eye image B0 (see FIG. 15) photographed immediately before the process of tracking the subject Z, or the processing target frame, and the photographing timing as much as possible.
  • the near right eye image B is input to the template image generation unit 135, and the template image generation unit 135 generates a template image TB0 (see FIG. 16) from the right eye image B0 by the same method as in step S10 (step S12).
  • step S14 the photographing and processing of the first image is started.
  • i is a positive integer.
  • the subject search unit 136 inputs the search result (position of the subject Z) and the left-eye image Ai to the template image generation unit 135. Then, it progresses to step S100.
  • the subject Z to be tracked is an obstruction (here, another subject located in front of the subject Z). It is conceivable that the vehicle is covered with a car C) and cannot be visually recognized.
  • the subject searching unit 136 acquires the generated template image TB (i ⁇ 1) from the template image generating unit 135. Further, since the right-eye image Bi acquired in step S16 is input to the template image generation unit 135, the subject search unit 136 acquires the right-eye image Bi from the template image generation unit 135.
  • the subject search unit 136 determines whether or not the search for the right-eye image Bi is successful (step S86).
  • the subject search unit 136 performs tracking error processing (step S101) because the search has failed.
  • the tracking error process for example, a message indicating a tracking error may be displayed on the monitor 16, but the tracking error process is not limited to this.
  • the subject search unit 136 inputs the search result to the template image regeneration unit 141, and the template image regeneration unit 141 selects the right-eye image Bi.
  • a part is extracted to generate a template image TBi-L1 (step S88).
  • the position of the template image TBi-L1 is a position estimated as a position close to the position of the subject Z in the left-eye image A from the positional relationship between the subject and the compound-eye digital camera.
  • the tracking target subject Z is searched in step S84.
  • the template image regeneration unit 141 sets a predetermined region including the subject Z as a template image TB1 (see the dotted line in FIG. 17 and FIG. 18). Similar to the template image TB0, the template image TB1 is generated by extracting a rectangular region having a margin of several pixels on the contour of the subject Z.
  • the shield is located on the left side of the subject Z in the viewpoint image on the right side of the viewpoint that is tracking.
  • the template image regenerating unit 141 extracts a region moved left from the position of the template image TB1 by the width of the template image TB1 from the position of the template image TB1 for the right eye image B1, and generates a template image TB1-L1 (see the dotted line in FIG. FIG. 18). Note that the size of the template image TB1-L1 is the same as that of the template image TB1.
  • the subject searching unit 136 searches for the left-eye image Ai using the template image TBi-L1 generated by the template image regenerating unit 141 in step S88 (step S90).
  • the subject search unit 136 determines whether or not the search for the left-eye image Ai is successful (step S92).
  • the subject search unit 136 inputs the search result (position of the subject Z) and the left-eye image Ai to the template image generation unit 135. Then, it progresses to step S100.
  • step S92 If the search for the left-eye image Ai fails (NO in step S92), the search is unsuccessful, so the search result is input to the template image regeneration unit 141, and the template image regeneration unit 141 is the same as in step S88.
  • the region moved to the left by the width of the template image TBi-L1 from the position of the template image TBi-L1 is extracted to generate the template image TBi-L2 (step S94).
  • a template image TB1-L2 (see the dotted line in FIG. 17) is generated from the right-eye image B1 shown in FIG.
  • the subject searching unit 136 searches for the left-eye image Ai using the template image TBi-L2 generated by the template image regenerating unit 141 in step S94 (step S96).
  • the subject search unit 136 determines whether or not the search for the left-eye image Ai is successful (step S98).
  • step S98 If the search for the left-eye image Ai has failed (NO in step S98), the search is unsuccessful, and the subject search unit 136 performs tracking error processing (step S101). If the search for the left-eye image Ai is successful (YES in step S98), the subject search unit 136 inputs the search result (position of the subject Z) and the left-eye image Ai to the template image generation unit 135. Then, it progresses to step S100.
  • the template image generation unit 135 includes the subject Z from the left-eye image Ai based on the position of the subject Z set in step S80, and recognizes the shape of the subject Z. A region having a possible size is extracted to generate a template image TAi (step S100). Similar to step S10, a rectangular area having a margin of several pixels is extracted from the contour of the subject Z and generated as a template image.
  • the template image generation unit 135 sets the template image TA (i-1) searched in step S80 as the template image TAi (step S100). That is, if it is determined in step S82 that the search has failed, the template image in which the tracking target subject Z was finally searched is continuously used for the next frame. Thereby, when the subject Z to be tracked is covered with the vehicle C located in the front and cannot be seen (see FIG. 17), when the vehicle C moves and the subject Z can be seen, The subject Z can be searched.
  • a search is performed by estimating the position where the tracking target subject is likely to be based on the positional relationship between the subject and the compound-eye digital camera.
  • the possibility of losing sight of the subject can be reduced.
  • the template image TA (i ⁇ 1) is used to search the left eye image Ai (step S80). If the template image TAi ⁇ is unsuccessful, the template image TBi ⁇ generated by the template image regeneration unit 141 is searched.
  • the search for the left-eye image Ai at L1 may be a process of searching for the left-eye image Ai using the template images TA (i-1) and TB (i-1) instead of step S80. . In this case, the left-eye image Ai is searched using the template images TA (i ⁇ 1) and TB (i ⁇ 1). If the image is unsuccessful, the template image TBi ⁇ generated by the template image regeneration unit 141 is searched. The left-eye image Ai is searched at L1.
  • the left-eye image Ai + 1 may be searched using the template images TA (i ⁇ 1) and TB (i ⁇ 1).
  • the tracking error process (step S102) is performed when the subject is lost, but when the subject is lost, the tracking error process (step S102) is not performed and the process proceeds to step S36.
  • the next frame may be processed.
  • FIG. 20 is a flowchart showing a flow of processing for tracking the subject Z with respect to the right-eye image A continuously taken at a predetermined frame rate.
  • This process is controlled by the CPU 110.
  • a program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110.
  • symbol is attached
  • the subject search unit 136 acquires the generated template image TB (i-1) from the template image generation unit 135, searches the right-eye image Bi using the template image TB (i-1), and performs the right-eye image. A part similar to the template image TB (i-1) is searched from Bi (step S102).
  • the subject search unit 136 determines whether or not the search for the right-eye image Bi has been successful (step S104). If the search for the right-eye image Bi is successful (YES in step S104), the subject search unit 136 inputs the search result (position of the subject Z) and the right-eye image Bi to the template image generation unit 135. Thereafter, the process proceeds to step S122.
  • the subject search unit 136 acquires the generated template image TA (i-1) from the template image generation unit 135. In addition, the subject search unit 136 acquires the left-eye image Ai from the template image generation unit 135. Then, the subject searching unit 136 searches the left-eye image Ai using the template image TA (i-1), and searches for a portion similar to the template image TA (i-1) from the left-eye image Ai (step S1). S106).
  • the subject search unit 136 determines whether or not the search for the left-eye image Ai is successful (step S108).
  • step S108 If the search for the left-eye image Ai has failed (NO in step S108), the search is unsuccessful, and the subject search unit 136 performs tracking error processing (step S101).
  • the subject searching unit 136 inputs the search result to the template image regenerating unit 141, and the template image regenerating unit 141 inputs the left eye image Ai.
  • a part is extracted to generate a template image TAi-R1 (step S110).
  • the position of the template image TAi-R1 is a position estimated as a position close to the position of the subject Z in the right-eye image B from the positional relationship between the subject and the compound-eye digital camera.
  • the shielding object When the subject Z to be tracked is covered with a shielding object, the shielding object is located on the right side of the subject Z in the viewpoint image on the left side of the viewpoint that is being tracked. That is, when the subject Z is covered with the shielding object in the right eye image B, the shielding object is positioned on the right side of the subject Z in the left eye image A. Therefore, the template image regenerating unit 141 extracts a region moved to the right from the position of the template image TAi by the width of the template image TAi in the left eye image Ai, and generates a template image TAi-R1. Note that the size of the template image TAi-R1 is the same as that of the template image TAi.
  • the subject searching unit 136 searches the right-eye image Bi using the template image TAi-R1 generated by the template image regenerating unit 141 in step S100 (step S112). Then, the subject searching unit 136 determines whether or not the search for the right-eye image Bi has been successful (step S114).
  • step S114 If the search for the right-eye image Bi is successful (YES in step S114), the subject search unit 136 inputs the search result (position of the subject Z) and the right-eye image Bi to the template image generation unit 135. Thereafter, the process proceeds to step S122.
  • step S114 If the search for the right-eye image Bi fails (NO in step S114), the search is unsuccessful, so the search result is input to the template image regeneration unit 141, and the template image regeneration unit 141 is the same as in step S112.
  • the region moved to the right by the width of the template image TAi-R1 from the position of the template image TAi-R1 is extracted to generate the template image TAi-R2 (step S116).
  • the subject searching unit 136 searches the right-eye image Bi using the template image TAi-R2 generated by the template image regenerating unit 141 in step S116 (step S118). Then, the subject search unit 136 determines whether or not the search for the right-eye image Bi is successful (step S120).
  • step S120 If the search for the right-eye image Bi fails (NO in step S120), the subject search unit 136 performs a tracking error process (step S101) because the search has failed. If the search for the right-eye image Bi is successful (YES in step S120), the subject search unit 136 inputs the search result (position of the subject Z) and the right-eye image Bi to the template image generation unit 135. Thereafter, the process proceeds to step S122.
  • the template image generation unit 135 includes the subject Z from the right-eye image Bi based on the position of the subject Z set in step S102, and recognizes the shape of the subject Z. An area having a possible size is extracted to generate a template image TBi (step S122). Further, when it is determined in step S104 that the search has failed, the template image generation unit 135 sets the template image TB (i ⁇ 1) searched in step S102 as the template image TBi (step S122).
  • the subject is determined by performing a search by estimating the position where the subject to be tracked is supposed to be based on the positional relationship between the subject and the compound-eye digital camera. The possibility of losing sight can be further reduced.
  • the template image regeneration unit 141 extracts a region moved to the left by the width of the template image TBi from the position of the template image TBi to generate the template image TBi-L1, but the template image TBi ⁇
  • the method of determining the position of L1 is not limited to this.
  • the position of the region template image TBi-L1 moved to the left by half the width of the template image TBi may be set, or a region covered with a change in luminance or color in the vicinity of the template image TBi may be the template image TBi-L1.
  • a region with many edges and contours may be used as the position of the template image TBi-L1.
  • the position of the subject Z may be estimated based on the left-eye image A and the right-eye image B, and this may be used as the position of the template image TBi-L1.
  • An example of a method for estimating the position of the subject Z will be described with reference to FIGS.
  • the position for the left eye is set on the basis of the position (x0, ⁇ y0) of the template image TB0 in the right-eye image B0.
  • the position (x0 + dx, y0 + dy) of the template image TA0 in the image A0 is calculated.
  • the difference (dx, dy) between the position (x0, y0) of the template image TB0 in the right-eye image B0 and the position (x0 + dx, y0 + dy) of the template image TA0 in the left-eye image A0 is calculated. Therefore, if the position of the template image TB1 in the right-eye image B1 is (x1, y1), the position of the subject Z to be tracked in the left-eye image A1 can be estimated as (x1 + dx, y1 + dy). A template image may be generated at the position. As a result, the template image regenerating unit 141 needs to create a template image only once, and the time required to track the subject when the template image is lost can be reduced.
  • the template image regeneration unit 141 generates the template images TBi-L1 and TBi-L2, and performs error processing when the search in the template image TBi-L2 fails.
  • the template image regeneration is performed.
  • the number of template images generated by the unit 141 is not limited to two, and can be arbitrarily set.
  • the first embodiment of the present invention uses a template image generated by extracting a part of a right-eye image and a template image generated by extracting a part of a left-eye image, and a right-eye image and a left-eye image.
  • Subject tracking is performed by searching for a subject from at least one of the above, but subject tracking may fail due to the background or foreground included in the template image.
  • a background and foreground are removed from a template image generated by extracting a part of a right-eye image and a left-eye image, and subject tracking is performed using this.
  • the compound eye digital camera 2 according to the fourth embodiment will be described below.
  • the same parts as those of the first embodiment are denoted by the same reference numerals, and description thereof is omitted.
  • FIG. 23 is a block diagram showing the main internal configuration of the compound-eye digital camera 1.
  • the compound-eye digital camera 1 mainly includes a CPU 110, operation means (release switch 20, MENU / OK button 25, cross button 26, etc.) 112, SDRAM 114, VRAM 116, AF detection means 118, AE / AWB detection means 120, image sensor 122, 123, CDS / AMP 124, 125, A / D converters 126, 127, image input controller 128, image signal processing unit 130, compression / decompression processing unit 132, stereoscopic image generation unit 133, video encoder 134, subject search unit 136, media Controller 137, sound input processing means 138, recording medium 140, focus lens driving means 142, 143, zoom lens driving means 144, 145, aperture driving means 146, 147, timing generator (TG) 148, 149, template It constituted by preparative image generation unit 150.
  • operation means release switch 20, MENU / OK button 25, cross button 26, etc.
  • the template image generation unit 150 extracts a predetermined area (for example, a rectangle) including the subject Z to be tracked from each of the left-eye image A and the right-eye image B, and generates a template image. Further, the template image generation unit 150 generates a template image obtained by removing the background and foreground from the template image. The processing performed by the template image generation unit 150 will be described in detail later.
  • FIG. 26 is a flowchart showing a flow of processing for tracking the subject Z with respect to the image A for the left eye continuously photographed at a predetermined frame rate. This process is controlled by the CPU 110. A program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110.
  • the left-eye image A taken immediately before the processing target frame, in this case, the left-eye image A0 taken immediately before the process of tracking the subject Z is input to the template image generation unit 150.
  • the right-eye image B taken immediately before the processing target frame, in this case, the right-eye image B0 taken immediately before the process of tracking the subject Z is input to the template image generation unit 150.
  • the template image generation unit 150 generates parallax maps PA0 and PB0 (see FIG. 25) from the left-eye image A0 and the right-eye image B0 (step S130).
  • the parallax map represents the amount of deviation between the left-eye image A and the right-eye image B.
  • the distance of each subject included in the image becomes clear. In FIG. 25, the distance is expressed with a high density at a short distance and a low density at a long distance.
  • Various known methods can be used to generate the parallax map.
  • the template image generation unit 150 extracts a region including the subject Z from the left-eye image A (here, the left-eye image A0) captured immediately before the processing target frame and having a size that allows the shape of the subject Z to be recognized.
  • An image TA0 is generated (step S10).
  • the template image generation unit 150 extracts the same area as the area of the template image TA0 in the left-eye image A0 from the parallax map PA0, and generates a template parallax map TPA0. Then, the template image generating unit 150 sets a parallax area that is approximately 10 pixels or more away from the parallax of the subject Z, that is, the background, with the area near 10 pixels or more from the parallax of the subject Z as a foreground (step S132). ). In the case of the template image TAO in FIG. 26, a part of the car is included as a background. In the case of the template image TAO in FIG.
  • the foreground is not included, but the foreground may be a tree, a telephone pole, or the like.
  • ⁇ 10 pixels is determined from the idea that the parallax in the subject Z is small and the parallax between the foreground / background and the subject Z is large, and if it does not deviate from this point, ⁇ 10 pixels It is not limited to.
  • the template image generation unit 150 generates a template image SA0 obtained by removing the background and foreground from the template image TA0 (step S134). Hereinafter, the process of step S134 will be described.
  • the template image generation unit 150 generates mask data for masking the invalid area set in step S132, that is, mask data other than the parallax range where the parallax is ⁇ 10 pixels from the approximate center of the template parallax map TPA0. Generate from TPA0.
  • the template image generation unit 150 removes the background and foreground from the template image TA0 from the template image TA0 and the mask data, and generates a template image SA0 from which the background and foreground are removed.
  • step S14 the photographing and processing of the first image is started.
  • i is a positive integer.
  • the left imaging system 13 captures the left eye image Ai (step S16). Since the subject tracking process is performed on the left-eye image Ai, the captured left-eye image Ai is input to the subject search unit 136. At the same time, the left-eye image Ai is input to the video encoder 134, is sequentially converted into a display signal format by the video encoder 134, and is output to the monitor 16.
  • the right imaging system 12 captures the right-eye image Bi (step S18). Since the subject tracking process is not performed on the right-eye image Bi, the captured right-eye image Bi is input to the template image generation unit 135. At the same time, the right-eye image Bi is input to the video encoder 134, is sequentially converted into a display signal format by the video encoder 134, and is output to the monitor 16.
  • the subject search unit 136 determines whether or not the search for the left-eye image Ai has been completed (step S138). If the search has not been completed (NO in step S138), step S138 is performed again.
  • the subject search unit 136 removes the result of searching the left-eye image Ai from the template image TA (i-1), and the background and foreground. It is determined whether or not the result of searching for the left-eye image Ai in the template image SA (i-1) is the same (step S140).
  • step S140 If the result of searching the left-eye image Ai using the template image TA (i-1) is the same as the result of searching the left-eye image Ai using the template image SA (i-1) with the background and foreground removed (YES in step S140) ), A portion similar to the template image TA (i-1) and the template image SA (i-1) with the background and foreground removed is set as the position of the subject Z.
  • the subject search unit 136 inputs the search result and the left-eye image Ai to the template image generation unit 135. Thereafter, the process proceeds to step S148.
  • the subject search unit 136 calculates the similarity between the result of searching the left-eye image Ai in the template image TA (i-1) and the template image TA (i-1), and the background and foreground. The similarity between the result of searching the left-eye image Ai using the removed template image SA (i-1) and the template image SA (i-1) from which the background and foreground have been removed is calculated.
  • the subject search unit 136 searches the template image TA (i ⁇ 1) for the left-eye image Ai and the template image SA (i ⁇ 1) from which the similarity between the template image TA (i ⁇ 1) and the template image TA (i ⁇ 1) is removed. It is determined whether or not the result of searching the left-eye image Ai in step -1) is higher than the similarity between the template image SA (i-1) with the background and foreground removed (step S142).
  • the similarity between the result of searching the left eye image Ai using the template image TA (i-1) and the template image TA (i-1) is the left eye image obtained using the template image SA (i-1) with the background and foreground removed. If the similarity between the result of searching for Ai and the template image SA (i-1) from which the background and foreground have been removed is higher (YES in step S142), the subject search unit 136 uses the template image TA (i-1). As a result of searching the left eye image Ai, that is, a portion similar to the template image TA (i-1) is set as the position of the subject Z (step S144). The subject search unit 136 inputs the position of the subject Z and the left eye image Ai to the template image generation unit 135.
  • the similarity between the result of searching the left eye image Ai using the template image TA (i-1) and the template image TA (i-1) is the left eye image obtained using the template image SA (i-1) with the background and foreground removed. If the similarity between the result of searching for Ai and the template image SA (i-1) from which the background and foreground have been removed is not higher (NO in step S142), the subject search unit 136 removes the background and foreground from the template.
  • the result of searching the left eye image Ai with the image SA (i-1), that is, a portion similar to the template image SA (i-1) from which the background and foreground are removed is set as the position of the subject Z (step S146).
  • the subject search unit 136 inputs the position of the subject Z and the left eye image Ai to the template image generation unit 135.
  • the template image generation unit 135 extracts a region that includes the subject Z from the left-eye image Ai and has a size that allows the shape of the subject Z to be recognized.
  • a template image TAi is generated (step S148). Similar to step S10, a rectangular area having a margin of several pixels is extracted from the contour of the subject Z and generated as a template image. In the case shown in FIG. 8A1, the dotted line portion is generated as the template image TA1.
  • the left-eye image Ai is searched using the template image TA (i-1) and the template image SA (i-1) from which the background and foreground are removed.
  • the left-eye image Ai may be searched using only the template image SA (i-1) from which the image is removed.
  • the left-eye image Ai is searched using the template image TA (i-1) and the template image SA (i-1) from which the background and foreground are removed.
  • the template image TA ( i-1), TB (i-1) is used to search the left eye image Ai. If the search fails, the left eye image Ai is used using the template image SA (i-1) from which the background and foreground are removed. You may make it search. Further, the left-eye image Ai may be searched using a template image obtained by removing the background and foreground from the template image TB (i-1).
  • the case of capturing a live view image has been described as an example.
  • the present invention can also be applied to movie recording. Can do.
  • the difference between live view image shooting and movie shooting is that the continuous-shot right-eye image B and left-eye image A are not recorded in the case of a live-view image, whereas in the case of movie shooting, the difference is continuous.
  • the only difference is that the process of recording the right-eye image B and the left-eye image A taken on the recording medium 54 is performed. Note that the process of recording the continuously captured right-eye image B data and left-eye image A data on the recording medium 54 is already known, and thus description thereof is omitted.
  • the case where two viewpoint images of the left-eye image A and the right-eye image B are captured has been described as an example.
  • the case where three or more viewpoint images are captured is also described.
  • the subject tracking process may be performed on at least one of the three or more viewpoint images.
  • subject tracking processing is performed on all viewpoint images, not only optimization of focus, zoom, and exposure, but also a display such as displaying a frame over the tracked subject Z, highlighting the tracked subject Z, etc. It is also possible to optimize. Since various known methods can be used for displaying and highlighting the frame, description thereof is omitted.
  • subject tracking processing is performed at the time of shooting a live view image, and AE metering and AF control are performed on the subject to be tracked when an S1 ON signal is input. That is, subject tracking processing is performed when shooting a live view image, and then AE metering and AF control of a still image to be captured are performed on the subject to be tracked. You may make it perform AE photometry and AF control continuously.
  • the frame display and the highlight display may be continuously performed when the live view image is captured.
  • a subject may be searched for from a still image viewpoint image using a template image extracted from the viewpoint image for live view images.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Abstract

A region including a subject (Z) and having a size at which the shape of the subject (Z) can be identified is extracted from a left-eye image (A (i-1)) to create a template image (TA (i-1)) (step S10), and a predetermined region is extracted from a right-eye image (B (i-1)) by means of the same method to create a template image (TB (i-1)) (step S12). The created template images (TA (i-1), TB (i-1)) are used to search a left-eye image (Ai) in order to search for the subject (step S20). If the result of searching the left-eye image (Ai) by means of the template image (TA (i-1)) is not the same as the result of searching the left-eye image (Ai) by means of the template image (TB (i-1)), that having the highest degree of similarity is employed (steps S24 to 30).

Description

撮像装置、立体画像撮像方法及びプログラムImaging apparatus, stereoscopic image imaging method, and program
 本発明は撮像装置、立体画像撮像方法及びプログラムに係り、特に複数の視点から同一被写体を撮影した複数枚の視点画像を取得可能な撮像装置、立体画像撮像方法及びプログラムに関する。 The present invention relates to an imaging apparatus, a stereoscopic image imaging method, and a program, and more particularly to an imaging apparatus, a stereoscopic image imaging method, and a program capable of acquiring a plurality of viewpoint images obtained by capturing the same subject from a plurality of viewpoints.
 複数の画像を取得し、画像に含まれる被写体を検出又は追尾することは、よく行われている。例えば、特許文献1には、ステレオカメラで撮影された2枚の画像を取得し、予め作成されたデータベースを用いて画像から頭部ポーズを検出することが記載されている。 It is common to acquire a plurality of images and detect or track a subject included in the images. For example, Patent Document 1 describes that two images taken by a stereo camera are acquired and a head pose is detected from the images using a database created in advance.
 特許文献2には、複数の画像を取得し、距離情報を用いて物体を検出し、距離情報により検出した物体の位置から距離情報以外の情報である濃淡画像や3値化エッジ画像等の情報をテンプレートとして抽出し、このテンプレートを用いて物体を追尾することが記載されている。 In Patent Document 2, a plurality of images are acquired, an object is detected using distance information, and information such as a grayscale image or a ternary edge image, which is information other than the distance information, is detected from the position of the object detected by the distance information. Is extracted as a template and an object is tracked using this template.
 特許文献3は、時系列画像群の中から静止した被写体を追尾する発明である。具体的には、複数の画像を取得し、画像内に境界要素と画像特徴量とで表した初期の追尾領域(テンプレートに相当)を設定し、初期の追尾領域に基づいて次フレーム内の追尾領域を検出し、この追尾領域に基づいて更に次のフレーム内で追尾領域を検出する。 Patent Document 3 is an invention for tracking a stationary subject from a time-series image group. Specifically, multiple images are acquired, an initial tracking area (corresponding to a template) represented by boundary elements and image features is set in the image, and tracking within the next frame is performed based on the initial tracking area. An area is detected, and a tracking area is further detected in the next frame based on the tracking area.
 つまり、特許文献3に記載の発明では、被写体追尾に当たり順次追尾領域を変更していく。この追尾領域の変更は、境界要素に基づき現フレームの追尾領域の形状を推定し、推定結果に基づき前フレームの追尾領域を変形させることで複数の候補を作成し、この複数の候補と検出対象フレームとを比較照合して現フレームの追尾領域を決定することにより行われる。 That is, in the invention described in Patent Document 3, the tracking area is sequentially changed when tracking the subject. The tracking area is changed by estimating the shape of the tracking area of the current frame based on the boundary element, creating a plurality of candidates by deforming the tracking area of the previous frame based on the estimation result, and detecting the plurality of candidates and the detection target. This is done by comparing and collating with the frame to determine the tracking area of the current frame.
特開2009‐169958号公報JP 2009-169958 A 特開平11-252587号公報Japanese Patent Application Laid-Open No. 11-252587 特開2001-101419号公報JP 2001-101419 A
 しかしながら、特許文献1に記載の発明では、頭部ポーズを検出するために利用するデータベースを予め作成しておく必要がある。したがって、データベース作成の手間がかかるという問題がある。また、特許文献1に記載の発明は、データベースに登録されていない頭部を検出することができない、すなわち、撮影者が任意の被写体を撮影している場合に、その画像に含まれるある物体を追尾したいという要望を満足させることはできないという問題がある。 However, in the invention described in Patent Document 1, it is necessary to create in advance a database used for detecting the head pose. Therefore, there is a problem that it takes time to create a database. Further, the invention described in Patent Literature 1 cannot detect a head that is not registered in the database, that is, when a photographer is photographing an arbitrary subject, an object included in the image is detected. There is a problem that the desire to track cannot be satisfied.
 特許文献2に記載の発明は、データベースを作成等に関する問題は存在しない。しかしながら、検出対象の物体の向きが変わった場合等に追尾を失敗する可能性が高いという問題がある。 The invention described in Patent Document 2 has no problem with creating a database. However, there is a problem that tracking is highly likely to fail when the orientation of the object to be detected changes.
 特許文献3に記載の発明は、検出対象の物体の向きが変わった場合等に追尾対象を見失うという問題を解決することができる。しかしながら、特許文献3に記載の発明は静止した被写体が対象であり、被写体が移動する場合に適用することができないという問題がある。なぜならば、移動する被写体の場合には、境界要素に基づき現フレームの追尾領域の形状を推定することができないからである。 The invention described in Patent Document 3 can solve the problem of losing sight of the tracking target when the direction of the object to be detected changes. However, the invention described in Patent Document 3 has a problem that it is a stationary subject and cannot be applied when the subject moves. This is because in the case of a moving subject, the shape of the tracking area of the current frame cannot be estimated based on the boundary element.
 本発明はこのような事情に鑑みてなされたもので、被写体が移動したり向きが変わったりした場合においても、追尾中に被写体を見失う可能性を低減することができる撮像装置、立体画像撮像方法及びプログラムを提供することを目的とする。 The present invention has been made in view of such circumstances, and an imaging apparatus and a stereoscopic image imaging method that can reduce the possibility of losing a subject during tracking even when the subject moves or changes direction. And to provide a program.
 前記目的を達成するために、本発明の一の態様に係る撮像装置は、2つの視点から同一被写体を撮影した2枚の視点画像を取得する第1の撮像手段及び第2の撮像手段と、前記第1の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出して第1のテンプレート画像を生成すると共に、前記第2の撮像手段で撮像された視点画像から前記追尾対象とすべき被写体を含む一部の領域を抜き出して第2のテンプレート画像を生成する第1のテンプレート画像生成手段と、前記第1の撮像手段で撮影された視点画像であって、前記第1のテンプレート画像の基となる視点画像と異なる時間に撮影された視点画像から、前記第1のテンプレート画像及び前記第2のテンプレート画像を用いて追尾対象の被写体を探索する探索手段と、を備えた。 In order to achieve the above object, an imaging apparatus according to an aspect of the present invention includes a first imaging unit and a second imaging unit that acquire two viewpoint images obtained by capturing the same subject from two viewpoints; A first template image is generated by extracting a part of the area including the subject to be tracked from the viewpoint image captured by the first imaging unit, and the viewpoint image captured by the second imaging unit. A first template image generating unit that extracts a part of the region including the subject to be tracked from the first template image generating unit to generate a second template image, and a viewpoint image captured by the first imaging unit, Search for a subject to be tracked using the first template image and the second template image from viewpoint images taken at different times from the viewpoint image that is the basis of the first template image. And that search means, with a.
 本発明の一の態様によれば、第1の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出して第1のテンプレート画像を生成すると共に、第2の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出して第2のテンプレート画像を生成する。そして、第1の撮像手段で撮影された視点画像であって、第1のテンプレート画像の基となる視点画像と異なる時間に撮影された視点画像から、第1のテンプレート画像及び第2のテンプレート画像を用いて追尾対象の被写体を探索する。すなわち、向きの異なる複数の被写体の画像をキーとして、被写体を追尾する。これにより、被写体が移動したり被写体の向きが変わったりした場合においても、追尾中に被写体を見失う可能性を低減することができる。また、視差画像を抜き出したテンプレート画像を用いて被写体追尾処理をするため、精度の良い追尾をすることができる。したがって、結果的には演算量を減らすことができる。 According to one aspect of the present invention, the first template image is generated by extracting a part of the region including the subject to be tracked from the viewpoint image captured by the first imaging unit, and the second template image is generated. A second template image is generated by extracting a part of the region including the subject to be tracked from the viewpoint image captured by the imaging unit. Then, the first template image and the second template image are obtained from the viewpoint image captured by the first imaging unit and captured at a time different from the viewpoint image that is the basis of the first template image. To search for a subject to be tracked. That is, the subject is tracked using the images of a plurality of subjects having different directions as keys. Thereby, even when the subject moves or the orientation of the subject changes, the possibility of losing sight of the subject during tracking can be reduced. In addition, since subject tracking processing is performed using a template image obtained by extracting a parallax image, accurate tracking can be performed. As a result, the amount of calculation can be reduced.
 本発明の他の態様に係る撮像装置において、前記探索手段により追尾対象の被写体が探索されなかった場合には、前記第1のテンプレート画像及び前記第2のテンプレート画像から画像合成処理により合成テンプレート画像を生成する第2のテンプレート画像生成手段を備え、前記探索手段は、前記第2のテンプレート画像生成手段により合成テンプレート画像が生成されると、当該合成テンプレート画像を用いて追尾対象の被写体を探索する。 In the imaging device according to another aspect of the present invention, when the tracking unit does not search for the subject to be tracked, a synthesized template image is synthesized from the first template image and the second template image by image synthesis processing. And a second template image generation unit that generates a tracking target object using the combined template image when the second template image generation unit generates a combined template image. .
 本発明の他の態様によれば、探索手段により追尾対象の被写体が探索されなかった場合には、前記第1のテンプレート画像及び前記第2のテンプレート画像から画像合成処理により合成テンプレート画像を生成し、この合成テンプレート画像を用いて追尾対象の被写体を探索する。これにより、追尾中に被写体を見失う可能性をより低減することができる。 According to another aspect of the present invention, when a tracking target object is not searched by the search unit, a combined template image is generated from the first template image and the second template image by image combining processing. Then, the subject to be tracked is searched using this composite template image. This can further reduce the possibility of losing sight of the subject during tracking.
 本発明の更に他の態様に係る撮像装置において、前記合成テンプレート画像は、前記第1のテンプレート画像及び前記第2のテンプレート画像との中間状態の画像である。 In the imaging device according to still another aspect of the present invention, the composite template image is an image in an intermediate state between the first template image and the second template image.
 本発明の更に他の態様に係る撮像装置において、前記探索手段は、複数の探索結果が得られた場合には、探索結果が得られたテンプレート画像と、このテンプレート画像を用いて探索された結果との類似度を前記複数の探索結果の各々について算出し、この算出された類似度が最も高いテンプレート画像を用いて探索された結果を前記追尾対象の被写体とする。 In the imaging device according to still another aspect of the present invention, when a plurality of search results are obtained, the search means obtains a template image from which the search results are obtained and a result obtained by searching using the template images. Is calculated for each of the plurality of search results, and the result searched using the template image with the highest similarity is calculated as the subject to be tracked.
 本発明の更に他の態様によれば、複数の探索結果が得られた場合には、探索結果が得られたテンプレート画像と、このテンプレート画像を用いて探索された結果との類似度を各々算出し、この算出された類似度が最も高いテンプレート画像を用いて探索された結果を追尾対象の被写体とする。これにより、被写体追尾の精度を高くすることができる。 According to still another aspect of the present invention, when a plurality of search results are obtained, the similarity between the template image obtained from the search result and the result searched using the template image is calculated. Then, the search result using the template image having the highest calculated similarity is set as the subject to be tracked. Thereby, the accuracy of subject tracking can be increased.
 本発明の更に他の態様に係る撮像装置において、前記探索手段は、複数の探索結果が得られた場合であって、前記第1のテンプレート画像を用いて探索結果が得られた場合には、前記第1のテンプレート画像を用いて探索された結果を前記追尾対象の被写体とする。これにより、被写体追尾の精度を高くすることができる。 In the imaging device according to still another aspect of the present invention, the search means is a case where a plurality of search results are obtained, and when the search results are obtained using the first template image, The search result using the first template image is set as the tracking target subject. Thereby, the accuracy of subject tracking can be increased.
 本発明の更に他の態様によれば、第1のテンプレート画像を用いて探索結果が得られた場合には、第1のテンプレート画像を用いて探索された結果を追尾対象の被写体とする。すなわち、同じ視点画像から抜き出された被写体と類似する被写体が追尾対象の被写体となる。これにより、被写体追尾の精度を高くすることができる。 According to still another aspect of the present invention, when a search result is obtained using the first template image, the result searched using the first template image is set as the subject to be tracked. That is, a subject similar to the subject extracted from the same viewpoint image is the subject to be tracked. Thereby, the accuracy of subject tracking can be increased.
 本発明の更に他の態様に係る撮像装置において、前記第2のテンプレート画像生成手段は、前記合成テンプレート画像を複数種類生成し、前記探索手段は、複数の探索結果が得られた場合であって、前記第1のテンプレート画像を用いて探索結果が得られなかった場合には、前記複数種類生成された合成テンプレート画像のうちの前記第1のテンプレート画像に最も近い合成テンプレート画像を用いて探索された結果を前記追尾対象の被写体とする。 In the imaging device according to still another aspect of the present invention, the second template image generation unit generates a plurality of types of the synthesized template images, and the search unit is a case where a plurality of search results are obtained. When a search result is not obtained using the first template image, a search is performed using a synthesized template image closest to the first template image among the plurality of types of synthesized template images. The result is set as the subject to be tracked.
 本発明の更に他の態様によれば、第1のテンプレート画像を用いて探索結果が得られなかった場合には、複数種類生成された合成テンプレート画像のうちの第1のテンプレート画像に最も近い合成テンプレート画像を用いて探索された結果を追尾対象の被写体とする。これにより、精度を保ちつつ、追尾中に被写体を見失う可能性を低減することができる。 According to still another aspect of the present invention, when a search result is not obtained using the first template image, the synthesis closest to the first template image among a plurality of types of synthesized template images is generated. A result searched using the template image is set as a subject to be tracked. This can reduce the possibility of losing sight of the subject during tracking while maintaining accuracy.
 本発明の更に他の態様に係る撮像装置において、前記探索手段により追尾対象の被写体が探索されなかった場合には、前記第2のテンプレート画像の生成にあたり抜き出された領域から任意の量だけ左右方向に移動させた領域を前記第2の撮像手段で撮像された視点画像から抜き出して第3のテンプレート画像を生成する第3のテンプレート画像生成手段を備え、前記探索手段は、前記生成された第3のテンプレート画像を用いて追尾対象の被写体を探索する。 In the imaging apparatus according to still another aspect of the present invention, when the tracking unit does not search for the tracking target subject, an arbitrary amount is selected from the region extracted in generating the second template image. A third template image generating unit that generates a third template image by extracting the region moved in the direction from the viewpoint image captured by the second imaging unit; and the search unit includes the generated first template image. The object to be tracked is searched using the template image 3.
 本発明の更に他の態様によれば、追尾対象の被写体が探索されなかった場合には、第2のテンプレート画像の生成にあたり抜き出された領域から任意の量だけ左右方向に移動させた領域を第2の撮像手段で撮像された視点画像から抜き出して第3のテンプレート画像を生成し、この生成された第3のテンプレート画像を用いて追尾対象の被写体を探索する。これにより、追尾対象の被写体が他の被写体に遮られた場合においても、被写体を見失う可能性を低減することができる。 According to still another aspect of the present invention, if a subject to be tracked is not searched, a region moved in the left-right direction by an arbitrary amount from the region extracted in generating the second template image is obtained. A third template image is generated by extracting from the viewpoint image captured by the second imaging means, and the tracking target subject is searched using the generated third template image. Thereby, even when the subject to be tracked is blocked by another subject, the possibility of losing sight of the subject can be reduced.
 本発明の更に他の態様に係る撮像装置において、前記第3のテンプレート画像生成手段は、前記探索手段により前記第3のテンプレート画像を用いて探索結果が得られなかった場合には、前記第3のテンプレート画像の生成にあたり抜き出された領域から所定量だけ左右方向に移動させた領域を前記第2の撮像手段で撮像された視点画像から抜き出して第4のテンプレート画像を生成し、前記探索手段は、前記生成された第4のテンプレート画像を用いて追尾対象の被写体を探索する。 In the imaging device according to still another aspect of the present invention, the third template image generation means may be configured to perform the third template image search if the search means cannot obtain a search result using the third template image. Generating a fourth template image by extracting a region moved in the left-right direction by a predetermined amount from the region extracted in generating the template image from the viewpoint image captured by the second imaging unit; Searches for the subject to be tracked using the generated fourth template image.
 本発明の更に他の態様によれば、第3のテンプレート画像を用いて探索結果が得られなかった場合には、第3のテンプレート画像の生成にあたり抜き出された領域から所定量だけ左右方向に移動させた領域を第2の撮像手段で撮像された視点画像から抜き出して第4のテンプレート画像を生成し、この生成された第4のテンプレート画像を用いて追尾対象の被写体を探索する。これにより、より被写体を見失う可能性を低減することができる。 According to still another aspect of the present invention, when a search result is not obtained using the third template image, a predetermined amount from the region extracted in generating the third template image is set in the horizontal direction. The moved region is extracted from the viewpoint image captured by the second imaging unit to generate a fourth template image, and the tracking target subject is searched using the generated fourth template image. As a result, the possibility of losing sight of the subject can be reduced.
 本発明の更に他の態様に係る撮像装置において、前記第3のテンプレート画像生成手段は、前記2枚の視点画像に基づいて前記追尾対象とすべき被写体がいると思われる位置を推定して前記任意の量を決定する。これにより、テンプレート画像を作成する処理が1回で済み、被写体の追尾に要する時間を短縮することができる。 In the imaging device according to still another aspect of the present invention, the third template image generation unit estimates a position where a subject to be tracked is estimated based on the two viewpoint images, and Determine any amount. As a result, the process for creating the template image is only required once, and the time required for tracking the subject can be shortened.
 本発明の更に他の態様に係る撮像装置において、前記第1の撮像手段及び前記第2の撮像手段は、前記2枚の視点画像を連続的に取得し、前記探索手段は、前記第3のテンプレート画像生成手段により生成されたテンプレート画像を用いて追尾対象の被写体を探索した場合には、その後に前記第1の撮像手段で撮像された視点画像については前記第1のテンプレート画像及び前記第2のテンプレート画像を用いて追尾対象の被写体を探索する。 In the imaging apparatus according to still another aspect of the present invention, the first imaging unit and the second imaging unit continuously acquire the two viewpoint images, and the search unit includes the third imaging unit. When the tracking target subject is searched using the template image generated by the template image generation means, the first template image and the second template are used for the viewpoint image captured by the first imaging means thereafter. The subject to be tracked is searched using the template image.
 本発明の更に他の態様によれば、1回目の探索で追尾対象の被写体が探索されず、2回目以降の探索で被写体が探索できた場合に、その後に第1の撮像手段で撮像された視点画像については、第1のテンプレート画像及び第2のテンプレート画像を使用する。これにより、追尾対象の被写体が前面に位置する遮蔽物で覆われて視認できなくなっている場合において、遮蔽物が移動して被写体が視認できるようになった場合に、被写体を精度よく探索することができる。 According to still another aspect of the present invention, when the subject to be tracked is not searched in the first search and the subject can be searched in the second and subsequent searches, the image is subsequently captured by the first imaging means. For the viewpoint image, the first template image and the second template image are used. As a result, when the subject to be tracked is covered with a shielding object located in front and cannot be viewed, the subject can be accurately searched when the shielding object moves and the subject can be visually recognized. Can do.
 本発明の更に他の態様に係る撮像装置において、前記探索手段により探索された追尾対象の被写体を基準に自動露出制御を行う自動露出制御手段、前記探索手段により探索された追尾対象の被写体に焦点が合うように焦点調節を行う自動焦点調節手段、又は前記探索手段により探索された追尾対象の被写体に基づいて画角を調整するズーム制御手段を備えた。 In the imaging apparatus according to still another aspect of the present invention, an automatic exposure control unit that performs automatic exposure control based on a tracking target subject searched by the searching unit, and a focus on the tracking target subject searched by the searching unit. Automatic focus adjustment means that adjusts the focus so that the two are matched, or zoom control means that adjusts the angle of view based on the subject to be tracked searched by the search means.
 本発明の更に他の態様によれば、探索された追尾対象の被写体を基準に自動露出制御、自動焦点調節やズーム制御を行う。これにより、適切な制御を自動的に行うことができる。 According to still another aspect of the present invention, automatic exposure control, automatic focus adjustment, and zoom control are performed based on the searched subject to be tracked. Thereby, appropriate control can be performed automatically.
 本発明の更に他の態様に係る撮像装置において、2つの視点から同一被写体を撮影した2枚の視点画像を取得する第1の撮像手段及び第2の撮像手段と、前記第1の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出すことにより第1のテンプレート画像を生成するテンプレート画像生成手段と、前記2枚の視点画像から視差を取得する視差取得手段と、前記視差取得手段により取得された視差に基づいて前記第1のテンプレート画像から背景及び前景を除去したテンプレート画像を生成する第4のテンプレート画像生成手段と、前記第1の撮像手段で撮影された視点画像であって、前記第1のテンプレート画像の基となる視点画像と異なる時間に撮影された視点画像から、前記第4のテンプレート画像生成手段で生成されたテンプレート画像を用いて追尾対象の被写体を探索する探索手段と、を備えた。 In the imaging apparatus according to still another aspect of the present invention, a first imaging unit and a second imaging unit that acquire two viewpoint images obtained by capturing the same subject from two viewpoints, and the first imaging unit. Template image generation means for generating a first template image by extracting a part of the area including the subject to be tracked from the captured viewpoint image, and parallax acquisition means for acquiring parallax from the two viewpoint images And a fourth template image generation unit that generates a template image obtained by removing the background and foreground from the first template image based on the parallax acquired by the parallax acquisition unit, and the first imaging unit. From the viewpoint image captured at a different time from the viewpoint image that is the basis of the first template image, the fourth template image A search means for searching the tracking target object by using a template image generated in adult means, comprising a.
 本発明の更に他の態様によれば、第1の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出すことにより第1のテンプレート画像を生成し、2枚の視点画像から視差を取得し、この視差に基づいて第1のテンプレート画像から背景及び前景を除去したテンプレート画像を生成する。そして、第1の撮像手段で撮影された視点画像であって、第1のテンプレート画像の基となる視点画像と異なる時間に撮影された視点画像から、生成されたテンプレート画像を用いて追尾対象の被写体を探索する。これにより、被写体が移動した等で背景が変わったり被写体の前に前景が入ったりした場合でも、被写体を見失う可能性を低減することができる。 According to still another aspect of the present invention, the first template image is generated by extracting a part of the region including the subject to be tracked from the viewpoint image captured by the first imaging unit, The parallax is acquired from the viewpoint image, and a template image is generated by removing the background and foreground from the first template image based on the parallax. Then, it is a viewpoint image captured by the first imaging means, and the tracking target image is generated from the viewpoint image captured at a time different from the viewpoint image that is the basis of the first template image, using the generated template image. Search for a subject. This can reduce the possibility of losing sight of the subject even when the background changes due to movement of the subject or the foreground enters in front of the subject.
 本発明の更に他の態様に係る撮像装置において、前記探索手段は、前記テンプレート画像生成手段により生成された第1のテンプレート画像及び前記第4のテンプレート画像生成手段で生成されたテンプレート画像を用いて追尾対象の被写体を探索する。 In the imaging device according to still another aspect of the present invention, the search unit uses the first template image generated by the template image generation unit and the template image generated by the fourth template image generation unit. Search for the subject to be tracked.
 本発明の更に他の態様によれば、背景及び前景を除去していないテンプレート画像及び除去したテンプレート画像を用いて追尾対象の被写体を探索する。これにより、被写体を見失う可能性をより低減することができる。 According to still another aspect of the present invention, the tracking target object is searched using the template image from which the background and the foreground are not removed and the removed template image. This can further reduce the possibility of losing sight of the subject.
 本発明の更に他の態様に係る立体画像撮像方法において、第1の撮像手段及び第2の撮像手段により2つの視点から同一被写体を撮影した2枚の視点画像を取得するステップと、前記第1の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出して第1のテンプレート画像を生成すると共に、前記第2の撮像手段で撮像された視点画像から前記追尾対象とすべき被写体を含む一部の領域を抜き出して第2のテンプレート画像を生成するステップと、前記第1の撮像手段で撮影された視点画像であって、前記第1のテンプレート画像の基となる視点画像と異なる時間に撮影された視点画像から前記第1のテンプレート画像及び前記第2のテンプレート画像を用いて追尾対象の被写体を探索するステップと、を含む。 In the three-dimensional image capturing method according to still another aspect of the present invention, a step of acquiring two viewpoint images obtained by capturing the same subject from two viewpoints using a first imaging unit and a second imaging unit; A first template image is generated by extracting a part of the area including the subject to be tracked from the viewpoint image captured by the imaging unit, and the tracking is performed from the viewpoint image captured by the second imaging unit. Extracting a part of a region including a subject to be a target to generate a second template image, a viewpoint image captured by the first imaging unit, and a basis of the first template image Searching for a subject to be tracked using the first template image and the second template image from viewpoint images taken at a different time from the viewpoint image No.
 本発明の更に他の態様に係るプログラムにおいて、第1の撮像手段及び第2の撮像手段により2つの視点から同一被写体を撮影した2枚の視点画像を取得するステップと、前記第1の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出して第1のテンプレート画像を生成すると共に、前記第2の撮像手段で撮像された視点画像から前記追尾対象とすべき被写体を含む一部の領域を抜き出して第2のテンプレート画像を生成するステップと、前記第1の撮像手段で撮影された視点画像であって、前記第1のテンプレート画像の基となる視点画像と異なる時間に撮影された視点画像から前記第1のテンプレート画像及び前記第2のテンプレート画像を用いて追尾対象の被写体を探索するステップと、を演算装置に実行させる。このプログラムが記録されたコンピュータ読み取り可能な非一時的媒体(a non-transitory computer-readable medium)も本発明に含まれる。 In the program according to still another aspect of the present invention, a step of obtaining two viewpoint images obtained by photographing the same subject from two viewpoints by the first imaging means and the second imaging means, and the first imaging means The first template image is generated by extracting a part of the region including the subject to be tracked from the viewpoint image captured in step 1, and the tracking target is determined from the viewpoint image captured by the second imaging unit. Generating a second template image by extracting a part of the region including the subject to be photographed, and a viewpoint image taken by the first imaging means, which is a basis of the first template image Searching for a subject to be tracked using viewpoint images taken at different times using the first template image and the second template image. To be executed in. A computer-readable non-transitory medium (a non-transitory computer-readable medium) in which this program is recorded is also included in the present invention.
 本発明によれば、被写体が移動したり向きが変わったりした場合においても、追尾中に被写体を見失う可能性を低減することができることができる。 According to the present invention, even when the subject moves or changes its direction, the possibility of losing sight of the subject during tracking can be reduced.
本発明の第1の実施の形態の複眼デジタルカメラ1の概略図であり、(a)は正面図、(b)は背面図である。BRIEF DESCRIPTION OF THE DRAWINGS It is the schematic of the compound eye digital camera 1 of the 1st Embodiment of this invention, (a) is a front view, (b) is a rear view. 複眼デジタルカメラ1の電気的な構成を示すブロック図である。2 is a block diagram showing an electrical configuration of the compound-eye digital camera 1. FIG. 被写体と複眼デジタルカメラ1との位置関係を示す図である。2 is a diagram illustrating a positional relationship between a subject and a compound-eye digital camera 1. FIG. 複眼デジタルカメラ1の被写体追尾処理の流れを示すフローチャートである。3 is a flowchart showing a flow of subject tracking processing of the compound-eye digital camera 1; 視差画像の一例である。It is an example of a parallax image. 視差画像からテンプレート画像を作成する方法を説明するための図である。It is a figure for demonstrating the method to produce a template image from a parallax image. テンプレート画像の一例である。It is an example of a template image. 視差画像からテンプレート画像を作成する方法を説明するための図である。It is a figure for demonstrating the method to produce a template image from a parallax image. 本発明の第2の実施の形態の複眼デジタルカメラ2の電気的な構成を示すブロック図である。It is a block diagram which shows the electric constitution of the compound eye digital camera 2 of the 2nd Embodiment of this invention. 画像合成処理によりテンプレート画像を生成する方法を説明するための図である。It is a figure for demonstrating the method to produce | generate a template image by an image synthesis process. 複眼デジタルカメラ2の被写体追尾処理の流れを示すフローチャートである。6 is a flowchart showing a flow of subject tracking processing of the compound-eye digital camera 2; 複眼デジタルカメラ2の変形例において被写体追尾処理の流れを示すフローチャートである。10 is a flowchart showing the flow of subject tracking processing in a modification of the compound-eye digital camera 2. 本発明の第3の実施の形態の複眼デジタルカメラ3の電気的な構成を示すブロック図である。It is a block diagram which shows the electrical constitution of the compound eye digital camera 3 of the 3rd Embodiment of this invention. 複眼デジタルカメラ3の被写体追尾処理の流れを示すフローチャートである。4 is a flowchart showing a flow of subject tracking processing of the compound-eye digital camera 3; 視差画像からテンプレート画像を作成する方法を説明するための図である。It is a figure for demonstrating the method to produce a template image from a parallax image. テンプレート画像の一例である。It is an example of a template image. 視差画像からテンプレート画像を作成する方法を説明するための図である。It is a figure for demonstrating the method to produce a template image from a parallax image. テンプレート画像の一例である。It is an example of a template image. 被写体と複眼デジタルカメラ1との位置関係を示す図である。2 is a diagram illustrating a positional relationship between a subject and a compound-eye digital camera 1. FIG. 複眼デジタルカメラ3の変形例の被写体追尾処理の流れを示すフローチャートである。6 is a flowchart showing a flow of subject tracking processing of a modified example of the compound-eye digital camera 3; 視差画像からテンプレート画像を作成する方法を説明するための図である。It is a figure for demonstrating the method to produce a template image from a parallax image. 視差画像からテンプレート画像を作成する方法を説明するための図である。It is a figure for demonstrating the method to produce a template image from a parallax image. 本発明の第4の実施の形態の複眼デジタルカメラ4の電気的な構成を示すブロック図である。It is a block diagram which shows the electrical constitution of the compound eye digital camera 4 of the 4th Embodiment of this invention. 複眼デジタルカメラ4の変形例の被写体追尾処理の流れを示すフローチャートである。7 is a flowchart showing a flow of subject tracking processing of a modified example of the compound-eye digital camera 4; 視差マップの一例である。It is an example of a parallax map. テンプレート画像を作成する方法を説明するための図である。It is a figure for demonstrating the method to produce a template image.
 以下、添付図面に従って本発明に係る撮像装置、立体画像撮像方法及びプログラムを実施するための最良の形態について詳細に説明する。 Hereinafter, the best mode for carrying out an imaging apparatus, a stereoscopic image imaging method, and a program according to the present invention will be described in detail with reference to the accompanying drawings.
 <第1の実施の形態>
 図1は、本発明に係る立体画像表示装置を有する複眼デジタルカメラ1の概略図であり、(a)は正面図であり、(b)は背面図である。複眼デジタルカメラ1は、複数(図1では2個を例示)の撮像系を備えた複眼デジタルカメラ1であって、同一被写体を複数視点(図1では左右二つの視点を例示)から見た立体画像や、単視点画像(2次元画像)が撮影可能である。また、複眼デジタルカメラ1は、静止画に限らず、動画、音声の記録再生も可能である。
<First Embodiment>
FIG. 1 is a schematic diagram of a compound-eye digital camera 1 having a stereoscopic image display device according to the present invention, where (a) is a front view and (b) is a rear view. The compound-eye digital camera 1 is a compound-eye digital camera 1 provided with a plurality of imaging systems (two are illustrated in FIG. 1) and is a stereoscopic view of the same subject viewed from a plurality of viewpoints (two viewpoints on the left and right are illustrated in FIG. 1). Images and single viewpoint images (two-dimensional images) can be taken. The compound-eye digital camera 1 is capable of recording and reproducing not only still images but also moving images and sounds.
 複眼デジタルカメラ1のカメラボディ10は、略直方体の箱状に形成されており、その正面には、図1(a)に示すように、主として、バリア11、右撮像系12、左撮像系13、フラッシュ14、マイク15が設けられている。また、カメラボディ10の上面には、主として、レリーズスイッチ20、ズームボタン21が設けられている。 The camera body 10 of the compound-eye digital camera 1 is formed in a substantially rectangular parallelepiped box shape, and mainly on the front thereof is a barrier 11, a right imaging system 12, and a left imaging system 13 as shown in FIG. A flash 14 and a microphone 15 are provided. A release switch 20 and a zoom button 21 are mainly provided on the upper surface of the camera body 10.
 一方、カメラボディ10の背面には、図1(b)に示すように、モニタ16、モードボタン22、視差調整ボタン23、2D/3D切り替えボタン24、MENU/OKボタン25、十字ボタン26、DISP/BACKボタン27が設けられている。 On the other hand, on the back of the camera body 10, as shown in FIG. 1B, a monitor 16, a mode button 22, a parallax adjustment button 23, a 2D / 3D switching button 24, a MENU / OK button 25, a cross button 26, a DISP A / BACK button 27 is provided.
 バリア11は、カメラボディ10の前面に摺動可能に装着され、バリア11が上下に摺動することにより開状態と閉状態とが切り替えられる。通常は、図1(a)点線に示すように、バリア11は上端、すなわち閉状態に位置されており、対物レンズ12a、13a等はバリア11によって覆われている。これにより、レンズなどの破損が防止される。バリア11が摺動されることにより、バリアが下端、すなわち開状態に位置される(図1(a)実線参照)と、カメラボディ10前面に配設されたレンズ等が露呈される。図示しないセンサによりバリア11が開状態であることが認識されると、CPU110(図2参照)により電源がONされ、撮影が可能となる。 The barrier 11 is slidably mounted on the front surface of the camera body 10, and is switched between an open state and a closed state when the barrier 11 slides up and down. Normally, as shown by a dotted line in FIG. 1A, the barrier 11 is positioned at the upper end, that is, in a closed state, and the objective lenses 12a, 13a and the like are covered with the barrier 11. Thereby, damage of a lens etc. is prevented. When the barrier 11 is slid, the lens disposed on the front surface of the camera body 10 is exposed when the barrier is positioned at the lower end, that is, in the open state (see a solid line in FIG. 1A). When a sensor (not shown) recognizes that the barrier 11 is in an open state, the power is turned on by the CPU 110 (see FIG. 2), and photographing is possible.
 右目用の画像を撮影する右撮像系12及び左目用の画像を撮影する左撮像系13は、図3に示すように、2つの視点から同一被写体を撮影した2枚の視点画像を取得する。右撮像系12及び左撮像系13は、屈曲光学系を有する撮影レンズ群、絞り兼用メカシャッタ12d、13d及び撮像素子122、123(図2参照)を含む光学ユニットである。右撮像系12及び左撮像系13の撮影レンズ群は、主として、被写体からの光を取り込む対物レンズ12a、13a、対物レンズから入射した光路を略垂直に折り曲げるプリズム(図示せず)、ズームレンズ12c、13c(図2参照)、フォーカスレンズ12b、13b(図2参照)等で構成される。 The right imaging system 12 that captures an image for the right eye and the left imaging system 13 that captures an image for the left eye acquire two viewpoint images obtained by capturing the same subject from two viewpoints, as shown in FIG. The right imaging system 12 and the left imaging system 13 are optical units including a photographing lens group having a bending optical system, diaphragm and mechanical shutters 12d and 13d, and imaging elements 122 and 123 (see FIG. 2). The photographic lens group of the right imaging system 12 and the left imaging system 13 mainly includes objective lenses 12a and 13a that take in light from a subject, a prism (not shown) that bends an optical path incident from the objective lens substantially vertically, and a zoom lens 12c. 13c (see FIG. 2), focus lenses 12b and 13b (see FIG. 2), and the like.
 フラッシュ14は、キセノン管で構成されており、暗い被写体を撮影する場合や逆光時などに必要に応じて発光される。 The flash 14 is composed of a xenon tube, and emits light as necessary when shooting a dark subject or when backlit.
 モニタ16は、4:3の一般的なアスペクト比を有するカラー表示が可能な液晶モニタであり、立体画像と平面画像の両方が表示可能である。モニタ16の詳細な構造は図示しないが、モニタ16は、その表面にパララックスバリア表示層を備えたパララックスバリア式3Dモニタである。モニタ16は、各種設定操作を行なう際の使用者インターフェース表示パネルとして利用され、画像撮影時には電子ビューファインダとして利用される。 The monitor 16 is a liquid crystal monitor capable of color display having a general aspect ratio of 4: 3, and can display both a stereoscopic image and a planar image. Although the detailed structure of the monitor 16 is not shown, the monitor 16 is a parallax barrier type 3D monitor having a parallax barrier display layer on the surface thereof. The monitor 16 is used as a user interface display panel when performing various setting operations, and is used as an electronic viewfinder during image capturing.
 モニタ16は、立体画像を表示するモード(3Dモード)と、平面画像を表示するモード(2Dモード)とが切り替えが可能である。3Dモードにおいては、モニタ16のパララックスバリア表示層に光透過部と光遮蔽部とが交互に所定のピッチで並んだパターンからなるパララックスバリアを発生させるとともに、その下層の画像表示面に左右の像を示す短冊状の画像断片を交互に配列して表示する。2Dモードや使用者インターフェース表示パネルとして利用される場合には、パララックスバリア表示層には何も表示せず、その下層の画像表示面に1枚の画像をそのまま表示する。 The monitor 16 can be switched between a mode for displaying a stereoscopic image (3D mode) and a mode for displaying a planar image (2D mode). In the 3D mode, a parallax barrier having a pattern in which light transmitting portions and light shielding portions are alternately arranged at a predetermined pitch is generated on the parallax barrier display layer of the monitor 16, and The strip-shaped image fragments showing the image are alternately arranged and displayed. When used as a 2D mode or a user interface display panel, nothing is displayed on the parallax barrier display layer, and one image is displayed as it is on the lower image display surface.
 なお、モニタ16は、パララックスバリア式には限定されず、レンチキュラー方式、マイクロレンズアレイシートを用いるインテグラルフォトグラフィ方式、干渉現象を用いるホログラフィー方式などが採用されてもよい。また、モニタ16は液晶モニタに限定されず、有機ELなどが採用されてもよい。 The monitor 16 is not limited to the parallax barrier type, and a lenticular method, an integral photography method using a microlens array sheet, a holography method using an interference phenomenon, or the like may be employed. The monitor 16 is not limited to a liquid crystal monitor, and an organic EL or the like may be employed.
 レリーズスイッチ20は、いわゆる「半押し」と「全押し」とからなる二段ストローク式のスイッチで構成されている。複眼デジタルカメラ1は、静止画撮影時(例えば、モードボタン22で静止画撮影モード選択時、又はメニューから静止画撮影モード選択時)、このレリーズスイッチ20を半押しすると撮影準備処理、すなわち、AE(Automatic Exposure:自動露出)、AF(Auto Focus:自動焦点合わせ)、AWB(Automatic White Balance:自動ホワイトバランス)の各処理を行い、全押しすると、画像の撮影・記録処理を行う。また、動画撮影時(例えば、モードボタン22で動画撮影モード選択時、又はメニューから動画撮影モード選択時)、このレリーズスイッチ20を全押しすると、動画の撮影を開始し、再度全押しすると、撮影を終了する。 The release switch 20 is composed of a two-stage stroke switch composed of a so-called “half press” and “full press”. When the compound-eye digital camera 1 shoots a still image (for example, when the still image shooting mode is selected with the mode button 22 or when the still image shooting mode is selected from the menu), the release switch 20 is pressed halfway, so (Automatic Exposure: Automatic Exposure), AF (Auto Focus: Automatic Focusing), AWB (Automatic White Balance: Automatic White Balance) are performed, and when fully pressed, image capturing / recording processing is performed. Also, during movie shooting (for example, when the movie shooting mode is selected with the mode button 22 or when the movie shooting mode is selected from the menu), when the release switch 20 is fully pressed, shooting of the movie starts, and when the shutter button is fully pressed again, shooting occurs Exit.
 ズームボタン21は、右撮像系12及び左撮像系13のズーム操作に用いられ、望遠側へのズームを指示するズームテレボタン21Tと、広角側へのズームを指示するズームワイドボタン21Wとで構成されている。 The zoom button 21 is used for a zoom operation of the right imaging system 12 and the left imaging system 13, and includes a zoom tele button 21T for instructing zooming to the telephoto side and a zoom wide button 21W for instructing zooming to the wide angle side. Has been.
 モードボタン22は、デジタルカメラ1の撮影モードを設定する撮影モード設定手段として機能し、このモードボタン22の設定位置により、デジタルカメラ1の撮影モードが様々なモードに設定される。撮影モードは、動画撮影を行う「動画撮影モード」と、静止画撮影を行う「静止画撮影モード」とに分けられ、「静止画撮影モード」は例えば、絞り、シャッタスピード等がデジタルカメラ1によって自動的に設定される「オート撮影モード」、人物の顔を抽出して撮影を行う「顔抽出撮影モード」、動体撮影に適した「スポーツ撮影モード」、風景の撮影に適した「風景撮影モード」、夕景及び夜景の撮影に適した「夜景撮影モード」、絞りの目盛りを使用者が設定し、シャッタスピードをデジタルカメラ1が自動的に設定する「絞り優先撮影モード」、シャッタスピードを使用者が設定し、絞りの目盛りをデジタルカメラ1が自動的に設定する「シャッタスピード優先撮影モード」、絞り、シャッタスピード等を使用者が設定する「マニュアル撮影モード」等がある。 The mode button 22 functions as shooting mode setting means for setting the shooting mode of the digital camera 1, and the shooting mode of the digital camera 1 is set to various modes depending on the setting position of the mode button 22. The shooting mode is divided into a “moving image shooting mode” in which moving image shooting is performed and a “still image shooting mode” in which still image shooting is performed. "Auto shooting mode" that is set automatically, "Face extraction shooting mode" that extracts a person's face for shooting, "Sport shooting mode" suitable for moving body shooting, "Landscape shooting mode" suitable for landscape shooting ”,“ Night scene shooting mode ”suitable for evening and night scene shooting,“ Aperture priority shooting mode ”in which the user sets the scale of the aperture and the digital camera 1 automatically sets the shutter speed, and the shutter speed is the user The “shutter speed priority shooting mode” in which the digital camera 1 automatically sets the scale of the aperture, and the “manifold” in which the user sets the aperture, shutter speed, etc. There are some shooting mode ", and the like.
 視差調整ボタン23は、立体画像撮影時に視差を電子的に調整するボタンである。視差調整ボタン23の右側を押下することにより、右撮像系12で撮影された画像と左撮像系13で撮影された画像との視差が所定の距離だけ大きくなり、視差調整ボタン23の左側を押下することにより、右撮像系12で撮影された画像と左撮像系13で撮影された画像との視差が所定の距離だけ小さくなる。 The parallax adjustment button 23 is a button for electronically adjusting the parallax at the time of stereoscopic image shooting. By pressing the right side of the parallax adjustment button 23, the parallax between the image captured by the right imaging system 12 and the image captured by the left imaging system 13 is increased by a predetermined distance, and the left side of the parallax adjustment button 23 is pressed. Thus, the parallax between the image captured by the right imaging system 12 and the image captured by the left imaging system 13 is reduced by a predetermined distance.
 2D/3D切り替えボタン24は、単視点画像を撮影する2D撮影モードと、多視点画像を撮影する3D撮影モードの切り替えを指示するためのスイッチである。 The 2D / 3D switching button 24 is a switch for instructing switching between a 2D shooting mode for shooting a single viewpoint image and a 3D shooting mode for shooting a multi-viewpoint image.
 MENU/OKボタン25は、撮影及び再生機能の各種設定画面(メニュー画面)の呼び出し(MENU機能)に用いられるとともに、選択内容の確定、処理の実行指示等(OK機能)に用いられ、複眼デジタルカメラ1が持つ全ての調整項目の設定が行われる。撮影時にMENU/OKボタン25が押されると、モニタ16にたとえば露出値、色合い、ISO感度、記録画素数などの画質調整などの設定画面が表示され、再生時にMENU/OKボタン25が押されると、モニタ16に画像の消去などの設定画面が表示される。複眼デジタルカメラ1は、このメニュー画面で設定された条件に応じて動作する。 The MENU / OK button 25 is used for calling up various setting screens (menu screens) of shooting and playback functions (MENU function), as well as for confirming selection contents, executing instructions for processing, etc. (OK function). All adjustment items of the camera 1 are set. When the MENU / OK button 25 is pressed during shooting, a setting screen for adjusting image quality such as exposure value, hue, ISO sensitivity, and the number of recorded pixels is displayed on the monitor 16, and when the MENU / OK button 25 is pressed during playback. Then, a setting screen for erasing the image is displayed on the monitor 16. The compound-eye digital camera 1 operates according to the conditions set on this menu screen.
 十字ボタン26は、各種のメニューの設定や選択あるいはズームを行うためのボタンであり、上下左右4方向に押圧操作可能に設けられており、各方向のボタンには、カメラの設定状態に応じた機能が割り当てられる。たとえば、撮影時には、左ボタンにマクロ機能のON/OFFを切り替える機能が割り当てられ、右ボタンにフラッシュモードを切り替える機能が割り当てられる。また、上ボタンにモニタ16の明るさを替える機能が割り当てられ、下ボタンにセルフタイマーのON/OFFや時間を切り替える機能が割り当てられる。また、再生時には、右ボタンにコマ送りの機能が割り当てられ、左ボタンにコマ戻しの機能が割り当てられる。また、上ボタンに再生中の画像を削除する機能が割り当てられる。また、各種設定時には、モニタ16に表示されたカーソルを各ボタンの方向に移動させる機能が割り当てられる。 The cross button 26 is a button for setting, selecting, or zooming various menus, and is provided so that it can be pressed in four directions, up, down, left, and right. The button in each direction corresponds to the setting state of the camera. A function is assigned. For example, at the time of shooting, a function for switching the macro function ON / OFF is assigned to the left button, and a function for switching the flash mode is assigned to the right button. In addition, a function for changing the brightness of the monitor 16 is assigned to the upper button, and a function for switching ON / OFF of the self-timer and time is assigned to the lower button. During playback, a frame advance function is assigned to the right button, and a frame return function is assigned to the left button. In addition, a function for deleting an image being reproduced is assigned to the upper button. Further, at the time of various settings, a function for moving the cursor displayed on the monitor 16 in the direction of each button is assigned.
 DISP/BACKボタン27は、モニタ16の表示切り替えを指示するボタンとして機能し、撮影中、このDISP/BACKボタン27が押されると、モニタ16の表示が、ON→フレーミングガイド表示→OFFに切り替えられる。また、再生中、このDISP/BACKボタン27が押されると、通常再生→文字表示なし再生→マルチ再生に切り替えられる。また、DISP/BACKボタン27は、入力操作のキャンセルや一つ前の操作状態に戻すことを指示するボタンとして機能する。 The DISP / BACK button 27 functions as a button for instructing display switching of the monitor 16, and when the DISP / BACK button 27 is pressed during shooting, the display of the monitor 16 is switched from ON to framing guide display to OFF. . If the DISP / BACK button 27 is pressed during playback, the playback mode is switched from normal playback to playback without character display to multi playback. The DISP / BACK button 27 functions as a button for instructing to cancel the input operation or return to the previous operation state.
 図2は、複眼デジタルカメラ1の主要な内部構成を示すブロック図である。複眼デジタルカメラ1は、主として、CPU110、操作手段(レリーズスイッチ20、MENU/OKボタン25、十字ボタン26等)112、SDRAM114、VRAM116、AF検出手段118、AE/AWB検出手段120、撮像素子122、123、CDS/AMP124、125、A/D変換器126、127、画像入力コントローラ128、画像信号処理手段130、圧縮伸張処理手段132、立体画像生成部133、ビデオエンコーダ134、テンプレート画像生成部135、被写体探索部136、メディアコントローラ137、音入力処理手段138、記録メディア140、フォーカスレンズ駆動手段142、143、ズームレンズ駆動手段144、145、絞り駆動手段146、147、タイミングジェネレータ(TG)148、149で構成される。 FIG. 2 is a block diagram showing the main internal configuration of the compound-eye digital camera 1. The compound-eye digital camera 1 mainly includes a CPU 110, operation means (release switch 20, MENU / OK button 25, cross button 26, etc.) 112, SDRAM 114, VRAM 116, AF detection means 118, AE / AWB detection means 120, image sensor 122, 123, CDS / AMP 124, 125, A / D converters 126, 127, image input controller 128, image signal processing means 130, compression / decompression processing means 132, stereoscopic image generation unit 133, video encoder 134, template image generation unit 135, Subject search unit 136, media controller 137, sound input processing means 138, recording medium 140, focus lens driving means 142, 143, zoom lens driving means 144, 145, aperture driving means 146, 147, timing generator ( G) consisting of 148 and 149.
 CPU110は、複眼デジタルカメラ1の全体の動作を統括的に制御する。CPU110は、操作手段112からの入力に応じて各ブロックに指令を出す。CPU110は、右撮像系12と左撮像系13の動作を制御する。右撮像系12と左撮像系13とは、基本的に連動して動作を行うが、各々個別に動作させることも可能である。また、CPU110は、右撮像系12及び左撮像系13で得られた2つの画像データを短冊状の画像断片とし、これがモニタ16に交互に表示されるような表示用画像データを生成する。3Dモードで表示を行う際に、パララックスバリア表示層に光透過部と光遮蔽部とが交互に所定のピッチで並んだパターンからなるパララックスバリアを発生させるとともに、その下層の画像表示面に左右の像を示す短冊状の画像断片を交互に配列して表示することで立体視を可能にする。 The CPU 110 comprehensively controls the overall operation of the compound-eye digital camera 1. CPU 110 issues a command to each block in response to an input from operation means 112. The CPU 110 controls the operations of the right imaging system 12 and the left imaging system 13. The right imaging system 12 and the left imaging system 13 basically operate in conjunction with each other, but can be operated individually. Further, the CPU 110 generates two pieces of image data obtained by the right imaging system 12 and the left imaging system 13 as strip-shaped image fragments, and generates display image data that is alternately displayed on the monitor 16. When displaying in the 3D mode, a parallax barrier having a pattern in which light transmitting portions and light shielding portions are alternately arranged at a predetermined pitch is generated on the parallax barrier display layer, and the image display surface below the parallax barrier is displayed. Stereoscopic viewing is enabled by alternately displaying strip-shaped image fragments showing left and right images.
 SDRAM114には、このCPU110が実行する制御プログラムであるファームウェア、制御に必要な各種データ、カメラ設定値、撮影された画像データ等が記録されている。 The SDRAM 114 stores firmware, which is a control program executed by the CPU 110, various data necessary for control, camera setting values, captured image data, and the like.
 VRAM116は、CPU110の作業用領域として利用されるとともに、画像データの一時記憶領域として利用される。 The VRAM 116 is used as a work area for the CPU 110 and also as a temporary storage area for image data.
 撮像素子122、123は、所定のカラーフィルタ配列(例えば、ハニカム配列、ベイヤ配列)のR、G、Bのカラーフィルタが設けられたカラーCCDで構成されている。撮像素子122、123は、フォーカスレンズ12b、13b、ズームレンズ12c、13c等によって結像された被写体光を受光し、この受光面に入射した光は、その受光面に配列された各フォトダイオードによって入射光量に応じた量の信号電荷に変換される。撮像素子122、123の光電荷蓄積・転送動作は、TG148、149からそれぞれ入力される電荷排出パルスに基づいて電子シャッタ速度(光電荷蓄積時間)が決定される。 The image sensors 122 and 123 are constituted by color CCDs provided with R, G, and B color filters in a predetermined color filter array (for example, honeycomb array, Bayer array). The image sensors 122 and 123 receive the subject light imaged by the focus lenses 12b and 13b, the zoom lenses 12c and 13c, and the light incident on the light receiving surface is received by the photodiodes arranged on the light receiving surface. The signal charge is converted into an amount corresponding to the amount of incident light. In the photocharge accumulation / transfer operations of the image sensors 122 and 123, the electronic shutter speed (photocharge accumulation time) is determined based on the charge discharge pulses input from the TGs 148 and 149, respectively.
 すなわち、撮像素子122、123に電荷排出パルスが入力されている場合には、撮像素子122、123に電荷が蓄えられることなく排出される。それに対し、撮像素子122、123に電荷排出パルスが入力されなくなると、電荷が排出されなくなるため、撮像素子122、123において電荷蓄積、すなわち露光が開始される。撮像素子122、123で取得された撮像信号は、TG148、149からそれぞれ与えられる駆動パルスに基づいてCDS/AMP124、125に出力される。 That is, when a charge discharge pulse is input to the image sensors 122 and 123, charges are discharged without being stored in the image sensors 122 and 123. On the other hand, when no charge discharge pulse is input to the image sensors 122 and 123, charges are not discharged, so that charge accumulation, that is, exposure is started in the image sensors 122 and 123. The imaging signals acquired by the imaging elements 122 and 123 are output to the CDS / AMPs 124 and 125 based on the driving pulses given from the TGs 148 and 149, respectively.
 CDS/AMP124、125は、撮像素子122、123から出力された画像信号に対して相関二重サンプリング処理(撮像素子の出力信号に含まれるノイズ(特に熱雑音)等を軽減することを目的として、撮像素子の1画素毎の出力信号に含まれるフィードスルー成分レベルと画素信号成分レベルとの差をとることにより正確な画素データを得る処理)を行い、増幅してR、G、Bのアナログの画像信号を生成する。 The CDS / AMPs 124 and 125 are for the purpose of reducing correlated double sampling processing (noise (particularly thermal noise) included in the output signal of the image sensor) and the like for the image signals output from the image sensors 122 and 123. Processing to obtain accurate pixel data by taking the difference between the feedthrough component level and the pixel signal component level included in the output signal for each pixel of the image sensor, and amplifying the analog signals of R, G, B An image signal is generated.
 A/D変換器126、127は、CDS/AMP124、125で生成されたR、G、Bのアナログの画像信号デジタルの画像信号に変換する。 The A / D converters 126 and 127 convert the R, G, and B analog image signals generated by the CDS / AMPs 124 and 125 into digital image signals.
 画像入力コントローラ128は、所定容量のラインバッファを内蔵しており、CPU110からの指令に従い、CDS/AMP/AD変換手段から出力された1画像分の画像信号を蓄積して、VRAM116に記録する。 The image input controller 128 has a built-in line buffer with a predetermined capacity, accumulates an image signal for one image output from the CDS / AMP / AD conversion means, and records it in the VRAM 116 in accordance with a command from the CPU 110.
 画像信号処理手段130は、同時化回路(単板CCDのカラーフィルタ配列に伴う色信号の空間的なズレを補間して色信号を同時式に変換する処理回路)、ホワイトバランス補正回路、ガンマ補正回路、輪郭補正回路、輝度・色差信号生成回路等を含み、CPU110からの指令に従い、入力された画像信号に所要の信号処理を施して、輝度データ(Yデータ)と色差データ(Cr,Cbデータ)とからなる画像データ(YUVデータ)を生成する。以下、撮像素子122から出力された画像信号から生成された画像データを右目用画像Bといい、撮像素子123から出力された画像信号から生成された画像データを左目用画像Aという。 The image signal processing unit 130 is a synchronization circuit (a processing circuit that converts a color signal into a simultaneous expression by interpolating a spatial shift of the color signal associated with a color filter array of a single CCD), a white balance correction circuit, and a gamma correction. Circuit, contour correction circuit, luminance / color difference signal generation circuit, etc., and according to a command from the CPU 110, the input image signal is subjected to necessary signal processing to obtain luminance data (Y data) and color difference data (Cr, Cb data). ) Is generated. Hereinafter, the image data generated from the image signal output from the image sensor 122 is referred to as a right eye image B, and the image data generated from the image signal output from the image sensor 123 is referred to as a left eye image A.
 画像信号処理手段130で処理された左目用画像A及び右目用画像B(3D画像データ)は、VRAM50に入力される。VRAM50には、それぞれが1コマ分の3D画像を表す3D画像データを記憶するA領域とB領域とが含まれている。VRAM50において1コマ分の3D画像を表す3D画像データがA領域とB領域とで交互に書き換えられる。VRAM50のA領域及びB領域のうち、3D画像データが書き換えられている方の領域以外の領域から、書き込まれている3D画像データが読み出される。 The left-eye image A and the right-eye image B (3D image data) processed by the image signal processing unit 130 are input to the VRAM 50. The VRAM 50 includes an A area and a B area each storing 3D image data representing a 3D image for one frame. In the VRAM 50, 3D image data representing a 3D image for one frame is rewritten alternately in the A area and the B area. The written 3D image data is read from an area other than the area in which the 3D image data is rewritten in the A area and the B area of the VRAM 50.
 立体画像生成部133は、VRAM50から読み出された3D画像データ又は記録メディア140から読み出され圧縮伸張処理手段132で生成された非圧縮の3D画像データをモニタ16で立体表示が可能なように加工する。例えば、パララックスバリア式のモニタである場合には、立体画像生成部133は、再生に用いる右目用画像B及び左目用画像Aをそれぞれ短冊状に区切り、短冊状の右目用画像Bと左目用画像Aとを交互に並べた表示用画像データを生成する。表示用画像データは、立体画像生成部133からビデオエンコーダ134を介してモニタ16に出力される。 The stereoscopic image generation unit 133 allows the monitor 16 to display the 3D image data read from the VRAM 50 or the non-compressed 3D image data read from the recording medium 140 and generated by the compression / decompression processing unit 132 on the monitor 16. Process. For example, in the case of a parallax barrier monitor, the stereoscopic image generation unit 133 divides the right-eye image B and the left-eye image A used for reproduction into strips, and the strip-shaped right-eye image B and left-eye image are separated. Display image data in which images A are arranged alternately is generated. The display image data is output from the stereoscopic image generation unit 133 to the monitor 16 via the video encoder 134.
 ビデオエンコーダ134は、モニタ16への表示を制御する。すなわち、立体画像生成部133で生成された表示用画像データ等をモニタ16に表示するための映像信号(例えば、NTSC信号やPAL信号、SCAM信号)に変換してモニタ16に出力すると共に、必要に応じて所定の文字、図形情報をモニタ16に出力する。 The video encoder 134 controls display on the monitor 16. That is, the display image data generated by the stereoscopic image generation unit 133 is converted into a video signal (for example, NTSC signal, PAL signal, SCAM signal) to be displayed on the monitor 16 and output to the monitor 16 as well as necessary. In response to this, predetermined character and graphic information is output to the monitor 16.
 これにより、右目用画像B及び左目用画像Aがモニタ16に立体表示される。なお、VRAM50に1コマ分の3D画像を表す3D画像データが交互に書き換えられ、3D画像データが書き換えられている方の領域以外の領域から書き込まれている3D画像データが読み出される場合には、モニタ16には3D画像がリアルタイムに連続して表示される(ライブビュー画像(スルー画像)の表示)。 Thereby, the right-eye image B and the left-eye image A are stereoscopically displayed on the monitor 16. When 3D image data representing a 3D image for one frame is alternately rewritten in the VRAM 50 and 3D image data written from an area other than the area where the 3D image data is rewritten is read out, 3D images are continuously displayed in real time on the monitor 16 (display of live view images (through images)).
 また、画像信号処理手段130で処理された左目用画像A及び右目用画像Bは、テンプレート画像生成部135に入力される。テンプレート画像生成部135は、左目用画像A及び右目用画像Bのそれぞれから追尾対象の被写体Zを含む所定の領域(例えば、矩形)を抜き出してテンプレート画像を生成する。テンプレート画像生成部135が行う内容の詳細については、後に詳述する。 Also, the left-eye image A and the right-eye image B processed by the image signal processing means 130 are input to the template image generation unit 135. The template image generation unit 135 extracts a predetermined region (for example, a rectangle) including the subject Z to be tracked from each of the left eye image A and the right eye image B, and generates a template image. Details of the contents performed by the template image generation unit 135 will be described later.
 更に、画像信号処理手段130で処理された左目用画像A及び右目用画像Bは、被写体探索部136に入力される。被写体探索部136は、テンプレート画像生成部135により生成されたテンプレート画像を使って、左目用画像A及び右目用画像Bの少なくとも一方からテンプレート画像に類似した部分を探索する。これにより、被写体Zが左目用画像A及び右目用画像Bの少なくとも一方から探索される。被写体探索部136が行う内容の詳細については、後に詳述する。 Further, the left eye image A and the right eye image B processed by the image signal processing means 130 are input to the subject searching unit 136. The subject search unit 136 uses the template image generated by the template image generation unit 135 to search for a portion similar to the template image from at least one of the left-eye image A and the right-eye image B. Accordingly, the subject Z is searched from at least one of the left-eye image A and the right-eye image B. Details of the contents performed by the subject searching unit 136 will be described later.
 被写体探索部136により被写体Zが探索された左目用画像A及び右目用画像Bの少なくとも一方はテンプレート画像生成部135に入力され、テンプレート画像生成部135で被写体Zを含む所定の領域が抜き出されテンプレート画像が生成される。 At least one of the left-eye image A and the right-eye image B for which the subject Z has been searched by the subject search unit 136 is input to the template image generation unit 135, and a predetermined region including the subject Z is extracted by the template image generation unit 135. A template image is generated.
 AF検出手段118は、CPU110からの指令に従い、被写体探索部136で探索された被写体Zが合焦するように、入力された画像信号からAF制御に必要な物理量を算出する。AF検出手段118は、右撮像系12から入力された画像信号に基づいてAF制御を行う右撮像系AF制御回路と、左撮像系13から入力された画像信号に基づいてAF制御を行う左撮像系AF制御回路とで構成される。本実施の形態のデジタルカメラ1では、撮像素子122、123から得られる画像のコントラストによりAF制御が行われ(いわゆるコントラストAF)、AF検出手段118は、入力された画像信号から画像の鮮鋭度を示す焦点評価値を算出する。CPU110は、このAF検出手段118で算出される焦点評価値が極大となる位置を検出し、その位置にフォーカスレンズ群を移動させる。すなわち、フォーカスレンズ群を至近から無限遠まで所定のステップで移動させ、各位置で焦点評価値を取得し、得られた焦点評価値が最大の位置を合焦位置として、その位置にフォーカスレンズ群を移動させる。 The AF detection unit 118 calculates a physical quantity necessary for AF control from the input image signal so that the subject Z searched by the subject search unit 136 is in focus according to a command from the CPU 110. The AF detection unit 118 includes a right imaging system AF control circuit that performs AF control based on the image signal input from the right imaging system 12, and a left imaging that performs AF control based on the image signal input from the left imaging system 13. And a system AF control circuit. In the digital camera 1 of the present embodiment, AF control is performed based on the contrast of the images obtained from the image sensors 122 and 123 (so-called contrast AF), and the AF detection unit 118 determines the sharpness of the image from the input image signal. The focus evaluation value shown is calculated. The CPU 110 detects a position where the focus evaluation value calculated by the AF detection means 118 is maximized, and moves the focus lens group to that position. That is, the focus lens group is moved from the closest distance to infinity in a predetermined step, the focus evaluation value is obtained at each position, and the position where the obtained focus evaluation value is the maximum is set as the in-focus position, and the focus lens group is at that position. Move.
 フォーカスレンズ駆動手段142、143は、CPU110からの指令に従い、フォーカスレンズ12b、13bをそれぞれ光軸方向に移動させ、被写体探索部136で探索された被写体Zが合焦するように焦点位置を可変させる。 The focus lens driving units 142 and 143 move the focus lenses 12b and 13b in the optical axis direction in accordance with instructions from the CPU 110, and vary the focal position so that the subject Z searched by the subject search unit 136 is in focus. .
 ズームレンズ駆動手段144、145は、撮影者の指示に応じて又は被写体探索部136で探索された被写体Zが所定の大きさとなるように、CPU110からの指令に従い、ズームレンズ12c、13cそれぞれ光軸方向に移動させ、焦点距離を可変する。 The zoom lens driving means 144 and 145 are respectively optical axes of the zoom lenses 12c and 13c according to a command from the CPU 110 in accordance with an instruction from the photographer or according to a command from the CPU 110 so that the subject Z searched for by the subject search unit 136 has a predetermined size. Move in the direction and change the focal length.
 AE/AWB検出手段120は、CPU110からの指令に従い、入力された画像信号からAE制御及びAWB制御に必要な物理量を算出する。例えば、AE/AWB検出手段120は、AE制御に必要な物理量として、1画面を複数のエリア(たとえば16×16)に分割し、分割したエリアごとにR、G、Bの画像信号の積算値を算出する。又は、AE/AWB検出手段120は、被写体探索部136で探索された被写体Zを含む所定のエリアのR、G、Bの画像信号の積算値を算出する。CPU110は、このAE/AWB検出手段120から得た積算値に基づいて被写体の明るさ(被写体輝度)を検出し、撮影に適した露出値(撮影EV値)を算出する。そして、算出した撮影EV値と所定のプログラム線図から絞り値とシャッタスピードを決定する。  AE / AWB detection means 120 calculates a physical quantity necessary for AE control and AWB control from the input image signal in accordance with a command from CPU 110. For example, the AE / AWB detection unit 120 divides one screen into a plurality of areas (for example, 16 × 16) as physical quantities necessary for AE control, and an integrated value of R, G, and B image signals for each divided area. Is calculated. Alternatively, the AE / AWB detection unit 120 calculates an integrated value of R, G, and B image signals in a predetermined area including the subject Z searched by the subject search unit 136. The CPU 110 detects the brightness of the subject (subject brightness) based on the integrated value obtained from the AE / AWB detection means 120, and calculates an exposure value (shooting EV value) suitable for shooting. Then, an aperture value and a shutter speed are determined from the calculated shooting EV value and a predetermined program diagram. *
 絞り駆動手段146、147は、CPU110からの指令に従い、絞り兼用メカシャッタ12d、13dの開口量を可変して、撮像素子122、123への入射光量をそれぞれ調整する。また、絞り駆動手段146、147は、CPU110からの指令に従い、絞り兼用メカシャッタ12d、13dを開閉して、撮像素子122、123への露光/遮光それぞれを行う。 The aperture driving means 146 and 147 adjust the amounts of light incident on the image sensors 122 and 123 by varying the apertures of the aperture / mechanical shutters 12d and 13d in accordance with a command from the CPU 110. Further, the aperture driving units 146 and 147 open / close the aperture / mechanical shutters 12d and 13d in accordance with a command from the CPU 110 to perform exposure / light shielding on the image sensors 122 and 123, respectively.
 また、AE/AWB検出手段120は、AWB制御に必要な物理量として、1画面を複数のエリア(例えば、16×16)に分割し、分割したエリアごとにR、G、Bの画像信号の色別の平均積算値を算出する。又は、AE/AWB検出手段120は、被写体探索部136で探索された被写体Zを含む所定のエリアのR、G、Bの画像信号の色別の平均積算値を算出する。CPU110は、得られたRの積算値、Bの積算値、Gの積算値から分割エリアごとにR/G及びB/Gの比を求め、求めたR/G、B/Gの値のR/G、B/Gの色空間における分布等に基づいて光源種判別を行う。そして、判別された光源種に適したホワイトバランス調整値に従って、例えば各比の値がおよそ1(つまり、1画面においてRGBの積算比率がR:G:B=1:1:1)になるように、ホワイトバランス調整回路のR、G、B信号に対するゲイン値(ホワイトバランス補正値)を決定する。 Further, the AE / AWB detection unit 120 divides one screen into a plurality of areas (for example, 16 × 16) as physical quantities necessary for AWB control, and the colors of the R, G, and B image signals for each divided area. Calculate another average integrated value. Alternatively, the AE / AWB detection unit 120 calculates an average integrated value for each color of R, G, and B image signals in a predetermined area including the subject Z searched by the subject search unit 136. The CPU 110 obtains the ratio of R / G and B / G for each divided area from the obtained R accumulated value, B accumulated value, and G accumulated value, and R of the obtained R / G and B / G values. The light source type is discriminated based on the distribution in the color space of / G and B / G. Then, according to the white balance adjustment value suitable for the discriminated light source type, for example, the value of each ratio is approximately 1 (that is, the RGB integration ratio is R: G: B = 1: 1: 1 in one screen). Then, a gain value (white balance correction value) for the R, G, and B signals of the white balance adjustment circuit is determined.
 圧縮伸張処理手段132は、CPU110からの指令に従い、入力された画像データに所定形式の圧縮処理を施し、圧縮画像データを生成する。また、CPU110からの指令に従い、入力された圧縮画像データに所定形式の伸張処理を施し、非圧縮の画像データを生成する。 The compression / decompression processing unit 132 performs compression processing in a predetermined format on the input image data in accordance with a command from the CPU 110 to generate compressed image data. Further, in accordance with a command from the CPU 110, the input compressed image data is subjected to a decompression process in a predetermined format to generate uncompressed image data.
 メディアコントローラ137は、圧縮伸張処理手段132で圧縮処理された各画像データを記録メディア140に記録する。 The media controller 137 records each image data compressed by the compression / decompression processing unit 132 on the recording medium 140.
 記録メディア140は、複眼デジタルカメラ1に着脱自在なxDピクチャカード(登録商標)、スマートメディア(登録商標)に代表される半導体メモリカード、可搬型小型ハードディスク、磁気ディスク、光ディスク、光磁気ディスク等、種々の記録媒体である。 The recording medium 140 includes an xD picture card (registered trademark) detachably attached to the compound-eye digital camera 1, a semiconductor memory card represented by smart media (registered trademark), a portable small hard disk, a magnetic disk, an optical disk, a magneto-optical disk, etc. Various recording media.
 音入力処理手段138は、マイク15に入力され、図示しないステレオマイクアンプで増幅された音声信号が入力され、この音声信号の符号化処理を行う。 The sound input processing means 138 receives an audio signal input to the microphone 15 and amplified by a stereo microphone amplifier (not shown), and performs an encoding process on the audio signal.
 以上のように構成された複眼デジタルカメラ1の作用について説明する。 The operation of the compound eye digital camera 1 configured as described above will be described.
 バリア11を閉状態から開状態へと摺動させると、複眼デジタルカメラ1の電源が投入され、複眼デジタルカメラ1は、撮影モードの下で起動する。撮影モードとしては、2Dモードと、同一被写体を2視点から見た立体画像を撮影する3D撮影モードとが設定可能である。また、3Dモードとしては、右撮像系12及び左撮像系13を用いて同時に所定の視差で立体画像を撮影する3D撮影モードが設定可能である。撮影モードの設定は、複眼デジタルカメラ1が撮影モードで駆動中にMENU/OKボタン25が押下されることによりモニタ16に表示されたメニュー画面において、十字ボタン26等により「撮影モード」を選択することによりモニタ16に表示された撮影モードメニュー画面から設定可能である。 When the barrier 11 is slid from the closed state to the open state, the compound-eye digital camera 1 is turned on, and the compound-eye digital camera 1 is activated under the photographing mode. As the shooting mode, a 2D mode and a 3D shooting mode for shooting a stereoscopic image of the same subject viewed from two viewpoints can be set. In addition, as the 3D mode, a 3D shooting mode in which a stereoscopic image is shot with a predetermined parallax using the right imaging system 12 and the left imaging system 13 can be set. The shooting mode is set by selecting “shooting mode” with the cross button 26 or the like on the menu screen displayed on the monitor 16 when the MENU / OK button 25 is pressed while the compound-eye digital camera 1 is driven in the shooting mode. Thus, the setting can be made from the shooting mode menu screen displayed on the monitor 16.
 (1)2D撮影モード
 CPU110は、右撮像系12又は左撮像系13(本実施の形態では左撮像系13)を選択し、左撮像系13の撮像素子123によって撮影確認画像用の撮影を開始する。すなわち、撮像素子123で連続的に画像が撮像され、その画像信号が連続的に処理されて、撮影確認画像用の画像データが生成される。
(1) 2D shooting mode The CPU 110 selects the right imaging system 12 or the left imaging system 13 (the left imaging system 13 in the present embodiment), and starts shooting for the shooting confirmation image by the imaging device 123 of the left imaging system 13. To do. That is, images are continuously picked up by the image pickup device 123, the image signals are continuously processed, and image data for a shooting confirmation image is generated.
 CPU110は、モニタ16を2Dモードとし、生成された画像データを順次ビデオエンコーダ134に加え、表示用の信号形式に変換してモニタ16に出力する。これにより、撮像素子123で捉えた画像がモニタ16に立体表示される。モニタ16の入力がデジタル信号に対応している場合はビデオエンコーダ134は不要であるが、モニタ16の入力仕様に合致した信号形態に変換する必要がある。 The CPU 110 sets the monitor 16 in the 2D mode, sequentially adds the generated image data to the video encoder 134, converts it into a signal format for display, and outputs it to the monitor 16. As a result, the image captured by the image sensor 123 is stereoscopically displayed on the monitor 16. When the input of the monitor 16 corresponds to a digital signal, the video encoder 134 is not necessary, but it is necessary to convert it to a signal form that matches the input specifications of the monitor 16.
 ユーザ(使用者)は、モニタ16に立体表示される撮影確認画像を見ながらフレーミングしたり、撮影したい被写体を確認したり、撮影後の画像を確認したり、撮影条件を設定したりする。 The user (user) performs framing while viewing the shooting confirmation image displayed in three dimensions on the monitor 16, checks the subject to be shot, checks the image after shooting, and sets shooting conditions.
 上記撮影スタンバイ状態時にレリーズスイッチ20が半押しされると、CPU110にS1ON信号が入力される。CPU110はこれを検知し、AE測光、AF制御を行う。AE測光時には、撮像素子123を介して取り込まれる画像信号の積算値等に基づいて被写体の明るさを測光する。この測光した値(測光値)は、本撮影時における絞り兼用メカシャッタ13dの絞り値、及びシャッタ速度の決定に使用される。同時に、検出された被写体輝度より、フラッシュ14の発光が必要かどうかを判断する。フラッシュ14の発光が必要と判断された場合には、フラッシュ14をプリ発光させ、その反射光に基づいて本撮影時のフラッシュ14の発光量を決定する。 When the release switch 20 is half-pressed in the shooting standby state, an S1 ON signal is input to the CPU 110. The CPU 110 detects this and performs AE metering and AF control. At the time of AE photometry, the brightness of the subject is measured based on the integrated value of the image signal captured via the image sensor 123. This photometric value (photometric value) is used to determine the aperture value and shutter speed of the aperture / mechanical shutter 13d at the time of actual photographing. At the same time, it is determined from the detected subject brightness whether the flash 14 needs to emit light. When it is determined that the flash 14 needs to emit light, the flash 14 is pre-lighted, and the light emission amount of the flash 14 at the time of actual photographing is determined based on the reflected light.
 レリーズスイッチ20が全押しされると、CPU110にS2ON信号が入力される。CPU110は、このS2ON信号に応動して、撮影、記録処理を実行する。 When the release switch 20 is fully pressed, an S2 ON signal is input to the CPU 110. The CPU 110 executes photographing and recording processing in response to the S2ON signal.
 まず、CPU110は、前記測光値に基づいて決定した絞り値に基づいて絞り駆動手段147を介して絞り兼用メカシャッタ13dを駆動するとともに、前記測光値に基づいて決定したシャッタ速度になるように撮像素子123での電荷蓄積時間(いわゆる電子シャッタ)を制御する。 First, the CPU 110 drives the aperture-mechanical shutter 13d via the aperture driving means 147 based on the aperture value determined based on the photometric value, and at the same time, the imaging device has a shutter speed determined based on the photometric value. The charge accumulation time at 123 (so-called electronic shutter) is controlled.
 また、CPU110は、AF制御時にはフォーカスレンズを至近から無限遠に対応するレンズ位置に順次移動させるとともに、レンズ位置毎に撮像素子123を介して取り込まれた画像のAFエリアの画像信号に基づいて画像信号の高周波成分を積算した評価値をAF検出手段118から取得し、この評価値がピークとなるレンズ位置を求め、そのレンズ位置にフォーカスレンズを移動させるコントラストAFを行う。 Further, the CPU 110 sequentially moves the focus lens to a lens position corresponding to the distance from the nearest to infinity during AF control, and the image is based on the image signal of the AF area of the image captured through the image sensor 123 for each lens position. An evaluation value obtained by integrating the high frequency components of the signal is acquired from the AF detection unit 118, a lens position where the evaluation value reaches a peak is obtained, and contrast AF is performed to move the focus lens to the lens position.
 この際、フラッシュ14を発光させる場合は、プリ発光の結果から求めたフラッシュ14の発光量に基づいてフラッシュ14を発光させる。 At this time, when the flash 14 is caused to emit light, the flash 14 is caused to emit light based on the light emission amount of the flash 14 obtained from the result of pre-emission.
 被写体光は、フォーカスレンズ13b、ズームレンズ13c、絞り兼用メカシャッタ13d、赤外線カットフィルタ46、及び光学ローパスフィルタ48等を介して撮像素子123の受光面に入射する。 The subject light is incident on the light receiving surface of the image sensor 123 via the focus lens 13b, the zoom lens 13c, the diaphragm-mechanical shutter 13d, the infrared cut filter 46, the optical low-pass filter 48, and the like.
 撮像素子123の各フォトダイオードに蓄積された信号電荷は、TG149から加えられるタイミング信号に従って読み出され、電圧信号(画像信号)として撮像素子123から順次出力され、CDS/AMP125に入力される。 The signal charge accumulated in each photodiode of the image sensor 123 is read according to the timing signal applied from the TG 149, sequentially output from the image sensor 123 as a voltage signal (image signal), and input to the CDS / AMP 125.
 CDS/AMP125は、CDSパルスに基づいてCCD出力信号を相関二重サンプリング処理し、CPU110から加えられる撮影感度設定用ゲインによってCDS回路から出力される画像信号を増幅する。 The CDS / AMP 125 performs correlated double sampling processing on the CCD output signal based on the CDS pulse, and amplifies the image signal output from the CDS circuit by the imaging sensitivity setting gain applied from the CPU 110.
 CDS/AMP125から出力されたアナログの画像信号は、A/D変換器127において、デジタルの画像信号に変換され、この変換された画像信号(R、G、BのRAWデータ)は、SDRAM114に転送され、ここに一旦蓄えられる。 The analog image signal output from the CDS / AMP 125 is converted into a digital image signal by the A / D converter 127, and the converted image signal (R, G, B RAW data) is transferred to the SDRAM 114. And once stored here.
 SDRAM114から読み出されたR、G、Bの画像信号は、画像信号処理手段130に入力される。画像信号処理手段130では、ホワイトバランス調整回路によりR、G、Bの画像信号ごとにデジタルゲインをかけることでホワイトバランス調整が行われ、ガンマ補正回路によりガンマ特性に応じた階調変換処理が行われ、同時化回路により単板CCDのカラーフィルタ配列に伴う色信号の空間的なズレを補間して各色信号の位相を合わせる同時化処理が行われる。同時化されたR、G、Bの画像信号は、更に輝度・色差データ生成回路により輝度信号Yと色差信号Cr、Cb(YC信号)に変換され輪郭強調などの所定の信号処理が行われる。画像信号処理手段130で処理されたYC信号は再びSDRAM114に蓄えられる。 The R, G, B image signals read from the SDRAM 114 are input to the image signal processing means 130. In the image signal processing means 130, white balance adjustment is performed by applying digital gain to each of the R, G, and B image signals by the white balance adjustment circuit, and gradation conversion processing according to gamma characteristics is performed by the gamma correction circuit. The synchronization circuit interpolates the spatial shift of the color signals associated with the color filter array of the single CCD and performs the synchronization process for matching the phases of the color signals. The synchronized R, G, B image signals are further converted into a luminance signal Y and color difference signals Cr, Cb (YC signal) by a luminance / color difference data generation circuit and subjected to predetermined signal processing such as edge enhancement. The YC signal processed by the image signal processing means 130 is stored in the SDRAM 114 again.
 上記のようにしてSDRAM114に蓄えられたYC信号は、圧縮伸張処理手段132によって圧縮され、所定のフォーマットの画像ファイルとして、メディアコントローラ137を介して記録メディア140に記録される。静止画のデータは、Exif規格に従った画像ファイルとして記録メディア140に格納される。Exifファイルは、主画像のデータを格納する領域と、縮小画像(サムネイル画像)のデータを格納する領域とを有している。撮影によって取得された主画像のデータから画素の間引き処理その他の必要なデータ処理を経て、規定サイズ(例えば、160×120又は80×60ピクセルなど)のサムネイル画像が生成される。こうして生成されたサムネイル画像は、主画像とともにExifファイル内に書き込まれる。また、Exifファイルには、撮影日時、撮影条件、顔検出情報等のタグ情報が付属されている。 The YC signal stored in the SDRAM 114 as described above is compressed by the compression / expansion processing means 132 and recorded on the recording medium 140 via the media controller 137 as an image file of a predetermined format. Still image data is stored in the recording medium 140 as an image file according to the Exif standard. The Exif file has an area for storing main image data and an area for storing reduced image (thumbnail image) data. A thumbnail image having a specified size (for example, 160 × 120 or 80 × 60 pixels) is generated from the main image data obtained by shooting through pixel thinning processing and other necessary data processing. The thumbnail image generated in this way is written in the Exif file together with the main image. Also, tag information such as shooting date / time, shooting conditions, and face detection information is attached to the Exif file.
 複眼デジタルカメラ1のモードを再生モードに設定すると、CPU110は、メディアコントローラ137にコマンドを出力し、記録メディア140に最後に記録された画像ファイルを読み出させる。 When the mode of the compound-eye digital camera 1 is set to the playback mode, the CPU 110 outputs a command to the media controller 137 to read out the image file recorded last on the recording medium 140.
 読み出された画像ファイルの圧縮画像データは、圧縮伸張処理手段132に加えられ、非圧縮の輝度/色差信号に伸張され、ビデオエンコーダ134を介してモニタ16に出力される。これにより、記録メディア140に記録されている画像がモニタ16に再生表示される(1枚画像の再生)。2D撮影モードで撮影された画像は、平面画像がモニタ16全面に2Dモードで表示される。 The compressed image data of the read image file is added to the compression / decompression processing unit 132, decompressed to an uncompressed luminance / color difference signal, and output to the monitor 16 via the video encoder 134. As a result, the image recorded on the recording medium 140 is reproduced and displayed on the monitor 16 (reproduction of one image). As for the image image | photographed in 2D imaging | photography mode, a planar image is displayed on the monitor 16 whole surface in 2D mode.
 (2)3D撮影モード
 撮像素子122及び撮像素子123によって撮影確認画像用の撮影を開始する。すなわち、撮像素子122及び撮像素子123で右目用画像B及び左目用画像Aが所定のフレームレートで連続的に撮像され、その画像信号が連続的に処理され、撮影確認画像用の立体画像データが生成される。CPU110は、モニタ16を3Dモードに設定し、生成された画像データはビデオエンコーダ134で順次表示用の信号形式に変換されて、それぞれモニタ16に出力される。これにより撮影確認画像用の立体画像データがモニタ16に立体表示される。
(2) 3D shooting mode Shooting for the shooting confirmation image is started by the image sensor 122 and the image sensor 123. That is, the image sensor 122 and the image sensor 123 continuously capture the right-eye image B and the left-eye image A at a predetermined frame rate, the image signals are continuously processed, and the stereoscopic image data for the shooting confirmation image is obtained. Generated. The CPU 110 sets the monitor 16 to the 3D mode, and the generated image data is sequentially converted into a signal format for display by the video encoder 134 and is output to the monitor 16. As a result, stereoscopic image data for the shooting confirmation image is stereoscopically displayed on the monitor 16.
 本実施の形態では、撮影確認画像用の立体画像データがモニタ16に立体表示されるのと平行して、左目用画像Aに対して被写体Zを追尾する処理が行われる。 In the present embodiment, a process of tracking the subject Z with respect to the left-eye image A is performed in parallel with the stereoscopic image data for the shooting confirmation image being stereoscopically displayed on the monitor 16.
 図4は、所定のフレームレートで連続的に撮影された左目用画像Aに対して被写体Zを追尾する処理の流れを示すフローチャートである。この処理はCPU110によって制御される。この撮像処理をCPU110に実行させるためのプログラムはCPU110内のプログラム格納部に記憶されている。 FIG. 4 is a flowchart showing a flow of processing for tracking the subject Z with respect to the image A for the left eye continuously photographed at a predetermined frame rate. This process is controlled by the CPU 110. A program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110.
 処理対象のフレームの直前に撮影された左目用画像A、この場合には被写体Zを追尾する処理の直前に撮影された左目用画像A0(図5参照)がテンプレート画像生成部135に入力され、テンプレート画像生成部135は左目用画像A0からテンプレート画像TA0を生成する(ステップS10)。テンプレート画像生成部135がテンプレート画像TA0を生成する処理について詳しく説明する。 The left-eye image A taken immediately before the processing target frame, in this case, the left-eye image A0 (see FIG. 5) taken immediately before the process of tracking the subject Z is input to the template image generation unit 135. The template image generation unit 135 generates a template image TA0 from the left eye image A0 (step S10). A process in which the template image generation unit 135 generates the template image TA0 will be described in detail.
 図5に示す左目用画像A0には、追尾対象である被写体Z(ここでは、人物の顔)が含まれている。テンプレート画像生成部135は、左目用画像A0から被写体Zを含み、被写体Zの形状が認識できる大きさの領域を抜き出す。本実施の形態では、図6の点線に示すように、被写体Zの輪郭に数ピクセルの余裕を持った矩形領域を左目用画像A0から抜き出す。そして、テンプレート画像生成部135は、抜き出した画像を図7に示すようなテンプレートTA0とする。これにより、テンプレート画像が生成される。 The left-eye image A0 shown in FIG. 5 includes a subject Z (here, a human face) that is a tracking target. The template image generation unit 135 extracts a region including the subject Z from the left-eye image A0 and having a size that allows the shape of the subject Z to be recognized. In the present embodiment, as shown by the dotted line in FIG. 6, a rectangular area having a margin of several pixels in the contour of the subject Z is extracted from the left-eye image A0. Then, the template image generation unit 135 sets the extracted image as a template TA0 as shown in FIG. Thereby, a template image is generated.
 なお、被写体Zを追尾対象として選択する方法は、顔検出等により被写体を自動的に検出する方法、撮影者が操作手段112を介して選択する方法、発信機を持つ被写体を自動的に検出する方法等が考えられる。顔検出により被写体を自動的に検出する場合には、左目用画像A及び右目用画像Bからそれぞれ顔検出を行い、左目用画像Aと右目用画像Bとで同じ顔が検出されたことが確認されたらその顔を被写体Zとすればよい。左目用画像Aと右目用画像Bとで異なる顔が検出された場合には、操作手段112を介して撮影者が顔を選択するようにすればよい。撮影者が操作手段112を介して選択する場合には、撮影者が左目用画像A及び右目用画像Bのそれぞれから選択してもよいし、撮影者が左目用画像A又は右目用画像Bから選択し、対応点検出等により他方の画像から同じ被写体を検出するようにしても良い。 Note that the method of selecting the subject Z as a tracking target includes a method of automatically detecting the subject by face detection or the like, a method of selecting by the photographer via the operation means 112, and a subject having a transmitter automatically. A method etc. can be considered. When the subject is automatically detected by face detection, face detection is performed from each of the left eye image A and the right eye image B, and it is confirmed that the same face is detected in the left eye image A and the right eye image B. Then, the face may be the subject Z. When different faces are detected in the left-eye image A and the right-eye image B, the photographer may select the face via the operation unit 112. When the photographer selects via the operation means 112, the photographer may select from each of the left eye image A and the right eye image B, or the photographer may select from the left eye image A or the right eye image B. The same subject may be detected from the other image by selecting and detecting corresponding points.
 処理対象のフレームの直前に撮影された右目用画像B、この場合には被写体Zを追尾する処理の直前に撮影された右目用画像B0(図5参照)がテンプレート画像生成部135に入力され、テンプレート画像生成部135はステップS10と同様の方法により、右目用画像B0からテンプレート画像TB0を生成する(ステップS12)。 The right-eye image B photographed immediately before the frame to be processed, in this case, the right-eye image B0 (see FIG. 5) photographed immediately before the process of tracking the subject Z is input to the template image generation unit 135. The template image generation unit 135 generates a template image TB0 from the right eye image B0 by the same method as in step S10 (step S12).
 被写体Zを追尾対象として選択する時に左目用画像Aを基準とすると、右目用画像Bに追尾対象である被写体Zが含まれていない(例えば、他の被写体に覆われている場合)場合も考えられる。この場合には、できるだけ撮影タイミングの近い画像からテンプレート画像を生成する。例えば、左目用画像A0からテンプレート画像TA0が生成できたが右目用画像B0に被写体Zが含まれていない場合には、右目用画像B0の直前に撮影された右目用画像に被写体Zが含まれている場合にはこの画像から生成したテンプレート画像をテンプレート画像TB0とすればよいし、右目用画像B0の直前に撮影された右目用画像にも被写体Zが含まれていない場合には、その前に撮影された右目用画像であって、被写体Zが含まれている画像からテンプレート画像TB0を生成するようにすればよい。 When the subject Z is selected as a tracking target, if the left-eye image A is used as a reference, the right-eye image B may not include the subject Z that is the tracking target (for example, when it is covered by another subject). It is done. In this case, a template image is generated from images that are as close as possible to the photographing timing. For example, when the template image TA0 can be generated from the left-eye image A0, but the subject Z is not included in the right-eye image B0, the subject Z is included in the right-eye image captured immediately before the right-eye image B0. In this case, the template image generated from this image may be used as the template image TB0. If the subject Z is not included in the right-eye image taken immediately before the right-eye image B0, the previous image is displayed. The template image TB0 may be generated from the image for the right eye that is captured in FIG.
 i=1に設定する(ステップS14)。すなわち1枚目の画像の撮影、処理を開始する。なお、iは正の整数である。 I is set to 1 (step S14). That is, the photographing and processing of the first image is started. Note that i is a positive integer.
 左撮像系13で左目用画像Ai(今はi=1であるためA1、図8参照)を撮影する(ステップS16)。左目用画像Aiについては被写体追尾処理を行うため、撮影された左目用画像Aiは被写体探索部136に入力される。同時に、左目用画像Aiはビデオエンコーダ134に入力され、ビデオエンコーダ134で順次表示用の信号形式に変換されて、それぞれモニタ16に出力される。 The left imaging system 13 captures the left-eye image Ai (currently, A = 1 since i = 1, see FIG. 8) (step S16). Since the subject tracking process is performed on the left-eye image Ai, the captured left-eye image Ai is input to the subject search unit 136. At the same time, the left-eye image Ai is input to the video encoder 134, is sequentially converted into a display signal format by the video encoder 134, and is output to the monitor 16.
 右撮像系12で右目用画像Bi(今はi=1であるためB1、図8参照)を撮影する(ステップS18)。右目用画像Biについては被写体追尾処理を行わないため、撮影された右目用画像Biはテンプレート画像生成部135に入力される。同時に、右目用画像Biはビデオエンコーダ134に入力され、ビデオエンコーダ134で順次表示用の信号形式に変換されて、それぞれモニタ16に出力される。 The right imaging system 12 captures the right-eye image Bi (B1 because i = 1 at this time, see FIG. 8) (step S18). Since the subject tracking process is not performed on the right-eye image Bi, the captured right-eye image Bi is input to the template image generation unit 135. At the same time, the right-eye image Bi is input to the video encoder 134, is sequentially converted into a display signal format by the video encoder 134, and is output to the monitor 16.
 被写体探索部136は、生成されたテンプレート画像TA(i-1)、TB(i-1)を用いて左目用画像Aiをサーチして、左目用画像Aiからテンプレート画像TA(i-1)、TB(i-1)に類似した部分を探索する(ステップS20)。今はi=1であるため、ステップS10、S12で生成されたテンプレート画像TA0、TB0を用いて左目用画像A1をサーチする。 The subject search unit 136 searches the left-eye image Ai using the generated template images TA (i−1) and TB (i−1), and uses the left-eye image Ai as the template image TA (i−1), A part similar to TB (i-1) is searched (step S20). Since i = 1 at this time, the left-eye image A1 is searched using the template images TA0 and TB0 generated in steps S10 and S12.
 図3に示すように、左撮像系13と右撮像系12との撮影位置の違いから、左撮像系13の光軸13Lと右撮像系12の光軸12Lは一致せず、左撮像系13で被写体Zを撮影した結果と右撮像系12で被写体Zを撮影した結果とは異なる。すなわち、図6に示すように、左目用画像Aに含まれる被写体Zの向きと、右目用画像Bに含まれる被写体Zの向きとは異なる。したがって、テンプレート画像TAiに含まれる被写体Zとテンプレート画像TBiに含まれる被写体Zとは異なるため、テンプレート画像TA(i-1)、TB(i-1)の両方で左目用画像Aiをサーチする。 As shown in FIG. 3, due to the difference in shooting position between the left imaging system 13 and the right imaging system 12, the optical axis 13 </ b> L of the left imaging system 13 and the optical axis 12 </ b> L of the right imaging system 12 do not coincide with each other. The result of photographing the subject Z with the right and the result of photographing the subject Z with the right imaging system 12 are different. That is, as shown in FIG. 6, the orientation of the subject Z included in the left-eye image A is different from the orientation of the subject Z included in the right-eye image B. Therefore, since the subject Z included in the template image TAi and the subject Z included in the template image TBi are different, the left-eye image Ai is searched by using both the template images TA (i−1) and TB (i−1).
 なお、左目用画像Aiからテンプレート画像TA(i-1)、TB(i-1)に類似した部分を探索する方法としては、テンプレートマッチング等によるパターンマッチングの手法を用いることができるが、特にこれらに限定されない。 As a method for searching for a portion similar to the template images TA (i-1) and TB (i-1) from the left-eye image Ai, a pattern matching method such as template matching can be used. It is not limited to.
 被写体探索部136は、左目用画像Aiのサーチが終了したか否かを判断し(ステップS22)、終了していない場合(ステップS22でNO)には再度ステップS22を行う。 The subject search unit 136 determines whether or not the search for the left-eye image Ai has been completed (step S22). If the search has not been completed (NO in step S22), step S22 is performed again.
 左目用画像Aiのサーチが終了した場合(ステップS22でYES)には、被写体探索部136は、テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果と、テンプレート画像TB(i-1)で左目用画像Aiをサーチした結果が同じか否かを判断する(ステップS24)。今はi=1であるため、テンプレート画像TA0で左目用画像A1をサーチした結果と、テンプレート画像TB0で左目用画像A1をサーチした結果とが同じか否かが判断される。 When the search for the left-eye image Ai is completed (YES in step S22), the subject searching unit 136 searches the template image TA (i-1) for the left-eye image Ai and the template image TB (i− It is determined whether or not the result of searching the left-eye image Ai in 1) is the same (step S24). Since i = 1 now, it is determined whether or not the result of searching the left-eye image A1 using the template image TA0 is the same as the result of searching the left-eye image A1 using the template image TB0.
 テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果と、テンプレート画像TB(i-1)で左目用画像Aiをサーチした結果が同じ場合(ステップS24でYES)には、テンプレート画像TA(i-1)、TB(i-1)に類似した部分を被写体Zの位置とする。図8に示す左目用画像A1をサーチした場合には、テンプレート画像TA0、TB0いずれでサーチしても図8A1の点線で囲んだ領域がサーチ結果とされるため、被写体探索部136は図8A1の点線で囲まれた位置を被写体Zの位置とする。被写体探索部136は、探索結果と左目用画像Aiをテンプレート画像生成部135へ入力する。その後、ステップS32へ進む。 If the result of searching the left-eye image Ai using the template image TA (i-1) is the same as the result of searching the left-eye image Ai using the template image TB (i-1) (YES in step S24), the template image A portion similar to TA (i−1) and TB (i−1) is set as the position of the subject Z. When the left-eye image A1 shown in FIG. 8 is searched, the search result is the area surrounded by the dotted line in FIG. 8A1 regardless of whether the template image TA0 or TB0 is searched. The position surrounded by the dotted line is the position of the subject Z. The subject search unit 136 inputs the search result and the left-eye image Ai to the template image generation unit 135. Thereafter, the process proceeds to step S32.
 テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果と、テンプレート画像TB(i-1)で左目用画像Aiをサーチした結果が同じでない場合(ステップS24でYES)には、被写体探索部136は、テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TA(i-1)との類似度を算出すると共に、テンプレート画像TB(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TB(i-1)との類似度を算出する。類似度の算出方法としては公知のもの、例えば特徴量の値の差分、特徴量空間(重み付き空間も可)上の最小2乗法などが採用できる。そして被写体探索部136は、テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TA(i-1)との類似度が、テンプレート画像TB(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TB(i-1)との類似度より高いか否かを判断する(ステップS26)。 If the result of searching the left-eye image Ai using the template image TA (i-1) and the result of searching the left-eye image Ai using the template image TB (i-1) are not the same (YES in step S24), the subject The search unit 136 calculates the similarity between the result of searching the left eye image Ai using the template image TA (i-1) and the template image TA (i-1), and also uses the template image TB (i-1) to perform the left eye operation. The similarity between the search result of the image Ai and the template image TB (i-1) is calculated. As a method for calculating the similarity, a known method, for example, a difference between feature value values, a least square method on a feature amount space (a weighted space is also possible), or the like can be employed. Then, the subject search unit 136 searches for the left eye image Ai using the template image TA (i-1) and the similarity between the template image TA (i-1) and the template image TA (i-1). It is determined whether or not the result of searching the image Ai is higher than the similarity between the template image TB (i-1) (step S26).
 テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TA(i-1)との類似度が、テンプレート画像TB(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TB(i-1)との類似度より高い場合(ステップS26でYES)には、被写体探索部136は、テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果、すなわちテンプレート画像TA(i-1)に類似した部分を被写体Zの位置とする(ステップS28)。被写体探索部136は、この被写体Zの位置と左目用画像Aiをテンプレート画像生成部135へ入力する。 The similarity between the result of searching the left eye image Ai using the template image TA (i-1) and the template image TA (i-1) is the result of searching the left eye image Ai using the template image TB (i-1). When the degree of similarity is higher than the template image TB (i-1) (YES in step S26), the subject searching unit 136 searches the template image TA (i-1) for the left eye image Ai, that is, the template. A portion similar to the image TA (i-1) is set as the position of the subject Z (step S28). The subject search unit 136 inputs the position of the subject Z and the left eye image Ai to the template image generation unit 135.
 テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TA(i-1)との類似度が、テンプレート画像TB(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TB(i-1)との類似度より高くない場合(ステップS26でNO)には、被写体探索部136は、テンプレート画像TB(i-1)で左目用画像Aiをサーチした結果、すなわちテンプレート画像TB(i-1)に類似した部分を被写体Zの位置とする(ステップS30)。被写体探索部136は、この被写体Zの位置と左目用画像Aiをテンプレート画像生成部135へ入力する。 The similarity between the result of searching the left eye image Ai using the template image TA (i-1) and the template image TA (i-1) is the result of searching the left eye image Ai using the template image TB (i-1). When the similarity is not higher than the similarity with the template image TB (i-1) (NO in step S26), the subject searching unit 136 searches the template image TB (i-1) for the left eye image Ai, that is, A portion similar to the template image TB (i-1) is set as the position of the subject Z (step S30). The subject search unit 136 inputs the position of the subject Z and the left eye image Ai to the template image generation unit 135.
 これにより、左目用画像Aiから被写体Zが探索される。テンプレート画像生成部135は、ステップS24、S28、S30で設定された被写体Zの位置に基づいて、左目用画像Aiから被写体Zを含み、被写体Zの形状が認識できる大きさの領域を抜き出して、テンプレート画像TAiを生成する(ステップS32)。ステップS10と同様、被写体Zの輪郭に数ピクセルの余裕を持った矩形領域が抜き出され、テンプレート画像として生成される。図8A1に示す場合には、点線部分がテンプレート画像TA1として生成される。 Thereby, the subject Z is searched from the left-eye image Ai. Based on the position of the subject Z set in steps S24, S28, and S30, the template image generation unit 135 extracts a region that includes the subject Z from the left-eye image Ai and has a size that allows the shape of the subject Z to be recognized. A template image TAi is generated (step S32). Similar to step S10, a rectangular area having a margin of several pixels is extracted from the contour of the subject Z and generated as a template image. In the case shown in FIG. 8A1, the dotted line portion is generated as the template image TA1.
 テンプレート画像生成部135は、ステップS12と同様の方法により、右目用画像Biからテンプレート画像TBiを生成する(ステップS34)。右目用画像Biに追尾対象である被写体Zが含まれていない場合には、右目用画像Biの前のできるだけ近いタイミングで撮影された右目用画像であって、被写体Zが含まれている画像からテンプレート画像TB0を生成するようにすればよい。 The template image generation unit 135 generates a template image TBi from the right-eye image Bi by the same method as in step S12 (step S34). When the subject Z that is the tracking target is not included in the right-eye image Bi, the right-eye image is taken at the closest timing before the right-eye image Bi, and the image includes the subject Z. A template image TB0 may be generated.
 その後、i=1+1に設定し(ステップS36)、ステップS16に戻り、再度ステップS16~S36の処理が行われる。すなわち1枚目(i=1)の画像の撮影、被写体の探索が終了したら、2枚目(i=1+1=2)の処理を行い、2枚目(i=2)の画像の撮影、被写体の探索が終了したら、3枚目(i=2+1=3)の処理を行い、…というようにステップS16~S36の処理を順次繰り返す。 Thereafter, i = 1 + 1 is set (step S36), the process returns to step S16, and the processes of steps S16 to S36 are performed again. That is, when the first image (i = 1) is shot and the search for the subject is completed, the second (i = 1 + 1 = 2) process is performed, and the second (i = 2) image is taken. When the search is completed, the third sheet (i = 2 + 1 = 3) is processed, and the processes of steps S16 to S36 are sequentially repeated.
 これにより、所定のフレームレートで連続的に撮影された左目用画像Ai(i=1、2…)に対して連続的に被写体Zが探索される、すなわちモニタ16に立体表示される撮影確認画像のうちの左目用画像Aに対して被写体Zが追尾される。 As a result, the subject Z is continuously searched for the left-eye images Ai (i = 1, 2,...) Continuously shot at a predetermined frame rate, that is, the shooting confirmation image displayed stereoscopically on the monitor 16. The subject Z is tracked with respect to the left-eye image A.
 ユーザ(使用者)は、モニタ16に立体表示される撮影確認画像を見ながらフレーミングしたり、撮影したい被写体を確認したり、撮影後の画像を確認したり、撮影条件を設定したりする。 The user (user) performs framing while viewing the shooting confirmation image displayed in three dimensions on the monitor 16, checks the subject to be shot, checks the image after shooting, and sets shooting conditions.
 また、被写体Zの追尾と同時に、被写体Zを基準にズームを最適化するようにしても良い。例えば、CPU110は、被写体Zの大きさが所定の大きさとなるように、ズームレンズ駆動手段144、145を介してズームレンズ12c、13cそれぞれ光軸方向に移動させる。これにより、追尾対象が何であるかを撮影者が認識することができる。また、追尾対象は撮影者にとって重要な被写体であることから、重要な被写体が見やすい画像を撮影することができる。 Also, the zoom may be optimized based on the subject Z simultaneously with the tracking of the subject Z. For example, the CPU 110 moves the zoom lenses 12c and 13c in the optical axis direction via the zoom lens driving units 144 and 145 so that the size of the subject Z becomes a predetermined size. Thereby, the photographer can recognize what the tracking target is. In addition, since the tracking target is an important subject for the photographer, an image in which the important subject is easy to see can be taken.
 撮影確認画像用の立体画像データがモニタ16に立体表示されている間(撮影スタンバイ状態時)にレリーズスイッチ20が半押しされると、CPU110にS1ON信号が入力される。CPU110はこれを検知し、図4に示す被写体追尾処理を終了する。また、CPU110は、AE測光、AF制御を行う。本実施の形態では、AE測光及びAF制御は、被写体Zの追尾処理を行った左撮像系13で行う。また、図4に示す被写体追尾処理で追尾された被写体Zを基準にして露出、フォーカスを最適化する。すなわち、被写体Zが適正露出となるようにAE測光を行い、被写体Zが合焦するようにAF処理を行う。なお、AE測光、AF制御は2D撮影モードと同一であるため、詳細な説明を省略する。 When the release switch 20 is pressed halfway while the stereoscopic image data for the shooting confirmation image is stereoscopically displayed on the monitor 16 (in the shooting standby state), an S1 ON signal is input to the CPU 110. The CPU 110 detects this and ends the subject tracking process shown in FIG. The CPU 110 performs AE metering and AF control. In the present embodiment, AE metering and AF control are performed by the left imaging system 13 that has performed tracking processing of the subject Z. Further, the exposure and focus are optimized based on the subject Z tracked by the subject tracking process shown in FIG. In other words, AE photometry is performed so that the subject Z has an appropriate exposure, and AF processing is performed so that the subject Z is in focus. Since AE metering and AF control are the same as in the 2D shooting mode, detailed description thereof is omitted.
 レリーズスイッチ20が全押しされると、CPU110にS2ON信号が入力される。CPU110は、このS2ON信号に応動して、撮影、記録処理を実行する。右撮像系12及び左撮像系13のそれぞれで撮影された画像データを生成する処理については、2D撮影モードと同一であるため、説明を省略する。 When the release switch 20 is fully pressed, an S2 ON signal is input to the CPU 110. The CPU 110 executes photographing and recording processing in response to the S2ON signal. The processing for generating image data captured by each of the right imaging system 12 and the left imaging system 13 is the same as that in the 2D imaging mode, and thus description thereof is omitted.
 CDS/AMP124、125でそれぞれ生成された2枚の画像データからは、2D撮影モードと同様の方法により、圧縮画像データが2個生成される。圧縮された2枚の画像データは、関連付けられて1ファイルとして記録メディア140に記憶される。記憶形式としては、MPフォーマット等を用いることができる。 From the two pieces of image data generated by the CDS / AMP 124 and 125, two pieces of compressed image data are generated by the same method as in the 2D shooting mode. The two pieces of compressed image data are associated and stored in the recording medium 140 as one file. An MP format or the like can be used as the storage format.
 複眼デジタルカメラ1のモードを再生モードに設定すると、CPU110は、メディアコントローラ137にコマンドを出力し、記録メディア140に最後に記録された画像ファイルを読み出させる。読み出された画像ファイルの圧縮画像データは、圧縮伸張処理手段132に加えられ、非圧縮の輝度/色差信号に伸張され、立体画像生成部133で立体画像とされたのち、ビデオエンコーダ134を介してモニタ16に出力される。これにより、記録メディア140に記録されている画像がモニタ16に再生表示される(1枚画像の再生)。 When the mode of the compound-eye digital camera 1 is set to the playback mode, the CPU 110 outputs a command to the media controller 137 to read out the image file recorded last on the recording medium 140. The compressed image data of the read image file is added to the compression / decompression processing unit 132, decompressed to an uncompressed luminance / color difference signal, converted into a stereoscopic image by the stereoscopic image generation unit 133, and then passed through the video encoder 134. And output to the monitor 16. As a result, the image recorded on the recording medium 140 is reproduced and displayed on the monitor 16 (reproduction of one image).
 画像のコマ送りは、十字ボタン26の左右のキー操作によって行われ、十字ボタン26の右キーが押されると、次の画像ファイルが記録メディア140から読み出され、モニタ16に再生表示される。また、十字ボタンの左キーが押されると、一つ前の画像ファイルが記録メディア140から読み出され、モニタ16に再生表示される。 The frame advance of the image is performed by operating the left and right keys of the cross button 26. When the right key of the cross button 26 is pressed, the next image file is read from the recording medium 140 and reproduced and displayed on the monitor 16. When the left key of the cross button is pressed, the previous image file is read from the recording medium 140 and reproduced and displayed on the monitor 16.
 モニタ16に再生表示された画像を確認しながら、必要に応じて、記録メディア140に記録された画像を消去することができる。画像の消去は、画像がモニタ16に再生表示された状態でMENU/OKボタン25が押下されることによって行われる。 The image recorded on the recording medium 140 can be erased as necessary while confirming the image reproduced and displayed on the monitor 16. The image is erased by pressing the MENU / OK button 25 while the image is reproduced and displayed on the monitor 16.
 本実施の形態によれば、向きの異なる複数の被写体の画像をキーとして被写体を追尾するため、被写体が移動したり被写体の向きが変わったりした場合においても、追尾中に被写体を見失う可能性を低減することができる。また、視差画像を抜き出したテンプレート画像を用いて被写体追尾処理をするため、精度の良い追尾をすることができる。したがって、結果的には演算量を減らすことができる。 According to the present embodiment, the subject is tracked using the images of a plurality of subjects with different orientations as keys, so even if the subject moves or the orientation of the subject changes, there is a possibility that the subject may be lost during tracking. Can be reduced. In addition, since subject tracking processing is performed using a template image obtained by extracting a parallax image, accurate tracking can be performed. As a result, the amount of calculation can be reduced.
 なお、本実施の形態では、被写体追尾処理を左目用画像Aで行ったが、右目用画像Bで被写体追尾処理を行ってもよいし、左目用画像A及び右目用画像Bの両方で被写体追尾処理を行うようにしてもよい。右目用画像Bで被写体追尾処理を行う場合についても、図4に示すのと同様の方法で行う。すなわちテンプレート画像TA(i-1)、TB(i-1)の両方で右目用画像Biをサーチし、結果が同じ場合にはその位置を被写体Zの位置と設定し、結果が異なる場合にはより類似度が高い方の結果を被写体Zの位置と設定すればよい。 In this embodiment, the subject tracking process is performed on the left-eye image A. However, the subject tracking process may be performed on the right-eye image B, or the subject tracking process may be performed on both the left-eye image A and the right-eye image B. Processing may be performed. The subject tracking process is performed on the right eye image B in the same manner as shown in FIG. That is, the right-eye image Bi is searched for both the template images TA (i-1) and TB (i-1). If the results are the same, the position is set as the position of the subject Z. If the results are different, The result with the higher similarity may be set as the position of the subject Z.
 <第2の実施の形態>
 本発明の第1の実施の形態は、右目用画像の一部を抜き出して生成したテンプレート画像と、左目用画像の一部を抜き出して生成したテンプレート画像とを用いて右目用画像及び左目用画像の少なくとも一方から被写体の探索をすることで被写体追尾を行ったが、被写体追尾を行う方法はこれに限られない。
<Second Embodiment>
The first embodiment of the present invention uses a template image generated by extracting a part of a right-eye image and a template image generated by extracting a part of a left-eye image, and a right-eye image and a left-eye image. The subject tracking is performed by searching for the subject from at least one of the above, but the method of performing the subject tracking is not limited to this.
 本発明の第2の実施の形態は、右目用画像の一部を抜き出して生成したテンプレート画像と、右目用画像の一部を抜き出して生成したテンプレート画像と左目用画像の一部を抜き出して生成したテンプレート画像とから画像合成処理により生成したテンプレート画像とを用いて、右目用画像及び左目用画像の少なくとも一方から被写体の探索をすることで被写体追尾を行う形態である。以下、第2の実施の形態の複眼デジタルカメラ2について説明する。第1の実施の形態と同一の部分については、同一の符号を付し、説明を省略する。 In the second embodiment of the present invention, a template image generated by extracting a part of the right-eye image, a template image generated by extracting a part of the right-eye image, and a part of the left-eye image are generated. This is a form in which subject tracking is performed by searching for a subject from at least one of a right-eye image and a left-eye image using a template image generated by image synthesis processing from the template image. Hereinafter, the compound-eye digital camera 2 of the second embodiment will be described. The same parts as those of the first embodiment are denoted by the same reference numerals, and description thereof is omitted.
 図9は、複眼デジタルカメラ2の主要な内部構成を示すブロック図である。複眼デジタルカメラ2は、主として、CPU110、操作手段(レリーズスイッチ20、MENU/OKボタン25、十字ボタン26等)112、SDRAM114、VRAM116、AF検出手段118、AE/AWB検出手段120、撮像素子122、123、CDS/AMP124、125、A/D変換器126、127、画像入力コントローラ128、画像信号処理手段130、圧縮伸張処理手段132、立体画像生成部133、ビデオエンコーダ134、テンプレート画像生成部135、被写体探索部136、メディアコントローラ137、音入力処理手段138、合成テンプレート画像生成部139、記録メディア140、フォーカスレンズ駆動手段142、143、ズームレンズ駆動手段144、145、絞り駆動手段146、147、タイミングジェネレータ(TG)148、149で構成される。 FIG. 9 is a block diagram showing the main internal configuration of the compound-eye digital camera 2. The compound-eye digital camera 2 mainly includes a CPU 110, operation means (release switch 20, MENU / OK button 25, cross button 26, etc.) 112, SDRAM 114, VRAM 116, AF detection means 118, AE / AWB detection means 120, image sensor 122, 123, CDS / AMP 124, 125, A / D converters 126, 127, image input controller 128, image signal processing means 130, compression / decompression processing means 132, stereoscopic image generation unit 133, video encoder 134, template image generation unit 135, Subject search unit 136, media controller 137, sound input processing unit 138, composite template image generation unit 139, recording medium 140, focus lens driving units 142 and 143, zoom lens driving units 144 and 145, aperture driving unit 146 147, and a timing generator (TG) 148 and 149.
 合成テンプレート画像生成部139は、右目用画像の一部を抜き出して生成したテンプレート画像と、左目用画像の一部を抜き出して生成したテンプレート画像とから画像合成処理により合成テンプレート画像を生成する。図10は、合成テンプレート画像生成部139が合成テンプレート画像を生成する方法を示す模式図である。 The composite template image generation unit 139 generates a composite template image by image composition processing from a template image generated by extracting a part of the image for the right eye and a template image generated by extracting a part of the image for the left eye. FIG. 10 is a schematic diagram illustrating a method in which the composite template image generation unit 139 generates a composite template image.
 まず、合成テンプレート画像生成部139は、左目用画像A0の一部を抜き出して生成したテンプレート画像TA0から特徴点を抽出する。特徴点は、例えば、複数の方向に強い信号勾配を持つ点(画素)であり、Harrisの手法や、Shi-Tomasiの手法等を用いることで抽出できる。 First, the composite template image generation unit 139 extracts feature points from the template image TA0 generated by extracting a part of the left-eye image A0. The feature points are, for example, points (pixels) having strong signal gradients in a plurality of directions, and can be extracted by using a Harris method, a Shi-Tomasi method, or the like.
 次に、合成テンプレート画像生成部139は、右目用画像B0の一部を抜き出して生成したテンプレート画像TB0から対応点を抽出する。対応点とは、テンプレート画像TA0から抽出された特徴点に対応する点である。 Next, the composite template image generation unit 139 extracts corresponding points from the template image TB0 generated by extracting a part of the right-eye image B0. The corresponding point is a point corresponding to the feature point extracted from the template image TA0.
 そして、合成テンプレート画像生成部139は、テンプレート画像TA0から抽出された特徴点とテンプレート画像TB0から抽出された対応点との位置合わせを行った後、画像合成処理によりテンプレート画像TA0とテンプレート画像TB0との中間状態の画像、すなわち合成テンプレート画像TM0-1~TM0-5を生成する。本実施の形態では、ある物体から別の物体へ変形する様子を表現するモーフィングという手法を用いて合成テンプレート画像を作成するが、画像合成処理はこれに限られない。 Then, the composite template image generation unit 139 aligns the feature points extracted from the template image TA0 and the corresponding points extracted from the template image TB0, and then performs template combination TA0 and template image TB0 by image composition processing. Intermediate state images, that is, composite template images TM0-1 to TM0-5 are generated. In this embodiment, a composite template image is created using a technique called morphing that expresses a state of deformation from one object to another, but the image composition processing is not limited to this.
 また、本実施の形態では、5つの合成テンプレート画像TM0-1~TM0-5を生成するが、生成される合成テンプレートの数は5つに限定されるものではない。 Further, in the present embodiment, five composite template images TM0-1 to TM0-5 are generated, but the number of generated composite templates is not limited to five.
 また、テンプレート画像TA(i-1)、TB(i-1)に何らかの画像処理を行った後で合成テンプレート画像を生成するようにしてもよい。例えば、テンプレート画像TA(i-1)の平均輝度とテンプレート画像TB(i-1)の平均輝度とが同じ値になるように輝度を補正した後で合成テンプレート画像TM(i-1)-1~TM(i-1)-5を生成するようにしても良いし、テンプレート画像TA(i-1)のカラーバランスとテンプレート画像TB(i-1)のカラーバランスとが同じ値になるように輝度を補正した後で合成テンプレート画像TM(i-1)-1~TM(i-1)-5を生成するようにしても良い。テンプレート画像TA(i-1)から抽出された特徴点とテンプレート画像TB(i-1)から抽出された対応点との位置合わせを行い、テンプレート画像TA(i-1)をテンプレート画像TB(i-1)と略同じ大きさとして後で、合成テンプレート画像TM(i-1)-1~TM(i-1)-5を生成するようにしても良い。 Further, a composite template image may be generated after some image processing is performed on the template images TA (i-1) and TB (i-1). For example, after correcting the luminance so that the average luminance of the template image TA (i-1) and the average luminance of the template image TB (i-1) become the same value, the synthesized template image TM (i-1) -1 TM (i-1) -5 may be generated, or the color balance of the template image TA (i-1) and the color balance of the template image TB (i-1) may have the same value. The synthesized template images TM (i-1) -1 to TM (i-1) -5 may be generated after correcting the luminance. The feature points extracted from the template image TA (i-1) and the corresponding points extracted from the template image TB (i-1) are aligned, and the template image TA (i-1) is converted into the template image TB (i After that, the composite template images TM (i-1) -1 to TM (i-1) -5 may be generated with the same size as that of (-1).
 以上のように構成された複眼デジタルカメラ2の作用について説明する。第1の実施の形態と第2の実施の形態との違いは、3D撮影モードにおける被写体追尾処理のみであるため、被写体追尾処理についてのみ説明し、その他の説明は省略する。 The operation of the compound-eye digital camera 2 configured as described above will be described. Since the difference between the first embodiment and the second embodiment is only the subject tracking process in the 3D shooting mode, only the subject tracking process will be described, and the other description will be omitted.
 図11は、所定のフレームレートで連続的に撮影された左目用画像Aに対して被写体Zを追尾する処理の流れを示すフローチャートである。この処理はCPU110によって制御される。この撮像処理をCPU110に実行させるためのプログラムはCPU110内のプログラム格納部に記憶されている。 FIG. 11 is a flowchart showing a flow of processing for tracking the subject Z with respect to the image A for the left eye continuously photographed at a predetermined frame rate. This process is controlled by the CPU 110. A program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110.
 処理対象のフレームの直前に撮影された左目用画像A、この場合には被写体Zを追尾する処理の直前に撮影された左目用画像A0(図5参照)がテンプレート画像生成部135に入力される。テンプレート画像生成部135は、左目用画像A0から被写体Zを含み、被写体Zの形状が認識できる大きさの領域を抜き出してテンプレート画像TA0を生成する(ステップS10)。 The left-eye image A taken immediately before the frame to be processed, in this case, the left-eye image A0 (see FIG. 5) taken immediately before the process of tracking the subject Z is input to the template image generation unit 135. . The template image generation unit 135 generates a template image TA0 by extracting an area including the subject Z from the left-eye image A0 and having a size that allows the shape of the subject Z to be recognized (step S10).
 処理対象のフレームの直前に撮影された右目用画像B、この場合には被写体Zを追尾する処理の直前に撮影された右目用画像B0(図5参照)又は処理対象のフレームとできるだけ撮影タイミングの近い右目用画像Bがテンプレート画像生成部135に入力され、テンプレート画像生成部135はステップS10と同様の方法により、右目用画像B0からテンプレート画像TB0を生成する(ステップS12)。 The right eye image B photographed immediately before the processing target frame, in this case, the right eye image B0 (see FIG. 5) photographed immediately before the process of tracking the subject Z, or the processing target frame and the photographing timing as much as possible. The near right-eye image B is input to the template image generation unit 135, and the template image generation unit 135 generates a template image TB0 from the right-eye image B0 by the same method as in step S10 (step S12).
 i=1に設定する(ステップS14)。すなわち1枚目の画像の撮影、処理を開始する。なお、iは正の整数である。 I is set to 1 (step S14). That is, the photographing and processing of the first image is started. Note that i is a positive integer.
 左撮像系13で左目用画像Ai(今はi=1であるためA1、図8参照)を撮影する(ステップS16)。左目用画像Aiについては被写体追尾処理を行うため、撮影された左目用画像Aiは被写体探索部136に入力される。同時に、左目用画像Aiはビデオエンコーダ134に入力され、ビデオエンコーダ134で順次表示用の信号形式に変換されて、それぞれモニタ16に出力される。 The left imaging system 13 captures the left-eye image Ai (currently, A = 1 since i = 1, see FIG. 8) (step S16). Since the subject tracking process is performed on the left-eye image Ai, the captured left-eye image Ai is input to the subject search unit 136. At the same time, the left-eye image Ai is input to the video encoder 134, is sequentially converted into a display signal format by the video encoder 134, and is output to the monitor 16.
 右撮像系12で右目用画像Bi(今はi=1であるためB1、図8参照)を撮影する(ステップS16)。右目用画像Biについては被写体追尾処理を行わないため、撮影された右目用画像Biはテンプレート画像生成部135に入力される。同時に、右目用画像Biはビデオエンコーダ134に入力され、ビデオエンコーダ134で順次表示用の信号形式に変換されて、それぞれモニタ16に出力される。 The right imaging system 12 captures the right-eye image Bi (B1 since i = 1 at this time, see FIG. 8) (step S16). Since the subject tracking process is not performed on the right-eye image Bi, the captured right-eye image Bi is input to the template image generation unit 135. At the same time, the right-eye image Bi is input to the video encoder 134, is sequentially converted into a display signal format by the video encoder 134, and is output to the monitor 16.
 合成テンプレート画像生成部139は、テンプレート画像TA(i-1)、TB(i-1)から合成テンプレート画像TM(i-1)-1~TM(i-1)-5を生成する(ステップS40)。今はi=1であるため、テンプレート画像TA0、TB0から合成テンプレート画像TM0-1~TM0-5を生成する。 The composite template image generation unit 139 generates composite template images TM (i-1) -1 to TM (i-1) -5 from the template images TA (i-1) and TB (i-1) (Step S40). ). Since i = 1 at this time, synthesized template images TM0-1 to TM0-5 are generated from the template images TA0 and TB0.
 被写体探索部136は、テンプレート画像TA(i-1)、合成テンプレート画像TM(i-1)-1~TM(i-1)-5を用いて左目用画像Aiをサーチして、左目用画像Aiからテンプレート画像TA(i-1)、合成テンプレート画像TM(i-1)-1~TM(i-1)-5に類似した部分を探索する(ステップS42)。今はi=1であるため、ステップS10、S40で生成されたテンプレート画像TA0、合成テンプレート画像TM0-1~TM0-5を用いて左目用画像A1をサーチする。 The subject search unit 136 searches the left-eye image Ai using the template image TA (i-1) and the combined template images TM (i-1) -1 to TM (i-1) -5, and the left-eye image. A portion similar to template image TA (i-1) and composite template images TM (i-1) -1 to TM (i-1) -5 is searched from Ai (step S42). Since i = 1 at this time, the left-eye image A1 is searched using the template image TA0 and composite template images TM0-1 to TM0-5 generated in steps S10 and S40.
 被写体探索部136は、左目用画像Aiのサーチが終了したか否かを判断し(ステップS44)、終了していない場合(ステップS44でNO)には再度ステップS44を行う。 The subject search unit 136 determines whether or not the search for the left-eye image Ai has been completed (step S44), and if not completed (NO in step S44), performs the step S44 again.
 左目用画像Aiのサーチが終了した場合(ステップS44でYES)には、被写体探索部136は、複数のテンプレートでのサーチが成功したか否か、すなわち複数のテンプレートで探索結果が得られたか否かを判断する(ステップS46)。 When the search for the left-eye image Ai is completed (YES in step S44), the subject search unit 136 determines whether the search using a plurality of templates is successful, that is, whether the search results are obtained using a plurality of templates. Is determined (step S46).
 複数のテンプレートでのサーチが成功していない場合(ステップS46でNO)には、被写体Zの探索結果は1つのみである。したがって、被写体探索部136は、探索結果(被写体Zの位置)と左目用画像Aiをテンプレート画像生成部135へ入力する。その後、ステップS54へ進む。 If the search with a plurality of templates is not successful (NO in step S46), only one search result for the subject Z is obtained. Therefore, the subject search unit 136 inputs the search result (position of the subject Z) and the left-eye image Ai to the template image generation unit 135. Thereafter, the process proceeds to step S54.
 複数のテンプレートでのサーチが成功した場合(ステップS46でYES)には、複数のテンプレートで探索結果が得られた場合であるため、被写体探索部136は、サーチが成功したテンプレート画像と、このテンプレート画像を用いて探索された結果との類似度をテンプレート画像毎(探索結果毎)に算出する。そして、被写体探索部136は、類似度が最も高いものが複数あるか否かを判断する(ステップS48)。なお、類似度の算出方法としては公知のもの、例えば特徴量の値の差分、特徴量空間(重み付き空間も可)上の最小2乗法などが採用できる。 If the search using a plurality of templates is successful (YES in step S46), the search result is obtained using a plurality of templates. The similarity with the result searched using the image is calculated for each template image (each search result). Then, the subject search unit 136 determines whether there are a plurality of objects having the highest similarity (step S48). As a method for calculating similarity, a known method such as a difference between feature value values, a least square method on a feature space (or a weighted space is also possible), and the like can be employed.
 類似度が最も高いものが複数ある場合(ステップS48でYES)には、被写体探索部136は、テンプレート画像TA(i-1)又はテンプレート画像TA(i-1)に最も近い合成テンプレート画像に類似した部分を被写体Zの位置とする(ステップS50)。 When there are a plurality of objects having the highest similarity (YES in step S48), the subject search unit 136 is similar to the template image TA (i-1) or the synthesized template image closest to the template image TA (i-1). This portion is set as the position of the subject Z (step S50).
 ステップS50について具体的に説明する。テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TA(i-1)との類似度が、複数ある類似度が最も高いものの中に含まれている場合には、テンプレート画像TA(i-1)を用いて探索した結果を被写体Zの位置とする。これにより、被写体追尾の制度を高くすることができる。被写体探索部136は、被写体Zの位置と左目用画像Aiをテンプレート画像生成部135へ入力する。その後、ステップS32へ進む。 Step S50 will be specifically described. When the similarity between the result of searching the left-eye image Ai in the template image TA (i-1) and the template image TA (i-1) is included in the plurality of the highest similarities, The result of searching using the template image TA (i-1) is set as the position of the subject Z. Thereby, the subject tracking system can be increased. The subject search unit 136 inputs the position of the subject Z and the left-eye image Ai to the template image generation unit 135. Thereafter, the process proceeds to step S32.
 テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TA(i-1)との類似度が、複数ある類似度が最も高いものの中に含まれていない場合には、合成テンプレート画像TM(i-1)-1~TM(i-1)-5のうちテンプレート画像TA(i-1)に最も近い合成テンプレート画像に類似した部分を被写体Zの位置とする。図10に示す合成テンプレート画像TM0-1~TM0-5が生成されている場合には、合成テンプレート画像TM0-1がテンプレート画像TA0に最も近く、合成テンプレート画像TM0-5がテンプレート画像TA0に最も遠い。したがって、合成テンプレート画像TM0-1で左目用画像Aiをサーチした結果と合成テンプレート画像TM0-1との類似度が、複数ある類似度が最も高いものの中に含まれている場合には、合成テンプレート画像TM0-1に類似した部分を被写体Zの位置とする。また、合成テンプレート画像TM0-1で左目用画像Aiをサーチした結果と合成テンプレート画像TM0-1との類似度が複数ある類似度が最も高いものの中に含まれず、合成テンプレート画像TM0-2で左目用画像Aiをサーチした結果と合成テンプレート画像TM0-2との類似度が、複数ある類似度が最も高いものの中に含まれている場合には、合成テンプレート画像TM0-2に類似した部分を被写体Zの位置とする。これにより、最も精度が高い検出結果が用いられない場合においても、ある程度の精度を保ちつつ、追尾中に被写体を見失う可能性を低減することができる。 When the similarity between the result of searching for the left-eye image Ai in the template image TA (i-1) and the template image TA (i-1) is not included in the plurality of the highest similarities, Of the synthesized template images TM (i-1) -1 to TM (i-1) -5, a portion similar to the synthesized template image closest to the template image TA (i-1) is set as the position of the subject Z. When the composite template images TM0-1 to TM0-5 shown in FIG. 10 are generated, the composite template image TM0-1 is closest to the template image TA0, and the composite template image TM0-5 is farthest from the template image TA0. . Therefore, when the similarity between the result of searching for the left-eye image Ai in the composite template image TM0-1 and the composite template image TM0-1 is included in the plurality of the highest similarities, the composite template A portion similar to the image TM0-1 is set as the position of the subject Z. Further, the result of searching the left-eye image Ai with the composite template image TM0-1 and the composite template image TM0-1 are not included in the highest similarity that has a plurality of similarities, and the left-eye with the composite template image TM0-2. If the similarity between the result of the search for the image Ai and the composite template image TM0-2 is included in a plurality of the highest similarities, a portion similar to the composite template image TM0-2 is selected as the subject. Let Z be the position. Thereby, even when the detection result with the highest accuracy is not used, the possibility of losing the subject during tracking can be reduced while maintaining a certain level of accuracy.
 類似度が最も高いものが複数ない場合(ステップS48でNO)には、被写体探索部136は、類似度が最も高いサーチ結果を被写体Zの位置とする(ステップS52)。被写体探索部136は、この被写体Zの位置と左目用画像Aiをテンプレート画像生成部135へ入力する。 If there are not a plurality of objects having the highest similarity (NO in step S48), the subject search unit 136 sets the search result having the highest similarity as the position of the subject Z (step S52). The subject search unit 136 inputs the position of the subject Z and the left eye image Ai to the template image generation unit 135.
 これにより、左目用画像Aiから被写体Zが探索される。テンプレート画像生成部135は、ステップS44、S50、S52で設定された被写体Zの位置に基づいて、左目用画像Aiから被写体Zを含み、被写体Zの形状が認識できる大きさの領域を抜き出して、テンプレート画像TAiを生成する(ステップS54)。 Thereby, the subject Z is searched from the left-eye image Ai. Based on the position of the subject Z set in steps S44, S50, and S52, the template image generation unit 135 extracts a region that includes the subject Z from the left-eye image Ai and has a size that allows the shape of the subject Z to be recognized. A template image TAi is generated (step S54).
 テンプレート画像生成部135は、ステップS12と同様の方法により、右目用画像Biからテンプレート画像TBiを生成する(ステップS34)。 The template image generation unit 135 generates a template image TBi from the right-eye image Bi by the same method as in step S12 (step S34).
 その後、i=1+1に設定し(ステップS36)、ステップS16に戻り、再度ステップS16~S36の処理が行われる。 Thereafter, i = 1 + 1 is set (step S36), the process returns to step S16, and the processes of steps S16 to S36 are performed again.
 本実施の形態によれば、向きの異なる複数の被写体をキーとして被写体を追尾するため、被写体が移動したり被写体の向きが変わったりした場合においても、追尾中に被写体を見失う可能性を低減することができる。 According to the present embodiment, since the subject is tracked using a plurality of subjects having different directions as keys, the possibility of losing the subject during tracking is reduced even when the subject moves or the orientation of the subject changes. be able to.
 なお、本実施の形態では、被写体追尾処理を左目用画像Aで行ったが、右目用画像Bで被写体追尾処理を行ってもよいし、左目用画像A及び右目用画像Bの両方で被写体追尾処理を行うようにしてもよい。右目用画像Bで被写体追尾処理を行う場合についても、図11に示すのと同様の方法、右目用画像Biをテンプレート画像TB(i-1)と合成テンプレート画像TM(i-1)-1~TM(i-1)-5とでサーチする、で行えばよい。 In this embodiment, the subject tracking process is performed on the left-eye image A. However, the subject tracking process may be performed on the right-eye image B, or the subject tracking process may be performed on both the left-eye image A and the right-eye image B. Processing may be performed. When subject tracking processing is performed on the right eye image B, the right eye image Bi is converted into the template image TB (i-1) and the combined template image TM (i-1) -1 to the same method as shown in FIG. Search with TM (i-1) -5.
 また、本実施の形態では、複数のテンプレートでのサーチが成功した場合(ステップS46でYES)で、類似度が最も高いものが複数ある場合(ステップS48でYES)には、テンプレート画像TA(i-1)又はテンプレート画像TA(i-1)に最も近い合成テンプレート画像に類似した部分を被写体Zの位置とした(ステップS50)が、複数のテンプレートでのサーチが成功した場合(ステップS46でYES)に、類似度を算出することなく、テンプレート画像TA(i-1)又はテンプレート画像TA(i-1)に最も近い合成テンプレート画像に類似した部分を被写体Zの位置としてもよい。 Further, in the present embodiment, when a search with a plurality of templates is successful (YES in step S46) and there are a plurality of items having the highest similarity (YES in step S48), the template image TA (i -1) or a portion similar to the combined template image closest to the template image TA (i-1) is set as the position of the subject Z (step S50), but if the search with a plurality of templates is successful (YES in step S46) ), The portion similar to the template image TA (i-1) or the combined template image closest to the template image TA (i-1) may be set as the position of the subject Z without calculating the similarity.
 また、本実施の形態では、テンプレート画像TA(i-1)、TB(i-1)を画像合成処理することにより合成テンプレート画像TM(i-1)-1~TM(i-1)-5を生成したが、合成テンプレート画像を生成する方法は画像合成処理に限定されない。例えば、テンプレート画像TA(i-1)、TB(i-1)の特徴点、対応点を位置合わせ後に差分を抽出して、輝度、色相が異なるピクセルをマスクしてテンプレート画像TA(i-1)、TB(i-1)の共通部分のみのテンプレートを合成テンプレートとして生成しても良い。また、生成する合成テンプレート画像は、テンプレート画像TA(i-1)とテンプレート画像TB(i-1)との中間状態の画像、すなわち内挿により生成する場合に限定されない。例えば、合成テンプレート画像として、テンプレート画像TA(i-1)やテンプレート画像TB(i-1)の外側の画像(すなわち右撮像系12や左撮像系13よりも外側の視点から被写体を見た画像)を外挿により生成してもよい。 Further, in this embodiment, the template images TA (i-1) and TB (i-1) are subjected to image synthesis processing to generate synthesized template images TM (i-1) -1 to TM (i-1) -5. However, the method for generating the composite template image is not limited to the image composition processing. For example, after the feature points and corresponding points of the template images TA (i-1) and TB (i-1) are aligned, the difference is extracted, and the pixels having different luminance and hue are masked to make the template image TA (i-1 ), A template having only the common part of TB (i-1) may be generated as a synthesis template. Further, the composite template image to be generated is not limited to an image in an intermediate state between the template image TA (i−1) and the template image TB (i−1), that is, generated by interpolation. For example, as a composite template image, an image outside the template image TA (i-1) or template image TB (i-1) (that is, an image of a subject viewed from a viewpoint outside the right imaging system 12 or the left imaging system 13). ) May be generated by extrapolation.
 また、本実施の形態では、テンプレート画像TA(i-1)、合成テンプレート画像TM(i-1)-1~TM(i-1)-5を用いて左目用画像Aiをサーチして、左目用画像Aiからテンプレート画像TA(i-1)、合成テンプレート画像TM(i-1)-1~TM(i-1)-5に類似した部分を探索した(ステップS42)が、合成テンプレート画像TM(i-1)-1~TM(i-1)-5のみを用いて左目用画像Aiをサーチするようにしても良い。ただし、合成テンプレート画像は画像が粗くなる可能性があるため、できればテンプレート画像と合成テンプレート画像との両方でサーチすることが望ましい。また、テンプレート画像TA(i-1)、TB(i-1)及び合成テンプレート画像TM(i-1)-1~TM(i-1)-5を用いて左目用画像Aiをサーチしてもよい。この場合には、処理に時間を要するが、確実に被写体を追尾することが可能である。 In the present embodiment, the left eye image Ai is searched by using the template image TA (i-1) and the combined template images TM (i-1) -1 to TM (i-1) -5, and the left eye A portion similar to the template image TA (i-1) and the composite template images TM (i-1) -1 to TM (i-1) -5 was searched from the image Ai (step S42). The left eye image Ai may be searched using only (i-1) -1 to TM (i-1) -5. However, since the synthesized template image may be rough, it is desirable to search both the template image and the synthesized template image if possible. Further, the left-eye image Ai is searched using the template images TA (i-1), TB (i-1) and the composite template images TM (i-1) -1 to TM (i-1) -5. Good. In this case, although processing takes time, it is possible to reliably track the subject.
 <第2の実施の形態の変形例>
 第2の実施の形態は、テンプレート画像TA(i-1)、合成テンプレート画像TM(i-1)-1~TM(i-1)-5を用いて左目用画像Aiをサーチしたが、第1の実施の形態のようにテンプレート画像TA(i-1)、TB(i-1)で左目用画像Aiをサーチし、被写体を見失ってしまった場合に合成テンプレート画像を用いて左目用画像Aiをサーチしてもよい。
<Modification of Second Embodiment>
In the second embodiment, the left-eye image Ai is searched using the template image TA (i-1) and the combined template images TM (i-1) -1 to TM (i-1) -5. As in the first embodiment, the left eye image Ai is searched using the template images TA (i-1) and TB (i-1), and the subject is lost. May be searched.
 図12は、所定のフレームレートで連続的に撮影された左目用画像Aに対して被写体Zを追尾する処理の流れを示すフローチャートである。この処理はCPU110によって制御される。この撮像処理をCPU110に実行させるためのプログラムはCPU110内のプログラム格納部に記憶されている。 FIG. 12 is a flowchart showing a flow of processing for tracking the subject Z with respect to the image A for the left eye continuously photographed at a predetermined frame rate. This process is controlled by the CPU 110. A program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110.
 ステップS10~S22については第1の実施の形態と同一であるため、ステップS60から説明する。 Steps S10 to S22 are the same as those in the first embodiment, and will be described from step S60.
 被写体探索部136は、ステップS20でのサーチ結果が目標ロスト、すなわちテンプレート画像TA(i-1)、TB(i-1)を用いて左目用画像Aiをサーチした結果、テンプレート画像TA(i-1)、TB(i-1)に類似した部分が見つからなかったかどうかを判断する(ステップS60)。 The subject search unit 136 searches the left eye image Ai using the target lost as the search result in step S20, that is, the template images TA (i−1) and TB (i−1). As a result, the template image TA (i− 1) It is determined whether or not a portion similar to TB (i-1) has been found (step S60).
 目標ロストしていない場合(ステップS60でNO)は、左目用画像Aiのサーチが終了した場合(ステップS22でYES)には、被写体探索部136は、TA(i-1)で左目用画像Aiをサーチした結果と、TB(i-1)で左目用画像Aiをサーチした結果が同じか否かを判断する(ステップS24)。ステップS24~S30までは第1の実施の形態と同じであるため、説明を省略する。その後、ステップS74へ進む。 If the target has not been lost (NO in step S60), if the search for the left-eye image Ai is completed (YES in step S22), the subject search unit 136 performs TA (i-1) for the left-eye image Ai. It is determined whether or not the result of searching for and the result of searching for the left-eye image Ai by TB (i−1) are the same (step S24). Steps S24 to S30 are the same as those in the first embodiment, and a description thereof will be omitted. Thereafter, the process proceeds to step S74.
 目標ロストした場合(ステップS60でYES)は、合成テンプレート画像生成部139は、テンプレート画像TA(i-1)、TB(i-1)から合成テンプレート画像TM(i-1)-1~TM(i-1)-5を生成する(ステップS40)。 When the target has been lost (YES in step S60), the composite template image generation unit 139 uses the composite template images TM (i-1) -1 to TM (TM) from the template images TA (i-1) and TB (i-1). i-1) -5 is generated (step S40).
 被写体探索部136は、合成テンプレート画像TM(i-1)-1~TM(i-1)-5を用いて左目用画像Aiをサーチして、左目用画像Aiから合成テンプレート画像TM(i-1)-1~TM(i-1)-5に類似した部分を探索する(ステップS62)。 The subject searching unit 136 searches the left-eye image Ai using the composite template images TM (i-1) -1 to TM (i-1) -5, and generates the composite template image TM (i− 1) A part similar to -1 to TM (i-1) -5 is searched (step S62).
 被写体探索部136は、左目用画像Aiのサーチが終了したか否かを判断し(ステップS64)、終了していない場合(ステップS64でNO)には再度ステップS64を行う。 The subject search unit 136 determines whether or not the search for the left-eye image Ai has ended (step S64). If the search has not ended (NO in step S64), the subject search unit 136 performs step S64 again.
 左目用画像Aiのサーチが終了した場合(ステップS64でYES)には、被写体探索部136は、複数のテンプレートでのサーチが成功したか否か、すなわち複数のテンプレートで探索結果が得られたか否かを判断する(ステップS66)。 When the search for the left-eye image Ai is completed (YES in step S64), the subject search unit 136 determines whether the search using a plurality of templates is successful, that is, whether the search results are obtained using a plurality of templates. Is determined (step S66).
 複数のテンプレートでのサーチが成功していない場合(ステップS66でNO)には、被写体Zの探索結果は1つのみである。したがって、被写体探索部136は、探索結果(被写体Zの位置)と左目用画像Aiをテンプレート画像生成部135へ入力する。その後、ステップS74へ進む。 If the search with a plurality of templates has not succeeded (NO in step S66), the search result of the subject Z is only one. Therefore, the subject search unit 136 inputs the search result (position of the subject Z) and the left-eye image Ai to the template image generation unit 135. Thereafter, the process proceeds to step S74.
 複数のテンプレートでのサーチが成功した場合(ステップS66でYES)には、複数のテンプレートで探索結果が得られた場合であるため、被写体探索部136は、サーチが成功したテンプレート画像と、このテンプレート画像を用い探索された結果との類似度をテンプレート画像毎(探索結果毎)に算出する。そして、被写体探索部136は、類似度が最も高いものが複数あるか否かを判断する(ステップS68)。 If the search with a plurality of templates is successful (YES in step S66), the search result is obtained with a plurality of templates. Therefore, the subject search unit 136 includes the template image that has been successfully searched, and the template. The similarity with the search result using the image is calculated for each template image (each search result). Then, the subject searching unit 136 determines whether there are a plurality of objects having the highest similarity (step S68).
 類似度が最も高いものが複数ある場合(ステップS68でYES)には、被写体探索部136は、テンプレート画像TA(i-1)に最も近い合成テンプレート画像に類似した部分を被写体Zの位置とする(ステップS70)。類似度が最も高いものが複数ない場合(ステップS68でNO)には、被写体探索部136は、類似度が最も高いサーチ結果を被写体Zの位置とする(ステップS72)。被写体探索部136は、ステップS70、S72での被写体Zの位置と左目用画像Aiをテンプレート画像生成部135へ入力する。その後、ステップS74へ進む。 If there are a plurality of objects having the highest similarity (YES in step S68), the subject searching unit 136 sets a portion similar to the synthesized template image closest to the template image TA (i-1) as the position of the subject Z. (Step S70). If there is not a plurality of objects with the highest similarity (NO in step S68), the subject search unit 136 sets the search result with the highest similarity as the position of the subject Z (step S72). The subject search unit 136 inputs the position of the subject Z and the left-eye image Ai in steps S70 and S72 to the template image generation unit 135. Thereafter, the process proceeds to step S74.
 テンプレート画像生成部135は、ステップS22、S28、S30、S62、S70、S72で設定された被写体Zの位置に基づいて、左目用画像Aiから被写体Zを含み、被写体Zの形状が認識できる大きさの領域を抜き出して、テンプレート画像TAiを生成する(ステップS74)。ステップS10と同様、被写体Zの輪郭に数ピクセルの余裕を持った矩形領域が抜き出され、テンプレート画像として生成される。 The template image generation unit 135 includes the subject Z from the left-eye image Ai based on the position of the subject Z set in steps S22, S28, S30, S62, S70, and S72, and is large enough to recognize the shape of the subject Z. Are extracted to generate a template image TAi (step S74). Similar to step S10, a rectangular area having a margin of several pixels is extracted from the contour of the subject Z and generated as a template image.
 テンプレート画像生成部135は、ステップS12と同様の方法により、右目用画像Biからテンプレート画像TBiを生成する(ステップS34)。その後、i=1+1に設定し(ステップS36)、ステップS16に戻り、再度ステップS16~S36の処理が行われる。 The template image generation unit 135 generates a template image TBi from the right-eye image Bi by the same method as in step S12 (step S34). Thereafter, i = 1 + 1 is set (step S36), the process returns to step S16, and the processes of steps S16 to S36 are performed again.
 本実施の形態によれば、向きの異なる複数の被写体をキーとして被写体を追尾するため、被写体が移動したり被写体の向きが変わったりした場合においても、追尾中に被写体を見失う可能性をより低減することができる。また、テンプレート画像でのサーチが失敗した場合のみ合成テンプレート画像の作成及び合成テンプレート画像を用いたサーチを行うため、無駄な処理を行う必要がなく、処理時間を短縮することができる。 According to the present embodiment, the subject is tracked using a plurality of subjects having different directions as keys, so that the possibility of losing the subject during tracking is further reduced even when the subject moves or the orientation of the subject changes. can do. In addition, since a composite template image is created and a search using the composite template image is performed only when the search with the template image fails, it is not necessary to perform useless processing, and the processing time can be shortened.
 <第3の実施の形態>
 本発明の第1の実施の形態は、右目用画像の一部を抜き出して生成したテンプレート画像と、左目用画像の一部を抜き出して生成したテンプレート画像とを用いて右目用画像及び左目用画像の少なくとも一方から被写体の探索をすることで被写体追尾を行ったが、追尾対象の被写体の全面が遮蔽物に隠れた場合等に被写体追尾が失敗する可能性がある。
<Third Embodiment>
The first embodiment of the present invention uses a template image generated by extracting a part of a right-eye image and a template image generated by extracting a part of a left-eye image, and a right-eye image and a left-eye image. Subject tracking is performed by searching for a subject from at least one of the above, but subject tracking may fail when the entire surface of the subject to be tracked is hidden by a shield.
 本発明の第3の実施の形態は、追尾対象の被写体の全面が遮蔽物に隠れた場合等に、追尾対象の被写体を見失わないようにする形態である。以下、第3の実施の形態の複眼デジタルカメラ3について説明する。第1の実施の形態と同一の部分については、同一の符号を付し、説明を省略する。 In the third embodiment of the present invention, when the entire subject to be tracked is hidden behind a shield, the subject to be tracked is not lost. The compound eye digital camera 3 according to the third embodiment will be described below. The same parts as those of the first embodiment are denoted by the same reference numerals, and description thereof is omitted.
 図13は、複眼デジタルカメラ3の主要な内部構成を示すブロック図である。複眼デジタルカメラ3は、主として、CPU110、操作手段(レリーズスイッチ20、MENU/OKボタン25、十字ボタン26等)112、SDRAM114、VRAM116、AF検出手段118、AE/AWB検出手段120、撮像素子122、123、CDS/AMP124、125、A/D変換器126、127、画像入力コントローラ128、画像信号処理手段130、圧縮伸張処理手段132、立体画像生成部133、ビデオエンコーダ134、テンプレート画像生成部135、被写体探索部136、メディアコントローラ137、音入力処理手段138、記録メディア140、テンプレート画像再生成部141、フォーカスレンズ駆動手段142、143、ズームレンズ駆動手段144、145、絞り駆動手段146、147、タイミングジェネレータ(TG)148、149で構成される。 FIG. 13 is a block diagram showing the main internal configuration of the compound-eye digital camera 3. The compound-eye digital camera 3 mainly includes a CPU 110, operation means (release switch 20, MENU / OK button 25, cross button 26, etc.) 112, SDRAM 114, VRAM 116, AF detection means 118, AE / AWB detection means 120, image sensor 122, 123, CDS / AMP 124, 125, A / D converters 126, 127, image input controller 128, image signal processing means 130, compression / decompression processing means 132, stereoscopic image generation unit 133, video encoder 134, template image generation unit 135, Subject search unit 136, media controller 137, sound input processing unit 138, recording medium 140, template image regeneration unit 141, focus lens driving units 142 and 143, zoom lens driving units 144 and 145, aperture driving unit 146, 47, and a timing generator (TG) 148 and 149.
 テンプレート画像再生成部141は、テンプレート画像生成部135が生成したテンプレート画像でサーチした結果、左目用画像A又は右目用画像Bの一方で追尾対象の被写体を見失った場合に、追尾対象の被写体を見失った画像をサーチするためのテンプレート画像を追尾対象の被写体が探索された画像を用いて生成する。本実施の形態では、追尾対象の被写体が探索された画像のサーチ結果に基づいて、追尾対象の被写体が探索された画像の一部を抜き出してテンプレート画像を生成する。被写体探索部136は、テンプレート画像再生成部141によって生成されたテンプレート画像を用いて、追尾対象の被写体を見失った画像をサーチする。なお、テンプレート画像再生成部141の処理の詳細については後に詳述する。 As a result of searching the template image generated by the template image generating unit 135, the template image regenerating unit 141 finds a tracking target subject when one of the left eye image A and the right eye image B is lost. A template image for searching for an image that has been lost is generated using an image in which a subject to be tracked is searched. In the present embodiment, a template image is generated by extracting a part of the image searched for the tracking target subject based on the search result of the image searched for the tracking target subject. The subject searching unit 136 uses the template image generated by the template image regenerating unit 141 to search for an image in which the tracking target subject is lost. Details of the processing of the template image regenerating unit 141 will be described later.
 以上のように構成された複眼デジタルカメラ3の作用について説明する。第1の実施の形態と第3の実施の形態との違いは、3D撮影モードにおける被写体追尾処理のみであるため、被写体追尾処理についてのみ説明し、その他の説明は省略する。 The operation of the compound eye digital camera 3 configured as described above will be described. Since the difference between the first embodiment and the third embodiment is only the subject tracking process in the 3D shooting mode, only the subject tracking process will be described, and the other description will be omitted.
 図14は、所定のフレームレートで連続的に撮影された左目用画像Aに対して被写体Zを追尾する処理の流れを示すフローチャートである。この処理はCPU110によって制御される。この撮像処理をCPU110に実行させるためのプログラムはCPU110内のプログラム格納部に記憶されている。 FIG. 14 is a flowchart showing a flow of processing for tracking the subject Z with respect to the image A for the left eye continuously photographed at a predetermined frame rate. This process is controlled by the CPU 110. A program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110.
 処理対象のフレームの直前に撮影された左目用画像A、この場合には被写体Zを追尾する処理の直前に撮影された左目用画像A0(図15参照)がテンプレート画像生成部135に入力される。テンプレート画像生成部135は、左目用画像A0から被写体Zを含み、被写体Zの形状が認識できる大きさの領域を抜き出してテンプレート画像TA0(図16参照)を生成する(ステップS10)。 The left-eye image A taken immediately before the frame to be processed, in this case, the left-eye image A0 (see FIG. 15) taken immediately before the process of tracking the subject Z is input to the template image generation unit 135. . The template image generation unit 135 generates a template image TA0 (see FIG. 16) by extracting a region including the subject Z from the left-eye image A0 and having a size that allows the shape of the subject Z to be recognized (step S10).
 処理対象のフレームの直前に撮影された右目用画像B、この場合には被写体Zを追尾する処理の直前に撮影された右目用画像B0(図15参照)又は処理対象のフレームとできるだけ撮影タイミングの近い右目用画像Bがテンプレート画像生成部135に入力され、テンプレート画像生成部135はステップS10と同様の方法により、右目用画像B0からテンプレート画像TB0(図16参照)を生成する(ステップS12)。 The right eye image B photographed immediately before the processing target frame, in this case, the right eye image B0 (see FIG. 15) photographed immediately before the process of tracking the subject Z, or the processing target frame, and the photographing timing as much as possible. The near right eye image B is input to the template image generation unit 135, and the template image generation unit 135 generates a template image TB0 (see FIG. 16) from the right eye image B0 by the same method as in step S10 (step S12).
 i=1に設定する(ステップS14)。すなわち1枚目の画像の撮影、処理を開始する。なお、iは正の整数である。 I is set to 1 (step S14). That is, the photographing and processing of the first image is started. Note that i is a positive integer.
 左撮像系13で左目用画像Ai(今はi=1であるためA1、図17参照)を撮影する(ステップS16)。左目用画像Aiについては被写体追尾処理を行うため、撮影された左目用画像Aiは被写体探索部136に入力される。同時に、左目用画像Aiはビデオエンコーダ134に入力され、ビデオエンコーダ134で順次表示用の信号形式に変換されて、それぞれモニタ16に出力される。 The left imaging system 13 captures a left-eye image Ai (currently, A = 1 since i = 1, see FIG. 17) (step S16). Since the subject tracking process is performed on the left-eye image Ai, the captured left-eye image Ai is input to the subject search unit 136. At the same time, the left-eye image Ai is input to the video encoder 134, is sequentially converted into a display signal format by the video encoder 134, and is output to the monitor 16.
 右撮像系12で右目用画像Bi(今はi=1であるためB1、図17参照)を撮影する(ステップS18)。右目用画像Biについては被写体追尾処理を行わないため、撮影された右目用画像Biはテンプレート画像生成部135に入力される。同時に、右目用画像Biはビデオエンコーダ134に入力され、ビデオエンコーダ134で順次表示用の信号形式に変換されて、それぞれモニタ16に出力される。 The right imaging system 12 captures the right-eye image Bi (B1 since i = 1 at this time, see FIG. 17) (step S18). Since the subject tracking process is not performed on the right-eye image Bi, the captured right-eye image Bi is input to the template image generation unit 135. At the same time, the right-eye image Bi is input to the video encoder 134, is sequentially converted into a display signal format by the video encoder 134, and is output to the monitor 16.
 被写体探索部136は、生成されたテンプレート画像TA(i-1)をテンプレート画像生成部135から取得し、テンプレート画像TA(i-1)を用いて左目用画像Aiをサーチして、左目用画像Aiからテンプレート画像TA(i-1)に類似した部分を探索する(ステップS80)。今はi=1であるため、ステップS10で生成されたテンプレート画像TA0を用いて左目用画像A1をサーチする。 The subject search unit 136 acquires the generated template image TA (i-1) from the template image generation unit 135, searches the left-eye image Ai using the template image TA (i-1), and determines the left-eye image. A portion similar to the template image TA (i-1) is searched from Ai (step S80). Since i = 1 at present, the left-eye image A1 is searched using the template image TA0 generated in step S10.
 被写体探索部136は、左目用画像Aiのサーチが成功したか否かを判断する(ステップS82)。今はi=1であるため、左目用画像A1からテンプレート画像TA0に類似した部分が探索されたか否かが判断される。 The subject search unit 136 determines whether or not the search for the left-eye image Ai is successful (step S82). Since i = 1 at this time, it is determined whether or not a portion similar to the template image TA0 has been searched from the left-eye image A1.
 左目用画像Aiのサーチが成功した場合(ステップS82でYES)には、被写体探索部136は、探索結果(被写体Zの位置)と左目用画像Aiをテンプレート画像生成部135へ入力する。その後、ステップS100へ進む。 If the search for the left-eye image Ai is successful (YES in step S82), the subject search unit 136 inputs the search result (position of the subject Z) and the left-eye image Ai to the template image generation unit 135. Then, it progresses to step S100.
 左目用画像Aiのサーチが成功していない場合(ステップS82でNO)は、例えば図17に示すように、追尾対象の被写体Zが遮蔽物(ここでは、被写体Zの前面に位置する他の被写体である車C)で覆われて視認できなくなっている場合が考えられる。この場合には、被写体探索部136は、生成されたテンプレート画像TB(i-1)をテンプレート画像生成部135から取得する。また、ステップS16で取得された右目用画像Biは、テンプレート画像生成部135に入力されているため、被写体探索部136は、テンプレート画像生成部135から右目用画像Biを取得する。そして、被写体探索部136は、テンプレート画像TB(i-1)を用いて右目用画像Biをサーチして、右目用画像Biからテンプレート画像TB(i-1)に類似した部分を探索する(ステップS84)。今はi=1であるため、右目用画像B1からテンプレート画像TB0に類似した部分が探索されたか否かが判断される。 If the search for the left-eye image Ai has not succeeded (NO in step S82), for example, as shown in FIG. 17, the subject Z to be tracked is an obstruction (here, another subject located in front of the subject Z). It is conceivable that the vehicle is covered with a car C) and cannot be visually recognized. In this case, the subject searching unit 136 acquires the generated template image TB (i−1) from the template image generating unit 135. Further, since the right-eye image Bi acquired in step S16 is input to the template image generation unit 135, the subject search unit 136 acquires the right-eye image Bi from the template image generation unit 135. Then, the subject search unit 136 searches the right-eye image Bi using the template image TB (i-1), and searches for a portion similar to the template image TB (i-1) from the right-eye image Bi (Step 1). S84). Since i = 1 at this time, it is determined whether a portion similar to the template image TB0 has been searched from the right-eye image B1.
 被写体探索部136は、右目用画像Biのサーチが成功したか否かを判断する(ステップS86)。 The subject search unit 136 determines whether or not the search for the right-eye image Bi is successful (step S86).
 右目用画像Biのサーチが失敗した場合(ステップS86でNO)には、サーチ失敗であるため、被写体探索部136は、追尾エラー処理を行う(ステップS101)。追尾エラー処理としては例えばモニタ16に追尾エラーであることを示すメッセージを表示することが考えられるが、これに限られるものではない。 If the search for the right-eye image Bi fails (NO in step S86), the subject search unit 136 performs tracking error processing (step S101) because the search has failed. As the tracking error process, for example, a message indicating a tracking error may be displayed on the monitor 16, but the tracking error process is not limited to this.
 右目用画像Biのサーチが成功した場合(ステップS86でYES)には、被写体探索部136は、サーチ結果をテンプレート画像再生成部141へ入力し、テンプレート画像再生成部141が右目用画像Biの一部を抜き出してテンプレート画像TBi-L1を生成する(ステップS88)。テンプレート画像TBi-L1の位置は、被写体と複眼デジタルカメラとの位置関係から、左目用画像Aの被写体Zの位置に近い位置として推定された位置である。以下、テンプレート画像再生成部141がテンプレート画像TBi-L1を生成する処理について詳しく説明する。 If the search for the right-eye image Bi is successful (YES in step S86), the subject search unit 136 inputs the search result to the template image regeneration unit 141, and the template image regeneration unit 141 selects the right-eye image Bi. A part is extracted to generate a template image TBi-L1 (step S88). The position of the template image TBi-L1 is a position estimated as a position close to the position of the subject Z in the left-eye image A from the positional relationship between the subject and the compound-eye digital camera. Hereinafter, a process in which the template image regenerating unit 141 generates the template image TBi-L1 will be described in detail.
 図17に示す右目用画像B1からは、ステップS84で追尾対象の被写体Zが探索されている。まず、テンプレート画像再生成部141は、この被写体Zを含む所定の領域をテンプレート画像TB1(図17点線、図18参照)とする。テンプレート画像TB1は、テンプレート画像TB0と同様、被写体Zの輪郭に数ピクセルの余裕を持った矩形領域が抜き出されて生成される。 From the right-eye image B1 shown in FIG. 17, the tracking target subject Z is searched in step S84. First, the template image regeneration unit 141 sets a predetermined region including the subject Z as a template image TB1 (see the dotted line in FIG. 17 and FIG. 18). Similar to the template image TB0, the template image TB1 is generated by extracting a rectangular region having a margin of several pixels on the contour of the subject Z.
 図19に示すように、追尾対象の被写体Zが遮蔽物(ここでは車C)に覆われた場合、追尾を行っている視点の右側の視点の画像では、被写体Zの左側に遮蔽物が位置する。ここでは、左目用画像Aで追尾処理を行っているため、右目用画像Bでは被写体Zの左側に遮蔽物が位置している。したがって、テンプレート画像再生成部141は、右目用画像B1について、テンプレート画像TB1の位置からテンプレート画像TB1の横幅分だけ左へ移動した領域を抜き出し、テンプレート画像TB1-L1を生成する(図17点線、図18参照)。なお、テンプレート画像TB1-L1の大きさは、テンプレート画像TB1と同じ大きさである。 As shown in FIG. 19, when the subject Z to be tracked is covered with a shield (in this case, the car C), the shield is located on the left side of the subject Z in the viewpoint image on the right side of the viewpoint that is tracking. To do. Here, since the tracking process is performed on the left-eye image A, the shielding object is located on the left side of the subject Z in the right-eye image B. Therefore, the template image regenerating unit 141 extracts a region moved left from the position of the template image TB1 by the width of the template image TB1 from the position of the template image TB1 for the right eye image B1, and generates a template image TB1-L1 (see the dotted line in FIG. FIG. 18). Note that the size of the template image TB1-L1 is the same as that of the template image TB1.
 被写体探索部136は、ステップS88でテンプレート画像再生成部141により生成されたテンプレート画像TBi-L1で左目用画像Aiをサーチする(ステップS90)。 The subject searching unit 136 searches for the left-eye image Ai using the template image TBi-L1 generated by the template image regenerating unit 141 in step S88 (step S90).
 被写体探索部136は、左目用画像Aiのサーチが成功したか否かを判断する(ステップS92)。 The subject search unit 136 determines whether or not the search for the left-eye image Ai is successful (step S92).
 左目用画像Aiのサーチが成功した場合(ステップS92でYES)には、被写体探索部136は、探索結果(被写体Zの位置)と左目用画像Aiをテンプレート画像生成部135へ入力する。その後、ステップS100へ進む。 If the search for the left-eye image Ai is successful (YES in step S92), the subject search unit 136 inputs the search result (position of the subject Z) and the left-eye image Ai to the template image generation unit 135. Then, it progresses to step S100.
 左目用画像Aiのサーチが失敗した場合(ステップS92でNO)には、サーチ失敗であるため、サーチ結果をテンプレート画像再生成部141へ入力し、テンプレート画像再生成部141は、ステップS88と同様の方法により、テンプレート画像TBi-L1の位置からテンプレート画像TBi-L1の横幅分だけ左へ移動した領域を抜き出し、テンプレート画像TBi-L2を生成する(ステップS94)。図17に示す右目用画像B1からは、テンプレート画像TB1-L2(図17点線参照)が生成される。 If the search for the left-eye image Ai fails (NO in step S92), the search is unsuccessful, so the search result is input to the template image regeneration unit 141, and the template image regeneration unit 141 is the same as in step S88. By this method, the region moved to the left by the width of the template image TBi-L1 from the position of the template image TBi-L1 is extracted to generate the template image TBi-L2 (step S94). A template image TB1-L2 (see the dotted line in FIG. 17) is generated from the right-eye image B1 shown in FIG.
 被写体探索部136は、ステップS94でテンプレート画像再生成部141により生成されたテンプレート画像TBi-L2で左目用画像Aiをサーチする(ステップS96)。 The subject searching unit 136 searches for the left-eye image Ai using the template image TBi-L2 generated by the template image regenerating unit 141 in step S94 (step S96).
 被写体探索部136は、左目用画像Aiのサーチが成功したか否かを判断する(ステップS98)。 The subject search unit 136 determines whether or not the search for the left-eye image Ai is successful (step S98).
 左目用画像Aiのサーチが失敗した場合(ステップS98でNO)には、サーチ失敗であるため、被写体探索部136は、追尾エラー処理を行う(ステップS101)。左目用画像Aiのサーチが成功した場合(ステップS98でYES)には、被写体探索部136は、探索結果(被写体Zの位置)と左目用画像Aiをテンプレート画像生成部135へ入力する。その後、ステップS100へ進む。 If the search for the left-eye image Ai has failed (NO in step S98), the search is unsuccessful, and the subject search unit 136 performs tracking error processing (step S101). If the search for the left-eye image Ai is successful (YES in step S98), the subject search unit 136 inputs the search result (position of the subject Z) and the left-eye image Ai to the template image generation unit 135. Then, it progresses to step S100.
 テンプレート画像生成部135は、ステップS82でサーチ成功と判断された場合には、ステップS80で設定された被写体Zの位置に基づいて、左目用画像Aiから被写体Zを含み、被写体Zの形状が認識できる大きさの領域を抜き出して、テンプレート画像TAiを生成する(ステップS100)。ステップS10と同様、被写体Zの輪郭に数ピクセルの余裕を持った矩形領域が抜き出され、テンプレート画像として生成される。 If it is determined that the search is successful in step S82, the template image generation unit 135 includes the subject Z from the left-eye image Ai based on the position of the subject Z set in step S80, and recognizes the shape of the subject Z. A region having a possible size is extracted to generate a template image TAi (step S100). Similar to step S10, a rectangular area having a margin of several pixels is extracted from the contour of the subject Z and generated as a template image.
 また、テンプレート画像生成部135は、ステップS82でサーチ失敗と判断された場合には、ステップS80でサーチしたテンプレート画像TA(i-1)をテンプレート画像TAiとする(ステップS100)。すなわち、ステップS82でサーチ失敗と判断された場合には、最後に追尾対象の被写体Zが探索できたテンプレート画像を、次のフレームについても継続して使用する。これにより、追尾対象の被写体Zが前面に位置する車Cで覆われて視認できなくなっている場合において(図17参照)、車Cが移動して被写体Zが視認できるようになった場合に、被写体Zを探索することができる。 Further, when it is determined that the search has failed in step S82, the template image generation unit 135 sets the template image TA (i-1) searched in step S80 as the template image TAi (step S100). That is, if it is determined in step S82 that the search has failed, the template image in which the tracking target subject Z was finally searched is continuously used for the next frame. Thereby, when the subject Z to be tracked is covered with the vehicle C located in the front and cannot be seen (see FIG. 17), when the vehicle C moves and the subject Z can be seen, The subject Z can be searched.
 その後、i=1+1に設定し(ステップS36)、ステップS16に戻り、再度ステップS16~S36の処理が行われる。 Thereafter, i = 1 + 1 is set (step S36), the process returns to step S16, and the processes of steps S16 to S36 are performed again.
 本実施の形態によれば、追尾対象の被写体が他の被写体に遮られた場合でも、被写体と複眼デジタルカメラとの位置関係から追尾対象の被写体がいると思われる位置を推定して探索を行うことにより、被写体を見失う可能性を低減することができる。 According to the present embodiment, even when the tracking target subject is obstructed by another subject, a search is performed by estimating the position where the tracking target subject is likely to be based on the positional relationship between the subject and the compound-eye digital camera. Thus, the possibility of losing sight of the subject can be reduced.
 なお、本実施の形態では、テンプレート画像TA(i-1)を用いて左目用画像Aiをサーチし(ステップS80)、失敗した場合にはテンプレート画像再生成部141により生成されたテンプレート画像TBi-L1で左目用画像Aiをサーチした(ステップS90)が、ステップS80の代わりに、テンプレート画像TA(i-1)、TB(i-1)を用いて左目用画像Aiをサーチする処理としても良い。この場合には、テンプレート画像TA(i-1)、TB(i-1)を用いて左目用画像Aiをサーチし、失敗した場合にはテンプレート画像再生成部141により生成されたテンプレート画像TBi-L1で左目用画像Aiをサーチする。すなわち、まずは精度の良い追尾を行い、精度の良い追尾では追尾不可能な場合のみ精度が落ちるが被写体を見失う可能性を低減する処理が行われる。したがって、正確かつ被写体を見失う可能性の低い被写体追尾処理を提供することができる。この場合には、左目用画像Ai+1については、テンプレート画像TA(i-1)、TB(i-1)を用いてサーチするようにすればよい。 In the present embodiment, the template image TA (i−1) is used to search the left eye image Ai (step S80). If the template image TAi− is unsuccessful, the template image TBi− generated by the template image regeneration unit 141 is searched. The search for the left-eye image Ai at L1 (step S90) may be a process of searching for the left-eye image Ai using the template images TA (i-1) and TB (i-1) instead of step S80. . In this case, the left-eye image Ai is searched using the template images TA (i−1) and TB (i−1). If the image is unsuccessful, the template image TBi− generated by the template image regeneration unit 141 is searched. The left-eye image Ai is searched at L1. That is, first, tracking is performed with high accuracy, and processing is performed to reduce the possibility of losing sight of the subject, although accuracy decreases only when tracking is impossible with high accuracy tracking. Therefore, it is possible to provide subject tracking processing that is accurate and has a low possibility of losing sight of the subject. In this case, the left-eye image Ai + 1 may be searched using the template images TA (i−1) and TB (i−1).
 また、本実施の形態では、被写体を見失った場合には追尾エラー処理(ステップS102)を行ったが、被写体を見失った場合に追尾エラー処理(ステップS102)を行わず、ステップS36へ進む、すなわち次のフレームの処理を行うようにしても良い。 In this embodiment, the tracking error process (step S102) is performed when the subject is lost, but when the subject is lost, the tracking error process (step S102) is not performed and the process proceeds to step S36. The next frame may be processed.
 また、本実施の形態では、所定のフレームレートで連続的に撮影された左目用画像Aに対して被写体Zを追尾する処理を説明したが、右目用画像Bに対して被写体Zを追尾する処理を行ってもよいし、左目用画像A及び右目用画像Bに対して被写体Zを追尾する処理を行ってもよい。図20は、所定のフレームレートで連続的に撮影された右目用画像Aに対して被写体Zを追尾する処理の流れを示すフローチャートである。この処理はCPU110によって制御される。この撮像処理をCPU110に実行させるためのプログラムはCPU110内のプログラム格納部に記憶されている。なお、図14に示す処理と図20に示す処理とで同一の部分については、同一の符号を付し説明を省略する。 Further, in the present embodiment, the process of tracking the subject Z with respect to the left-eye image A continuously captured at a predetermined frame rate has been described, but the process of tracking the subject Z with respect to the right-eye image B Alternatively, a process for tracking the subject Z may be performed on the left-eye image A and the right-eye image B. FIG. 20 is a flowchart showing a flow of processing for tracking the subject Z with respect to the right-eye image A continuously taken at a predetermined frame rate. This process is controlled by the CPU 110. A program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110. In addition, the same code | symbol is attached | subjected about the same part by the process shown in FIG. 14, and the process shown in FIG.
 被写体探索部136は、生成されたテンプレート画像TB(i-1)をテンプレート画像生成部135から取得し、テンプレート画像TB(i-1)を用いて右目用画像Biをサーチして、右目用画像Biからテンプレート画像TB(i-1)に類似した部分を探索する(ステップS102)。 The subject search unit 136 acquires the generated template image TB (i-1) from the template image generation unit 135, searches the right-eye image Bi using the template image TB (i-1), and performs the right-eye image. A part similar to the template image TB (i-1) is searched from Bi (step S102).
 被写体探索部136は、右目用画像Biのサーチが成功したか否かを判断する(ステップS104)。右目用画像Biのサーチが成功した場合(ステップS104でYES)には、被写体探索部136は、探索結果(被写体Zの位置)と右目用画像Biをテンプレート画像生成部135へ入力する。その後、ステップS122へ進む。 The subject search unit 136 determines whether or not the search for the right-eye image Bi has been successful (step S104). If the search for the right-eye image Bi is successful (YES in step S104), the subject search unit 136 inputs the search result (position of the subject Z) and the right-eye image Bi to the template image generation unit 135. Thereafter, the process proceeds to step S122.
 右目用画像Biのサーチが成功していない場合(ステップS104でNO)は、被写体探索部136は、生成されたテンプレート画像TA(i-1)をテンプレート画像生成部135から取得する。また、被写体探索部136は、テンプレート画像生成部135から左目用画像Aiを取得する。そして、被写体探索部136は、テンプレート画像TA(i-1)を用いて左目用画像Aiをサーチして、左目用画像Aiからテンプレート画像TA(i-1)に類似した部分を探索する(ステップS106)。 When the search for the right-eye image Bi has not been successful (NO in step S104), the subject search unit 136 acquires the generated template image TA (i-1) from the template image generation unit 135. In addition, the subject search unit 136 acquires the left-eye image Ai from the template image generation unit 135. Then, the subject searching unit 136 searches the left-eye image Ai using the template image TA (i-1), and searches for a portion similar to the template image TA (i-1) from the left-eye image Ai (step S1). S106).
 被写体探索部136は、左目用画像Aiのサーチが成功したか否かを判断する(ステップS108)。 The subject search unit 136 determines whether or not the search for the left-eye image Ai is successful (step S108).
 左目用画像Aiのサーチが失敗した場合(ステップS108でNO)には、サーチ失敗であるため、被写体探索部136は、追尾エラー処理を行う(ステップS101)。 If the search for the left-eye image Ai has failed (NO in step S108), the search is unsuccessful, and the subject search unit 136 performs tracking error processing (step S101).
 左目用画像Aiのサーチが成功した場合(ステップS86でYES)には、被写体探索部136は、サーチ結果をテンプレート画像再生成部141へ入力し、テンプレート画像再生成部141が左目用画像Aiの一部を抜き出してテンプレート画像TAi-R1を生成する(ステップS110)。テンプレート画像TAi-R1の位置は、被写体と複眼デジタルカメラとの位置関係から、右目用画像Bの被写体Zの位置に近い位置として推定された位置である。 If the search for the left-eye image Ai is successful (YES in step S86), the subject searching unit 136 inputs the search result to the template image regenerating unit 141, and the template image regenerating unit 141 inputs the left eye image Ai. A part is extracted to generate a template image TAi-R1 (step S110). The position of the template image TAi-R1 is a position estimated as a position close to the position of the subject Z in the right-eye image B from the positional relationship between the subject and the compound-eye digital camera.
 追尾対象の被写体Zが遮蔽物に覆われた場合、追尾を行っている視点の左側の視点の画像では、被写体Zの右側に遮蔽物が位置する。すなわち、右目用画像Bで被写体Zが遮蔽物に覆われた場合には、左目用画像Aにおいては被写体Zの右側に遮蔽物が位置する。したがって、テンプレート画像再生成部141は、左目用画像Aiについて、テンプレート画像TAiの位置からテンプレート画像TAiの横幅分だけ右へ移動した領域を抜き出し、テンプレート画像TAi-R1を生成する。なお、テンプレート画像TAi-R1の大きさは、テンプレート画像TAiと同じ大きさである。 When the subject Z to be tracked is covered with a shielding object, the shielding object is located on the right side of the subject Z in the viewpoint image on the left side of the viewpoint that is being tracked. That is, when the subject Z is covered with the shielding object in the right eye image B, the shielding object is positioned on the right side of the subject Z in the left eye image A. Therefore, the template image regenerating unit 141 extracts a region moved to the right from the position of the template image TAi by the width of the template image TAi in the left eye image Ai, and generates a template image TAi-R1. Note that the size of the template image TAi-R1 is the same as that of the template image TAi.
 被写体探索部136は、ステップS100でテンプレート画像再生成部141により生成されたテンプレート画像TAi-R1で右目用画像Biをサーチする(ステップS112)。そして、被写体探索部136は、右目用画像Biのサーチが成功したか否かを判断する(ステップS114)。 The subject searching unit 136 searches the right-eye image Bi using the template image TAi-R1 generated by the template image regenerating unit 141 in step S100 (step S112). Then, the subject searching unit 136 determines whether or not the search for the right-eye image Bi has been successful (step S114).
 右目用画像Biのサーチが成功した場合(ステップS114でYES)には、被写体探索部136は、探索結果(被写体Zの位置)と右目用画像Biをテンプレート画像生成部135へ入力する。その後、ステップS122へ進む。 If the search for the right-eye image Bi is successful (YES in step S114), the subject search unit 136 inputs the search result (position of the subject Z) and the right-eye image Bi to the template image generation unit 135. Thereafter, the process proceeds to step S122.
 右目用画像Biのサーチが失敗した場合(ステップS114でNO)には、サーチ失敗であるため、サーチ結果をテンプレート画像再生成部141へ入力し、テンプレート画像再生成部141は、ステップS112と同様の方法により、テンプレート画像TAi-R1の位置からテンプレート画像TAi-R1の横幅分だけ右へ移動した領域を抜き出し、テンプレート画像TAi-R2を生成する(ステップS116)。 If the search for the right-eye image Bi fails (NO in step S114), the search is unsuccessful, so the search result is input to the template image regeneration unit 141, and the template image regeneration unit 141 is the same as in step S112. By this method, the region moved to the right by the width of the template image TAi-R1 from the position of the template image TAi-R1 is extracted to generate the template image TAi-R2 (step S116).
 被写体探索部136は、ステップS116でテンプレート画像再生成部141により生成されたテンプレート画像TAi-R2で右目用画像Biをサーチする(ステップS118)。そして、被写体探索部136は、右目用画像Biのサーチが成功したか否かを判断する(ステップS120)。 The subject searching unit 136 searches the right-eye image Bi using the template image TAi-R2 generated by the template image regenerating unit 141 in step S116 (step S118). Then, the subject search unit 136 determines whether or not the search for the right-eye image Bi is successful (step S120).
 右目用画像Biのサーチが失敗した場合(ステップS120でNO)には、サーチ失敗であるため、被写体探索部136は、追尾エラー処理を行う(ステップS101)。右目用画像Biのサーチが成功した場合(ステップS120でYES)には、被写体探索部136は、探索結果(被写体Zの位置)と右目用画像Biをテンプレート画像生成部135へ入力する。その後、ステップS122へ進む。 If the search for the right-eye image Bi fails (NO in step S120), the subject search unit 136 performs a tracking error process (step S101) because the search has failed. If the search for the right-eye image Bi is successful (YES in step S120), the subject search unit 136 inputs the search result (position of the subject Z) and the right-eye image Bi to the template image generation unit 135. Thereafter, the process proceeds to step S122.
 テンプレート画像生成部135は、ステップS104でサーチ成功と判断された場合には、ステップS102で設定された被写体Zの位置に基づいて、右目用画像Biから被写体Zを含み、被写体Zの形状が認識できる大きさの領域を抜き出して、テンプレート画像TBiを生成する(ステップS122)。また、テンプレート画像生成部135は、ステップS104でサーチ失敗と判断された場合には、ステップS102でサーチしたテンプレート画像TB(i-1)をテンプレート画像TBiとする(ステップS122)。 If it is determined that the search is successful in step S104, the template image generation unit 135 includes the subject Z from the right-eye image Bi based on the position of the subject Z set in step S102, and recognizes the shape of the subject Z. An area having a possible size is extracted to generate a template image TBi (step S122). Further, when it is determined in step S104 that the search has failed, the template image generation unit 135 sets the template image TB (i−1) searched in step S102 as the template image TBi (step S122).
 これにより、追尾対象の被写体が他の被写体に遮られた場合でも、被写体と複眼デジタルカメラとの位置関係から追尾対象の被写体がいると思われる位置を推定して探索を行うことにより、被写体を見失う可能性をより低減することができる。 As a result, even if the subject to be tracked is obstructed by other subjects, the subject is determined by performing a search by estimating the position where the subject to be tracked is supposed to be based on the positional relationship between the subject and the compound-eye digital camera. The possibility of losing sight can be further reduced.
 また、本実施の形態では、テンプレート画像再生成部141がテンプレート画像TBiの位置からテンプレート画像TBiの横幅分だけ左へ移動した領域を抜き出し、テンプレート画像TBi-L1を生成したが、テンプレート画像TBi-L1の位置の決め方はこれに限られない。例えば、テンプレート画像TBiの横幅の半分の大きさだけ左へ移動した領域テンプレート画像TBi-L1の位置としてもよいし、テンプレート画像TBiの近傍で輝度や色の変化が覆い領域をテンプレート画像TBi-L1の位置としてもよいし、エッジ、輪郭が多い領域をテンプレート画像TBi-L1の位置としてもよい。 In the present embodiment, the template image regeneration unit 141 extracts a region moved to the left by the width of the template image TBi from the position of the template image TBi to generate the template image TBi-L1, but the template image TBi− The method of determining the position of L1 is not limited to this. For example, the position of the region template image TBi-L1 moved to the left by half the width of the template image TBi may be set, or a region covered with a change in luminance or color in the vicinity of the template image TBi may be the template image TBi-L1. Or a region with many edges and contours may be used as the position of the template image TBi-L1.
 また、左目用画像Aと右目用画像Bとに基づいて被写体Zの位置を推定し、これをテンプレート画像TBi-L1の位置としてもよい。被写体Zの位置の推定方法の一例を、図21、22を用いて説明する。図22の左目用画像A1で追尾対象の被写体Zが探索できなかった場合には、図21で示すように、右目用画像B0におけるテンプレート画像TB0の位置(x0, y0)を基準として、左目用画像A0におけるテンプレート画像TA0の位置(x0+dx, y0+dy)を算出する。これにより右目用画像B0におけるテンプレート画像TB0の位置(x0, y0)と左目用画像A0におけるテンプレート画像TA0の位置(x0+dx, y0+dy)との差分(dx, dy)が算出される。したがって、右目用画像B1におけるテンプレート画像TB1の位置が(x1, y1)であるとすると、左目用画像A1の追尾対象の被写体Zの位置は(x1+dx, y1+dy)と推定でき、この位置でテンプレート画像を生成すればよい。これにより、テンプレート画像再生成部141がテンプレート画像を作成する処理が1回で済み、ロストした時に被写体の追尾に要する時間を短縮することができる。 Also, the position of the subject Z may be estimated based on the left-eye image A and the right-eye image B, and this may be used as the position of the template image TBi-L1. An example of a method for estimating the position of the subject Z will be described with reference to FIGS. When the subject Z to be tracked cannot be searched for in the left-eye image A1 in FIG. 22, as shown in FIG. 21, the position for the left eye is set on the basis of the position (x0, 基準 y0) of the template image TB0 in the right-eye image B0. The position (x0 + dx, y0 + dy) of the template image TA0 in the image A0 is calculated. Thereby, the difference (dx, dy) between the position (x0, y0) of the template image TB0 in the right-eye image B0 and the position (x0 + dx, y0 + dy) of the template image TA0 in the left-eye image A0 is calculated. Therefore, if the position of the template image TB1 in the right-eye image B1 is (x1, y1), the position of the subject Z to be tracked in the left-eye image A1 can be estimated as (x1 + dx, y1 + dy). A template image may be generated at the position. As a result, the template image regenerating unit 141 needs to create a template image only once, and the time required to track the subject when the template image is lost can be reduced.
 また、本実施の形態では、テンプレート画像再生成部141がテンプレート画像TBi-L1、TBi-L2を生成し、テンプレート画像TBi-L2での探索が失敗したらエラー処理を行ったが、テンプレート画像再生成部141が生成するテンプレート画像の数は2回に限られず、任意に設定することができる。 In this embodiment, the template image regeneration unit 141 generates the template images TBi-L1 and TBi-L2, and performs error processing when the search in the template image TBi-L2 fails. However, the template image regeneration is performed. The number of template images generated by the unit 141 is not limited to two, and can be arbitrarily set.
 <第4の実施の形態>
 本発明の第1の実施の形態は、右目用画像の一部を抜き出して生成したテンプレート画像と、左目用画像の一部を抜き出して生成したテンプレート画像とを用いて右目用画像及び左目用画像の少なくとも一方から被写体の探索をすることで被写体追尾を行ったが、テンプレート画像に含まれる背景や前景等により被写体追尾が失敗する可能性がある。
<Fourth embodiment>
The first embodiment of the present invention uses a template image generated by extracting a part of a right-eye image and a template image generated by extracting a part of a left-eye image, and a right-eye image and a left-eye image. Subject tracking is performed by searching for a subject from at least one of the above, but subject tracking may fail due to the background or foreground included in the template image.
 本発明の第4の実施の形態は、右目用画像や左目用画像の一部を抜き出して生成したテンプレート画像から背景や前景を除去し、これを用いて被写体追尾を行う形態である。以下、第4の実施の形態の複眼デジタルカメラ2について説明する。第1の実施の形態と同一の部分については、同一の符号を付し、説明を省略する。 In the fourth embodiment of the present invention, a background and foreground are removed from a template image generated by extracting a part of a right-eye image and a left-eye image, and subject tracking is performed using this. The compound eye digital camera 2 according to the fourth embodiment will be described below. The same parts as those of the first embodiment are denoted by the same reference numerals, and description thereof is omitted.
 図23は、複眼デジタルカメラ1の主要な内部構成を示すブロック図である。複眼デジタルカメラ1は、主として、CPU110、操作手段(レリーズスイッチ20、MENU/OKボタン25、十字ボタン26等)112、SDRAM114、VRAM116、AF検出手段118、AE/AWB検出手段120、撮像素子122、123、CDS/AMP124、125、A/D変換器126、127、画像入力コントローラ128、画像信号処理手段130、圧縮伸張処理手段132、立体画像生成部133、ビデオエンコーダ134、被写体探索部136、メディアコントローラ137、音入力処理手段138、記録メディア140、フォーカスレンズ駆動手段142、143、ズームレンズ駆動手段144、145、絞り駆動手段146、147、タイミングジェネレータ(TG)148、149、テンプレート画像生成部150で構成される。 FIG. 23 is a block diagram showing the main internal configuration of the compound-eye digital camera 1. The compound-eye digital camera 1 mainly includes a CPU 110, operation means (release switch 20, MENU / OK button 25, cross button 26, etc.) 112, SDRAM 114, VRAM 116, AF detection means 118, AE / AWB detection means 120, image sensor 122, 123, CDS / AMP 124, 125, A / D converters 126, 127, image input controller 128, image signal processing unit 130, compression / decompression processing unit 132, stereoscopic image generation unit 133, video encoder 134, subject search unit 136, media Controller 137, sound input processing means 138, recording medium 140, focus lens driving means 142, 143, zoom lens driving means 144, 145, aperture driving means 146, 147, timing generator (TG) 148, 149, template It constituted by preparative image generation unit 150.
 テンプレート画像生成部150は、左目用画像A及び右目用画像Bのそれぞれから追尾対象の被写体Zを含む所定の領域(例えば、矩形)を抜き出してテンプレート画像を生成する。また、テンプレート画像生成部150は、テンプレート画像から背景及び前景を除去したテンプレート画像を生成する。テンプレート画像生成部150が行う処理については後に詳述する。 The template image generation unit 150 extracts a predetermined area (for example, a rectangle) including the subject Z to be tracked from each of the left-eye image A and the right-eye image B, and generates a template image. Further, the template image generation unit 150 generates a template image obtained by removing the background and foreground from the template image. The processing performed by the template image generation unit 150 will be described in detail later.
 以上のように構成された複眼デジタルカメラ4の作用について説明する。第1の実施の形態と第4の実施の形態との違いは、3D撮影モードにおける被写体追尾処理のみであるため、被写体追尾処理についてのみ説明し、その他の説明は省略する。 The operation of the compound eye digital camera 4 configured as described above will be described. Since the difference between the first embodiment and the fourth embodiment is only the subject tracking process in the 3D shooting mode, only the subject tracking process will be described, and the other description will be omitted.
 図26は、所定のフレームレートで連続的に撮影された左目用画像Aに対して被写体Zを追尾する処理の流れを示すフローチャートである。この処理はCPU110によって制御される。この撮像処理をCPU110に実行させるためのプログラムはCPU110内のプログラム格納部に記憶されている。 FIG. 26 is a flowchart showing a flow of processing for tracking the subject Z with respect to the image A for the left eye continuously photographed at a predetermined frame rate. This process is controlled by the CPU 110. A program for causing the CPU 110 to execute this imaging process is stored in a program storage unit in the CPU 110.
 処理対象のフレームの直前に撮影された左目用画像A、この場合には被写体Zを追尾する処理の直前に撮影された左目用画像A0がテンプレート画像生成部150に入力される。また、処理対象のフレームの直前に撮影された右目用画像B、この場合には被写体Zを追尾する処理の直前に撮影された右目用画像B0がテンプレート画像生成部150に入力される。テンプレート画像生成部150は、左目用画像A0及び右目用画像B0から視差マップPA0、PB0(図25参照)を生成する(ステップS130)。視差マップは左目用画像Aと右目用画像Bとのズレ量を現したものであり、視差マップを参照することで画像内に含まれる各被写体の距離が明確になる。図25においては、近距離は高い濃度、遠距離は低い濃度で距離が表現されている。視差マップの生成には、公知の様々な方法を用いることができる。 The left-eye image A taken immediately before the processing target frame, in this case, the left-eye image A0 taken immediately before the process of tracking the subject Z is input to the template image generation unit 150. In addition, the right-eye image B taken immediately before the processing target frame, in this case, the right-eye image B0 taken immediately before the process of tracking the subject Z is input to the template image generation unit 150. The template image generation unit 150 generates parallax maps PA0 and PB0 (see FIG. 25) from the left-eye image A0 and the right-eye image B0 (step S130). The parallax map represents the amount of deviation between the left-eye image A and the right-eye image B. By referring to the parallax map, the distance of each subject included in the image becomes clear. In FIG. 25, the distance is expressed with a high density at a short distance and a low density at a long distance. Various known methods can be used to generate the parallax map.
 テンプレート画像生成部150は、処理対象のフレームの直前に撮影された左目用画像A(ここでは左目用画像A0)から被写体Zを含み、被写体Zの形状が認識できる大きさの領域を抜き出してテンプレート画像TA0を生成する(ステップS10)。 The template image generation unit 150 extracts a region including the subject Z from the left-eye image A (here, the left-eye image A0) captured immediately before the processing target frame and having a size that allows the shape of the subject Z to be recognized. An image TA0 is generated (step S10).
 テンプレート画像生成部150は、図25に示すように、左目用画像A0におけるテンプレート画像TA0の領域と同じ領域を視差マップPA0から抜き出し、テンプレート視差マップTPA0を生成する。そして、テンプレート画像生成部150は、テンプレート視差マップTPA0の略中央、すなわち被写体Zの視差から10ピクセル以上遠い視差の領域を背景とし、10ピクセル以上近い領域を前景として無効領域に設定する(ステップS132)。図26のテンプレート画像TAOの場合には、車の一部が背景として含まれている。図26のテンプレート画像TAOの場合には前景は含まれていないが、前景としては木、電柱等が考えられる。なお、±10ピクセルというのは、被写体Z内の視差は小さく、前景/背景と被写体Zとの視差が大きくなるという考えから決定されたものであり、この趣旨を逸脱しないのであれば±10ピクセルには限定されない。 As shown in FIG. 25, the template image generation unit 150 extracts the same area as the area of the template image TA0 in the left-eye image A0 from the parallax map PA0, and generates a template parallax map TPA0. Then, the template image generating unit 150 sets a parallax area that is approximately 10 pixels or more away from the parallax of the subject Z, that is, the background, with the area near 10 pixels or more from the parallax of the subject Z as a foreground (step S132). ). In the case of the template image TAO in FIG. 26, a part of the car is included as a background. In the case of the template image TAO in FIG. 26, the foreground is not included, but the foreground may be a tree, a telephone pole, or the like. Note that ± 10 pixels is determined from the idea that the parallax in the subject Z is small and the parallax between the foreground / background and the subject Z is large, and if it does not deviate from this point, ± 10 pixels It is not limited to.
 テンプレート画像生成部150は、テンプレート画像TA0から背景及び前景を除去したテンプレート画像SA0を生成する(ステップS134)。以下、ステップS134の処理について説明する。 The template image generation unit 150 generates a template image SA0 obtained by removing the background and foreground from the template image TA0 (step S134). Hereinafter, the process of step S134 will be described.
 テンプレート画像生成部150は、図26に示すように、ステップS132で設定された無効領域、すなわちテンプレート視差マップTPA0の略中央から視差が±10ピクセルの視差範囲以外をマスクするマスクデータをテンプレート視差マップTPA0から生成する。 As shown in FIG. 26, the template image generation unit 150 generates mask data for masking the invalid area set in step S132, that is, mask data other than the parallax range where the parallax is ± 10 pixels from the approximate center of the template parallax map TPA0. Generate from TPA0.
 そして、テンプレート画像生成部150は、図26に示すように、テンプレート画像TA0とマスクデータとから、テンプレート画像TA0から背景及び前景を除去し、背景及び前景を除去したテンプレート画像SA0を生成する。 Then, as shown in FIG. 26, the template image generation unit 150 removes the background and foreground from the template image TA0 from the template image TA0 and the mask data, and generates a template image SA0 from which the background and foreground are removed.
 i=1に設定する(ステップS14)。すなわち1枚目の画像の撮影、処理を開始する。なお、iは正の整数である。 I is set to 1 (step S14). That is, the photographing and processing of the first image is started. Note that i is a positive integer.
 左撮像系13で左目用画像Aiを撮影する(ステップS16)。左目用画像Aiについては被写体追尾処理を行うため、撮影された左目用画像Aiは被写体探索部136に入力される。同時に、左目用画像Aiはビデオエンコーダ134に入力され、ビデオエンコーダ134で順次表示用の信号形式に変換されて、それぞれモニタ16に出力される。 The left imaging system 13 captures the left eye image Ai (step S16). Since the subject tracking process is performed on the left-eye image Ai, the captured left-eye image Ai is input to the subject search unit 136. At the same time, the left-eye image Ai is input to the video encoder 134, is sequentially converted into a display signal format by the video encoder 134, and is output to the monitor 16.
 右撮像系12で右目用画像Biを撮影する(ステップS18)。右目用画像Biについては被写体追尾処理を行わないため、撮影された右目用画像Biはテンプレート画像生成部135に入力される。同時に、右目用画像Biはビデオエンコーダ134に入力され、ビデオエンコーダ134で順次表示用の信号形式に変換されて、それぞれモニタ16に出力される。 The right imaging system 12 captures the right-eye image Bi (step S18). Since the subject tracking process is not performed on the right-eye image Bi, the captured right-eye image Bi is input to the template image generation unit 135. At the same time, the right-eye image Bi is input to the video encoder 134, is sequentially converted into a display signal format by the video encoder 134, and is output to the monitor 16.
 被写体探索部136は、生成されたテンプレート画像TA(i-1)及び背景及び前景を除去したテンプレート画像SA(i-1)を用いて左目用画像Aiをサーチして、左目用画像Aiからテンプレート画像TA(i-1)及び前景を除去したテンプレート画像SA(i-1)に類似した部分を探索する(ステップS136)。今はi=1であるため、ステップS10、S12で生成されたテンプレート画像TA0及び前景を除去したテンプレート画像SA0を用いて左目用画像A1をサーチする。 The subject search unit 136 searches the left-eye image Ai using the generated template image TA (i-1) and the template image SA (i-1) from which the background and foreground are removed, and uses the left-eye image Ai as a template. A portion similar to the image TA (i-1) and the template image SA (i-1) from which the foreground is removed is searched (step S136). Since i = 1 at this time, the left-eye image A1 is searched using the template image TA0 generated in steps S10 and S12 and the template image SA0 from which the foreground is removed.
 被写体探索部136は、左目用画像Aiのサーチが終了したか否かを判断し(ステップS138)、終了していない場合(ステップS138でNO)には再度ステップS138を行う。 The subject search unit 136 determines whether or not the search for the left-eye image Ai has been completed (step S138). If the search has not been completed (NO in step S138), step S138 is performed again.
 左目用画像Aiのサーチが終了した場合(ステップS138でYES)には、被写体探索部136は、テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果と、背景及び前景を除去したテンプレート画像SA(i-1)で左目用画像Aiをサーチした結果が同じか否かを判断する(ステップS140)。 When the search for the left-eye image Ai is completed (YES in step S138), the subject search unit 136 removes the result of searching the left-eye image Ai from the template image TA (i-1), and the background and foreground. It is determined whether or not the result of searching for the left-eye image Ai in the template image SA (i-1) is the same (step S140).
 テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果と、背景及び前景を除去したテンプレート画像SA(i-1)で左目用画像Aiをサーチした結果が同じ場合(ステップS140でYES)には、テンプレート画像TA(i-1)、背景及び前景を除去したテンプレート画像SA(i-1)に類似した部分を被写体Zの位置とする。被写体探索部136は、探索結果と左目用画像Aiをテンプレート画像生成部135へ入力する。その後、ステップS148へ進む。 If the result of searching the left-eye image Ai using the template image TA (i-1) is the same as the result of searching the left-eye image Ai using the template image SA (i-1) with the background and foreground removed (YES in step S140) ), A portion similar to the template image TA (i-1) and the template image SA (i-1) with the background and foreground removed is set as the position of the subject Z. The subject search unit 136 inputs the search result and the left-eye image Ai to the template image generation unit 135. Thereafter, the process proceeds to step S148.
 テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果と、背景及び前景を除去したテンプレート画像SA(i-1)で左目用画像Aiをサーチした結果が同じでない場合(ステップS140でNO)には、被写体探索部136は、テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TA(i-1)との類似度を算出すると共に、背景及び前景を除去したテンプレート画像SA(i-1)で左目用画像Aiをサーチした結果と背景及び前景を除去したテンプレート画像SA(i-1)との類似度を算出する。類似度の算出方法としては公知のもの、例えば特徴量の値の差分、特徴量空間(重み付き空間も可)上の最小2乗法などが採用できる。そして被写体探索部136は、テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TA(i-1)との類似度が、背景及び前景を除去したテンプレート画像SA(i-1)で左目用画像Aiをサーチした結果と背景及び前景を除去したテンプレート画像SA(i-1)との類似度より高いか否かを判断する(ステップS142)。 If the result of searching the left-eye image Ai using the template image TA (i-1) is not the same as the result of searching the left-eye image Ai using the template image SA (i-1) with the background and foreground removed (in step S140) NO), the subject search unit 136 calculates the similarity between the result of searching the left-eye image Ai in the template image TA (i-1) and the template image TA (i-1), and the background and foreground. The similarity between the result of searching the left-eye image Ai using the removed template image SA (i-1) and the template image SA (i-1) from which the background and foreground have been removed is calculated. As a method for calculating the similarity, a known method, for example, a difference between feature value values, a least square method on a feature amount space (a weighted space is also possible), or the like can be employed. The subject search unit 136 then searches the template image TA (i−1) for the left-eye image Ai and the template image SA (i−1) from which the similarity between the template image TA (i−1) and the template image TA (i−1) is removed. It is determined whether or not the result of searching the left-eye image Ai in step -1) is higher than the similarity between the template image SA (i-1) with the background and foreground removed (step S142).
 テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TA(i-1)との類似度が、背景及び前景を除去したテンプレート画像SA(i-1)で左目用画像Aiをサーチした結果と背景及び前景を除去したテンプレート画像SA(i-1)との類似度より高い場合(ステップS142でYES)には、被写体探索部136は、テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果、すなわちテンプレート画像TA(i-1)に類似した部分を被写体Zの位置とする(ステップS144)。被写体探索部136は、この被写体Zの位置と左目用画像Aiをテンプレート画像生成部135へ入力する。 The similarity between the result of searching the left eye image Ai using the template image TA (i-1) and the template image TA (i-1) is the left eye image obtained using the template image SA (i-1) with the background and foreground removed. If the similarity between the result of searching for Ai and the template image SA (i-1) from which the background and foreground have been removed is higher (YES in step S142), the subject search unit 136 uses the template image TA (i-1). As a result of searching the left eye image Ai, that is, a portion similar to the template image TA (i-1) is set as the position of the subject Z (step S144). The subject search unit 136 inputs the position of the subject Z and the left eye image Ai to the template image generation unit 135.
 テンプレート画像TA(i-1)で左目用画像Aiをサーチした結果とテンプレート画像TA(i-1)との類似度が、背景及び前景を除去したテンプレート画像SA(i-1)で左目用画像Aiをサーチした結果と背景及び前景を除去したテンプレート画像SA(i-1)との類似度より高くない場合(ステップS142でNO)には、被写体探索部136は、背景及び前景を除去したテンプレート画像SA(i-1)で左目用画像Aiをサーチした結果、すなわち背景及び前景を除去したテンプレート画像SA(i-1)に類似した部分を被写体Zの位置とする(ステップS146)。被写体探索部136は、この被写体Zの位置と左目用画像Aiをテンプレート画像生成部135へ入力する。 The similarity between the result of searching the left eye image Ai using the template image TA (i-1) and the template image TA (i-1) is the left eye image obtained using the template image SA (i-1) with the background and foreground removed. If the similarity between the result of searching for Ai and the template image SA (i-1) from which the background and foreground have been removed is not higher (NO in step S142), the subject search unit 136 removes the background and foreground from the template. The result of searching the left eye image Ai with the image SA (i-1), that is, a portion similar to the template image SA (i-1) from which the background and foreground are removed is set as the position of the subject Z (step S146). The subject search unit 136 inputs the position of the subject Z and the left eye image Ai to the template image generation unit 135.
 これにより、左目用画像Aiから被写体Zが探索される。テンプレート画像生成部135は、ステップS136、S144、S146で設定された被写体Zの位置に基づいて、左目用画像Aiから被写体Zを含み、被写体Zの形状が認識できる大きさの領域を抜き出して、テンプレート画像TAiを生成する(ステップS148)。ステップS10と同様、被写体Zの輪郭に数ピクセルの余裕を持った矩形領域が抜き出され、テンプレート画像として生成される。図8A1に示す場合には、点線部分がテンプレート画像TA1として生成される。 Thereby, the subject Z is searched from the left-eye image Ai. Based on the position of the subject Z set in steps S136, S144, and S146, the template image generation unit 135 extracts a region that includes the subject Z from the left-eye image Ai and has a size that allows the shape of the subject Z to be recognized. A template image TAi is generated (step S148). Similar to step S10, a rectangular area having a margin of several pixels is extracted from the contour of the subject Z and generated as a template image. In the case shown in FIG. 8A1, the dotted line portion is generated as the template image TA1.
 その後、i=1+1に設定し(ステップS36)、ステップS16に戻り、再度ステップS16~S36の処理が行われる。 Thereafter, i = 1 + 1 is set (step S36), the process returns to step S16, and the processes of steps S16 to S36 are performed again.
 本実施の形態によれば、被写体が移動した等で背景が変わったり被写体の前に前景が入ったりした場合でも、被写体を見失う可能性を低減することができる。 According to the present embodiment, it is possible to reduce the possibility of losing the subject even when the background changes due to the subject moving or the foreground enters in front of the subject.
 なお、本実施の形態では、ステップS136において、テンプレート画像TA(i-1)及び背景及び前景を除去したテンプレート画像SA(i-1)を用いて左目用画像Aiをサーチしたが、背景及び前景を除去したテンプレート画像SA(i-1)のみを用いて左目用画像Aiをサーチするようにしてもよい。ただし、被写体を見失う可能性を減らすためには、テンプレート画像TA(i-1)及び背景及び前景を除去したテンプレート画像SA(i-1)を用いてサーチすることが望ましい。 In the present embodiment, in step S136, the left-eye image Ai is searched using the template image TA (i-1) and the template image SA (i-1) from which the background and foreground are removed. Alternatively, the left-eye image Ai may be searched using only the template image SA (i-1) from which the image is removed. However, in order to reduce the possibility of losing sight of the subject, it is desirable to perform a search using the template image TA (i-1) and the template image SA (i-1) with the background and foreground removed.
 また、本実施の形態では、最初にテンプレート画像TA(i-1)及び背景及び前景を除去したテンプレート画像SA(i-1)を用いて左目用画像Aiをサーチしたが、まずテンプレート画像TA(i-1)、TB(i-1)を用いて左目用画像Aiをサーチし、サーチが失敗した場合には背景及び前景を除去したテンプレート画像SA(i-1)を用いて左目用画像Aiをサーチするようにしてもよい。更に、テンプレート画像TB(i-1)から背景及び前景を除去したテンプレート画像を用いて左目用画像Aiをサーチするようにしてもよい。 In the present embodiment, first, the left-eye image Ai is searched using the template image TA (i-1) and the template image SA (i-1) from which the background and foreground are removed. First, the template image TA ( i-1), TB (i-1) is used to search the left eye image Ai. If the search fails, the left eye image Ai is used using the template image SA (i-1) from which the background and foreground are removed. You may make it search. Further, the left-eye image Ai may be searched using a template image obtained by removing the background and foreground from the template image TB (i-1).
 なお、第1~4の実施の形態では、ライブビュー画像の撮影時を例に説明したが、右目用画像B及び左目用画像Aを連続的に取得する場合、例えば動画撮影時にも適用することができる。ライブビュー画像の撮影と動画撮影との差異は、ライブビュー画像の場合には連続的に撮影した右目用画像B及び左目用画像Aを記録しないのに対し、動画撮影の場合には、連続的に撮影した右目用画像B及び左目用画像Aを記録メディア54に記録する処理を行う点のみが異なる。なお、連続的に撮影した右目用画像Bデータ及び左目用画像Aデータを記録メディア54に記録する処理は、既に公知であるため説明を省略する。 In the first to fourth embodiments, the case of capturing a live view image has been described as an example. However, when the right-eye image B and the left-eye image A are acquired continuously, for example, the present invention can also be applied to movie recording. Can do. The difference between live view image shooting and movie shooting is that the continuous-shot right-eye image B and left-eye image A are not recorded in the case of a live-view image, whereas in the case of movie shooting, the difference is continuous. The only difference is that the process of recording the right-eye image B and the left-eye image A taken on the recording medium 54 is performed. Note that the process of recording the continuously captured right-eye image B data and left-eye image A data on the recording medium 54 is already known, and thus description thereof is omitted.
 また、第1~4の実施の形態では、左目用画像A、右目用画像Bの2枚の視点画像を撮影する場合を例に説明したが、3枚以上の視点画像を撮影する場合についても適用可能である。この場合にも、3枚以上の視点画像の少なくとも1枚で被写体追尾処理を行えばよい。全ての視点画像で被写体追尾処理を行う場合には、フォーカス、ズーム、露出の最適化のみでなく、追尾した被写体Zに重ねて枠を表示する、追尾した被写体Zを強調表示する等、表示を最適化することも可能である。枠の表示や強調表示には公知の様々な方法を用いることができるので、説明は省略する。 In the first to fourth embodiments, the case where two viewpoint images of the left-eye image A and the right-eye image B are captured has been described as an example. However, the case where three or more viewpoint images are captured is also described. Applicable. Also in this case, the subject tracking process may be performed on at least one of the three or more viewpoint images. When subject tracking processing is performed on all viewpoint images, not only optimization of focus, zoom, and exposure, but also a display such as displaying a frame over the tracked subject Z, highlighting the tracked subject Z, etc. It is also possible to optimize. Since various known methods can be used for displaying and highlighting the frame, description thereof is omitted.
 また、第1~4の実施の形態の実施の形態では、ライブビュー画像の撮影時に被写体追尾処理を行い、S1ON信号が入力されると追尾対象の被写体に対してAE測光、AF制御を行った、すなわちライブビュー画像の撮影時に被写体追尾処理を行い、その後撮影する静止画のAE測光、AF制御を追尾対象の被写体に対して行ったが、ライブビュー画像の撮影時に追尾対象の被写体に対してAE測光、AF制御を連続的に行うようにしてもよい。枠の表示や強調表示もライブビュー画像の撮影時に連続的に行っても良い。また、ライブビュー画像用の視点画像から抜き出されたテンプレート画像を用いて静止画用の視点画像から被写体の探索を行うようにしてもよい。 In the first to fourth embodiments, subject tracking processing is performed at the time of shooting a live view image, and AE metering and AF control are performed on the subject to be tracked when an S1 ON signal is input. That is, subject tracking processing is performed when shooting a live view image, and then AE metering and AF control of a still image to be captured are performed on the subject to be tracked. You may make it perform AE photometry and AF control continuously. The frame display and the highlight display may be continuously performed when the live view image is captured. Alternatively, a subject may be searched for from a still image viewpoint image using a template image extracted from the viewpoint image for live view images.
 1:複眼デジタルカメラ、10:カメラボディ、11:バリア、12:右撮像系、13:左撮像系、14:フラッシュ、15:マイク、16:モニタ、20:レリーズスイッチ、21:ズームボタン、22:モードボタン、23:視差調整ボタン、24:2D/3D切り替えボタン、25:MENU/OKボタン、26:十字ボタン、27:DISP/BACKボタン、110:CPU、112:操作手段、114:SDRAM、116:VRAM、118:AF検出回路、120:AE/AWB検出手段、122、123:撮像素子、124、125:CDS/AMP、126、127:A/D変換器、128:画像入力コントローラ、130:画像信号処理手段、132:圧縮伸張処理手段、133:立体画像生成部、134:ビデオエンコーダ、135,150:テンプレート画像生成手段、136:被写体探索部、137:メディアコントローラ、138:音入力処理手段、139:合成テンプレート画像生成部、140:記録メディア、141:テンプレート画像再生成部、142、143:フォーカスレンズ駆動手段、144、145:ズームレンズ駆動手段、146、147:絞り駆動手段、148、149:タイミングジェネレータ(TG) 1: compound eye digital camera, 10: camera body, 11: barrier, 12: right imaging system, 13: left imaging system, 14: flash, 15: microphone, 16: monitor, 20: release switch, 21: zoom button, 22 : Mode button, 23: Parallax adjustment button, 24: 2D / 3D switching button, 25: MENU / OK button, 26: Cross button, 27: DISP / BACK button, 110: CPU, 112: Operating means, 114: SDRAM, 116: VRAM, 118: AF detection circuit, 120: AE / AWB detection means, 122, 123: imaging device, 124, 125: CDS / AMP, 126, 127: A / D converter, 128: image input controller, 130 : Image signal processing means, 132: compression / expansion processing means, 133: stereoscopic image generation unit, 134: video Encoder, 135, 150: template image generating means, 136: subject searching section, 137: media controller, 138: sound input processing means, 139: synthesized template image generating section, 140: recording medium, 141: template image regenerating section, 142, 143: Focus lens driving means, 144, 145: Zoom lens driving means, 146, 147: Aperture driving means, 148, 149: Timing generator (TG)

Claims (16)

  1.  2つの視点から同一被写体を撮影した2枚の視点画像を取得する第1の撮像手段及び第2の撮像手段と、
     前記第1の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出して第1のテンプレート画像を生成すると共に、前記第2の撮像手段で撮像された視点画像から前記追尾対象とすべき被写体を含む一部の領域を抜き出して第2のテンプレート画像を生成する第1のテンプレート画像生成手段と、
     前記第1の撮像手段で撮影された視点画像であって、前記第1のテンプレート画像の基となる視点画像と異なる時間に撮影された視点画像から、前記第1のテンプレート画像及び前記第2のテンプレート画像を用いて追尾対象の被写体を探索する探索手段と、
     を備えた撮像装置。
    A first imaging means and a second imaging means for acquiring two viewpoint images obtained by photographing the same subject from two viewpoints;
    A first template image is generated by extracting a part of the area including the subject to be tracked from the viewpoint image captured by the first imaging unit, and the viewpoint image captured by the second imaging unit. A first template image generating means for generating a second template image by extracting a partial area including the subject to be tracked from
    From the viewpoint image captured by the first imaging means and captured at a time different from the viewpoint image that is the basis of the first template image, the first template image and the second template image Search means for searching for a subject to be tracked using a template image;
    An imaging apparatus comprising:
  2.  前記探索手段により追尾対象の被写体が探索されなかった場合には、前記第1のテンプレート画像及び前記第2のテンプレート画像から画像合成処理により合成テンプレート画像を生成する第2のテンプレート画像生成手段を備え、
     前記探索手段は、前記第2のテンプレート画像生成手段により合成テンプレート画像が生成されると、当該合成テンプレート画像を用いて追尾対象の被写体を探索する請求項1に記載の撮像装置。
    A second template image generating unit configured to generate a combined template image by image combining processing from the first template image and the second template image when the tracking unit does not search for the tracking target object; ,
    The imaging device according to claim 1, wherein the search unit searches for a subject to be tracked using the combined template image when the combined template image is generated by the second template image generating unit.
  3.  前記合成テンプレート画像は、前記第1のテンプレート画像及び前記第2のテンプレート画像との中間状態の画像である請求項2に記載の撮像装置。 The imaging apparatus according to claim 2, wherein the composite template image is an image in an intermediate state between the first template image and the second template image.
  4.  前記探索手段は、複数の探索結果が得られた場合には、探索結果が得られたテンプレート画像と、このテンプレート画像を用いて探索された結果との類似度を前記複数の探索結果の各々について算出し、この算出された類似度が最も高いテンプレート画像を用いて探索された結果を前記追尾対象の被写体とする請求項2又は3に記載の撮像装置。 In the case where a plurality of search results are obtained, the search means determines the similarity between the template image obtained from the search result and the result searched using the template image for each of the plurality of search results. The imaging apparatus according to claim 2 or 3, wherein a result obtained by searching using a template image having the highest calculated similarity is the subject to be tracked.
  5.  前記探索手段は、複数の探索結果が得られた場合であって、前記第1のテンプレート画像を用いて探索結果が得られた場合には、前記第1のテンプレート画像を用いて探索された結果を前記追尾対象の被写体とする請求項2、3又は4に記載の撮像装置。 The search means is a case where a plurality of search results are obtained, and when a search result is obtained using the first template image, a result searched using the first template image The imaging apparatus according to claim 2, wherein the subject is a subject to be tracked.
  6.  前記第2のテンプレート画像生成手段は、前記合成テンプレート画像を複数種類生成し、
     前記探索手段は、複数の探索結果が得られた場合であって、前記第1のテンプレート画像を用いて探索結果が得られなかった場合には、前記複数種類生成された合成テンプレート画像のうちの前記第1のテンプレート画像に最も近い合成テンプレート画像を用いて探索された結果を前記追尾対象の被写体とする請求項2、3又は4に記載の撮像装置。
    The second template image generation means generates a plurality of types of the composite template images,
    The search means is a case where a plurality of search results are obtained, and when a search result is not obtained using the first template image, the search means The imaging apparatus according to claim 2, 3, or 4, wherein a result searched using a synthesized template image closest to the first template image is set as the tracking target subject.
  7.  前記探索手段により追尾対象の被写体が探索されなかった場合には、前記第2のテンプレート画像の生成にあたり抜き出された領域から任意の量だけ左右方向に移動させた領域を前記第2の撮像手段で撮像された視点画像から抜き出して第3のテンプレート画像を生成する第3のテンプレート画像生成手段を備え、
     前記探索手段は、前記生成された第3のテンプレート画像を用いて追尾対象の被写体を探索する請求項1に記載の撮像装置。
    When the tracking target object is not searched by the search means, an area that is moved in the left-right direction by an arbitrary amount from the area extracted in generating the second template image is the second imaging means. A third template image generating means for generating a third template image extracted from the viewpoint image captured in
    The imaging device according to claim 1, wherein the search unit searches for a subject to be tracked using the generated third template image.
  8.  前記第3のテンプレート画像生成手段は、前記探索手段により前記第3のテンプレート画像を用いて探索結果が得られなかった場合には、前記第3のテンプレート画像の生成にあたり抜き出された領域から所定量だけ左右方向に移動させた領域を前記第2の撮像手段で撮像された視点画像から抜き出して第4のテンプレート画像を生成し、
     前記探索手段は、前記生成された第4のテンプレート画像を用いて追尾対象の被写体を探索する請求項7に記載の撮像装置。
    The third template image generation means is configured to extract a region from the region extracted in generating the third template image when the search means does not obtain a search result using the third template image. A fourth template image is generated by extracting a region moved in the left-right direction by a fixed amount from the viewpoint image captured by the second imaging unit,
    The imaging apparatus according to claim 7, wherein the search unit searches for a subject to be tracked using the generated fourth template image.
  9.  前記第3のテンプレート画像生成手段は、前記2枚の視点画像に基づいて前記追尾対象とすべき被写体がいると思われる位置を推定して前記任意の量を決定する請求項7又は8に記載の撮像装置。 The said 3rd template image generation means estimates the position where the subject which should be the said tracking object exists based on the said 2 viewpoint images, and determines the said arbitrary quantity. Imaging device.
  10.  前記第1の撮像手段及び前記第2の撮像手段は、前記2枚の視点画像を連続的に取得し、
     前記探索手段は、前記第3のテンプレート画像生成手段により生成されたテンプレート画像を用いて追尾対象の被写体を探索した場合には、その後に前記第1の撮像手段で撮像された視点画像については前記第1のテンプレート画像及び前記第2のテンプレート画像を用いて追尾対象の被写体を探索する請求項7、8又は9に記載の撮像装置。
    The first imaging means and the second imaging means continuously acquire the two viewpoint images,
    When the search means searches for the subject to be tracked using the template image generated by the third template image generation means, the viewpoint image captured by the first image pickup means thereafter is The imaging apparatus according to claim 7, 8 or 9, wherein a subject to be tracked is searched using the first template image and the second template image.
  11.  前記探索手段により探索された追尾対象の被写体を基準に自動露出制御を行う自動露出制御手段、前記探索手段により探索された追尾対象の被写体に焦点が合うように焦点調節を行う自動焦点調節手段、又は前記探索手段により探索された追尾対象の被写体に基づいて画角を調整するズーム制御手段を備えた請求項1から10のいずれかに記載の撮像装置。 Automatic exposure control means for performing automatic exposure control based on the tracking target object searched by the search means, automatic focus adjustment means for performing focus adjustment so that the tracking target object searched by the search means is in focus, The imaging apparatus according to claim 1, further comprising a zoom control unit that adjusts an angle of view based on a subject to be tracked searched by the search unit.
  12.  2つの視点から同一被写体を撮影した2枚の視点画像を取得する第1の撮像手段及び第2の撮像手段と、
     前記第1の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出すことにより第1のテンプレート画像を生成するテンプレート画像生成手段と、
     前記2枚の視点画像から視差を取得する視差取得手段と、
     前記視差取得手段により取得された視差に基づいて前記第1のテンプレート画像から背景及び前景を除去したテンプレート画像を生成する第4のテンプレート画像生成手段と、
     前記第1の撮像手段で撮影された視点画像であって、前記第1のテンプレート画像の基となる視点画像と異なる時間に撮影された視点画像から、前記第4のテンプレート画像生成手段で生成されたテンプレート画像を用いて追尾対象の被写体を探索する探索手段と、
     を備えた撮像装置。
    A first imaging means and a second imaging means for acquiring two viewpoint images obtained by photographing the same subject from two viewpoints;
    Template image generation means for generating a first template image by extracting a part of a region including a subject to be tracked from a viewpoint image captured by the first imaging means;
    Parallax acquisition means for acquiring parallax from the two viewpoint images;
    Fourth template image generation means for generating a template image obtained by removing a background and foreground from the first template image based on the parallax acquired by the parallax acquisition means;
    A viewpoint image captured by the first imaging means, which is generated by the fourth template image generation means from a viewpoint image captured at a different time from the viewpoint image that is the basis of the first template image. Search means for searching for a subject to be tracked using the template image obtained,
    An imaging apparatus comprising:
  13.  前記探索手段は、前記テンプレート画像生成手段により生成された第1のテンプレート画像及び前記第4のテンプレート画像生成手段で生成されたテンプレート画像を用いて追尾対象の被写体を探索する請求項12に記載の撮像装置。 The search unit according to claim 12, wherein the search unit searches for a subject to be tracked using the first template image generated by the template image generation unit and the template image generated by the fourth template image generation unit. Imaging device.
  14.  第1の撮像手段及び第2の撮像手段により2つの視点から同一被写体を撮影した2枚の視点画像を取得するステップと、
     前記第1の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出して第1のテンプレート画像を生成すると共に、前記第2の撮像手段で撮像された視点画像から前記追尾対象とすべき被写体を含む一部の領域を抜き出して第2のテンプレート画像を生成するステップと、
     前記第1の撮像手段で撮影された視点画像であって、前記第1のテンプレート画像の基となる視点画像と異なる時間に撮影された視点画像から前記第1のテンプレート画像及び前記第2のテンプレート画像を用いて追尾対象の被写体を探索するステップと、
     を含む立体画像撮像方法。
    Obtaining two viewpoint images obtained by photographing the same subject from two viewpoints by the first imaging means and the second imaging means;
    A first template image is generated by extracting a part of the area including the subject to be tracked from the viewpoint image captured by the first imaging unit, and the viewpoint image captured by the second imaging unit. Extracting a part of the region including the subject to be tracked from the second template image,
    The first template image and the second template from the viewpoint image captured by the first imaging unit, the viewpoint image captured at a time different from the viewpoint image that is the basis of the first template image. Searching for a subject to be tracked using an image;
    A stereoscopic image capturing method including:
  15.  第1の撮像手段及び第2の撮像手段により2つの視点から同一被写体を撮影した2枚の視点画像を取得するステップと、
     前記第1の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出して第1のテンプレート画像を生成すると共に、前記第2の撮像手段で撮像された視点画像から前記追尾対象とすべき被写体を含む一部の領域を抜き出して第2のテンプレート画像を生成するステップと、
     前記第1の撮像手段で撮影された視点画像であって、前記第1のテンプレート画像の基となる視点画像と異なる時間に撮影された視点画像から前記第1のテンプレート画像及び前記第2のテンプレート画像を用いて追尾対象の被写体を探索するステップと、
     を演算装置に実行させるプログラム。
    Obtaining two viewpoint images obtained by photographing the same subject from two viewpoints by the first imaging means and the second imaging means;
    A first template image is generated by extracting a part of the area including the subject to be tracked from the viewpoint image captured by the first imaging unit, and the viewpoint image captured by the second imaging unit. Extracting a part of the region including the subject to be tracked from the second template image,
    The first template image and the second template from the viewpoint image captured by the first imaging unit, the viewpoint image captured at a time different from the viewpoint image that is the basis of the first template image. Searching for a subject to be tracked using an image;
    A program that causes an arithmetic unit to execute.
  16.  第1の撮像手段及び第2の撮像手段により2つの視点から同一被写体を撮影した2枚の視点画像を取得するステップと、
     前記第1の撮像手段で撮像された視点画像から追尾対象とすべき被写体を含む一部の領域を抜き出して第1のテンプレート画像を生成すると共に、前記第2の撮像手段で撮像された視点画像から前記追尾対象とすべき被写体を含む一部の領域を抜き出して第2のテンプレート画像を生成するステップと、
     前記第1の撮像手段で撮影された視点画像であって、前記第1のテンプレート画像の基となる視点画像と異なる時間に撮影された視点画像から前記第1のテンプレート画像及び前記第2のテンプレート画像を用いて追尾対象の被写体を探索するステップと、
     を演算装置に実行させるプログラムを記録したコンピュータ読み取り可能な記録媒体。
    Obtaining two viewpoint images obtained by photographing the same subject from two viewpoints by the first imaging means and the second imaging means;
    A first template image is generated by extracting a part of the area including the subject to be tracked from the viewpoint image captured by the first imaging unit, and the viewpoint image captured by the second imaging unit. Extracting a part of the region including the subject to be tracked from the second template image,
    The first template image and the second template from the viewpoint image captured by the first imaging unit, the viewpoint image captured at a time different from the viewpoint image that is the basis of the first template image. Searching for a subject to be tracked using an image;
    The computer-readable recording medium which recorded the program which makes an arithmetic unit execute.
PCT/JP2012/062369 2011-07-05 2012-05-15 Imaging device, three-dimensional image capturing method and program WO2013005477A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2011149440 2011-07-05
JP2011-149440 2011-07-05

Publications (1)

Publication Number Publication Date
WO2013005477A1 true WO2013005477A1 (en) 2013-01-10

Family

ID=47436835

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2012/062369 WO2013005477A1 (en) 2011-07-05 2012-05-15 Imaging device, three-dimensional image capturing method and program

Country Status (1)

Country Link
WO (1) WO2013005477A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103595916A (en) * 2013-11-11 2014-02-19 南京邮电大学 Double-camera target tracking system and implementation method thereof
JP2017028655A (en) * 2015-07-28 2017-02-02 日本電気株式会社 Tracking system, tracking method and tracking program
JP2020141426A (en) * 2020-06-15 2020-09-03 日本電気株式会社 Tracking system, tracking method and tracking program

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002269570A (en) * 2001-03-09 2002-09-20 Toyota Motor Corp Recognition system for surrounding
JP2008059148A (en) * 2006-08-30 2008-03-13 Fujifilm Corp Image processing apparatus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002269570A (en) * 2001-03-09 2002-09-20 Toyota Motor Corp Recognition system for surrounding
JP2008059148A (en) * 2006-08-30 2008-03-13 Fujifilm Corp Image processing apparatus

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103595916A (en) * 2013-11-11 2014-02-19 南京邮电大学 Double-camera target tracking system and implementation method thereof
JP2017028655A (en) * 2015-07-28 2017-02-02 日本電気株式会社 Tracking system, tracking method and tracking program
JP2020141426A (en) * 2020-06-15 2020-09-03 日本電気株式会社 Tracking system, tracking method and tracking program
JP7001125B2 (en) 2020-06-15 2022-01-19 日本電気株式会社 Tracking system, tracking method and tracking program

Similar Documents

Publication Publication Date Title
JP4783465B1 (en) Imaging device and display device
US9077976B2 (en) Single-eye stereoscopic image capturing device
US20110018970A1 (en) Compound-eye imaging apparatus
JP5269252B2 (en) Monocular stereoscopic imaging device
US9258545B2 (en) Stereoscopic imaging apparatus
US20130113892A1 (en) Three-dimensional image display device, three-dimensional image display method and recording medium
JP5415170B2 (en) Compound eye imaging device
US8823778B2 (en) Imaging device and imaging method
JP4763827B2 (en) Stereoscopic image display device, compound eye imaging device, and stereoscopic image display program
JP5231771B2 (en) Stereo imaging device
JP4533735B2 (en) Stereo imaging device
US20110050856A1 (en) Stereoscopic imaging apparatus
JP2011075675A (en) Compound-eye imaging apparatus
JP2011022501A (en) Compound-eye imaging apparatus
JP2009128969A (en) Imaging device and method, and program
JP2007225897A (en) Focusing position determination device and method
WO2013005477A1 (en) Imaging device, three-dimensional image capturing method and program
JP2007279333A (en) Device and method for deciding focusing position
JP2012028871A (en) Stereoscopic image display device, stereoscopic image photographing device, stereoscopic image display method, and stereoscopic image display program
JP2010200024A (en) Three-dimensional image display device and three-dimensional image display method
JP4874923B2 (en) Image recording apparatus and image recording method
JP5307189B2 (en) Stereoscopic image display device, compound eye imaging device, and stereoscopic image display program
JP5087027B2 (en) Compound eye imaging device
JP2011259405A (en) Imaging device and imaging method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12807374

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12807374

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP