JP2013074397A

JP2013074397A - Image processing system, image processing method, and image processing program

Info

Publication number: JP2013074397A
Application number: JP2011210901A
Authority: JP
Inventors: Motohiro Asano; 基広浅野
Original assignee: Konica Minolta Inc
Current assignee: Konica Minolta Inc
Priority date: 2011-09-27
Filing date: 2011-09-27
Publication date: 2013-04-22
Anticipated expiration: 2031-09-27
Also published as: JP5741353B2

Abstract

PROBLEM TO BE SOLVED: To provide an image processing system, an image processing method, and an image processing program which keeps the quality of stereoscopic vision display even if one input image includes a defect such as a blur.SOLUTION: The image processing system comprises: first imaging means which takes an image of a subject to obtain a first input image; second imaging means which takes an image of the subject from a viewpoint different from that of the first imaging means to obtain a second input image; frequency characteristic obtaining means which obtains the frequency characteristic of the first and second input images; and stereoscopic vision generation means for generating a stereoscopic image for stereoscopic vision display of the subject, from the first and second input images, mainly using an input image determined to have a relatively high quality on the basis of the obtained frequency characteristics.

Description

本発明は、被写体を立体視表示するための画像生成に向けられた画像処理システム、画像処理方法および画像処理プログラムに関する。 The present invention relates to an image processing system, an image processing method, and an image processing program directed to image generation for stereoscopic display of a subject.

近年の表示デバイスの開発とも相まって、同一対象物（被写体）を立体視表示するための画像処理技術の開発が進められている。このような立体視表示を実現する典型的な方法として、人間が感じる両眼視差を利用する方法がある。このような両眼視差を利用する場合には、撮像手段から被写体までの距離に応じて視差をつけた一対の画像（以下「ステレオ画像」または「３Ｄ画像」とも称す。）を生成する必要がある。このようなステレオ画像を生成する場合においても、画質のよい入力画像を用いることが好ましい。 In conjunction with the development of display devices in recent years, development of image processing techniques for stereoscopically displaying the same object (subject) has been promoted. As a typical method for realizing such stereoscopic display, there is a method using binocular parallax felt by humans. When such binocular parallax is used, it is necessary to generate a pair of images (hereinafter also referred to as “stereo images” or “3D images”) with parallax according to the distance from the imaging means to the subject. is there. Even when such a stereo image is generated, it is preferable to use an input image with good image quality.

従来、ステレオ画像に関するものではないが、入力画像の画質を評価する技術、あるいは、ノイズなどを除去する技術が知られている。 Conventionally, although not related to a stereo image, a technique for evaluating the image quality of an input image or a technique for removing noise or the like is known.

例えば、特開２０１１−０５５２５９号公報（特許文献１）には、複数枚の画像を位置合わせ処理した後、合成処理することにより、ぶれを低減する電子ぶれ補正技術を開示する。また、特開２０１０−２７９００２号公報（特許文献２）には、画像に飽和部分が発生し、本来の輝度レベルが得られないときにも、適切にフレア成分を検出して効果的なフレア補正を行なう技術を開示する。また、特開２００９−２００７４９号公報（特許文献３）は、撮影画像にゴースト画像が発生した場合には、ゴースト画像を含む撮影画像からゴースト画像を除去することで、ゴースト画像のない撮影画像を得る技術を開示する。また、特開２０１０−０５５１９４号公報（特許文献４）は、被写体が鮮明に写っているか否かを確実に評価する技術を開示する。 For example, Japanese Patent Application Laid-Open No. 2011-055259 (Patent Document 1) discloses an electronic shake correction technique that reduces shake by performing alignment processing on a plurality of images and then performing synthesis processing. Japanese Patent Laid-Open No. 2010-279002 (Patent Document 2) discloses an effective flare correction by appropriately detecting a flare component even when a saturated portion occurs in an image and an original luminance level cannot be obtained. A technique for performing is disclosed. Japanese Patent Laid-Open No. 2009-200249 (Patent Document 3) discloses that when a ghost image is generated in a captured image, the ghost image is removed from the captured image including the ghost image, thereby obtaining a captured image without the ghost image. Disclose the technology to obtain. Japanese Patent Laying-Open No. 2010-055194 (Patent Document 4) discloses a technique for reliably evaluating whether or not a subject is clearly visible.

特開２０１１−０５５２５９号公報JP 2011-055259 A 特開２０１０−２７９００２号公報JP 2010-279002 A 特開２００９−２００７４９号公報Japanese Unexamined Patent Publication No. 2009-200749 特開２０１０−０５５１９４号公報JP 2010-055194 A

互いに異なる視点から被写体をそれぞれ撮像する複数の撮像手段を用いてステレオ画像を生成する構成において、それぞれの撮像手段によって撮像された入力画像の画質が互いに異なり、見づらい場合がある。より具体的には、一対のカメラを搭載したデジタルカメラや携帯電話などにおいて、一方のカメラによって撮像された入力画像のみがぼけてしまうことがある。このような理由としては、オートフォーカス動作の違いや、手ぶれ量の違い、（特に、携帯電話の場合）ユーザーが無意識に触ってしまうことによる一方のレンズの汚れ、などが考えられる。このような場合、左眼用画像と右眼用画像との間でぼけ度合いが異なることになり、提供される立体視画像の品質が低下する。 In a configuration in which a stereo image is generated using a plurality of imaging units that respectively image subjects from different viewpoints, the image quality of input images captured by the respective imaging units may be different from each other and may be difficult to see. More specifically, in a digital camera or a mobile phone equipped with a pair of cameras, only an input image captured by one camera may be blurred. Possible reasons for this include a difference in autofocus operation, a difference in the amount of camera shake, and contamination (in particular in the case of a mobile phone) of one lens due to unintentional touching by the user. In such a case, the degree of blur is different between the image for the left eye and the image for the right eye, and the quality of the provided stereoscopic image is deteriorated.

また、一方の入力画像にのみ、フレアやゴーストなどの現象が発生した場合にも、左眼用画像と右眼用画像との間でこのような現象によって生じる不一致により、提供される立体視画像の品質が低下する。 In addition, even when a phenomenon such as flare or ghost occurs only in one input image, a stereoscopic image provided due to a mismatch caused by such a phenomenon between the image for the left eye and the image for the right eye The quality of

そこで、本発明は、かかる問題を解決するためになされたものであり、その目的は、一方の入力画像にぼけなどの欠陥が存在する場合であっても、立体視表示の品質を維持できる画像処理システム、画像処理方法および画像処理プログラムを提供することである。 Therefore, the present invention has been made to solve such a problem, and an object of the present invention is to provide an image that can maintain the quality of stereoscopic display even when a defect such as blur exists in one input image. A processing system, an image processing method, and an image processing program are provided.

本発明のある局面に従う画像処理システムは、被写体を撮像して第１の入力画像を取得する第１の撮像手段と、第１の撮像手段とは異なる視点から被写体を撮像して第２の入力画像を取得する第２の撮像手段と、第１および第２の入力画像の周波数特性を取得する周波数特性取得手段と、取得された周波数特性に基づいて相対的に画質がよいと判断された入力画像を主体的に用いて、第１および第２の入力画像から、被写体を立体視表示するためのステレオ画像を生成する立体視生成手段とを含む。 An image processing system according to an aspect of the present invention includes: a first imaging unit that captures a subject and obtains a first input image; and a second input that captures the subject from a different viewpoint from the first imaging unit. A second imaging unit for acquiring an image; a frequency characteristic acquiring unit for acquiring the frequency characteristics of the first and second input images; and an input determined to have relatively good image quality based on the acquired frequency characteristics Stereoscopic image generating means for generating a stereo image for stereoscopically displaying the subject from the first and second input images by using the image mainly.

好ましくは、立体視生成手段は、より多くの高周波成分を含む入力画像を相対的に画質がよいと判断する。 Preferably, the stereoscopic generation unit determines that an input image including more high-frequency components has a relatively high image quality.

好ましくは、周波数特性取得手段は、入力画像に含まれるエッジ量抽出および入力画像に対する周波数分析の少なくとも一方を用いて、周波数特定を取得する。 Preferably, the frequency characteristic acquisition unit acquires the frequency specification using at least one of extraction of an edge amount included in the input image and frequency analysis on the input image.

好ましくは、立体視生成手段は、取得された周波数特性、および、撮像時における第１および第２の撮像手段の位置関係に基づいて、第１および第２の入力画像の一方を主体的に用いてステレオ画像を生成する処理モードと、第１および第２の入力画像の両方を用いてステレオ画像を生成する処理モードとを切り替える。 Preferably, the stereoscopic generation unit mainly uses one of the first and second input images based on the acquired frequency characteristics and the positional relationship between the first and second imaging units at the time of imaging. Thus, the processing mode for generating a stereo image and the processing mode for generating a stereo image using both the first and second input images are switched.

好ましくは、立体視生成手段は、部分画像ごとに、ステレオ画像を生成する処理モードを切り替える。 Preferably, the stereoscopic generation unit switches a processing mode for generating a stereo image for each partial image.

好ましくは、立体視生成手段は、第１および第２の撮像手段が被写体を動画撮像する場合に、複数フレームごとに、ステレオ画像を生成するために適用する処理モードを決定する。 Preferably, the stereoscopic generation unit determines a processing mode to be applied to generate a stereo image for each of a plurality of frames when the first and second imaging units capture a moving image of the subject.

さらに好ましくは、立体視生成手段は、第１および第２の撮像手段が被写体を動画撮像する場合に、一旦決定した処理モードを所定期間に亘って維持する。 More preferably, the stereoscopic generation unit maintains the processing mode once determined for a predetermined period when the first and second imaging units capture a moving image of the subject.

さらに好ましくは、立体視生成手段は、第１および第２の撮像手段が被写体を動画撮像する場合に、先頭フレームに対応する第１および第２の入力画像に基づいて処理モードを決定する。 More preferably, the stereoscopic generation unit determines the processing mode based on the first and second input images corresponding to the first frame when the first and second imaging units capture a moving image of the subject.

好ましくは、画像処理システムは、特定の撮像手段からの入力画像に対して画質が劣化していると複数回にわたって判断されたときに、対応する撮像手段についてレンズの汚れをユーザーに警告するための警告手段をさらに含む。 Preferably, when the image processing system determines that the image quality is deteriorated for an input image from the specific imaging unit a plurality of times, the image processing system warns the user of lens contamination of the corresponding imaging unit. A warning means is further included.

好ましくは、画像処理システムは、周波数特性を評価するための第１および第２の撮像手段を用いた撮像の実行をユーザーに促す撮像喚起手段をさらに含む。 Preferably, the image processing system further includes an imaging inducing unit that prompts the user to perform imaging using the first and second imaging units for evaluating the frequency characteristics.

好ましくは、周波数特性取得手段は、初期設定時、電源投入時、撮像直前の合焦動作時のいずれかにおいて取得された第１および第２の入力画像に対して、周波数特性を取得する。 Preferably, the frequency characteristic acquisition unit acquires the frequency characteristic with respect to the first and second input images acquired at any time of initial setting, power-on, or focusing operation immediately before imaging.

好ましくは、より多くの高周波成分を含む入力画像に対応する撮像手段を、ステレオ画像を生成する際に主体的に用いられる撮像手段としてデフォルト設定する。 Preferably, an imaging unit corresponding to an input image including more high-frequency components is set as a default as an imaging unit that is mainly used when generating a stereo image.

さらに好ましくは、立体視生成手段は、デフォルト設定されている撮像手段からの入力画像を主体的に用いて、ステレオ画像を生成する。 More preferably, the stereoscopic generation unit generates a stereo image mainly using an input image from the imaging unit set as a default.

本発明の別の局面に従う画像処理方法は、被写体を撮像して第１の入力画像を取得するステップと、第１の入力画像を撮像した視点とは異なる視点から被写体を撮像して第２の入力画像を取得するステップと、第１および第２の入力画像の周波数特性を取得するステップと、取得された周波数特性に基づいて相対的に画質がよいと判断された入力画像を主体的に用いて、第１および第２の入力画像から、被写体を立体視表示するためのステレオ画像を生成するステップとを含む。 An image processing method according to another aspect of the present invention includes a step of capturing a subject to acquire a first input image, and capturing a subject from a viewpoint different from the viewpoint from which the first input image is captured. Mainly using the step of acquiring the input image, the step of acquiring the frequency characteristics of the first and second input images, and the input image determined to have relatively good image quality based on the acquired frequency characteristics Generating a stereo image for stereoscopically displaying the subject from the first and second input images.

本発明のさらに別の局面に従えば、コンピューターに画像処理を実行させる画像処理プログラムを提供する。画像処理プログラムは、コンピューターを、被写体を撮像した第１の入力画像を取得する第１の取得手段と、第１の入力画像を撮像した視点とは異なる視点から被写体を撮像した第２の入力画像を取得する第２の取得手段と、第１および第２の入力画像の周波数特性を取得する周波数特性取得手段と、取得された周波数特性に基づいて相対的に画質がよいと判断された入力画像を主体的に用いて、第１および第２の入力画像から、被写体を立体視表示するためのステレオ画像を生成する立体視生成手段として機能させる。 According to still another aspect of the present invention, an image processing program for causing a computer to execute image processing is provided. The image processing program causes the computer to acquire a first input image obtained by imaging a subject and a second input image obtained by imaging the subject from a viewpoint different from the viewpoint from which the first input image is captured. The second acquisition means for acquiring the frequency characteristics, the frequency characteristic acquisition means for acquiring the frequency characteristics of the first and second input images, and the input image determined to have relatively good image quality based on the acquired frequency characteristics Is used as a stereoscopic generation means for generating a stereo image for stereoscopically displaying the subject from the first and second input images.

本発明によれば、一方の入力画像にぼけなどの欠陥が存在する場合であっても、立体視表示の品質を維持できる。 According to the present invention, the quality of stereoscopic display can be maintained even when a defect such as blur exists in one input image.

本発明の実施の形態に従う画像処理システムの基本的構成を示すブロック図である。1 is a block diagram showing a basic configuration of an image processing system according to an embodiment of the present invention. 図１に示す撮像部の具体的な構成例を示す図である。It is a figure which shows the specific structural example of the imaging part shown in FIG. 図１に示す画像処理システムを具現化したデジタルカメラの構成を示すブロック図である。It is a block diagram which shows the structure of the digital camera which actualized the image processing system shown in FIG. 図１に示す画像処理システムを具現化したパーソナルコンピューターの構成を示すブロック図である。It is a block diagram which shows the structure of the personal computer which actualized the image processing system shown in FIG. 図１に示す画像処理システムを具現化した携帯電話の外観図である。FIG. 2 is an external view of a mobile phone that embodies the image processing system shown in FIG. 1. 第１の実施の形態に従う画像処理方法の手順を示す図である。It is a figure which shows the procedure of the image processing method according to 1st Embodiment. 図１に示す撮像部によって撮像された一対の入力画像の一例を示す図である。It is a figure which shows an example of a pair of input image imaged by the imaging part shown in FIG. 図６に示すエッジ抽出処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the edge extraction process shown in FIG. 図８のスムージング処理において用いられるエッジ抽出用の平均化フィルタの一例を示す図である。It is a figure which shows an example of the averaging filter for edge extraction used in the smoothing process of FIG. 第１の実施の形態に従う画像処理方法に従って図７に示す一対の入力画像から生成された距離画像の一例を示す図である。It is a figure which shows an example of the distance image produced | generated from a pair of input image shown in FIG. 7 according to the image processing method according to 1st Embodiment. 図６のスムージング処理において用いられる平均化フィルタの一例を示す図である。It is a figure which shows an example of the averaging filter used in the smoothing process of FIG. 図６に示す視差調整処理の処理内容を説明するための図である。It is a figure for demonstrating the processing content of the parallax adjustment process shown in FIG. 図６のステレオ画像生成処理における処理手順を説明するための図である。It is a figure for demonstrating the process sequence in the stereo image generation process of FIG. 図１３に示すステレオ画像生成処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the stereo image generation process shown in FIG. 第１の実施の形態に従う画像処理方法に従って図７に示す一対の入力画像から生成されたステレオ画像の一例を示す図である。It is a figure which shows an example of the stereo image produced | generated from a pair of input image shown in FIG. 7 according to the image processing method according to 1st Embodiment. 第２の実施の形態に従う画像処理方法の手順を示す図である。It is a figure which shows the procedure of the image processing method according to 2nd Embodiment. 第２の変形例に従う部分画像ごとに画質を評価する処理例を示す図である。It is a figure which shows the process example which evaluates an image quality for every partial image according to a 2nd modification. 第２の変形例に従ってステレオ画像を生成する処理を説明するための図である。It is a figure for demonstrating the process which produces | generates a stereo image according to the 2nd modification. 第２の変形例に従ってステレオ画像を生成する処理を説明するための図である。It is a figure for demonstrating the process which produces | generates a stereo image according to the 2nd modification. 第３の変形例において提供されるユーザーインターフェイスの一例を示す図である。It is a figure which shows an example of the user interface provided in a 3rd modification.

本発明の実施の形態について、図面を参照しながら詳細に説明する。なお、図中の同一または相当部分については、同一符号を付してその説明は繰り返さない。 Embodiments of the present invention will be described in detail with reference to the drawings. In addition, about the same or equivalent part in a figure, the same code | symbol is attached | subjected and the description is not repeated.

＜Ａ．概要＞
本発明の実施の形態に従う画像処理システムは、被写体を複数の視点でそれぞれ撮像することで得られる複数の入力画像から立体視表示を行なうためのステレオ画像を生成する。このステレオ画像の生成にあたって、画像処理システムは、それぞれの入力画像の周波数特性が取得し、取得された周波数特性に基づいて相対的に画質がよいと判断された入力画像を主体的に用いて、それぞれの入力画像から、被写体を立体視表示するためのステレオ画像が生成する。 <A. Overview>
The image processing system according to the embodiment of the present invention generates a stereo image for performing stereoscopic display from a plurality of input images obtained by imaging a subject from a plurality of viewpoints. In generating the stereo image, the image processing system acquires the frequency characteristics of the respective input images, and mainly uses the input images determined to have relatively good image quality based on the acquired frequency characteristics. A stereo image for stereoscopic display of the subject is generated from each input image.

これにより、一方の入力画像にぼけなどの欠陥が存在する場合であっても、立体視表示の品質を維持できる。 Thereby, even if a defect such as blur exists in one input image, the quality of the stereoscopic display can be maintained.

＜Ｂ．システム構成＞
まず、本発明の実施の形態に従う画像処理システムの構成について説明する。 <B. System configuration>
First, the configuration of the image processing system according to the embodiment of the present invention will be described.

（ｂ１：基本的構成）
図１は、本発明の実施の形態に従う画像処理システム１の基本的構成を示すブロック図である。図１を参照して、画像処理システム１は、撮像部２と、画像処理部３と、３Ｄ画像出力部４とを含む。図１に示す画像処理システム１においては、撮像部２が被写体を撮像することで一対の入力画像（入力画像１および入力画像２）を取得し、画像処理部３がこの取得された一対の入力画像に対して後述するような画像処理を行なうことで、被写体を立体視表示するためのステレオ画像（左眼用画像および右眼用画像）を生成する。そして、３Ｄ画像出力部４は、このステレオ画像（左眼用画像および右眼用画像）を表示デバイスなどへ出力する。 (B1: Basic configuration)
FIG. 1 is a block diagram showing a basic configuration of an image processing system 1 according to the embodiment of the present invention. With reference to FIG. 1, the image processing system 1 includes an imaging unit 2, an image processing unit 3, and a 3D image output unit 4. In the image processing system 1 shown in FIG. 1, the imaging unit 2 captures a subject to acquire a pair of input images (input image 1 and input image 2), and the image processing unit 3 acquires the acquired pair of inputs. By performing image processing as will be described later on the image, a stereo image (left-eye image and right-eye image) for stereoscopically displaying the subject is generated. Then, the 3D image output unit 4 outputs the stereo image (the left eye image and the right eye image) to a display device or the like.

撮像部２は、同一対象物（被写体）を異なる視点で撮像して一対の入力画像を生成する。より具体的には、第１カメラ２１と、第２カメラ２２と、第１カメラと接続されたＡ／Ｄ（Analog to Digital）変換部２３と、第２カメラ２２と接続されたＡ／Ｄ変換部２４とを含む。Ａ／Ｄ変換部２３は、第１カメラ２１により撮像された被写体を示す入力画像１を出力し、Ａ／Ｄ変換部２４は、第２カメラ２２により撮像された被写体を示す入力画像２を出力する。 The imaging unit 2 captures the same object (subject) from different viewpoints and generates a pair of input images. More specifically, the first camera 21, the second camera 22, an A / D (Analog to Digital) conversion unit 23 connected to the first camera, and an A / D conversion connected to the second camera 22. Part 24. The A / D conversion unit 23 outputs an input image 1 indicating the subject imaged by the first camera 21, and the A / D conversion unit 24 outputs an input image 2 indicating the subject imaged by the second camera 22. To do.

すなわち、第１カメラ２１およびＡ／Ｄ変換部２３は、被写体を撮像して第１の入力画像を取得する第１の撮像手段に相当し、第２カメラ２２およびＡ／Ｄ変換部２４は、第１の撮像手段とは異なる視点から被写体を撮像して第２の入力画像を取得する第２の撮像手段に相当する。 That is, the first camera 21 and the A / D conversion unit 23 correspond to a first imaging unit that captures a subject and obtains a first input image. The second camera 22 and the A / D conversion unit 24 This corresponds to a second imaging unit that captures a subject from a different viewpoint from the first imaging unit and obtains a second input image.

第１カメラ２１は、被写体を撮像するための光学系であるレンズ２１ａと、レンズ２１ａにより集光された光を電気信号に変換するデバイスである撮像素子２１ｂとを含む。Ａ／Ｄ変換部２３は、撮像素子２１ｂから出力される被写体を示す映像信号（アナログ電気信号）をデジタル信号に変換して出力する。同様に、第２カメラ２２は、被写体を撮像するための光学系であるレンズ２２ａと、レンズ２２ａにより集光された光を電気信号に変換するデバイスである撮像素子２２ｂとを含む。Ａ／Ｄ変換部２４は、撮像素子２２ｂから出力される被写体を示す映像信号（アナログ電気信号）をデジタル信号に変換して出力する。撮像部２はさらに、各部分を制御するための制御処理回路などを含み得る。 The first camera 21 includes a lens 21a that is an optical system for imaging a subject, and an imaging element 21b that is a device that converts light collected by the lens 21a into an electrical signal. The A / D converter 23 converts a video signal (analog electrical signal) indicating a subject output from the image sensor 21b into a digital signal and outputs the digital signal. Similarly, the second camera 22 includes a lens 22a that is an optical system for imaging a subject, and an imaging element 22b that is a device that converts light collected by the lens 22a into an electrical signal. The A / D converter 24 converts a video signal (analog electrical signal) indicating a subject output from the image sensor 22b into a digital signal and outputs the digital signal. The imaging unit 2 may further include a control processing circuit for controlling each part.

後述するように、本実施の形態に従う画像処理においては、一方のカメラで撮像された入力画像だけを用いても、ステレオ画像（左眼用画像および右眼用画像）を生成することができる。そのため、一方の入力画像の画質が劣化していても、画質が良好な他方の入力画像を用いることで、ステレオ画像を生成することができる。すなわち、第１カメラ２１および第２カメラ２２におけるオートフォーカス動作の違い、第１カメラ２１および第２カメラ２２にそれぞれ生じる手ぶれ量の違い、ユーザーが無意識に触ってしまうことによるレンズ２１ａまたはレンズ２２ａの汚れなどによって、一方のカメラによって撮像された入力画像のみがぼけてしまった場合であっても、品質が良好なステレオ画像を生成できる。 As will be described later, in the image processing according to the present embodiment, a stereo image (a left-eye image and a right-eye image) can be generated using only the input image captured by one camera. Therefore, even if the image quality of one input image is degraded, a stereo image can be generated by using the other input image with good image quality. That is, the difference in autofocus operation between the first camera 21 and the second camera 22, the difference in the amount of camera shake generated in the first camera 21 and the second camera 22, respectively, and the lens 21a or the lens 22a caused by the user touching unconsciously. Even when only an input image taken by one camera is blurred due to dirt or the like, a stereo image with good quality can be generated.

図２は、図１に示す撮像部２の具体的な構成例を示す図である。より具体的には、図２には、基本的なスペックを同一としたレンズ２１ａおよび２２ａからなる撮像部２の一例を示す。この撮像部２においては、いずれのレンズについても光学ズーム機能を搭載してもよい。 FIG. 2 is a diagram illustrating a specific configuration example of the imaging unit 2 illustrated in FIG. 1. More specifically, FIG. 2 shows an example of the imaging unit 2 including lenses 21a and 22a having the same basic specifications. In this imaging unit 2, an optical zoom function may be mounted on any lens.

本実施の形態に従う画像処理方法においては、同一の被写体に対するそれぞれのカメラの視線方向（視点）が異なっていればよいので、撮像部２において、レンズ２１ａと２２ａとの配置（縦方向配列または横方向配列）は任意に設定できる。すなわち、図２（ａ）に示すように縦長方向に配置（縦ステレオ）して撮像してもよいし、図２（ｂ）に示すように横長方向に配置（横ステレオ）して撮像してもよい。 In the image processing method according to the present embodiment, it is only necessary that the line-of-sight directions (viewpoints) of the respective cameras with respect to the same subject are different, so in the imaging unit 2, the arrangement of the lenses 21a and 22a (vertical arrangement or horizontal arrangement). The direction array can be arbitrarily set. That is, as shown in FIG. 2A, the image may be arranged in the vertically long direction (vertical stereo) or may be picked up in the horizontally long direction (horizontal stereo) as shown in FIG. Also good.

本実施の形態に従う画像処理方法においては、入力画像１および入力画像２を必ずしも同時に取得する必要はない。すなわち、入力画像１および入力画像２を取得するための撮像タイミングにおいて、被写体に対する撮像部２の位置関係が実質的に同じであれば、入力画像１および入力画像２を異なるタイミングでそれぞれ取得してもよい。また、本実施の形態に従う画像処理方法においては、静止画だけではなく、動画としても立体視表示を行なうためのステレオ画像を生成することができる。この場合には、第１カメラ２１および第２カメラ２２の間で同期を取りつつ、時間的に連続して被写体を撮像することで、それぞれのカメラについての一連の画像をそれぞれ取得することができる。また、本実施の形態に従う画像処理方法においては、入力画像は、カラー画像であってもよいし、モノクロ画像であってもよい。 In the image processing method according to the present embodiment, it is not always necessary to acquire input image 1 and input image 2 at the same time. That is, if the positional relationship of the imaging unit 2 with respect to the subject is substantially the same at the imaging timing for acquiring the input image 1 and the input image 2, the input image 1 and the input image 2 are acquired at different timings, respectively. Also good. Moreover, in the image processing method according to the present embodiment, a stereo image for performing stereoscopic display can be generated not only as a still image but also as a moving image. In this case, a series of images can be acquired for each camera by capturing the subject continuously in time while synchronizing between the first camera 21 and the second camera 22. . In the image processing method according to the present embodiment, the input image may be a color image or a monochrome image.

再度図１を参照して、画像処理部３は、撮像部２によって取得された一対の入力画像に対して、本実施の形態に従う画像処理方法を実施することで、被写体を立体視表示するためのステレオ画像（左眼用画像および右眼用画像）を生成する。より具体的には、画像処理部３は、エッジ抽出部３０と、対応点探索部３１と、距離画像生成部３２と、スムージング処理部３３と、視差調整部３４と、３Ｄ画像生成部３５とを含む。 Referring to FIG. 1 again, the image processing unit 3 performs stereoscopic display of the subject by performing the image processing method according to the present embodiment on the pair of input images acquired by the imaging unit 2. Stereo images (left eye image and right eye image) are generated. More specifically, the image processing unit 3 includes an edge extraction unit 30, a corresponding point search unit 31, a distance image generation unit 32, a smoothing processing unit 33, a parallax adjustment unit 34, and a 3D image generation unit 35. including.

エッジ抽出部３０は、一対の入力画像（入力画像１および入力画像２）の周波数特性を取得する。より具体的には、エッジ抽出部３０は、それぞれの入力画像に含まれるエッジ量を算出する。この算出されたエッジ量が多いほど、対象の入力画像に高周波成分が多いことを示す。すなわち、相対的に画質のよい入力画像は、周波数特性の高い方の入力画像に相当する。 The edge extraction unit 30 acquires frequency characteristics of a pair of input images (input image 1 and input image 2). More specifically, the edge extraction unit 30 calculates the amount of edge included in each input image. It shows that there are many high frequency components in the target input image, so that this calculated edge amount is large. That is, an input image with relatively good image quality corresponds to an input image with higher frequency characteristics.

本実施の形態においては、より多くの高周波成分を含む入力画像を相対的に画質がよいと判断するため、エッジ量がより多い入力画像ほど、高画質であると判断する。これは、入力画像がぼけた場合には、全体的に緩慢な絵になるので、エッジ量が少なく、その中に含まれる周波数成分が相対的に低周波側へ移動することを利用したものである。 In this embodiment, since it is determined that the input image including a larger number of high-frequency components has a relatively high image quality, the input image having a larger edge amount is determined to have a higher image quality. This is because when the input image is blurred, the picture becomes sluggish as a whole, so the amount of edges is small and the frequency components contained in it move relatively to the low frequency side. is there.

なお、入力画像の周波数特性を取得する方法としては、入力画像に含まれるエッジ量を抽出する方法に代えて、入力画像に対して各種の周波数分析を行ない、この周波数分析の結果に基づいて、対象の入力画像の画質を評価してもよい。すなわち、周波数特性は、典型的には、エッジ量または周波数分析により判定される。 As a method of acquiring the frequency characteristics of the input image, instead of the method of extracting the edge amount included in the input image, various frequency analyzes are performed on the input image, and based on the result of the frequency analysis, The image quality of the target input image may be evaluated. That is, the frequency characteristic is typically determined by edge amount or frequency analysis.

対応点探索部３１は、一対の入力画像（入力画像１および入力画像２）に対して対応点探索の処理を行なう。この対応点探索の処理は、典型的には、ＰＯＣ（Phase-Only Correlation）演算法、ＳＡＤ（Sum of Absolute Difference）演算法、ＳＳＤ（Sum of Squared Difference）演算法、ＮＣＣ（Normalized Cross Correlation）演算法などを用いることができる。すなわち、対応点探索部３１は、入力画像１と入力画像２との間における被写体の各点についての対応関係を探索する。 Corresponding point search unit 31 performs corresponding point search processing on a pair of input images (input image 1 and input image 2). Typically, the corresponding point search processing includes POC (Phase-Only Correlation) calculation method, SAD (Sum of Absolute Difference) calculation method, SSD (Sum of Squared Difference) calculation method, NCC (Normalized Cross Correlation) calculation method. The method etc. can be used. That is, the corresponding point search unit 31 searches for a corresponding relationship for each point of the subject between the input image 1 and the input image 2.

距離画像生成部３２は、２つの入力画像についての距離情報を取得する。この距離情報は、同一の被写体についての情報の相違に基づいて算出される。典型的には、距離画像生成部３２は、対応点探索部３１によって探索された被写体の各点についての入力画像の間での対応関係から距離情報を算出する。撮像部２では、異なる視点からそれぞれ被写体を撮像する。そのため、２つの入力画像の間では、被写体のある点（注目点）を表現する画素は、撮像部２と当該被写体の点との距離に応じた距離だけずれることになる。本明細書においては、入力画像１の注目点に対応する画素の画像座標系上の座標と、入力画像２の注目点に対応する画素の画像座標系上の座標との差を「視差」と称する。距離画像生成部３２は、対応点探索部３１によって探索された被写体の注目点の各々について、視差を算出する。 The distance image generation unit 32 acquires distance information about two input images. This distance information is calculated based on the difference in information about the same subject. Typically, the distance image generation unit 32 calculates distance information from a correspondence relationship between input images for each point of the subject searched by the corresponding point search unit 31. The imaging unit 2 images the subject from different viewpoints. Therefore, between two input images, a pixel representing a certain point (attention point) of the subject is shifted by a distance corresponding to the distance between the imaging unit 2 and the point of the subject. In this specification, the difference between the coordinates on the image coordinate system of the pixel corresponding to the target point of the input image 1 and the coordinates on the image coordinate system of the pixel corresponding to the target point of the input image 2 is referred to as “parallax”. Called. The distance image generation unit 32 calculates the parallax for each point of interest of the subject searched by the corresponding point search unit 31.

この視差は、撮像部２から被写体の対応する注目点までの距離を示す指標値である。視差が大きいほど、撮像部２から被写体の対応する注目点までの距離が短い、すなわち撮像部２により近接していることを意味する。本明細書においては、視差、および、視差によって示される被写体の各点の撮像部２からの距離を、総称して「距離情報」という用語を用いる。 This parallax is an index value indicating the distance from the imaging unit 2 to the corresponding point of interest of the subject. The larger the parallax, the shorter the distance from the imaging unit 2 to the corresponding point of interest of the subject, that is, the closer the imaging unit 2 is. In this specification, the term “distance information” is used as a general term for the parallax and the distance from the imaging unit 2 of each point of the subject indicated by the parallax.

なお、入力画像間で視差が生じる方向は、撮像部２における第１カメラ２１と第２カメラ２２との間の位置関係に依存する。例えば、第１カメラ２１と第２カメラ２２とを縦方向に所定間隔だけ離して配置した場合には、入力画像１と入力画像２との間での視差は縦方向に生じることになる。 Note that the direction in which the parallax occurs between the input images depends on the positional relationship between the first camera 21 and the second camera 22 in the imaging unit 2. For example, when the first camera 21 and the second camera 22 are arranged at a predetermined interval in the vertical direction, the parallax between the input image 1 and the input image 2 is generated in the vertical direction.

距離画像生成部３２は、被写体の各点についての距離情報として算出し、算出したそれぞれの距離情報を画像座標系上の座標に関連付けて表現した距離画像（視差画像）を生成する。 The distance image generation unit 32 calculates distance information for each point of the subject, and generates a distance image (parallax image) that expresses the calculated distance information in association with coordinates on the image coordinate system.

スムージング処理部３３は、距離画像生成部３２によって生成された距離画像に対してスムージング処理する。 The smoothing processing unit 33 performs a smoothing process on the distance image generated by the distance image generation unit 32.

視差調整部３４は、許容される視差範囲（最大飛び出し位置から最大奥行き位置までの範囲）に適合するように、生成された距離画像を調整する。この視差調整部３４による調整処理の詳細については後述する。 The parallax adjustment unit 34 adjusts the generated distance image so as to match an allowable parallax range (a range from the maximum protruding position to the maximum depth position). Details of the adjustment processing by the parallax adjustment unit 34 will be described later.

３Ｄ画像生成部３５は、視差調整部３４による調整後の距離画像に基づいて、入力画像を構成する各画素を対応する距離情報（画素数）だけずらすことで、被写体を立体視表示するためのステレオ画像（左眼用画像および右眼用画像）を生成する。このように、距離情報に基づいて、入力画像に含まれる画素を横方向にずらすことで被写体を立体視表示するためのステレオ画像が生成される。左眼用画像と右眼用画像との間について見れば、被写体の各点は、距離画像によって示される距離情報（画素数）に応じた距離だけ離れて、すなわち距離情報（画素数）に応じた視差が与えられて表現される。これにより、被写体を立体視表示することができる。 The 3D image generation unit 35 shifts each pixel constituting the input image by the corresponding distance information (number of pixels) based on the distance image after adjustment by the parallax adjustment unit 34, thereby displaying the subject in a stereoscopic view. Stereo images (left eye image and right eye image) are generated. In this manner, a stereo image for stereoscopically displaying the subject is generated by shifting the pixels included in the input image in the horizontal direction based on the distance information. When viewed between the left-eye image and the right-eye image, each point of the subject is separated by a distance corresponding to the distance information (number of pixels) indicated by the distance image, that is, according to the distance information (number of pixels). It is expressed with given parallax. As a result, the subject can be stereoscopically displayed.

本実施の形態においては、３Ｄ画像生成部３５は、エッジ抽出部３０などによって取得された周波数特性に基づいていずれの入力画像の画質が相対的によいかを判断し、そして、相対的に画質がよいと判断された入力画像を主体的に用いて、入力画像１および２から、被写体を立体視表示するためのステレオ画像を生成する。すなわち、本実施の形態においては、少なくとも２つの撮像系により撮像されたそれぞれの入力画像に対し、それぞれの画像についての周波数特性を検出し、その検出結果に基づいて、画質のよい方を優先的に用いてステレオ画像を生成する。 In the present embodiment, the 3D image generation unit 35 determines which input image has a relatively good image quality based on the frequency characteristics acquired by the edge extraction unit 30 and the like, and the image quality is relatively high. A stereo image for stereoscopically displaying the subject is generated from the input images 1 and 2 mainly using the input image determined to be good. In other words, in the present embodiment, the frequency characteristic of each image is detected for each input image captured by at least two imaging systems, and the better image quality is prioritized based on the detection result. To generate a stereo image.

例えば、３Ｄ画像生成部３５は、入力画像１の画質が相対的によい場合には、入力画像１をそれぞれの画素について対応する距離情報（画素数）だけ横方向にずらした一対の画像（左眼用画像および右眼用画像）を生成する。なお、入力画像１をそのまま左眼用画像または右眼用画像として用いるとともに、入力画像１の各画素を対応する距離情報（画素数）だけ横方向にずらすことで他方の右眼用画像または左眼用画像を生成してもよい。 For example, when the image quality of the input image 1 is relatively high, the 3D image generation unit 35 shifts the input image 1 in the horizontal direction by the distance information (number of pixels) corresponding to each pixel (left Eye image and right eye image). The input image 1 is used as it is as a left-eye image or a right-eye image, and the other right-eye image or left-eye is shifted by shifting each pixel of the input image 1 by the corresponding distance information (number of pixels). An ophthalmic image may be generated.

これに対して、３Ｄ画像生成部３５は、入力画像２の画質が相対的によい場合には、入力画像２をそれぞれの画素について対応する距離情報（画素数）だけ横方向にずらした一対の画像（左眼用画像および右眼用画像）を生成する。上述と同様に、入力画像２をそのまま左眼用画像または右眼用画像として用いるとともに、入力画像２の各画素を対応する距離情報（画素数）だけ横方向にずらすことで他方の右眼用画像または左眼用画像を生成してもよい。 On the other hand, when the image quality of the input image 2 is relatively good, the 3D image generation unit 35 shifts the input image 2 in the horizontal direction by the distance information (number of pixels) corresponding to each pixel. An image (an image for the left eye and an image for the right eye) is generated. Similarly to the above, the input image 2 is used as it is as the left-eye image or the right-eye image, and each pixel of the input image 2 is shifted in the horizontal direction by the corresponding distance information (number of pixels) to be used for the other right eye. An image or an image for the left eye may be generated.

このとき、相対的に画質が悪い入力画像については、距離画像の生成にのみ利用されるので、ステレオ画像には、一方の入力画像における画質の劣化部分が含まれることはない。これにより、一方の入力画像にぼけなどの欠陥が存在する場合であっても、立体視表示の品質を維持できる。 At this time, since an input image with relatively poor image quality is used only for generating a distance image, the stereo image does not include a deteriorated image quality portion of one input image. Thereby, even if a defect such as blur exists in one input image, the quality of the stereoscopic display can be maintained.

３Ｄ画像出力部４は、画像処理部３によって生成されるステレオ画像（左眼用画像および右眼用画像）を表示デバイスなどへ出力する。 The 3D image output unit 4 outputs the stereo image (left-eye image and right-eye image) generated by the image processing unit 3 to a display device or the like.

各部の処理動作の詳細については、後述する。
図１に示す画像処理システム１は、各部を独立に構成することもできるが、汎用的には、以下に説明するデジタルカメラやパーソナルコンピューターなどとして具現化される場合が多い。そこで、本実施の形態に従う画像処理システム１の具現化例について説明する。 Details of the processing operation of each unit will be described later.
Although the image processing system 1 shown in FIG. 1 can be configured independently of each other, in general, the image processing system 1 is often embodied as a digital camera or a personal computer described below. Therefore, an implementation example of the image processing system 1 according to the present embodiment will be described.

（ｂ２：具現化例１）
図３は、図１に示す画像処理システム１を具現化したデジタルカメラ１００の構成を示すブロック図である。図３に示すデジタルカメラ１００は、２つのカメラ（第１カメラ１２１および第２カメラ１２２）を搭載しており、被写体を立体視表示するためのステレオ画像を撮像することができる。図３において、図１に示す画像処理システム１を構成するそれぞれのブロックに対応するコンポーネントには、図１と同一の参照符号を付している。 (B2: Implementation example 1)
FIG. 3 is a block diagram showing a configuration of a digital camera 100 that embodies the image processing system 1 shown in FIG. A digital camera 100 illustrated in FIG. 3 includes two cameras (a first camera 121 and a second camera 122), and can capture a stereo image for stereoscopically displaying a subject. In FIG. 3, components corresponding to the respective blocks constituting the image processing system 1 shown in FIG. 1 are denoted by the same reference numerals as in FIG.

デジタルカメラ１００では、第１カメラ１２１で被写体を撮像することで取得される入力画像が記憶および出力され、第２カメラ１２２で当該被写体を撮像することで取得される入力画像については、主として、上述の対応点探索処理および距離画像生成処理に用いられる。そのため、第１カメラ１２１についてのみ光学ズーム機能が搭載されているとする。 In the digital camera 100, an input image acquired by imaging a subject with the first camera 121 is stored and output, and an input image acquired by imaging the subject with the second camera 122 is mainly described above. Are used for the corresponding point search process and the distance image generation process. Therefore, it is assumed that only the first camera 121 has an optical zoom function.

図３を参照して、デジタルカメラ１００は、ＣＰＵ（Central Processing Unit）１０２と、デジタル処理回路１０４と、画像表示部１０８と、カードインターフェイス（Ｉ／Ｆ）１１０と、記憶部１１２と、ズーム機構１１４と、加速度センサー１１６と、第１カメラ１２１と、第２カメラ１２２とを含む。 Referring to FIG. 3, a digital camera 100 includes a CPU (Central Processing Unit) 102, a digital processing circuit 104, an image display unit 108, a card interface (I / F) 110, a storage unit 112, and a zoom mechanism. 114, an acceleration sensor 116, a first camera 121, and a second camera 122.

ＣＰＵ１０２は、予め格納されたプログラム（画像処理プログラムを含む）などを実行することで、デジタルカメラ１００の全体を制御する。デジタル処理回路１０４は、本実施の形態に従う画像処理を含む各種のデジタル処理を実行する。デジタル処理回路１０４は、典型的には、ＤＳＰ（Digital Signal Processor）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＬＳＩ（Large Scale Integration）、ＦＰＧＡ（Field-Programmable Gate Array）などによって構成される。このデジタル処理回路１０４は、図１に示す画像処理部３が提供する機能を実現するための画像処理回路１０６を含む。 The CPU 102 controls the entire digital camera 100 by executing a program (including an image processing program) stored in advance. The digital processing circuit 104 executes various digital processes including image processing according to the present embodiment. The digital processing circuit 104 is typically configured by a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an LSI (Large Scale Integration), an FPGA (Field-Programmable Gate Array), or the like. The digital processing circuit 104 includes an image processing circuit 106 for realizing the functions provided by the image processing unit 3 shown in FIG.

画像表示部１０８は、第１カメラ１２１および／または第２カメラ１２２により提供される画像、デジタル処理回路１０４（画像処理回路１０６）によって生成される画像、デジタルカメラ１００に係る各種設定情報、および、制御用ＧＵＩ（Graphical User Interface）画面などを表示する。画像表示部１０８は、画像処理回路１０６によって生成されるステレオ画像を用いて、被写体を立体視表示できることが好ましい。この場合、画像表示部１０８は、３次元表示方式に対応した任意の表示デバイス（３次元表示用の液晶表示装置）によって構成される。このような３次元表示方式としては、パララックスバリア方式などを採用することができる。このパララックスバリア方式では、液晶表示面にパララックスバリアを設けることで、ユーザーの右眼で右眼用画像を視認させ、ユーザーの左眼で左眼用画像を視認させることができる。あるいは、シャッタメガネ方式を採用してもよい。このシャッタメガネ方式では、左眼用画像および右眼用画像を交互に高速で切り替えて表示するとともに、この画像の切り替えに同期して開閉するシャッターが搭載された専用メガネをユーザーが装着することで、立体視表示を楽しむことができる。 The image display unit 108 includes an image provided by the first camera 121 and / or the second camera 122, an image generated by the digital processing circuit 104 (image processing circuit 106), various setting information related to the digital camera 100, and A control GUI (Graphical User Interface) screen or the like is displayed. It is preferable that the image display unit 108 can stereoscopically display the subject using a stereo image generated by the image processing circuit 106. In this case, the image display unit 108 is configured by an arbitrary display device (liquid crystal display device for three-dimensional display) corresponding to the three-dimensional display method. As such a three-dimensional display method, a parallax barrier method or the like can be employed. In this parallax barrier method, by providing a parallax barrier on the liquid crystal display surface, the right eye image can be visually recognized by the user's right eye, and the left eye image can be visually recognized by the user's left eye. Alternatively, a shutter glasses method may be adopted. In this shutter glasses method, the left eye image and the right eye image are alternately switched at high speed and displayed, and the user wears special glasses equipped with a shutter that opens and closes in synchronization with the switching of the image. , You can enjoy stereoscopic display.

カードインターフェイス（Ｉ／Ｆ）１１０は、画像処理回路１０６によって生成された画像データを記憶部１１２へ書き込み、あるいは、記憶部１１２から画像データなどを読み出すためのインターフェイスである。記憶部１１２は、画像処理回路１０６によって生成された画像データや各種情報（デジタルカメラ１００の制御パラメータや動作モードなどの設定値）を格納する記憶デバイスである。この記憶部１１２は、フラッシュメモリ、光学ディスク、磁気ディスクなどからなり、データを不揮発的に記憶する。 A card interface (I / F) 110 is an interface for writing image data generated by the image processing circuit 106 to the storage unit 112 or reading image data from the storage unit 112. The storage unit 112 is a storage device that stores image data generated by the image processing circuit 106 and various information (setting values such as control parameters and operation modes of the digital camera 100). The storage unit 112 includes a flash memory, an optical disk, a magnetic disk, and the like, and stores data in a nonvolatile manner.

ズーム機構１１４は、ユーザー操作などに応じて、第１カメラ１２１の撮像倍率を変更する機構である。ズーム機構１１４は、典型的には、サーボモーターなどを含み、第１カメラ１２１を構成するレンズ群を駆動することで、焦点距離を変化させる。 The zoom mechanism 114 is a mechanism that changes the imaging magnification of the first camera 121 according to a user operation or the like. The zoom mechanism 114 typically includes a servo motor and the like, and drives the lens group constituting the first camera 121 to change the focal length.

加速度センサー１１６は、重力加速度を検出することで、デジタルカメラ１００の姿勢を判断する。 The acceleration sensor 116 determines the attitude of the digital camera 100 by detecting gravitational acceleration.

第１カメラ１２１は、被写体を撮像することでステレオ画像を生成するための入力画像を生成する。第１カメラ１２１は、ズーム機構１１４によって駆動される複数のレンズ群からなる。第２カメラ１２２は、後述するような対応点探索処理や距離画像生成処理に用いられ、第１カメラ１２１によって撮像される同一の被写体を別の視点から撮像する。 The first camera 121 generates an input image for generating a stereo image by imaging a subject. The first camera 121 includes a plurality of lens groups that are driven by the zoom mechanism 114. The second camera 122 is used for corresponding point search processing and distance image generation processing, which will be described later, and images the same subject imaged by the first camera 121 from different viewpoints.

このように、図３に示すデジタルカメラ１００は、本実施の形態に従う画像処理システム１の全体を単体の装置として実装したものである。すなわち、ユーザーは、デジタルカメラ１００を用いて被写体を撮像することで、画像表示部１０８において当該被写体を立体的に視認することができる。 As described above, the digital camera 100 shown in FIG. 3 is obtained by mounting the entire image processing system 1 according to the present embodiment as a single device. That is, the user can visually recognize the subject in a three-dimensional manner on the image display unit 108 by imaging the subject using the digital camera 100.

（ｂ３：具現化例２）
図４は、図１に示す画像処理システム１を具現化したパーソナルコンピューター２００の構成を示すブロック図である。図４に示すパーソナルコンピューター２００では、一対の入力画像を取得するための撮像部２が搭載されておらず、任意の撮像部２によって取得された一対の入力画像（入力画像１および入力画像２）が外部から入力される構成となっている。このような構成であっても、実施の形態に従う画像処理システム１に含まれ得る。なお、図４においても、図１に示す画像処理システム１を構成するそれぞれのブロックに対応するコンポーネントには、図１と同一の参照符号を付している。 (B3: Implementation example 2)
FIG. 4 is a block diagram showing a configuration of a personal computer 200 that embodies the image processing system 1 shown in FIG. In the personal computer 200 shown in FIG. 4, the imaging unit 2 for acquiring a pair of input images is not mounted, and a pair of input images (an input image 1 and an input image 2) acquired by an arbitrary imaging unit 2. Is input from the outside. Even such a configuration can be included in the image processing system 1 according to the embodiment. In FIG. 4 as well, components corresponding to the respective blocks constituting the image processing system 1 shown in FIG. 1 are denoted by the same reference numerals as in FIG.

図４を参照して、パーソナルコンピューター２００は、パーソナルコンピューター本体２０２と、モニター２０６と、マウス２０８と、キーボード２１０と、外部記憶装置２１２とを含む。 Referring to FIG. 4, personal computer 200 includes a personal computer main body 202, a monitor 206, a mouse 208, a keyboard 210, and an external storage device 212.

パーソナルコンピューター本体２０２は、典型的には、汎用的なアーキテクチャーに従う汎用コンピューターであり、基本的な構成要素として、ＣＰＵ、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）などを含む。パーソナルコンピューター本体２０２は、図１に示す画像処理部３が提供する機能を実現するための画像処理プログラム２０４が実行可能になっている。このような画像処理プログラム２０４は、ＣＤ−ＲＯＭ（Compact Disk-Read Only Memory）などの記憶媒体に格納されて流通し、あるいは、ネットワークを介してサーバー装置から配信される。そして、画像処理プログラム２０４は、パーソナルコンピューター本体２０２のハードディスクなどの記憶領域内に格納される。 The personal computer main body 202 is typically a general-purpose computer according to a general-purpose architecture, and includes a CPU, a RAM (Random Access Memory), a ROM (Read Only Memory), and the like as basic components. The personal computer main body 202 can execute an image processing program 204 for realizing a function provided by the image processing unit 3 shown in FIG. Such an image processing program 204 is stored in a storage medium such as a CD-ROM (Compact Disk-Read Only Memory) and distributed, or distributed from a server device via a network. The image processing program 204 is stored in a storage area such as a hard disk of the personal computer main body 202.

このような画像処理プログラム２０４は、パーソナルコンピューター本体２０２で実行されるオペレーティングシステム（ＯＳ）の一部として提供されるプログラムモジュールのうち必要なモジュールを、所定のタイミングおよび順序で呼出して処理を実現するように構成されてもよい。この場合、画像処理プログラム２０４自体には、ＯＳが提供するモジュールは含まれず、ＯＳと協働して画像処理が実現される。また、画像処理プログラム２０４は、単体のプログラムではなく、何らかのプログラムの一部に組込まれて提供されてもよい。このような場合にも、画像処理プログラム２０４自体には、当該何らかのプログラムにおいて共通に利用されるようなモジュールは含まれず、当該何らかのプログラムと協働して画像処理が実現される。このような一部のモジュールを含まない画像処理プログラム２０４であっても、本実施の形態に従う画像処理システム１の趣旨を逸脱するものではない。 Such an image processing program 204 implements processing by calling necessary modules among program modules provided as part of an operating system (OS) executed by the personal computer main body 202 at a predetermined timing and order. It may be configured as follows. In this case, the image processing program 204 itself does not include a module provided by the OS, and image processing is realized in cooperation with the OS. Further, the image processing program 204 may be provided by being incorporated in a part of some program instead of a single program. Even in such a case, the image processing program 204 itself does not include a module that is commonly used in the program, and image processing is realized in cooperation with the program. Even such an image processing program 204 that does not include some modules does not depart from the spirit of the image processing system 1 according to the present embodiment.

もちろん、画像処理プログラム２０４によって提供される機能の一部または全部を専用のハードウェアによって実現してもよい。 Of course, part or all of the functions provided by the image processing program 204 may be realized by dedicated hardware.

モニター２０６は、オペレーティングシステム（ＯＳ）が提供するＧＵＩ画面、画像処理プログラム２０４によって生成される画像などを表示する。モニター２０６は、図３に示す画像表示部１０８と同様に、画像処理プログラム２０４によって生成されるステレオ画像を用いて、被写体を立体視表示できることが好ましい。この場合、モニター２０６としては、画像表示部１０８において説明したのと同様に、パララックスバリア方式やシャッタメガネ方式などの表示デバイスによって構成される。 The monitor 206 displays a GUI screen provided by an operating system (OS), an image generated by the image processing program 204, and the like. As with the image display unit 108 shown in FIG. 3, the monitor 206 is preferably capable of stereoscopically displaying a subject using a stereo image generated by the image processing program 204. In this case, the monitor 206 is configured by a display device such as a parallax barrier method or a shutter glasses method, as described in the image display unit 108.

マウス２０８およびキーボード２１０は、それぞれユーザー操作を受付け、その受付けたユーザー操作の内容をパーソナルコンピューター本体２０２へ出力する。 The mouse 208 and the keyboard 210 each accept a user operation and output the contents of the accepted user operation to the personal computer main body 202.

外部記憶装置２１２は、何らかの方法で取得された一対の入力画像（入力画像１および入力画像２）を格納しており、この一対の入力画像をパーソナルコンピューター本体２０２へ出力する。外部記憶装置２１２としては、フラッシュメモリ、光学ディスク、磁気ディスクなどのデータを不揮発的に記憶するデバイスが用いられる。 The external storage device 212 stores a pair of input images (an input image 1 and an input image 2) obtained by some method, and outputs the pair of input images to the personal computer main body 202. As the external storage device 212, a device that stores data in a nonvolatile manner such as a flash memory, an optical disk, or a magnetic disk is used.

このように、図４に示すパーソナルコンピューター２００は、本実施の形態に従う画像処理システム１の一部を単体の装置として実装したものである。このようなパーソナルコンピューター２００を用いることで、ユーザーは、任意の撮像部（ステレオカメラ）を用いて異なる視点で被写体を撮像することで取得された一対の入力画像から、当該被写体を立体視表示するためのステレオ画像（左眼用画像および右眼用画像）を生成することができる。さらに、この生成したステレオ画像をモニター２０６で表示することで、立体視表示を楽しむこともできる。 As described above, personal computer 200 shown in FIG. 4 is obtained by mounting a part of image processing system 1 according to the present embodiment as a single device. By using such a personal computer 200, the user stereoscopically displays the subject from a pair of input images acquired by imaging the subject from different viewpoints using an arbitrary imaging unit (stereo camera). Therefore, a stereo image (a left-eye image and a right-eye image) can be generated. Furthermore, by displaying the generated stereo image on the monitor 206, stereoscopic display can be enjoyed.

（ｂ４：具現化例３）
図５は、図１に示す画像処理システム１を具現化した携帯電話３００の外観図である。図５を参照して、レンズ２１ａおよび撮像素子２１ｂからなる撮像部２は、携帯電話３００の一方面に実装されることになる。なお、携帯電話３００のステレオ画像生成に係る構成は、図３に示すデジタルカメラ１００の構成と同様であるので、詳細な説明は繰り返さない。 (B4: Implementation example 3)
FIG. 5 is an external view of a mobile phone 300 that embodies the image processing system 1 shown in FIG. Referring to FIG. 5, imaging unit 2 including lens 21 a and imaging element 21 b is mounted on one surface of mobile phone 300. Note that the configuration relating to the stereo image generation of the mobile phone 300 is the same as the configuration of the digital camera 100 shown in FIG.

＜Ｃ．第１の実施の形態（縦ステレオ）＞
まず、第１の実施の形態に従う画像処理方法として、上述した図２（ａ）に示すように、同一種類のレンズ（光学ズーム機能無し）を縦方向に所定間隔だけ離して２つ配置した構成（ステレオカメラ）を想定する。 <C. First Embodiment (Vertical Stereo)>
First, as an image processing method according to the first embodiment, as shown in FIG. 2A, two lenses of the same type (no optical zoom function) are arranged at a predetermined interval in the vertical direction. (Stereo camera) is assumed.

上述したように、市販されているステレオ画像を生成可能なデジタルカメラや携帯電話などは、撮像された画像をそのままステレオ画像として出力するため、片側だけぼけた画像が撮像され、立体視として見づらいことがある。これは、オートフォーカスがばらばらに動作することで合焦位置（ピント位置）が異なってしまったり、特に、携帯電話の場合にレンズ面がむき出しになっている（露出している）ため、片側のレンズを触って汚してしまったりすることに起因する。本実施の形態においては、このような場合であっても、高画質なステレオ画像を生成する。 As described above, since a commercially available digital camera or mobile phone that can generate a stereo image outputs the captured image as a stereo image as it is, a blurred image is captured on one side and is difficult to view as a stereoscopic view. There is. This is because the autofocusing works differently and the focus position (focus position) is different, especially in the case of mobile phones, the lens surface is exposed (exposed), so one side This is because the lens is touched and dirty. In this embodiment, even in such a case, a high-quality stereo image is generated.

図６は、第１の実施の形態に従う画像処理方法の手順を示す図である。図６を参照して、第１の実施の形態に従う画像処理方法は、第１カメラ２１および第２カメラ２２が同一の被写体を撮像することでそれぞれ取得される入力画像１および入力画像２に対して、それぞれエッジ抽出処理が実行される（ステップＳ１およびＳ２）。それぞれの入力画像から抽出されたエッジ量に基づいてエッジ量比較処理が実行される（ステップＳ３）。そして、このエッジ量比較処理の結果に基づいて、出力用画像決定処理が実行される（ステップＳ４）。この出力用画像決定処理によって、ステレオ画像を生成するために用いる入力画像が決定される。 FIG. 6 is a diagram showing a procedure of the image processing method according to the first embodiment. Referring to FIG. 6, the image processing method according to the first embodiment is performed on input image 1 and input image 2 acquired by first camera 21 and second camera 22 capturing the same subject, respectively. Thus, edge extraction processing is executed (steps S1 and S2). An edge amount comparison process is executed based on the edge amount extracted from each input image (step S3). Based on the result of the edge amount comparison processing, output image determination processing is executed (step S4). By this output image determination process, an input image used for generating a stereo image is determined.

また、ステレオ画像を生成するために用いる入力画像が決定されると、対応点探索処理（ステップＳ１０Ａ，Ｓ１０Ｂ）、視差画像生成処理（ステップＳ１１Ａ，Ｓ１１Ｂ）、スムージング処理（ステップＳ１２Ａ，Ｓ１２Ｂ）、視差調整処理（ステップＳ１３Ａ，Ｓ１３Ｂ）、ステレオ画像生成処理（ステップＳ１４Ａ，Ｓ１４Ｂ）が順次実行され、ステレオ画像が生成される。以下、各ステップについて詳述する。 Further, when an input image used to generate a stereo image is determined, corresponding point search processing (steps S10A and S10B), parallax image generation processing (steps S11A and S11B), smoothing processing (steps S12A and S12B), parallax Adjustment processing (steps S13A and S13B) and stereo image generation processing (steps S14A and S14B) are sequentially executed to generate a stereo image. Hereinafter, each step will be described in detail.

（ｃ１：入力画像）
図７は、図１に示す撮像部２によって撮像された一対の入力画像の一例を示す図である。図７（ａ）は、第１カメラ２１によって撮像された入力画像１を示し、図７（ｂ）は、第２カメラ２２によって撮像された入力画像２を示す。図７（ａ）に示す入力画像１は、最終的に出力されるステレオ画像の一方の画像（この例では、左眼用画像）としてそのまま用いることができ、図７（ｂ）に示す入力画像２は、最終的に出力されるステレオ画像の他方の画像（この例では、右眼用画像）としてそのまま用いることができるものとする。但し、図７の例では、レンズを縦方向に所定間隔だけ離して２つ配置した構成において撮像したものであり、入力画像１および入力画像２の両方を出力しただけでは、ステレオ画像にはならない。 (C1: input image)
FIG. 7 is a diagram illustrating an example of a pair of input images captured by the imaging unit 2 illustrated in FIG. 7A shows the input image 1 captured by the first camera 21, and FIG. 7B shows the input image 2 captured by the second camera 22. The input image 1 shown in FIG. 7A can be used as it is as one of the finally output stereo images (in this example, the image for the left eye), and the input image shown in FIG. 2 can be used as it is as the other image (right-eye image in this example) of the stereo image finally output. However, in the example of FIG. 7, the image is taken in a configuration in which two lenses are arranged apart from each other by a predetermined interval in the vertical direction, and only the output of both the input image 1 and the input image 2 does not become a stereo image. .

図７には、説明を容易にするために、画像座標系を便宜的に定義している。より具体的には、入力画像の横方向をＸ軸とし、入力画像の縦方向をＹ軸とする直交座標系を採用する。このＸ軸およびＹ軸の原点は、便宜上、入力画像の左上端であるとする。また、撮像部２（図１）の視線方向をＺ軸とする。この直交座標系は、本明細書中の他の図面についての説明においても利用する場合がある。 In FIG. 7, an image coordinate system is defined for convenience in order to facilitate the explanation. More specifically, an orthogonal coordinate system is employed in which the horizontal direction of the input image is the X axis and the vertical direction of the input image is the Y axis. The origin of the X and Y axes is assumed to be the upper left corner of the input image for convenience. Further, the line-of-sight direction of the imaging unit 2 (FIG. 1) is taken as the Z axis. This orthogonal coordinate system may be used in the description of other drawings in this specification.

上述したように、第１カメラ１２１および第２カメラ１２２を縦方向に配列した撮像部２を用いているので、図７（ａ）に示す入力画像１と図７（ｂ）に示す入力画像２との間には、Ｙ軸方向に沿って視差が生じている。 As described above, since the imaging unit 2 in which the first camera 121 and the second camera 122 are arranged in the vertical direction is used, the input image 1 shown in FIG. 7A and the input image 2 shown in FIG. There is a parallax along the Y-axis direction.

（ｃ２：エッジ抽出処理）
まず、図６に示すエッジ抽出処理（ステップＳ１およびＳ２）の詳細について説明する。図８は、図６に示すエッジ抽出処理（ステップＳ１およびＳ２）の処理手順を示すフローチャートである。図９は、図８のスムージング処理（ステップＳ２）において用いられるエッジ抽出用の平均化フィルタの一例を示す図である。このエッジ抽出処理は、図１に示すエッジ抽出部３０によって実行される。このエッジ抽出処理では、入力画像１および入力画像２のそれぞれの周波数特性が取得される。 (C2: edge extraction process)
First, the details of the edge extraction process (steps S1 and S2) shown in FIG. 6 will be described. FIG. 8 is a flowchart showing a processing procedure of the edge extraction processing (steps S1 and S2) shown in FIG. FIG. 9 is a diagram illustrating an example of an averaging filter for edge extraction used in the smoothing process (step S2) of FIG. This edge extraction processing is executed by the edge extraction unit 30 shown in FIG. In this edge extraction process, the frequency characteristics of the input image 1 and the input image 2 are acquired.

図８を参照して、エッジ抽出部３０は、入力画像に対してスムージング処理を行なう（ステップＳ１０１）。このスムージング処理では、図９に示すようなエッジ抽出用の平均化フィルタを用いて、入力画像の各画素についての平均化処理後の値が算出される。本実施の形態においては、１つの対象画素を中心とする９画素×９画素の大きさのスムージングフィルタが用いられる。平均化処理後の画素からなる画像をスムージング画像と称す。 Referring to FIG. 8, edge extraction unit 30 performs a smoothing process on the input image (step S101). In this smoothing process, an averaged filter for edge extraction as shown in FIG. 9 is used to calculate a value after the averaging process for each pixel of the input image. In the present embodiment, a smoothing filter having a size of 9 pixels × 9 pixels centered on one target pixel is used. An image composed of pixels after the averaging process is referred to as a smoothed image.

続いて、エッジ抽出部３０は、元の入力画像とスムージング画像との間で差分処理を行なう（ステップＳ１０２）。より具体的には、エッジ抽出部３０は、元の入力画像とスムージング画像との間で、各画素について画素値の差分を算出し、算出した各画素の差分の絶対値をすべての画素について積算する。 Subsequently, the edge extraction unit 30 performs a difference process between the original input image and the smoothed image (step S102). More specifically, the edge extraction unit 30 calculates a pixel value difference for each pixel between the original input image and the smoothed image, and integrates the absolute value of the calculated difference of each pixel for all the pixels. To do.

入力画像がぼけている場合には、平均化フィルタを用いてスムージング処理を実行したとしても、元の入力画像からの変化が小さい。この原理を利用して、元の入力画像とスムージング画像との間の差分の絶対値が大きい方が、より多く高周波成分が含まれる入力画像であると判断できる。 When the input image is blurred, the change from the original input image is small even if smoothing processing is executed using an averaging filter. Using this principle, it can be determined that the larger absolute value of the difference between the original input image and the smoothed image is an input image including more high-frequency components.

最終的に、エッジ抽出部３０は、算出した差分の絶対値（エッジ量）を出力する（ステップＳ１０３）。このエッジ量が多いほど、より多くの高周波成分が含まれることを意味する。 Finally, the edge extraction unit 30 outputs the absolute value (edge amount) of the calculated difference (step S103). It means that more high frequency components are included, so that this edge amount is large.

（ｃ３：エッジ量比較処理および出力用画像決定処理）
続いて、図６に示すエッジ量比較処理（ステップＳ３）および出力用画像決定処理（ステップＳ４）の詳細について説明する。 (C3: edge amount comparison processing and output image determination processing)
Next, details of the edge amount comparison process (step S3) and the output image determination process (step S4) shown in FIG. 6 will be described.

エッジ量比較処理においては、入力画像１に対するエッジ抽出処理（ステップＳ１）によって算出されたエッジ量と、入力画像２に対するエッジ抽出処理（ステップＳ２）によって算出されたエッジ量とが比較される。すなわち、エッジ量比較処理は、エッジ量がより多い、すなわちより多くの高周波成分を含む入力画像がいずれであるかを判断する。このエッジ量の大小関係の判断により、より多くの高周波成分を含む入力画像を決定できる。 In the edge amount comparison process, the edge amount calculated by the edge extraction process (step S1) for the input image 1 is compared with the edge amount calculated by the edge extraction process (step S2) for the input image 2. That is, in the edge amount comparison process, it is determined which of the input images has a larger edge amount, that is, a higher frequency component. An input image including more high-frequency components can be determined by determining the size relationship of the edge amounts.

出力用画像決定処理（ステップＳ４）においては、エッジ量（差分の絶対値）がより大きい入力画像を出力用画像として選択する。図７に示す例では、入力画像１の方が、相対的に画質がよいと判断されたものとする。この出力用画像決定処理（ステップＳ４）による判断結果に応じて、入力画像１を主体的に用いてステレオ画像が生成するための処理（ステップＳ１０Ａ〜Ｓ１４Ａ）、または、入力画像２を主体的に用いてステレオ画像が生成するための処理（ステップＳ１０Ｂ〜Ｓ１４Ｂ）が実行される。 In the output image determination process (step S4), an input image having a larger edge amount (absolute value of difference) is selected as an output image. In the example illustrated in FIG. 7, it is assumed that the input image 1 is determined to have relatively higher image quality. Depending on the determination result of the output image determination process (step S4), a process (steps S10A to S14A) for generating a stereo image mainly using the input image 1 or the input image 2 is mainly performed. The process (step S10B-S14B) for producing | generating a stereo image using is performed.

（ｃ４：対応点探索処理および距離画像生成処理）
次に、対応点探索処理（ステップＳ１０Ａ，Ｓ１０Ｂ）および距離画像生成処理（ステップＳ１１Ａ，Ｓ１１Ｂ）について説明する。なお、後述の説明においては、入力画像１を主体的に用いる場合について例示的に説明するが、入力画像２を主体的に用いる場合も同様の処理を行なうことができる。 (C4: Corresponding point search process and distance image generation process)
Next, corresponding point search processing (steps S10A and S10B) and distance image generation processing (steps S11A and S11B) will be described. In the following description, the case where the input image 1 is mainly used will be described as an example, but the same processing can be performed when the input image 2 is mainly used.

対応点探索処理においては、一対の入力画像（入力画像１および入力画像２）の間の位置関係の対応付けが探索される。この対応点探索処理は、図１に示す対応点探索部３１によって実行される。より具体的には、対応点探索処理では、一方の入力画像の注目点にそれぞれ対応する他方の入力画像の画素（座標値）を特定する。このような対応点探索処理は、ＰＯＣ演算法、ＳＡＤ演算法、ＳＳＤ演算法、ＮＣＣ演算法などを用いたマッチング処理が利用される。 In the corresponding point search process, the correspondence of the positional relationship between the pair of input images (input image 1 and input image 2) is searched. This corresponding point search processing is executed by the corresponding point search unit 31 shown in FIG. More specifically, in the corresponding point search process, the other input image pixel (coordinate value) corresponding to the target point of one input image is specified. For such a corresponding point search process, a matching process using a POC calculation method, an SAD calculation method, an SSD calculation method, an NCC calculation method, or the like is used.

対応点探索処理においては、一方の入力画像を基準画像に設定するとともに、他方の入力画像を参照画像に設定して、両画像間の対応付けが行なわれる。いずれの入力画像を主体的に用いるかに応じて、この基準画像に設定される入力画像が変更されることになる。 In the corresponding point search process, one input image is set as a reference image, and the other input image is set as a reference image, and the images are associated with each other. Depending on which input image is mainly used, the input image set as the reference image is changed.

続いて、対応点探索処理によって特定された注目点と対応点との間の対応関係に基づいて、被写体の各点の座標に関連付けられた距離情報を示す距離画像を生成するための距離画像生成処理が実行される。この距離画像生成処理は、図１に示す距離画像生成部３２によって実行される。この距離画像生成処理では、注目点の各々について、入力画像１の画像座標系における当該注目点の座標と、入力画像２の画像座標系における対応点の座標との差（視差）が算出される。 Subsequently, a distance image generation for generating a distance image indicating distance information associated with the coordinates of each point of the subject based on the correspondence relationship between the target point identified by the corresponding point search process and the corresponding point Processing is executed. This distance image generation process is executed by the distance image generation unit 32 shown in FIG. In this distance image generation processing, for each attention point, a difference (parallax) between the coordinates of the attention point in the image coordinate system of the input image 1 and the coordinates of the corresponding point in the image coordinate system of the input image 2 is calculated. .

算出される視差は、入力画像１を主体的に用いる場合には、対応する入力画像１の注目点の座標に関連付けて記憶され、入力画像２を主体的に用いる場合には、対応する入力画像２の注目点の座標に関連付けて記憶される。距離情報としては、対応点探索処理によって探索されたそれぞれの注目点について、入力画像１または入力画像２上の座標および対応する視差が関連付けられる。この距離情報を入力画像１または入力画像２の画素配列に対応付けて配列することで、入力画像１または入力画像２の画像座標系に対応して各点の視差を表す距離画像が生成される。 The calculated parallax is stored in association with the coordinates of the target point of the corresponding input image 1 when the input image 1 is used proactively, and the corresponding input image when the input image 2 is used proactively. It is stored in association with the coordinates of the two points of interest. As the distance information, the coordinates on the input image 1 or the input image 2 and the corresponding parallax are associated with each attention point searched by the corresponding point search process. By arranging this distance information in association with the pixel arrangement of the input image 1 or the input image 2, a distance image representing the parallax of each point is generated corresponding to the image coordinate system of the input image 1 or the input image 2. .

なお、このような対応点探索処理および距離画像生成処理としては、特開２００８−２１６１２７号公報に記載された方法を採用してもよい。特開２００８−２１６１２７号公報には、サブピクセルの粒度で視差（距離情報）を算出するための方法が開示されているが、ピクセルの粒度で視差（距離情報）を算出するようにしてもよい。 In addition, as such a corresponding point search process and a distance image generation process, you may employ | adopt the method described in Unexamined-Japanese-Patent No. 2008-216127. Japanese Patent Laid-Open No. 2008-216127 discloses a method for calculating parallax (distance information) with subpixel granularity, but parallax (distance information) may be calculated with pixel granularity. .

なお、入力画像がＲＧＢなどのカラー画像である場合には、グレイ画像に変換した後に対応点探索処理を行なってもよい。 When the input image is a color image such as RGB, the corresponding point search process may be performed after conversion to a gray image.

図１０は、第１の実施の形態に従う画像処理方法に従って図７に示す一対の入力画像から生成された距離画像の一例を示す図である。すなわち、図１０（ａ）には、図７（ａ）に示す入力画像１を基準画像に設定し、図７（ｂ）に示す入力画像２を参照画像に設定した上で、対応点探索処理を行なうことで得られた距離画像（視差画像）の一例を示す。なお、図１０（ｂ）には、後述するスムージング処理後の距離画像の一例を示す。図１０に示すように、入力画像１の各点の各点に関連付けられた視差（距離情報）の大きさは、対応する点の濃淡によって表現される。 FIG. 10 is a diagram showing an example of a distance image generated from the pair of input images shown in FIG. 7 according to the image processing method according to the first embodiment. That is, in FIG. 10 (a), the input image 1 shown in FIG. 7 (a) is set as a standard image, and the input image 2 shown in FIG. 7 (b) is set as a reference image. An example of a distance image (parallax image) obtained by performing is shown. In addition, in FIG.10 (b), an example of the distance image after the smoothing process mentioned later is shown. As shown in FIG. 10, the magnitude of the parallax (distance information) associated with each point of the input image 1 is expressed by the shade of the corresponding point.

上述した対応点探索処理および距離画像生成処理において、相関演算を行なうことで注目点およびその対応点を特定するので、所定の画素サイズを有する単位領域毎に対応点が探索される。図１０には、３２画素×３２画素の単位領域毎に対応点探索が実行された一例を示す。すなわち、図７に示す例では、Ｘ軸（横方向）およびＹ軸（縦方向）のいずれも３２画素間隔で規定された単位領域毎に対応点が探索され、その探索された対応点との間の距離が算出される。この探索された対応点との間の距離を示す距離画像は、入力画像の画素サイズと一致するように生成される。例えば、入力画像１が３４５６画素×２５９２画素のサイズを有している場合には、１０８点×８１点の探索点において距離が算出され、この算出されたそれぞれの距離から入力画像の画素サイズに対応する距離画像が生成される。 In the corresponding point search process and the distance image generation process described above, the attention point and the corresponding point are specified by performing a correlation operation, and therefore the corresponding point is searched for each unit region having a predetermined pixel size. FIG. 10 shows an example in which the corresponding point search is performed for each unit region of 32 pixels × 32 pixels. That is, in the example shown in FIG. 7, the corresponding points are searched for each unit region defined by an interval of 32 pixels on both the X axis (horizontal direction) and the Y axis (vertical direction). The distance between is calculated. A distance image indicating the distance between the searched corresponding points is generated so as to match the pixel size of the input image. For example, when the input image 1 has a size of 3456 pixels × 2592 pixels, the distance is calculated at search points of 108 points × 81 points, and the pixel size of the input image is calculated from the calculated distances. A corresponding distance image is generated.

なお、入力画像の最外周にある３２画素分の領域（探索ウィンドウ）については、対応点が存在しないと誤って判断される可能性があるため、対応点探索処理を行なわず、最も近接した位置にある画素の距離（視差）データで代用した。すなわち、入力画像の最外周にある３２画素分の領域については、最外周から３２画素だけ内側に入った位置にある画素の値を用いた。 Note that the region (search window) for 32 pixels on the outermost periphery of the input image may be erroneously determined that there is no corresponding point, so the corresponding point search process is not performed and the closest position is determined. The pixel distance (parallax) data is used instead. That is, for the region for 32 pixels at the outermost periphery of the input image, the value of the pixel at the position that is 32 pixels inside from the outermost periphery is used.

（ｃ５：スムージング処理）
距離画像が取得されると、当該取得された距離画像に対して、スムージング処理（図６のステップＳ１２Ａ，Ｓ１２Ｂ）が実行される。このスムージング処理は、図１に示すスムージング処理部３３によって実行される。このスムージング処理では、距離画像の全体が平均化される。 (C5: smoothing process)
When the distance image is acquired, smoothing processing (steps S12A and S12B in FIG. 6) is performed on the acquired distance image. This smoothing process is executed by the smoothing processing unit 33 shown in FIG. In this smoothing process, the entire distance image is averaged.

このようなスムージング処理の具現化例として、所定サイズの二次元フィルタを用いる方法がある。 As an embodiment of such smoothing processing, there is a method using a two-dimensional filter of a predetermined size.

図１１は、図６のスムージング処理（ステップＳ１２Ａ，Ｓ１２Ｂ）において用いられる平均化フィルタの一例を示す図である。距離画像に対するスムージング処理では、例えば、図１１に示すような８１画素×８１画素の平均化フィルタが適用される。平均化フィルタでは、対象画素を中心とする縦方向８１画素および横方向８１画素の範囲に含まれる距離画像の画素値（視差）の平均値が当該対象画素の新たな画素値として算出される。より具体的には、フィルタ内に含まれる画素が有する画素値の総和をフィルタの画素サイズで除算することで、対象画素の新たな画素値が算出される。 FIG. 11 is a diagram illustrating an example of an averaging filter used in the smoothing process (steps S12A and S12B) in FIG. In the smoothing process for the distance image, for example, an 81 × 81 pixel averaging filter as shown in FIG. 11 is applied. In the averaging filter, the average value of the pixel values (parallax) of the distance image included in the range of 81 pixels in the vertical direction and 81 pixels in the horizontal direction centering on the target pixel is calculated as a new pixel value of the target pixel. More specifically, a new pixel value of the target pixel is calculated by dividing the sum of the pixel values of the pixels included in the filter by the pixel size of the filter.

なお、フィルタ内に含まれるすべての画素の操作をとるのではなく、所定間隔毎（例えば、９画素）に間引いて抽出した画素の平均値を用いてもよい。このような間引き処理を行なった場合であっても、全画素の平均値を用いた場合と同様の平滑化結果が得られる場合があり、そのような場合には、間引き処理を行なうことで処理量を低減できる。 In addition, instead of taking the operation of all the pixels included in the filter, an average value of pixels extracted by thinning out at predetermined intervals (for example, 9 pixels) may be used. Even when such a thinning process is performed, a smoothing result similar to the case where the average value of all pixels is used may be obtained. In such a case, the processing is performed by performing the thinning process. The amount can be reduced.

図１０（ｂ）は、図１０（ａ）に示す距離画像に対してスムージング処理を行なった結果を示す図である。図１０（ｂ）に示すスムージング処理後の距離画像では、隣接する画素間で画素値（視差）が大きく変化しないようになっていることがわかる。 FIG. 10B is a diagram illustrating a result of performing the smoothing process on the distance image illustrated in FIG. In the distance image after the smoothing process shown in FIG. 10B, it can be seen that the pixel value (parallax) does not change greatly between adjacent pixels.

なお、スムージング処理によって得られた距離画像の画素サイズは、入力画像と同一の画素サイズであることが好ましい。画素サイズを同一にすることで、後述するステレオ画像生成処理において、各画素の距離を一対一で決定することができる。 Note that the pixel size of the distance image obtained by the smoothing process is preferably the same pixel size as the input image. By making the pixel size the same, the distance of each pixel can be determined on a one-to-one basis in a stereo image generation process to be described later.

（ｃ６：視差調整処理）
スムージング処理後の距離画像が取得されると、より快適な視差量で立体視表示できるように、視差調整処理が実行される。この視差調整処理は、スムージング処理後の距離画像の画素値（視差）が予め定められた視差範囲（ターゲット視差レンジ）内に存在するように、スケーリングを行なう。 (C6: parallax adjustment processing)
When the distance image after the smoothing process is acquired, the parallax adjustment process is executed so that stereoscopic display can be performed with a more comfortable parallax amount. In this parallax adjustment process, scaling is performed so that the pixel value (parallax) of the distance image after the smoothing process is within a predetermined parallax range (target parallax range).

図１２は、図６に示す視差調整処理（ステップＳ１３Ａ，Ａ１３Ｂ）の処理内容を説明するための図である。図１２には、スムージング処理後の距離画像における視差（距離）量を横軸とし、視差調整処理後の視差（距離）量を縦軸としている。視差調整処理では、図１２における視差調整関数を決定する。 FIG. 12 is a diagram for explaining the processing content of the parallax adjustment processing (steps S13A and A13B) illustrated in FIG. In FIG. 12, the horizontal axis represents the amount of parallax (distance) in the distance image after smoothing processing, and the vertical axis represents the amount of parallax (distance) after parallax adjustment processing. In the parallax adjustment process, the parallax adjustment function in FIG. 12 is determined.

より具体的な手順としては、まず、ターゲット視差レンジｒの大きさを決定する。このターゲット視差レンジｒは、入力画像の横幅を基準として経験的に決定される。例えば、１９２０画素×１０８０画素の入力画像であれば、４９画素に決定される。このターゲット視差レンジｒは、動的に決定してもよいし、予め設定された固定値を用いるようにしてもよい。 As a more specific procedure, first, the size of the target parallax range r is determined. This target parallax range r is determined empirically based on the width of the input image. For example, in the case of an input image of 1920 pixels × 1080 pixels, it is determined to be 49 pixels. The target parallax range r may be determined dynamically or a fixed value set in advance may be used.

また、スムージング処理後の距離画像に含まれる視差（距離）量の最大値を最大視差Ｐｍａｘとし、最小値を最小視差Ｐｍｉｎとする。すなわち、最大視差Ｐｍａｘを有する画素の画像は最も飛び出して立体視表示され、最小視差Ｐｍｉｎを有する画素の画像は最も奥行きに存在するように立体視表示される。視差調整処理においては、最大視差Ｐｍａｘ（表示面に直交する方向に最も飛び出した位置）と、最小視差Ｐｍｉｎ（表示面に直交する方向（Ｚ軸方向）における最も奥行き側の位置）との間のレンジが、ターゲット視差レンジｒと一致するように、視差調整関数の傾きである視差増減係数ｃと、視差調整関数の切片であるオフセットｏとが算出される。より具体的には、以下の式に従って、視差増減係数ｃおよびオフセットｏが算出される。 Further, the maximum value of the parallax (distance) amount included in the distance image after the smoothing process is set as the maximum parallax Pmax, and the minimum value is set as the minimum parallax Pmin. In other words, the image of the pixel having the maximum parallax Pmax is projected out and displayed stereoscopically, and the image of the pixel having the minimum parallax Pmin is displayed stereoscopically so as to exist at the depth. In the parallax adjustment processing, between the maximum parallax Pmax (the position that protrudes most in the direction orthogonal to the display surface) and the minimum parallax Pmin (the position on the deepest side in the direction orthogonal to the display surface (Z-axis direction)). A parallax increase / decrease coefficient c that is the inclination of the parallax adjustment function and an offset o that is an intercept of the parallax adjustment function are calculated so that the range matches the target parallax range r. More specifically, the parallax increase / decrease coefficient c and the offset o are calculated according to the following equations.

視差増減係数ｃ＝ターゲット視差レンジｒ／（最大視差Ｐｍａｘ−最小視差Ｐｍｉｎ）
オフセットｏ＝（最大視差Ｐｍａｘ＋最小視差Ｐｍｉｎ）／２
調整後視差（距離）量＝視差増減係数ｃ×（調整前視差（距離）量−オフセットｏ）
（ｃ７：ステレオ画像生成処理）
スムージング処理後の距離画像が取得されると、当該取得された距離画像を用いて、ステレオ画像生成処理（図６のステップＳ１４Ａ，Ｓ１４Ｂ）が実行される。このステレオ画像生成処理は、図１に示す３Ｄ画像生成部３５によって実行される。 Parallax increase / decrease coefficient c = target parallax range r / (maximum parallax Pmax−minimum parallax Pmin)
Offset o = (maximum parallax Pmax + minimum parallax Pmin) / 2
Post-adjustment parallax (distance) amount = parallax increase / decrease coefficient c × (pre-adjustment parallax (distance) amount−offset o)
(C7: Stereo image generation process)
When the distance image after the smoothing process is acquired, a stereo image generation process (steps S14A and S14B in FIG. 6) is executed using the acquired distance image. This stereo image generation process is executed by the 3D image generation unit 35 shown in FIG.

ステレオ画像生成処理では、入力画像１が主体的に用いられる場合には、入力画像１をそのまま左眼用画像として出力するとともに、入力画像１の各画素を対応する距離（視差）に応じて位置をずらすことで、右眼用画像を生成する。一方、入力画像２が主体的に用いられる場合には、入力画像２をそのまま右眼用画像として出力するとともに、入力画像２の各画素を対応する距離（視差）に応じて位置をずらすことで、左眼用画像を生成する。 In the stereo image generation process, when the input image 1 is mainly used, the input image 1 is output as it is as the left-eye image, and each pixel of the input image 1 is positioned according to the corresponding distance (parallax). The right eye image is generated by shifting. On the other hand, when the input image 2 is mainly used, the input image 2 is output as it is as an image for the right eye, and the position of each pixel of the input image 2 is shifted according to the corresponding distance (parallax). The left eye image is generated.

なお、被写体を立体視表示するためには、左眼用画像および右眼用画像との間で、対応する画素が指定された距離（視差）だけ離れていればよいので、入力画像から左眼用画像および右眼用画像をそれぞれ生成してもよい。 Note that in order to stereoscopically display the subject, it is only necessary that the corresponding pixels be separated from the left-eye image and the right-eye image by a specified distance (parallax). A commercial image and a right eye image may be generated.

図１３は、図６のステレオ画像生成処理（ステップＳ１４Ａ，Ｓ１４Ｂ）における処理手順を説明するための図である。図１３には、入力画像１を主体的に用いる場合の処理例を示す。図１４は、図１３に示すステレオ画像生成処理の処理手順を示すフローチャートである。 FIG. 13 is a diagram for explaining a processing procedure in the stereo image generation processing (steps S14A and S14B) in FIG. FIG. 13 shows a processing example when the input image 1 is used proactively. FIG. 14 is a flowchart showing a processing procedure of the stereo image generation processing shown in FIG.

図１３を参照して、ステレオ画像生成処理においては、距離画像に基づいて、主体的に用いられる入力画像からステレオ画像（左眼用画像および右眼用画像）が生成される。より具体的には、主体的に用いられる入力画像を構成するライン単位で画素の位置をずらすことで、他方の右眼用画像または左眼用画像が生成される。図１３には、入力画像１をそのまま左眼用画像として用いるとともに、入力画像２を右眼用画像として用いる一例を示す。図１３には、左眼用画像として用いる入力画像１のあるラインについて、画素位置（座標）が「１０１」，「１０２」，・・・，「１１０」である１０個の画素が示されている。各画素位置の画素に対応する距離（視差）がそれぞれ「４０」，「４０」，「４１」，「４１」，「４１」，「４２」，「４２」，「４１」，「４０」，「４０」であるとする。これらの情報を用いて、各画素について、ずらし後の画素位置（右眼用画像における座標）が算出される。より具体的には、（ずらし後の画素位置）＝（左眼用画像における座標）−（対応する距離（視差））に従って、１ライン分の各画素についてのずらし後の画素位置が算出される。 Referring to FIG. 13, in the stereo image generation process, a stereo image (a left-eye image and a right-eye image) is generated from an input image used mainly based on a distance image. More specifically, the other right-eye image or left-eye image is generated by shifting the pixel position in units of lines constituting the input image that is used mainly. FIG. 13 shows an example in which the input image 1 is used as it is as the left eye image and the input image 2 is used as the right eye image. FIG. 13 shows 10 pixels whose pixel positions (coordinates) are “101”, “102”,..., “110” for a line of the input image 1 used as the left-eye image. Yes. The distance (parallax) corresponding to the pixel at each pixel position is “40”, “40”, “41”, “41”, “41”, “42”, “42”, “41”, “40” It is assumed that “40”. Using these pieces of information, the shifted pixel position (coordinates in the right-eye image) is calculated for each pixel. More specifically, the pixel position after shifting for each pixel for one line is calculated according to (pixel position after shifting) = (coordinates in the image for the left eye) − (corresponding distance (parallax)). .

そして、それぞれの画素値と対応するずらし後の画素位置とに基づいて、右眼用画像の対応する１ライン分の画像が生成される。このとき、距離（視差）の値によっては、対応する画素が存在しない場合がある。図１３に示す例では、右眼用画像の画素位置「６６」および「６８」の画素の情報が存在しない。このような場合には、隣接する画素からの情報を用いて、不足する画素の画素値が補間される。 Then, a corresponding one line image of the right-eye image is generated based on each pixel value and the corresponding shifted pixel position. At this time, there may be no corresponding pixel depending on the value of the distance (parallax). In the example illustrated in FIG. 13, there is no information on the pixels at the pixel positions “66” and “68” of the right-eye image. In such a case, the pixel values of the deficient pixels are interpolated using information from adjacent pixels.

このような処理を入力画像に含まれるすべてのライン分だけ繰り返すことで、右眼用画像が生成される。 By repeating such processing for all lines included in the input image, the right-eye image is generated.

なお、この画素位置をずらす方向は、視差を生じさせるべき方向であり、具体的には、ユーザーに向けて表示した場合に、水平方向となる方向に相当する。 Note that the direction in which the pixel position is shifted is a direction in which parallax should be generated, and specifically corresponds to a direction that becomes the horizontal direction when displayed toward the user.

このような処理手順を示すと、図１４のようになる。すなわち、図１４を参照して、３Ｄ画像生成部３５（図１）は、入力画像１の１ライン分の画素について、それぞれのずらし後の画素位置を算出する（ステップＳ１４０１）。続いて、３Ｄ画像生成部３５は、ステップＳ１において算出されたずらし後の画素位置から１ライン分の画像（右眼用画像）を生成する（ステップＳ１４０２）。 Such a processing procedure is shown in FIG. That is, with reference to FIG. 14, the 3D image generation unit 35 (FIG. 1) calculates the pixel position after each shift for pixels for one line of the input image 1 (step S <b> 1401). Subsequently, the 3D image generation unit 35 generates an image for one line (image for the right eye) from the shifted pixel position calculated in step S1 (step S1402).

その後、３Ｄ画像生成部３５（図１）は、入力画像に処理を行なっていないラインが存在するか否かを判断する（ステップＳ１４０３）。入力画像に処理を行なっていないラインが存在していれば（ステップＳ１４０３においてＮＯ）、次のラインが選択され、ステップＳ１４０１およびＳ１４０２の処理が繰り返される。 Thereafter, the 3D image generation unit 35 (FIG. 1) determines whether or not there is an unprocessed line in the input image (step S1403). If there is an unprocessed line in the input image (NO in step S1403), the next line is selected, and the processes in steps S1401 and S1402 are repeated.

入力画像のすべてのラインについて処理が完了していれば（ステップＳ１４０３においてＹＥＳ）、３Ｄ画像生成部３５は、入力画像１（左眼用画像）とともに、生成した右眼用画像を出力する。そして、処理は終了する。 If the processing has been completed for all the lines of the input image (YES in step S1403), the 3D image generation unit 35 outputs the generated right eye image together with the input image 1 (left eye image). Then, the process ends.

図１３および図１４には、入力画像１が出力画像として決定された場合、すなわち入力画像１が主体的に用いられる場合の処理について説明したが、入力画像２が出力画像として決定された場合、すなわち入力画像２が主体的に用いられる場合の処理についても同様である。但し、上述の対応点探索処理においては、基準画像と参照画像との関係が入れ替わる。 13 and 14 describe the processing when the input image 1 is determined as the output image, that is, when the input image 1 is mainly used. However, when the input image 2 is determined as the output image, That is, the same applies to the processing when the input image 2 is mainly used. However, in the above-described corresponding point search process, the relationship between the base image and the reference image is switched.

図１５は、第１の実施の形態に従う画像処理方法に従って図７に示す一対の入力画像から生成されたステレオ画像の一例を示す図である。図７と図１５とを比較すると分かるように、ぼけのないクリアなステレオ画像が得られていることがわかる。 FIG. 15 is a diagram showing an example of a stereo image generated from the pair of input images shown in FIG. 7 according to the image processing method according to the first embodiment. As can be seen by comparing FIG. 7 and FIG. 15, it can be seen that a clear stereo image without blur is obtained.

すなわち、上述したような一連の処理を採用することで、相対的に画質のよい（ぼけていない）入力画像を用いて、出力されるステレオ画像が生成されるので、一方の入力画像にぼけなどの欠陥が存在する場合であっても、立体視表示の品質を維持できる。 That is, by adopting a series of processes as described above, a stereo image to be output is generated using an input image with relatively good image quality (not blurred). Even when there is a defect, the quality of stereoscopic display can be maintained.

（ｃ８：変形例）
上述の画像処理方法においては、入力画像の周波数特性に基づいて、入力画像に含まれるぼけなどの欠陥を検出する場合を想定したが、入力画像の周波数特性に基づいて、入力画像における露出オーバーなどを検出することもできる。例えば、一方の入力画像だけについて部分的に白く飛んでしまっていると、高周波成分が存在しない。そのため、上述と同様の方法によって、露出オーバーなどを検出できる。また、強い光がレンズに入射したときにレンズ内部での反射などにより、フレアやゴーストと呼ばれる輪や玉状のにじみなどが片側だけで撮影される場合がある。この場合にも、にじむことで高周波成分が少なくなっているため、上述と同様の方法によって、これらのフレアやゴーストを検出できる。 (C8: Modification)
In the image processing method described above, it is assumed that defects such as blur included in the input image are detected based on the frequency characteristics of the input image. However, overexposure in the input image is performed based on the frequency characteristics of the input image. Can also be detected. For example, if only one input image is partially whitened, there is no high frequency component. Therefore, overexposure can be detected by the same method as described above. Further, when strong light is incident on the lens, a ring called a flare or a ghost or a ball-like blur may be photographed only on one side due to reflection inside the lens. Also in this case, since high frequency components are reduced by bleeding, these flares and ghosts can be detected by the same method as described above.

このような露出オーバー、フレア、ゴーストなどの生じていない入力画像からステレオ画像を生成することが可能である。 A stereo image can be generated from an input image in which such overexposure, flare, and ghost are not generated.

＜Ｄ．第２の実施の形態（縦ステレオ／横ステレオ）＞
上述した第１の実施の形態においては、図２（ａ）に示す縦ステレオを利用できる場合に、入力画像１を主体的に用いる処理モードと、入力画像２を主体的に用いる処理モードとを選択的に実行する例を示した。ここで、図２（ｂ）に示す横ステレオについても利用できる場合には、入力画像１および入力画像２をそのままステレオ画像として用いることができる。典型的には、図５に示すように、撮像部２を搭載した携帯電話３００では、ユーザーの持ち方（携帯電話３００の姿勢方向）に依存して、縦ステレオおよび横ステレオのいずれにもなる。 <D. Second Embodiment (Vertical Stereo / Horizontal Stereo)>
In the first embodiment described above, when the vertical stereo shown in FIG. 2A can be used, the processing mode that mainly uses the input image 1 and the processing mode that mainly uses the input image 2 are provided. An example of selective execution was shown. Here, when the horizontal stereo shown in FIG. 2B can also be used, the input image 1 and the input image 2 can be used as they are as a stereo image. Typically, as shown in FIG. 5, the mobile phone 300 equipped with the imaging unit 2 can be either vertical stereo or horizontal stereo depending on how the user holds (the orientation direction of the mobile phone 300). .

そこで、第２の実施の形態においては、このような３つの処理モードを選択可能な構成について例示する。すなわち、ステレオ画像を生成する際に、１つの入力画像を用いるか、２つの入力画像を用いるかを切り替える。 Therefore, in the second embodiment, a configuration in which such three processing modes can be selected will be exemplified. That is, when a stereo image is generated, whether to use one input image or two input images is switched.

図１６は、第２の実施の形態に従う画像処理方法の手順を示す図である。図１６においては、図６に示す画像処理方法と同様の処理を同一のステップ番号を付して示す。図１６に示す画像処理方法は、図６に示す第１の実施の形態に従う画像処理方法に比較して、出力用モード決定処理（ステップＳ５）およびステレオ画像出力処理（ステップＳ７）が新たに追加されたものであり、その他の処理については、第１の実施の形態と同様であるので、詳細な説明は繰り返さない。 FIG. 16 is a diagram showing the procedure of the image processing method according to the second embodiment. In FIG. 16, the same process as the image processing method shown in FIG. 6 is given the same step number. Compared with the image processing method according to the first embodiment shown in FIG. 6, the image processing method shown in FIG. 16 has newly added an output mode determination process (step S5) and a stereo image output process (step S7). Since other processes are the same as those in the first embodiment, detailed description will not be repeated.

本実施の形態に従う画像処理方法においては、入力画像１および入力画像２の一方を主体的に用いてステレオ画像を生成する処理モードと、入力画像１および入力画像２の両方を用いて記テレオ画像を生成する処理モードとを切り替える。この処理モードの切り替えは、入力画像１および入力画像２から取得された周波数特性、および、撮像時における第１カメラ２１および第２カメラ２２の位置関係に基づいて行なわれる。 In the image processing method according to the present embodiment, a processing mode in which one of input image 1 and input image 2 is mainly used to generate a stereo image, and both the input image 1 and input image 2 are used to record a stereo image. Switch between processing modes to generate. The processing mode is switched based on the frequency characteristics acquired from the input image 1 and the input image 2 and the positional relationship between the first camera 21 and the second camera 22 at the time of imaging.

より具体的には、入力画像１および入力画像２から取得された周波数特性としては、エッジ抽出処理によって算出されるそれぞれの入力画像のエッジ量についての絶対値および相対差が用いられる。また、撮像時における第１カメラ２１および第２カメラ２２の位置関係は、加速度センサー１１６（図３）により取得される姿勢情報が用いられる。すなわち、撮像部２が縦ステレオの状態で撮像されたものであるか、横ステレオの状態で撮像されたものであるかが判断される。 More specifically, as the frequency characteristics acquired from the input image 1 and the input image 2, an absolute value and a relative difference for the edge amount of each input image calculated by the edge extraction process are used. In addition, posture information acquired by the acceleration sensor 116 (FIG. 3) is used for the positional relationship between the first camera 21 and the second camera 22 at the time of imaging. That is, it is determined whether the imaging unit 2 is captured in a vertical stereo state or a horizontal stereo state.

図１６を参照して、入力画像１および入力画像２が取得されると、これらの入力画像に対して、エッジ抽出処理（ステップＳ１およびＳ２）が実行され、それぞれの入力画像についてのエッジ量が算出される。このエッジ抽出処理（エッジ量の算出）は、上述の第１の実施の形態と同様であるので、詳細な説明は繰り返さない。 Referring to FIG. 16, when input image 1 and input image 2 are acquired, edge extraction processing (steps S1 and S2) is performed on these input images, and the edge amount for each input image is determined. Calculated. Since this edge extraction processing (edge amount calculation) is the same as that in the first embodiment described above, detailed description will not be repeated.

続いて、算出されたそれぞれのエッジ量に基づいて、エッジ量比較処理（ステップＳ３）が実行される。このエッジ量比較処理においては、入力画像１および入力画像２の一方のみに、ぼけなどの欠陥が存在するか否かが判断される。より具体的には、入力画像１のエッジ量と入力画像２のエッジ量との差分が予め定められたしきい値以下であるか否かが判断される。エッジ量の差分がしきい値以下である場合には、入力画像１と入力画像２との間でぼけ度合いに有意な相違はないと判断できるので、入力画像１および入力画像２をそのままステレオ画像として出力する。 Subsequently, an edge amount comparison process (step S3) is executed based on the calculated edge amounts. In this edge amount comparison process, it is determined whether or not a defect such as blur exists in only one of the input image 1 and the input image 2. More specifically, it is determined whether or not the difference between the edge amount of the input image 1 and the edge amount of the input image 2 is equal to or less than a predetermined threshold value. When the difference in edge amount is equal to or less than the threshold value, it can be determined that there is no significant difference in the degree of blur between the input image 1 and the input image 2, so that the input image 1 and the input image 2 are used as they are as stereo images. Output as.

但し、エッジ量比較処理（ステップＳ３）は、姿勢情報によって、第１カメラ２１および第２カメラ２２が横ステレオになっていると判断できる場合に実行される。第１カメラ２１および第２カメラ２２が縦ステレオになっている場合には、入力画像１および入力画像２をそのままステレオ画像として用いることができないからである。 However, the edge amount comparison process (step S3) is executed when it can be determined from the posture information that the first camera 21 and the second camera 22 are in horizontal stereo. This is because when the first camera 21 and the second camera 22 are in a vertical stereo, the input image 1 and the input image 2 cannot be used as they are as a stereo image.

すなわち、第１カメラ２１および第２カメラ２２が横ステレオになっており、かつ、入力画像１のエッジ量と入力画像２のエッジ量との差分がしきい値以下である場合に限って、ステレオ画像出力処理（ステップＳ７）が実行される。このステレオ画像出力処理（ステップＳ７）では、入力画像１および入力画像２がそのままステレオ画像として出力される。 That is, only when the first camera 21 and the second camera 22 are in the horizontal stereo and the difference between the edge amount of the input image 1 and the edge amount of the input image 2 is equal to or less than the threshold value, the stereo is limited. Image output processing (step S7) is executed. In this stereo image output process (step S7), the input image 1 and the input image 2 are output as they are as a stereo image.

これに対して、入力画像１のエッジ量と入力画像２のエッジ量との差分がしきい値を超えている場合には、入力画像の一方がぼけた状態であると判断できるので、上述した第１の実施の形態と同様に、出力用画像決定処理が実行される（ステップＳ４）。 On the other hand, if the difference between the edge amount of the input image 1 and the edge amount of the input image 2 exceeds the threshold value, it can be determined that one of the input images is blurred. As in the first embodiment, output image determination processing is executed (step S4).

また、第１カメラ２１および第２カメラ２２が縦ステレオになっている場合にも、入力画像１および入力画像２をそのままステレオ画像として使用することができないので、出力用画像決定処理が実行される（ステップＳ４）。 Further, even when the first camera 21 and the second camera 22 are in vertical stereo, the input image 1 and the input image 2 cannot be used as they are as a stereo image, so that an output image determination process is executed. (Step S4).

出力用画像決定処理において、よりエッジ量の多い入力画像が選択され、この選択された入力画像を主体としてステレオ画像が生成される。この出力用画像決定処理の実行後の処理については、上述した第１の実施の形態と同様であるので、詳細な説明は繰り返さない。 In the output image determination process, an input image with a larger amount of edge is selected, and a stereo image is generated mainly using the selected input image. Since the processing after execution of the output image determination processing is the same as that in the first embodiment described above, detailed description will not be repeated.

＜Ｅ．第１の変形例（動画）＞
上述した第１および第２の実施の形態に従う画像処理方法については、静止画および動画のいずれについても適用できるが、動画に適用する場合には、以下に述べるような処理を付加してもよい。 <E. First Modification (Movie)>
The image processing methods according to the first and second embodiments described above can be applied to both still images and moving images, but when applied to moving images, the following processing may be added. .

すなわち、本変形例に従う画像処理方法においては、第１カメラ２１および第２カメラ２２が被写体を動画撮像する場合に、複数フレームごとに、ステレオ画像を生成するために適用する処理モードを決定する。動画は、時間的に連続した一連のフレームで構成される。このような動画撮像において、フレームごとに処理モード（（１）入力画像１を主体的に用いたステレオ画像の生成、（２）入力画像２を主体的に用いたステレオ画像の生成、（３）入力画像１および入力画像２をそのまま用いたステレオ画像の生成）が異なると、ステレオ画像の生成規則が短時間で切り替わることになる。これによって、出力されるステレオ画像の連続性が損なわれ、ユーザーに違和感を与える可能性がある。 That is, in the image processing method according to the present modification, when the first camera 21 and the second camera 22 capture a moving image of a subject, a processing mode to be applied for generating a stereo image is determined for each of a plurality of frames. A moving image is composed of a series of frames that are temporally continuous. In such moving image capturing, processing mode for each frame ((1) generation of a stereo image mainly using the input image 1, (2) generation of a stereo image mainly using the input image 2, (3) If the generation of the stereo image using the input image 1 and the input image 2 as they are is different, the stereo image generation rule is switched in a short time. As a result, the continuity of the output stereo image is impaired, and there is a possibility that the user feels uncomfortable.

そこで、本変形例に従う画像処理方法においては、第１カメラ２１および第２カメラ２２が被写体を動画撮像する場合に、一旦決定した処理モードを所定期間（典型的には、複数フレーム）に亘って維持する。このような処理モードを維持する典型例としては、ある区間の先頭フレームに基づいて、処理モード（ステレオ画像の生成規則）を決定する方法がある。すなわち、本変形例に従う画像処理方法においては、第１カメラ２１および第２カメラ２２が被写体を動画撮像する場合に、先頭フレームに対応する入力画像１および入力画像２に基づいて処理モードを決定する。 Therefore, in the image processing method according to the present modification, when the first camera 21 and the second camera 22 capture a moving image of the subject, the processing mode once determined is set over a predetermined period (typically, a plurality of frames). maintain. As a typical example of maintaining such a processing mode, there is a method of determining a processing mode (stereo image generation rule) based on the first frame of a certain section. That is, in the image processing method according to the present modification, when the first camera 21 and the second camera 22 capture a moving image of the subject, the processing mode is determined based on the input image 1 and the input image 2 corresponding to the first frame. .

より具体的には、動画を構成する所定区間（複数のフレームを含む）のうち、先頭フレームに相当する入力画像１および入力画像２について、それぞれエッジ抽出処理を行なうことで、周波数特性を取得する。そして、取得された周波数特性（エッジ量）を利用して、上述した第１の実施の形態または第２の実施の形態に従う画像処理方法に従って、ステレオ画像の処理モードが決定される。一旦、処理モードが決定されると、当該決定された処理モードに従って、後続のフレーム（２フレーム以降）に相当する入力画像１および入力画像２から順次ステレオ画像を生成する。一連のフレーム群に対して、入力画像１を主体とするステレオ画像の生成処理（ステップＳ１１Ａ〜Ｓ１４Ａ）、入力画像２を主体とするステレオ画像の生成処理（ステップＳ１１Ｂ〜Ｓ１４Ｂ）、入力画像１および入力画像２をステレオ画像として出力する処理（ステップＳ７）が繰り返し実行される。 More specifically, the frequency characteristics are acquired by performing edge extraction processing for each of the input image 1 and the input image 2 corresponding to the first frame in a predetermined section (including a plurality of frames) constituting the moving image. . Then, using the acquired frequency characteristic (edge amount), the stereo image processing mode is determined according to the image processing method according to the first embodiment or the second embodiment described above. Once the processing mode is determined, stereo images are sequentially generated from the input image 1 and the input image 2 corresponding to subsequent frames (after the second frame) according to the determined processing mode. For a series of frames, a stereo image generation process mainly including the input image 1 (steps S11A to S14A), a stereo image generation process mainly including the input image 2 (steps S11B to S14B), the input image 1 and The process of outputting the input image 2 as a stereo image (step S7) is repeatedly executed.

このように、一連のフレーム群に対して、同一の生成規則を適用することにより、出力画像の連続性を保ち、ユーザーに対して自然な立体視表示（動画）を提供できる。 In this way, by applying the same generation rule to a series of frames, it is possible to maintain the continuity of the output image and provide a natural stereoscopic display (moving image) to the user.

なお、１つの動画に含まれるすべてのフレームに対して、先頭フレームの基づいて決定した処理モードを適用するようにしてもよいが、人間の視覚を利用して、動画に含まれるシーンの単位で、生成規則を変更できるようにしてもよい。例えば、特開２００２−１５２６６９号公報に開示されるような技術を用いて、動画内のシーンチェンジを検出することでシーンに区別し、各シーンの先頭フレーム（シーンの切り替わりタイミング）で処理モード（生成規則）を決定（変更）してもよい。あるいは、一定時間（例えば、３０ｆｐｓで６００フレーム）ごとに処理モード（生成規則）を決定（変更）してもよい。 Note that the processing mode determined based on the first frame may be applied to all frames included in one moving image, but using human vision, the processing mode is determined in units of scenes included in the moving image. The generation rule may be changed. For example, by using a technique disclosed in Japanese Patent Application Laid-Open No. 2002-152669, a scene change in a moving image is detected to distinguish the scene, and a processing mode (scene switching timing) is set in the first frame of each scene (scene switching timing). (Generation rule) may be determined (changed). Alternatively, the processing mode (generation rule) may be determined (changed) every certain time (for example, 600 frames at 30 fps).

上述したように、本変形例においては、静止画と動画との間で、生成規則の判定条件を切り替えて、動画は、静止画に比較して頻繁に切り替えないようにする。すなわち、動画の場合には、一定時間に亘って、生成規則を維持する。より具体的には、動画の場合には、先頭フレームでのみ生成規則の判定が行なわれることが好ましい。 As described above, in this modification, the generation rule determination condition is switched between a still image and a moving image so that the moving image is not switched more frequently than the still image. That is, in the case of a moving image, the generation rule is maintained for a certain time. More specifically, in the case of a moving image, it is preferable that the generation rule is determined only in the first frame.

＜Ｆ．第２の変形例（部分画像ごとの評価）＞
上述の第１および第２の実施の形態においては、入力画像の全体の画質を評価する構成について例示したが、入力画像の部分画像について画質を評価してもよい。そして、部分画像ごとにステレオ画像を生成する処理モードを切り替えてもよい。すなわち、入力画像の部分領域ごとに、処理を切り替える。 <F. Second Modification (Evaluation for Each Partial Image)>
In the first and second embodiments described above, the configuration for evaluating the overall image quality of the input image has been exemplified, but the image quality may be evaluated for a partial image of the input image. Then, the processing mode for generating a stereo image for each partial image may be switched. That is, the process is switched for each partial region of the input image.

基本的には、この部分画像ごとに画質を評価して処理モードを切り替える構成は、横ステレオ、すなわち、第１カメラ２１および第２カメラ２２でそれぞれ取得される入力画像１および入力画像２をそのままステレオ画像として用いることができる場合に好適である。もちろん、本変形例の方法を縦ステレオにおいて入力画像に対する前処理として適用することもできる。 Basically, the configuration in which the image quality is evaluated for each partial image and the processing mode is switched is the horizontal stereo, that is, the input image 1 and the input image 2 acquired by the first camera 21 and the second camera 22 respectively. It is suitable when it can be used as a stereo image. Of course, the method of this modification can also be applied as preprocessing for an input image in vertical stereo.

図１７は、第２の変形例に従う部分画像ごとに画質を評価する処理例を示す図である。図１８および図１９は、第２の変形例に従ってステレオ画像を生成する処理を説明するための図である。 FIG. 17 is a diagram illustrating a processing example in which the image quality is evaluated for each partial image according to the second modification. 18 and 19 are diagrams for describing processing for generating a stereo image according to the second modification.

図１７（ａ）に示すように、入力画像を複数の部分領域に区分し、各部分領域に対応する部分画像について周波数特性を評価してもよい。図１７（ａ）に示す例では、入力画像の長い方の辺を４分割し、短い方の辺を３分割した、計１２個の部分画像が設定される。このように部分画像ごとに周波数特性を評価することで、部分的に発生する、露出オーバーによる白飛び、フレア、ゴーストなどの欠陥の判定精度を向上させることができる。 As shown in FIG. 17A, the input image may be divided into a plurality of partial areas, and the frequency characteristics of the partial images corresponding to the partial areas may be evaluated. In the example shown in FIG. 17A, a total of 12 partial images are set in which the longer side of the input image is divided into four and the shorter side is divided into three. As described above, by evaluating the frequency characteristics for each partial image, it is possible to improve the accuracy of determination of defects such as whiteout due to overexposure, flare, and ghost that occur partially.

部分領域ごとの周波数特性を評価するより具体的な方法としては、まず、上述の第１および第２の実施の形態と同様の手順に従って、図１７（ａ）に示す部分画像ごとのエッジ量が算出される。そして、部分画像ごとにエッジ量に基づいて、中間判定が行なわれる。より具体的には、入力画像１および入力画像２の間で、同一の部分画像についてのエッジ量についての差分の絶対値が予め定められたしきい値以下であるか否かが判断される。すなわち、対応する部分画像について見た場合に、入力画像１と入力画像２との間で同様の傾向を示すか否かが判断される。同様の傾向を示す場合には、他の部分画像についての差分の絶対値を基準として、その差分の絶対値を評価することで、当該部分画像が信頼できるか否かを判断する。 As a more specific method for evaluating the frequency characteristics for each partial region, first, the edge amount for each partial image shown in FIG. 17A is determined according to the same procedure as in the first and second embodiments described above. Calculated. Then, intermediate determination is performed for each partial image based on the edge amount. More specifically, it is determined whether the absolute value of the difference regarding the edge amount for the same partial image is equal to or less than a predetermined threshold value between the input image 1 and the input image 2. That is, it is determined whether the same tendency is shown between the input image 1 and the input image 2 when the corresponding partial image is viewed. When the same tendency is shown, it is determined whether or not the partial image is reliable by evaluating the absolute value of the difference with reference to the absolute value of the difference with respect to another partial image.

一方、入力画像１と入力画像２との間で対応する部分画像についてのみエッジ量が大きく異なっている場合には、エッジ量が相対的に大きい部分画像が選択される。差分の絶対値が相対的に大きい場合には、入力画像１と入力画像２との間で互いに相反する傾向を有することになり、このときには、より大きなエッジ量を示す部分画像がステレオ画像の生成に用いられる。 On the other hand, when the edge amount is greatly different only for the corresponding partial images between the input image 1 and the input image 2, the partial image having a relatively large edge amount is selected. When the absolute value of the difference is relatively large, the input image 1 and the input image 2 tend to be in conflict with each other. In this case, a partial image showing a larger edge amount is generated as a stereo image. Used for.

なお、第１カメラ２１と第２カメラ２２との間で、被写体に対する視点が異なっているので、両カメラの間における視野範囲の相違、すなわち、オクルージョンが発生している場合には、入力画像１と入力画像２との間では映り込んでいる被写体が異なっていることがある。このような悪影響を防止するため、図１８に示すように、パターンマッチングなどの手法を用いて、入力画像１と入力画像２との間での対応関係を予め取得しておき、各対応する部分画像の間で周波数特性を評価するようにしてもよい。 In addition, since the viewpoint with respect to a to-be-photographed object differs between the 1st camera 21 and the 2nd camera 22, when the difference of the visual field range between both cameras, ie, the occlusion has generate | occur | produced, input image 1 And the input image 2 may reflect different subjects. In order to prevent such an adverse effect, as shown in FIG. 18, a correspondence relationship between the input image 1 and the input image 2 is acquired in advance using a method such as pattern matching, and each corresponding portion is obtained. You may make it evaluate a frequency characteristic between images.

すなわち、第１カメラ２１および第２カメラ２２が互いに異なる視点が被写体を撮像する場合には、図１８に示すように、入力画像１に設定される部分画像と入力画像２に設定される部分画像とが一致するとは限らないので、一方の入力画像に対して複数の部分画像を設定した上で、各部分画像に対応する領域を他の入力画像から探索してもよい。 That is, when the first camera 21 and the second camera 22 capture the subject from different viewpoints, as shown in FIG. 18, the partial image set in the input image 1 and the partial image set in the input image 2 Therefore, it is possible to search a region corresponding to each partial image from other input images after setting a plurality of partial images for one input image.

上述のように各部分画像について、周波数特性に基づいて画質を評価した後、ステレオ画像を生成する手順としては、図１９に示すようになる。この例では、横ステレオの構成において、第１カメラ２１が被写体を撮像することで取得される入力画像１をそのまま左眼用画像として用いるとともに、第２カメラ２２が被写体を撮像することで取得される入力画像２を適宜修正しつつ、右眼用画像として用いる。図１９には、入力画像２に含まれる部分画像を入力画像１の対応する部分画像を用いて補間する処理例を示す。すなわち、入力画像２の特定の部分画像にのみフレアやゴーストといった欠陥が存在していると判定された場合には、その問題がと判定された部分画像（領域）についてのみ、フレアやゴーストが存在しない方の入力画像の部分画像を用いて補間する。すなわち、入力画像２に対して、部分的に入力画像１の部分画像を移植することで、右眼用画像が生成される。 As described above, the procedure for generating a stereo image after evaluating the image quality of each partial image based on the frequency characteristics is as shown in FIG. In this example, in the horizontal stereo configuration, the input image 1 acquired by the first camera 21 capturing an image of the subject is used as the left-eye image as it is, and the second camera 22 is acquired by capturing the image of the subject. The input image 2 is used as a right-eye image while being appropriately corrected. FIG. 19 shows a processing example in which the partial image included in the input image 2 is interpolated using the corresponding partial image of the input image 1. That is, when it is determined that a defect such as flare or ghost exists only in a specific partial image of the input image 2, flare or ghost exists only in the partial image (region) determined to have the problem. Interpolation is performed using a partial image of the input image that is not performed. That is, the right-eye image is generated by partially transplanting the partial image of the input image 1 with respect to the input image 2.

このように部分画像ごとに画質を評価することで、部分的に発生すフレアやゴーストといった欠陥を修復しつつ、高画質なステレオ画像を生成できる。 Thus, by evaluating the image quality for each partial image, it is possible to generate a high-quality stereo image while repairing defects such as flare and ghost that are partially generated.

＜Ｇ．第３の変形例（ユーザーインターフェイス）＞
上述の画像処理方法を実装した画像処理システム／装置において、以下に示すようなユーザーインターフェイスを搭載することが好ましい。 <G. Third Modification (User Interface)>
In the image processing system / device in which the above-described image processing method is mounted, it is preferable to mount a user interface as described below.

図２０は、第３の変形例において提供されるユーザーインターフェイスの一例を示す図である。 FIG. 20 is a diagram illustrating an example of a user interface provided in the third modification.

（ｇ１：ユーザーへの警告）
上述したように、本実施の形態に従う画像処理方法では、入力画像についての画質を評価できるので、画質が劣化している入力画像に対応するカメラ（レンズ）を特定できる。そこで、特定のカメラ（撮像手段）からの入力画像に対して画質が劣化していると複数回にわたって判断されたときに、対応するカメラについてレンズの汚れをユーザーに警告するための警告手段を搭載してもよい。すなわち、複数回続けて、一方のカメラ（レンズ）からの入力画像の品質が常に悪い（算出されるエッジ量が少ない）と判定された場合には、片側のレンズ面が汚れている可能性があるため、ユーザーにレンズを拭くように促す警告メッセージを提示してもよい。言い換えれば、（少なくとも２回以上の）連続した撮像において、同じレンズからの入力画像のみの品質が悪い場合には、当該レンズ面が汚れている可能性ありとの警告を提示する。 (G1: Warning to users)
As described above, in the image processing method according to the present embodiment, the image quality of the input image can be evaluated, so that the camera (lens) corresponding to the input image with degraded image quality can be specified. Therefore, equipped with warning means to warn the user of lens contamination for the corresponding camera when it is determined multiple times that the image quality has deteriorated with respect to the input image from a specific camera (imaging means) May be. That is, if it is determined that the quality of the input image from one camera (lens) is always bad (the calculated edge amount is small) continuously, the lens surface on one side may be dirty. Therefore, a warning message may be presented that prompts the user to wipe the lens. In other words, if the quality of only the input image from the same lens is poor in continuous imaging (at least twice), a warning is given that the lens surface may be dirty.

このような警告メッセージとしては、図２０（ａ）に示すような内容が考えられる。
（ｇ２：撮像動作のユーザーへの喚起）
上述のような第１カメラ２１および第２カメラ２２の評価を行なうためには、第１カメラ２１および第２カメラ２２を用いて被写体を実際に撮像する必要がある。そこで、入力画像１および入力画像２に対する周波数特性を評価するために、第１カメラ２１および第２カメラ２２を用いた撮像の実行をユーザーに促す撮像喚起手段を搭載してもよい。すなわち、第１カメラ２１および第２カメラ２２（両レンズ）を判定するために入力画像を撮像させるようにユーザーに促す。 As such a warning message, the contents shown in FIG.
(G2: Arousing imaging operations to users)
In order to evaluate the first camera 21 and the second camera 22 as described above, it is necessary to actually image the subject using the first camera 21 and the second camera 22. Therefore, in order to evaluate the frequency characteristics with respect to the input image 1 and the input image 2, an imaging arousing unit that prompts the user to perform imaging using the first camera 21 and the second camera 22 may be installed. That is, the user is prompted to take an input image to determine the first camera 21 and the second camera 22 (both lenses).

このような喚起メッセージとしては、図２０（ｂ）に示すような内容が考えられる。なお、判定用チャートとしては、特定のエッジが画像の全面に存在するサンプル（例えば、草一面の画像など）が用いられる。 As such an arousing message, the content as shown in FIG. As the determination chart, a sample in which a specific edge exists on the entire surface of the image (for example, an image of the whole grass) is used.

なお、必ずしもチャートを用いる必要はなく、現実の何らかの被写体をユーザーに撮像してもらい、当該被写体を撮像して得られる入力画像についてのエッジ量が予め定められたしきい値以上であるか否かを判断する。そして、エッジ量が予め定められたしきい値以上である場合には、第１カメラ２１および第２カメラ２２の評価自体に適合した入力画像（被写体）と判断し、チャートの代用としてもよい。この場合には、チャートの代用として、いずれの被写体を用いて第１カメラ２１および第２カメラ２２を評価したのかが一見して把握できるように、撮像装置上のモニターなどにこの被写体についても表示することが好ましい。 It is not always necessary to use a chart. Whether or not the actual image of the subject is picked up by the user and the edge amount of the input image obtained by picking up the subject is greater than or equal to a predetermined threshold value Judging. If the edge amount is equal to or greater than a predetermined threshold value, it is determined that the input image (subject) is suitable for the evaluation of the first camera 21 and the second camera 22 and may be used as a chart substitute. In this case, as a substitute for the chart, this subject is also displayed on the monitor on the imaging device so that it can be understood at a glance which subject was used to evaluate the first camera 21 and the second camera 22. It is preferable to do.

（ｇ３：デフォルト設定）
上述のような第１カメラ２１および第２カメラ２２の評価については、初期設定時、電源投入時、撮像直前の合焦動作時のいずれかにおいて取得された入力画像１および入力画像２に基づいて行なわれることが好ましい。 (G3: Default setting)
Regarding the evaluation of the first camera 21 and the second camera 22 as described above, based on the input image 1 and the input image 2 acquired at the time of initial setting, when the power is turned on, or at the focusing operation immediately before imaging. Preferably, it is done.

初期設定時に取得された入力画像１および入力画像２についての周波数特性を取得することで、レンズ製造時に片側のレンズ面が少し傾いて製造されているといった初期不具合を容易に発見することができる。また、片側のレンズの品質が良好ではなく、常に少しぼけている場合などについても容易に発見できる。このような初期設定時に加えて、電源投入時や撮像直前の合焦動作時（シャッター半押し状態）といった定常的にも、第１カメラ２１および第２カメラ２２を評価してもよい。 By acquiring the frequency characteristics of the input image 1 and the input image 2 acquired at the time of initial setting, it is possible to easily find an initial defect such that the lens surface on one side is manufactured with a slight inclination at the time of manufacturing the lens. In addition, it is easy to find a case where the quality of the lens on one side is not good and is always slightly blurred. In addition to such initial setting, the first camera 21 and the second camera 22 may be evaluated on a regular basis such as when the power is turned on or when a focusing operation is performed immediately before imaging (shutter half-pressed state).

さらに、このような第１カメラ２１および第２カメラ２２を評価の結果に基づいて、デフォルトのカメラ（レンズ）を決定してもよい。このようにデフォルトのカメラ（レンズ）を決定しておいた場合には、撮像ごとに第１カメラ２１および第２カメラ２２を評価する必要はなく、予め設定したデフォルトのカメラ（レンズ）で取得された入力画像を主体的に用いて、ステレオ画像を生成することになる。 Further, a default camera (lens) may be determined based on the result of evaluating the first camera 21 and the second camera 22. When the default camera (lens) has been determined in this way, it is not necessary to evaluate the first camera 21 and the second camera 22 for each imaging, and the camera is acquired with a preset default camera (lens). A stereo image is generated mainly using the input image.

すなわち、より多くの高周波成分を含む入力画像に対応するカメラ（レンズ）を、ステレオ画像を生成する際に主体的に用いられる撮像手段としてデフォルト設定される。そして、デフォルト設定されているカメラ（レンズ）からの入力画像を主体的に用いて、ステレオ画像が生成される。 That is, a camera (lens) corresponding to an input image including more high-frequency components is set as a default as an imaging unit that is mainly used when generating a stereo image. Then, a stereo image is generated mainly using the input image from the camera (lens) set as default.

＜Ｈ．周波数特性の取得方法についての変形例）＞
上述したように、本実施の形態に従う画像処理方法においては、入力画像の周波数特性に基づいて画質を評価する。そのため、上述したようなエッジ抽出処理に限られず、周波数特性を取得するための各種の方法を採用できる。 <H. Modification of frequency characteristics acquisition method)>
As described above, in the image processing method according to the present embodiment, the image quality is evaluated based on the frequency characteristics of the input image. Therefore, the present invention is not limited to the edge extraction process as described above, and various methods for acquiring frequency characteristics can be employed.

このような周波数特性を取得する処理としては、特開２０１１−１２８９２６号公報に開示されるようなフーリエ変換を利用する方法を採用してもよい。あるいは、Ｓｏｂｅｌなどの１次微分フィルタや２次微分フィルタを利用してもよい。 As a process for acquiring such frequency characteristics, a method using Fourier transform as disclosed in JP 2011-128926 A may be employed. Alternatively, a primary differential filter or a secondary differential filter such as Sobel may be used.

また、入力画像に含まれる周波数特性についての判定精度を向上するために、ノイズ除去用にメディアン等を利用する処理を含ませてもよい。あるいは、入力画像の画素成分を周波数特性に変換した後、最も高周波の領域などノイズが多い周波数帯域を無視した上で、画質を評価してもよい。 Further, in order to improve the determination accuracy for the frequency characteristics included in the input image, a process of using a median or the like for noise removal may be included. Alternatively, after converting the pixel component of the input image into frequency characteristics, the image quality may be evaluated after ignoring the noisy frequency band such as the highest frequency region.

＜Ｉ．利点＞
本発明の実施の形態によれば、入力画像にぼけなどの欠陥がある状態であっても、その欠陥をユーザーに見せないように、画質のよい方の入力画像を主体的（優先的）に用いてステレオ画像を生成する。そのため、何らかの原因によって、一方の入力画像の撮像に失敗していても、画質のよい方からステレオ画像を生成することで、立体視表示の品質を維持できる。 <I. Advantage>
According to the embodiment of the present invention, even if the input image has a defect such as blur, the input image having the better image quality is mainly (preferential) so that the defect is not shown to the user. To generate a stereo image. Therefore, even if imaging of one input image has failed for some reason, the quality of stereoscopic display can be maintained by generating a stereo image from the one with the best image quality.

今回開示された実施の形態はすべての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は上記した説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 The embodiment disclosed this time should be considered as illustrative in all points and not restrictive. The scope of the present invention is defined by the terms of the claims, rather than the description above, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

１画像処理システム、２撮像部、３画像処理部、４画像出力部、２１，１２１第１カメラ、２１ａ，２２ａレンズ、２１ｂ，２２ｂ撮像素子、２２，１２２第２カメラ、２３，２４Ａ／Ｄ変換部、３０エッジ抽出部、３１対応点探索部、３２距離画像生成部、３３スムージング処理部、３４視差調整部、３５画像生成部、１００デジタルカメラ、１０２ＣＰＵ、１０４デジタル処理回路、１０６画像処理回路、１０８画像表示部、１１２記憶部、１１４ズーム機構、１１６加速度センサー、２００パーソナルコンピューター、２０２パーソナルコンピューター本体、２０４画像処理プログラム、２０６モニター、２０８マウス、２１０キーボード、２１２外部記憶装置、３００携帯電話。 DESCRIPTION OF SYMBOLS 1 Image processing system, 2 Imaging part, 3 Image processing part, 4 Image output part, 21,121 1st camera, 21a, 22a Lens, 21b, 22b Image sensor, 22,122 2nd camera, 23, 24 A / D Conversion unit, 30 edge extraction unit, 31 corresponding point search unit, 32 distance image generation unit, 33 smoothing processing unit, 34 parallax adjustment unit, 35 image generation unit, 100 digital camera, 102 CPU, 104 digital processing circuit, 106 image processing Circuit, 108 Image display unit, 112 Storage unit, 114 Zoom mechanism, 116 Acceleration sensor, 200 Personal computer, 202 Personal computer main body, 204 Image processing program, 206 Monitor, 208 Mouse, 210 Keyboard, 212 External storage device, 300 Mobile phone .

Claims

First imaging means for imaging a subject and obtaining a first input image;
Second imaging means for capturing the subject from a different viewpoint from the first imaging means to obtain a second input image;
Frequency characteristic acquisition means for acquiring frequency characteristics of the first and second input images;
A stereo image for stereoscopically displaying the subject from the first and second input images by mainly using an input image determined to have relatively good image quality based on the acquired frequency characteristics. An image processing system comprising: a stereoscopic generation unit that generates

The image processing system according to claim 1, wherein the stereoscopic generation unit determines that an input image including more high-frequency components has a relatively high image quality.

The image processing system according to claim 1, wherein the frequency characteristic acquisition unit acquires the frequency specification using at least one of extraction of an edge amount included in the input image and frequency analysis on the input image.

The stereoscopic vision generating means mainly uses one of the first and second input images based on the acquired frequency characteristics and the positional relationship between the first and second imaging means at the time of imaging. 4. The method according to claim 1, wherein the processing mode for generating the stereo image is used to switch between the processing mode for generating the stereo image using both the first and second input images. The image processing system according to item.

The image processing system according to claim 1, wherein the stereoscopic generation unit switches a processing mode for generating the stereo image for each partial image.

The stereoscopic viewing generation unit determines a processing mode to be applied to generate the stereo image for each of a plurality of frames when the first and second imaging units capture a moving image of the subject. The image processing system according to any one of 1 to 5.

The image processing system according to claim 6, wherein the stereoscopic generation unit maintains the processing mode once determined for a predetermined period when the first and second imaging units capture the moving image of the subject. .

The stereoscopic vision generating unit determines a processing mode based on first and second input images corresponding to a first frame when the first and second imaging units capture a moving image of the subject. Item 8. The image processing system according to Item 6 or 7.

And a warning unit configured to warn the user of lens contamination of the corresponding imaging unit when it is determined a plurality of times that the image quality of the input image from the specific imaging unit is deteriorated. The image processing system according to any one of 1 to 8.

The image processing according to any one of claims 1 to 9, further comprising an imaging inducing unit that prompts a user to perform imaging using the first and second imaging units for evaluating the frequency characteristic. system.

The frequency characteristic acquisition means acquires the frequency characteristic with respect to the first and second input images acquired at any of initial setting, power-on, and focusing operation immediately before imaging. The image processing system according to any one of 1 to 10.

The image according to any one of claims 1 to 11, wherein an imaging unit corresponding to an input image including more high-frequency components is default-set as an imaging unit that is mainly used when generating the stereo image. Processing system.

The image processing system according to claim 12, wherein the stereoscopic vision generation unit generates the stereo image mainly using an input image from an imaging unit set as a default.

Imaging a subject to obtain a first input image;
Capturing the subject from a viewpoint different from the viewpoint from which the first input image was captured, and obtaining a second input image;
Obtaining frequency characteristics of the first and second input images;
A stereo image for stereoscopically displaying the subject from the first and second input images by mainly using an input image determined to have relatively good image quality based on the acquired frequency characteristics. Generating an image processing method.

An image processing program for causing a computer to execute image processing, wherein the image processing program
First acquisition means for acquiring a first input image obtained by imaging a subject;
Second acquisition means for acquiring a second input image obtained by imaging the subject from a different viewpoint from the viewpoint obtained by imaging the first input image;
Frequency characteristic acquisition means for acquiring frequency characteristics of the first and second input images;
A stereo image for stereoscopically displaying the subject from the first and second input images by mainly using an input image determined to have relatively good image quality based on the acquired frequency characteristics. An image processing program that functions as a stereoscopic generation unit that generates the image.