JP2020009351A

JP2020009351A - Image processing device, image processing method, and program

Info

Publication number: JP2020009351A
Application number: JP2018132152A
Authority: JP
Inventors: 隆弘高橋; Takahiro Takahashi
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2018-07-12
Filing date: 2018-07-12
Publication date: 2020-01-16
Anticipated expiration: 2038-07-12
Also published as: JP7327911B2

Abstract

To detect an object, even when a subject distance is large.SOLUTION: An image processing device (13) for processing an output signal from an imaging element (11) in which an imaging pixel and a range-finding pixel capable of range-finding by an image-pickup-surface phase difference system are arranged in a predetermined arrangement pattern includes first generation means (130) for generating a first image on the basis of the output signal of the imaging pixel, range-finding means (132) for generating distance information on the basis of the output signal of the range-finding pixel, setting means (133) for setting an area farther than a predetermined distance on the basis of the distance information for the first image, second generation means (130) for generating a second image having resolution equal to or higher than resolution corresponding to the arrangement pattern of the imaging pixel on the basis of the output signals of the imaging pixel and the range-finding pixel for the set area, and detection means (134) for detecting the area of a predetermined object from one of the first image and the second image.SELECTED DRAWING: Figure 1

Description

本発明は、撮像装置にて取得された画像を物体検出に用いる際の画像処理技術に関する。 The present invention relates to an image processing technique when an image acquired by an imaging device is used for object detection.

自動車やロボットなど移動体が撮像装置および測距装置を搭載し、周辺環境を認識して、自律的に移動する技術が知られている。具体的には、移動体に設置された撮像装置から得られる画像を分析し、特定の物体（車、歩行者など）を検出する。次に、検出した物体と移動体に設置された測距装置からの距離情報、および移動体の移動予定経路を総合的に判断し、物体との衝突可能性を判定する。そして、移動体自身を停止あるいは回避等、どのように行動するかの行動計画を作成し、移動体自身の行動制御を行う。これらの技術を自動車に搭載した場合、運転者の運転を支援するため、運転支援、ＡＤＡＳ（高度運転支援システム）、自動運転等と呼ばれている。 2. Description of the Related Art There is known a technology in which a moving body such as an automobile or a robot mounts an imaging device and a distance measuring device, recognizes a surrounding environment, and moves autonomously. Specifically, an image obtained from an imaging device installed on a moving object is analyzed to detect a specific object (a car, a pedestrian, or the like). Next, distance information from the detected object and the distance measuring device installed on the moving body, and a planned moving path of the moving body are comprehensively determined to determine the possibility of collision with the object. Then, an action plan of how to behave, such as stopping or avoiding the moving body itself, is created, and the behavior of the moving body itself is controlled. When these technologies are installed in a vehicle, they are referred to as driving assistance, ADAS (advanced driving assistance system), automatic driving, and the like in order to assist the driver in driving.

この一連の処理の流れにおいて、車や歩行者等の特定の物体を検出する方式は幾つかある。例えば、撮像装置により取得された画像から特徴量を抽出し、その特徴量を基に事前に学習した識別器を用いて判断する特徴点ベースの方式がある。特徴量の例としては、Ｈａｒｒ−Ｌｉｋｅ特徴やＨＯＧ（ＨｉｓｔｏｇｒａｍｓｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔｓ）特徴などがある。また識別方式としては、ＡｄａＢｏｏｓｔやＳＶＭ（ＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅ）などがある。この方式の他には、畳み込みニューラルネットワークを用いて、直接、特徴抽出および識別器を深層学習により獲得するニューラルネットワーク方式などがある。いずれの方式も、画像中に写っている検出したい物体の位置・大きさを矩形で指定して、学習に用いている。しかし、画像中に写り込む物体の大きさは、物体と撮像装置との相対距離に応じて変化する。物体と撮像装置との距離が近い場合は、物体は画像中に大きく写り込む。このような場合に対応するため、取得した画像から、段階的に解像度を落とした画像を複数生成し、それぞれの画像で物体検出を行い、その結果を統合する階層処理が一般的に行われている。例えば、特許文献１には、測距装置からの距離情報を用いて、解像度の落とし方を制御することでより効率的に物体検出を行う技術が開示されている。 In this series of processing flows, there are several methods for detecting a specific object such as a car or a pedestrian. For example, there is a feature point-based method in which a feature amount is extracted from an image acquired by an imaging device, and a determination is performed using a classifier learned in advance based on the feature amount. Examples of the feature amount include a Harr-Like feature and a HOG (Histograms of Oriented Gradients) feature. As the identification method, there are AdaBoost, SVM (Support Vector Machine), and the like. Other than this method, there is a neural network method in which a feature extraction and a classifier are directly acquired by deep learning using a convolutional neural network. In either method, the position and size of an object to be detected in an image are designated by a rectangle and used for learning. However, the size of the object reflected in the image changes according to the relative distance between the object and the imaging device. When the distance between the object and the imaging device is short, the object appears largely in the image. In order to cope with such a case, hierarchical processing is generally performed in which a plurality of images with gradually reduced resolution are generated from the obtained images, object detection is performed on each image, and the results are integrated. I have. For example, Patent Literature 1 discloses a technique for detecting an object more efficiently by controlling how to lower the resolution using distance information from a distance measuring device.

また特許文献２には、撮像画像と同時に複数の画素位置にて撮像装置から被写体までの距離（被写体距離とする。）を取得できる測距機能を備えた撮像装置が開示されている。特に、測距機能は、位相差方式により被写体距離を検出可能な測距画素を、撮像素子の像面に行毎に複数配置することにより実現されている。 Patent Document 2 discloses an imaging device having a distance measurement function that can acquire a distance from an imaging device to a subject (referred to as a subject distance) at a plurality of pixel positions simultaneously with a captured image. In particular, the distance measurement function is realized by arranging a plurality of distance measurement pixels, each of which can detect a subject distance by a phase difference method, on the image plane of the image sensor.

特開２０１４−１４２２０２号公報JP 2014-142202 A 特開２０１７−１６３５３９号公報JP 2017-163538 A

前述した撮像素子と測距画素とが配置された撮像素子を備えた撮像装置の場合、撮像画素から生成された画像を用いて物体検出が行われる。しかし、撮像素子には測距画素も配置されているため、生成される画像の解像度が下がり、例えば被写体距離が遠いために小さく写る物体を検出できなくなる場合がある。 In the case of an imaging apparatus including the above-described imaging element in which the imaging element and the distance measurement pixel are arranged, object detection is performed using an image generated from the imaging pixel. However, since the distance measuring pixels are also arranged in the image sensor, the resolution of the generated image is reduced, and for example, a small object cannot be detected due to a long subject distance.

そこで、本発明は、被写体距離が遠い場合でも物体を検出可能にすることを目的とする。 Therefore, an object of the present invention is to make it possible to detect an object even when the subject distance is long.

本発明は、撮像画素と撮像面位相差方式で測距が可能な測距画素とが所定の配置パターンで配された撮像素子からの出力信号を処理する画像処理装置であって、前記撮像画素の出力信号を基に第１の画像を生成する第１の生成手段と、前記測距画素の出力信号を基に距離情報を生成する測距手段と、前記第１の画像について、前記距離情報を基に所定の距離よりも遠方となる領域を設定する設定手段と、前記設定手段により前記設定された領域について、前記撮像画素ならびに前記測距画素の出力信号を基に、前記撮像画素の配置パターンに応じた解像度以上の解像度を有する第２の画像を生成する第２の生成手段と、前記第１の画像と前記第２の画像のいずれかの画像から、所定の物体の領域を検出する検出手段と、を有することを特徴とする。 The present invention is an image processing apparatus for processing an output signal from an imaging element in which imaging pixels and ranging pixels capable of measuring a distance by an imaging surface phase difference method are arranged in a predetermined arrangement pattern, First generating means for generating a first image based on the output signal of the distance measuring device, distance measuring means for generating distance information based on the output signal of the distance measuring pixel, and the distance information for the first image. Setting means for setting an area that is farther than a predetermined distance based on the setting, and for the area set by the setting means, the arrangement of the imaging pixels based on the output signals of the imaging pixels and the ranging pixels. A second generation unit configured to generate a second image having a resolution equal to or higher than the resolution according to the pattern, and detecting an area of a predetermined object from one of the first image and the second image And detecting means. That.

本発明によれば、被写体距離が遠い場合でも物体を検出可能となる。 According to the present invention, an object can be detected even when the subject distance is long.

画像処理装置を含む撮像装置の構成と撮像素子の説明図である。FIG. 2 is an explanatory diagram of a configuration of an imaging device including an image processing device and an imaging element. 結像光学系と測距画素および視差量とデフォーカス量の関係説明図である。FIG. 3 is an explanatory diagram illustrating a relationship between an imaging optical system, a distance measurement pixel, and a parallax amount and a defocus amount. 画像処理の流れを示すフローチャートである。9 is a flowchart illustrating a flow of image processing. 物体検出に用いる画像の生成処理の説明図である。FIG. 9 is an explanatory diagram of a process of generating an image used for object detection. 拡大するべき領域の設定の説明図である。It is explanatory drawing of setting of the area | region which should be expanded. 半画素分ずれた画素配置の説明図である。It is explanatory drawing of the pixel arrangement | positioning shifted by half pixel. 運転支援システムへの適用例を示した図である。It is a figure showing an example of application to a driving support system.

以下、本発明の好ましい実施の形態を、添付の図面に基づいて詳細に説明する。
本実施形態の画像処理装置は、撮像画素と撮像面位相差方式による測距が可能な測距画素とが所定の配置パターンで配された撮像素子からの出力信号を処理する装置であり、撮像素子の出力信号を基に生成した画像から所定の物体を検出する機能を有する。なお、以下の説明に用いる各図において、図番は異なっていても同じ構成または処理を行う部分にはそれぞれ同じ参照符号を付している。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The image processing apparatus according to the present embodiment is an apparatus that processes an output signal from an imaging element in which imaging pixels and ranging pixels that can perform ranging by the imaging surface phase difference method are arranged in a predetermined arrangement pattern. It has a function of detecting a predetermined object from an image generated based on output signals of the elements. In the drawings used in the following description, parts performing the same configuration or processing are denoted by the same reference numerals even if the figure numbers are different.

図１（Ａ）は、本実施形態の画像処理装置（画像処理部１３）を撮像装置１に適用した場合の概略的な構成例を示した図である。撮像装置１は、結像光学系１０、撮像素子１１、制御部１２、画像処理部１３、記憶部１４、入力部１５、表示部１６、通信部１７を有して構成されている。 FIG. 1A is a diagram illustrating a schematic configuration example when the image processing device (image processing unit 13) of the present embodiment is applied to the imaging device 1. The imaging apparatus 1 includes an imaging optical system 10, an imaging element 11, a control unit 12, an image processing unit 13, a storage unit 14, an input unit 15, a display unit 16, and a communication unit 17.

結像光学系１０は、撮像装置１の撮影レンズであり、被写体の光像を撮像素子１１上に形成する機能を有する。結像光学系１０は、複数のレンズ群（不図示）から構成され、撮像素子１１から所定距離離れた位置に射出瞳１０１を有する。なお本実施形態の各図に描かれているｘ軸、ｙ軸、ｚ軸について、ｚ軸は結像光学系１０の光軸１０２と並行した軸であり、ｘ軸とｙ軸は互いに垂直な軸であり且つ光軸（ｚ軸）と垂直な軸であるとする。 The imaging optical system 10 is a photographing lens of the imaging device 1 and has a function of forming a light image of a subject on the imaging element 11. The imaging optical system 10 includes a plurality of lens groups (not shown), and has an exit pupil 101 at a position separated from the image sensor 11 by a predetermined distance. Note that, with respect to the x-axis, y-axis, and z-axis depicted in the drawings of the present embodiment, the z-axis is an axis parallel to the optical axis 102 of the imaging optical system 10, and the x-axis and the y-axis are perpendicular to each other. It is assumed to be an axis and an axis perpendicular to the optical axis (z axis).

撮像素子１１は、ＣＭＯＳ（相補型金属酸化膜半導体）やＣＣＤ（電荷結合素子）から構成され、撮像画素の機能に加え、撮像面位相差方式による測距画素の機能をも備えた撮像素子である。すなわち撮像素子１１は、結像光学系１０によって撮像面に結像された被写体像を光電変換し、その被写体像に基づく画像信号ならびに撮像装置１から被写体までの距離情報を生成する。撮像画素と測距画素の詳細については後述する。 The image sensor 11 is a CMOS (complementary metal oxide semiconductor) or CCD (charge-coupled device), and is an image sensor having not only the function of an image pixel but also the function of a distance measurement pixel by an image plane phase difference method. is there. That is, the imaging element 11 photoelectrically converts the subject image formed on the imaging surface by the imaging optical system 10, and generates an image signal based on the subject image and distance information from the imaging device 1 to the subject. Details of the imaging pixel and the distance measurement pixel will be described later.

制御部１２は、撮像装置１の各部を制御する。例えば、制御部１２は、オートフォーカス（ＡＦ）による自動焦点合わせ、フォーカス位置の変更、Ｆ値（絞り）の変更、画像の取り込み、記憶部１４や入力部１５、表示部１６、通信部１７等を制御する。 The control unit 12 controls each unit of the imaging device 1. For example, the control unit 12 performs automatic focusing by auto focus (AF), changes the focus position, changes the F-number (aperture), captures an image, the storage unit 14, the input unit 15, the display unit 16, the communication unit 17, and the like. Control.

画像処理部１３は、画像生成部１３０、メモリ１３１、距離生成部１３２、領域設定部１３３、物体検出部１３４、情報統合部１３５を有して構成されている。
画像生成部１３０は、撮像素子１１から供給された撮像信号に対し、ノイズ除去、デモザイキング、輝度信号変換、収差補正、ホワイトバランス調整、色補正などの各種信号処理を行い、観賞用画像ならびに物体検出に用いる画像を生成する。なお、観賞用画像は、撮像素子１１の撮像画素からの撮像信号を基に生成される。物体検出に用いる画像の生成に関する詳細な説明は後述する。そして、画像生成部１３０から出力された画像データはメモリ１３１に一時的に蓄積される。メモリ１３１に蓄積された画像データのうち、物体検出に用いる画像のデータは後述する物体検出部１３４に送られ、観賞用画像のデータは表示部１６の表示に使用されたり、他の装置へ送信等されたりする。なお、物体検出に用いる画像データについても他の装置等への送信がなされてもよい。 The image processing unit 13 includes an image generation unit 130, a memory 131, a distance generation unit 132, an area setting unit 133, an object detection unit 134, and an information integration unit 135.
The image generation unit 130 performs various signal processing such as noise removal, demosaicing, luminance signal conversion, aberration correction, white balance adjustment, and color correction on the image pickup signal supplied from the image pickup device 11 to obtain an image for viewing and an object. Generate an image to be used for detection. Note that the ornamental image is generated based on an imaging signal from an imaging pixel of the imaging element 11. A detailed description regarding generation of an image used for object detection will be described later. Then, the image data output from the image generation unit 130 is temporarily stored in the memory 131. Of the image data stored in the memory 131, data of an image used for object detection is sent to an object detection unit 134 described later, and data of an ornamental image is used for display on the display unit 16 or transmitted to another device. And so on. Note that image data used for object detection may also be transmitted to another device or the like.

距離生成部１３２は、後述するように、撮像素子１１が有する測距画素により取得された信号（距離情報）を用いて、距離情報の分布を表す距離画像を生成する。
領域設定部１３３は、後述するように、画像生成部１３０で生成された画像、あるいは距離生成部１３２で生成された距離画像を基に、拡大するべき領域を設定する。
物体検出部１３４は、後述するように、画像生成部１３０で生成された物体検出に用いる画像から、特定の物体の領域を検出し、その検出した物体領域の画像内における位置・大きさを特定する。
情報統合部１３５は、後述するように、距離生成部１３２および物体検出部１３４からの情報を統合し、その情報を基に、物体検出部１３４にて検出された物体が撮像装置１からどの距離に存在するかを演算する。情報統合部１３５は、検出された物体と距離情報、移動体の移動予定経路等を総合的に判断し、検出された物体が障害物となるか等を判断してもよい。 The distance generation unit 132 generates a distance image representing the distribution of the distance information by using a signal (distance information) acquired by the distance measurement pixels included in the image sensor 11 as described later.
The area setting unit 133 sets an area to be enlarged based on the image generated by the image generation unit 130 or the distance image generated by the distance generation unit 132, as described later.
The object detection unit 134 detects an area of a specific object from an image used for object detection generated by the image generation unit 130, and specifies the position and size of the detected object area in the image, as described later. I do.
The information integration unit 135 integrates information from the distance generation unit 132 and the object detection unit 134 as described later, and based on the information, determines how far the object detected by the object detection unit 134 To determine if it exists. The information integration unit 135 may comprehensively determine the detected object and the distance information, the planned moving path of the moving object, and the like, and may determine whether the detected object becomes an obstacle.

なお、画像処理部１３の各部は、論理回路を用いたハードウェアにより構成することができる。また別の形態として、画像処理部１３は、例えば中央演算処理装置（ＣＰＵ）と演算処理プログラムを格納するメモリとから構成されていてもよい。この場合、メモリ内の演算処理プログラムをＣＰＵが実行するようなソフトウェア構成により、前述した各部の処理が実現される。また、画像処理部１３は、一部がハードウェア構成で残りがソフトウェア構成により実現されてもよい。 Each unit of the image processing unit 13 can be configured by hardware using a logic circuit. As another form, the image processing unit 13 may be configured by, for example, a central processing unit (CPU) and a memory that stores an arithmetic processing program. In this case, the processing of each unit described above is realized by a software configuration in which the CPU executes an arithmetic processing program in the memory. Further, part of the image processing unit 13 may be realized by a hardware configuration, and the rest may be realized by a software configuration.

記憶部１４は、撮像装置１で取得したデータや中間データ、撮像装置１で利用されるパラメータデータなどが格納される不揮発性の記憶媒体である。記憶部１４としては、高速に読み書きでき、且つ、大容量の記憶媒体であればどのようなものを利用してもよい。記憶部１４としては、一例としてフラッシュメモリなどを用いることができる。
入力部１５は、ユーザーが操作し、撮像装置１に対して、情報入力や設定変更を行うためのインターフェイスである。入力部１５としては、例えば、ダイヤル、ボタン、スイッチ、タッチパネルなどを用いることができる。
表示部１６は、撮影時の構図確認のための画像や、各種設定画面、メッセージ情報、物体検出結果などの表示を行う。表示部１６は、液晶ディスプレイや有機ＥＬなどで構成される表示デバイスである。
通信部１７は、画像処理部１３で生成された撮像画像や距離情報、検出物体の情報、情報統合部１３５で障害物判断が行われた場合の障害物判断情報等を、他の装置に送信したり、他の装置から送信された各情報を受信したりする機能を有している。 The storage unit 14 is a non-volatile storage medium that stores data and intermediate data acquired by the imaging device 1, parameter data used by the imaging device 1, and the like. As the storage unit 14, any storage medium that can read and write at high speed and has a large capacity may be used. As the storage unit 14, for example, a flash memory or the like can be used.
The input unit 15 is an interface that is operated by a user to input information or change settings to the imaging device 1. As the input unit 15, for example, a dial, a button, a switch, a touch panel, or the like can be used.
The display unit 16 displays an image for confirming the composition at the time of shooting, various setting screens, message information, an object detection result, and the like. The display unit 16 is a display device including a liquid crystal display and an organic EL.
The communication unit 17 transmits the captured image and the distance information generated by the image processing unit 13, the information of the detected object, the obstacle determination information when the information integration unit 135 determines the obstacle, and the like to another device. And a function of receiving information transmitted from other devices.

次に、本実施形態の撮像装置１が備えている撮像素子１１の構成および機能について、図１（Ｂ）〜図１（Ｅ）を参照しながら詳細に説明する。
前述したように、本実施形態の撮像素子１１は、撮像画素と、撮像面位相差方式による測距画素とを備え、結像光学系１０を介して撮像面に結像された被写体像を光電変換し、被写体像に基づく画像信号ならびに距離情報を生成可能な機能を有している。 Next, the configuration and function of the image sensor 11 included in the image pickup apparatus 1 of the present embodiment will be described in detail with reference to FIGS.
As described above, the image sensor 11 of the present embodiment includes the imaging pixels and the distance measurement pixels based on the imaging surface phase difference method, and converts the subject image formed on the imaging surface via the imaging optical system 10 into a photoelectric image. It has a function of converting and generating an image signal and distance information based on a subject image.

図１（Ｂ）は、撮像素子１１の概略的なｘ−ｙ断面図である。図１（Ｂ）に示すように、撮像素子１１は、４行×４列の画素群１１０が複数配列されることで構成されている。画素群１１０は、赤（Ｒ）に対応した撮像画素であるＲ画素、同様に緑（Ｇ）と青（Ｂ）に対応した撮像画素のＧ画素とＢ画素、白（Ｗ）に対応した撮像画素のＷ画素、撮像面位相差測距方式による測距画素であるＭ画素とが、行毎に配置されて構成されている。したがって、画素群１１０からは、Ｒ画素，Ｇ画素，Ｂ画素からのＲ，Ｇ，Ｂの３色およびＷ画素からの白の情報を含む画像信号と、Ｍ画素からの距離情報とが出力される。 FIG. 1B is a schematic xy cross-sectional view of the image sensor 11. As shown in FIG. 1B, the imaging element 11 is configured by arranging a plurality of pixel groups 110 of 4 rows × 4 columns. The pixel group 110 includes R pixels, which are imaging pixels corresponding to red (R), G and B pixels of imaging pixels corresponding to green (G) and blue (B), and imaging corresponding to white (W). Each pixel is composed of W pixels and M pixels, which are distance measurement pixels according to the imaging surface phase difference distance measurement method, arranged for each row. Therefore, from the pixel group 110, an image signal including three colors of R, G, and B from the R, G, and B pixels and white information from the W pixel, and distance information from the M pixel are output. You.

図１（Ｄ）はＲ画素に配されるＲのカラーフィルタ、Ｇ画素に配されるＧのカラーフィルタ、Ｂ画素に配されるＢのカラーフィルタのそれぞれ通過波長帯域を模式的に示している。図１（Ｄ）の通過帯域１１２１はＲのカラーフィルタ、通過帯域１１２２はＧのカラーフィルタ、通過帯域１１２３はＢのカラーフィルタの、それぞれの通過波長帯域を表している。Ｗ画素に関してはカラーフィルタを使用しなくてもよいし、例えばＲ，Ｇ，Ｂの各通過波長帯域を含んだ図１（Ｅ）に示すような広い通過波長帯域１１２４を有するフィルタを使用してもよい。Ｗ画素からの信号を利用することで、感度が向上しＳ／Ｎの高い画像を得ることが可能となる。 FIG. 1D schematically shows respective pass wavelength bands of an R color filter provided for an R pixel, a G color filter provided for a G pixel, and a B color filter provided for a B pixel. . In FIG. 1D, a pass band 1121 represents an R color filter, a pass band 1122 represents a G color filter, and a pass band 1123 represents a pass wavelength band of a B color filter. For the W pixel, a color filter may not be used, or a filter having a wide pass wavelength band 1124 as shown in FIG. 1E including, for example, R, G, and B pass wavelength bands may be used. Is also good. By using the signal from the W pixel, it is possible to obtain an image with improved sensitivity and high S / N.

次に、撮像素子１１の測距画素（Ｍ）について説明する。
図１（Ｃ）は、図１（Ｂ）の画素群１１０のなかの測距画素Ｍの１画素分についてＩ−Ｉ'断面を模式的に示した図である。図１（Ｃ）に示すように、一つの測距画素Ｍは、導光層１１３と受光層１１４とで構成されている。導光層１１３には、測距画素へ入射した光束を二つの光電変換部へ効率よく導くためのマイクロレンズ１１１、所定の波長帯域の光を通過させるフィルタ１１２、不図示の画像読み出し用及び画素駆動用の配線などにより構成されている。フィルタ１１２は、例えばＷ画素と同様に広い通過波長帯域を持つフィルタであるが、このフィルタは必ずしも備えられていなくてもよい。受光層１１４は、光電変換部１１５および光電変換部１１６の二つの光電変換部により構成されている。なお、不図示の画像読み出し用及び画素駆動用の配線は受光層１１４に設けられていてもよい。 Next, the ranging pixels (M) of the image sensor 11 will be described.
FIG. 1C is a diagram schematically showing a cross section taken along line II ′ of one pixel of the distance measurement pixel M in the pixel group 110 of FIG. 1B. As shown in FIG. 1C, one ranging pixel M includes a light guide layer 113 and a light receiving layer 114. The light guide layer 113 includes a microlens 111 for efficiently guiding a light beam incident on the distance measurement pixel to the two photoelectric conversion units, a filter 112 for passing light in a predetermined wavelength band, and an image reading and pixel (not shown). It is composed of driving wires and the like. The filter 112 is, for example, a filter having a wide pass wavelength band like the W pixel, but this filter is not necessarily required. The light receiving layer 114 includes two photoelectric conversion units, a photoelectric conversion unit 115 and a photoelectric conversion unit 116. Note that wirings for image reading and pixel driving (not shown) may be provided in the light receiving layer 114.

次に、本実施形態の撮像素子１１の測距画素Ｍが備える光電変換部１１５と光電変換部１１６の動作、および、それら二つの光電変換部が受光する光束について、図２（Ａ）及び図２（Ｂ）を用いて説明する。
図２（Ａ）は、結像光学系１０の射出瞳１０１と、撮像素子１１中の画素の光電変換部１１５により受光される光束（光束は図中点線で示している。）と、を示した概略図である。図２（Ｂ）は同様に光電変換部１１６により受光される光束を示した概略図である。 Next, the operation of the photoelectric conversion unit 115 and the photoelectric conversion unit 116 included in the ranging pixel M of the image sensor 11 according to the present embodiment, and the light beams received by the two photoelectric conversion units will be described with reference to FIGS. 2 (B).
FIG. 2A shows an exit pupil 101 of the imaging optical system 10 and a light beam received by the photoelectric conversion unit 115 of the pixel in the image sensor 11 (the light beam is indicated by a dotted line in the figure). FIG. FIG. 2B is a schematic diagram showing a light beam similarly received by the photoelectric conversion unit 116.

図２（Ａ）及び図２（Ｂ）に示したマイクロレンズ１１１は、射出瞳１０１と受光層１１４とが光学的に共役関係になるように配置されている。結像光学系１０の射出瞳１０１を通過した光束は、マイクロレンズ１１１により集光されて光電変換部１１５または光電変換部１１６に導かれる。この際、光電変換部１１５と光電変換部１１６は、それぞれ図２（Ａ）及び図２（Ｂ）に示す通り、異なる瞳領域を通過した光束を主に受光する。光電変換部１１５は瞳領域２１０を通過した光束を、光電変換部１１６には瞳領域２２０を通過した光束を受光する。 The microlenses 111 shown in FIGS. 2A and 2B are arranged such that the exit pupil 101 and the light receiving layer 114 have an optically conjugate relationship. The light beam that has passed through the exit pupil 101 of the imaging optical system 10 is condensed by the microlens 111 and guided to the photoelectric conversion unit 115 or 116. At this time, the photoelectric conversion unit 115 and the photoelectric conversion unit 116 mainly receive light beams that have passed through different pupil regions, as shown in FIGS. 2A and 2B, respectively. The photoelectric conversion unit 115 receives the light beam that has passed through the pupil region 210, and the photoelectric conversion unit 116 receives the light beam that has passed through the pupil region 220.

撮像素子１１が備える複数の光電変換部１１５は瞳領域２１０を通過した光束を主に受光し、これにより、それら複数の光電変換部１１５の出力信号からは、瞳領域２１０を通過した光束に基づく画像信号Ｓ１が得られることになる。同様に、複数の光電変換部１１６は瞳領域２２０を通過した光束を主に受光し、これにより、それら複数の光電変換部１１６の出力信号からは、瞳領域２２０を通過した光束に基づく画像信号Ｓ２が得られることになる。また、画像信号Ｓ１からは、瞳領域２１０を通過した光束により撮像素子１１上に形成された光像の強度分布を得ることができ、同様に、画像信号Ｓ２からは、瞳領域２２０を通過した光束により撮像素子１１上に形成された光像の強度分布を得ることができる。 The plurality of photoelectric conversion units 115 included in the image sensor 11 mainly receive the light beam that has passed through the pupil region 210, and the output signals of the plurality of photoelectric conversion units 115 are based on the light beam that has passed through the pupil region 210. The image signal S1 is obtained. Similarly, the plurality of photoelectric conversion units 116 mainly receive the light beam that has passed through the pupil region 220, and the output signals of the plurality of photoelectric conversion units 116 determine the image signal based on the light beam that has passed through the pupil region 220. S2 will be obtained. In addition, from the image signal S1, the intensity distribution of the light image formed on the image sensor 11 can be obtained by the light flux that has passed through the pupil region 210, and similarly, the image signal S2 has passed through the pupil region 220 from the image signal S2. An intensity distribution of a light image formed on the image sensor 11 can be obtained by the light beam.

ここで、これら画像信号Ｓ１と画像信号Ｓ２との間の相対的な位置ズレ量（いわゆる視差量）は、デフォーカス量に応じた値となる。視差量とデフォーカス量との関係について、図２（Ｃ）、図２（Ｄ）、図２（Ｅ）を用いて説明する。図２（Ｃ）、図２（Ｄ）、図２（Ｅ）は本実施形態の撮像素子１１と結像光学系１０とを模式的に表した図である。 Here, the relative positional shift amount (so-called parallax amount) between the image signal S1 and the image signal S2 is a value corresponding to the defocus amount. The relationship between the amount of parallax and the amount of defocus will be described with reference to FIGS. 2C, 2D, and 2E. FIG. 2C, FIG. 2D, and FIG. 2E are diagrams schematically illustrating the imaging element 11 and the imaging optical system 10 of the present embodiment.

これら図２（Ｃ）〜図２（Ｄ）において、光束２１１は瞳領域２１０を通過する光束であり、光束２２１は瞳領域２２０を通過する光束であるとする。
図２（Ｃ）は合焦時の状態を示しており、光束２１１および光束２２１が撮像素子１１上で収束している。このとき、光束２１１により形成される画像信号Ｓ１と光束２２１により形成される画像信号Ｓ２との間で視差量は０となる。
図２（Ｄ）は像側でｚ軸の負方向にデフォーカスした状態を示している。この時、光束２１１により形成される画像信号Ｓ１と光束２２１により形成される画像信号Ｓ２との間で視差量は０とはならず、負の値を有する。
図２（Ｅ）は、像側でｚ軸の正方向にデフォーカスした状態を示している。この時、光束２１１により形成される画像信号Ｓ１と光束２２１により形成される画像信号Ｓ２との間で視差量は正の値を有する。
そしてこれら図２（Ｄ）と図２（Ｅ）の比較から、デフォーカス量の正負に応じて、位置ズレの方向が入れ替わることが分かる。また、デフォーカス量に応じて、結像光学系１０の結像関係（幾何関係）に従って位置ズレが生じることが分かる。画像信号Ｓ１と画像信号Ｓ２との間の位置ずれを表す視差量は、後述する照合領域と参照領域を用いた領域ベースで相関を求めるマッチング手法等によって検出することができる。 In FIGS. 2C to 2D, it is assumed that the light beam 211 is a light beam passing through the pupil region 210 and the light beam 221 is a light beam passing through the pupil region 220.
FIG. 2C shows a state at the time of focusing, and the light beam 211 and the light beam 221 are converged on the image sensor 11. At this time, the amount of parallax between the image signal S1 formed by the light beam 211 and the image signal S2 formed by the light beam 221 becomes zero.
FIG. 2D shows a state where the image is defocused in the negative direction of the z-axis on the image side. At this time, the amount of parallax between the image signal S1 formed by the light beam 211 and the image signal S2 formed by the light beam 221 does not become 0 but has a negative value.
FIG. 2E shows a state where the image is defocused in the positive direction of the z-axis on the image side. At this time, the amount of parallax between the image signal S1 formed by the light beam 211 and the image signal S2 formed by the light beam 221 has a positive value.
From the comparison between FIG. 2D and FIG. 2E, it can be seen that the direction of the positional shift is switched according to the sign of the defocus amount. In addition, it can be seen that a positional shift occurs according to the imaging relationship (geometric relationship) of the imaging optical system 10 according to the defocus amount. The amount of parallax representing the displacement between the image signal S1 and the image signal S2 can be detected by a matching method for obtaining a correlation based on an area using a matching area and a reference area, which will be described later.

次に、図３（Ａ）のフローチャートを用いて、本実施形態の画像処理部１３においてデータ取得開始から物体検出に用いる画像の生成、さらにその画像を用いて物体検出を行った結果を出力して情報を統合するまでの処理の流れを説明する。なお、以下の説明では、図３（Ａ）のフローチャートの各処理ステップＳ３０１〜Ｓ３０７をＳ３０１〜Ｓ３０７と略記し、このことは後述する他のフローチャートでも同様とする。 Next, using the flowchart of FIG. 3A, the image processing unit 13 of this embodiment generates an image used for object detection from the start of data acquisition, and outputs a result of object detection using the image. The process flow until the information is integrated will be described. In the following description, each processing step S301 to S307 in the flowchart of FIG. 3A is abbreviated as S301 to S307, and the same applies to other flowcharts described later.

ここで本実施形態の画像処理部１３において、画像生成部１３０は、物体検出に用いる画像として、第１の物体検出用画像と第２の物体検出用画像とを、生成可能となされている。詳細は後述するが、第１の物体検出画像は、撮像素子１１の撮像画素からの撮像信号を基に生成される画像であり、撮像素子の配置パターンに応じた解像度と同等の解像度を有する画像として生成される。第２の物体検出用画像は、特に撮像装置１からの距離（被写体距離）が遠く物体が小さく写るために、領域設定部１３３が拡大するべきとして設定した領域について、撮像画素の配置パターンに応じた解像度以上の解像度を有する拡大画像として生成される。以下の説明では、第１の物体検出用画像を単に物体検出用画像と呼び、第２の物体検出用画像を特に拡大物体検出用画像と呼ぶことにする。 Here, in the image processing unit 13 of the present embodiment, the image generation unit 130 can generate a first object detection image and a second object detection image as images used for object detection. Although the details will be described later, the first object detection image is an image generated based on an imaging signal from an imaging pixel of the imaging element 11 and has an image having a resolution equivalent to a resolution according to an arrangement pattern of the imaging element. Is generated as In particular, the second object detection image corresponds to the area set by the area setting unit 133 to be enlarged because the distance (subject distance) from the imaging apparatus 1 is long and the object is small, according to the arrangement pattern of the imaging pixels. Is generated as an enlarged image having a resolution equal to or higher than that of the In the following description, the first object detection image is simply referred to as an object detection image, and the second object detection image is particularly referred to as an enlarged object detection image.

先ず、図３（Ａ）のＳ３０１において、撮像装置１では、設定された焦点位置、絞り、露光時間などに応じた撮影が行われ、撮像素子１１からの出力信号が画像処理部１３に転送され、同時に、メモリ１３１に記録される。
次にＳ３０２において、画像処理部１３は、物体検出用画像（撮像素子１１の解像度に応じた第１の物体検出用画像）と距離画像とを生成する。 First, in S301 of FIG. 3A, the imaging device 1 performs shooting in accordance with the set focal position, aperture, exposure time, and the like, and outputs an output signal from the imaging device 11 to the image processing unit 13. Are recorded in the memory 131 at the same time.
Next, in S302, the image processing unit 13 generates an object detection image (a first object detection image corresponding to the resolution of the image sensor 11) and a distance image.

Ｓ３０２における物体検出用画像の生成処理について、図３（Ｂ）のフローチャートおよび図４を用いて説明する。物体検出用画像は、画像生成部１３０において生成される。
前述した図１（Ｂ）に示したように、撮像素子１１の画素の配列は、Ｒ画素とＧ画素とＢ画素とＷ画素からなる各撮像画素が配された行と、画素Ｍの測距画素が配された行とに分かれている。また、図４に示すように、撮像素子１１の画素数は横方向（ｘ軸方向）がＷｓ画素で縦方向（ｙ軸方向）がＨｓ画素からなるＷｓ×Ｈｓ画素であるとする。 The generation processing of the object detection image in S302 will be described with reference to the flowchart in FIG. The object detection image is generated by the image generation unit 130.
As shown in FIG. 1B described above, the pixel array of the image sensor 11 includes a row in which each image pixel including R, G, B, and W pixels is arranged, and a distance measurement of the pixel M. It is divided into rows in which pixels are arranged. As shown in FIG. 4, the number of pixels of the image sensor 11 is Ws × Hs pixels in which the horizontal direction (x-axis direction) is Ws pixels and the vertical direction (y-axis direction) is Hs pixels.

画像生成部１３０は、先ずＳ３１１において、撮像素子１１のＲ画素，Ｇ画素，Ｂ画素の各信号を用いて、Ｗｃ×Ｈｃ画素のベイヤー配列画像４１１を生成する。なおこの時の横方向の画素数はＷｃ＝Ｗｓ／２、縦方向の画素数はＨｃの画素数はＨｃ＝Ｗｓ／２となる。さらに、画像生成部１３０は、このベイヤー配列画像４１１をもとに、Ｒ画素，Ｇ画素，Ｂ画素の撮像画素の配置パターンに応じた解像度と同等の解像度の画像、つまり、撮像画素の行数と同等の行数のＲ，Ｇ，Ｂ各色の画像を生成するデモザイキングを行う。 First, in step S311, the image generation unit 130 generates a Bayer array image 411 of Wc × Hc pixels using the signals of the R, G, and B pixels of the image sensor 11. At this time, the number of pixels in the horizontal direction is Wc = Ws / 2, and the number of pixels in the vertical direction is Hc. The number of pixels is Hc = Ws / 2. Further, based on the Bayer array image 411, the image generating unit 130 generates an image having a resolution equivalent to the resolution according to the arrangement pattern of the R, G, and B pixels, that is, the number of rows of the imaging pixels. Demosaicing is performed to generate R, G, and B color images of the same number of lines as.

次にＳ３１２において、画像生成部１３０は、Ｓ３１１の処理後のＲ，Ｇ，Ｂ各色の画像を変倍する処理を行うことにより、Ｒ，Ｇ，Ｂ各色それぞれがＷｃ×Ｈｃ画素からなる物体検出用画像４１２を生成する。
このように、本実施形態の画像生成部１３０は、撮像素子１１に配された撮像画素の行数と同等の行数となる画像を生成し、その画像を変倍することにより物体検出用画像４１２を生成している。 Next, in step S312, the image generation unit 130 performs a process of scaling the image of each of the R, G, and B colors after the process of step S311 to detect an object in which each of the R, G, and B colors includes Wc × Hc pixels. The image for use 412 is generated.
As described above, the image generation unit 130 according to the present embodiment generates an image having the same number of rows as the number of imaging pixels arranged in the imaging element 11 and scales the image to obtain an object detection image. 412 is generated.

なおここでは、Ｗ画素の信号を用いない方法を説明したが、物体検出用画像の生成にＷ画素の信号を用いてもよい。具体的には、Ｗ画素の信号を用いて、Ｗｓ×Ｈｃ画素を有する輝度画像を物体検出用画像として生成する。
また、デモザイキング方法に関しては特に制限は無く、例えばＷ画素の信号を利用してＳＮ比を向上するよう、Ｒ画素，Ｇ画素，Ｂ画素の各画素値を補間するようにしてもよい。 Here, the method not using the signal of the W pixel has been described, but the signal of the W pixel may be used for generating the object detection image. Specifically, a luminance image having Ws × Hc pixels is generated as an object detection image by using a signal of W pixels.
There is no particular limitation on the demosaicing method. For example, the pixel values of the R pixel, the G pixel, and the B pixel may be interpolated so as to improve the SN ratio using the signal of the W pixel.

画像生成部１３０は、前述した処理以外に、物体検出用画像についても、ノイズ除去、輝度信号変換、収差補正、ホワイトバランス調整、色補正などの処理を行い、それら処理が行われた後のデータを、メモリ１３１に記録する。 The image generation unit 130 performs processing such as noise removal, luminance signal conversion, aberration correction, white balance adjustment, and color correction on the object detection image in addition to the above-described processing. Is stored in the memory 131.

次に、図３（Ａ）のＳ３０２における距離画像の生成処理について説明する。距離画像は、距離生成部１３２において生成される。
図４に示した撮像素子１１の測距画素であるＭ画素から読み出された画像信号４１３は、前述した光電変換部１１５からの画像信号Ｓ１と、光電変換部１１６からの画像信号Ｓ２とを有した信号となっている。距離生成部１３２は、それら画像信号Ｓ１と画像信号Ｓ２を基に、距離画像４１５（距離画像Ｄ）を生成する。 Next, the distance image generation processing in S302 of FIG. 3A will be described. The distance image is generated by the distance generation unit 132.
The image signal 413 read from the M pixel which is the distance measurement pixel of the image sensor 11 shown in FIG. 4 is obtained by converting the image signal S1 from the photoelectric conversion unit 115 and the image signal S2 from the photoelectric conversion unit 116 described above. Signal. The distance generator 132 generates a distance image 415 (distance image D) based on the image signal S1 and the image signal S2.

図３（Ｃ）は、距離画像の生成処理の詳細なフローチャートである。
Ｓ３２１において、距離生成部１３２は、画像信号Ｓ１および画像信号Ｓ２について光量補正処理を施す。結像光学系１０の周辺画角では、ヴィネッティング（口径食）によって瞳領域２１０と瞳領域２２０の形状が異なり、それに起因して、画像信号Ｓ１と画像信号Ｓ２との間で光量バランスが崩れることがある。このため、Ｓ３２１において、距離生成部１３２は、メモリ１３１に予め格納されている光量補正値を用いて、画像信号Ｓ１と画像信号Ｓ２について光量補正処理を施す。 FIG. 3C is a detailed flowchart of the distance image generation process.
In step S321, the distance generation unit 132 performs a light amount correction process on the image signal S1 and the image signal S2. At the peripheral angle of view of the imaging optical system 10, the shapes of the pupil region 210 and the pupil region 220 are different due to vignetting (vignetting), and as a result, the light amount balance between the image signal S1 and the image signal S2 is lost. Sometimes. For this reason, in S321, the distance generation unit 132 performs the light amount correction processing on the image signals S1 and S2 using the light amount correction values stored in the memory 131 in advance.

次のＳ３２２において、距離生成部１３２は、撮像素子１１に起因するノイズを低減するための処理を行う。具体的には、距離生成部１３２は、画像信号Ｓ１と画像信号Ｓ２に対してフィルタ処理を施すことにより、それら画像信号Ｓ１と画像信号Ｓ２に含まれるノイズを低減する。ここで、一般に、空間周波数が高い高周波領域ほどＳＮ比が低くなり、相対的にノイズ成分が多くなる。したがって、距離生成部１３２は、高周波領域になるほど通過率が低くなるローパスフィルタを用いたフィルタ処理を行う。なお、Ｓ３２１における光量補正は、結像光学系１０の製造誤差などにより設計通りにはならないことが多い。このため、Ｓ３２２でのノイズ低減処理の際には、直流成分を遮断し、且つ、高周波成分の通過率が低いバンドパスフィルタを用いることが望ましい。 In the next step S322, the distance generation unit 132 performs a process for reducing noise caused by the image sensor 11. Specifically, the distance generation unit 132 performs a filtering process on the image signal S1 and the image signal S2 to reduce noise included in the image signal S1 and the image signal S2. Here, generally, the higher the spatial frequency, the lower the S / N ratio and the relatively large the noise component. Therefore, the distance generation unit 132 performs a filtering process using a low-pass filter in which the pass rate decreases as the frequency increases. Note that the light amount correction in S321 often does not become as designed due to a manufacturing error of the imaging optical system 10, or the like. For this reason, at the time of the noise reduction process in S322, it is desirable to use a bandpass filter that cuts off the DC component and has a low transmittance of the high-frequency component.

次にＳ３２３において、距離生成部１３２は、前述したＳ３２１およびＳ３２２の処理後の画像信号Ｓ１と画像信号Ｓ２を基に視差量を算出する。具体的には、距離生成部１３２は、先ず、画像信号Ｓ１内に、代表画素情報Ｉｓｐに対応した注目点を設定し、その注目点を中心とする照合領域を設定する。照合領域は、例えば、注目点を中心とした一辺が所定の画素数となされた矩形領域とする。また、距離生成部１３２は、画像信号Ｓ２内に、参照点を設定し、その参照点を中心とする参照領域を設定する。参照領域は照合領域と同一の大きさおよび形状である。次に距離生成部１３２は、参照点を順次移動させながら照合領域内に含まれる画像信号Ｓ１と参照領域内に含まれる画像信号Ｓ２との相関度を算出し、最も相関度が高い点を注目点に対応した対応点とする。注目点と対応点の相対的な位置ズレ量が、注目点における視差量である。前述した注目点を代表画素情報Ｉｓｐに従って順次変更しながら視差量を算出することで、複数の画素位置における視差量を算出することができる。相関度の算出方法はＮＣＣ（ＮｏｒｍａｌｉｚｅｄＣｒｏｓｓ−Ｃｏｒｒｅｌａｔｉｏｎ）やＳＳＤ（ＳｕｍｏｆＳｑｕａｒｅｄＤｉｆｆｅｒｅｎｃｅ）、ＳＡＤ（ＳｕｍｏｆＡｂｓｏｌｕｔｅＤｉｆｆｅｒｅｎｃｅ）等を用いることができる。そして、距離生成部１３２は、視差量を計算する位置を、物体検出用画像４１２と同様になるよう設定することで距離画像４１５を得る。 Next, in step S323, the distance generation unit 132 calculates the amount of parallax based on the image signals S1 and S2 after the processing in steps S321 and S322 described above. Specifically, first, the distance generation unit 132 sets a point of interest corresponding to the representative pixel information Isp in the image signal S1, and sets a collation area centered on the point of interest. The matching area is, for example, a rectangular area in which one side around the point of interest has a predetermined number of pixels. Further, the distance generation unit 132 sets a reference point in the image signal S2, and sets a reference area centered on the reference point. The reference area has the same size and shape as the matching area. Next, the distance generation unit 132 calculates the degree of correlation between the image signal S1 included in the matching area and the image signal S2 included in the reference area while sequentially moving the reference points, and focuses on the point having the highest degree of correlation. The corresponding points correspond to the points. The relative displacement between the point of interest and the corresponding point is the amount of parallax at the point of interest. By calculating the amount of parallax while sequentially changing the noted point according to the representative pixel information Isp, it is possible to calculate the amount of parallax at a plurality of pixel positions. As a method of calculating the degree of correlation, NCC (Normalized Cross-Correlation), SSD (Sum of Squared Difference), SAD (Sum of Absolute Difference), or the like can be used. Then, the distance generation unit 132 obtains the distance image 415 by setting the position at which the amount of parallax is calculated to be similar to the position of the object detection image 412.

また、距離生成部１３２は、所定の変換係数を用い、下記の式（１）の演算を行うことにより、視差量を撮像素子１１から結像光学系１０の焦点までの距離であるデフォーカス量に変換する。式（１）中のＫは所定の変換係数、ΔＬはデフォーカス量、ｄは視差量である。 Further, the distance generation unit 132 calculates the amount of parallax by using the predetermined conversion coefficient and calculating the following expression (1), thereby obtaining the amount of defocus from the image sensor 11 to the focal point of the imaging optical system 10. Convert to In the equation (1), K is a predetermined conversion coefficient, ΔL is a defocus amount, and d is a parallax amount.

ΔＬ＝Ｋ×ｄ式（１） ΔL = K × d Equation (1)

さらに、距離生成部１３２は、式（２）に示すように、幾何光学におけるレンズの公式を用いて、デフォーカス量ΔＬを物体距離に変換する。式（２）中のＤａは物体面から結像光学系１０の主点までの距離、Ｄｂは結像光学系１０の主点から像面までの距離、Ｆは結像光学系１０の焦点距離である。 Further, as shown in Expression (2), the distance generation unit 132 converts the defocus amount ΔL into an object distance using a lens formula in geometrical optics. In the equation (2), Da is the distance from the object plane to the principal point of the imaging optical system 10, Db is the distance from the principal point of the imaging optical system 10 to the image plane, and F is the focal length of the imaging optical system 10. It is.

１／Ｄａ＋１／Ｄｂ＝１／Ｆ式（２） 1 / Da + 1 / Db = 1 / F Equation (2)

この式（２）において、距離Ｄｂの値はデフォーカス量ΔＬから算出することができ、焦点距離Ｆは結像光学系１０により得られるため、物体面までの距離Ｄａを算出することができる。 In this equation (2), the value of the distance Db can be calculated from the defocus amount ΔL, and the focal length F is obtained by the imaging optical system 10, so that the distance Da to the object plane can be calculated.

図３（Ａ）のフローチャートに説明を戻す。前述したＳ３０２の後、画像処理部１３は、Ｓ３０３に処理を進める。
Ｓ３０３に進むと、画像処理部１３は、Ｓ３０２で生成した距離画像を利用して、拡大すべき領域を設定する。拡大すべき領域は遠方の領域であり、拡大すべき領域の設定は領域設定部１３３において行われる。
以下、拡大すべき領域の設定について図５を用いて説明する。 The description returns to the flowchart of FIG. After S302 described above, the image processing unit 13 advances the processing to S303.
In step S303, the image processing unit 13 sets an area to be enlarged using the distance image generated in step S302. The area to be enlarged is a distant area, and the area to be enlarged is set by the area setting unit 133.
Hereinafter, setting of an area to be enlarged will be described with reference to FIG.

図５において、画像５１０は前述したＳ３０２で生成された物体検出用画像４１２の一画像例を示しており、画像５１１は同じくＳ３０２で生成された距離画像４１５の一画像例を示している。物体検出用画像５１０には遠い距離の物体５０２と近い距離の物体５０１が写っているとする。
ここで、物体検出が可能な最小モデルに相当する大きさをＷｍｉｎ画素、検出したい実モデルの大きさをＷｍ、結像光学系１０の焦点距離をＦ、撮像素子１１の画素ピッチをＰで表すとする。この場合、撮像装置１から、最小モデルとして検出可能な物体までの第１の距離Ｄｍｉｎは、式（３）により表すことができる。 In FIG. 5, an image 510 shows an example of the object detection image 412 generated in S302 described above, and an image 511 shows an example of the distance image 415 also generated in S302. It is assumed that the object detection image 510 includes an object 502 at a long distance and an object 501 at a short distance.
Here, the size corresponding to the minimum model capable of detecting the object is represented by Wmin pixel, the size of the real model to be detected is represented by Wm, the focal length of the imaging optical system 10 is represented by F, and the pixel pitch of the image sensor 11 is represented by P. And In this case, the first distance Dmin from the imaging device 1 to the object that can be detected as the minimum model can be expressed by Expression (3).

Ｄｍｉｎ＝Ｗｍ×Ｆ／（Ｗｍｉｎ×Ｐ）式（３） Dmin = Wm × F / (Wmin × P) Equation (3)

このことを言い換えると、第１の距離Ｄｍｉｎより遠い距離にある物体（例えば物体５０２）は、撮像装置１による撮像画像中では最小モデルＷｍｉｎ画素より小さく写っていることになり、そのままでは検出することができないことを意味する。
また例えば、撮像装置１から物体までの距離のうち、所定の第２の距離以上（Ｄｍａｘ以上）の遠い物体等については、撮像装置１の撮像画像ではそもそも解像することができず、その物体の写った画像領域を例えば拡大したとしても物体を検出できないとする。すなわち言い換えると、撮像装置１から物体までの距離が所定の第２の距離未満（Ｄｍａｘ未満）であれば、例えば撮像画像を拡大すれば物体を検出することが可能になるとする。 In other words, an object (for example, the object 502) located at a distance longer than the first distance Dmin is smaller than the minimum model Wmin pixel in the image captured by the imaging device 1, and can be detected as it is. Means you can't.
Further, for example, among the distances from the imaging device 1 to the object, a distant object or the like that is equal to or more than a predetermined second distance (Dmax or more) cannot be resolved in the captured image of the imaging device 1 in the first place. It is assumed that an object cannot be detected even if the image region in which the image is displayed is enlarged, for example. In other words, in other words, if the distance from the imaging device 1 to the object is less than the second predetermined distance (less than Dmax), for example, if the captured image is enlarged, the object can be detected.

以上のことを踏まえ、領域設定部１３３では、距離画像のうち、第１の距離Ｄｍｉｎと第２の距離Ｄｍａｘとに基づく距離範囲Ｄｒｇ（Ｄｒｇ＝Ｄｍｉｎ−Ｄｍａｘ）に入る領域を特定する。そして、領域設定部１３３は、距離画像から距離範囲Ｄｒｇ内として特定された領域を基に、物体検出用画像５１０の中で拡大するべき領域を設定する。図５の例では、距離画像５１１のなかで点線により囲まれた領域５０４が距離範囲Ｄｒｇ内として特定された領域を表し、同様に、物体検出用画像５１０のなかで点線により囲まれた領域５０４が拡大するべきと設定された領域を表している。すなわち、領域設定部１３３は、遠方の空の領域など第２の距離Ｄｍａｘ以上の遠方の領域を除外し、所定の第２の距離Ｄｍａｘ未満で且つ第１の距離Ｄｍｉｎより遠い距離範囲Ｄｒｇに相当する領域を、拡大すべき領域５０４として設定する。これにより、物体検出用画像５１０の中で設定された領域５０４を例えば拡大すれば、後段の物体検出処理において、その拡大画像（５１２）の中に写っている物体を検出することが可能となる。また、本実施形態では、拡大するべき領域を限定していることにより、物体検出用画像５１０の全体を拡大する場合よりも、後段の物体検出処理の負荷を低減でき、また使用メモリ容量を削減することも可能となる。 Based on the above, the area setting unit 133 specifies an area in the distance image that falls within the distance range Drg (Drg = Dmin-Dmax) based on the first distance Dmin and the second distance Dmax. Then, the area setting unit 133 sets an area to be enlarged in the object detection image 510 based on the area specified as being within the distance range Drg from the distance image. In the example of FIG. 5, a region 504 surrounded by a dotted line in the distance image 511 represents a region specified as being within the distance range Drg, and similarly, a region 504 surrounded by a dotted line in the object detection image 510. Indicates an area set to be enlarged. That is, the region setting unit 133 excludes a distant region that is equal to or greater than the second distance Dmax, such as a distant sky region, and corresponds to a distance range Drg that is less than the predetermined second distance Dmax and is greater than the first distance Dmin. The region to be enlarged is set as the region 504 to be enlarged. Accordingly, if the area 504 set in the object detection image 510 is enlarged, for example, it is possible to detect an object appearing in the enlarged image (512) in the subsequent object detection processing. . Further, in the present embodiment, since the area to be enlarged is limited, the load of the subsequent object detection processing can be reduced and the used memory capacity can be reduced as compared with the case where the entire object detection image 510 is enlarged. It is also possible to do.

再び図３（Ａ）のフローチャートに説明を戻す。Ｓ３０３の後、画像処理部１３は、Ｓ３０４に処理を進める。
Ｓ３０４に進むと、画像処理部１３は、拡大すべき遠方の領域が存在するか否か、つまり領域設定部１３３により拡大するべきと設定された領域が存在するか否かの判定を行う。この判定は、例えば画像生成部１３０において行われるとする。画像生成部１３０は、Ｓ３０４において、拡大すべき遠方領域が存在しないと判定した場合にはＳ３０６に処理を進め、一方、拡大すべき遠方領域が存在すると判定した場合にはＳ３０５に処理を進める。 Description will be returned to the flowchart of FIG. After S303, the image processing unit 13 proceeds to S304.
In S304, the image processing unit 13 determines whether there is a distant area to be enlarged, that is, whether there is an area set to be enlarged by the area setting unit 133. This determination is made, for example, by the image generation unit 130. If the image generation unit 130 determines in S304 that there is no distant region to be enlarged, the process proceeds to S306, while if it determines that there is a distant region to be enlarged, the process proceeds to S305.

Ｓ３０５に進むと、画像生成部１３０は、Ｓ３０３で拡大するべきとして設定された領域の拡大画像を生成し、その後、Ｓ３０６に処理を進める。図５の例の場合、物体検出用画像５１０の中で、拡大するべきとして設定された領域５０４の拡大画像（５１２）が生成される。本実施形態においては、この拡大画像が、拡大物体検出用画像５１２、つまり撮像素子１１の撮像画素による解像度と同等以上の解像度を有する画像として生成される第２の物体検出用画像である。本実施形態の場合、撮像素子１１は撮像画素と測距画素が行ごとに配置されているため、画像生成部１３０は、撮像素子１１に配された撮像画素の行数と同等以上の行数となる拡大画像を、拡大物体検出用画像として生成する。 In step S305, the image generation unit 130 generates an enlarged image of the area set to be enlarged in step S303, and then proceeds to step S306. In the example of FIG. 5, an enlarged image (512) of the region 504 set to be enlarged is generated in the object detection image 510. In the present embodiment, the enlarged image is an enlarged object detection image 512, that is, a second object detection image generated as an image having a resolution equal to or higher than the resolution of the imaging pixels of the imaging element 11. In the case of the present embodiment, since the imaging element 11 has imaging pixels and ranging pixels arranged for each row, the image generation unit 130 determines the number of rows equal to or greater than the number of imaging pixels arranged in the imaging element 11. Is generated as an enlarged object detection image.

前述したように、本実施形態において、Ｓ３０５の拡大画像生成処理は、Ｓ３０２で生成された物体検出用画像のなかで、Ｓ３０３で拡大するべきとして設定された領域５０４を、撮像素子１１の解像度以上に拡大するような処理である。例えば、Ｓ３０３で設定された領域５０４における矩形の左上座標を（ｘｃ，ｙｃ）とし、その矩形の領域の大きさをＷｔ×Ｈｔ画素とする。画像生成部１３０は、この領域５０４の矩形の座標と大きさを基に、撮像素子１１の画素配列上で対応した位置と大きさを算出する。本実施形態の撮像素子１１の画素配列の場合、撮像素子１１の画素配列上における矩形領域の大きさはＷｔ×Ｈｔの値をそれぞれ２倍すればよい。例えば、撮像素子１１の画素配列上における矩形領域の左上座標は（ｘｓ，ｙｓ）とし、大きさをＷｔｓ×Ｈｔｓ画素とする。つまりＷｔｓ＝Ｗｔ×２、Ｈｔｓ＝Ｈｔ×２である。そして、画像生成部１３０は、このようにして求めた矩形領域の情報をもとに、物体検出用画像５１０から拡大物体検出用画像５１２を生成する。 As described above, in the present embodiment, in the enlarged image generation processing in S305, in the object detection image generated in S302, the area 504 set to be enlarged in S303 is set to the resolution of the image sensor 11 or higher. This is a process that expands to. For example, the upper left coordinate of the rectangle in the region 504 set in S303 is (xc, yc), and the size of the rectangular region is Wt × Ht pixels. The image generation unit 130 calculates the corresponding position and size on the pixel array of the image sensor 11 based on the coordinates and size of the rectangle of the area 504. In the case of the pixel array of the image sensor 11 of the present embodiment, the size of the rectangular area on the pixel array of the image sensor 11 may be twice the value of Wt × Ht. For example, the upper left coordinate of the rectangular area on the pixel array of the image sensor 11 is (xs, ys), and the size is Wts × Hts pixels. That is, Wts = Wt × 2 and Hts = Ht × 2. Then, the image generation unit 130 generates an enlarged object detection image 512 from the object detection image 510 based on the information on the rectangular area thus obtained.

以下、図４を用い、矩形領域を拡大した拡大物体検出用画像を生成する処理について詳細を説明する。
図４において、画像４１６は、撮像素子１１の画素配列において左上座標（ｘｓ，ｙｓ）で大きさがＷｔｓ×Ｈｔｓ画素の矩形領域を切り出した画像に相当する。 Hereinafter, the process of generating an enlarged object detection image in which a rectangular area is enlarged will be described in detail with reference to FIG.
In FIG. 4, an image 416 corresponds to an image obtained by cutting out a rectangular area of Wts × Hts pixels at upper left coordinates (xs, ys) in the pixel array of the image sensor 11.

また撮像素子１１の測距画素であるＭ画素の信号については、前述した画像信号Ｓ１と画像信号Ｓ２とを加算したＭ＝Ｓ１＋Ｓ２の輝度信号として取り扱うとする。この際、Ｍ画素とＷ画素はいずれもカラーフィルタによる色分離がされていない信号として取得されるため、同様に輝度信号として取り扱うことが可能となる。このため、画像生成部１３０は、Ｗ画素およびＭ画素の信号を用いて輝度画像４１７を生成する。具体的には、画像生成部１３０は、図４の画像４１７のように、Ｒ画素，Ｇ画素，Ｂ画素の各画素位置については、Ｗ画素とＭ画素の画素値により補間した画素（アンダーラインを付けたＷ画素）を用いることで、輝度画像４１７を生成する。 Further, it is assumed that the signal of the M pixel, which is the distance measuring pixel of the image sensor 11, is handled as a luminance signal of M = S1 + S2 obtained by adding the image signal S1 and the image signal S2 described above. At this time, since the M pixel and the W pixel are both obtained as signals that have not been color-separated by the color filter, they can be similarly treated as luminance signals. Therefore, the image generation unit 130 generates the luminance image 417 using the signals of the W pixels and the M pixels. Specifically, the image generation unit 130 determines, for each pixel position of the R pixel, the G pixel, and the B pixel, pixels interpolated by the pixel values of the W pixel and the M pixel (underline, as in an image 417 in FIG. 4). The luminance image 417 is generated by using a W pixel with a.

次に、画像生成部１３０は、図４の物体検出用画像４１２から、Ｓ３０３で拡大するべきとして設定された領域（図５の領域５０４）を切り出し、輝度画像４１７と同様のサイズへ拡大する処理を行って、図４の拡大物体検出用画像４１８を生成する。 Next, the image generation unit 130 cuts out the area (the area 504 in FIG. 5) set to be enlarged in S303 from the object detection image 412 in FIG. 4 and enlarges the area to the same size as the luminance image 417. To generate the enlarged object detection image 418 of FIG.

さらに、画像生成部１３０は、輝度画像４１７を利用して、拡大物体検出用画像４１８の補正を行う。ここで、輝度画像４１７は、撮像素子１１が本来持つ解像度の画像であるため、物体検出用画像４１２と比べて高解像度である。このため、画像生成部１３０は、先ず、輝度画像４１７の画素と同一位置において色の比を算出する。すなわち画像生成部１３０は、本来、Ｒ画素、Ｇ画素、Ｂ画素があった位置の輝度を、輝度画像４１７から取得し、その位置の輝度に対応した色の比を算出する。そして、画像生成部１３０は、その位置において算出した色の比を、Ｒ，Ｇ，Ｂのそれぞれ拡大物体検出用画像４１８に乗じる。これにより、それぞれ解像度が補正された鮮明な拡大物体検出用画像が得られることになる。 Further, the image generation unit 130 corrects the enlarged object detection image 418 using the luminance image 417. Here, the luminance image 417 has a higher resolution than the image for object detection 412 because the luminance image 417 is an image having the resolution originally possessed by the image sensor 11. Therefore, the image generation unit 130 first calculates the color ratio at the same position as the pixel of the luminance image 417. That is, the image generation unit 130 acquires the luminance at the position where the R pixel, the G pixel, and the B pixel were originally located from the luminance image 417, and calculates the color ratio corresponding to the luminance at the position. Then, the image generation unit 130 multiplies the enlarged object detection image 418 of each of R, G, and B by the color ratio calculated at that position. As a result, a clear enlarged object detection image whose resolution has been corrected can be obtained.

なお、遠方の物体を検出するためには、拡大物体検出用画像４１８を元画像として段階的に拡大した画像を生成してもよい。またここでは、拡大物体検出用画像４１８をＲ，Ｇ，Ｂのカラー画像として生成する例を挙げたが、後段の物体検出処理が輝度画像にも対応している場合は、輝度画像のみを生成するだけでもよい。以上のことで、高解像度でＳ／Ｎの高い拡大物体検出用画像を生成することが可能となる。 Note that, in order to detect a distant object, an image enlarged stepwise may be generated using the enlarged object detection image 418 as an original image. Also, here, an example in which the enlarged object detection image 418 is generated as an R, G, B color image has been described. However, if the subsequent object detection process also supports a luminance image, only the luminance image is generated. Just do it. As described above, it is possible to generate an enlarged object detection image with high resolution and high S / N.

再び図３（Ａ）のフローチャートに説明を戻す。
Ｓ３０５の後、Ｓ３０６に進むと、画像処理部１３は、物体検出処理を行う。物体検出処理は物体検出部１３４において行われる。
ここで、物体検出部１３４は、Ｓ３０２で生成された物体検出用画像、または、３０４で拡大すべき遠方領域が存在すると判定された場合にＳ３０５で生成された拡大物体検出用画像を用いて、物体検出を行う。物体検出に関しては、例えば前述したように、特徴点ベースの方式やニューラルネットワークを用いた方式など、特に限定は無く、いずれの方式を用いてもよい。そして、物体検出部１３４は、物体検出結果として、画像中のどの位置にどのような大きさの物体が存在するかを示す矩形領域情報を出力する。なお、拡大物体検出用画像から検出された矩形領域については、画像拡大処理の際の拡大率を考慮して、最終的に物体検出用画像の大きさに整合された矩形領域情報として出力される。 Description will be returned to the flowchart of FIG.
After proceeding to S306 after S305, the image processing unit 13 performs an object detection process. The object detection processing is performed in the object detection unit 134.
Here, the object detection unit 134 uses the image for object detection generated in S302 or the image for enlarged object detection generated in S305 when it is determined that there is a distant region to be enlarged in 304, Perform object detection. As for the object detection, for example, as described above, there is no particular limitation such as a method based on a feature point or a method using a neural network, and any method may be used. Then, the object detection unit 134 outputs, as the object detection result, rectangular area information indicating at which position in the image the size of the object exists. Note that the rectangular area detected from the enlarged object detection image is finally output as rectangular area information matched to the size of the object detection image in consideration of the enlargement ratio at the time of the image enlargement processing. .

次にＳ３０７において、画像処理部１３は、Ｓ３０２で生成した距離画像とＳ３０６で出力した物体検出結果の情報を統合する。情報統合の処理は情報統合部１３５において行われる。情報統合部１３５は、Ｓ３０２で生成した距離画像について、Ｓ３０６で検出された矩形領域情報に基づく矩形領域を設定し、その矩形領域内の距離情報を基に撮像装置１から物体までの距離を設定する。なお、設定する距離は、矩形領域内の中央付近の距離情報の値や、矩形領域内における各距離情報の中央値を用いるなど、その設定方法は特に制限せず、種々の方法を用いることができる。これにより、検出された物体が撮像装置１からどの距離に存在するかを出力することが可能となる。そして、情報統合部１３５は、その情報統合の結果を、通信部１７を介して、他の装置等に送る。この統合情報結果を受け取る他の装置としては、例えば後述する図７の車両７００に搭載されている行動計画装置７１等が想定される。そして行動計画装置７１では、取得した統合情報結果を基に、物体との衝突可能性などが判断され、例えば自車両の行動計画の策定等が行われる。 Next, in step S307, the image processing unit 13 integrates the distance image generated in step S302 and the information on the object detection result output in step S306. The information integration process is performed in the information integration unit 135. The information integration unit 135 sets a rectangular area based on the rectangular area information detected in S306 for the distance image generated in S302, and sets a distance from the imaging device 1 to the object based on the distance information in the rectangular area. I do. Note that the distance to be set is not particularly limited, such as using a value of distance information near the center in the rectangular area or a median of each distance information in the rectangular area, and various methods may be used. it can. Thereby, it is possible to output at which distance the detected object is located from the imaging device 1. Then, the information integration unit 135 sends the result of the information integration to another device or the like via the communication unit 17. As another device that receives the integrated information result, for example, an action planning device 71 mounted on the vehicle 700 in FIG. Then, the action planning device 71 determines the possibility of collision with the object based on the acquired integrated information result, and for example, formulates an action plan of the own vehicle.

以上、説明したように、本実施形態において、画像処理部１３は、撮像画素と測距画素が所定のパターンで配置された撮像素子１１の撮像画素の信号に加え、測距画素からの信号をも用いることで、撮像装置１からの距離が遠い物体をも検出可能としている。
なお前述した説明では、距離画像の距離のみを用いて拡大すべき領域を設定したが、それに限らない。例えば、図５の物体検出用画像５１０を解析して例えば走行路を検出し、その走行路上で、且つ遠方の距離となる領域を、拡大すべき領域として設定してもよい。そのように設定することで、拡大物体検出用画像を記憶するメモリ１３１の使用量の削減、および、拡大物体検出用画像による物体検出処理の時間を削減することが可能となる。 As described above, in the present embodiment, the image processing unit 13 outputs the signal from the ranging pixel in addition to the signal of the imaging pixel of the imaging element 11 in which the imaging pixel and the ranging pixel are arranged in a predetermined pattern. By using also, an object that is far from the imaging device 1 can be detected.
In the above description, the area to be enlarged is set using only the distance of the distance image, but is not limited thereto. For example, the object detection image 510 in FIG. 5 may be analyzed to detect, for example, a travel path, and an area on the travel path and at a long distance may be set as an area to be enlarged. By making such settings, it is possible to reduce the amount of use of the memory 131 that stores the image for enlarged object detection and to reduce the time required for the object detection processing using the image for enlarged object detection.

また前述した実施形態では、撮像素子１１上の画素配置のパターンが正方配列の例を挙げたが、これには限定されず、例えば撮像画素行と測距画素行が交互に半画素ずれたような配置パターンでもよい。この例を図６に示す。図６の撮像素子６１は、撮像画素の行と測距画素の行のそれぞれ１列を一組として、各組が横方向（ｘ軸方向）に半画素分ずれた配置（いわゆる千鳥配置）とした画素配置のパターンを有している。このような千鳥配置の場合、測距画素としては視差分解能が向上し、正方配置の場合よりも高精度な測距が可能になる。またこの場合、物体検出用画像および拡大物体検出用画像も横方向に分解能が向上する。 Further, in the above-described embodiment, an example in which the pixel arrangement pattern on the image sensor 11 is a square array is used. However, the present invention is not limited to this. For example, it is assumed that the image pixel row and the distance measurement pixel row are alternately shifted by half a pixel. Arrangement pattern may be used. This example is shown in FIG. The image sensor 61 in FIG. 6 has an arrangement in which each row of imaging pixels and each row of distance measurement pixels is one set, and each set is shifted by half a pixel in the horizontal direction (x-axis direction) (so-called staggered arrangement). Pixel arrangement pattern. In the case of such a staggered arrangement, parallax resolution is improved as a distance measurement pixel, and a more accurate distance measurement can be performed than in the case of a square arrangement. In this case, the resolution of the object detection image and the enlarged object detection image is also improved in the horizontal direction.

図６の撮像素子６１のようなパターンが用いられた場合、図３（Ａ）のフローチャートでは、Ｓ３０２およびＳ３０５の処理の一部が前述した処理とは異なる。以下、異なった処理部分のみ説明する。
撮像素子６１の信号から物体検出用画像を生成する場合、その物体検出用画像の解像度は、撮像素子６１における解像度の半分となる。この場合、画像生成部１３０は、Ｓ３０２において、半画素分のずれ量に基づく補間を行う画素ずらし処理による画像生成を行わず、図４のベイヤー配列画像４１１と同様な画素並びのＲ，Ｇ，Ｂの物体検出用画像４１２を生成する。 When a pattern like the image sensor 61 of FIG. 6 is used, in the flowchart of FIG. 3A, a part of the processing of S302 and S305 is different from the processing described above. Hereinafter, only different processing portions will be described.
When an image for object detection is generated from a signal of the image sensor 61, the resolution of the image for object detection is half the resolution of the image sensor 61. In this case, in S302, the image generation unit 130 does not generate an image by a pixel shift process of performing interpolation based on a shift amount of a half pixel, and performs R, G, and R pixel arrangement similar to the Bayer array image 411 in FIG. The object detection image 412 of B is generated.

拡大物体検出用画像を生成する場合、画像生成部１３０は、Ｓ３０５において、先ず撮像素子６１のＷ画素の信号と、Ｍ画素による画像信号Ｓ１と画像信号Ｓ２を加算した信号とから、図６に示すような輝度画像６１２を生成する。具体的には、画像生成部１３０は、撮像素子６１の横方向の画素数Ｗｓの２倍の解像度の画素として半画素分だけ横方向にずれていることを考慮し、先ず、Ｗ画素及びＭ画素の行を一組として、Ｗ画素およびＭ画素を配置する。そして、図中でアンダーラインを付けたＷ画素とＭ画素の位置の画素信号は存在しないため、画像生成部１３０は、それら存在しない位置の画素を、周辺画素の輝度信号から補間により算出することにより輝度画像６１２を生成する。このように、拡大物体検出用画像については、半画素分のずれ量に基づく補間を行う画素ずらし処理を用いて生成する。 When generating an image for detecting an enlarged object, the image generating unit 130 in FIG. 6 first uses the signal of the W pixel of the image sensor 61 and the signal obtained by adding the image signal S1 and the image signal S2 of the M pixels in S305. A luminance image 612 as shown is generated. Specifically, the image generation unit 130 considers that a pixel having a resolution twice as large as the number of pixels Ws in the horizontal direction of the image sensor 61 is shifted by half a pixel in the horizontal direction, W pixels and M pixels are arranged as a set of pixel rows. Since there are no pixel signals at the positions of the underlined W pixel and M pixel in the drawing, the image generation unit 130 calculates the pixels at those nonexistent positions by interpolation from the luminance signals of the peripheral pixels. To generate a luminance image 612. As described above, the enlarged object detection image is generated by using the pixel shift process of performing the interpolation based on the shift amount of the half pixel.

その後、画像生成部１３０は、前述したＳ３０５と同様に、物体検出用画像において拡大すべきとして設定された領域を、輝度画像６１２に合わせるように拡大して拡大物体検出用画像を生成する。また前述同様に、輝度画像６１２を基に、拡大物体検出用画像の色情報を補正することで、撮像素子６１の解像度の２倍を有する拡大物体検出用画像を生成することもできる。なお、物体検出部１３４では、前述同様に、拡大物体検出用画像として、輝度画像６１２を利用しもよいし、拡大物体検出用画像をさらに拡大して物体検出を行ってもよい。以上により、前述した例よりもさらに被写体距離が遠い物体や小さな物体を検出することが可能となる。 After that, the image generation unit 130 generates an enlarged object detection image by enlarging the area set to be enlarged in the object detection image so as to match the luminance image 612, as in S305 described above. As described above, by correcting the color information of the enlarged object detection image based on the luminance image 612, it is also possible to generate an enlarged object detection image having twice the resolution of the image sensor 61. As described above, the object detection unit 134 may use the luminance image 612 as the enlarged object detection image, or may perform object detection by further enlarging the enlarged object detection image. As described above, it is possible to detect an object having a farther object distance or a smaller object than in the above-described example.

図７は、本実施形態の撮像装置１を自動車等の移動体（車両７００）に搭載し、周辺環境を認識して、自律的に移動するような高度運転支援システム等を実現する場合の概略的な構成例を示した図である。
図７に示すように、車両７００は、撮像装置１、行動計画装置７１、警報装置７２、制御装置７３、車両情報取得装置７４を備えている。 FIG. 7 schematically illustrates a case where the imaging apparatus 1 according to the present embodiment is mounted on a mobile object such as a car (vehicle 700), and an advanced driving support system or the like that recognizes the surrounding environment and moves autonomously is realized. FIG. 3 is a diagram showing a typical configuration example.
As shown in FIG. 7, the vehicle 700 includes an imaging device 1, an action planning device 71, an alarm device 72, a control device 73, and a vehicle information acquisition device 74.

図７の車両７００において、撮像装置１は、前述した本実施形態に係る画像処理装置を有した撮像装置であり、前述したように取得した画像を分析して特定の物体（車、歩行者など）を検出する。
車両情報取得装置７４は、車両７００の車速（速度）、ヨーレート、舵角の少なくともいずれかを、車両情報として取得する。
行動計画装置７１は、撮像装置１からの撮像画像、測距画像、検出物体の情報、車両情報取得装置７４からの車両情報等を総合的に判断し、車両７００の移動予定経路、車両７００の周辺環境を認識すると共に、自車の加減速・操舵角等の行動計画を策定する。 In the vehicle 700 of FIG. 7, the imaging device 1 is an imaging device having the image processing device according to the present embodiment described above, and analyzes an image acquired as described above to analyze a specific object (such as a car or a pedestrian). ) Is detected.
The vehicle information acquisition device 74 acquires at least one of the vehicle speed (speed), the yaw rate, and the steering angle of the vehicle 700 as vehicle information.
The action planning device 71 comprehensively determines the captured image from the imaging device 1, the ranging image, information on the detected object, the vehicle information from the vehicle information acquisition device 74, and the like, and determines the planned movement route of the vehicle 700, Recognize the surrounding environment and formulate an action plan such as acceleration / deceleration and steering angle of the vehicle.

制御装置７３は、行動計画装置７１からの行動計画を取得し、その行動計画で決定された加減速・操舵角等となるようハンドル、ブレーキ、アクセル等の制御を行う。
また例えば、前方車との衝突の可能性が高い場合、警報装置７２は、音などの警報を発したり、カーナビゲーションシステムやヘッドアップディスプレイ等の表示装置に警報を表示したりして、運転手に警告を行う。 The control device 73 acquires the action plan from the action plan device 71, and controls the steering wheel, the brake, the accelerator, and the like so that the acceleration, deceleration, steering angle, and the like determined by the action plan are obtained.
Further, for example, when there is a high possibility of a collision with a vehicle ahead, the alarm device 72 issues an alarm such as a sound, or displays an alarm on a display device such as a car navigation system or a head-up display, so that the driver Warning.

前述した実施形態では画像処理装置を備えた撮像装置を車両に搭載する例を挙げたが、本実施形態の画像処理装置は、カメラ機能を備えたスマートフォンやタブレット端末等の各種携帯端末、監視カメラ、工業用カメラ等に搭載されてもよい。 In the above-described embodiment, an example in which an imaging device including an image processing device is mounted on a vehicle has been described. However, the image processing device according to the present embodiment includes various mobile terminals such as a smartphone and a tablet terminal having a camera function, and a monitoring camera. , An industrial camera or the like.

本実施形態では、前述した実施形態の機能を実現するソフトウェアのプログラムコードを格納した記憶ないし記録媒体を、画像処理装置に供給することでも実現できる。算出部のコンピュータ（または、ＣＰＵ、ＭＰＵなど）が記憶媒体に格納されたプログラムコードを読み出して前述した機能を実行する。この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラム、これを格納した記憶媒体は本発明を構成することになる。本発明のプログラムは、所定の結像光学系と所定の撮像部、コンピュータを備えた撮像装置のコンピュータにインストールすることによって、撮像装置を高精度の距離検出が可能なものとすることができる。本発明のコンピュータは、記憶媒体の他、インターネットを通じて頒布することも可能である。 In the present embodiment, the present invention can also be realized by supplying a storage or recording medium storing program codes of software for realizing the functions of the above-described embodiments to an image processing apparatus. A computer (or CPU, MPU, or the like) of the calculation unit reads out the program code stored in the storage medium and executes the above-described function. In this case, the program code itself read from the storage medium implements the functions of the above-described embodiment, and the program and the storage medium storing the program constitute the present invention. By installing the program of the present invention in a computer of an imaging apparatus including a predetermined imaging optical system, a predetermined imaging unit, and a computer, the imaging apparatus can detect a distance with high accuracy. The computer of the present invention can be distributed via the Internet in addition to the storage medium.

上述の実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。即ち、本発明は、その技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 Each of the above-described embodiments is merely an example of a specific embodiment for carrying out the present invention, and the technical scope of the present invention should not be interpreted in a limited manner. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features.

１：撮像装置、１０：結像光学系、１１：撮像素子、１２：制御部、１３：画像処理部、１３０：画像生成部、１３１：距離生成部、１３２：メモリ、１３３：領域設定部、１３４：物体検出部、１３５：情報統合部、１４：記憶部、１５：入力部、１６：表示部、１７：通信部 1: imaging device, 10: imaging optical system, 11: imaging device, 12: control unit, 13: image processing unit, 130: image generation unit, 131: distance generation unit, 132: memory, 133: area setting unit, 134: Object detection unit, 135: Information integration unit, 14: Storage unit, 15: Input unit, 16: Display unit, 17: Communication unit

Claims

An image processing apparatus for processing an output signal from an imaging element in which imaging pixels and ranging pixels capable of measuring a distance by an imaging surface phase difference method are arranged in a predetermined arrangement pattern,
First generating means for generating a first image based on an output signal of the imaging pixel;
Distance measuring means for generating distance information based on an output signal of the distance measuring pixel;
Setting means for setting an area that is farther than a predetermined distance based on the distance information, for the first image;
A second image generating a second image having a resolution equal to or higher than the resolution according to the arrangement pattern of the imaging pixels based on the output signals of the imaging pixels and the ranging pixels for the area set by the setting unit; Means for generating
Detecting means for detecting an area of a predetermined object from any one of the first image and the second image;
An image processing apparatus comprising:

The arrangement pattern of the imaging device is a pattern in which the imaging pixels and the ranging pixels are arranged in each row,
The method according to claim 1, wherein the second generation unit generates the second image having a number of rows equal to or greater than the number of rows of the imaging pixels for the area set by the setting unit. The image processing apparatus according to any one of the preceding claims.

The image processing apparatus according to claim 1, wherein the first generation unit generates the first image having a resolution equivalent to a resolution according to an arrangement pattern of the imaging pixels.

The arrangement pattern of the imaging device is a pattern in which the imaging pixels and the ranging pixels are arranged in each row,
The method according to claim 3, wherein the first generation unit generates an image having a number of rows equal to the number of rows of the imaging pixels, and scales the image to generate the first image. The image processing apparatus according to any one of the preceding claims.

The arrangement pattern of the imaging element, the imaging pixels and the ranging pixels are arranged by a half pixel for each set as a set of the rows of the imaging pixels and the rows of the ranging pixels,
5. The image processing apparatus according to claim 1, wherein the second generation unit generates an enlarged image by performing interpolation according to the shift of the half pixel as the second image. 6. Image processing device.

The image processing apparatus according to claim 5, wherein the first generation unit generates the first image without performing interpolation according to a shift of the half pixel.

The second generation means includes:
A luminance signal is generated by adding a plurality of signals read from the ranging pixels of the imaging element, and a luminance image having a resolution equal to or higher than the resolution by the imaging pixels is generated,
For the area set by the setting unit, an image enlarged so as to have the same number of pixels as the luminance image is corrected using the luminance image to generate the second image. The image processing apparatus according to claim 1.

The second generation unit calculates a color ratio corresponding to luminance for each pixel of the luminance image, and multiplies the color ratio calculated for each pixel by a corresponding pixel of the enlarged image, The image processing apparatus according to claim 7, wherein the correction is performed.

The image processing apparatus according to claim 1, wherein the second generation unit generates the second image including a luminance image.

The setting means may set, as the area farther than the predetermined distance, an area that falls within a distance range from a first distance to a second distance farther than the first distance. The image processing device according to claim 1.

The first distance is a distance set based on a minimum model of an object that can be detected from the first image,
The image processing apparatus according to claim 10, wherein the second distance is a distance set based on a minimum model of an object that can be detected from the second image.

The apparatus according to claim 1, wherein the setting unit sets, for an area of the traveling road included in the first image, the area that is farther than a predetermined distance based on the distance information. The image processing apparatus according to claim 1.

An imaging element in which imaging pixels and ranging pixels capable of measuring a distance by an imaging surface phase difference method are arranged in a predetermined arrangement pattern,
An image processing apparatus according to any one of claims 1 to 12,
An imaging device comprising:

An imaging device according to claim 13,
Using the detection result of the object by the imaging device, an action planning device that generates an action plan of the moving object,
A control device that controls the operation of the moving object based on the action plan,
A support device comprising:

An image processing method for processing an output signal from an imaging element in which imaging pixels and ranging pixels capable of measuring a distance by an imaging surface phase difference method are arranged in a predetermined arrangement pattern,
A first generation step of generating a first image based on an output signal of the imaging pixel;
A ranging step of generating distance information based on an output signal of the ranging pixel;
A setting step of setting an area that is farther than a predetermined distance based on the distance information, for the first image;
A second image generating a second image having a resolution equal to or higher than the resolution according to the arrangement pattern of the imaging pixels based on the output signals of the imaging pixels and the ranging pixels for the area set in the setting step; Generation process,
A detecting step of detecting a region of a predetermined object from any one of the first image and the second image;
An image processing method comprising:

A program for causing a computer to function as each unit of the image processing apparatus according to claim 1.