JP2020030569A

JP2020030569A - Image processing method, image processing device, imaging device, lens device, program, and storage medium

Info

Publication number: JP2020030569A
Application number: JP2018155205A
Authority: JP
Inventors: 法人日浅; Norito Hiasa
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2018-08-22
Filing date: 2018-08-22
Publication date: 2020-02-27
Anticipated expiration: 2038-08-22
Also published as: JP7191588B2

Abstract

To provide an image processing method capable of acquiring a sharpened image by correcting blur depending on pupils of an optical system with high accuracy from an image.SOLUTION: An image processing method includes the steps of: acquiring a first image obtained by imaging a subject space through a first pupil of an optical system and a second image obtained by imaging the subject space through a second pupil different from the first pupil of the optical system S101; and generating a sharpened image obtained by correcting blur depending on the pupils of the optical system on the basis of the first image and the second image by using a multilayer neural network S107.SELECTED DRAWING: Figure 5

Description

本発明は、光学系の瞳を分割して撮像した画像に対して、瞳に依存するぼけを補正し、鮮鋭化する画像処理方法に関する。 The present invention relates to an image processing method that corrects a pupil-dependent blur and sharpens an image captured by dividing a pupil of an optical system.

特許文献１には、ウィナーフィルタに基づく処理によって、撮像画像から収差によるぼけを補正し、鮮鋭化画像を得る方法が開示されている。特許文献２には、畳み込みニューラルネットワークを用いて、撮像画像のフォーカスずれ（デフォーカス）によるぼけを補正する方法が開示されている。 Patent Literature 1 discloses a method in which a blur based on aberration is corrected from a captured image by a process based on a Wiener filter to obtain a sharpened image. Patent Literature 2 discloses a method of correcting blurring due to a focus shift (defocus) of a captured image using a convolutional neural network.

特開２０１１−１２３５８９号公報JP 2011-123589 A 特開２０１７−１９９２３５号公報JP 2017-199235 A

しかし、特許文献１に開示された方法は、ウィナーフィルタに基づく処理（線型処理）を用いるため、高精度なぼけ補正を行うことができない。例えば、ぼけによって空間周波数スペクトルがゼロ、またはノイズと同程度の強度まで低下した被写体の情報を復元することはできない。また、特許文献１に開示された方法では、ぼけの補正と共にノイズも増幅する。 However, the method disclosed in Patent Literature 1 uses a process based on the Wiener filter (linear process), and thus cannot perform highly accurate blur correction. For example, it is not possible to restore information on a subject whose spatial frequency spectrum has been reduced to zero or the same intensity as noise due to blurring. In the method disclosed in Patent Document 1, noise is amplified together with blur correction.

一方、特許文献２に開示された畳み込みニューラルネットワークでは、複数のフィルタとの畳込みと活性化関数に依る非線形変換が実行されるため、ゼロ近傍まで低下した被写体の空間周波数スペクトルを推定することができる。また、学習の際にノイズを考慮することで、ノイズの増幅を抑制したぼけ補正を行うことも可能である。しかし、ぼけ補正に畳込みニューラルネットワークを用いても、ぼけが大きく、画像の空間周波数スペクトルの劣化が激しい場合、充分な補正効果を得ることができない。 On the other hand, in the convolutional neural network disclosed in Patent Literature 2, since the convolution with a plurality of filters and the non-linear conversion based on the activation function are performed, it is possible to estimate the spatial frequency spectrum of the subject lowered to near zero. it can. In addition, by considering noise at the time of learning, it is also possible to perform blur correction in which noise amplification is suppressed. However, even if a convolutional neural network is used for blur correction, a sufficient correction effect cannot be obtained when the blur is large and the spatial frequency spectrum of the image is significantly deteriorated.

そこで本発明は、画像から光学系の瞳に依存するぼけを高精度に補正し、鮮鋭化画像を得ることが可能な画像処理方法、画像処理装置、撮像装置、レンズ装置、プログラム、および、記憶媒体を提供することを目的とする。 Accordingly, the present invention provides an image processing method, an image processing device, an imaging device, a lens device, a program, and a storage capable of correcting a blur depending on a pupil of an optical system from an image with high accuracy and obtaining a sharpened image. The purpose is to provide a medium.

本発明の一側面としての画像処理方法は、光学系の第１の瞳を介して被写体空間を撮像することで得られた第１の画像と、前記光学系の前記第１の瞳とは異なる第２の瞳を介して前記被写体空間を撮像することで得られた第２の画像とを取得する工程と、多層のニューラルネットワークを用いて、前記第１の画像と前記第２の画像とに基づいて、前記光学系の瞳に依存するぼけが補正された鮮鋭化画像を生成する工程とを有する。 In the image processing method according to one aspect of the present invention, a first image obtained by imaging a subject space via a first pupil of an optical system is different from the first pupil of the optical system. Obtaining a second image obtained by imaging the subject space through a second pupil; and obtaining the first image and the second image using a multilayer neural network. Generating a sharpened image in which blur depending on the pupil of the optical system has been corrected based on the corrected image.

本発明の他の側面としての画像処理装置は、光学系の第１の瞳を介して被写体空間を撮像することで得られた第１の画像と、前記光学系の前記第１の瞳とは異なる第２の瞳を介して前記被写体空間を撮像することで得られた第２の画像とを取得する取得手段と、多層のニューラルネットワークを用いて、前記第１の画像と前記第２の画像とに基づいて、前記光学系の瞳に依存するぼけが補正された鮮鋭化画像を生成する生成手段とを有する。 An image processing apparatus according to another aspect of the present invention is configured such that a first image obtained by imaging a subject space via a first pupil of an optical system and the first pupil of the optical system are Acquiring means for acquiring a second image obtained by imaging the subject space through a different second pupil; and the first image and the second image using a multilayer neural network. Generating means for generating a sharpened image in which blur depending on the pupil of the optical system is corrected based on the above.

本発明の他の側面としての撮像装置は、光学系により形成された光学像を光電変換する撮像素子と前記画像処理装置とを有する。 An imaging device according to another aspect of the present invention includes an imaging device that photoelectrically converts an optical image formed by an optical system, and the image processing device.

本発明の他の側面としてのレンズ装置は、撮像装置に着脱可能なレンズ装置であって、光学系と、多層のニューラルネットワークに入力されるウエイトに関する情報を記憶する記憶手段とを有し、前記撮像装置は、前記光学系の第１の瞳を介して被写体空間を撮像することで得られた第１の画像と、前記光学系の前記第１の瞳とは異なる第２の瞳を介して前記被写体空間を撮像することで得られた第２の画像とを取得する取得手段と、前記多層のニューラルネットワークを用いて、前記第１の画像と前記第２の画像と前記ウエイトに関する情報とに基づいて、前記光学系の瞳に依存するぼけが補正された鮮鋭化画像を生成する生成手段とを有する。 A lens device as another aspect of the present invention is a lens device that is detachable from an imaging device, and has an optical system, and storage means for storing information on weights input to a multilayer neural network, An imaging device configured to capture a first image obtained by capturing an image of a subject space via a first pupil of the optical system and a second image different from the first pupil of the optical system; Acquiring means for acquiring a second image obtained by imaging the subject space; and using the multilayer neural network to acquire information on the first image, the second image, and the weight. Generating means for generating a sharpened image in which blur depending on the pupil of the optical system is corrected based on the pupil of the optical system.

本発明の他の側面としてのプログラムは、前記画像処理方法をコンピュータに実行させる。 A program according to another aspect of the present invention causes a computer to execute the image processing method.

本発明の他の側面としての記憶媒体は、前記プログラムを記憶している。 A storage medium according to another aspect of the present invention stores the program.

本発明の他の目的及び特徴は、以下の実施例において説明される。 Other objects and features of the present invention are described in the following examples.

本発明によれば、画像から光学系の瞳に依存するぼけを高精度に補正し、鮮鋭化画像を得ることが可能な画像処理方法、画像処理装置、撮像装置、レンズ装置、プログラム、および、記憶媒体を提供することができる。 According to the present invention, an image processing method, an image processing device, an imaging device, a lens device, a program, and a method for correcting a blur depending on a pupil of an optical system from an image with high accuracy and obtaining a sharpened image are provided. A storage medium can be provided.

各実施例における鮮鋭化画像を生成するネットワーク構造を示す図である。FIG. 4 is a diagram illustrating a network structure for generating a sharpened image in each embodiment. 実施例１における撮像装置のブロック図である。FIG. 2 is a block diagram of the imaging device according to the first embodiment. 実施例１における撮像装置の外観図である。FIG. 2 is an external view of the imaging apparatus according to the first embodiment. 実施例１における撮像部の説明図である。FIG. 3 is an explanatory diagram of an imaging unit according to the first embodiment. 実施例１における鮮鋭化画像の生成処理を示すフローチャートである。6 is a flowchart illustrating a process of generating a sharpened image according to the first embodiment. 実施例１における分割瞳と像高とヴィネッティングとの関係を示す図である。FIG. 5 is a diagram illustrating a relationship among a split pupil, an image height, and vignetting in the first embodiment. 実施例１における各像高とアジムスでの瞳分割の説明図である。FIG. 4 is an explanatory diagram of pupil division at each image height and azimuth in the first embodiment. 各実施例におけるウエイトの学習に関するフローチャートである。It is a flowchart regarding weight learning in each Example. 実施例２における画像処理システムのブロック図である。FIG. 9 is a block diagram of an image processing system according to a second embodiment. 実施例２における画像処理システムの外観図である。FIG. 7 is an external view of an image processing system according to a second embodiment. 実施例２における撮像素子の構成図である。FIG. 9 is a configuration diagram of an image sensor according to a second embodiment. 実施例２における鮮鋭化画像の生成処理を示すフローチャートである。9 is a flowchart illustrating a process of generating a sharpened image according to the second embodiment.

以下、本発明の実施例について、図面を参照しながら詳細に説明する。各図において、同一の部材については同一の参照符号を付し、重複する説明は省略する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In each of the drawings, the same members are denoted by the same reference numerals, and redundant description will be omitted.

まず、本発明の補正対象である光学系の瞳に依存するぼけについて説明する。光学系の瞳に依存するぼけとは、光学系の瞳の大きさおよび形状（強度分布）に影響するぼけを指す。光学系の瞳に依存するぼけには、光学系の収差、回折によるぼけ、および、デフォーカスによるぼけが含まれる。収差は、光学系の絞りを絞る（瞳が小さくなる）と小さくなる。逆に、回折によるぼけは瞳が小さくなると大きくなる。デフォーカスによるぼけは、瞳の形状を反映した形状になり、瞳が大きいほどデフォーカス量に対して大きくなる。一方、その他の被写体または撮像装置のぶれによるぼけ等は、光学系の瞳と無関係であるため、本発明の補正対象ではない。また、ぼけの補正とは、ぼけにより劣化した空間周波数スペクトルを復元することであり、画像を鮮鋭化する（鮮鋭化画像を生成する）ことを指す。 First, the blur depending on the pupil of the optical system to be corrected according to the present invention will be described. The blur depending on the pupil of the optical system refers to a blur that affects the size and shape (intensity distribution) of the pupil of the optical system. The blur depending on the pupil of the optical system includes an aberration of the optical system, a blur due to diffraction, and a blur due to defocus. The aberration decreases as the aperture of the optical system is reduced (pupil becomes smaller). Conversely, the blur due to diffraction increases as the pupil decreases. The blur due to defocus has a shape that reflects the shape of the pupil, and the larger the pupil, the greater the amount of defocus. On the other hand, other subjects or blurs due to blurring of the imaging device are not related to the pupil of the optical system, and therefore are not correction targets of the present invention. Further, blur correction refers to restoring a spatial frequency spectrum degraded due to blur, and refers to sharpening an image (generating a sharpened image).

まず、各実施例にて具体的な説明を行う前に、本発明の要旨を述べる。本発明では、光学系の瞳（第１の瞳）で撮像された第１の画像と、前記瞳の一部（第２の瞳）で撮像された第２の画像から、ディープラーニングによって、光学系の瞳に依存するぼけが補正された鮮鋭化画像を得る。第１の画像は、瞳のサイズが大きいため、収差とデフォーカスによるぼけが大きいが、回折によるぼけは小さい。また、光量も多いため、ノイズも小さい。一方、第２の画像は、瞳のサイズが小さいため、収差とデフォーカスによるぼけは小さいが、回折によるぼけが大きく、またノイズも大きい。第１の画像および第２の画像の両方を用いることにより、互いの有用な情報をぼけ補正に利用し、鮮鋭かつノイズの小さい鮮鋭化画像を得ることができる。なお各実施例において、第２の瞳は第１の瞳の一部であるが、本発明はこれに限定されるものではなく、第２の瞳の大きさや形状（透過率分布）が第１の瞳と異なればよい。 First, before giving a specific description in each embodiment, the gist of the present invention will be described. According to the present invention, optical learning is performed by deep learning from a first image captured by a pupil (first pupil) of the optical system and a second image captured by a part of the pupil (second pupil). Obtain a sharpened image in which blur depending on the pupil of the system has been corrected. Since the first image has a large pupil size, blur due to aberration and defocus is large, but blur due to diffraction is small. Also, since the amount of light is large, noise is small. On the other hand, the second image has a small pupil size, so that blur due to aberration and defocus is small, but blur due to diffraction is large and noise is also large. By using both the first image and the second image, mutually useful information is used for blur correction, and a sharpened image with small noise can be obtained. In each of the embodiments, the second pupil is a part of the first pupil. However, the present invention is not limited to this, and the size and shape (transmittance distribution) of the second pupil are different from those of the first pupil. It just needs to be different from your eyes.

まず、図２および図３を参照して、本発明の実施例１における撮像装置について説明する。図２は、撮像装置１００のブロック図である。図３は、撮像装置１００の外観図である。なお、本実施例の撮像装置１００は、カメラ本体とカメラ本体に一体的に構成されたレンズ装置とを備えて構成されるが、これに限定されるものではない。本発明は、カメラ本体（撮像装置本体）と、カメラ本体に着脱可能なレンズ装置（交換レンズ）とを備えて構成される撮像システムにも適用可能である。まず、撮像装置１００の各部の概略を説明し、その詳細については後述する。 First, an imaging device according to a first embodiment of the present invention will be described with reference to FIGS. FIG. 2 is a block diagram of the imaging device 100. FIG. 3 is an external view of the imaging device 100. The imaging apparatus 100 according to the present embodiment is configured to include a camera body and a lens device integrally formed with the camera body, but is not limited thereto. The present invention is also applicable to an imaging system including a camera body (imaging apparatus body) and a lens device (interchangeable lens) that is detachable from the camera body. First, an outline of each unit of the imaging apparatus 100 will be described, and details thereof will be described later.

図２に示されるように、撮像装置１００は、被写体空間の像を画像（撮影画像）として取得する撮像部１０１を有する。撮像部１０１は、被写体空間からの入射光を集光する光学系（撮像光学系）１０１ａと、複数の画素を有する撮像素子１０１ｂとを有する。撮像素子１０１ｂは、例えば、ＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）センサやＣＭＯＳ（ＣｏｍｐｌｅｍｅｎｔａｒｙＭｅｔａｌ−ＯｘｉｄｅＳｅｍｉｃｏｎｄｕｃｔｏｒ）センサである。 As shown in FIG. 2, the imaging apparatus 100 includes an imaging unit 101 that acquires an image of a subject space as an image (captured image). The imaging unit 101 includes an optical system (imaging optical system) 101a that collects incident light from a subject space, and an imaging device 101b having a plurality of pixels. The image sensor 101b is, for example, a charge coupled device (CCD) sensor or a complementary metal-oxide semiconductor (CMOS) sensor.

図４は、撮像部１０１の説明図である。図４（Ａ）は、撮像部１０１の断面図を示し、一点鎖線は軸上光束を表している。図４（Ｂ）は、撮像素子１０１ｂの上面図である。撮像素子１０１ｂは、マイクロレンズアレイ１２２と複数の画素１２１とを有する。マイクロレンズアレイ１２２は、光学系１０１ａを介して被写体面１２０と共役の位置に配置されている。図４（Ｂ）に示されるように、マイクロレンズアレイ１２２を構成するマイクロレンズ１２２（マイクロレンズ１２２ａのみ表記し、１２２ｂ以降は省略）は、複数の画素１２１（画素１２１ａのみ表記し、１２１ｂ以降は省略）のそれぞれと対応している。ここで、複数の部位をまとめて指定する際は番号のみを付し、そのうちの１つを示す際は番号とａなどの記号を付す。 FIG. 4 is an explanatory diagram of the imaging unit 101. FIG. 4A is a cross-sectional view of the imaging unit 101, and a dashed line indicates an axial luminous flux. FIG. 4B is a top view of the image sensor 101b. The imaging element 101b has a micro lens array 122 and a plurality of pixels 121. The micro lens array 122 is arranged at a position conjugate with the object plane 120 via the optical system 101a. As shown in FIG. 4B, a plurality of pixels 121 (only the pixel 121a is described, and only the microlenses 122a are described below) are included in the micro lens 122 (only the micro lens 122a is described, and the description after 122b is omitted). (Omitted). Here, when designating a plurality of parts collectively, only a number is given, and when indicating one of them, a number and a symbol such as a are given.

複数の画素１２１のそれぞれは、光学系１０１ａを介して形成された光学像を光電変換する第１の光電変換部１２３および第２の光電変換部１２４を有する。これにより、例えば画素１２１ａに入射した光は、その入射角に依存して、第１の光電変換部１２３ａと第２の光電変換部１２４ａとに分離して受光される（第１の光電変換部１２３ａと第２の光電変換部１２４ａは、互いに異なる入射角で入射する光を受光する）。光の入射角は、その光が光学系１０１ａにおける瞳のいずれの位置を通過したかにより決定される。このため、光学系１０１ａの瞳は２つの光電変換部により２つの部分瞳に分割され、一つの画素内の２つの光電変換部は互いに異なる視点（瞳の位置）から被写体空間を観察した情報を取得する。なお本実施例において、瞳の分割方向は水平方向であるが、これに限定されるものではなく、垂直方向や斜め方向などの他の方向であってもよい。 Each of the plurality of pixels 121 has a first photoelectric conversion unit 123 and a second photoelectric conversion unit 124 that photoelectrically convert an optical image formed via the optical system 101a. Thus, for example, the light incident on the pixel 121a is separated and received by the first photoelectric conversion unit 123a and the second photoelectric conversion unit 124a depending on the incident angle (the first photoelectric conversion unit). 123a and the second photoelectric conversion unit 124a receive light incident at different incident angles from each other). The incident angle of the light is determined by which position of the pupil of the optical system 101a the light has passed. For this reason, the pupil of the optical system 101a is divided into two partial pupils by the two photoelectric conversion units, and the two photoelectric conversion units in one pixel use information obtained by observing the subject space from different viewpoints (pupil positions). get. In the present embodiment, the pupil division direction is the horizontal direction, but is not limited to this, and may be another direction such as a vertical direction or an oblique direction.

撮像素子１０１ｂは、第１の光電変換部１２３で取得された信号（第２の画像、Ａ画像）と、この信号（Ａ画像）と第２の光電変換部１２４で取得された信号（第３の画像、Ｂ画像）との加算信号（第１の画像、Ａ＋Ｂ画像）を出力する。このように本実施例において、第１の画像および第２の画像は、光学系１０１ａを介して被写体空間を同時に撮像して得られた画像である。また本実施例において、第１の画像および第２の画像は、同一の撮像素子１０１ｂにより撮像された画像である。 The imaging element 101b includes a signal (second image, A image) acquired by the first photoelectric conversion unit 123, a signal (A image) acquired by the first photoelectric conversion unit 123, and a signal (third image) acquired by the second photoelectric conversion unit 124. (The first image and the A + B image) are output. As described above, in the present embodiment, the first image and the second image are images obtained by simultaneously imaging the subject space via the optical system 101a. In this embodiment, the first image and the second image are images captured by the same image sensor 101b.

Ａ画像およびＡ＋Ｂ画像は、画像処理部１０２に出力される。画像処理部（画像処理装置）１０２は、情報取得部（取得手段）１０２ａおよび画像生成部（生成手段）１０２ｂを有し、本実施例の画像処理方法（光学系１０１ａの瞳に依存するぼけを補正する鮮鋭化処理）を実行する。この際、画像処理部１０２は、記憶部（記憶手段）１０３に記憶されたウエイト情報（ウエイトに関する情報）を取得し、画像生成部１０２ｂがウエイト情報を用いることで鮮鋭化画像を生成する。なお、この処理の詳細に関しては後述する。生成された鮮鋭化画像は、記録媒体１０５に保存される。ユーザから撮像画像の表示に関する指示が出された場合、保存された鮮鋭化画像が読み出され、表示部１０４に表示される。なお、記録媒体１０５に既に保存されたＡ画像とＡ＋Ｂ画像とを読み出し、画像処理部１０２で鮮鋭化画像を生成してもよい。以上の一連の制御は、システムコントローラ１０６によって行われる。 The A image and the A + B image are output to the image processing unit 102. The image processing unit (image processing apparatus) 102 includes an information acquisition unit (acquisition unit) 102a and an image generation unit (generation unit) 102b, and uses the image processing method (blur depending on the pupil of the optical system 101a) of the present embodiment. (A sharpening process for correcting). At this time, the image processing unit 102 acquires the weight information (information about the weight) stored in the storage unit (storage unit) 103, and the image generation unit 102b generates a sharpened image by using the weight information. The details of this processing will be described later. The generated sharpened image is stored in the recording medium 105. When the user issues an instruction regarding the display of the captured image, the stored sharpened image is read out and displayed on the display unit 104. Note that the A image and the A + B image already stored in the recording medium 105 may be read, and the image processing unit 102 may generate a sharpened image. The above series of controls is performed by the system controller 106.

次に、図５を参照して、画像処理部１０２で実行される鮮鋭化処理（鮮鋭化画像の生成）に関して説明する。画像処理部１０２は、鮮鋭化処理の際に、事前に学習されたウエイト情報を用いるが、この学習に関する詳細については後述する。図５は、鮮鋭化画像の生成方法を示すフローチャートである。図５の各ステップは、システムコントローラ１０６の指令に基づいて画像処理部１０２により実行される。 Next, the sharpening process (generation of a sharpened image) performed by the image processing unit 102 will be described with reference to FIG. The image processing unit 102 uses weight information that has been learned in advance at the time of the sharpening process. The details of this learning will be described later. FIG. 5 is a flowchart illustrating a method of generating a sharpened image. Each step in FIG. 5 is executed by the image processing unit 102 based on a command from the system controller 106.

まず、ステップＳ１０１において、情報取得部１０２ａは、Ａ＋Ｂ画像（第１の画像）２０１とＡ画像（第２の画像）２０２とを取得する。Ａ画像２０２は、光学系１０１ａの瞳の一部である部分瞳（第２の瞳）を通過する光束に基づいて被写体空間を撮像して得られた画像である。Ａ＋Ｂ画像２０１は、光学系１０１ａの瞳（第１の瞳）を通過する光束に基づいて被写体空間を撮像して得られた画像である。本実施例において、第２の瞳は、第１の瞳に含まれ、第１の瞳の一部である。第２の瞳は第１の瞳より小さいため、Ａ画像はＡ＋Ｂ画像よりも収差およびデフォーカスのそれぞれに起因するぼけが小さく、逆に回折に起因するぼけおよびノイズが大きい。本実施例によれば、Ａ＋Ｂ画像とＡ画像との両方を用いることで、互いの有用な情報を後述のぼけ補正に利用し、鮮鋭かつノイズの小さい鮮鋭化画像を生成することができる。また、図４に示される撮像部１０１の構成を用いることで、互いに異なる瞳の大きさのＡ＋Ｂ画像とＡ画像とを同時に撮像することができる。これによって、被写体の動きによる画像間のずれ等を回避することができる。 First, in step S101, the information acquisition unit 102a acquires an A + B image (first image) 201 and an A image (second image) 202. The A image 202 is an image obtained by imaging a subject space based on a light beam that passes through a partial pupil (second pupil) that is a part of the pupil of the optical system 101a. The A + B image 201 is an image obtained by imaging a subject space based on a light beam passing through a pupil (first pupil) of the optical system 101a. In the present embodiment, the second pupil is included in the first pupil and is a part of the first pupil. Since the second pupil is smaller than the first pupil, the A image has less blur due to aberration and defocus, respectively, and the blur and noise due to diffraction are larger than the A + B image. According to the present embodiment, by using both the A + B image and the A image, mutually useful information is used for blur correction described later, and a sharpened image with small noise can be generated. Further, by using the configuration of the imaging unit 101 shown in FIG. 4, it is possible to simultaneously capture the A + B image and the A image having different pupil sizes. Thus, it is possible to avoid a shift between images due to the movement of the subject.

続いてステップＳ１０２において、画像生成部１０２ｂは、Ａ＋Ｂ画像とＡ画像の明るさを合わせる処理を行う。Ａ画像は、Ａ＋Ｂ画像に対して瞳が小さいため、暗い画像となっている。また、光軸上以外の像高ではヴィネッティングが発生するため、像高とアジムスとにより、Ａ＋Ｂ画像とＡ画像の明るさの比（光量比）は変化する。これに関し、図６を参照して説明する。 Subsequently, in step S102, the image generation unit 102b performs a process of adjusting the brightness of the A + B image and the brightness of the A image. The A image is a dark image because the pupil is smaller than the A + B image. In addition, since vignetting occurs at an image height other than on the optical axis, the brightness ratio (light amount ratio) between the A + B image and the A image changes depending on the image height and the azimuth. This will be described with reference to FIG.

図６は、分割瞳と像高とヴィネッティングとの関係を示す図である。図６（Ａ）は、光学系１０１ａの光軸上における瞳を示している。図６中の破線は、２つの光電変換部により分割される瞳の分割線を表している。図６（Ｂ）は、図６（Ａ）の場合とは異なる像高における瞳を示している。図６（Ａ）では２つの分割瞳の光量は均一だが、図６（Ｂ）ではヴィネッティングにより両者の光量比に偏りが生じている。図６（Ｃ）は、図６（Ｂ）と同一像高（光軸に垂直な平面内で光軸から同一の距離の位置）でアジムス（光軸に垂直な平面内で光軸を回転軸とした方位角）が異なる場合である。この際も部分瞳の光量比が変化する。このため、Ａ＋Ｂ画像とＡ画像を後述の多層のニューラルネットワークへ入力すると、画像内の像高およびアジムスにより２つの画像の明るさの関係がばらつくことにより、生成される鮮鋭化画像の精度が低下する可能性がある。したがって、本実施例では、Ａ＋Ｂ画像とＡ画像の明るさを合わせる処理を実行することが好ましい。なお本実施例では、Ａ画像の明るさをＡ＋Ｂ画像に合わせるが、逆でも構わない。 FIG. 6 is a diagram illustrating a relationship between a split pupil, an image height, and vignetting. FIG. 6A shows a pupil on the optical axis of the optical system 101a. A broken line in FIG. 6 indicates a pupil division line divided by the two photoelectric conversion units. FIG. 6B shows a pupil at an image height different from that in the case of FIG. 6A. In FIG. 6A, the light amounts of the two split pupils are uniform, but in FIG. 6B, a bias occurs in the light amount ratio between the two due to vignetting. FIG. 6C shows an azimuth (rotation of the optical axis in the plane perpendicular to the optical axis) at the same image height (at the same distance from the optical axis in the plane perpendicular to the optical axis) as in FIG. Azimuth angle) are different. Also at this time, the light amount ratio of the partial pupil changes. For this reason, when the A + B image and the A image are input to a multilayer neural network described later, the relationship between the brightness of the two images varies due to the image height and azimuth in the image, and the accuracy of the generated sharpened image decreases. there's a possibility that. Therefore, in the present embodiment, it is preferable to execute a process of adjusting the brightness of the A + B image and the brightness of the A image. In this embodiment, the brightness of the A image is adjusted to the brightness of the A + B image, but may be reversed.

２つの画像の明るさを合わせる方法として、以下に２つの例を挙げる。１つ目は、第１の瞳および第２の瞳の光量比（第１の瞳と第２の瞳の透過率分布の比）に基づいて、明るさを合わせる方法である。Ａ画像の各像高とアジムスの画素に対して、第２の瞳に対する第１の瞳の光量比を記憶部１０３から読み出して積をとり、Ａ＋Ｂ画像と明るさを合わせる。光量比は１以上の値であり、像高とアジムスによって異なる値を有する。また、各像高とアジムスに対して、第１の瞳と第２の瞳それぞれの透過率分布を積分した第１の積分値と第２の積分値を取得し、明るさ合わせに使用してもよい。第１の画像の各像高とアジムスの画素に対して、対応する第１の積分値の逆数をかけ、第２の画像の各像高とアジムスの画素に対して、対応する第２の積分値の逆数をかけることでも、明るさを合わせることができる。 The following two examples are given as a method of adjusting the brightness of two images. The first is a method of adjusting the brightness based on the light amount ratio of the first pupil and the second pupil (the ratio of the transmittance distribution of the first pupil and the second pupil). For each image height of the A image and the azimuth pixel, the light amount ratio of the first pupil to the second pupil is read out from the storage unit 103 and the product is obtained, and the brightness is adjusted to the A + B image. The light quantity ratio is a value of 1 or more, and has different values depending on the image height and the azimuth. Also, for each image height and azimuth, a first integral value and a second integral value obtained by integrating the transmittance distribution of each of the first pupil and the second pupil are obtained and used for brightness adjustment. Is also good. Each image height and azimuth pixel of the first image are multiplied by the reciprocal of the corresponding first integral value, and each image height and azimuth pixel of the second image are correspondingly integrated by the second integral. Brightness can also be adjusted by multiplying the reciprocal of the value.

２つ目は、Ａ＋Ｂ画像とＡ画像の局所的な平均画素値を用いる方法である。Ａ＋Ｂ画像とＡ画像は、収差やノイズが異なり、また視差を有するが、同じ被写体を撮像しているため、部分領域における平均画素値の比は、前述の光量比におおよそ対応する。このため、例えば、Ａ＋Ｂ画像とＡ画像に平滑化フィルタをかけて各画素に対して平均画素値を求め、同一位置の画素における平均画素値の比から、この位置での光量比を求め、明るさを合わせることができる。ただし、平均画素値を求める際、輝度飽和している画素が含まれている場合、光量比から値が乖離する可能性がある。このため本実施例では、輝度飽和した画素を除いて平均画素値を求めることが好ましい。仮に、輝度飽和の面積が大きく、その位置での平均画素値が求められない場合、周辺で算出された光量比から補間を行い、その位置に対応する光量比を算出することができる。部分領域の大きさは、ぼけの大きさと、第１の瞳と第２の瞳の基線長（重心位置の間の長さ）に基づいて決定することが好ましい。なおステップＳ１０２は、ステップＳ１０１とステップＳ１０７との間であれば、いつ実行してもよい。 The second method uses a local average pixel value of the A + B image and the A image. The A + B image and the A image have different aberrations and noises and have parallax, but since the same object is imaged, the average pixel value ratio in the partial region roughly corresponds to the light amount ratio described above. For this reason, for example, an average pixel value is obtained for each pixel by applying a smoothing filter to the A + B image and the A image, and a light amount ratio at this position is obtained from a ratio of the average pixel values at pixels at the same position. You can match. However, when calculating the average pixel value, if a pixel having luminance saturation is included, the value may deviate from the light amount ratio. For this reason, in the present embodiment, it is preferable to calculate the average pixel value except for the pixels whose luminance is saturated. If the area of the luminance saturation is large and the average pixel value at that position cannot be obtained, interpolation can be performed from the light amount ratios calculated in the periphery, and the light amount ratio corresponding to that position can be calculated. It is preferable that the size of the partial region is determined based on the size of the blur and the base line length of the first pupil and the second pupil (the length between the positions of the centers of gravity). Step S102 may be performed at any time between step S101 and step S107.

続いて、図５のステップＳ１０３において、画像生成部１０２ｂは、Ａ＋Ｂ画像またはＡ画像のＦ値（絞り値）が閾値以下（所定のＦ値以下）であるか否かを判定する。なお本実施例では、Ａ＋Ｂ画像を撮像した際の光学系１０１ａのＦ値を判定基準として用いる。Ｆ値が閾値以下の場合、ステップＳ１０４へ進む。一方、Ｆ値が閾値よりも大きい場合、ステップＳ１０６へ進む。Ｆ値が閾値より小さい場合、ユーザはポートレート等のように、メインの被写体以外はデフォーカスでぼかすことを意図していたと考えられる。逆に、Ｆ値が閾値より大きい場合、ユーザは被写界深度を深くし、パンフォーカスに近い画像を撮像することを意図していたと考えられる。またＦ値が大きい場合、収差が充分に小さく、結像性能の劣化は主に回折に起因するぼけとなる。このため、Ｆ値が閾値より大きい場合には主に回折およびデフォーカスに起因するぼけを補正し、Ｆ値が閾値以下の場合には主に収差を補正するように、鮮鋭化の対象を切り替えることが好ましい。 Subsequently, in step S103 in FIG. 5, the image generation unit 102b determines whether the F value (aperture value) of the A + B image or the A image is equal to or less than a threshold value (not more than a predetermined F value). In this embodiment, the F value of the optical system 101a when the A + B image is captured is used as a criterion. If the F value is equal to or smaller than the threshold, the process proceeds to step S104. On the other hand, when the F value is larger than the threshold, the process proceeds to step S106. When the F value is smaller than the threshold value, it is considered that the user intends to defocus and blur other than the main subject, such as a portrait. Conversely, when the F value is larger than the threshold, it is considered that the user intended to increase the depth of field and capture an image close to pan focus. When the F value is large, the aberration is sufficiently small, and the deterioration of the imaging performance is mainly caused by diffraction. Therefore, when the F value is larger than the threshold value, the object to be sharpened is switched so that blur caused mainly by diffraction and defocus is corrected, and when the F value is smaller than the threshold value, aberration is mainly corrected. Is preferred.

続いてステップＳ１０４において、画像生成部１０２ｂは、多層のニューラルネットワークを使用する前の前処理として、Ａ＋Ｂ画像またはＡ画像に対して、鮮鋭化フィルタを作用させる。ここで、鮮鋭化フィルタは、光学系１０１ａの結像性能を表す光学特性（光学伝達関数または点像強度分布）に基づいて鮮鋭化を行うフィルタである。鮮鋭化フィルタとしては、ウィナーフィルタ等の逆フィルタが用いられる。本実施例では、Ａ＋Ｂ画像に対して鮮鋭化フィルタを作用させるが、Ａ画像にも同様に鮮鋭化フィルタを作用させてもよい。この場合、第１の光電変換部１２３で取得される像の光学伝達関数または点像強度分布に基づいて生成された鮮鋭化フィルタを用いる。鮮鋭化フィルタは、記憶部１０３に記憶された光学伝達関数に関する情報を用いて、各像高とアジムスに対して生成される。なお、各像高とアジムスに対する鮮鋭化フィルタ自体を記憶部１０３に記憶しておいてもよい。鮮鋭化フィルタの補正対象は収差であるため、デフォーカスによるぼけの補正は含まない。近似的に、Ａ＋Ｂ画像全体に対して、結像面での光学特性から求められた鮮鋭化フィルタを作用させてもよい。 Subsequently, in step S104, the image generation unit 102b applies a sharpening filter to the A + B image or the A image as preprocessing before using the multilayer neural network. Here, the sharpening filter is a filter that performs sharpening based on optical characteristics (optical transfer function or point spread function) representing the imaging performance of the optical system 101a. An inverse filter such as a Wiener filter is used as the sharpening filter. In this embodiment, the sharpening filter is applied to the A + B image. However, the sharpening filter may be applied to the A image in the same manner. In this case, a sharpening filter generated based on the optical transfer function or the point spread function of the image acquired by the first photoelectric conversion unit 123 is used. The sharpening filter is generated for each image height and azimuth using the information on the optical transfer function stored in the storage unit 103. The sharpening filter itself for each image height and azimuth may be stored in the storage unit 103. Since the correction target of the sharpening filter is aberration, correction of blur due to defocus is not included. Approximately, a sharpening filter determined from the optical characteristics on the image plane may be applied to the entire A + B image.

本実施例の鮮鋭化フィルタは、ウィナーフィルタである。このため、ぼけによってゼロ近傍まで低下したＡ＋Ｂ画像の空間周波数スペクトルは復元することができない。また、ノイズの増幅やリンギング等の弊害も発生している。しかし、撮像時のぼけたままのＡ＋Ｂ画像ではなく、ウィナーフィルタを作用させたＡ＋Ｂ画像を後述のニューラルネットワークへ入力することで、ぼけの形状変化に対するニューラルネットワークの補正効果（鮮鋭化と弊害の抑制）をロバストにできる。特に、収差によるぼけは、光学系１０１ａのズーム、合焦距離、像高とアジムスによって大きく変化し得るため、ロバスト性を向上させることが好ましい。ロバスト性が低い場合、収差の形状ごとに個別でニューラルネットワークを学習する必要があり、記憶部１０３に保持するウエイト情報の容量が増大する。なお、回折によるぼけは、光学系１０１ａのズーム、合焦距離、像高とアジムスによる変化がほぼなく、Ｆ値によってその大きさのみが変わる。また、Ｆ値が大きい場合、被写界深度が深いため、デフォーカスによるぼけも変化が小さい。このため本実施例では、Ｆ値が閾値（所定のＦ値）よりも大きい（回折とデフォーカスのぼけを補正する）場合、相対的にロバスト性の必要性は低いため、鮮鋭化フィルタを用いない。ただし、Ｆ値が大きい場合でも前処理として鮮鋭化フィルタを作用させてもよい。また本実施例において、ウィナーフィルタ以外の鮮鋭化フィルタを用いることもできる。 The sharpening filter of this embodiment is a Wiener filter. For this reason, the spatial frequency spectrum of the A + B image reduced to near zero due to blur cannot be restored. Also, adverse effects such as noise amplification and ringing have occurred. However, by inputting an A + B image to which a Wiener filter is applied, instead of the blurred A + B image at the time of imaging, to a neural network described later, the effect of the neural network to correct the blur shape change (sharpening and suppression of adverse effects) ) Can be robust. In particular, since blur due to aberration can greatly vary depending on the zoom, focusing distance, image height, and azimuth of the optical system 101a, it is preferable to improve the robustness. If the robustness is low, it is necessary to individually learn a neural network for each aberration shape, and the capacity of weight information stored in the storage unit 103 increases. The blur due to diffraction hardly changes due to the zoom, the focusing distance, the image height and the azimuth of the optical system 101a, and only the size changes depending on the F value. When the F-number is large, the depth of field is large, so that the blur caused by defocusing is small. For this reason, in this embodiment, when the F value is larger than the threshold value (predetermined F value) (correction of blurring of diffraction and defocus), the need for robustness is relatively low, and a sharpening filter is used. Not in. However, even when the F value is large, a sharpening filter may be applied as preprocessing. Further, in this embodiment, a sharpening filter other than the Wiener filter can be used.

続いてステップＳ１０５において、画像生成部１０２ｂは、光軸上の物点に対する第２の瞳が線対称となる軸と平行で、かつＡ＋Ｂ画像とＡ画像のそれぞれの基準点（光軸、または光軸の近傍）を通過する直線で、Ａ＋Ｂ画像とＡ画像のそれぞれを分割する。また画像生成部１０２ｂは、分割されたＡ＋Ｂ画像とＡ画像、またはウエイト情報に対して、反転を制御する前処理（反転処理）を施す。本実施例では、光学系１０１ａの瞳は、図６（Ａ）に示されるように水平方向に２分割されている。このため、Ａ＋Ｂ画像とＡ画像をそれぞれ上下に２分割し、一方の画像（又はウエイト情報）を反転する。そして、反転処理後のＡ＋Ｂ画像とＡ画像と（又は、反転処理後のウエイト情報）に基づいて鮮鋭化処理を行うことにより、ウエイト情報の容量を削減しつつ鮮鋭化画像を生成することができる。これに関して、図７を参照して説明する。 Subsequently, in step S105, the image generating unit 102b determines that the second pupil with respect to the object point on the optical axis is parallel to the axis that is line-symmetric, and that each of the reference points (optical axis or optical axis) of the A + B image and the A image. Each of the A + B image and the A image is divided by a straight line passing through the vicinity of the axis). Further, the image generation unit 102b performs preprocessing (inversion processing) for controlling inversion on the divided A + B image and A image, or the weight information. In the present embodiment, the pupil of the optical system 101a is divided into two in the horizontal direction as shown in FIG. Therefore, the A + B image and the A image are each vertically divided into two, and one image (or weight information) is inverted. Then, by performing the sharpening process based on the A + B image and the A image after the inversion process (or the weight information after the inversion process), it is possible to generate a sharpened image while reducing the capacity of the weight information. . This will be described with reference to FIG.

図７は、各像高とアジムスでの瞳分割の説明図である。図７はＡ画像を示し、×印の像高およびアジムスにおける分割瞳を×印の横に描画している。図７中の破線は瞳の分割線（分割直線）である。図７に示されるように、本実施例では一点鎖線を軸としてＡ画像の上下いずれか一方を反転すると、他方の瞳分割と重なり、線対称になっている。このため、収差、回折、または、デフォーカスのぼけも同様に、一点鎖線に対して線対称となる。したがって、一点鎖線の上下いずれか一方の領域に関して、ぼけを補正するウエイト情報を保持しておけば、他方は画像またはウエイト情報を反転することで鮮鋭化画像を推定することができる。 FIG. 7 is an explanatory diagram of pupil division at each image height and azimuth. FIG. 7 shows the A image, in which the image height of the mark x and the split pupil in the azimuth are drawn beside the mark x. The broken line in FIG. 7 is a pupil division line (division line). As shown in FIG. 7, in this embodiment, when one of the upper and lower sides of the A image is inverted with respect to the one-dot chain line as an axis, the A image overlaps with the other pupil division and is line-symmetric. For this reason, the aberration, the diffraction, or the defocus blur is also symmetric with respect to the one-dot chain line. Therefore, if the weight information for correcting the blur is held for one of the upper and lower regions of the one-dot chain line, the other can infer the sharpened image by inverting the image or the weight information.

ここで、反転とは、画像とウエイト情報との積を取る際の参照の順序を逆にする場合を含む。また、Ａ＋Ｂ画像は瞳が円形のため、光軸を中心にして回転対称な収差を有する。このため、一点鎖線に対して線対称な収差でもある。したがって、Ａ画像と同様にウエイト情報の削減が可能である。本実施例では、水平方向に瞳を分割しているため、対称軸は水平な直線である。仮に、垂直方向に瞳を分割すると、対称軸も垂直な直線になる。これをさらに一般的に表現すると、以下のようになる。分割した瞳の関係が画像全体に対して線対称となる軸は、光軸を通過し、かつ光軸上で第２の瞳が線対称になる軸と平行である。この軸で分割されたＡ＋Ｂ画像とＡ画像に対し、一方の分割領域のみウエイト情報を保持しておけば、他方は反転を制御することで同じウエイト情報で鮮鋭化処理を行うことができる。なお、ステップＳ１０４と同様に、回折やデフォーカスのぼけに対しても、ステップＳ１０５を実行してもよい。また、ステップＳ１０４とステップＳ１０５は、順番が逆でもよい。 Here, the inversion includes a case where the order of reference when taking the product of the image and the weight information is reversed. Further, since the A + B image has a circular pupil, it has rotationally symmetric aberration about the optical axis. Therefore, the aberration is also line-symmetric with respect to the one-dot chain line. Therefore, weight information can be reduced as in the case of the A image. In this embodiment, since the pupil is divided in the horizontal direction, the axis of symmetry is a horizontal straight line. If the pupil is divided in the vertical direction, the axis of symmetry also becomes a vertical straight line. This can be more generally expressed as follows. The axis where the relationship between the divided pupils is line-symmetric with respect to the entire image passes through the optical axis and is parallel to the axis where the second pupil becomes line-symmetric on the optical axis. If the weight information is held only in one of the divided areas for the A + B image and the A image divided by this axis, the other can control the inversion to perform the sharpening process with the same weight information. Note that, similarly to step S104, step S105 may be executed for diffraction or defocus blur. Further, the order of step S104 and step S105 may be reversed.

続いてステップＳ１０６において、情報取得部１０２ａは、第１の画像または第２の画像の撮影時の光学系のＦ値に基づいて、ウエイト情報を記憶部１０３から取得する。ステップＳ１０３にて説明したように、本実施例では、Ｆ値が閾値よりも小さい場合、収差に起因するぼけを補正するウエイトを取得する。一方、Ｆ値が閾値よりも大きい場合、回折とデフォーカスに起因するぼけを補正するウエイト情報を取得する。それぞれのウエイト情報は、異なる学習データにより学習されており、その詳細については後述する。なおステップＳ１０６は、ステップＳ１０３とステップＳ１０７との間であれば、いつ実行してもよい。 Subsequently, in step S106, the information obtaining unit 102a obtains weight information from the storage unit 103 based on the F value of the optical system at the time of capturing the first image or the second image. As described in step S103, in the present embodiment, when the F value is smaller than the threshold, a weight for correcting blur caused by aberration is acquired. On the other hand, when the F value is larger than the threshold, weight information for correcting blur caused by diffraction and defocus is acquired. Each piece of weight information has been learned using different learning data, and details thereof will be described later. Step S106 may be executed at any time between step S103 and step S107.

続いてステップＳ１０７において、画像生成部１０２ｂは、多層のニューラルネットワークを用いて鮮鋭化画像を生成する。多層のニューラルネットワークには、（前処理後の）第１の画像および第２の画像とウエイト情報とを入力する。本実施例では、多層のニューラルネットワークとして、畳み込みニューラルネットワーク（ＣＮＮ：ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）が用いられる。ただし、本発明はこれに限定されるものではなく、ＧＡＮ（ＧｅｎｅｒａｔｉｖｅＡｄｖｅｒｓａｒｉａｌＮｅｔｗｏｒｋ）等の他の方法を用いてもよい。 Subsequently, in step S107, the image generation unit 102b generates a sharpened image using a multilayer neural network. The first and second images (after preprocessing) and weight information are input to the multilayer neural network. In this embodiment, a convolutional neural network (CNN) is used as a multilayer neural network. However, the present invention is not limited to this, and another method such as GAN (Generative Adversary Network) may be used.

ここで、図１を参照して、ＣＮＮにより鮮鋭化画像２１３を生成する工程について詳述する。図１は、鮮鋭化画像を生成するネットワーク構造を示す図である。ＣＮＮは、複数の畳み込み層を有する。本実施例において、入力画像２０１は、（前処理された）第１の画像と第２の画像がチャンネル方向に連結された画像である。第１の画像と第２の画像のそれぞれが複数のカラーチャンネルを有している場合、そのチャンネル数の２倍のチャンネル数を持つ画像となる。入力画像２０１は、第１畳み込み層２０２で複数のフィルタとの畳み込みとバイアスの和を算出される。各層におけるフィルタおよびバイアスの値は、ウエイト情報により決定される。第１の特徴マップ２０３は、各フィルタに対して算出された結果をまとめたものである。第１の特徴マップ２０３は、第２の畳み込み層２０４に入力され、同様に新たな複数のフィルタとの畳み込みとバイアスの和が算出される。これを繰り返し、第Ｎ−１の特徴マップ２１１を第Ｎの畳み込み層２１２に入力して得られた結果が、鮮鋭化画像２１３である。ここで、Ｎは３以上の自然数である。一般には３層以上の畳み込み層を有するＣＮＮが、ディープラーニングに該当すると言われる。各畳み込み層では、畳み込みの他に活性化関数を用いた非線型変換が実行される。活性化関数の例としては、シグモイド関数やＲｅＬＵ（ＲｅｃｔｉｆｉｅｄＬｉｎｅａｒＵｎｉｔ）等がある。実施例１では以下の式（１）で表されるＲｅＬＵを用いる。 Here, the step of generating a sharpened image 213 by CNN will be described in detail with reference to FIG. FIG. 1 is a diagram illustrating a network structure for generating a sharpened image. CNN has multiple convolutional layers. In the present embodiment, the input image 201 is an image in which the (preprocessed) first image and the second image are connected in the channel direction. When each of the first image and the second image has a plurality of color channels, the image has twice the number of channels. In the input image 201, the sum of the convolution with a plurality of filters and the bias is calculated in the first convolution layer 202. The values of the filter and the bias in each layer are determined by the weight information. The first feature map 203 summarizes the results calculated for each filter. The first feature map 203 is input to the second convolution layer 204, and the sum of the convolution with a plurality of new filters and the bias is similarly calculated. This is repeated, and the result obtained by inputting the (N−1) th feature map 211 to the Nth convolution layer 212 is a sharpened image 213. Here, N is a natural number of 3 or more. Generally, it is said that a CNN having three or more convolutional layers corresponds to deep learning. In each convolutional layer, a non-linear transformation using an activation function is performed in addition to the convolution. Examples of the activation function include a sigmoid function and a ReLU (Rectified Linear Unit). In the first embodiment, a ReLU represented by the following equation (1) is used.

式（１）において、ｍａｘは、引数のうち最大値を出力するＭＡＸ関数を表す。ただし、最後の第Ｎ畳み込み層では、非線形変換を実行しなくてもよい。以上の処理により、画像から光学系の瞳に依存するぼけを高精度に補正し、鮮鋭化画像を得ることが可能になる。 In Expression (1), max represents a MAX function that outputs the maximum value of the arguments. However, in the last N-th convolution layer, it is not necessary to perform the non-linear conversion. Through the above-described processing, it becomes possible to accurately correct the blur depending on the pupil of the optical system from the image and obtain a sharpened image.

次に、図８を参照して、ウエイト情報の学習に関して説明する。図８は、ウエイト情報の学習に関するフローチャートである。本実施例において、学習は撮像装置１００以外の画像処理装置で事前に実行され、その結果（複数のウエイト情報）が記憶部１０３に記憶されている。ただし本発明は、これに限定されるものではなく、撮像装置１００内に学習を実行する部位が存在していてもよい。 Next, learning of weight information will be described with reference to FIG. FIG. 8 is a flowchart relating to learning of weight information. In the present embodiment, learning is executed in advance by an image processing apparatus other than the imaging apparatus 100, and the result (a plurality of pieces of weight information) is stored in the storage unit 103. However, the present invention is not limited to this, and a part for executing learning may exist in the imaging device 100.

まず、ステップＳ２０１において、画像処理装置は、複数の学習ペアを取得する。学習ペアとは、ＣＮＮの入力画像としてのＡ＋Ｂ画像およびＡ画像と、ＣＮＮの出力画像（鮮鋭化画像）として得たい画像（正解画像）である。学習ペアの入力画像と正解画像との関係によって、ＣＮＮが補正する対象は変化する。Ｆ値が閾値より小さい場合、補正する対象のぼけは収差である。また、ノイズも抑制したいため、入力画像と正解画像は、収差とノイズの有無が異なる画像である。デフォーカスによるぼけは補正しないため、正解画像の被写界深度は、Ａ＋Ｂ画像と同じになるようにする。Ｆ値が閾値より大きい場合、回折とデフォーカスによるぼけとノイズの有無が異なる画像となる。 First, in step S201, the image processing device acquires a plurality of learning pairs. The learning pair is an A + B image and an A image as input images of the CNN, and an image (correct image) desired to be obtained as an output image (sharpened image) of the CNN. The object to be corrected by the CNN changes depending on the relationship between the input image of the learning pair and the correct image. If the F value is smaller than the threshold, the blur to be corrected is aberration. In order to suppress noise, the input image and the correct image are images having different aberrations and noise. Since the blur due to defocus is not corrected, the depth of field of the correct image is set to be the same as that of the A + B image. If the F value is larger than the threshold value, an image is obtained in which blur due to diffraction and defocus and presence or absence of noise are different.

ここで、学習ペアの生成方法に関して説明する。まず、入力画像（Ａ＋Ｂ画像とＡ画像）と正解画像を生成する元となるソースデータを用意する。ソースデータは、充分に高い空間周波数までスペクトル強度を有する３次元モデル、または２次元画像である。３次元モデルは、ＣＧ（コンピュータ・グラフィクス）等で生成することができる。２次元画像は、充分に結像性能の良い光学系で撮像した画像、または、画像を縮小することで高周波成分を強めた画像であってもよい。Ａ画像とＢ画像は、撮像部１０１でソースデータを撮像したシミュレーションを行うことで得られる。撮像シミュレーションでは、光学系１０１ａと撮像素子１０１ｂで発生するぼけとノイズを付与する。 Here, a method of generating a learning pair will be described. First, input data (A + B image and A image) and source data from which a correct image is generated are prepared. The source data is a three-dimensional model or a two-dimensional image having a spectral intensity up to a sufficiently high spatial frequency. The three-dimensional model can be generated by CG (computer graphics) or the like. The two-dimensional image may be an image captured by an optical system having sufficiently high imaging performance, or an image in which high-frequency components are enhanced by reducing the image. The A image and the B image can be obtained by performing a simulation in which the imaging unit 101 captures source data. In the imaging simulation, blur and noise generated in the optical system 101a and the imaging element 101b are added.

Ａ＋Ｂ画像は、生成されたＡ画像とＢ画像とを加算して得られる。加算により生成することで、Ａ＋Ｂ画像とＡ画像のノイズが現実に即した関係（互いに独立なノイズではない）となる。正解画像は、補正したい対象が除かれた撮像シミュレーションを行うことで生成可能である。Ｆ値が閾値より小さい場合、無収差の光学系とノイズのない撮像素子を用いてソースデータを撮像した画像である。Ｆ値が閾値より大きい場合、回折とデフォーカスによるぼけとノイズがない状態で、ソースデータを撮像した画像である。ソースデータが２次元画像の場合、様々なデフォーカス距離に２次元画像を配置して撮像シミュレーションを行い、それらに対応した複数の学習ペアを作成することが好ましい。ただし、２次元画像を合焦距離のみに配置して学習ペアを作成しても、近似的に所望の対象を補正可能なＣＮＮを生成することができる。Ｆ値が閾値より小さい場合、学習ペアにデフォーカスのぼけが存在せず、学習されたＣＮＮもデフォーカスのぼけを補正しないためである（ＣＮＮは全ての被写体が合焦距離にいるとして収差を補正する）。 The A + B image is obtained by adding the generated A image and B image. By generating by the addition, the noise of the A + B image and the noise of the A image has a realistic relationship (not independent noise). The correct image can be generated by performing an imaging simulation in which an object to be corrected is removed. When the F value is smaller than the threshold value, the image is obtained by capturing source data using an aberration-free optical system and an image sensor without noise. If the F value is larger than the threshold value, the image is obtained by capturing the source data without blur and noise due to diffraction and defocus. When the source data is a two-dimensional image, it is preferable to arrange a two-dimensional image at various defocus distances, perform an imaging simulation, and create a plurality of learning pairs corresponding thereto. However, even if a learning pair is created by arranging a two-dimensional image only at the focal distance, a CNN capable of approximately correcting a desired target can be generated. If the F value is smaller than the threshold value, there is no defocus blur in the learning pair, and the learned CNN does not correct the defocus blur. to correct).

Ｆ値が閾値より大きい場合、様々なＦ値の回折によるぼけに対して学習ペアを作成することで、ぼけの大小にロバストな補正が可能となるＣＮＮが学習される。これにより、被写界深度の深い画像の小さなデフォーカスのぼけを同時に補正することが可能となる。また、図５のステップＳ１０５のように、Ａ＋Ｂ画像とＡ画像をそれぞれ分割する場合、その分割された一部の画像の範囲（像高とアジムス）だけぼけを学習すればよい。このため、該当する像高とアジムスのぼけに対してのみ、学習ペアを作成すればよい。なお、図５のステップＳ１０２のように、Ａ＋Ｂ画像とＡ画像の明るさ合わせを行う場合、学習ペアにも同様の明るさ合わせを実行する。 When the F value is larger than the threshold value, a learning pair is created for the blur caused by diffraction of various F values, thereby learning a CNN that enables robust correction of the magnitude of the blur. This makes it possible to simultaneously correct small defocus blur of an image having a large depth of field. Further, when the A + B image and the A image are respectively divided as in step S105 in FIG. 5, the blur may be learned only in the range (image height and azimuth) of a part of the divided images. Therefore, a learning pair may be created only for the corresponding image height and azimuth blur. When the brightness adjustment of the A + B image and the A image is performed as in step S102 in FIG. 5, the same brightness adjustment is performed for the learning pair.

続いて、図８のステップＳ２０２において、画像処理装置は、複数の学習ペアから学習を行い、ウエイト情報を生成する。学習の際には、ステップＳ１０７の鮮鋭化画像の生成と同じネットワーク構造を用いる。本実施例では、図１に示されるネットワーク構造に対してＡ＋Ｂ画像とＡ画像を入力し、その出力結果（推定された鮮鋭化画像）と正解画像との誤差を算出する。この誤差が最小になるように、誤差逆伝播法（Ｂａｃｋｐｒｏｐａｇａｔｉｏｎ）等を用いて、各層で用いるフィルタとバイアス（ウエイト情報）を更新して最適化する。フィルタとバイアスの初期値はそれぞれ任意であり、例えば乱数から決定することができる。または、各層ごとに初期値を事前学習するＡｕｔｏＥｎｃｏｄｅｒ等のプレトレーニングを行ってもよい。 Subsequently, in step S202 in FIG. 8, the image processing device performs learning from a plurality of learning pairs and generates weight information. At the time of learning, the same network structure as that of the generation of the sharpened image in step S107 is used. In this embodiment, an A + B image and an A image are input to the network structure shown in FIG. 1, and an error between an output result (estimated sharpened image) and a correct image is calculated. In order to minimize this error, the filter and bias (weight information) used in each layer are updated and optimized by using an error back propagation method (Backpropagation) or the like. The initial values of the filter and the bias are arbitrary, and can be determined from, for example, random numbers. Alternatively, pre-training such as Auto Encoder for pre-learning an initial value for each layer may be performed.

学習ペアを全てネットワーク構造へ入力し、それら全ての情報を使って学習情報を更新する手法をバッチ学習と呼ぶ。ただし、この学習方法は学習ペアの数が増えるにつれて、計算負荷が膨大になる。逆に、学習情報の更新に１つの学習ペアのみを使用し、更新ごとに異なる学習ペアを使用する学習手法をオンライン学習と呼ぶ。この手法は、学習ペアが増えても計算量が増大しない利点があるが、その代わりに１つの学習ペアに存在するノイズの影響を大きく受ける。このため、これら２つの手法の中間に位置するミニバッチ法を用いて学習することが好ましい。ミニバッチ法は、全学習ペアの中から少数を抽出し、それらを用いて学習情報を更新する。次の更新では、異なる小数の学習ペアを抽出して使用する。これを繰り返すことにより、バッチ学習とオンライン学習の欠点を小さくすることができる。 A method of inputting all the learning pairs to the network structure and updating the learning information using all the information is called batch learning. However, this learning method requires an enormous calculation load as the number of learning pairs increases. Conversely, a learning method using only one learning pair for updating learning information and using a different learning pair for each update is called online learning. This method has an advantage that the amount of calculation does not increase even if the number of learning pairs increases, but instead is greatly affected by noise existing in one learning pair. For this reason, it is preferable to perform learning using a mini-batch method located between these two methods. In the mini-batch method, a small number is extracted from all the learning pairs, and the learning information is updated using them. In the next update, a different number of learning pairs will be extracted and used. By repeating this, the disadvantages of batch learning and online learning can be reduced.

続いてステップＳ２０３において、画像処理装置は、学習されたウエイト情報を出力する。本実施例では、Ｆ値が閾値以下の場合と閾値よりも大きい場合の少なくとも２つのケースに対して、同様の学習を行い、複数のウエイト情報を出力する。また本実施例では、ウエイト情報は記憶部１０３に記憶される。 Subsequently, in step S203, the image processing device outputs the learned weight information. In the present embodiment, similar learning is performed for at least two cases where the F value is equal to or less than the threshold value and when the F value is greater than the threshold value, and a plurality of pieces of weight information are output. In this embodiment, the weight information is stored in the storage unit 103.

なお、ウエイト情報の学習および鮮鋭化画像の生成を行う際に扱う画像は、ＲＡＷ画像または現像後の画像のいずれでもよい。Ａ＋Ｂ画像とＡ画像が符号化されている場合、復号してから学習および生成を行う。学習に使用した画像と鮮鋭化画像生成時の入力画像でガンマ補正の有無や、ガンマ値が異なる場合、入力画像を処理して学習の画像に合わせることが好ましい。また、Ａ＋Ｂ画像とＡ画像（学習の際は正解画像も）は、ニューラルネットワークへ入力する前に信号値を規格化しておくことが好ましい。規格化しない場合、学習と鮮鋭化画像生成時にｂｉｔ数が異なっていると、鮮鋭化画像を正しく推定することができない。また、ｂｉｔ数によってスケールが変化するため、学習時の最適化で収束に影響を及ぼす可能性もある。規格化には、信号が実際に取り得る最大値（輝度飽和値）を用いる。例えばＡ＋Ｂ画像が１６ｂｉｔで保存されていたとしても、輝度飽和値は１２ｂｉｔの場合等があり、この際は１２ｂｉｔの最大値（４０９５）で規格化しなければ信号の範囲が０〜１にならない。また、規格化の際にはオプティカルブラックの値を減算することが好ましい。これにより、実際に画像が取り得る信号の範囲をより０〜１に近づけることができる。具体的には、以下の式（２）に従って規格化することが好ましい。 Note that the image handled when learning the weight information and generating the sharpened image may be either a RAW image or an image after development. When the A + B image and the A image are encoded, learning and generation are performed after decoding. If the image used for learning and the input image at the time of generation of the sharpened image have gamma correction or a different gamma value, it is preferable to process the input image to match the learning image. In addition, it is preferable that the signal values of the A + B image and the A image (and the correct image at the time of learning) are standardized before input to the neural network. If normalization is not performed, if the number of bits is different between learning and generation of a sharpened image, the sharpened image cannot be correctly estimated. Also, since the scale changes depending on the number of bits, convergence may be affected by optimization during learning. For normalization, the maximum value (brightness saturation value) that the signal can actually take is used. For example, even if the A + B image is stored in 16 bits, the luminance saturation value may be 12 bits, and in this case, the range of the signal does not become 0 to 1 unless standardized with the maximum value (4095) of 12 bits. Further, it is preferable to subtract the value of the optical black at the time of normalization. As a result, the range of signals that can be actually taken by the image can be made closer to 0-1. Specifically, it is preferable to standardize according to the following equation (2).

式（２）において、ｓはＡ＋Ｂ画像（またはＡ画像もしくは正解画像）の信号、ｓ_ＯＢはオプティカルブラックの信号値（画像が取り得る信号の最小値）、ｓ_ｓａｔｕは信号の輝度飽和値、ｓ_ｎｏｒは規格化された信号を示す。 In the formula (2), s is a signal of an A + B image (or an A image or a correct image), s _OB is a signal value of optical black (minimum value of a signal that the image can take), s _satu is a luminance saturation value of the signal, s _nor indicates a standardized signal.

本実施例によれば、画像から光学系の瞳に依存するぼけを高精度に補正し、鮮鋭化画像を得ることが可能な画像処理方法、画像処理装置、撮像装置、および、レンズ装置を提供することができる。 According to the present embodiment, an image processing method, an image processing device, an imaging device, and a lens device capable of accurately correcting a blur depending on a pupil of an optical system from an image and obtaining a sharpened image are provided. can do.

次に、本発明の実施例２における画像処理システムについて説明する。本実施例では、鮮鋭化画像を推定する画像処理装置、撮像画像を取得する撮像装置、および、学習を行うサーバが個別に存在している。 Next, an image processing system according to a second embodiment of the present invention will be described. In the present embodiment, an image processing device for estimating a sharpened image, an imaging device for acquiring a captured image, and a server for learning exist individually.

図９および図１０を参照して、本実施例における画像処理システムについて説明する。図９は、画像処理システム３００のブロック図である。図１０は、画像処理システム３００の外観図である。図９および図１０に示されるように、画像処理システム３００は、撮像装置３０１、画像処理装置３０２、サーバ３０６、表示装置３０９、記録媒体３１０、および、出力装置３１１を備えて構成される。 An image processing system according to the present embodiment will be described with reference to FIGS. FIG. 9 is a block diagram of the image processing system 300. FIG. 10 is an external view of the image processing system 300. 9 and 10, the image processing system 300 includes an imaging device 301, an image processing device 302, a server 306, a display device 309, a recording medium 310, and an output device 311.

撮像装置３０１の基本構成は、鮮鋭化画像を生成する画像処理部、および撮像部を除いて、図２に示される撮像装置１００と同様である。なお、本実施例の撮像装置３０１は、レンズ装置（光学系）の交換が可能である。撮像装置３０１の撮像素子は、図１１に示されるように構成されている。図１１は、本実施例における撮像素子の構成図である。図１１において、破線はマイクロレンズを示す。画素３２０（ａ、ｂ以降は省略）のそれぞれには４つの光電変換部３２１、３２２、３２３、３２４（ａ、ｂ以降は省略）が設けられ、光学系の瞳を２×２の四つに分割している。光電変換部３２１〜３２４で取得される画像を、順に、Ａ画像、Ｂ画像、Ｃ画像、Ｄ画像とし、それらの加算結果をＡＢＣＤ画像とする。撮像素子からは撮像画像として、ＡＢＣＤ画像（第１の画像）とＡ画像（第２の画像）の２画像が出力される。 The basic configuration of the imaging device 301 is the same as that of the imaging device 100 illustrated in FIG. 2 except for an image processing unit that generates a sharpened image and an imaging unit. Note that the imaging device 301 of the present embodiment can exchange a lens device (optical system). The imaging device of the imaging device 301 is configured as shown in FIG. FIG. 11 is a configuration diagram of the image sensor according to the present embodiment. In FIG. 11, a broken line indicates a microlens. Each of the pixels 320 (omitted from a and b) is provided with four photoelectric conversion units 321, 322, 323, and 324 (omitted from a and b), and the pupil of the optical system is divided into four 2 × 2 pupils. Divided. The images acquired by the photoelectric conversion units 321 to 324 are sequentially referred to as an A image, a B image, a C image, and a D image, and an addition result thereof is referred to as an ABCD image. Two images, an ABCD image (first image) and an A image (second image), are output from the imaging device as captured images.

撮像装置３０１と画像処理装置３０２とが接続されると、ＡＢＣＤ画像とＡ画像は記憶部３０３に記憶される。画像処理装置３０２は、画像生成部３０４にてＡＢＣＤ画像とＡ画像から鮮鋭化画像を生成する。この際、画像処理装置３０２は、ネットワーク３０５を介してサーバ３０６にアクセスし、生成に用いるウエイト情報を読み出す。ウエイト情報は、学習部３０８で予め学習され、記憶部３０７に記憶されている。ウエイト情報は、複数のレンズ、焦点距離、Ｆ値等により個別に学習されており、複数のウエイト情報が存在する。 When the imaging device 301 and the image processing device 302 are connected, the ABCD image and the A image are stored in the storage unit 303. In the image processing device 302, the image generation unit 304 generates a sharpened image from the ABCD image and the A image. At this time, the image processing apparatus 302 accesses the server 306 via the network 305 and reads out weight information used for generation. The weight information is learned in advance by the learning unit 308 and is stored in the storage unit 307. The weight information has been individually learned using a plurality of lenses, focal lengths, F-numbers, and the like, and there is a plurality of weight information.

画像処理装置３０２は、入力されたＡＢＣＤ画像に合致する条件のウエイト情報を選択して記憶部３０３に取得し、鮮鋭化画像を生成する。生成された鮮鋭化画像は、表示装置３０９、記録媒体３１０、および、出力装置３１１の少なくとも一つに出力される。表示装置３０９は、例えば液晶ディスプレイやプロジェクタ等である。ユーザは、表示装置３０９を介して、処理途中の画像を確認しながら作業を行うことができる。記録媒体３１０は、例えば半導体メモリ、ハードディスク、ネットワーク上のサーバ等である。出力装置３１１は、プリンタ等である。画像処理装置３０２は、必要に応じて現像処理やその他の画像処理を行う機能を有する。また本実施例において、撮像装置３０１に接続されているレンズ装置内の記憶手段にウエイト情報を保持しておき、ぼけ補正の際に呼び出してもよい。 The image processing device 302 selects weight information of a condition that matches the input ABCD image, acquires the weight information in the storage unit 303, and generates a sharpened image. The generated sharpened image is output to at least one of the display device 309, the recording medium 310, and the output device 311. The display device 309 is, for example, a liquid crystal display or a projector. The user can work through the display device 309 while checking the image being processed. The recording medium 310 is, for example, a semiconductor memory, a hard disk, a server on a network, or the like. The output device 311 is a printer or the like. The image processing device 302 has a function of performing development processing and other image processing as needed. In this embodiment, the weight information may be stored in a storage unit in the lens device connected to the imaging device 301, and may be called when blur correction is performed.

次に、図１２を参照して、画像処理装置３０２の画像生成部３０４により実行される鮮鋭化処理（鮮鋭化画像の生成処理）について説明する。図１２は、鮮鋭化画像の生成処理を示すフローチャートである。図１２の各ステップは、主に、画像処理装置３０２（画像生成部３０４）により実行される。 Next, with reference to FIG. 12, the sharpening process (the process of generating a sharpened image) performed by the image generating unit 304 of the image processing device 302 will be described. FIG. 12 is a flowchart illustrating a process of generating a sharpened image. Each step in FIG. 12 is mainly executed by the image processing device 302 (image generation unit 304).

まずステップＳ３０１において、画像処理装置３０２は、ＡＢＣＤ画像とＡ画像を取得する。本実施例において、第１の画像はＡＢＣＤ画像であり、第２の画像はＡ画像である。ただし、第１の画像は光学系の瞳全体に対応する画像である必要はなく、Ａ画像、Ｂ画像、Ｃ画像、Ｄ画像の少なくとも二つを加算した画像でもよい。続いてステップＳ３０２において、画像処理装置３０２は、ＡＢＣＤ画像のレンズ装置の種類、焦点距離、Ｆ値、または、合焦距離に基づいて、対応するウエイト情報を取得する。続いてステップＳ３０３において、画像処理装置３０２は、鮮鋭化画像を生成する。なお本実施例において、生成に用いるネットワークは、図１を参照して実施例１で説明したネットワークと同様である。 First, in step S301, the image processing device 302 acquires an ABCD image and an A image. In this embodiment, the first image is an ABCD image, and the second image is an A image. However, the first image does not need to be an image corresponding to the entire pupil of the optical system, and may be an image obtained by adding at least two of the A image, the B image, the C image, and the D image. Subsequently, in step S302, the image processing device 302 obtains corresponding weight information based on the type of lens device, focal length, F-number, or focusing distance of the ABCD image. Subsequently, in step S303, the image processing device 302 generates a sharpened image. In this embodiment, the network used for generation is the same as the network described in the first embodiment with reference to FIG.

学習部３０８が行うウエイト情報の学習は、実施例１と同様に、図８に示されるフローチャートに従って行われる。レンズ装置に応じて収差やヴィネッティングが異なるため、レンズ装置の種類ごとに学習ペアを作成し、ウエイト情報を学習する。また、撮像条件（焦点距離、Ｆ値、合焦距離）や像高、アジムスにより収差やヴィネッティングの変化が無視できない場合、複数の撮像条件、像高、または、アジムスごとに学習ペアを作成してウエイト情報を学習することが好ましい。なお本実施例では、第２の画像が１枚である例を挙げたが、第２の画像が複数（例えば、Ａ画像、Ｃ画像、Ｄ画像の３枚）の画像であってもよい。 The learning of the weight information performed by the learning unit 308 is performed in accordance with the flowchart shown in FIG. Since aberrations and vignetting vary depending on the lens device, a learning pair is created for each type of lens device to learn weight information. In addition, if changes in aberrations and vignetting cannot be ignored due to imaging conditions (focal length, F value, focusing distance), image height, and azimuth, a learning pair is created for each of a plurality of imaging conditions, image heights, or azimuths. It is preferable to learn the weight information by using the weight information. In this embodiment, an example in which the number of the second image is one has been described, but the number of the second image may be a plurality (for example, three images of an A image, a C image, and a D image).

本実施例によれば、画像から光学系の瞳に依存するぼけを高精度に補正し、鮮鋭化画像を得ることが可能な画像処理システムを提供することができる。 According to the present embodiment, it is possible to provide an image processing system capable of correcting a blur depending on a pupil of an optical system from an image with high accuracy and obtaining a sharpened image.

（その他の実施例）
本発明は、上述の実施例の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other Examples)
The present invention supplies a program for realizing one or more functions of the above-described embodiments to a system or an apparatus via a network or a storage medium, and one or more processors in a computer of the system or the apparatus read and execute the program. This process can be realized. Further, it can be realized by a circuit (for example, an ASIC) that realizes one or more functions.

各実施例によれば、画像から光学系の瞳に依存するぼけを高精度に補正し、鮮鋭化画像を得ることが可能な画像処理方法、画像処理装置、撮像装置、レンズ装置、プログラム、および、記憶媒体を提供することができる。 According to each embodiment, an image processing method, an image processing device, an imaging device, a lens device, a program, and a method that can accurately correct a blur depending on a pupil of an optical system from an image and obtain a sharpened image , A storage medium can be provided.

以上、本発明の好ましい実施形態について説明したが、本発明はこれらの実施形態に限定されず、その要旨の範囲内で種々の変形及び変更が可能である。 As described above, the preferred embodiments of the present invention have been described, but the present invention is not limited to these embodiments, and various modifications and changes can be made within the scope of the gist.

１０２画像処理部（画像処理装置）
１０２ａ情報取得部（取得手段）
１０２ｂ画像生成部（生成手段） 102 Image processing unit (image processing device)
102a Information acquisition unit (acquisition means)
102b Image generation unit (generation means)

Claims

A first image obtained by imaging an object space via a first pupil of an optical system and an image of the object space via a second pupil different from the first pupil of the optical system Obtaining a second image obtained by performing
Generating a sharpened image in which a pupil-dependent blur of the optical system has been corrected based on the first image and the second image using a multilayer neural network. Characteristic image processing method.

The image processing method according to claim 1, wherein the blur depending on the pupil of the optical system is a blur depending on a transmittance distribution of the pupil.

The image processing method according to claim 2, wherein the transmittance distribution of the first pupil is different from the transmittance distribution of the second pupil.

The image processing method according to claim 1, wherein the blur depending on the pupil of the optical system is a blur depending on aberration, diffraction, or defocus of the optical system. .

The image processing method according to claim 1, wherein the second pupil is a part of the first pupil.

6. The image according to claim 1, wherein the first image and the second image are images obtained by simultaneously capturing the subject space via the optical system. 7. Image processing method.

The image processing method according to claim 1, wherein the first image and the second image are images captured by the same image sensor.

The method further includes a step of performing a process of adjusting brightness of the first image and the second image,
8. The method according to claim 1, wherein the step of generating the sharpened image is performed based on the first image and the second image after the brightness adjustment processing. 9. Image processing method described in

9. The image processing method according to claim 8, wherein the step of performing the brightness adjustment processing is performed based on information on a transmittance distribution of the first pupil and the second pupil.

9. The method according to claim 8, wherein the step of performing the process of adjusting the brightness is performed based on an average pixel value calculated for each partial region of the first image and the second image. Image processing method.

The image processing method according to any one of claims 1 to 10, wherein the multilayer neural network is configured using information on weights.

The information related to the weight further includes a step of acquiring information related to the weight based on an F value of the optical system at the time of capturing the first image or the second image,
The image processing method according to claim 11, wherein the step of generating the sharpened image is performed based on the first image, the second image, and information on the weight.

When the F value of the optical system at the time of capturing the first image or the second image is larger than a predetermined F value, the sharpened image is an image in which a blur depending on diffraction or defocus has been corrected. The image processing method according to claim 1, wherein

A process of applying a sharpening filter to the first image or the second image when an F value of the optical system at the time of capturing the first image or the second image is smaller than a predetermined F value; Further comprising the step of:
14. The sharpening filter according to claim 1, wherein the sharpening filter is a filter that sharpens the first image or the second image based on an optical characteristic representing an imaging performance of the optical system. The image processing method according to any one of the above.

The first image and the second image are each divided by a straight line that is parallel to an axis where the second pupil is axisymmetric and passes through respective reference points of the first image and the second image. And further comprising a step of performing an inversion process on the divided first image and the second image,
15. The method according to claim 1, wherein the step of generating the sharpened image is performed based on the first image and the second image after the inversion processing. Image processing method.

A first image obtained by imaging an object space via a first pupil of an optical system and an image of the object space via a second pupil different from the first pupil of the optical system Acquiring means for acquiring a second image obtained by performing
Generating means for generating a sharpened image in which blur depending on the pupil of the optical system has been corrected based on the first image and the second image using a multilayer neural network. An image processing apparatus characterized by the above-mentioned.

An image sensor that photoelectrically converts an optical image formed by the optical system,
An imaging apparatus comprising: the image processing apparatus according to claim 16.

The image sensor has a plurality of pixels,
Each of the plurality of pixels has a plurality of photoelectric conversion units,
Each of the plurality of photoelectric conversion units receives light incident at different incident angles to generate a plurality of signals,
The first image corresponding to an addition signal obtained by adding the plurality of signals, and the first image corresponding to an addition signal obtained by adding one signal of the plurality of signals or a part of the plurality of signals. 18. The imaging device according to claim 17, wherein the imaging device outputs two images.

A lens device detachable from the imaging device,
Optics,
Storage means for storing information about weights input to the multilayer neural network,
The imaging device,
A first image obtained by imaging the subject space through the first pupil of the optical system, and the subject space through a second pupil different from the first pupil of the optical system. Acquiring means for acquiring a second image obtained by imaging;
Using the multilayer neural network to generate a sharpened image in which the pupil-dependent blur of the optical system has been corrected based on the first image, the second image, and information on the weight. Means, and a lens device.

A program for causing a computer to execute the image processing method according to claim 1.

A storage medium storing the program according to claim 20.