JP2023009613A

JP2023009613A - Imaging device and model learning system

Info

Publication number: JP2023009613A
Application number: JP2021113039A
Authority: JP
Inventors: 広器佐々木; Hiroki Sasaki
Original assignee: Denso Corp; Toyota Motor Corp; Mirise Technologies Corp
Current assignee: Denso Corp; Toyota Motor Corp; Mirise Technologies Corp
Priority date: 2021-07-07
Filing date: 2021-07-07
Publication date: 2023-01-20

Abstract

To provide an imaging device capable of obtaining a corrected captured image with a small degree of blurring.SOLUTION: An imaging device includes an imaging unit 10 for capturing an image in which blurring of an image has symmetry with respect to an optical axis 11a with a light receiving sensor 12 having a plurality of light receiving elements, and outputting a brightness signal indicating the brightness of each light receiving element, and an image processor 20 that receives, as input data, the brightness signal output by the imaging unit 10 and a polar coordinate component indicating the position of a light receiving element that has detected the brightness signal, inputting the input data to a learning network model to generate a corrected captured image in which blur is corrected, and outputting the generated corrected captured image to an arithmetic operation unit 2. The learning network model receives, as input data, a teacher brightness signal for indicating a teacher image having blurring improved for the captured image with a brightness signal for each image position corresponding to the position of the light receiving element, and the polar coordinate component indicating the image position, and is learned by using learning data including the input data and the teacher data.SELECTED DRAWING: Figure 1

Description

画像処理機能を備えた撮像装置、および、モデル学習システムに関する。 The present invention relates to an imaging device having an image processing function and a model learning system.

非特許文献１には、撮像画像を表している受光素子毎の輝度信号を入力データとして事前に学習しておいて学習型ネットワークモデルに入力し、撮像画像のボケを改善する技術が開示されている。 Non-Patent Document 1 discloses a technique for improving the blur of a captured image by learning in advance the luminance signal of each light receiving element representing the captured image as input data and inputting it to a learning network model. there is

”Learned large field-of-view imaging with thin-plate optics”, ACM Transactions on Graphics, Volume 38, Issue 6, November 2019 Article No.: 219, pp 1-14”Learned large field-of-view imaging with thin-plate optics”, ACM Transactions on Graphics, Volume 38, Issue 6, November 2019 Article No.: 219, pp 1-14

非特許文献１に開示された技術は、まだボケの改善が十分ではない。より、ボケの程度が少ない画像を出力できる技術が望まれる。 The technology disclosed in Non-Patent Literature 1 is still insufficient in improving blurring. A technique capable of outputting an image with less blur is desired.

本開示は、この事情に基づいて成されたものであり、その目的とするところは、ボケの程度が少ない補正後撮像画像を得られる撮像装置、および、撮像装置が使うモデルを学習するモデル学習システムを提供することにある。 The present disclosure has been made based on this situation, and aims to provide an imaging device capable of obtaining a corrected captured image with a small degree of blurring, and model learning for learning a model used by the imaging device. It is to provide a system.

上記目的は独立請求項に記載の特徴の組み合わせにより達成され、また、下位請求項は更なる有利な具体例を規定する。特許請求の範囲に記載した括弧内の符号は、一つの態様として後述する実施形態に記載の具体的態様との対応関係を示すものであって、開示した技術的範囲を限定するものではない。 The above objects are achieved by the combination of features stated in the independent claims, and the sub-claims define further advantageous embodiments. The symbols in parentheses described in the claims indicate the corresponding relationship with the specific aspects described in the embodiments described later as one aspect, and do not limit the disclosed technical scope.

上記目的を達成するための撮像装置に係る１つの開示は、
画像のボケが光軸（１１ａ）に対して対称性をもつ撮像画像を、複数の受光素子を備えた受光センサ（１２）により撮像し、受光素子別の輝度を示す輝度信号を出力する撮像部（１０）と、
撮像部が出力した輝度信号と、輝度信号を検出した受光素子の位置を示す極座標成分とを入力データとし、入力データを学習型ネットワークモデルに入力してボケを補正した補正後撮像画像を生成し、生成した補正後撮像画像を所定の演算装置へ出力する画像処理部（２０、２２０）と、を備え、
学習型ネットワークモデルは、撮像画像に対してボケが改善されている教師画像を受光素子の位置に対応した画像位置別の輝度信号により示す教師輝度信号と、画像位置を示す極座標成分とを教師データとし、入力データと教師データとを含む学習データにより学習されている、撮像装置である。 One disclosure related to an imaging device for achieving the above object is,
An imaging unit that captures a captured image whose blurring is symmetrical with respect to the optical axis (11a) with a light receiving sensor (12) having a plurality of light receiving elements and outputs a luminance signal indicating the luminance of each light receiving element. (10) and
The luminance signal output by the imaging unit and the polar coordinate component indicating the position of the light-receiving element that detected the luminance signal are used as input data, and the input data is input to a learning network model to generate a corrected captured image in which blur is corrected. , an image processing unit (20, 220) that outputs the generated corrected captured image to a predetermined arithmetic device,
The learning-type network model uses a teacher luminance signal that indicates a teacher image in which blurring has been improved with respect to the captured image by a luminance signal for each image position corresponding to the position of the light receiving element, and a polar coordinate component that indicates the image position as teacher data. , and is learned by learning data including input data and teacher data.

撮像部が撮像する画像は、画像のボケが光軸に対して対称性を持っている。画像処理部は、学習型ネットワークモデルにより撮像画像のボケを補正する。学習型ネットワークモデルは、輝度信号だけでなく、輝度信号を検出した受光素子の位置を示す極座標成分も入力データとするモデルである。 An image captured by the imaging unit has symmetrical blur of the image with respect to the optical axis. The image processing unit corrects blurring of the captured image using a learning network model. The learning network model is a model that takes as input data not only luminance signals but also polar coordinate components indicating the positions of the light receiving elements that detected the luminance signals.

この学習型ネットワークモデルは、撮像画像に対してボケが改善されている教師画像をベースとして学習されている。詳しくは、教師画像において受光素子に対応した画像位置別の輝度信号（すなわち教師輝度信号）と画像位置を示す極座標成分とを教師データとしている。学習型ネットワークモデルは、この教師データと入力データとを学習データとして学習されている。 This learning-type network model is learned based on a teacher image in which the blur is improved with respect to the captured image. Specifically, in the teacher image, the teacher data is a luminance signal for each image position corresponding to the light-receiving element (that is, teacher luminance signal) and a polar coordinate component indicating the image position. The learning network model is learned using this teacher data and input data as learning data.

学習型ネットワークモデルがこの学習データにより学習されていると、輝度信号のみを入力データとして学習されたモデルよりも補正精度が向上する。補正精度が向上する理由は以下の通りである。ボケが光軸に対して対称性を持つ撮像画像は、座標を極座標系とすると、互いに対称となる座標のボケが同様になる。そのため、上記の学習データとすることで、学習時にボケの対称性が利用されて学習型ネットワークモデルのパラメータが学習されるので、学習効率が向上する。学習効率が向上することは、よりボケを改善できる学習型ネットワークモデルになることを意味する。このようにして学習された学習型ネットワークモデルを用いて撮像画像を補正するので、ボケの程度が少ない補正後撮像画像を得られるのである。 If the learning network model is trained using this learning data, the correction accuracy is improved over the model trained using only the luminance signal as input data. The reason why the correction accuracy is improved is as follows. A picked-up image whose blur is symmetrical with respect to the optical axis has the same blur at mutually symmetrical coordinates when the coordinates are in a polar coordinate system. Therefore, by using the learning data described above, the parameters of the learning network model are learned using the symmetry of the blur during learning, so that the learning efficiency is improved. Improving the learning efficiency means that it becomes a learning network model that can improve blurring. Since the picked-up image is corrected using the learned network model learned in this way, the corrected picked-up image with less degree of blur can be obtained.

上記目的を達成するための他の開示は、撮像装置が備える学習型ネットワークモデルを学習するモデル学習システムである。このモデル学習システムは、
画像のボケが光軸（１１ａ）に対して対称性をもつ撮像画像を、複数の受光素子を備えた受光センサ（１２）により撮像し、受光素子別の輝度を示す輝度信号を出力する撮像部（１０）と、
撮像画像に対してボケが改善されている教師画像を複数の受光素子に対応した画像位置別の輝度信号により示す教師輝度信号と、画像位置を示す極座標成分とを含む教師データを生成する教師データ生成部（１２３、３２３）と、
撮像部が出力した輝度信号と、輝度信号を検出した受光素子の位置を示す極座標成分とを入力データとし、入力データと教師データ生成部が生成した教師データとを含む学習データにより、学習型ネットワークモデルのパラメータを学習するモデル学習部（１２４、３２４）とを備える。 Another disclosure for achieving the above object is a model learning system that learns a learning network model provided in an imaging device. This model learning system
An imaging unit that captures a captured image whose blurring is symmetrical with respect to the optical axis (11a) with a light receiving sensor (12) having a plurality of light receiving elements and outputs a luminance signal indicating the luminance of each light receiving element. (10) and
Teacher data for generating teacher data including a teacher luminance signal indicating a teacher image in which blurring is improved with respect to a captured image by luminance signals for each image position corresponding to a plurality of light receiving elements, and a polar coordinate component indicating the image position. a generator (123, 323);
A learning network is generated by learning data including the input data and the teacher data generated by the teacher data generator using the luminance signal output by the imaging unit and the polar coordinate component indicating the position of the light-receiving element that detected the luminance signal as input data. and a model learning unit (124, 324) for learning model parameters.

第１実施形態の撮像装置１の構成を示す図。1 is a diagram showing the configuration of an imaging device 1 according to a first embodiment; FIG. 入力データ生成部２２と画像補正部２３が実行する処理を説明する図。FIG. 4 is a diagram for explaining processing executed by an input data generation unit 22 and an image correction unit 23; モデル学習システム１００の構成図。1 is a configuration diagram of a model learning system 100; FIG. 第１実施形態の効果を説明する図。The figure explaining the effect of 1st Embodiment. 第２実施形態の撮像装置２００の構成を示す図。The figure which shows the structure of the imaging device 200 of 2nd Embodiment. 座標変換部２２１、入力データ生成部２２２および画像補正部２２３が実行する処理を説明する図。FIG. 4 is a diagram for explaining processing executed by a coordinate conversion unit 221, an input data generation unit 222, and an image correction unit 223; モデル学習システム３００の構成図。1 is a configuration diagram of a model learning system 300; FIG. 第２実施形態の効果を説明する図。The figure explaining the effect of 2nd Embodiment.

＜第１実施形態＞
以下、実施形態を図面に基づいて説明する。図１は、第１実施形態の撮像装置１の構成を示す図である。本実施形態の撮像装置１は車両に搭載され、車両の周囲の画像を撮像する。 <First embodiment>
Hereinafter, embodiments will be described based on the drawings. FIG. 1 is a diagram showing the configuration of an imaging device 1 according to the first embodiment. The imaging device 1 of this embodiment is mounted on a vehicle and captures an image around the vehicle.

撮像装置１は、撮像部１０と画像処理部２０とを備えている。撮像部１０は、車両の周囲を撮像し、撮像した画像（以下、撮像画像）を示すデジタルデータを画像処理部２０に出力する。撮像部１０は、レンズ１１と受光センサ１２と撮像制御部１３とを備えている。 The imaging device 1 includes an imaging section 10 and an image processing section 20 . The imaging unit 10 captures an image of the surroundings of the vehicle and outputs digital data representing the captured image (hereinafter referred to as captured image) to the image processing unit 20 . The imaging unit 10 includes a lens 11 , a light receiving sensor 12 and an imaging control unit 13 .

レンズ１１は、車両外部からレンズ１１に入射する光を受光センサ１２の受光面１２ａに集光する。レンズ１１は、第１面を凹面、第２面を凸面とする非球面レンズとし、中央部のコントラストに優れる設計とする。また、レンズ１１は、車両で用いる場合の耐候性を考慮して、ガラス製とすることが好ましい。 The lens 11 collects light incident on the lens 11 from the outside of the vehicle onto the light receiving surface 12 a of the light receiving sensor 12 . The lens 11 is an aspherical lens having a concave surface on the first surface and a convex surface on the second surface, and is designed to provide excellent contrast in the central portion. Further, the lens 11 is preferably made of glass in consideration of weather resistance when used in a vehicle.

受光センサ１２は、受光面１２ａに縦横に多数の受光素子が配置された構成である。受光素子は受光画素、あるいは、単に画素と言われることもある。受光素子は、光の強度を電気信号に変換する素子である。受光センサ１２としては、ＣＭＯＳセンサ、ＣＣＤセンサなどを用いることができる。受光センサ１２は、赤、緑、青のカラーフィルタが光路手前に配置されることにより、赤色の光の強度、緑色の光の強度、青色の光の強度を検出する。 The light-receiving sensor 12 has a configuration in which a large number of light-receiving elements are arranged vertically and horizontally on a light-receiving surface 12a. A light-receiving element may also be called a light-receiving pixel or simply a pixel. A light receiving element is an element that converts the intensity of light into an electrical signal. A CMOS sensor, a CCD sensor, or the like can be used as the light receiving sensor 12 . The light receiving sensor 12 detects the intensity of red light, the intensity of green light, and the intensity of blue light by arranging red, green, and blue color filters in front of the optical path.

受光面１２ａは、非点収差を補正するために、レンズ１１の光軸１１ａから離れるほどレンズ１１側に近づく湾曲形状である。受光面１２ａの湾曲の程度は、非点収差を補正できるように、レンズ１１の特性に応じて適宜、決定する。 The light-receiving surface 12a has a curved shape that approaches the lens 11 side as the distance from the optical axis 11a of the lens 11 increases, in order to correct astigmatism. The degree of curvature of the light receiving surface 12a is appropriately determined according to the characteristics of the lens 11 so as to correct astigmatism.

前述のように、レンズ１１は、中央のコントラストに優れる。換言すれば、このレンズ１１を用いて撮像される撮像画像は、視野中心から視野周辺に向かうほどボケの程度が大きくなる。また、レンズ１１は、光軸１１ａに対して対称形状であるので、撮像画像は、ボケが光軸１１ａに対して対称性を持つ画像である。 As described above, the lens 11 has excellent central contrast. In other words, the image captured using this lens 11 has a greater degree of blurring from the center of the field of view toward the periphery of the field of view. In addition, since the lens 11 has a symmetrical shape with respect to the optical axis 11a, the captured image is an image in which blurring is symmetrical with respect to the optical axis 11a.

撮像制御部１３は、受光センサ１２が備える受光素子を制御するものである。撮像制御部１３は、少なくとも１つのプロセッサを備えた構成により実現できる。撮像制御部１３は、受光センサ１２に撮像画像を逐次撮像させ、また、撮像画像を示す画像信号を画像処理部２０に逐次出力する。画像信号は、詳しくは、受光素子別の輝度信号である。受光センサ１２の受光面１２ａには、格子状に受光素子が配列しているので、撮像制御部１３は、受光素子別の輝度信号を、１列ずつ順番に直交座標系において連続する配列で、画像処理部２０に出力する。画像処理部２０は、逐次、画像処理部２０に入力される輝度信号の順番により、各輝度信号がどの受光素子が検出した輝度信号であるかを特定できる。なお、撮像制御部１３は、受光素子別の輝度信号に直交座標を付与して画像処理部２０に出力してもよい。 The imaging control unit 13 controls the light receiving element included in the light receiving sensor 12 . The imaging control unit 13 can be realized by a configuration including at least one processor. The imaging control unit 13 causes the light receiving sensor 12 to sequentially capture captured images, and sequentially outputs image signals representing the captured images to the image processing unit 20 . The image signal is, in detail, a luminance signal for each light receiving element. Since the light-receiving elements are arranged in a grid on the light-receiving surface 12a of the light-receiving sensor 12, the imaging control unit 13 sequentially transmits the luminance signal of each light-receiving element row by row in a continuous arrangement in the orthogonal coordinate system. Output to the image processing unit 20 . The image processing unit 20 can identify which light-receiving element detected each luminance signal based on the order of the luminance signals sequentially input to the image processing unit 20 . Note that the imaging control unit 13 may assign orthogonal coordinates to the luminance signal of each light-receiving element and output it to the image processing unit 20 .

画像処理部２０は、撮像画像のボケを改善した補正後撮像画像を生成し、補正後撮像画像を示す画像データを演算装置２へ出力する。演算装置２は、画像データを演算する装置であればよい。演算装置２は、たとえば、画像データを処理して補正後撮像画像に写っている障害物を検出する装置である。演算装置２の他の具体例は、補正後撮像画像を表示する表示器である。 The image processing unit 20 generates a corrected captured image in which blurring of the captured image is improved, and outputs image data representing the corrected captured image to the arithmetic device 2 . The computing device 2 may be any device that computes image data. The computing device 2 is, for example, a device that processes image data and detects an obstacle appearing in the corrected captured image. Another specific example of the computing device 2 is a display that displays the captured image after correction.

画像処理部２０は、少なくとも１つのプロセッサを備えた構成により実現できる。なお、撮像制御部１３と画像処理部２０が、同じプロセッサにより実現されてもよい。画像処理部２０は、プロセッサが実行する機能として、入力データ生成部２２と、画像補正部２３とを備える。 The image processing unit 20 can be realized by a configuration including at least one processor. Note that the imaging control unit 13 and the image processing unit 20 may be realized by the same processor. The image processing unit 20 includes an input data generation unit 22 and an image correction unit 23 as functions executed by the processor.

入力データ生成部２２は、画像補正部２３に入力する入力データを生成する。入力データの１単位は、１つの画素位置についての赤色の輝度値ＬＲ、緑色の輝度値ＬＧ、青色の輝度値ＬＢ、および、その画素位置の極座標（ｒ、θ）である。輝度値ＬＲ、ＬＧ、ＬＢは、輝度信号に基づいて定まる値とすることができる。 The input data generator 22 generates input data to be input to the image corrector 23 . One unit of input data is the red luminance value LR, the green luminance value LG, the blue luminance value LB, and the polar coordinates (r, θ) of the pixel position for one pixel position. The luminance values LR, LG, and LB can be values determined based on the luminance signal.

画素位置の極座標における距離ｒは、光軸１１ａからの距離を意味する。極座標における角度θは、始線からの角度である。入力データ生成部２２は、生成した入力データを画像補正部２３に出力する。 The distance r in the polar coordinates of the pixel position means the distance from the optical axis 11a. The angle θ in polar coordinates is the angle from the starting line. The input data generator 22 outputs the generated input data to the image corrector 23 .

画像補正部２３は、入力データを学習型ネットワークモデルに入力して補正後撮像画像を生成する。そして、画像補正部２３は、生成した補正後撮像画像を演算装置２に出力する。画像処理部２０は、記憶部２４を備える。記憶部２４は不揮発性であり、記憶部２４に、画像補正部２３が用いる学習済みの学習型ネットワークモデルが記憶されている。 The image correction unit 23 inputs the input data to the learning network model to generate a corrected captured image. Then, the image correction unit 23 outputs the generated captured image after correction to the arithmetic device 2 . The image processing section 20 includes a storage section 24 . The storage unit 24 is non-volatile, and stores a learned learning network model used by the image correction unit 23 .

学習型ネットワークモデルは、学習型フィルタを用い、逆畳み込み演算を行うように構築されたモデルである。たとえば、非特許文献１にも開示されているＵ－ｎｅｔを用いることができる。なお、非特許文献１では、Ｕ－ｎｅｔの段数は６段になっている。本実施形態でも、段数は６段とすることができる。ただし、５段など６段以外の段数でもよい。入力データの１セットが５つのパラメータを備えるので、本実施形態で用いる学習型ネットワークモデルは入力チャンネルを５チャンネルとする。また、初段の変換は、たとえば、１２８チャンネルとする。 A learning network model is a model constructed to perform a deconvolution operation using a learning filter. For example, U-net, which is also disclosed in Non-Patent Document 1, can be used. In Non-Patent Document 1, the U-net has six stages. Also in this embodiment, the number of stages can be six. However, the number of stages other than 6, such as 5, may be used. Since one set of input data has five parameters, the learning network model used in this embodiment has five input channels. Also, the conversion at the first stage is assumed to be 128 channels, for example.

図２は、入力データ生成部２２と画像補正部２３が実行する処理を説明する図である。図２において、「Image_RGB」は、撮像制御部１３が出力する画像信号である。画像信号は、画素位置毎に、赤色、緑色、青色の３つの輝度値Ｌが、直交座標の配列で配列されている信号群である。「Adress_rθ」は、距離ｒ、角度θにより定まる極座標位置である。入力データ生成部２２は、これら５つのパラメータを結合して、１セットの入力データ「Input_RGBrθ」とする。 FIG. 2 is a diagram for explaining the processing executed by the input data generation unit 22 and the image correction unit 23. As shown in FIG. In FIG. 2 , “Image_RGB” is an image signal output by the imaging control section 13 . The image signal is a signal group in which three luminance values L of red, green, and blue are arranged in an orthogonal coordinate arrangement for each pixel position. “Adress_rθ” is a polar coordinate position determined by the distance r and the angle θ. The input data generator 22 combines these five parameters to form one set of input data "Input_RGBrθ".

画像補正部２３は、入力データを、所定の大きさで順次切り出す。図２では、２５６×２５６の大きさを例示している。この切り出した入力データを、学習型ネットワークモデルに入力して出力データを得る。「Output_RGBrθ」は、入力データを順次、学習型ネットワークモデルに入力して得られた出力データを意味する。画像補正部２３は、この出力データから、極座標（ｒ、θ）のデータを除去することで、「Image_RGB」を得る。画像補正部２３側の「Image_RGB」は、補正後撮像画像を表す画像データである。 The image correction unit 23 sequentially cuts out the input data in a predetermined size. FIG. 2 illustrates a size of 256×256. The cut out input data is input to a learning network model to obtain output data. “Output_RGBrθ” means output data obtained by sequentially inputting input data to a learning network model. The image correction unit 23 obtains "Image_RGB" by removing the data of the polar coordinates (r, θ) from this output data. "Image_RGB" on the image correction unit 23 side is image data representing the captured image after correction.

〔学習処理〕
次に、学習型ネットワークモデルの学習処理について説明する。図３に、学習処理をするモデル学習システム１００の構成を示す。モデル学習システム１００は、撮像部１０、教師画像撮像装置１１０、モデル生成装置１２０を備える。 [Learning processing]
Next, learning processing of the learning network model will be described. FIG. 3 shows the configuration of a model learning system 100 that performs learning processing. A model learning system 100 includes an imaging unit 10 , a teacher image imaging device 110 and a model generation device 120 .

教師画像撮像装置１１０は、教師画像を撮像する装置である。教師画像は、撮像部１０が撮像する撮像画像に対してボケが改善されている画像である。教師画像撮像装置１１０は、このような教師画像を撮像するために、複数枚のレンズを組み合わせたりする。教師画像撮像装置１１０は、教師画像を、受光素子の位置に対応した画像位置別の輝度信号により示す教師輝度信号に、各教師輝度信号の画像位置を示す直交座標を付与して、モデル生成装置１２０に出力する。 The teacher image capturing device 110 is a device that captures a teacher image. The teacher image is an image in which blur is improved with respect to the captured image captured by the imaging unit 10 . The teacher image capturing device 110 combines a plurality of lenses in order to capture such teacher images. The teacher image capturing device 110 assigns rectangular coordinates indicating the image position of each teacher luminance signal to the teacher luminance signal representing the teacher image by the luminance signal for each image position corresponding to the position of the light receiving element, and creates a model generation device. output to 120.

撮像部１０は、撮像装置１が備えるものと同じである。学習時、撮像部１０は、教師画像撮像装置１１０が撮像する範囲と同じ範囲を撮像する。なお、教師画像は、コンピュータグラフィックスにより作成したものでもよい。教師画像がコンピュータグラフィックスにより作成した画像である場合、撮像部１０は、教師画像を撮像する。 The imaging unit 10 is the same as that included in the imaging device 1 . During learning, the imaging unit 10 images the same range as that captured by the teacher imaging device 110 . Note that the teacher image may be created by computer graphics. When the teacher image is an image created by computer graphics, the imaging unit 10 captures the teacher image.

モデル生成装置１２０は、入力データ生成部１２２、教師データ生成部１２３、モデル学習部１２４、記憶部１２５を備える。入力データ生成部１２２は、撮像装置１が備える入力データ生成部２２と同じ処理により、入力データを生成する。 The model generation device 120 includes an input data generation section 122 , a teacher data generation section 123 , a model learning section 124 and a storage section 125 . The input data generator 122 generates input data by the same process as the input data generator 22 included in the imaging device 1 .

教師データ生成部１２３は、教師輝度信号を教師画像撮像装置１１０から取得し、この教師輝度信号をもとに教師データを生成する。教師データは、入力データにおける撮像画像を教師画像に置き換えたデータである。したがって、教師データは、画像位置別の輝度信号に、各画像位置を示す極座標（ｒ、θ）を加えたデータである。 The teacher data generation unit 123 acquires the teacher luminance signal from the teacher image capturing device 110 and generates teacher data based on this teacher luminance signal. The teacher data is data obtained by replacing the captured image in the input data with a teacher image. Therefore, the teacher data is data obtained by adding the polar coordinates (r, θ) indicating each image position to the luminance signal for each image position.

モデル学習部１２４は、入力データ生成部１２２が生成した入力データと、教師データ生成部１２３が生成した教師データとを学習データとして、学習型ネットワークモデルのパラメータを学習する。そして、モデル学習部１２４は、パラメータを学習した学習型ネットワークモデルを記憶部１２５に記憶する。撮像装置１が備える記憶部２４に記憶されている学習型ネットワークモデルは、このモデル学習システム１００により学習された学習型ネットワークモデルである。学習済みの学習型ネットワークモデルを記憶部２４に記憶するために、撮像装置１とモデル学習システム１００は、相互に通信できるようになっていてもよい。また、作業者が可搬型記憶媒体を用いて、記憶部１２５から記憶部２４へ、学習型ネットワークモデルをコピーしてもよい。 The model learning unit 124 learns the parameters of the learning network model using the input data generated by the input data generation unit 122 and the teacher data generated by the teacher data generation unit 123 as learning data. Then, the model learning unit 124 stores the learning network model in which the parameters have been learned in the storage unit 125 . The learning network model stored in the storage unit 24 included in the imaging device 1 is a learning network model learned by this model learning system 100 . In order to store the learned learning network model in the storage unit 24, the imaging device 1 and the model learning system 100 may communicate with each other. Alternatively, the operator may copy the learning network model from the storage unit 125 to the storage unit 24 using a portable storage medium.

このように、学習型ネットワークモデルは、教師輝度信号と極座標で示す画像位置とを教師データとして学習されている。学習型ネットワークモデルがこのようにして学習されていると、以下の理由により、輝度信号のみを入力データとして学習されたモデルよりも補正精度が向上する。 In this way, the learning network model is learned using the teacher luminance signal and the image position indicated by polar coordinates as teacher data. When the learning network model is learned in this manner, the correction accuracy is improved over the model trained using only the luminance signal as input data for the following reasons.

撮像画像は、ボケが光軸に対して対称性を持っているため、極座標を入力データの一部として学習することで、ボケの対称性が利用されて学習型ネットワークモデルのパラメータが学習されるからである。このようにして学習された学習型ネットワークモデルを用いて撮像画像を補正することで、撮像装置１は、ボケの程度が少ない補正後撮像画像を得られる。 In captured images, the blur has symmetry with respect to the optical axis, so by learning the polar coordinates as part of the input data, the symmetry of the blur is used to learn the parameters of the learning network model. It is from. By correcting the captured image using the learned network model learned in this way, the imaging apparatus 1 can obtain the corrected captured image with a small degree of blurring.

〔画像改善効果〕
図４に、撮像装置１による画像改善効果を確認した結果を示す。図４は、横軸がＨａｌｆＦｏＶ（°）である。ＨａｌｆＦｏＶは、光軸を０°とし、光軸に対して一方の側のみの視野を意味する。全体の視野（ＦｉｅｌｄｏｆＶｉｅｗ）は、ＨａｌｆＦｏＶの２倍である。図４の縦軸は、ＰＳＮＲ（ＰｅａｋＳｉｇｎａｌＮｏｉｓｅＲａｔｉｏ）である。 [Image improvement effect]
FIG. 4 shows the result of confirming the image improvement effect of the imaging device 1. As shown in FIG. In FIG. 4, the horizontal axis is HalfFoV (°). HalfFoV means the field of view on only one side of the optical axis with the optical axis at 0°. The Field of View is twice the HalfFoV. The vertical axis of FIG. 4 is PSNR (Peak Signal Noise Ratio).

図４において非特許文献１として示した結果は、第１実施形態の撮像装置１の構成において、学習型ネットワークモデルのみ、非特許文献１に開示されているモデルとした装置を用いた結果である。非特許文献１に開示されているモデルは、Ｕ－ｎｅｔを用いるが、入力データが、各画素の輝度信号のみであり、各画素の輝度信号に極座標情報は追加されない。 The result shown as Non-Patent Document 1 in FIG. 4 is the result of using the device disclosed in Non-Patent Document 1 for only the learning network model in the configuration of the imaging device 1 of the first embodiment. . The model disclosed in Non-Patent Document 1 uses U-net, but the input data is only the luminance signal of each pixel, and polar coordinate information is not added to the luminance signal of each pixel.

図４に示されるように、第１実施形態によれば、非特許文献１に開示されている入力データと学習型ネットワークモデルを用いるよりも、全体的にＰＳＮＲが向上していることが分かる。 As shown in FIG. 4, according to the first embodiment, the overall PSNR is improved compared to using the input data and learning network model disclosed in Non-Patent Document 1.

また、非特許文献１では、回折型ＤＯＥを用いたシングルレンズを採用している。非特許文献１には、ＨａｌｆＦｏＶが９°～１３°付近では、非球面レンズを採用するよりもＰＳＮＲが向上しているグラフが開示されている。しかし、このグラフは、換言すれば、ＨａｌｆＦｏＶが０°から９°付近では、非球面レンズを採用したほうが、ＰＳＮＲがよいことを意味している。そこで、本実施形態では、非球面のレンズ１１を採用している。これにより、視野中央部の解像度が、非特許文献１に開示された構成よりも向上する。 Also, in Non-Patent Document 1, a single lens using a diffractive DOE is adopted. Non-Patent Document 1 discloses a graph showing that when HalfFoV is around 9° to 13°, the PSNR is better than when an aspherical lens is employed. However, this graph, in other words, means that when HalfFoV is around 0° to 9°, PSNR is better when aspherical lenses are used. Therefore, in this embodiment, an aspherical lens 11 is employed. As a result, the resolution in the central portion of the field of view is improved as compared with the configuration disclosed in Non-Patent Document 1.

ただし、非球面のレンズ１１を採用すると、非点収差により、周辺視野ではＰＳＮＲが低くなる。そこで、非点収差を補正するために、撮像装置１では、受光面１２ａが湾曲形状になっている受光センサ１２を備える。この構成により、視野周辺の解像度も向上する。 However, when the aspherical lens 11 is employed, the PSNR becomes low in the peripheral vision due to astigmatism. Therefore, in order to correct the astigmatism, the imaging device 1 includes the light receiving sensor 12 having a curved light receiving surface 12a. This configuration also improves the resolution in the periphery of the field of view.

＜第２実施形態＞
次に、第２実施形態を説明する。この第２実施形態以下の説明において、それまでに使用した符号と同一番号の符号を有する要素は、特に言及する場合を除き、それ以前の実施形態における同一符号の要素と同一である。また、構成の一部のみを説明している場合、構成の他の部分については先に説明した実施形態を適用できる。 <Second embodiment>
Next, a second embodiment will be described. In the following description of the second embodiment, the elements having the same reference numerals as those used so far are the same as the elements having the same reference numerals in the previous embodiments unless otherwise specified. Moreover, when only part of the configuration is described, the previously described embodiments can be applied to the other portions of the configuration.

図５に、第２実施形態の撮像装置２００の構成を示す。撮像装置２００は、撮像装置１が備えているものと同じ撮像部１０を備える。撮像装置２００は、この撮像部１０と、画像処理部２２０とを備える。 FIG. 5 shows the configuration of an imaging device 200 according to the second embodiment. The imaging device 200 includes an imaging unit 10 that is the same as that included in the imaging device 1 . The imaging device 200 includes the imaging section 10 and an image processing section 220 .

画像処理部２２０は、座標変換部２２１、入力データ生成部２２２、画像補正部２２３、記憶部２２４を備える。座標変換部２２１は、撮像部１０が出力した輝度信号を、極座標の配列に変換する。 The image processing section 220 includes a coordinate conversion section 221 , an input data generation section 222 , an image correction section 223 and a storage section 224 . The coordinate conversion unit 221 converts the luminance signal output from the imaging unit 10 into a polar coordinate array.

入力データ生成部２２２は、画像補正部２２３に入力する入力データを生成する。第２実施形態での入力データの１単位は、１つの画素位置についての赤色の輝度値ＬＲ、緑色の輝度値ＬＧ、青色の輝度値ＬＢ、および、その画素位置の極座標成分の距離ｒである。第１実施形態とは異なり、入力データに、画素位置の極座標成分の角度θは含ませない。したがって、入力データは４チャンネルのデータになる。 The input data generator 222 generates input data to be input to the image corrector 223 . One unit of input data in the second embodiment is a red luminance value LR, a green luminance value LG, and a blue luminance value LB for one pixel position, and the distance r of the polar coordinate component of that pixel position. . Unlike the first embodiment, the input data does not include the angle θ of the polar coordinate component of the pixel position. Therefore, the input data is 4-channel data.

撮像画像は光軸１１ａに対してボケの程度が対称性を持つ画像である。したがって、角度θが変わっても、ボケの程度は同じである。そこで、輝度信号の配列を極座標系に変換している第２実施形態では、画素位置の極座標成分のうち角度θは、入力データに含ませないのである。 The picked-up image is an image in which the degree of blurring is symmetrical with respect to the optical axis 11a. Therefore, even if the angle θ changes, the degree of blur remains the same. Therefore, in the second embodiment in which the array of luminance signals is converted to the polar coordinate system, the angle θ among the polar coordinate components of the pixel position is not included in the input data.

画像補正部２２３は、入力データ生成部２２２が生成した入力データを学習型ネットワークモデルに入力して補正後撮像画像を生成する。そして、画像補正部２２３は、生成した補正後撮像画像を演算装置２に出力する。画像処理部２２０は、記憶部２２４を備える。記憶部２２４には、画像補正部２２３が用いる学習済みの学習型ネットワークモデルが記憶されている。 The image correction unit 223 inputs the input data generated by the input data generation unit 222 to the learning network model to generate a corrected captured image. Then, the image correction unit 223 outputs the generated captured image after correction to the arithmetic device 2 . The image processing section 220 includes a storage section 224 . The storage unit 224 stores a learned learning network model used by the image correction unit 223 .

図６は、座標変換部２２１、入力データ生成部２２２および画像補正部２２３が実行する処理を説明する図である。図６において、座標変換部２２１の「Image_RGB」は撮像制御部１３が出力する画像信号、「Polar_RGB」は極座標の順で配列された画像信号である。「Adress_r」は画素位置の極座標成分である距離ｒ、「Input_RGBr」は入力データである。 FIG. 6 is a diagram illustrating processing executed by the coordinate transformation unit 221, the input data generation unit 222, and the image correction unit 223. As shown in FIG. In FIG. 6, "Image_RGB" of the coordinate conversion unit 221 is an image signal output by the imaging control unit 13, and "Polar_RGB" is an image signal arranged in the order of polar coordinates. "Adress_r" is the distance r, which is the polar coordinate component of the pixel position, and "Input_RGBr" is the input data.

画像補正部２２３は、入力データを、所定の大きさで順次切り出し、切り出した入力データを、学習型ネットワークモデルに入力して出力データを得る。「Output_RGBr」は、入力データを順次、学習型ネットワークモデルに入力して得られた出力データを意味する。画像補正部２２３は、この出力データから、極座標（ｒ）のデータを除去して、「Polar_RGB」を得る。この「Polar_RGB」は極座標の配列になっている補正後撮像画像であるので、さらに、「Polar_RGB」を直交座標の配列に変換して「Image_RGB」を得る。画像補正部２２３側の「Image_RGB」は、直交座標系で補正後撮像画像を表す画像データである。 The image correction unit 223 sequentially cuts out the input data in a predetermined size, inputs the cut out input data to the learning network model, and obtains output data. "Output_RGBr" means output data obtained by sequentially inputting input data to a learning network model. The image correction unit 223 removes the polar coordinate (r) data from this output data to obtain "Polar_RGB". Since this "Polar_RGB" is the corrected captured image in the array of polar coordinates, "Image_RGB" is obtained by converting "Polar_RGB" into the array of orthogonal coordinates. "Image_RGB" on the side of the image correction unit 223 is image data representing the captured image after correction in the orthogonal coordinate system.

〔学習処理〕
次に、第２実施形態において学習型ネットワークモデルを学習する学習処理について説明する。図７に、学習処理をするモデル学習システム３００の構成を示す。モデル学習システム３００は、撮像部１０、教師画像撮像装置１１０、モデル生成装置３２０を備える。撮像部１０と教師画像撮像装置１１０は、モデル学習システム１００が備えるものと同じである。 [Learning processing]
Next, learning processing for learning a learning network model in the second embodiment will be described. FIG. 7 shows the configuration of a model learning system 300 that performs learning processing. A model learning system 300 includes an imaging unit 10 , a teacher image imaging device 110 and a model generation device 320 . The imaging unit 10 and the teacher image capturing device 110 are the same as those included in the model learning system 100 .

モデル生成装置３２０は、座標変換部３２１、入力データ生成部３２２、教師データ生成部３２３、モデル学習部３２４、記憶部３２５を備える。座標変換部３２１、入力データ生成部３２２は、それぞれ、撮像装置２００が備える座標変換部２２１、入力データ生成部２２２と同じ処理を実行する。 The model generation device 320 includes a coordinate transformation unit 321 , an input data generation unit 322 , a teacher data generation unit 323 , a model learning unit 324 and a storage unit 325 . The coordinate transformation unit 321 and the input data generation unit 322 execute the same processes as the coordinate transformation unit 221 and the input data generation unit 222 provided in the imaging device 200, respectively.

教師データ生成部３２３は、教師輝度信号を教師画像撮像装置１１０から取得する。教師画像撮像装置１１０から取得する教師輝度信号は直交座標系で配列している。教師データ生成部３２３は、教師画像撮像装置１１０から取得した教師輝度信号を極座標の配列に変換する。さらに、教師データ生成部３２３は、変換後の各画像位置別の教師輝度信号に、その画像位置を示す極座標のうちの距離ｒを加えて教師データとする。 The teacher data generator 323 acquires the teacher luminance signal from the teacher image pickup device 110 . The teacher luminance signals obtained from the teacher image pickup device 110 are arranged in an orthogonal coordinate system. The teacher data generation unit 323 converts the teacher luminance signal acquired from the teacher image capturing device 110 into a polar coordinate array. Further, the teacher data generating unit 323 adds the distance r of the polar coordinates indicating the image position to the teacher luminance signal for each image position after conversion, and generates teacher data.

モデル学習部３２４は、入力データ生成部３２２が生成した入力データと、教師データ生成部３２３が生成した教師データとを学習データとして、学習型ネットワークモデルのパラメータを学習する。そして、モデル学習部３２４は、パラメータを学習した学習型ネットワークモデルを記憶部３２５に記憶する。撮像装置２００が備える記憶部３２５に記憶されている学習型ネットワークモデルは、このモデル学習システム３００により学習された学習型ネットワークモデルである。 The model learning unit 324 learns the parameters of the learning network model using the input data generated by the input data generation unit 322 and the teacher data generated by the teacher data generation unit 323 as learning data. Then, the model learning unit 324 stores the learning network model, which has learned the parameters, in the storage unit 325 . The learning network model stored in the storage unit 325 of the imaging device 200 is a learning network model learned by this model learning system 300 .

学習型ネットワークモデルがこのようにして学習されても、輝度信号のみを入力データとして学習されたモデルよりも補正精度が向上する。 Even if the learning-type network model is learned in this way, the correction accuracy is improved as compared with the model learned using only the luminance signal as input data.

〔画像改善効果〕
図８に、撮像装置２００による画像改善効果を確認した結果を示す。図８も、図４と同じく、横軸がＨａｌｆＦｏＶ、縦軸がＰＳＮＲである。図８に示されるように、第２実施形態のようにしても、非特許文献１に開示されている入力データと学習型ネットワークモデルを用いるよりも、全体的にＰＳＮＲが向上していることが分かる。 [Image improvement effect]
FIG. 8 shows the result of confirming the image improvement effect of the imaging device 200. As shown in FIG. In FIG. 8, as in FIG. 4, the horizontal axis is HalfFoV and the vertical axis is PSNR. As shown in FIG. 8, even in the case of the second embodiment, the overall PSNR is improved compared to using the input data and the learning network model disclosed in Non-Patent Document 1. I understand.

以上、実施形態を説明したが、開示した技術は上述の実施形態に限定されるものではなく、次の変形例も開示した範囲に含まれ、さらに、下記以外にも要旨を逸脱しない範囲内で種々変更して実施できる。 Although the embodiments have been described above, the disclosed technology is not limited to the above-described embodiments, and the following modifications are also included in the disclosed scope. Various modifications can be made.

撮像装置１、２００は、車両に搭載されていなくてもよい。車両以外の移動体に搭載されてもよいし、移動しない場所に設定されてもよい。 The imaging devices 1 and 200 do not have to be mounted on the vehicle. It may be mounted on a moving object other than a vehicle, or may be set in a place where it does not move.

また、撮像装置１、２００とモデル学習システム１００、３００とを備えた１つのシステムが構成されていてもよい。 Also, one system may be configured that includes the imaging devices 1 and 200 and the model learning systems 100 and 300 .

撮像制御部１３、画像処理部２０、２２０、モデル生成装置１２０、３２０は、専用ハードウエア論理回路により実現されてもよいし、コンピュータプログラムを実行するプロセッサと一つ以上のハードウエア論理回路との組み合わせにより構成された一つ以上の専用コンピュータにより実現されてもよい。ハードウエア論理回路は、たとえば、ＡＳＩＣ、ＦＰＧＡである。上記コンピュータプログラムは、コンピュータ読み取り可能な非遷移有形記録媒体に記憶されていればよい。たとえば、フラッシュメモリに上記コンピュータプログラムが記憶される。 The imaging control unit 13, the image processing units 20 and 220, and the model generation devices 120 and 320 may be realized by dedicated hardware logic circuits, or may be implemented by a processor executing a computer program and one or more hardware logic circuits. It may also be implemented by one or more dedicated computers configured in combination. Hardware logic circuits are, for example, ASICs and FPGAs. The computer program may be stored in a computer-readable non-transitional tangible recording medium. For example, the computer program is stored in flash memory.

１：撮像装置２：演算装置１０：撮像部１１：レンズ１１ａ：光軸１２：受光センサ１２ａ：受光面１３：撮像制御部２０：画像処理部２２：入力データ生成部２３：画像補正部２４：記憶部１００：モデル学習システム１１０：教師画像撮像装置１２０：モデル生成装置１２２：入力データ生成部１２３：教師データ生成部１２４：モデル学習部１２５：記憶部２００：撮像装置２２０：画像処理部２２１：座標変換部２２２：入力データ生成部２２３：画像補正部２２４：記憶部３００：モデル学習システム３２０：モデル生成装置３２１：座標変換部３２２：入力データ生成部３２３：教師データ生成部３２４：モデル学習部３２５：記憶部ｒ：距離 θ：角度 1: Imaging device 2: Arithmetic device 10: Imaging unit 11: Lens 11a: Optical axis 12: Light receiving sensor 12a: Light receiving surface 13: Imaging control unit 20: Image processing unit 22: Input data generation unit 23: Image correction unit 24: Storage Unit 100: Model Learning System 110: Teacher Image Imaging Device 120: Model Generating Device 122: Input Data Generating Unit 123: Teacher Data Generating Unit 124: Model Learning Unit 125: Storage Unit 200: Imaging Device 220: Image Processing Unit 221: Coordinate transformation unit 222: Input data generation unit 223: Image correction unit 224: Storage unit 300: Model learning system 320: Model generation device 321: Coordinate transformation unit 322: Input data generation unit 323: Teacher data generation unit 324: Model learning unit 325: storage unit r: distance θ: angle

Claims

A photographed image having blurring symmetry with respect to an optical axis (11a) is imaged by a light receiving sensor (12) having a plurality of light receiving elements, and a luminance signal indicating the luminance of each of the light receiving elements is output. a part (10);
Correction in which the luminance signal output by the imaging unit and a polar coordinate component indicating the position of the light receiving element that detected the luminance signal are used as input data, and the input data is input to a learning network model to correct the blur. An image processing unit (20, 220) that generates a post-captured image and outputs the generated post-correction captured image to a predetermined arithmetic device,
The learning network model indicates a teacher image in which the blur is improved with respect to the captured image by the brightness signal for each image position corresponding to the position of the light receiving element, and the image position. and a polar coordinate component as teacher data, and learning with learning data including the input data and the teacher data.

The imaging device according to claim 1,
The imaging unit outputs the luminance signal in an array of orthogonal coordinates,
The image processing unit
an input data generation unit (22) for generating the input data by adding the distance (r) from the optical axis and the angle (θ) from the starting line, which are the polar coordinate components, to each of the luminance signals;
an image correction unit (23) that inputs the input data to the learning network model to generate the corrected captured image;
The learning network model indicates the teacher image by the brightness signal for each image position corresponding to the position of the light receiving element, and the optical axis is attached to each of the teacher brightness signals arranged in an orthogonal coordinate system. and the data obtained by adding the distance from the starting line and the angle from the starting line are used as the teacher data, and learning is performed by learning data including the input data and the teacher data.

The imaging device according to claim 1,
The imaging unit outputs the luminance signal in an array of orthogonal coordinates,
The image processing unit
a coordinate conversion unit (221) for converting the luminance signal output by the imaging unit into a polar coordinate array;
an input data generation unit (222) for generating the input data by adding the distance from the optical axis, which is the polar coordinate component, to each of the luminance signals converted by the coordinate conversion unit;
an image correction unit (223) that inputs the input data to the learning network model to generate the corrected captured image;
The learning network model indicates the teacher image by the brightness signal for each image position corresponding to the light receiving element, and stores the distance from the optical axis in each of the teacher brightness signals arranged in a polar coordinate system. is used as the teacher data, and learning is performed by learning data including the input data and the teacher data.

The imaging device according to any one of claims 1 to 3,
The imaging device, wherein the imaging unit includes an aspherical lens (11) whose blur increases with increasing distance from the optical axis.

The imaging device according to claim 4,
The image capturing unit includes a light receiving sensor (12) having a light receiving surface (12a) that has a curved shape that approaches the lens side as the distance from the optical axis increases.

The imaging device according to claim 4 or 5,
The imaging device, wherein the lens is made of glass.

A photographed image having blurring symmetry with respect to an optical axis (11a) is imaged by a light receiving sensor (12) having a plurality of light receiving elements, and a luminance signal indicating the luminance of each of the light receiving elements is output. a part (10);
teacher data including a teacher luminance signal indicating a teacher image in which blur is improved with respect to the captured image by the luminance signals for each image position corresponding to the plurality of light receiving elements, and a polar coordinate component indicating the image position; a training data generation unit (123, 323) to generate;
Input data includes the luminance signal output by the imaging unit and a polar coordinate component indicating the position of the light receiving element that detected the luminance signal, and the input data and the teacher data generated by the teacher data generation unit. A model learning system comprising a model learning unit (124, 324) for learning parameters of a learning type network model from learning data.