JP7401246B2

JP7401246B2 - Imaging device, method of controlling the imaging device, and program

Info

Publication number: JP7401246B2
Application number: JP2019185391A
Authority: JP
Inventors: 修野村; 政美加藤
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-10-08
Filing date: 2019-10-08
Publication date: 2023-12-19
Anticipated expiration: 2039-10-08
Also published as: JP2021061546A

Description

本発明は、撮像時に用いられる撮像パラメータの制御技術に関する。 The present invention relates to a technique for controlling imaging parameters used during imaging.

撮像装置において、所望の画像を取得することを可能とする撮像パラメータを制御する技術として、画像に対して認識処理を実行し、得られた認識結果を基にして撮像パラメータを制御する技術が広く提案されている。特許文献１には、監視装置等のカメラが撮像した画像を用いた人物の顔認識結果を基に、その人物の顔を確認し易くするようにカメラの上下方向、水平方向、回転方向等の撮影方向、及び画角の少なくとも一つの撮像パラメータを決定する技術が開示されている。 As a technology for controlling imaging parameters that enables acquisition of a desired image in an imaging device, a technology that executes recognition processing on an image and controls imaging parameters based on the obtained recognition results is widely used. Proposed. Patent Document 1 discloses that based on the results of face recognition of a person using an image taken by a camera such as a monitoring device, the camera is adjusted in the vertical direction, horizontal direction, rotation direction, etc. to make it easier to confirm the person's face. A technique for determining at least one imaging parameter of a shooting direction and an angle of view is disclosed.

特開２０１４－６４０８３号公報Japanese Patent Application Publication No. 2014-64083

前述した画像認識結果に基づく撮像パラメータ制御技術の場合、認識処理を適用する対象となる画像は既に取得済みの画像である。このため、画像認識結果を基に撮像パラメータを制御した場合、当該撮像パラメータを反映した上で新たに取得した画像において、認識対象物の画像上の位置および画素レベルの輝度データは、既に変化している可能性が有る。この場合、顔画像のような所望の画像を取得するために制御した撮像パラメータの値は、その撮像パラメータを基に新たに撮像される画像に対して適当でない値となってしまってしまうことが有り得る。つまり、その撮像パラメータは、新たに撮像される画像に対して適合していない可能性がある。 In the case of the imaging parameter control technique based on the image recognition results described above, the image to which recognition processing is applied is an image that has already been acquired. Therefore, when the imaging parameters are controlled based on the image recognition results, the position of the recognition target on the image and the pixel-level brightness data have already changed in the newly acquired image after reflecting the imaging parameters. There is a possibility that In this case, the values of the imaging parameters controlled to obtain a desired image such as a face image may end up being inappropriate values for the newly captured image based on the imaging parameters. Possible. In other words, the imaging parameters may not be suitable for a newly captured image.

そこで、本発明は、撮像パラメータを基に撮像される画像に対して、より適合した撮像パラメータを決定可能にすることを目的とする。 Therefore, an object of the present invention is to make it possible to determine imaging parameters that are more suitable for an image that is captured based on the imaging parameters.

本発明の撮像装置は、画像の画素または領域毎に撮像パラメータを設定可能な撮像センサを備え、動く物体を撮像した画像を取得する撮像手段と、前記取得された画像を基に前記撮像手段によって、前記物体が動いた後に撮像される画像を予測した予測画像を生成する画像生成手段と、前記画像生成手段で生成された予測画像に対して画像認識処理を実行する画像認識手段と、前記画像認識処理によって得られる前記予測画像内の前記物体の位置に基づいて、前記画素または領域毎に前記撮像センサの前記撮像パラメータを決定するパラメータ決定手段と、前記決定した撮像パラメータを前記撮像センサに対して前記画素または領域毎に設定するパラメータ設定手段と、を有し、前記撮像手段は、前記パラメータ設定手段で設定されたパラメータに従って画像を取得することを特徴とする。 The imaging device of the present invention includes an imaging sensor capable of setting imaging parameters for each pixel or region of an image, an imaging means for acquiring an image of a moving object, and an imaging means for acquiring an image of a moving object based on the acquired image. , an image generation unit that generates a predicted image that predicts an image that will be captured after the object moves ; an image recognition unit that performs image recognition processing on the predicted image generated by the image generation unit ; parameter determining means for determining the imaging parameters of the imaging sensor for each pixel or region based on the position of the object in the predicted image obtained by recognition processing; and parameter setting means for setting each pixel or region , and the imaging means acquires an image according to the parameters set by the parameter setting means .

本発明によれば、撮像パラメータを基に撮像される画像に対して、より適合した撮像パラメータを決定することができる。 According to the present invention, it is possible to determine imaging parameters that are more suitable for an image that is captured based on the imaging parameters.

実施形態１に係る撮像装置の構成例を示す図である。1 is a diagram illustrating a configuration example of an imaging device according to a first embodiment; FIG. 実施形態１に係る処理フローチャートである。3 is a processing flowchart according to the first embodiment. 実施形態１に係る予測画像生成処理を実現する構成例を示す図である。FIG. 2 is a diagram illustrating a configuration example for realizing predicted image generation processing according to the first embodiment. 実施形態１に係る画像認識処理を実現する構成例を示す図である。3 is a diagram showing an example of a configuration for realizing image recognition processing according to the first embodiment; FIG. 実施形態１に係る画像認識処理の説明図である。FIG. 3 is an explanatory diagram of image recognition processing according to the first embodiment. 実施形態２に係る処理フローチャートである。7 is a processing flowchart according to Embodiment 2. 実施形態２に係る画像認識処理を実現する構成例を示す図である。7 is a diagram illustrating a configuration example for realizing image recognition processing according to a second embodiment. FIG. 実施形態２に係る画像認識処理（物体認識処理）の説明図である。FIG. 7 is an explanatory diagram of image recognition processing (object recognition processing) according to the second embodiment. 実施形態２に係る画像認識処理（物体姿勢認識処理）の説明図である。FIG. 7 is an explanatory diagram of image recognition processing (object posture recognition processing) according to the second embodiment. 実施形態２に係る人体頭部中心位置の算出方法の説明図である。FIG. 7 is an explanatory diagram of a method for calculating the center position of the human head according to the second embodiment. 実施形態２に係る撮像領域の説明図である。FIG. 3 is an explanatory diagram of an imaging area according to Embodiment 2. FIG. 実施形態３に係る処理フローチャートである。12 is a processing flowchart according to Embodiment 3. 実施形態３に係る画像認識処理の説明図である。FIG. 7 is an explanatory diagram of image recognition processing according to Embodiment 3; 実施形態４に係る撮像装置の構成例を示す図である。FIG. 7 is a diagram illustrating a configuration example of an imaging device according to a fourth embodiment. 実施形態４に係る処理フローチャートである。12 is a processing flowchart according to Embodiment 4. 実施形態５に係る撮像装置の構成例を示す図である。12 is a diagram illustrating a configuration example of an imaging device according to a fifth embodiment. FIG. 実施形態５に係る処理フローチャートである。12 is a processing flowchart according to Embodiment 5.

以下、本発明の実施形態を、添付の図面に基づいて詳細に説明する。なお、以下の実施形態において示す構成は一例にすぎず、本発明は図示された構成に限定されるものではない。
＜実施形態１＞
まず、本発明に係る実施形態１について説明する。図１は、本実施形態に係る撮像装置の構成例を示すブロック図である。 Hereinafter, embodiments of the present invention will be described in detail based on the accompanying drawings. Note that the configurations shown in the following embodiments are merely examples, and the present invention is not limited to the illustrated configurations.
<Embodiment 1>
First, Embodiment 1 according to the present invention will be described. FIG. 1 is a block diagram showing a configuration example of an imaging device according to this embodiment.

図１に示すように、本実施形態の撮像装置は、画像センサ部１０、画像生成部２０、画像認識部３０、パラメータ決定部４０、パラメータ設定部５０、ＲＡＭ８０、画像バス１２０、およびＣＰＵバス１７０を有する。さらに撮像装置は、ブリッジ１３０、ＤＭＡＣ（Direct Memory Access Controller）７０、ＣＰＵ（Central Processing Unit）１４０、ＲＯＭ１５０、及びＲＡＭ１６０を有している。画像センサ部１０、画像生成部２０、画像認識部３０、パラメータ決定部４０、パラメータ設定部５０及びＤＭＡＣ７０は、画像バス１２０を介して互いに接続されている。またＣＰＵ１４０、ＲＯＭ１５０及びＲＡＭ１６０は、ＣＰＵバス１７０を介して互いに接続されている。ブリッジ１３０は、画像バス１２０とＣＰＵバス１７０との間のデータ転送を可能にする。 As shown in FIG. 1, the imaging apparatus of this embodiment includes an image sensor section 10, an image generation section 20, an image recognition section 30, a parameter determination section 40, a parameter setting section 50, a RAM 80, an image bus 120, and a CPU bus 170. has. The imaging device further includes a bridge 130, a DMAC (Direct Memory Access Controller) 70, a CPU (Central Processing Unit) 140, a ROM 150, and a RAM 160. The image sensor section 10, the image generation section 20, the image recognition section 30, the parameter determination section 40, the parameter setting section 50, and the DMAC 70 are connected to each other via an image bus 120. Further, the CPU 140, ROM 150, and RAM 160 are connected to each other via a CPU bus 170. Bridge 130 enables data transfer between image bus 120 and CPU bus 170.

画像センサ部１０は、光学系、撮像センサ、撮像センサを制御するドライバー回路、撮像センサにて取得された撮像信号をデジタル信号に変換するＡ／Ｄ変換回路、および、デジタル信号に変換された撮像信号を画像として現像する現像回路等を備えている。撮像センサは、例えばＣＭＯＳ（Complementary Metal Oxide Semiconductor）センサからなる。 The image sensor unit 10 includes an optical system, an image sensor, a driver circuit that controls the image sensor, an A/D conversion circuit that converts an image signal acquired by the image sensor into a digital signal, and an image sensor that converts the image signal acquired by the image sensor into a digital signal. It is equipped with a developing circuit that develops the signal as an image. The image sensor includes, for example, a CMOS (Complementary Metal Oxide Semiconductor) sensor.

ここで、画像センサ部１０は、撮像センサの動作特性を決定する撮像パラメータを制御する機能を有する。撮像パラメータを制御する機能は、画像センサ部１０内の例えばドライバー回路が担っている。撮像パラメータは、露光パラメータ、フォーカスパラメータ、ダイナックレンジパラメータ、ゲインパラメータ、フレームレートパラメータ、撮像領域パラメータ、解像度パラメータ等からなる。露光パラメータは、撮像センサにおける露光時間等を制御するパラメータである。フォーカスパラメータは、撮像センサ上でフォーカス用の画素等の動作を制御するパラメータである。ゲインパラメータは、撮像センサにより撮像された信号のゲインを制御するパラメータである。フレームレートパラメータは、撮像センサにおける撮像時のフレームレートを制御するパラメータである。撮像領域パラメータは、撮像センサ上で撮像する領域を制御するパラメータである。解像度パラメータは、撮像センサの解像度を制御するパラメータである。撮像パラメータによる制御機能の具体例については後述する。 Here, the image sensor unit 10 has a function of controlling imaging parameters that determine the operating characteristics of the imaging sensor. For example, a driver circuit within the image sensor unit 10 has the function of controlling the imaging parameters. The imaging parameters include exposure parameters, focus parameters, dynamic range parameters, gain parameters, frame rate parameters, imaging area parameters, resolution parameters, and the like. The exposure parameter is a parameter that controls the exposure time and the like in the image sensor. The focus parameter is a parameter that controls the operation of focus pixels and the like on the image sensor. The gain parameter is a parameter that controls the gain of the signal captured by the image sensor. The frame rate parameter is a parameter that controls the frame rate during image capturing in the image sensor. The imaging area parameter is a parameter that controls the area to be imaged on the image sensor. The resolution parameter is a parameter that controls the resolution of the image sensor. A specific example of the control function using the imaging parameters will be described later.

画像生成部２０は、画像センサ部１０により取得された少なくとも１枚の画像データから、将来の画像として予測される予測画像を生成する機能を有する。画像生成部２０における予測画像生成機能の詳細については後述する。
画像認識部３０は、画像生成部２０によって生成された予測画像に対して所定の画像認識処理を実行する機能を有する。なお実施形態１では、画像認識部３０が画像認識処理として領域分割処理を実行する場合に関して説明する。画像認識部３０における画像認識処理機能（領域分割処理機能）の詳細については後述する。 The image generation unit 20 has a function of generating a predicted image predicted as a future image from at least one piece of image data acquired by the image sensor unit 10. Details of the predicted image generation function in the image generation unit 20 will be described later.
The image recognition unit 30 has a function of performing predetermined image recognition processing on the predicted image generated by the image generation unit 20. In the first embodiment, a case will be described in which the image recognition unit 30 executes area division processing as image recognition processing. Details of the image recognition processing function (area division processing function) in the image recognition section 30 will be described later.

パラメータ決定部４０は、画像認識部３０による認識結果を基に、画像センサ部１０の撮像センサを制御するための撮像パラメータの種類およびその値を決定する。パラメータ決定部４０における撮像パラメータ決定機能の詳細については後述する。
パラメータ設定部５０は、パラメータ決定部４０により決定された撮像パラメータの値を、画像センサ部１０に対して設定する。パラメータ設定部５０における撮像パラメータ設定機能の詳細については後述する。 The parameter determination unit 40 determines the type and value of an imaging parameter for controlling the imaging sensor of the image sensor unit 10 based on the recognition result by the image recognition unit 30. Details of the imaging parameter determination function in the parameter determination unit 40 will be described later.
The parameter setting unit 50 sets the value of the imaging parameter determined by the parameter determining unit 40 for the image sensor unit 10. Details of the imaging parameter setting function in the parameter setting section 50 will be described later.

ＤＭＡＣ７０は、画像センサ部１０、画像生成部２０、画像認識部３０、パラメータ決定部４０、パラメータ設定部５０、及びＣＰＵバス１７０との間のデータ転送を司る。
ＲＯＭ（Read Only Memory）１５０は、ＣＰＵ１４０が実行するプログラム、ＣＰＵ１４０の動作を規定する命令、パラメータデータ、重み係数等を格納している。
ＣＰＵ１４０は、ＲＯＭ１５０から、それらプログラムや命令、パラメータデータ等を読み出しつつ、当該撮像装置の全体の動作を制御したり、様々な演算を行ったりする。なお、ＣＰＵ１４０はブリッジ１３０を介して画像バス１２０上のＲＡＭ８０にアクセスすることも可能である。
ＲＡＭ１６０は、ＣＰＵ１４０が撮像装置を制御等する際に、当該ＣＰＵ１４０の作業領域として使用される。 The DMAC 70 controls data transfer between the image sensor section 10, the image generation section 20, the image recognition section 30, the parameter determination section 40, the parameter setting section 50, and the CPU bus 170.
The ROM (Read Only Memory) 150 stores programs executed by the CPU 140, instructions that define the operation of the CPU 140, parameter data, weighting coefficients, and the like.
The CPU 140 reads the programs, instructions, parameter data, etc. from the ROM 150, and controls the overall operation of the imaging device and performs various calculations. Note that the CPU 140 can also access the RAM 80 on the image bus 120 via the bridge 130.
The RAM 160 is used as a work area for the CPU 140 when the CPU 140 controls the imaging device.

次に、図１に示した構成を有する撮像装置の動作について説明する。
図２は、本実施形態に係る撮像装置の動作を示すフローチャートである。なお図２に示した処理フロー全体の制御は、事前に設定されたプログラムに基づいてＣＰＵ１４０により実行される。 Next, the operation of the imaging apparatus having the configuration shown in FIG. 1 will be described.
FIG. 2 is a flowchart showing the operation of the imaging device according to this embodiment. Note that control of the entire processing flow shown in FIG. 2 is executed by the CPU 140 based on a preset program.

先ずステップＳ１８０において、ＣＰＵ１４０は、撮像処理の開始に先立ち、撮像装置における各種初期化処理を実行する。例えば、ＣＰＵ１４０は、画像認識部３０および画像生成部２０の動作に必要な重み係数をＲＯＭ１５０からＲＡＭ１６０に転送すると共に、画像認識部３０および画像生成部２０の動作を定義する為の各種レジスタ設定を行う。具体的には、ＣＰＵ１４０は、画像認識部３０および画像生成部２０内の制御部に存在する複数のレジスタに所定の値を設定する。同様に、ＣＰＵ１４０は、画像センサ部１０に対しても動作に必要な値を書き込む。この時、画像センサ部１０の撮像センサを制御するための撮像パラメータの値としては、事前に決定された所定の初期値またはユーザが設定した初期値が設定される。 First, in step S180, the CPU 140 executes various initialization processes in the imaging device before starting the imaging process. For example, the CPU 140 transfers weighting coefficients necessary for the operations of the image recognition section 30 and the image generation section 20 from the ROM 150 to the RAM 160, and also sets various registers for defining the operations of the image recognition section 30 and the image generation section 20. conduct. Specifically, the CPU 140 sets predetermined values in a plurality of registers present in the control section in the image recognition section 30 and the image generation section 20. Similarly, the CPU 140 writes values necessary for operation into the image sensor section 10 as well. At this time, the value of the imaging parameter for controlling the imaging sensor of the image sensor unit 10 is set to a predetermined initial value determined in advance or an initial value set by the user.

次にステップＳ１８１において、画像センサ部１０は、撮像処理を実行して撮像画像データを取得する。
続いてステップＳ１８２において、画像生成部２０は、画像センサ部１０にて取得された画像データを内部のフレームバッファにフレーム単位で格納する。ここで、本実施形態の場合、画像生成部２０は、５フレーム分の画像データが入力されてから、１フレーム分の予測画像の生成および出力を開始するものとする。このため、画像生成部２０は、画像センサ部１０からフレーム単位で画像データを取得し、ステップＳ１８３において予測画像の生成を開始するか判定する。画像生成部２０は、５フレーム分の画像データを取得できた場合には、ステップＳ１８３において予測画像の生成を開始し、１フレーム分の予測画像を生成して出力する。一方、画像生成部２０は、５フレーム分の画像データが取得できていない場合には、予測画像の生成を開始せずに、画像センサ部１０によるステップＳ１８１の画像取得処理に戻す。なお本実施形態において、画像生成部２０は、次のフレームの画像データを取得するまでの間に、既にフレームバッファに格納された画像データに対して、所定の処理（特徴抽出処理など、予測画像生成に必要な各種の処理）を実行するものとする。画像生成部２０における予測画像生成処理の詳細については後述する。 Next, in step S181, the image sensor unit 10 performs imaging processing to acquire captured image data.
Subsequently, in step S182, the image generation unit 20 stores the image data acquired by the image sensor unit 10 in an internal frame buffer in units of frames. Here, in the case of the present embodiment, it is assumed that the image generation unit 20 starts generating and outputting one frame's worth of predicted images after receiving five frames' worth of image data. For this reason, the image generation unit 20 acquires image data frame by frame from the image sensor unit 10, and determines whether to start generating a predicted image in step S183. If the image generation unit 20 is able to acquire five frames of image data, it starts generating a predicted image in step S183, generates and outputs one frame of predicted image. On the other hand, if image data for five frames has not been acquired, the image generation unit 20 returns to the image acquisition process of step S181 by the image sensor unit 10 without starting generation of a predicted image. In this embodiment, the image generation unit 20 performs predetermined processing (such as feature extraction processing) on the image data already stored in the frame buffer before acquiring the next frame of image data. various processes necessary for generation). Details of the predicted image generation process in the image generation unit 20 will be described later.

ステップＳ１８３で予測画像の生成処理が開始されて予測画像が生成されると、ＣＰＵ１４０は、画像認識部３０にて行われるステップＳ１８４に処理を進める。
ステップＳ１８４に進むと、画像認識部３０は、画像生成部２０で生成された予測画像を用いた画像認識処理を行う。本実施形態の場合、画像認識部３０は、画像認識処理を実行し、画像生成部２０で生成された予測画像を複数の領域に分割した分割領域の単位で画像認識結果を出力する。そして、画像認識部３０は、画像認識処理の結果として、画像（予測画像）上における、分割領域の座標値（画像上の画素の位置を示す座標値）を出力する。 When the predicted image generation process is started in step S183 and the predicted image is generated, the CPU 140 advances the process to step S184, which is performed by the image recognition unit 30.
Proceeding to step S184, the image recognition unit 30 performs image recognition processing using the predicted image generated by the image generation unit 20. In the case of this embodiment, the image recognition unit 30 executes image recognition processing and outputs image recognition results in units of divided regions obtained by dividing the predicted image generated by the image generation unit 20 into a plurality of regions. Then, the image recognition unit 30 outputs the coordinate values of the divided regions (coordinate values indicating the positions of pixels on the image) on the image (predicted image) as a result of the image recognition process.

次にステップＳ１８５の撮像パラメータ決定処理として、パラメータ決定部４０は、画像センサ部１０の動作特性を決定する前述した撮像パラメータの値を決定する。本実施形態の場合、パラメータ決定部４０は、ステップＳ１８４の画像認識処理による認識結果として算出された、画像上における分割領域の座標値の情報と、ステップＳ１８２の画像生成処理で生成された予測画像とに基づいて、撮像パラメータの値を決定する。撮像パラメータ値の決定方法の具体例については後述する。 Next, as an imaging parameter determination process in step S185, the parameter determining unit 40 determines the value of the aforementioned imaging parameter that determines the operating characteristics of the image sensor unit 10. In the case of the present embodiment, the parameter determination unit 40 uses information on the coordinate values of the divided regions on the image calculated as the recognition result by the image recognition process in step S184, and the predicted image generated in the image generation process in step S182. Based on this, the value of the imaging parameter is determined. A specific example of the method for determining the imaging parameter value will be described later.

次にステップＳ１８６の撮像パラメータ設定処理として、パラメータ設定部５０は、ステップＳ１８５で決定された撮像パラメータの値を、画像センサ部１０に対して設定する。
ステップＳ１８６による撮像パラメータの設定が完了すると、ＣＰＵ１４０は、ステップＳ１８７において終了通知の発行（例えばユーザによる終了指示に基づく、ＣＰＵ１４０に対する割り込み信号）を確認する。ＣＰＵ１４０は、終了通知が発行されていない場合には撮像装置の処理をステップＳ１８１の画像取得処理に戻して、ステップＳ１８１以降の処理を継続させる。一方、ＣＰＵ１４０は、終了通知の発行を確認すると、図２のフローチャートの処理を終了する。 Next, as imaging parameter setting processing in step S186, the parameter setting unit 50 sets the imaging parameter value determined in step S185 to the image sensor unit 10.
When the setting of the imaging parameters in step S186 is completed, the CPU 140 confirms the issuance of a termination notification (for example, an interrupt signal to the CPU 140 based on a termination instruction by the user) in step S187. If the end notification has not been issued, the CPU 140 returns the processing of the imaging device to the image acquisition processing of step S181, and causes the processing from step S181 to continue. On the other hand, upon confirming the issuance of the termination notification, the CPU 140 terminates the process of the flowchart of FIG. 2 .

なお前述の説明では、画像生成部２０は、５フレーム分の画像データが入力されてから、１フレーム分の予測画像の出力を開始するものとしたが、この例に限定されるものではない。例えば、画像生成部２０は、過去の任意のｎフレーム分の画像データが入力されてから１フレーム分の予測画像の出力を開始するものであっても良い。この場合の過去の任意のｎフレーム分の"ｎ"は１以上の値である。例えば"ｎ"が１である場合、画像生成部２０は過去の１フレーム分の画像データを基に、予測画像を生成することになる。また本実施形態において、画像生成部２０は、１フレーム分の予測画像を出力するものとしたが、複数フレーム分の予測画像を生成して出力するものであっても良い。複数フレーム分の予測画像は、時系列順の複数のフレームの予測画像であっても良いし、同フレーム時間に相当する複数の予測画像であっても良い。同フレーム時間に相当する複数の予測画像は、例えば撮像で取得された画像から予測される確率順（例えば予測される確率が高い順）の複数パターンの予測画像であっても良い。 Note that in the above description, the image generation unit 20 starts outputting one frame's worth of predicted images after five frames' worth of image data is input, but the present invention is not limited to this example. For example, the image generation unit 20 may start outputting a predicted image for one frame after inputting image data for any n frames in the past. In this case, "n" for any n frames in the past is a value of 1 or more. For example, when "n" is 1, the image generation unit 20 generates a predicted image based on image data for one past frame. Further, in the present embodiment, the image generation unit 20 outputs a predicted image for one frame, but it may generate and output predicted images for multiple frames. The predicted images for multiple frames may be predicted images for multiple frames in chronological order, or may be multiple predicted images corresponding to the same frame time. The plurality of predicted images corresponding to the same frame time may be, for example, predicted images of a plurality of patterns in the order of probabilities predicted from images acquired by imaging (for example, in order of highest prediction probability).

次に、前述した画像生成部２０で実行される画像生成処理に関して、図３を用いて詳細に説明する。図３は、予測画像を生成する画像生成処理を実現するニューラルネットワーク（以下、ＮＮと表記する）の構成を模式的に示した図である。
なお、本実施形態で用いるＮＮ（ニューラルネットワーク）に関しては、下記の参考文献１において詳細に開示されているものを適用可能であり、画像生成部２０は、当該ＮＮを適用して、複数フレームの画像データを基に予測画像を生成する機能を有する。本実施形態では、画像生成処理の実現方法として、参考文献１に開示されている手法を適用することを想定しているが、この手法に限定されるものではない。例えば下記の参考文献２に開示されるような手法が用いられてもよい。また本実施形態における画像生成処理の実現方法は、それら参考文献１や参考文献２に開示された手法だけでなく、その他の手法を用いるものであっても構わない。 Next, the image generation process executed by the image generation section 20 described above will be described in detail using FIG. 3. FIG. 3 is a diagram schematically showing the configuration of a neural network (hereinafter referred to as NN) that implements image generation processing to generate a predicted image.
Regarding the NN (neural network) used in this embodiment, the one disclosed in detail in Reference 1 below can be applied, and the image generation unit 20 applies the NN to generate multiple frames. It has a function to generate predicted images based on image data. In this embodiment, it is assumed that the method disclosed in reference document 1 is applied as a method for realizing image generation processing, but the method is not limited to this method. For example, a method as disclosed in Reference 2 below may be used. Furthermore, the method for implementing the image generation process in this embodiment is not limited to the methods disclosed in Reference Document 1 and Reference Document 2, but may also use other methods.

参考文献１：Xingjian Shi, Zhourong Chen, Hao Wang, Dit-Yan Yeung, "Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting", arXiv 2015 Reference 1: Xingjian Shi, Zhourong Chen, Hao Wang, Dit-Yan Yeung, "Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting", arXiv 2015

参考文献２：William Lotter, Gabriel Kreiman, David Cox, "Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning", arXiv 2017 Reference 2: William Lotter, Gabriel Kreiman, David Cox, "Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning", arXiv 2017

画像生成部２０は、画像センサ部１０から入力された画像データＸ_tを、内部のフレームバッファに一旦格納した後、図３に示すように、畳み込みＬＳＴＭモジュールに入力する。畳み込みＬＳＴＭモジュールは、時系列データに対して、時系列情報を含んだ処理を実行するＮＮである。畳み込みＬＳＴＭモジュールは、広く利用されているＬＳＴＭ（long short-term memory）を時系列画像データに拡張したものである。前述の参考文献１には、畳み込みＬＳＴＭに関して詳細に説明されている。 The image generation unit 20 temporarily stores the image data X _t input from the image sensor unit 10 in an internal frame buffer, and then inputs it to the convolution LSTM module as shown in FIG. The convolution LSTM module is a NN that executes processing including time-series information on time-series data. The convolutional LSTM module is an extension of the widely used LSTM (long short-term memory) to time-series image data. Reference 1 mentioned above provides a detailed explanation of convolutional LSTM.

畳み込みＬＳＴＭにおける処理は、以下の式（１）で表される。前述したように、畳み込みＬＳＴＭは、時系列画像データを使用して事前に学習することにより、所望の出力を得ることが可能なＮＮある。本実施形態においては、式（１）に基づいて、過去の５フレーム分の時系列画像データＸ₀～Ｘ₄から次のフレームの予測画像Ｘ₅を生成するように、事前に学習が行われる。 Processing in convolution LSTM is expressed by the following equation (1). As described above, convolutional LSTM is a neural network (NN) that can obtain a desired output by learning in advance using time-series image data. In this embodiment, learning is performed in advance to generate the predicted image X ₅ of the next frame from the past five frames of time-series image data X ₀ to X ₄ based on equation (1). .

ここで、式（１）中の各記号が示す内容は以下の通りである。なお、アダマール積については式（１）中に記載している。
Ｘ：入力データ
Ｃ：セル出力データ
Ｈ：隠れ層の状態
ｉ：入力ゲート
ｆ：忘却ゲート
ｏ：出力ゲート
ｔ：時間（フレーム）
Ｗ_xo：入力荷重マトリクス（出力ゲート）
Ｗ_ho：状態荷重マトリクス（出力ゲート）
Ｗ_co：セル出力荷重マトリクス（出力ゲート）
ｂ_o：バイアス（出力ゲート）
Ｗ_xc：入力荷重マトリクス（セル出力）
Ｗ_hc：状態荷重マトリクス（セル出力）
ｂ_c：バイアス（セル出力）
Ｗ_xf：入力荷重マトリクス（忘却ゲート）
Ｗ_hf：状態荷重マトリクス（忘却ゲート）
Ｗ_cf：セル出力荷重マトリクス（忘却ゲート）
ｂ_f：バイアス（忘却ゲート）
Ｗ_xi：入力荷重マトリクス（入力ゲート）
Ｗ_hi：状態荷重マトリクス（入力ゲート）
Ｗ_ci：セル出力荷重マトリクス（入力ゲート）
ｂ_i：バイアス（入力ゲート）
＊：畳み込み演算 Here, the contents indicated by each symbol in formula (1) are as follows. Note that the Hadamard product is described in equation (1).
X: Input data C: Cell output data H: Hidden layer state i: Input gate f: Forgetting gate o: Output gate t: Time (frame)
W _xo : Input load matrix (output gate)
W _ho : State load matrix (output gate)
W _co : Cell output load matrix (output gate)
b _o : Bias (output gate)
W _xc : Input load matrix (cell output)
W _hc : State weight matrix (cell output)
b _c : Bias (cell output)
W _xf : Input load matrix (forgetting gate)
W _hf : state weight matrix (forgetting gate)
W _cf : Cell output load matrix (forgetting gate)
b _f : Bias (forgetting gate)
W _xi : Input load matrix (input gate)
W _hi : State weight matrix (input gate)
W _ci : Cell output load matrix (input gate)
b _i : Bias (input gate)
*: Convolution operation

本実施形態の画像生成部２０は、前述したように畳み込みＬＳＴＭを用いて構成されているものとしたが、これに限るものではない。予測画像を生成する手法に関しては、前述の手法以外にも非特許文献２など様々な手法が提案されており、他の手法が用いられても構わない。 Although the image generation unit 20 of this embodiment is configured using convolutional LSTM as described above, it is not limited to this. Regarding the method of generating a predicted image, various methods have been proposed in addition to the above-mentioned method, such as Non-Patent Document 2, and other methods may also be used.

図３に示したＮＮの構成において、畳み込みＬＳＴＭは、過去の５フレーム分の画像を基にして１フレーム分の予測画像を出力する。なお図３の例においても、画像生成処理として、過去の任意のｎフレーム分の画像を基にして１レーム分の予測画像を出力するものであっても良いし、複数フレーム分の予測画像を出力するものであっても良い。畳み込みＬＳＴＭによって生成された画像は、一旦フレームバッファに格納され、図２におけるステップＳ１８４の画像認識処理に入力される。 In the configuration of the NN shown in FIG. 3, the convolution LSTM outputs a predicted image for one frame based on images for five past frames. In the example of FIG. 3 as well, the image generation process may output a predicted image for one frame based on images for any n frames in the past, or may output a predicted image for multiple frames. It may be something that is output. The image generated by the convolution LSTM is temporarily stored in a frame buffer and input to the image recognition process in step S184 in FIG.

次に、前述した画像認識部３０で実行されるステップＳ１８４の画像認識処理に関して、図４を用いて詳細に説明する。図４は、入力された予測画像に対して、所定の画像認識処理を実現するＮＮの構成例を模式的に示した図である。
図４に示すＮＮは、階層型畳み込みＮＮ（ニューラルネットワーク）モジュール１９０と階層型逆畳み込みＮＮモジュール１９１とで構成され、各層は畳み込み演算素子４０１～４０８により構成されている。この図４に示したようなＮＮに関しては、下記の参考文献３に詳細に開示されているものである。画像認識部３０は、図４のようなＮＮにより、入力された予測画像データに対して、画像内の物体毎の画像認識処理（認識結果として分割領域を得る処理）を実行する機能を有する。 Next, the image recognition process in step S184 executed by the image recognition unit 30 described above will be described in detail using FIG. 4. FIG. 4 is a diagram schematically showing a configuration example of a NN that implements predetermined image recognition processing on an input predicted image.
The NN shown in FIG. 4 is composed of a hierarchical convolutional NN (neural network) module 190 and a hierarchical deconvolutional NN module 191, and each layer is composed of convolution operation elements 401 to 408. The NN as shown in FIG. 4 is disclosed in detail in Reference 3 below. The image recognition unit 30 has a function of performing image recognition processing for each object in the image (processing to obtain divided regions as a recognition result) on input predicted image data using a NN as shown in FIG.

参考文献３：Hyeonwoo Noh, Seunghoon Hong, Bohyung Han, "Learning Deconvolution Network for Semantic Segmentation", ICCV 2015 Reference 3: Hyeonwoo Noh, Seunghoon Hong, Bohyung Han, "Learning Deconvolution Network for Semantic Segmentation", ICCV 2015

画像認識部３０は、入力された予測画像データを内部のフレームバッファに一旦格納した後、図４に示す階層型畳み込みＮＮモジュール１９０に入力し、更に階層型逆畳み込みＮＮモジュール１９１に入力する。階層型畳み込みＮＮモジュール１９０と階層型逆畳み込みＮＮモジュール１９１とから構成されるＮＮは、画像処理に広く利用されている。特に本実施形態のように、領域分割が適用される例は、参考文献３において詳細に説明されている。 The image recognition unit 30 once stores the input predicted image data in an internal frame buffer, and then inputs it to the hierarchical convolution NN module 190 shown in FIG. 4, and further inputs it to the hierarchical deconvolution NN module 191. A NN composed of a hierarchical convolutional neural network module 190 and a hierarchical deconvolutional neural network module 191 is widely used in image processing. In particular, an example in which area division is applied as in this embodiment is described in detail in Reference 3.

階層型畳み込みＮＮモジュール１９０における処理は、以下の式（２）で表される。 The processing in the hierarchical convolutional NN module 190 is expressed by the following equation (2).

ここで式（２）中の各記号が示す内容は以下の通りである。
ｘ：入力データ
ｙ：出力データ
ｗ：畳み込みフィルタ
ｖ：畳み込みフィルタの高さ
ｈ：畳み込みフィルタの幅
ｍ：前段層特徴面インデックス
ｎ：後段層特徴面インデックス
ｉ：前段層演算位置y座標
ｊ：前段層演算位置x座標
ｉ'：前段層演算位置y座標
ｊ'：後段層演算位置x座標
ｋ：畳み込みフィルタy座標
ｌ(t)：畳み込みフィルタx座標 Here, the contents indicated by each symbol in formula (2) are as follows.
x: Input data y: Output data w: Convolution filter v: Height of convolution filter h: Width of convolution filter m: Previous layer feature surface index n: Later layer feature surface index i: Previous layer calculation position y-coordinate j: Previous layer Layer calculation position x-coordinate i': Previous layer calculation position y-coordinate j': Subsequent layer calculation position x-coordinate k: Convolution filter y-coordinate l(t): Convolution filter x-coordinate

また階層型逆畳み込みＮＮモジュール１９１は、式（３）で表される。 Furthermore, the hierarchical deconvolution NN module 191 is expressed by equation (3).

ここで式（３）中の各記号が示す内容は以下の通りである。
ｘ：入力データ（前段層特徴面にゼロ値を挿入して拡大したデータ）
ｙ：出力データ
ｗ：逆畳み込みフィルタ
ｖ：逆畳み込みフィルタの高さ
ｈ：逆畳み込みフィルタの幅
ｍ：前段層特徴面インデックス
ｎ：後段層特徴面インデックス
ｉ：前段層演算位置y座標
ｊ：前段層演算位置x座標
ｉ'：前段層演算位置y座標
ｊ'：後段層演算位置x座標
ｋ：逆畳み込みフィルタy座標
ｌ(t)：逆畳み込みフィルタx座標 Here, the contents indicated by each symbol in formula (3) are as follows.
x: Input data (data expanded by inserting zero values into the feature plane of the previous layer)
y: Output data w: Deconvolution filter v: Height of deconvolution filter h: Width of deconvolution filter m: Previous layer feature surface index n: Later layer feature surface index i: Previous layer calculation position y-coordinate j: Previous layer Calculation position x-coordinate i': Previous layer calculation position y-coordinate j': Subsequent layer calculation position x-coordinate k: Deconvolution filter y-coordinate l(t): Deconvolution filter x-coordinate

本実施形態では、これら階層型畳み込みＮＮモジュール１９０と階層型逆畳み込みＮＮモジュール１９１から構成されるＮＮに対し、領域分割に関する教師データを有する画像データを用いた事前学習が行われることにより、所望の出力を得ることが可能となる。学習処理に関しては、一般的な学習手法を適用することが可能であり、詳細な学習手法に関する説明は省略する。 In this embodiment, the NN composed of the hierarchical convolutional NN module 190 and the hierarchical deconvolutional NN module 191 is subjected to pre-learning using image data having training data regarding region segmentation, so that a desired result can be obtained. It becomes possible to obtain output. Regarding the learning process, a general learning method can be applied, and a detailed explanation of the learning method will be omitted.

なお本実施形態における画像認識部３０は、前述したように階層型畳み込みＮＮモジュール１９０と階層型逆畳み込みＮＮモジュール１９１とから構成されるＮＮを用いて構成するものとしたが、この例に限られるものではない。領域分割を伴う画像認識処理を実現する手法に関しては、これ以外にも参考文献３を含む様々な手法が提案されており、その他の手法を用いたものであっても構わない。 Although the image recognition unit 30 in this embodiment is configured using a NN including the hierarchical convolutional NN module 190 and the hierarchical deconvolutional NN module 191 as described above, the present invention is limited to this example. It's not a thing. Regarding the method of realizing image recognition processing involving region division, various other methods have been proposed, including reference document 3, and other methods may also be used.

前述した図２のステップＳ１８４の画像認識処理において、画像認識部３０は、入力された予測画像に対して領域分割処理を実行し、予測画像内の物体毎に定義される分割領域を認識結果として算出する処理を行う。具体的には、画像認識部３０は、図５に示すように、予測画像５００の中に含まれる物体の各領域の境界線５１０に相当する画素の座標を出力する。 In the image recognition process of step S184 in FIG. 2 described above, the image recognition unit 30 executes region division processing on the input predicted image, and divides the divided regions defined for each object in the predicted image as a recognition result. Perform the calculation process. Specifically, the image recognition unit 30 outputs the coordinates of pixels corresponding to the boundary line 510 of each area of the object included in the predicted image 500, as shown in FIG.

次に、前述したパラメータ決定部４０で実行されるステップＳ１８５の撮像パラメータ決定処理に関して説明する。
パラメータ決定部４０は、まず、予測画像内の物体毎に定義された領域情報と、予想画像のデータとを基にして、図５に示す予測画像５００の中の各領域内の輝度値の平均値を算出する。 Next, the imaging parameter determination process in step S185 executed by the parameter determination unit 40 described above will be described.
The parameter determination unit 40 first calculates the average brightness value in each region in the predicted image 500 shown in FIG. 5 based on the area information defined for each object in the predicted image and the data of the predicted image. Calculate the value.

ここで撮像パラメータとして例えば露光パラメータを決定する場合、パラメータ決定部４０は、各領域内の輝度値の平均値のレンジに対応して事前に決定した適切な露光時間の値をＬＵＴ（ルックアップテーブル）として保持している。パラメータ決定部４０は、このＬＵＴを用い、算出された輝度値の平均値を基にして、画像センサ部１０の撮像センサの各領域に対応する画素回路毎の露光時間を決める露光パラメータを決定する。 Here, when determining, for example, an exposure parameter as an imaging parameter, the parameter determining unit 40 uses an LUT (lookup table) to calculate an appropriate exposure time value determined in advance corresponding to the average value range of brightness values in each area. ). Using this LUT, the parameter determination unit 40 determines an exposure parameter that determines the exposure time for each pixel circuit corresponding to each area of the image sensor of the image sensor unit 10, based on the average value of the calculated brightness values. .

また撮像パラメータとして例えばダイナミックレンジパラメータを決定する場合、パラメータ決定部４０は、前述のように算出した輝度値の平均値を基にして、画像センサ部１０の撮像センサの各領域に対応する画素回路毎のダイナミックレンジを決定する。この場合、パラメータ決定部４０は、各領域内の輝度値の平均値のレンジに対応して事前に決定した適切なダイナミックレンジの値をＬＵＴとして保持している。そして、パラメータ決定部４０は、そのＬＵＴを用い、算出された輝度値の平均値を基にして、画像センサ部１０の撮像センサの各領域に対応する画素回路毎のダイナミックレンジを決めるダイナミックレンジパラメータを決定する。 Furthermore, when determining, for example, a dynamic range parameter as an imaging parameter, the parameter determination unit 40 determines the pixel circuits corresponding to each region of the image sensor of the image sensor unit 10 based on the average value of the luminance values calculated as described above. Determine the dynamic range for each. In this case, the parameter determination unit 40 holds, as an LUT, appropriate dynamic range values determined in advance corresponding to the average value range of luminance values in each region. Then, the parameter determination unit 40 uses the LUT to determine a dynamic range parameter that determines the dynamic range for each pixel circuit corresponding to each area of the image sensor of the image sensor unit 10 based on the average value of the calculated brightness values. Determine.

また撮像パラメータとして例えばゲインパラメータを決定する場合、パラメータ決定部４０は、前述のように算出した輝度値の平均値を基にして、画像センサ部１０の撮像センサの出力値に対して各領域に対応する画素毎にゲイン値を決定する。この場合、パラメータ決定部４０は、各領域内の輝度値の平均値のレンジに対応して事前に決定した適切なゲイン値をＬＵＴとして保持している。そして、パラメータ決定部４０は、そのＬＵＴを用い、算出された輝度値の平均値を基にして、画像センサ部１０の撮像センサの各領域に対応する画素毎にゲイン値を決めるゲインパラメータを決定する。 Further, when determining, for example, a gain parameter as an imaging parameter, the parameter determining unit 40 determines the output value of the image sensor of the image sensor unit 10 for each region based on the average value of the luminance values calculated as described above. A gain value is determined for each corresponding pixel. In this case, the parameter determination unit 40 holds as an LUT an appropriate gain value determined in advance corresponding to the range of the average value of brightness values in each region. Then, the parameter determination unit 40 uses the LUT to determine a gain parameter that determines a gain value for each pixel corresponding to each region of the image sensor of the image sensor unit 10 based on the average value of the calculated brightness values. do.

その後、ステップＳ１８５の撮像パラメータ決定処理において、パラメータ決定部４０は、前述したように決定した撮像パラメータをステップＳ１８６の撮像パラメータ設定処理に対して出力する。
なお、フォーカスパラメータ、フレームレートパラメータ、撮像領域パラメータ、解像度パラメータについては後述する他の実施形態において説明する。 After that, in the imaging parameter determination process of step S185, the parameter determination unit 40 outputs the imaging parameters determined as described above to the imaging parameter setting process of step S186.
Note that the focus parameter, frame rate parameter, imaging area parameter, and resolution parameter will be explained in other embodiments to be described later.

また本実施形態では、前述したように分割領域毎に撮像パラメータを決定する例を挙げて説明したが、この例に限定されるものではない。撮像パラメータ決定処理は、特定の領域より決定された撮像パラメータを全領域に対して使用する撮像パラメータとして決定する処理でであっても良い。例えば図５において、予測画像５００の中の自動車の領域から決定された撮像パラメータを全領域に対する撮像パラメータとするものであっても良い。また、前述した分割領域毎に撮像パラメータを決定する場合と、全領域について使用する撮像パラメータを決定する場合とを切り替えても良い。例えば、所定のフレーム周期のみで全領域について使用する撮像パラメータを決定し、それ以外では分割領域毎に撮像パラメータを決定しても良い。 Furthermore, although the present embodiment has been described using an example in which imaging parameters are determined for each divided region as described above, the present invention is not limited to this example. The imaging parameter determination process may be a process of determining imaging parameters determined for a specific area as imaging parameters to be used for the entire area. For example, in FIG. 5, the imaging parameters determined from the region of the car in the predicted image 500 may be used as the imaging parameters for the entire region. Furthermore, it is also possible to switch between the case where the imaging parameters are determined for each divided region as described above and the case where the imaging parameters to be used for the entire region are determined. For example, the imaging parameters to be used for the entire region may be determined only at a predetermined frame period, and the imaging parameters may be determined for each divided region at other times.

次に、パラメータ設定部５０で実行されるステップＳ１８６の撮像パラメータ設定処理に関して説明する。
パラメータ設定部５０は、パラメータ決定部４０から入力された撮像パラメータを、画像センサ部１０に対して設定するような撮像パラメータ設定機能を有する。
撮像パラメータとして露光パラメータが入力された場合、パラメータ設定部５０は、露光パラメータによって領域毎に決定された露光時間を基にして、ドライバー回路の、撮像センサの画素毎の露光時間制御に関わるレジスタに、露光時間を設定する。 Next, the imaging parameter setting process in step S186 executed by the parameter setting unit 50 will be described.
The parameter setting unit 50 has an imaging parameter setting function that sets the imaging parameters input from the parameter determining unit 40 to the image sensor unit 10.
When an exposure parameter is input as an imaging parameter, the parameter setting unit 50 sets a register related to exposure time control for each pixel of the image sensor in the driver circuit based on the exposure time determined for each region by the exposure parameter. , set the exposure time.

またダイナミックレンジパラメータが入力された場合、パラメータ設定部５０は、領域毎に決定されたダイナミックレンジを基にして、ドライバー回路の、絶像センサの画素毎のダイナミックレンジ制御に関わるレジスタに、ダイナミックレンジの値を設定する。 Further, when a dynamic range parameter is input, the parameter setting unit 50 sets the dynamic range in the register related to dynamic range control for each pixel of the fixed image sensor of the driver circuit based on the dynamic range determined for each region. Set the value of

またゲインパラメータが入力された場合、パラメータ設定部５０は、領域毎に決定されたゲイン値を基にして、ドライバー回路の、画像センサの画素毎のゲイン値制御に関わるレジスタに、ゲイン値を設定する。 Further, when a gain parameter is input, the parameter setting unit 50 sets a gain value to a register related to gain value control for each pixel of the image sensor in the driver circuit based on the gain value determined for each region. do.

そして画像センサ部１０は、前述したようにレジスタに設定された値（露光時間を示す値、ダイナミックレンジの値、およびゲイン値）に基づいて、ドライバー回路が撮像センサの動作を制御する。
なお本実施形態における撮像センサは、画素毎に露光時間、ダイナミックレンジ、およびゲイン値を制御可能な構成を有しており、それら露光時間、ダイナミックレンジ、およびゲイン値の設定を行うことができるとする。すなわち本実施形態における撮像センサは、画素毎にフォトダイオードのリセットタイミングを設定可能な構成を取ることにより、画素毎に露光時間を制御することが可能である。また、撮像センサは、画素毎に独立したＡ／Ｄ変換回路を構成することにより、ダイナミックレンジおよびゲイン値を画素毎に制御することが可能である。 In the image sensor unit 10, the driver circuit controls the operation of the image sensor based on the values set in the register (the exposure time value, the dynamic range value, and the gain value) as described above.
Note that the image sensor in this embodiment has a configuration in which the exposure time, dynamic range, and gain value can be controlled for each pixel, and the exposure time, dynamic range, and gain value can be set. do. That is, the image sensor in this embodiment has a configuration in which the photodiode reset timing can be set for each pixel, thereby making it possible to control the exposure time for each pixel. Further, in the image sensor, by configuring an independent A/D conversion circuit for each pixel, it is possible to control the dynamic range and gain value for each pixel.

また撮像パラメータ決定処理において、例えば特定の領域より決定された撮像パラメータを全領域に対して使用する撮像パラメータとして決定された場合、パラメータ設定部５０は、入力された撮像パラメータを撮像センサの全領域に対して設定する。具体的には、ドライバー回路の、撮像センサの全領域の制御に関わるレジスタに、それぞれの撮像パラメータを設定する。 In addition, in the imaging parameter determination process, for example, if an imaging parameter determined from a specific area is determined as an imaging parameter to be used for the entire area, the parameter setting unit 50 sets the input imaging parameter to the entire area of the imaging sensor. Set for. Specifically, each imaging parameter is set in a register related to control of the entire area of the imaging sensor in the driver circuit.

以上説明したように、本実施形態の撮像装置は、過去の撮像画像から予測画像を生成し、その予測画像に対する画像認識処理結果を基にして撮像パラメータを制御する。これにより、実施形態の撮像装置によれば、過去の画像から推測した画像認識処理結果を基にして、将来の画像に対する撮像パラメータを制御する場合に生じ得る、撮像パラメータの不適合を防止することができる。すなわち本実施形態によれば、将来の画像に適合した撮像パラメータの制御を実現することが可能である。例えば、画像内の物体毎の領域が画像フレーム毎に変化する場合、過去の画像から得られる物体毎の領域は、将来の画像においてずれてしまう可能性があり、結果として撮像パラメータの調整領域もずれてしまうことが有り得る。これに対し、本実施形態の撮像装置では、予測画像の画像認識処理結果を基にして物体毎の領域の撮像パラメータを決定するため、画像内で物体毎の領域が画像フレーム毎に変化する場合でも、より適切な撮像パラメータを設定することが可能となる。 As described above, the imaging apparatus of this embodiment generates a predicted image from past captured images, and controls imaging parameters based on the image recognition processing result for the predicted image. As a result, according to the imaging device of the embodiment, it is possible to prevent mismatches in imaging parameters that may occur when controlling imaging parameters for future images based on image recognition processing results estimated from past images. can. That is, according to this embodiment, it is possible to realize control of imaging parameters suitable for future images. For example, if the area of each object in an image changes from image frame to image frame, the area of each object obtained from past images may shift in future images, and as a result, the adjustment area of imaging parameters may also change. It is possible that it will shift. In contrast, in the imaging device of this embodiment, the imaging parameters for each object area are determined based on the image recognition processing results of the predicted image, so if the area for each object changes in each image frame within the image, However, it becomes possible to set more appropriate imaging parameters.

なお前述したように、実施形態１では、予測画像に対する認識結果を基にして撮像センサの撮像パラメータを制御するが、予測画像の生成方法、予測画像に対する認識処理手法、認識処理結果に基づく撮像パラメータの決定および設定手法に特に限定はない。特に、認識処理結果に基づく撮像パラメータの決定および設定手法に関しては、下記の参考文献４のように認識処理結果に基づく手法が多数開示されており、それらは本実施形態における予測画像に対する認識処理結果に基づく手法に対しても同様に適用可能である。 As described above, in the first embodiment, the imaging parameters of the image sensor are controlled based on the recognition result for the predicted image, but the method for generating the predicted image, the recognition processing method for the predicted image, and the imaging parameter based on the recognition processing result There are no particular limitations on the determination and setting method. In particular, regarding the determination and setting method of imaging parameters based on the recognition processing results, many methods based on the recognition processing results are disclosed as in Reference 4 below, and these are the recognition processing results for the predicted image in this embodiment. It is also applicable to methods based on .

参考文献４：特開２０１１－１３００３１号公報 Reference 4: Japanese Patent Application Publication No. 2011-130031

＜実施形態２＞
次に、実施形態２について説明する。
実施形態１では、画像認識処理の結果を基に分割領域（又は画像全体の領域）に対する撮像パラメータを設定する例を挙げた。これに対し、実施形態２の場合は、図１に示した画像認識部３０が、画像認識処理として物体認識処理を実行する点と、パラメータ決定部４０が、物体認識処理で決定された認識対象物体の領域に対して所定の撮像パラメータを決定する点が、実施形態１と相違する。以下、実施形態１と相違する部分についてのみ説明を行い、その他の部分に関しては実施形態１と同様であるため説明を省略する。 <Embodiment 2>
Next, a second embodiment will be described.
In the first embodiment, an example was given in which imaging parameters for divided regions (or regions of the entire image) are set based on the results of image recognition processing. On the other hand, in the case of the second embodiment, the image recognition unit 30 shown in FIG. 1 executes object recognition processing as the image recognition process, and the parameter determination unit 40 This embodiment differs from the first embodiment in that predetermined imaging parameters are determined for the object region. Hereinafter, only the parts that are different from Embodiment 1 will be explained, and the other parts will be omitted because they are the same as Embodiment 1.

本実施形態の撮像装置において、図１の画像認識部３０は、画像生成部２０で生成された予測画像から、事前に設定された特定の物体を認識するような画像認識処理機能を有する。事前に設定された特定物体は、本実施形態では人体とするが、その他、自動車、自転車または信号機等、事前に学習した任意の物体を認識することが可能である。 In the imaging device of this embodiment, the image recognition unit 30 in FIG. 1 has an image recognition processing function that recognizes a specific object set in advance from the predicted image generated by the image generation unit 20. The specific object set in advance is a human body in this embodiment, but any other object that has been learned in advance, such as a car, a bicycle, or a traffic light, can be recognized.

図６は、本実施形態における撮像装置の動作を示すフローチャートである。
図６において、ステップＳ１８０からステップＳ１８３の処理は、前述した実施形態１において述べた処理と同様であるため、それらの詳細な説明は省略する。 FIG. 6 is a flowchart showing the operation of the imaging device in this embodiment.
In FIG. 6, the processes from step S180 to step S183 are the same as those described in the first embodiment, so detailed description thereof will be omitted.

実施形態２の場合、ステップＳ１８３で予測画像の生成処理が開始されて予測画像が生成されると、ＣＰＵ１４０は、画像認識部３０にて行われるステップＳ１８８に処理を進める。
ステップＳ１８８の物体認識処理に進むと、画像認識部３０は、画像生成部２０にて生成された予測画像に対して、人体の認識処理を実行する。そして、画像認識部３０は、画像認識処理の結果として、認識された人体に対して推定された矩形領域の画像上の座標値（左上画素位置と右下画素位置）を出力する。予測画像から人体を認識する処理の詳細は後述する。ステップＳ１８８の処理後は、ステップＳ１８５に進む。 In the case of the second embodiment, when the predicted image generation process is started in step S183 and the predicted image is generated, the CPU 140 advances the process to step S188, which is performed in the image recognition unit 30.
Proceeding to object recognition processing in step S188, the image recognition unit 30 performs human body recognition processing on the predicted image generated by the image generation unit 20. Then, the image recognition unit 30 outputs the coordinate values (upper left pixel position and lower right pixel position) on the image of the rectangular area estimated for the recognized human body as a result of the image recognition process. Details of the process for recognizing a human body from a predicted image will be described later. After the processing in step S188, the process advances to step S185.

ステップＳ１８５の撮像パラメータ決定処理に進むと、パラメータ決定部４０は、ステップＳ１８８の物体認識処理で得られた人体の矩形領域の座標値の情報と、ステップＳ１８２の画像生成処理で生成された予測画像とを基に、撮像パラメータ値を決定する。撮像パラメータ値の決定方法の詳細は後述する。 Proceeding to the imaging parameter determination process in step S185, the parameter determination unit 40 uses information on the coordinate values of the rectangular region of the human body obtained in the object recognition process in step S188 and the predicted image generated in the image generation process in step S182. Based on this, the imaging parameter values are determined. Details of the method for determining the imaging parameter values will be described later.

続いて、実施形態２の画像認識部３０で実行されるステップＳ１８８の画像認識処理に関して、図７を用いて詳細に説明する。図７は、入力された予測画像（Ｘ）に対して、所定の画像認識処理を実現するＮＮの構成例を模式的に示した図である。 Next, the image recognition process in step S188 executed by the image recognition unit 30 of the second embodiment will be described in detail using FIG. 7. FIG. 7 is a diagram schematically showing a configuration example of a NN that implements a predetermined image recognition process on an input predicted image (X).

本実施形態における画像認識処理は、認識処理技術として広く応用されている階層型畳み込みＮＮにより構成されており、前述の実施形態１で説明したＮＮに含まれる図４の階層型畳み込みＮＮモジュール１９０と同様の演算処理を実行する。階層型畳み込みＮＮモジュールの各層は畳み込み演算素子７０１～７０４により構成されている。 The image recognition process in this embodiment is configured by a hierarchical convolutional neural network that is widely applied as a recognition processing technology, and includes the hierarchical convolutional neural network module 190 in FIG. 4 included in the neural network described in the first embodiment above. Executes similar arithmetic processing. Each layer of the hierarchical convolution NN module is composed of convolution operation elements 701 to 704.

画像認識部３０は、入力された予測画像データを内部のフレームバッファに一旦格納した後、階層型畳み込みＮＮモジュールに入力し、人体を認識する処理を実行する。ここで、階層型畳み込みＮＮモジュールにおける処理は、一般的な物体認識処理と同様に、画像内の所定領域を順に走査することにより実行される。 The image recognition unit 30 temporarily stores the input predicted image data in an internal frame buffer, and then inputs it to a hierarchical convolutional neural network module to perform a process of recognizing a human body. Here, the processing in the hierarchical convolutional NN module is executed by sequentially scanning a predetermined region within an image, similar to general object recognition processing.

階層型畳み込みＮＮにおける処理は、実施形態１における階層型畳み込みＮＮにおける処理と同様に式（２）で表される。実施形態２の場合、人体の画像上の位置およびサイズに関する教師データを有する画像データを使用して事前に学習することにより、画像中の人体を認識することが可能である。 Processing in the hierarchical convolutional NN is expressed by equation (2) similarly to the processing in the hierarchical convolutional NN in the first embodiment. In the case of the second embodiment, it is possible to recognize a human body in an image by performing prior learning using image data having training data regarding the position and size of the human body on the image.

なお本実施形態における画像認識部３０は、階層型畳み込みＮＮモジュールを用いて構成するものとしたがこれに限るものではなく、特定の物体を認識する手法は、これ以外にも様々な手法が提案されており、他の手法が用いられても良い。 Although the image recognition unit 30 in this embodiment is configured using a hierarchical convolutional neural network module, the present invention is not limited to this, and various other methods have been proposed for recognizing a specific object. However, other methods may also be used.

図６のステップＳ１８８の物体認識処理において、画像認識部３０は、入力された予測画像に対して人体を認識する処理を実行し、その予測画像内の人体位置・サイズを推定する。具体的には、画像認識部３０は、図８に示すように、予測画像８００から、認識された人体に対して推定された矩形領域８１０の画像上の座標値（左上画素位置８１１と右下画素位置８１２）を出力する。 In the object recognition process of step S188 in FIG. 6, the image recognition unit 30 performs a process of recognizing a human body on the input predicted image, and estimates the position and size of the human body in the predicted image. Specifically, as shown in FIG. pixel position 812).

また、画像認識部３０は、ステップＳ１８８の画像認識処理として、人体認識処理に加えて、さらに人体姿勢認識処理を実行することも可能である。人体姿勢認識処理は、前述した階層型畳み込みＮＮモジュールによって演算することが可能である。この場合、ステップＳ１８８の物体認識処理において、画像認識部３０は、前述のように推定した矩形領域の画像上の座標値に加えて、図９に示すように予測画像９００から人体各部位の位置９１０～９１５を示す各座標値を算出する。図９の例では、予測画像９００の人体の頭部中心位置９１１、胴体中心位置９１０、左手位置９１２、右手位置９１３、左足位置９１４、右足位置９１５のぞれぞれの座標値が出力される。なお、認識する対象は人体の姿勢に限るものではなく、任意の物体の姿勢を認識する物体姿勢認識処理が実行されても良い。例えば自動車の姿勢を認識する場合は、ヘッドランプ位置、タイヤ位置、ルーフ位置等の各部位を認識することができる。 In addition to the human body recognition process, the image recognition unit 30 can also perform a human body posture recognition process as the image recognition process in step S188. The human body posture recognition process can be performed using the above-described hierarchical convolutional neural network module. In this case, in the object recognition process of step S188, the image recognition unit 30 calculates the position of each part of the human body from the predicted image 900 as shown in FIG. 9, in addition to the coordinate values on the image of the rectangular area estimated as described above. Each coordinate value indicating 910 to 915 is calculated. In the example of FIG. 9, the coordinate values of the human body's head center position 911, torso center position 910, left hand position 912, right hand position 913, left foot position 914, and right foot position 915 of the predicted image 900 are output. . Note that the object to be recognized is not limited to the posture of a human body, and object posture recognition processing for recognizing the posture of any object may be performed. For example, when recognizing the posture of a car, each part such as the headlamp position, tire position, roof position, etc. can be recognized.

次に、実施形態２においてパラメータ決定部４０で実行されるステップＳ１８５の撮像パラメータ決定処理に関して説明する。
パラメータ決定部４０は、図８の予測画像８００内の人体に関して推定された矩形領域８１０の画像上の座標値（左上画素位置８１１と右下画素位置８１２）の情報と、予測画像８００のデータを基にして、各領域内の輝度値の平均値を算出する。 Next, the imaging parameter determination process in step S185 executed by the parameter determination unit 40 in the second embodiment will be described.
The parameter determination unit 40 uses information on the coordinate values (upper left pixel position 811 and lower right pixel position 812) on the image of the rectangular area 810 estimated for the human body in the predicted image 800 of FIG. 8, and the data of the predicted image 800. Based on this, the average value of the brightness values in each area is calculated.

露光パラメータを決定する場合、パラメータ決定部４０は、各領域内の輝度値の平均値のレンジに対応して事前に決定した適切な露光時間をＬＵＴとして保持している。そして、パラメータ決定部４０は、そのＬＵＴを用い、算出した輝度値の平均値を基にして、画像センサ部１０の撮像センサの各領域に対応する画素回路毎の露光時間を決めることで露光パラメータを決定する。 When determining the exposure parameters, the parameter determination unit 40 holds, as a LUT, appropriate exposure times determined in advance corresponding to the average value range of the brightness values in each region. Then, the parameter determining unit 40 uses the LUT to determine the exposure time for each pixel circuit corresponding to each area of the image sensor of the image sensor unit 10 based on the average value of the calculated brightness values. Determine.

またダイナミックレンジパラメータを決定する場合、パラメータ決定部４０は、前述のように算出した輝度値の平均値を基にして、画像センサ部１０の撮像センサの各領域に対応する画素回路毎のダイナミックレンジを決定する。この場合、パラメータ決定部４０は、各領域内の輝度値の平均値のレンジに対応して事前に決定した適切なダイナミックレンジをＬＵＴとして保持している。そして、パラメータ決定部４０は、そのＬＵＴを用い、算出した輝度値の平均値を基にして、画像センサ部１０の画像センサの各領域に対応する画素回路毎のダイナミックレンジを決めることでダイナミックレンジパラメータを決定する。 Further, when determining the dynamic range parameter, the parameter determination unit 40 determines the dynamic range of each pixel circuit corresponding to each area of the image sensor of the image sensor unit 10 based on the average value of the luminance values calculated as described above. Determine. In this case, the parameter determining unit 40 maintains an appropriate dynamic range determined in advance corresponding to the average value range of luminance values in each area as an LUT. Then, the parameter determination unit 40 uses the LUT to determine the dynamic range for each pixel circuit corresponding to each area of the image sensor of the image sensor unit 10 based on the average value of the calculated luminance values. Determine parameters.

またゲインパラメータを決定する場合、パラメータ決定部４０は、前述のように算出した輝度値の平均値を基にして、画像センサ部１０の撮像センサの出力値に対して、各領域に対応する画素毎にゲイン値を決定する。この場合、パラメータ決定部４０は、各領域内の輝度値の平均値のレンジに対応して事前に決定した適切なゲイン値をＬＵＴとして保持している。パラメータ決定部４０は、そのＬＵＴを用い、算出した輝度値の平均値を基にして、画像センサ部１０の撮像センサの各領域に対応する画素毎にゲイン値を決めることでゲインパラメータを決定する。 Further, when determining the gain parameter, the parameter determination unit 40 determines the output value of the image sensor of the image sensor unit 10 based on the average value of the luminance values calculated as described above. Determine the gain value for each time. In this case, the parameter determination unit 40 holds as an LUT an appropriate gain value determined in advance corresponding to the range of the average value of brightness values in each region. The parameter determination unit 40 determines a gain parameter by determining a gain value for each pixel corresponding to each area of the image sensor of the image sensor unit 10 based on the calculated average value of the luminance values using the LUT. .

実施形態２においても、パラメータ決定部４０は、前述したようにして決定された各撮像パラメータを、ステップＳ１８６の撮像パラメータ設定処理として、パラメータ設定部５０に送る。 Also in the second embodiment, the parameter determining unit 40 sends each imaging parameter determined as described above to the parameter setting unit 50 as the imaging parameter setting process in step S186.

また実施形態２の場合、パラメータ決定部４０は、人体に対して推定された矩形領域の画像上の座標値（左上画素位置と右下画素位置）を基に、撮像パラメータとして、撮像領域パラメータ（撮像センサで撮像する領域）を決定することも出来る。そして、パラメータ設定部５０によって撮像領域パラメータが設定された場合、画像センサ部１０は、後述するように、その撮像領域パラメータにより定義される領域の画像のみを取得することになる。 In the case of the second embodiment, the parameter determining unit 40 determines the imaging area parameter ( It is also possible to determine the area to be imaged by the image sensor. When the imaging area parameters are set by the parameter setting unit 50, the image sensor unit 10 acquires only images of the area defined by the imaging area parameters, as will be described later.

また実施形態２において、パラメータ決定部４０は、人体に対して推定された矩形領域の画像上の座標値（左上画素位置と右下画素位置）を基にして、フォーカスを合わせる画素位置を算出することもできる。例えば、パラメータ決定部４０は、人体頭部の中心位置、例えば図１０に示すように人体に対して推定された矩形領域１０１１の上部から１／１０で且つ左右の中心位置１０１０を算出し、それを基に、フォーカス位置を決めるフォーカスパラメータを決定する。そして、パラメータ設定部５０によってフォーカスパラメータが設定された場合、画像センサ部１０は、そのフォーカスパラメータにより設定されるフォーカス位置でフォーカスを合わせを行うことが可能となる。 Further, in the second embodiment, the parameter determining unit 40 calculates the pixel position to be focused based on the coordinate values (upper left pixel position and lower right pixel position) on the image of the rectangular area estimated for the human body. You can also do that. For example, the parameter determination unit 40 calculates the center position of the human head, for example, the left and right center positions 1010 that are 1/10 from the top of the rectangular area 1011 estimated for the human body as shown in FIG. Based on this, determine the focus parameter that determines the focus position. Then, when the focus parameter is set by the parameter setting unit 50, the image sensor unit 10 can perform focusing at the focus position set by the focus parameter.

実施形態２において、画像認識部３０は、人体認識処理に加えてさらに人体姿勢認識処理を実行した場合、事前に設定された部位、またはユーザに指定された部位に基づいて、入力された人体各部位の座標値の中から一つを選択することもできる。この場合、パラメータ決定部４０は、事前に設定された部位、またはユーザに指定された部位（例えば人体頭部中心位置）を、フォーカス位置を決めるフォーカスパラメータとして決定することも可能である。 In the second embodiment, when performing human body posture recognition processing in addition to human body recognition processing, the image recognition unit 30 recognizes each input human body based on a preset part or a part specified by the user. It is also possible to select one of the coordinate values of the part. In this case, the parameter determination unit 40 can also determine a preset region or a region specified by the user (for example, the center position of the human body head) as the focus parameter for determining the focus position.

実施形態２の場合も、パラメータ設定部５０は、図６のステップＳ１８６の撮像パラメータ設定処理として、パラメータ決定部４０にて決定された撮像パラメータを、画像センサ部１０に対して設定する。 In the case of the second embodiment as well, the parameter setting section 50 sets the imaging parameters determined by the parameter determining section 40 for the image sensor section 10 as the imaging parameter setting process of step S186 in FIG.

例えば撮像領域パラメータを設定する場合、パラメータ設定部５０は、パラメータ決定部４０にて決定された撮像領域パラメータに基づき、画像センサ部１０のドライバー回路の、撮像センサの撮像領域制御に関わるレジスタに、撮像領域（ＲＯＩ）を設定する。 For example, when setting imaging area parameters, the parameter setting unit 50 sets the registers related to imaging area control of the imaging sensor in the driver circuit of the image sensor unit 10 based on the imaging area parameters determined by the parameter determining unit 40. Set the imaging region (ROI).

またフォーカスパラメータを設定する場合、パラメータ設定部５０は、パラメータ決定部４０にて決定されたフォーカスパラメータに基づき、ドライバー回路の撮像センサのオートフォーカス制御に関わるレジスタに、フォーカスパラメータを設定する。
なお、パラメータ設定部５０で実行される、これら以外の撮像パラメータ設定処理に関しては、前述の実施形態１と同様であるため、それらの説明は省略する。 Further, when setting a focus parameter, the parameter setting unit 50 sets the focus parameter in a register related to autofocus control of the image sensor of the driver circuit based on the focus parameter determined by the parameter determination unit 40.
Note that imaging parameter setting processing other than these executed by the parameter setting unit 50 is the same as that in the first embodiment described above, so a description thereof will be omitted.

そして画像センサ部１０では、前述の実施形態１と同様に、レジスタに設定された値に基づいて、ドライバー回路が撮像センサの動作を制御する。
なお、本実施形態における撮像センサは、前述同様に、画素毎に露光時間、ダイナミックレンジ、およびゲイン値を制御することが可能な構成を有するものとする。また、本実施形態における撮像センサは、設定された撮像領域に基づき、当該領域のみ撮像する機能を有する。例えば図１１に示すように、実施形態２おける撮像センサは、ドライバー回路の制御に基づき、予測画像から認識した人体領域（図１１中の黒色矩形で囲まれた領域１１１０）のみを撮像し、画像データとして取得する。また本実施形態の撮像センサは、設定されたフォーカス位置に基づき、光学系の制御を含んだオートフォーカス処理を実現する機能を有する。 In the image sensor unit 10, the driver circuit controls the operation of the image sensor based on the value set in the register, as in the first embodiment described above.
Note that the image sensor in this embodiment has a configuration that allows the exposure time, dynamic range, and gain value to be controlled for each pixel, as described above. Furthermore, the image sensor in this embodiment has a function of capturing an image of only the set imaging area based on the set imaging area. For example, as shown in FIG. 11, the image sensor in Embodiment 2 images only the human body region (area 1110 surrounded by a black rectangle in FIG. 11) recognized from the predicted image based on the control of the driver circuit. Obtain as data. Furthermore, the image sensor of this embodiment has a function of realizing autofocus processing including control of the optical system based on the set focus position.

以上説明したように、実施形態２の撮像装置は、過去の画像から生成した予測画像に対する画像認識結果を基に、将来の画像に対する撮像パラメータを制御する。これにより、前述の実施形態１同様に、過去の画像における認識結果を基にして将来の画像に対する撮像パラメータを制御する場合に生じ得る、撮像パラメータの不適合を防止することができる。実施形態２のように人物の画像認識を行う場合、人体等の物体の位置・サイズが画像フレーム毎に変化すると、過去の画像から推測される人体等の領域は、将来の画像においてずれてしまう可能性があり、撮像パラメータの調整範囲もずれる可能性がある。また、画像中の人体等のサイズおよび姿勢が変化する場合は、過去の画像から推定した領域に基づく特定部位の位置、および物体の姿勢に基づく特定部位の位置も、将来の画像においてずれてしまう可能性がある。これに対し、本実施形態では、予測画像を基にして将来の画像における人体等の領域および姿勢を推定できるため、人体等の位置・サイズおよび姿勢が画像フレーム毎に変化する場合でも、領域および姿勢に対して適切な撮像パラメータを設定可能となる。 As described above, the imaging apparatus of the second embodiment controls imaging parameters for future images based on image recognition results for predicted images generated from past images. Thereby, as in the first embodiment described above, it is possible to prevent mismatching of imaging parameters that may occur when controlling imaging parameters for future images based on recognition results for past images. When performing image recognition of a person as in Embodiment 2, if the position and size of an object such as a human body changes from image frame to image frame, the area of the human body etc. estimated from past images will shift in future images. There is a possibility that the adjustment range of the imaging parameters may also shift. In addition, if the size and posture of a human body etc. in an image change, the position of a specific part based on the area estimated from past images and the position of a specific part based on the posture of the object will also shift in future images. there is a possibility. In contrast, in this embodiment, the area and posture of the human body, etc. in future images can be estimated based on the predicted image, so even if the position, size, and posture of the human body, etc. change from image frame to image frame, the area and posture can be estimated. It becomes possible to set appropriate imaging parameters for the posture.

なお本実施形態では、予測画像に対する認識結果を基にして画像センサ部の撮像パラメータを制御するが、実施形態１と同様、予測画像の生成方法、予測画像に対する認識処理手法、認識処理結果に基づく撮像パラメータの決定および設定手法は限定されない。認識処理結果に基づく撮像パラメータの決定および設定手法に関しては、認識処理結果に基づく各種手法を、本実施形態における測画像の認識処理結果に基づく手法にも適用可能である。 Note that in this embodiment, the imaging parameters of the image sensor unit are controlled based on the recognition results for the predicted image, but as in the first embodiment, the method for generating the predicted image, the recognition processing method for the predicted image, and the recognition processing method for the predicted image are controlled based on the recognition processing results. The method for determining and setting imaging parameters is not limited. Regarding the imaging parameter determination and setting method based on the recognition processing result, various methods based on the recognition processing result can also be applied to the method based on the recognition processing result of the measured image in this embodiment.

＜実施形態３＞
次に実施形態３について説明する。実施形態３は、画像認識部３０が、認識処理としてどのようなシーン（場面）か認識するシーン認識処理を実行する点と、パラメータ決定部４０が、シーン認識処理で認識されたシーンに応じて所定の撮像パラメータを決定する点が実施形態２と相違する。以下、実施形態２と相違する部分についてのみ説明を行い、その他の構成等に関しては実施形態２と同様であるため説明を省略する。 <Embodiment 3>
Next, Embodiment 3 will be described. In the third embodiment, the image recognition unit 30 executes scene recognition processing to recognize what kind of scene (scene) it is, and the parameter determination unit 40 executes scene recognition processing according to the scene recognized in the scene recognition processing. This embodiment differs from the second embodiment in that predetermined imaging parameters are determined. Hereinafter, only the parts that are different from the second embodiment will be explained, and the other configurations and the like will be omitted because they are the same as the second embodiment.

実施形態２の撮像装置は、図１の画像認識部３０が、画像生成部２０にて生成された予測画像から、事前に設定された特定のシーンを認識する。本実施形態の場合、事前に設定された特定のシーンは、例えば"歩行者の飛び出し"とするが、その他に"自動車の飛び出し"、"自転車の飛び出し"または"歩行者の転倒"等、事前に学習した任意のシーンでも良い。画像認識部３０は、これら特定のシーンを認識する画像認識処理を行う。 In the imaging device of the second embodiment, the image recognition unit 30 in FIG. 1 recognizes a specific scene set in advance from the predicted image generated by the image generation unit 20. In the case of this embodiment, the specific scene set in advance is, for example, "a pedestrian running out", but there are also other scenes such as "a car running out", "a bicycle running out", or "a pedestrian falling down", etc. It may be any scene that has been learned. The image recognition unit 30 performs image recognition processing to recognize these specific scenes.

図１２は、本実施形態における撮像装置の動作を示すフローチャートである。
図１２におけるステップＳ１８０からステップＳ１８３までの処理、およびステップＳ１８５からステップＳ１８７の処理は実施形態２と同様であるため、詳細な説明を省略する。 FIG. 12 is a flowchart showing the operation of the imaging device in this embodiment.
Since the processing from step S180 to step S183 and the processing from step S185 to step S187 in FIG. 12 are the same as those in the second embodiment, detailed description thereof will be omitted.

実施形態３の場合、ステップＳ１８３で予測画像の生成処理が開始されて予測画像が生成されると、ＣＰＵ１４０は、画像認識部３０にて行われるステップＳ１８９に処理を進める。
ステップＳ１８９のシーン認識処理に進むと、画像認識部３０は、画像生成部２０にて生成された予測画像に対して、"歩行者の飛び出し"シーンの認識処理を実行する。画像認識部３０は、画像認識処理の結果として、"歩行者の飛び出し"シーンの画像認識について真／偽（True／False）フラグと、認識されたシーンに対して推定された矩形領域の画像上の座標値（左上画素位置と右下画素位置）とを出力する。ステップＳ１８９の処理後は、ステップＳ１８５に進む。 In the case of the third embodiment, when the predicted image generation process is started in step S183 and the predicted image is generated, the CPU 140 advances the process to step S189, which is performed by the image recognition unit 30.
Proceeding to the scene recognition process in step S189, the image recognition unit 30 performs a “pedestrian jumping out” scene recognition process on the predicted image generated by the image generation unit 20. As a result of the image recognition process, the image recognition unit 30 sets a true/false flag for the image recognition of the "pedestrian jumping out" scene and a rectangular area estimated for the recognized scene on the image. The coordinate values (upper left pixel position and lower right pixel position) are output. After the processing in step S189, the process advances to step S185.

ステップＳ１８５の撮像パラメータ決定処理に進むと、パラメータ決定部４０は、ステップＳ１８９のシーン認識処理で算出された、特定のシーンの認識結果の情報（真／偽フラグ：True/Falseフラグ）に基づいて撮像パラメータ値を決定する。本実施形態における撮像パラメータ値は、画像センサ部１０において取得する画像のフレームレートの値とする。フレームレートパラメータ値の決定方法の詳細は後述する。 Proceeding to the imaging parameter determination process in step S185, the parameter determination unit 40 uses the information (true/false flag) of the recognition result of the specific scene calculated in the scene recognition process in step S189. Determine imaging parameter values. The imaging parameter value in this embodiment is the value of the frame rate of the image acquired by the image sensor unit 10. Details of the method for determining the frame rate parameter value will be described later.

次に前述した画像認識部３０で実行されるシーン認識処理に関して、詳細に説明する。
本実施形態におけるシーン認識処理は、認識処理技術として広く応用されている階層型畳み込みＮＮによって行われ、実施形態２で説明した図７に示す階層型畳み込みＮＮと同様の演算処理により実行される。 Next, the scene recognition process executed by the image recognition section 30 described above will be described in detail.
The scene recognition process in this embodiment is performed by a hierarchical convolutional neural network that is widely applied as a recognition processing technique, and is executed by the same calculation process as the hierarchical convolutional neural network shown in FIG. 7 described in the second embodiment.

画像認識部３０は、入力された予測画像のデータをフレームバッファに一旦格納した後、階層型畳み込みＮＮモジュールに入力し、"歩行者の飛び出し"シーンを認識する処理を実行する。階層型畳み込みＮＮモジュールにおける処理は、実施形態２における階層型畳み込みＮＮにおける処理と同様に、式（２）で表される。実施形態３の場合は、"歩行者の飛び出し"シーンに関する教師データを有する画像データを使用して事前に学習することにより、画像中の"歩行者の飛び出し"シーンを認識することが可能である。 The image recognition unit 30 temporarily stores the input predicted image data in a frame buffer, and then inputs the data to a hierarchical convolutional neural network module to execute processing for recognizing a "pedestrian jumping out" scene. Processing in the hierarchical convolutional NN module is expressed by equation (2) similarly to the processing in the hierarchical convolutional NN in the second embodiment. In the case of Embodiment 3, it is possible to recognize a "pedestrian jumping out" scene in an image by learning in advance using image data that has training data regarding the "pedestrian jumping out" scene. .

なお本実施形態における画像認識部３０は、階層型畳み込みＮＮモジュールを用いて構成するものとしたが、この例に限るものではなく、特定のシーンを認識する手法はこれ以外の様々な手法が用いられてもよい。 Although the image recognition unit 30 in this embodiment is configured using a hierarchical convolutional neural network module, it is not limited to this example, and various other methods may be used to recognize a specific scene. It's okay to be hit.

前述したようにステップＳ１８９のシーン認識処理において、画像認識部３０は、入力された予測画像に対して"歩行者の飛び出し"シーンを認識する処理を実行し、予測画像内に当該シーンが存在するかどうかを推定する。画像認識部３０は、図１３に示すように予測画像１３００内に"歩行者の飛び出し"シーンが存在するかを示す真／偽フラグと、認識されたシーンで推定された矩形領域１３１０の画像上の座標値（左上画素位置１３１１と右下画素位置１３１２）とを出力する。 As described above, in the scene recognition process of step S189, the image recognition unit 30 executes a process of recognizing a "pedestrian jumping out" scene on the input predicted image, and determines that the scene exists in the predicted image. Estimate whether. As shown in FIG. 13, the image recognition unit 30 includes a true/false flag indicating whether a "pedestrian jumping out" scene exists in the predicted image 1300, and a rectangular area 1310 estimated from the recognized scene on the image. The coordinate values (upper left pixel position 1311 and lower right pixel position 1312) are output.

次に、パラメータ決定部４０で実行される撮像パラメータ決定処理に関して説明する。
パラメータ決定部４０は、予測画像内の"歩行者の飛び出し"シーンの有無に関して推定された真／偽フラグと、認識されたシーンに対して推定された矩形領域の画像上の座標値の情報とを基に、画像センサ部１０で取得する画像のフレームレートを決定する。 Next, the imaging parameter determination process executed by the parameter determination unit 40 will be described.
The parameter determination unit 40 uses a true/false flag estimated regarding the presence or absence of a "pedestrian jumping out" scene in the predicted image, and information on coordinate values on the image of a rectangular area estimated for the recognized scene. Based on this, the frame rate of the image acquired by the image sensor unit 10 is determined.

パラメータ決定部４０は、"歩行者の飛び出し"シーンの有無（真／偽フラグ）に対応して事前に決定した画像のフレームレートをＬＵＴとして保持しており、シーンの有無を基にして、画像センサ部１０の撮像センサのフレームレートを決定する。例えば、予測画像において"歩行者の飛び出し"シーンが認識された（真フラグが入力された）場合、パラメータ決定部４０は、画像センサ部１０の通常動作時のフレームレート（６０ｆｐｓ）から２４０ｆｐｓへ変更することを決める。そして、パラメータ決定部４０は、その変更後のフレームレートパラメータを決定する。 The parameter determination unit 40 holds the frame rate of the image determined in advance in accordance with the presence or absence of the "pedestrian jumping out" scene (true/false flag) as an LUT, and determines the image frame rate based on the presence or absence of the scene. The frame rate of the image sensor of the sensor unit 10 is determined. For example, when a "pedestrian jumping out" scene is recognized in the predicted image (a true flag is input), the parameter determining unit 40 changes the frame rate (60 fps) during normal operation of the image sensor unit 10 to 240 fps. decide what to do. Then, the parameter determining unit 40 determines the frame rate parameter after the change.

パラメータ設定部５０は、前述のように決定された撮像パラメータ（フレームレートパラメータ）を、ステップＳ１８６の撮像パラメータ設定処理において画像センサ部１０に設定する。
なお本実施形態では、撮像パラメータ決定処理で決定する撮像パラメータを画像センサ部１０の撮像センサのフレームレートとした、これには限定されない。例えば"歩行者の飛び出し"シーンに対して推定された矩形領域の画像上の座標値の情報を基に、実施形態２と同様に、画像センサ部１０の撮像センサの各領域に対応する画素回路毎の露光時間、ダイナミックレンジおよびゲイン値が決定されてもよい。
図１２のステップＳ１８６においてパラメータ設定部で実行される撮像パラメータ設定処理に関しては、実施形態２と同様であるため、説明を省略する。 The parameter setting unit 50 sets the imaging parameters (frame rate parameters) determined as described above in the image sensor unit 10 in the imaging parameter setting process of step S186.
Note that in this embodiment, the imaging parameter determined in the imaging parameter determination process is the frame rate of the imaging sensor of the image sensor unit 10, but the present invention is not limited to this. For example, based on the information on the coordinate values on the image of the rectangular area estimated for the "pedestrian jumping out" scene, pixel circuits corresponding to each area of the image sensor of the image sensor unit 10, as in the second embodiment. The exposure time, dynamic range and gain value for each may be determined.
The imaging parameter setting process executed by the parameter setting unit in step S186 in FIG. 12 is the same as that in the second embodiment, and therefore a description thereof will be omitted.

シーン認識処理の場合も、過去の画像から推定したシーンの有無およびシーンが認識された領域は、将来の画像におけるシーンの有無およびシーンの認識領域と異なる可能性が有り、その結果、撮像パラメータの調整が実際のシーンに適合しなくなることがある。これに対し、本実施形態の場合、予測画像を基にして特定のシーンの有無およびシーンの領域を決定するため、シーンが画像フレーム毎に変化する場合でも、より適切なシーンの有無および領域に対して、撮像パラメータを設定することが可能となる。また、パラメータ決定部４０は、認識されたシーンに応じて、撮像パラメータの中の特定の撮像パラメータの値を制御するようなパラメータ決定処理を行っても良い。例えば、"歩行者の飛び出し"シーンが認識された場合に、前述のようにフレームレートを上げるフレームレートパラメータを決定する例の他に、歩行者の領域を読み出すような撮像領域パラメータを決定するようにしても良い。さらにその他にも、"歩行者の飛び出し"シーンが認識された場合に、撮像センサにおける解像度を上げるように制御する解像度パラメータを決定してもよい。 In the case of scene recognition processing, the presence or absence of a scene estimated from past images and the scene recognition area may be different from the presence or absence of a scene and the scene recognition area in future images, and as a result, the imaging parameters may be different. Adjustments may no longer fit the actual scene. In contrast, in the case of this embodiment, the presence or absence of a specific scene and the area of the scene are determined based on the predicted image, so even if the scene changes from image frame to image frame, the presence or absence of a more appropriate scene and the area of the scene are determined. In contrast, it becomes possible to set imaging parameters. Further, the parameter determination unit 40 may perform parameter determination processing such as controlling the value of a specific imaging parameter among the imaging parameters depending on the recognized scene. For example, when a "pedestrian jumping out" scene is recognized, in addition to determining the frame rate parameter to increase the frame rate as described above, it is also possible to determine the imaging area parameter to read out the area of the pedestrian. You can also do it. In addition, a resolution parameter may be determined to control the resolution of the image sensor to be increased when a "pedestrian jumping out" scene is recognized.

なお実施形態３においても、前述同様に、予測画像の生成方法、予測画像に対する認識処理手法、認識処理結果に基づく撮像パラメータの決定および設定手法は限定されない。特に、認識処理結果に基づく撮像パラメータの決定および設定手法、予測画像に対する認識処理結果に基づく手法は、実施形態２と同様に様々な手法を適用することが可能である。 Note that in the third embodiment, as described above, the method for generating a predicted image, the recognition processing method for the predicted image, and the method for determining and setting imaging parameters based on the recognition processing result are not limited. In particular, as in the second embodiment, various methods can be applied to determine and set imaging parameters based on recognition processing results, and as methods based on recognition processing results for predicted images.

＜実施形態４＞
次に実施形態４について説明する。
図１４は、実施形態４に係る撮像装置の構成例を示したブロック図である。図１の構成例と比較してわかるように、図１４に示した実施形態４の撮像装置は、図１の構成にベクトル算出部２００が追加されている。 <Embodiment 4>
Next, Embodiment 4 will be described.
FIG. 14 is a block diagram showing a configuration example of an imaging device according to the fourth embodiment. As can be seen from the comparison with the configuration example of FIG. 1, the imaging device of the fourth embodiment shown in FIG. 14 has a vector calculation unit 200 added to the configuration of FIG.

ここで、ベクトル算出部２００は、画像センサ部１０にて撮像された画像から、動きベクトルを検出（算出）する動きベクトル検出機能を有する。そして、実施形態４の画像生成部２１０は、入力された画像データと、その画像データからベクトル算出部２００が検出（算出）した動きベクトルとを基に、予測画像を生成する機能を有する。ベクトル算出部２００、および画像生成部２１０以外の他の各部に関しては、実施形態１と同様の機能を有するものとして詳細な説明を省略する。 Here, the vector calculation unit 200 has a motion vector detection function that detects (calculates) a motion vector from an image captured by the image sensor unit 10. The image generation unit 210 of the fourth embodiment has a function of generating a predicted image based on input image data and a motion vector detected (calculated) from the image data by the vector calculation unit 200. Detailed explanations of each unit other than the vector calculation unit 200 and the image generation unit 210 will be omitted as they have the same functions as those in the first embodiment.

次に、図１４に示した構成を有する実施形態４の撮像装置の動作について説明する。図１５は、本実施形態に係る撮像装置の動作を示すフローチャートである。
図１５のステップＳ１８０とステップＳ１８１、およびステップＳ１８３からステップＳ１８７までの各処理は、実施形態１と同様であるため、詳細な説明を省略する。
実施形態４の場合、ステップＳ１８１の処理後、ＣＰＵ１４０は、ベクトル算出部２００にて行われるステップＳ２０１に処理を進める。 Next, the operation of the imaging device according to the fourth embodiment having the configuration shown in FIG. 14 will be described. FIG. 15 is a flowchart showing the operation of the imaging device according to this embodiment.
Steps S180 and S181 in FIG. 15, and each process from step S183 to step S187 are the same as those in the first embodiment, so detailed explanations will be omitted.
In the case of the fourth embodiment, after the process of step S181, the CPU 140 advances the process to step S201 performed by the vector calculation unit 200.

ステップＳ２０１に進むと、ベクトル算出部２００は、動きベクトル算出処理として、画像センサ部１０で取得された、時間的に連続した２フレーム分の画像データを基にして、動きベクトルを算出する処理を実行する。そして、ベクトル算出部２００は、動きベクトル算出処理の結果として、画像の各画素位置における動きベクトル（動きの方向と、動きの大きさの情報を含む）を出力する。 Proceeding to step S201, the vector calculation unit 200 calculates a motion vector based on two temporally consecutive frames of image data acquired by the image sensor unit 10 as a motion vector calculation process. Execute. Then, the vector calculation unit 200 outputs a motion vector (including information on the direction of movement and the magnitude of movement) at each pixel position of the image as a result of the motion vector calculation process.

続いてステップＳ２０２に進むと、画像生成部２１０は、まず画像センサ部１０で取得された画像データ、およびベクトル算出部２００で算出された動きベクトルデータとを、内部のフレームバッファにフレーム単位で格納する。そして、画像生成部２１は、それら画像データと動きベクトルデータとを基に予測画像を生成する。 Next, proceeding to step S202, the image generation unit 210 first stores the image data acquired by the image sensor unit 10 and the motion vector data calculated by the vector calculation unit 200 in an internal frame buffer in units of frames. do. Then, the image generation unit 21 generates a predicted image based on the image data and motion vector data.

以下、画像生成部２１０で実行されるステップＳ２０２の画像生成処理に関して説明する。画像生成部２１０は、画像センサ部１０から入力された画像データと、ベクトル算出部２００で算出された動きベクトルデータとを基に、予測画像を生成する点が、前述の実施形態１～３とは異なっている。 The image generation process in step S202 executed by the image generation unit 210 will be described below. The image generation unit 210 is different from the first to third embodiments described above in that it generates a predicted image based on the image data input from the image sensor unit 10 and the motion vector data calculated by the vector calculation unit 200. are different.

実施形態４の画像生成部２１０における画像生成処理は、実施形態１～３で説明した畳み込みＬＳＴＭを行うＮＮモジュールの入力端子を拡張することで容易に実現される。画像生成部２１０で実行される演算および学習処理は、実施形態１～３と同様である。なお本実施形態では、画像生成処理の実現方法として、前述の参考文献１に開示されている手法を用いることを想定しているが、これに限定するものではないことは実施形態１～３と同様である。 The image generation process in the image generation unit 210 of the fourth embodiment is easily realized by expanding the input terminal of the NN module that performs the convolution LSTM described in the first to third embodiments. The calculations and learning processing performed by the image generation unit 210 are the same as in the first to third embodiments. Note that in this embodiment, it is assumed that the method disclosed in the above-mentioned Reference 1 is used as a method for realizing image generation processing, but the method is not limited to this, as in Embodiments 1 to 3. The same is true.

実施形態４の場合、過去の画像および動きベクトルから予測画像を生成し、予測画像に対する認識結果を基にして、将来の画像に対する撮像パラメータを制御する。実施形態４の場合も、過去の画像から生成した認識結果を基にして、将来の画像に対する撮像パラメータを制御する場合に生じ得る、撮像パラメータの不適合を防止することができる。 In the case of the fourth embodiment, a predicted image is generated from past images and motion vectors, and imaging parameters for future images are controlled based on recognition results for the predicted image. In the case of the fourth embodiment as well, it is possible to prevent mismatches in imaging parameters that may occur when imaging parameters for future images are controlled based on recognition results generated from past images.

なお本実施形態では、実施形態１の構成に対して更にベクトル算出部（動きベクトル算出処理）が追加された例に関して説明を行ったが、実施形態２および実施形態３に対しても同様にベクトル算出部（動きベクトル算出処理）が追加されてもよい。これらの例に関しては、本実施形態から容易に拡張可能なものとして、詳細な説明を省略する。 In this embodiment, an example in which a vector calculation unit (motion vector calculation processing) is further added to the configuration of Embodiment 1 has been described, but the vector calculation unit (motion vector calculation processing) is also added to the configuration of Embodiment 2 and 3. A calculation unit (motion vector calculation processing) may be added. Detailed explanations of these examples will be omitted as they can be easily expanded from this embodiment.

また本実施形態では、ベクトル算出部２００によって動きベクトルデータを算出し、画像生成部２１０の入力データとして与える例に関して説明を行ったが、画像生成部２１０において将来の画像における動きベクトルデータを予測（算出）することも可能である。参考文献５には、将来の画像における予測動きベクトルデータを算出する手法が開示されており、この手法を用いて本実施形態における画像生成部を実現することも可能である。 Furthermore, in this embodiment, an example has been described in which motion vector data is calculated by the vector calculation unit 200 and provided as input data to the image generation unit 210, but the image generation unit 210 predicts motion vector data in a future image ( It is also possible to calculate Reference 5 discloses a method for calculating predicted motion vector data for a future image, and it is also possible to implement the image generation unit in this embodiment using this method.

参考文献５：Xiaodan Liang, Lisa Lee, Wei Dai, Eric P. Xing, "Dual Motion GAN for Future-Flow Embedded Video Prediction", ICCV 2017 Reference 5: Xiaodan Liang, Lisa Lee, Wei Dai, Eric P. Xing, "Dual Motion GAN for Future-Flow Embedded Video Prediction", ICCV 2017

また本実施形態の変形例として、画像認識部３０に対して、予測画像と共に予測した動きベクトルの情報を入力することも可能である。これは、実施形態１～３で説明した画像認識部３０の入力端子をさらに予測動きベクトル情報の入力用に拡張することで容易に実現される。 Further, as a modification of the present embodiment, it is also possible to input information on the predicted motion vector together with the predicted image to the image recognition unit 30. This can be easily achieved by further expanding the input terminal of the image recognition unit 30 described in Embodiments 1 to 3 to input predicted motion vector information.

本実施形態によれば、画像データに加えて動きベクトルデータを用いて予測画像を生成することにより、特に動きを伴う画像の予測画像精度が向上するため、より将来の画像に適合した撮像パラメータの設定が可能となる。 According to this embodiment, by generating a predicted image using motion vector data in addition to image data, the predicted image accuracy of images with movement is improved, so that imaging parameters that are more suitable for future images are improved. Settings are now possible.

＜実施形態５＞
次に実施形態５について説明する。図１６は、実施形態５に係る撮像装置の構成例を示したブロック図である。
図１の構成例と比較してわかるように、図１６に示した実施形態５の撮像装置は、実施形態１～３における画像認識部３０とＲＡＭ８０が複数設けられて構成されている。図１６の画像認識部３０Ａは実施形態１で説明した画像認識処理（領域分割処理）を担当し、画像認識部３０Ｂは実施形態２の画像認識処理（物体認識処理）を担当し、画像認識部３０Ｃは実施形態３の画像認識処理（シーン認識処理）を担当する。 <Embodiment 5>
Next, Embodiment 5 will be described. FIG. 16 is a block diagram showing a configuration example of an imaging device according to the fifth embodiment.
As can be seen by comparison with the configuration example of FIG. 1, the imaging device of the fifth embodiment shown in FIG. 16 is configured by providing a plurality of image recognition units 30 and RAMs 80 in the first to third embodiments. The image recognition unit 30A in FIG. 16 is in charge of the image recognition process (area division process) described in the first embodiment, and the image recognition unit 30B is in charge of the image recognition process (object recognition process) in the second embodiment. 30C is in charge of image recognition processing (scene recognition processing) in the third embodiment.

また、図１７は、実施形態５の係る撮像装置の動作を示すフローチャートを示す。図１７に示したフローチャートの場合、実施形態１～３において、画像認識処理として説明したステップＳ１８４の領域分割処理、ステップＳ１８８の物体認識処理、およびステップＳ１８９のシーン認識処理が全て含まれている。 Further, FIG. 17 shows a flowchart showing the operation of the imaging apparatus according to the fifth embodiment. In the case of the flowchart shown in FIG. 17, all of the region division processing in step S184, the object recognition processing in step S188, and the scene recognition processing in step S189, which were explained as image recognition processing in Embodiments 1 to 3, are included.

図１７において、ステップＳ１８３で予測画像が生成されると、ＣＰＵ１４０は、ステップＳ１８４の画像認識処理（領域分割処理）、ステップＳ１８８の画像認識処理（物体認識処理）、ステップＳ１８９の画像認識処理（シーン認識処理）に処理を進める。 In FIG. 17, when a predicted image is generated in step S183, the CPU 140 performs image recognition processing (area division processing) in step S184, image recognition processing (object recognition processing) in step S188, and image recognition processing (scene recognition processing) in step S189. The process proceeds to (recognition processing).

ステップＳ１８４では画像認識部３０Ａが画像認識処理（領域分割処理）を実行し、ステップＳ１８３では画像認識部３０Ｂが画像認識処理（物体認識処理）を実行し、ステップＳ１８９では画像認識部３０Ｃが画像認識処理（シーン認識処理）を実行する。そして、それらステップＳ１８４、ステップＳ１８８、およびステップＳ１８９の後、パラメータ決定部４０にて行われるステップＳ１８５に処理が進められる。ステップＳ１８４、ステップＳ１８８、ステップＳ１８９の各画像認識処理は、前述の各実施形態でそれぞれ説明したのと同様の処理であるため、説明を省略する。なお、実施形態５において、ステップＳ１８８の物体認識処理に加えて、さらに物体姿勢認識処理を実行することが可能であることも実施形態２と同様である。 In step S184, the image recognition unit 30A executes image recognition processing (region division processing), in step S183, the image recognition unit 30B executes image recognition processing (object recognition processing), and in step S189, the image recognition unit 30C performs image recognition. Execute processing (scene recognition processing). After steps S184, S188, and S189, the process proceeds to step S185 performed by the parameter determining section 40. The image recognition processes in step S184, step S188, and step S189 are the same as those described in each of the above-described embodiments, so the description thereof will be omitted. Note that in the fifth embodiment, as in the second embodiment, in addition to the object recognition process in step S188, it is possible to further perform object posture recognition processing.

また、図１７のステップＳ１８５の撮像パラメータ決定処理及びステップＳ１８６の撮像パラメータ設定処理で行われる処理は、実施形態１～３で説明した処理から適宜一部の処理を実行する処理でも良いし、複数の処理を組み合わせた処理でも良い。例えば、背景領域（例えば空領域）のダイナミックレンジに関しては、実施形態１で説明した処理により撮像パラメータの値を決定してもよい。また例えば、人体領域のダイナミックレンジに関しては、実施形態２で説明した処理により撮像パラメータの値を決定してもよい。さらに、フレームレートに関しては、実施形態３で説明した手法により撮像パラメータの値を決定する処理でもよい。また、実施形態５の撮像装置においても、ベクトル算出部を追加して、実施形態４と同様に動きベクトルデータを基に処理を実行することも当然可能である。 Further, the processing performed in the imaging parameter determination processing in step S185 and the imaging parameter setting processing in step S186 in FIG. A combination of these processes may also be used. For example, regarding the dynamic range of a background region (for example, a sky region), values of imaging parameters may be determined by the processing described in the first embodiment. Further, for example, regarding the dynamic range of the human body region, the values of the imaging parameters may be determined by the processing described in the second embodiment. Furthermore, regarding the frame rate, the process of determining the value of the imaging parameter using the method described in the third embodiment may be used. Furthermore, in the imaging device of the fifth embodiment, it is also possible to add a vector calculation unit and perform processing based on motion vector data as in the fourth embodiment.

このように、実施形態５に係る撮像装置は、まず画像生成部２０によって予測画像を生成し、その予測画像に対して画像認識部３０Ａ，３０Ｂ，３０Ｃがそれぞれ画像認識処理を実行する。そして、パラメータ決定部４０が、それらの画像認識処理の実行結果に応じて撮像パラメータを決定する。また実施形態５の場合、画像認識処理として複数種類の処理が実行される。また実施形態５の場合は、一旦予測画像を生成すれば、その生成した予測画像に対して適用される画像認識処理が前述した実施形態１～４のようにそれぞれ限定されることはない。このため、実施形態５の場合は、実施形態１～４、さらには本実施形態における各画像認識処理以外の処理を実行することも可能となる。例えば、前述した画像認識処理とは異なる認識処理として、人物や自動車等の異常行動を検知する移動行動検知処理等を実行することも可能となる。また実施形態５においても、実行する画像認識処理に対応して、実施形態１～４で説明した撮像パラメータ、またはその他の撮像パラメータを制御することができる。この場合も、実施形態１～４で説明した撮像パラメータは一例であり、本発明は撮像パラメータの種類は特に限定されるものではない。 In this manner, in the imaging device according to the fifth embodiment, the image generation unit 20 first generates a predicted image, and the image recognition units 30A, 30B, and 30C each perform image recognition processing on the predicted image. Then, the parameter determination unit 40 determines imaging parameters according to the execution results of those image recognition processes. Furthermore, in the case of the fifth embodiment, multiple types of processing are executed as image recognition processing. Furthermore, in the case of the fifth embodiment, once a predicted image is generated, the image recognition processing applied to the generated predicted image is not limited as in the above-described first to fourth embodiments. Therefore, in the case of Embodiment 5, it is also possible to execute processes other than the image recognition processes in Embodiments 1 to 4 and even in this embodiment. For example, as a recognition process different from the image recognition process described above, it is also possible to perform a moving behavior detection process for detecting abnormal behavior of a person, a car, etc. Also in the fifth embodiment, the imaging parameters described in the first to fourth embodiments or other imaging parameters can be controlled in accordance with the image recognition processing to be executed. In this case as well, the imaging parameters described in Embodiments 1 to 4 are merely examples, and the types of imaging parameters are not particularly limited in the present invention.

前述した各実施形態の画像処理装置の構成または各フローチャートの処理は、ハードウェア構成により実現されてもよいし、例えばＣＰＵが本実施形態に係るプログラムを実行することによりソフトウェア構成により実現されてもよい。また、一部がハードウェア構成で残りがソフトウェア構成により実現されてもよい。ソフトウェア構成のためのプログラムは、予め用意されている場合だけでなく、不図示の外部メモリ等の記録媒体から取得されたり、不図示のネットワーク等を介して取得されたりしてもよい。 The configuration of the image processing device of each embodiment described above or the processing of each flowchart may be realized by a hardware configuration, or may be realized by a software configuration by, for example, a CPU executing a program according to the present embodiment. good. Further, a part may be realized by a hardware configuration and the rest by a software configuration. The program for the software configuration may not only be prepared in advance, but may also be obtained from a recording medium such as an external memory (not shown) or via a network (not shown).

本発明に係る制御処理における１以上の機能を実現するプログラムは、ネットワーク又は記憶媒体を介してシステム又は装置に供給可能であり、そのシステム又は装置のコンピュータの１つ以上のプロセッサにより読また出し実行されることで実現可能である。
前述の各実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。即ち、本発明は、その技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 A program for realizing one or more functions in the control processing according to the present invention can be supplied to a system or device via a network or a storage medium, and can be read and executed by one or more processors of a computer in the system or device. It is possible to achieve this by doing so.
The embodiments described above are merely examples of implementation of the present invention, and the technical scope of the present invention should not be construed as limited by these embodiments. That is, the present invention can be implemented in various forms without departing from its technical idea or main features.

１０：画像センサ部、２０：画像生成部、３０：画像認識部、４０：パラメータ決定部、５０：パラメータ設定部、８０，１６０：ＲＡＭ、１４０：ＣＰＵ、１５０：ＲＯＭ 10: Image sensor section, 20: Image generation section, 30: Image recognition section, 40: Parameter determination section, 50: Parameter setting section, 80, 160: RAM, 140: CPU, 150: ROM

Claims

an imaging means that is equipped with an imaging sensor that can set imaging parameters for each pixel or region of the image, and that captures an image of a moving object ;
an image generation unit that generates a predicted image that is a prediction of an image that will be captured after the object moves by the imaging unit based on the acquired image;
image recognition means for performing image recognition processing on the predicted image generated by the image generation means ;
Parameter determining means for determining the imaging parameters of the image sensor for each pixel or region based on the position of the object in the predicted image obtained by the image recognition process;
parameter setting means for setting the determined imaging parameters for each pixel or region for the imaging sensor ;
has
The image capturing device is characterized in that the image capturing means acquires an image according to parameters set by the parameter setting means .

The imaging device according to claim 1 , wherein the image recognition means executes object recognition processing to recognize a predetermined object in the predicted image.

The image recognition means executes scene recognition processing to recognize a specific scene in which the object makes a specific movement based on the predicted image ,
The parameter determining means determines the imaging parameters of the imaging sensor for each pixel or region based on the position of the object in the predicted image in the specific scene recognized by the scene recognition process. The imaging device according to claim 1 or 2.

The imaging apparatus according to any one of claims 1 to 3 , wherein the image recognition means executes a plurality of different image recognition processes.

The imaging device according to any one of claims 1 to 4 , wherein the image generation unit generates the predicted image from a plurality of images acquired by the imaging unit.

The imaging device according to any one of claims 1 to 5 , wherein the image generation means generates a plurality of predicted images.

The imaging device according to any one of claims 1 to 6 , wherein the image generation means generates the predicted image using a neural network.

8. The imaging apparatus according to claim 1 , wherein the image recognition means executes the image recognition process using a neural network.

The imaging parameter determined by the parameter determining means is one or more of an exposure parameter, a focus parameter, a dynamic range parameter, a gain parameter, a frame rate parameter, an imaging area parameter, and a resolution parameter. The imaging device according to any one of claims 1 to 8 .

an imaging step of acquiring an image of a moving object using an imaging sensor that can set imaging parameters for each pixel or region of the image ;
an image generation step of generating a predicted image that is a prediction of an image to be captured by the image sensor after the object moves based on the acquired image;
an image recognition step of performing image recognition processing on the predicted image generated in the image generation step ;
a parameter determining step of determining the imaging parameters of the image sensor for each pixel or region based on the position of the object in the predicted image obtained by the image recognition process;
a parameter setting step of setting the determined imaging parameters for each pixel or region for the imaging sensor ;
has
A method for controlling an imaging apparatus, characterized in that, in the imaging step, an image is acquired according to the parameters set in the parameter setting step .

A program for causing a computer to function as each means included in the imaging device according to any one of claims 1 to 9 .