JP2012164026A

JP2012164026A - Image recognition device and display device for vehicle

Info

Publication number: JP2012164026A
Application number: JP2011022060A
Authority: JP
Inventors: Hironori Sato; 弘規佐藤; Katsuyuki Imanishi; 勝之今西; Hiroaki Takeda; 浩章竹田; Naoyuki Aoki; 直之青木; Nobuhiko Inoue; 展彦井上
Original assignee: Denso Corp; Nippon Soken Inc
Current assignee: Denso Corp; Soken Inc
Priority date: 2011-02-03
Filing date: 2011-02-03
Publication date: 2012-08-30

Abstract

PROBLEM TO BE SOLVED: To provide a technology capable of performing processing for recognition while keeping high accuracy of recognizing an image of a recognition object from a captured image.SOLUTION: A night view device 100 is provided with an image recognition circuit 20 which includes: an image acquisition part 21 for acquiring a forward image 50 generated by a near-infrared camera 10; a cutout part 23 for sequentially cutting out cutout images 56 from the forward image 50; a first identification part 25 for determining whether or not there is a possibility that a pedestrian image 51 is included in each of a plurality of the cutout images 56 based on a first learning model 66; a second identification part 27 for determining whether or not the pedestrian image 51 is included in the cutout images 56 which is determined there is a possibility that the pedestrian image 51 is included based on a second learning model 77; and a recognition part 29 for recognizing a position in the forward image 50 of the cutout image 56 determined the pedestrian image 51 is included as the position where the pedestrian is photographed.

Description

撮像手段によって生成される撮像画像から認識対象物を表す画像を認識する画像認識装置、及び当該画像認識装置を備える車両用表示装置に関する。 The present invention relates to an image recognition device for recognizing an image representing a recognition object from a captured image generated by an imaging means, and a vehicle display device including the image recognition device.

従来、カメラ等の撮像手段によって生成された撮像画像に写る歩行者及び車両等の認識対象物の画像を認識する技術が知られている。例えば非特許文献１に開示の手法では、撮像手段によって生成された入力画像から、所定のサイズの画像が順に切り出される。そして、学習の基準として用意された複数のサンプル画像（非特許文献１Ｆｉｇｕｒｅ５参照）から作成された学習モデルに基づいて、切り出された各画像に認識対象物として想定されている歩行者等の画像が含まれているか否かが識別される。識別の結果、歩行者の画像を含んでいると識別された画像の入力画像における位置が、歩行者の画像の写る位置として認識される（非特許文献１Ｆｉｇｕｒｅ４等参照）。 2. Description of the Related Art Conventionally, a technique for recognizing an image of a recognition object such as a pedestrian and a vehicle shown in a captured image generated by an imaging unit such as a camera is known. For example, in the technique disclosed in Non-Patent Document 1, images of a predetermined size are cut out in order from an input image generated by an imaging unit. Then, based on a learning model created from a plurality of sample images (see FIG. 5 of Non-Patent Document 1) prepared as a reference for learning, images of pedestrians and the like that are assumed as recognition objects in each cut out image Is included. As a result of identification, the position in the input image of the image identified as including the pedestrian image is recognized as the position where the pedestrian image appears (see Non-Patent Document 1, FIG. 4 and the like).

この非特許文献１に開示の手法において、識別に用いられる識別器としての学習モデルは、複数のサンプル画像に最も共通する特徴量を選択する過程を繰り返し、複数の特徴量を重ね合わせることにより、認識対象物を識別可能な特徴量の組み合わせを学習している。このような過程の繰り返しによって、認識対象物の特徴量を学習することにより、学習モデルを構築する種々の方法が開示されている。 In the method disclosed in Non-Patent Document 1, the learning model as a discriminator used for discrimination repeats the process of selecting the feature amount most common to a plurality of sample images, and superimposes the plurality of feature amounts. A combination of feature quantities that can identify a recognition object is learned. Various methods for constructing a learning model by learning a feature amount of a recognition target object by repeating such a process are disclosed.

このような手法として、例えば特許文献１には、数字等のサンプル画像の特徴量を、主成分分析を用いて学習することにより作成された学習モデルに基づいて、認識対象物を識別するパターン認識装置が開示されている。また、特許文献２には、正常状態における基準画像の特徴量を、ニューラルネットワークを用いて学習することにより作成された学習モデルに基づいて、監視対象の異常状態を認識する画像認識装置が開示されている。また、特許文献３には、車両の画像等の学習サンプルの特徴量を、サポートベクターマシン（ＳＶＭ）を用いて学習することにより作成されたＳＶＭ分類器に基づいて、車両画像を認識する車両画像認識装置が開示されている。さらに、特許文献４には、サンプル画像として用意された顔画像及び非顔画像の特徴量を学習することにより作成されたパターン識別用パラメータに基づいて、入力画像の中にある顔画像を識別するパターン識別装置が開示されている。 As such a technique, for example, Patent Document 1 discloses pattern recognition for identifying a recognition target object based on a learning model created by learning feature quantities of sample images such as numbers using principal component analysis. An apparatus is disclosed. Patent Document 2 discloses an image recognition device that recognizes an abnormal state of a monitoring target based on a learning model created by learning a feature amount of a reference image in a normal state using a neural network. ing. Patent Document 3 discloses a vehicle image for recognizing a vehicle image based on an SVM classifier created by learning a feature amount of a learning sample such as a vehicle image using a support vector machine (SVM). A recognition device is disclosed. Further, in Patent Document 4, a face image in an input image is identified based on a pattern identification parameter created by learning feature amounts of a face image and a non-face image prepared as sample images. A pattern identification device is disclosed.

特開平０５−４０８５２号公報Japanese Patent Laid-Open No. 05-40852 特開平１１−７５３６号公報Japanese Patent Laid-Open No. 11-7536 特開２００７−２３５９５１号公報JP 2007-235951 A 特開２００９−８６７４９号公報JP 2009-86749 A

Paul Viola、Michael J. Jones、Daniel Snow、”Detecting Pedestrians Using Patterns of Motion and Appearance”、In iccv03 (Nice, France, 2003)、pp. 734-741Paul Viola, Michael J. Jones, Daniel Snow, “Detecting Pedestrians Using Patterns of Motion and Appearance”, In iccv03 (Nice, France, 2003), pp. 734-741

さて、特許文献４に開示のパターン識別装置に入力される入力画像において、例えば顔や胴体等の何らかの物体が含まれており、顔の画像が含まれている可能性のあるものと、顔とは明白に異なる非顔しか含まない背景等との差異は、明確である。故に、少ない特徴量しか学習していない単純なパターン識別用パラメータでも、これらの識別は可能である。しかし、入力画像において、顔の画像が含まれている顔画像と、顔画像は含まれていないが、顔に類似する画像を含む非顔画像との差異は、曖昧である。故に、詳細な特徴量を学習する必要が生じるため、パターン識別用パラメータは複雑にならざるを得ない。 Now, in the input image input to the pattern identification device disclosed in Patent Document 4, for example, some object such as a face or a torso is included, and there is a possibility that an image of the face is included. The difference from the background, etc., which clearly contains only different non-faces, is clear. Therefore, even a simple pattern identification parameter that has learned only a small amount of features can be identified. However, in the input image, the difference between a face image that includes a face image and a non-face image that does not include a face image but includes an image similar to a face is ambiguous. Therefore, since it is necessary to learn detailed feature amounts, the parameters for pattern identification must be complicated.

そのため、特許文献４に記載のパターン識別装置を用いて、非特許文献１に開示されている特徴量で歩行者を認識しようとした場合、撮像画像から順に切り出された全ての切出画像に対し、歩行者が含まれているか否かをパターン識別用パラメータで識別する必要がある。上述の顔認識と同様に、歩行者と歩行者に類似した非歩行者との曖昧な差異を学習したパターン識別用パラメータは、複雑である。故に、当該パラメータに基づく識別処理は、少ない特徴量で識別可能な切出画像に対しても、多量の特徴量で識別するため、処理が長く冗長なものとなる。以上により、撮像画像から歩行者等の認識対象物の画像を認識するまでの全体の処理量は、膨大なものとなり得る。 Therefore, when trying to recognize a pedestrian with the feature amount disclosed in Non-Patent Document 1 using the pattern identification device described in Patent Document 4, for all the cut-out images cut out in order from the captured image It is necessary to identify whether or not a pedestrian is included by using a pattern identification parameter. Similar to the face recognition described above, the parameter for pattern identification in which an ambiguous difference between a pedestrian and a non-pedestrian similar to the pedestrian is learned is complicated. Therefore, since the identification process based on the parameters is performed with a large amount of feature amounts even with respect to a clipped image that can be identified with a small amount of features, the processing becomes long and redundant. As described above, the entire processing amount from the captured image until the image of the recognition target object such as a pedestrian is recognized can be enormous.

本発明は、上記問題点に鑑みてなされたものであって、その目的は、撮像画像から認識対象物の画像を認識する精度を高く維持したまま、認識のための処理を高速で行うことができる画像認識装置、及び当該画像認識装置を備える車両用表示装置を提供することである。 The present invention has been made in view of the above problems, and an object of the present invention is to perform recognition processing at high speed while maintaining high accuracy of recognizing an image of a recognition object from a captured image. An image recognition device that can be used, and a vehicle display device including the image recognition device.

上記目的を達成するために、請求項１に記載の発明では、撮像手段によって生成される撮像画像を取得する画像取得手段と、画像取得手段によって取得された撮像画像から、予め設定されたサイズの切出画像を順に切り出す切出手段と、切出手段によって切り出された複数の切出画像について、学習の基準となる第一サンプル画像群から作成された第一学習モデルに基づいて、予め設定された認識対象物を表す画像が含まれている可能性があるか否かを識別する第一識別手段と、第一識別手段によって認識対象物の画像が含まれている可能性があると識別された切出画像について、学習の基準となる第二サンプル画像群から作成された第二学習モデルに基づいて、認識対象物の画像が含まれているか否かを識別する第二識別手段と、第二識別手段によって認識対象物の画像を含むと識別された切出画像の撮像画像における位置を、当該認識対象物が写る位置として認識する認識手段と、を備える画像認識装置とする。 In order to achieve the above object, according to the first aspect of the present invention, an image acquisition unit that acquires a captured image generated by the imaging unit, and a captured image acquired by the image acquisition unit have a preset size. Based on a first learning model created from a first sample image group serving as a learning reference, a cutting unit that sequentially cuts out the cut image and a plurality of cut images cut out by the cutting unit are set in advance. A first identification means for identifying whether or not there is a possibility that an image representing the recognized object is included, and the first identification means identifies that there is a possibility that the image of the recognition object is included. A second identification means for identifying whether or not an image of a recognition object is included based on a second learning model created from a second sample image group serving as a learning reference for the cut image; Second identification The position in the captured image with the identified clipped images includes an image of the recognition object by stage to recognizing means for recognizing a position where the recognition object objects appear, the image recognition apparatus including a.

一般に、切出手段により撮像画像から順に切り出された各切出画像において、予め設定された認識対象物の画像を含む可能性があるものと、認識対象物とは明白に異なる非認識対象物しか含まないものとの差異は、認識対象物の画像を含むものと、認識対象物に類似する非認識対象物を含むものとの差異よりも明確である。故に、認識対象物を表す画像が切出画像に含まれている可能性があるか否かを識別する第一学習モデルは、認識対象物と当該認識対象物とは明白に異なる非認識対象物とを識別する学習モデルであることにより、高い精度を維持していても、単純に構成され得る。一方で、認識対象物の画像が切出画像に含まれているか否かを識別する第二学習モデルは、認識対象物と当該認識対象物に類似した非認識対象物とを識別する学習モデルであることにより、高い精度を維持するためには、複雑にならざるを得ない。 In general, in each cut-out image cut out sequentially from the captured image by the cut-out means, there is only a non-recognition object that may include an image of a preset recognition object and a recognition object that is clearly different from the recognition object. The difference from the non-recognized object is clearer than the difference between the object including the image of the recognition object and the object including the non-recognition object similar to the recognition object. Therefore, the first learning model for identifying whether there is a possibility that the image representing the recognition target object is included in the cut-out image is a non-recognition target object that is clearly different from the recognition target object. Can be configured simply even if high accuracy is maintained. On the other hand, the second learning model for identifying whether the image of the recognition object is included in the cut-out image is a learning model for identifying the recognition object and a non-recognition object similar to the recognition object. For this reason, in order to maintain high accuracy, it must be complicated.

請求項１に記載の発明によれば、第二識別手段の識別処理の対象となる切出画像は、第一識別手段による識別処理によって、認識対象物の画像を含む可能性のある切出画像に絞り込まれる。この第一識別手段による識別処理において、認識対象物の画像が含まれている可能性があるものと非認識対象物しか含まないことが明白なものとの識別は、高精度に行われる。故に、切出画像が第一識別手段によって絞り込まれても、撮像画像から認識対象物の画像を認識する全体の処理の精度は、高いまま維持され得る。 According to the first aspect of the present invention, the cut-out image that is the target of the identification process by the second identification unit may include the image of the recognition target object by the identification process by the first identification unit. It is narrowed down to. In the identification processing by the first identification means, identification between the possibility of including an image of a recognition target object and that apparently including only a non-recognition target object is performed with high accuracy. Therefore, even if the cut-out image is narrowed down by the first identification unit, the accuracy of the entire process for recognizing the image of the recognition object from the captured image can be maintained high.

加えて、単純な第一学習モデルに基づく第一識別手段による識別処理は、複雑な第二学習モデルに基づく第二識別手段による識別処理よりも、高速に処理され得る。故に、第一識別手段による識別処理によって切出画像を絞り込むことにより、冗長な第二識別手段の識別処理の対象となる切出画像が低減される。これらにより、撮像画像から認識対象物の画像を認識するまでの全体の処理は、高速化され得る。 In addition, the identification process by the first identification unit based on the simple first learning model can be processed at a higher speed than the identification process by the second identification unit based on the complicated second learning model. Therefore, by narrowing down the cut-out images by the identification processing by the first identification means, the cut-out images that are the targets of the identification processing by the redundant second identification means are reduced. As a result, the entire process from the captured image to the recognition of the recognition target image can be speeded up.

したがって、画像認識装置は、撮像画像から認識対象物の画像を認識する精度を高く維持したまま、認識のための処理を高速で行うことができる。 Therefore, the image recognition apparatus can perform recognition processing at high speed while maintaining high accuracy in recognizing the image of the recognition target object from the captured image.

請求項２に記載の発明では、第一サンプル画像群は、認識対象物として想定されている想定物が写る複数の正例サンプル画像、及び想定物以外の非想定物が写る複数の負例サンプル画像を含み、第一学習モデルは、複数の負例サンプル画像のうち、想定物に類似する非想定物が写る負例サンプル画像を、複数の正例サンプル画像と共に正例として学習することにより作成され、第一識別手段は、第一学習モデルに学習された正例に対応するか否かによって、各切出画像に認識対象物の画像が含まれている可能性があるか否かを識別することを特徴とする。 In the invention according to claim 2, the first sample image group includes a plurality of positive sample images in which an assumed object assumed as a recognition object is captured, and a plurality of negative example samples in which a non-assumed object other than the assumed object is captured. The first learning model is created by learning a negative example sample image showing a non-assumed object similar to the assumed object among a plurality of negative example sample images as a positive example together with a plurality of positive example sample images. The first identifying means identifies whether or not there is a possibility that an image of the recognition object is included in each cut-out image depending on whether or not the positive example learned by the first learning model corresponds to the first example. It is characterized by doing.

この発明では、第一サンプル画像群において、認識対象物として想定されている想定物が写る正例サンプル画像と、想定物に類似する非想定物が写る負例サンプル画像との差異は、曖昧である。一方で、複数の負例サンプル画像においては、想定物に類似する非想定物が写るものと、例えば背景等の想定物に類似しない非想定物が写るものとの差異は、明確である。故に、想定物に類似する非想定物が写る負例サンプル画像を、正例サンプル画像と共に正例として学習することにより、第一学習モデルは、第一識別手段による処理の高速化に寄与する、単純な学習モデルになり得る。 In the present invention, in the first sample image group, the difference between a positive example sample image in which an assumed object assumed as a recognition object is captured and a negative example sample image in which an unimagined object similar to the assumed object is captured is ambiguous. is there. On the other hand, in a plurality of negative sample images, the difference between a non-assumed object similar to the assumed object and an unimagined object similar to the assumed object such as the background is clear. Therefore, the first learning model contributes to the speeding up of the processing by the first identification means by learning the negative example sample image in which the non-assumed object similar to the assumed object is captured as a positive example together with the positive example sample image. It can be a simple learning model.

また、想定物に類似する非想定物が写る負例サンプル画像も第一学習モデルの正例に含まれていることにより、認識対象物の画像が含まれる切出画像に加えて、認識対象物に類似した物体の画像が含まれる切出画像も、正例に対応し得る。これらの切出画像は、即ち認識対象物の画像が含まれている可能性のある切出画像である。故に、第一識別手段は、正例に対応する切出画像を識別することにより、認識対象物の画像が含まれている可能性のある切出画像を的確に絞り込むことができる。 In addition, the negative sample image that shows an unforeseen object similar to the assumed object is also included in the positive example of the first learning model, so that in addition to the cut-out image that includes the image of the recognition object, the recognition object A cut-out image including an image of an object similar to can also correspond to a positive example. These cut-out images are cut-out images that may include an image of the recognition object. Therefore, the 1st identification means can narrow down appropriately the cut-out image which may contain the image of a recognition target object by identifying the cut-out image corresponding to a positive example.

そして第二識別手段は、第一識別手段によって絞り込まれた切出画像について、認識対象物の画像が含まれているか否かを識別することにより、認識対象物に類似した物体が写る切出画像を除外することができる。以上のようにして、認識対象物の画像が含まれる切出画像は、第一識別手段及び第二識別手段の識別処理によって、精度良く識別される。したがって、画像認識装置は、認識のための処理が高速化されていても、撮像画像から認識対象物の画像を認識する精度を確実に高く維持することができる。 Then, the second identification unit identifies whether or not the image of the recognition target object is included in the cut image narrowed down by the first identification unit, so that an image similar to the recognition target object is captured. Can be excluded. As described above, the cut-out image including the image of the recognition object is identified with high accuracy by the identification processing of the first identification unit and the second identification unit. Therefore, the image recognition apparatus can reliably maintain high accuracy in recognizing the image of the recognition target object from the captured image even if the processing for recognition is accelerated.

請求項３に記載の発明では、第一サンプル画像群は、認識対象物として想定されている想定物が写る複数の正例サンプル画像、及び想定物以外の非想定物が写る複数の負例サンプル画像を含み、第一学習モデルは、複数の正例サンプル画像から、非想定物に類似する想定物が写る正例サンプル画像を除外した画像群を正例として学習すると共に、複数の負例サンプル画像から、想定物に類似する非想定物が写る負例サンプル画像を除外した画像群を負例として学習することにより作成され、第一識別手段は、第一学習モデルに学習された正例に関連するか否かによって、各切出画像に認識対象物の画像が含まれている可能性があるか否かを識別することを特徴とする。 In the invention according to claim 3, the first sample image group includes a plurality of positive sample images in which an assumed object assumed as a recognition object is reflected, and a plurality of negative example samples in which a non-assumed object other than the assumed object is reflected. The first learning model includes an image, and learns as a positive example an image group in which a positive example sample image in which a hypothetical object similar to an unanticipated object is captured from a plurality of positive example sample images. Created by learning as a negative example an image group that excludes negative example sample images in which an unforeseen object similar to the assumed object is shown from the image, the first identification means is a positive example learned by the first learning model. It is characterized by identifying whether or not there is a possibility that an image of a recognition object is included in each cut-out image depending on whether or not they are related.

この発明のように、第一サンプル画像群において、非想定物に類似する想定物の画像と、想定物に類似する非想定物の画像との差異は、曖昧である。故に、非想定物に類似する想定物が写る正例サンプル画像と、想定物に類似する非想定物が写る負例サンプル画像とを、第一サンプル画像群から除外する。このような第一サンプル画像群から第一学習モデルを作成することにより、当該第一学習モデルは、第一識別手段による処理の高速化に寄与する、単純な学習モデルに確実になり得る。 As in the present invention, in the first sample image group, the difference between the image of the assumed object similar to the unexpected object and the image of the unexpected object similar to the assumed object is ambiguous. Therefore, a positive example sample image in which an assumed object similar to the unexpected object is shown and a negative sample image in which an unexpected object similar to the assumed object is shown are excluded from the first sample image group. By creating the first learning model from such a first sample image group, the first learning model can be surely a simple learning model that contributes to speeding up the processing by the first identification means.

また、想定物に類似する非想定物の画像の写る負例サンプル画像が第一学習モデルの負例から除外されているので、認識対象物の画像が含まれる切出画像に加えて、認識対象物に類似した物体の画像が含まれる切出画像も、正例に関連する切出画像に該当し得る。故に、第一識別手段は、正例に関連する切出画像を識別することにより、認識対象物の画像が含まれている可能性のある切出画像を的確に絞り込むことができる。 In addition, negative sample images showing images of non-assumed objects similar to the assumed objects are excluded from the negative examples of the first learning model, so in addition to the cut-out images containing the images of the recognition objects, the recognition objects A cut-out image including an image of an object similar to an object can also correspond to a cut-out image related to the positive example. Therefore, the 1st identification means can narrow down appropriately the cut-out image which may contain the image of a recognition target object by identifying the cut-out image relevant to a positive example.

請求項４に記載の発明では、第一サンプル画像群は、認識対象物として想定されている想定物が写る複数の正例サンプル画像、及び想定物以外の非想定物が写る複数の負例サンプル画像を含み、第一学習モデルは、第一サンプル画像群のうち、複数の正例サンプル画像を正例として学習し、第一識別手段は、第一学習モデルに学習された正例に関連するか否かによって、各切出画像に認識対象物の画像が含まれている可能性があるか否かを識別することを特徴とする。 In the invention according to claim 4, the first sample image group includes a plurality of positive sample images in which an assumed object assumed as a recognition object is captured, and a plurality of negative example samples in which a non-assumed object other than the assumed object is captured. The first learning model includes a plurality of positive example sample images as a positive example in the first sample image group, and the first identification unit relates to a positive example learned by the first learning model. Whether or not there is a possibility that the image of the recognition target object is included in each cut-out image.

この発明では、第一学習モデルは、第一サンプル画像群うち、複数の正例サンプル画像を正例として学習している。このような第一学習モデルは、想定物と非想定物との曖昧な差異を学習していないので、第一識別手段による処理の高速化に寄与する、単純な学習モデルに確実になり得る。 In the present invention, the first learning model learns a plurality of positive example sample images as positive examples in the first sample image group. Since such a first learning model does not learn an ambiguous difference between an assumed object and an unimagined object, it can be surely a simple learning model that contributes to speeding up the processing by the first identification means.

以上のような第一学習モデルは、想定物に類似する非想定物の画像の写る負例サンプル画像を負例として学習していない。故に、認識対象物の画像が含まれる切出画像に加えて、認識対象物に類似した物体の画像が含まれる切出画像も、正例に関連する切出画像に該当し得る。故に、第一識別手段は、正例に関連する切出画像を識別することにより、認識対象物の画像が含まれている可能性のある切出画像を的確に絞り込むことができる。 The first learning model as described above does not learn as a negative example a negative example sample image in which an image of an unforeseen object similar to the assumed object is captured. Therefore, in addition to the cut-out image including the image of the recognition target object, the cut-out image including the image of the object similar to the recognition target object can also correspond to the cut-out image related to the positive example. Therefore, the 1st identification means can narrow down appropriately the cut-out image which may contain the image of a recognition target object by identifying the cut-out image relevant to a positive example.

請求項５に記載の発明では、第二サンプル画像群は、認識対象物として想定されている想定物が写る複数の正例サンプル画像、及び想定物以外の非想定物が写る複数の負例サンプル画像を含み、第二学習モデルは、複数の正例サンプル画像を正例として学習すると共に、負例サンプル画像を負例として学習することにより作成され、第二識別手段は、第二学習モデルにおいて学習された正例に対応するか否かによって、認識対象物の画像が切出画像に含まれているか否かを識別することを特徴とする。 In the invention according to claim 5, the second sample image group includes a plurality of positive sample images in which an assumed object assumed as a recognition object is reflected, and a plurality of negative example samples in which a non-assumed object other than the assumed object is reflected. The second learning model includes an image and is created by learning a plurality of positive example sample images as positive examples and learning a negative example sample image as a negative example. Whether or not the image of the recognition object is included in the cut-out image is identified based on whether or not it corresponds to the learned positive example.

この発明によれば、第二学習モデルは、想定物が写る複数の正例サンプル画像及び非想定物が写る複数の負例サンプル画像を含む第二サンプル画像群から作成されているので、想定物と非想定物との曖昧な差異について、詳しく学習している。故に、第二識別手段は、第二学習モデルの正例に対応する切出画像を識別することにより、認識対象物の画像が含まれている可能性がある切出画像の中から、認識対象物に類似した物体の画像が含まれる切出画像を的確に除外し得る。以上により、認識対象物の画像が含まれる切出画像は、第一識別手段及び第二識別手段の識別処理によって、精度良く識別される。 According to the present invention, the second learning model is created from the second sample image group including the plurality of positive example sample images in which the assumed object is photographed and the plurality of negative example sample images in which the unimaginable object is photographed. I am learning in detail about ambiguous differences between unexpected and unforeseen things. Therefore, the second identification unit identifies the clipped image corresponding to the positive example of the second learning model, thereby identifying the recognition target from the clipped images that may include the image of the recognition target object. A cut image including an image of an object similar to an object can be accurately excluded. As described above, the cut-out image including the image of the recognition object is accurately identified by the identification processing of the first identification unit and the second identification unit.

請求項６に記載の発明では、第一識別手段は、認識対象物である歩行者の画像が複数の切出画像に含まれている可能性があるか否かを識別し、第二識別手段は、第一識別手段によって歩行者の画像が含まれている可能性があると識別された切出画像について、当該歩行者の画像が含まれているか否かを識別することを特徴とする。 In the invention according to claim 6, the first identification means identifies whether or not there is a possibility that the image of the pedestrian that is the recognition object is included in the plurality of cut-out images, and the second identification means Is characterized by identifying whether or not the pedestrian image is included in the cut-out image identified as possibly having a pedestrian image included by the first identification means.

この発明のように、撮像画像に写る歩行者の画像は、歩行者の移動方向及び撮像手段に対する歩行者の相対位置に応じて、形態が様々である。故に、歩行者の画像か否かを正確に識別するためには、学習モデルは、非常に多数のサンプル画像から、歩行者の画像の特徴を学習しなければならなくなる。すると、学習モデルが複雑になることにより、識別処理は、さらに冗長なものとなるおそれがある。 As in the present invention, the pedestrian image shown in the captured image has various forms depending on the moving direction of the pedestrian and the relative position of the pedestrian with respect to the imaging means. Therefore, in order to accurately identify whether or not the image is a pedestrian image, the learning model has to learn the characteristics of the pedestrian image from a very large number of sample images. Then, since the learning model becomes complicated, the identification process may become further redundant.

しかし、上述した画像認識装置は、第二識別手段の識別処理の対象となる切出画像を第一識別手段によって絞り込むことにより、撮像画像から認識対象物を認識するための全体の処理量を低減している。故に、第二識別手段によって用いられる第二学習モデルが複雑なモデルになったとしても、画像認識装置は、高い精度を維持したまま、高速に歩行者の認識を行うことができる。以上のように、第一識別手段によって切出画像を絞り込む構成は、認識対象物の画像として歩行者の画像を認識する画像認識装置に特に好適なのである。 However, the image recognition apparatus described above reduces the overall processing amount for recognizing the recognition target object from the captured image by narrowing the cut-out image that is the target of the identification process of the second identification unit by the first identification unit. is doing. Therefore, even if the second learning model used by the second identification unit becomes a complex model, the image recognition apparatus can recognize a pedestrian at high speed while maintaining high accuracy. As described above, the configuration in which the first identification unit narrows the cut image is particularly suitable for an image recognition apparatus that recognizes a pedestrian image as a recognition target image.

請求項７に記載の発明では、車両の周辺領域を撮影する撮像手段によって生成された撮像画像から、当該周辺領域に存在する認識対象物の画像を認識し、当該認識対象物の画像を強調する強調画像を撮像画像に重畳して表示する車両用表示装置であって、撮像手段から撮像画像を取得する、請求項１〜６のいずれか一項に記載の画像認識装置と、撮像画像において、画像認識装置によって認識された認識対象物が写る位置に、強調画像を重畳して表示する表示手段と、を備える車両用表示装置とする。 In the invention according to claim 7, the image of the recognition object existing in the peripheral area is recognized from the captured image generated by the imaging unit that captures the peripheral area of the vehicle, and the image of the recognition object is emphasized. In the display apparatus for vehicles which displays an emphasis image superimposed on a picked-up image, and acquires a picked-up image from an image pick-up means, In the image recognition device according to any one of claims 1 to 6, and a picked-up image, A display device for a vehicle is provided that includes a display unit that superimposes and displays an emphasized image at a position where a recognition object recognized by the image recognition device appears.

この発明のように、車両用表示装置では、車両の周辺領域に位置する認識対象物の存在を、車両の操作者に迅速に伝えなければならない。故に、撮像画像から認識対象物の画像を認識する画像認識装置には、認識のための処理の高速化が強く求められる。したがって、単純な第一学習モデルに基づいた第一識別手段の識別処理で切出画像を絞り込むことにより、高速な認識を実現する画像認識装置は、車両用表示装置に用いられて車両の周辺領域の認識対象物を認識する構成として、特に好適なのである。 As in the present invention, in the vehicle display device, it is necessary to promptly notify the operator of the vehicle of the presence of the recognition object located in the peripheral region of the vehicle. Therefore, an image recognition apparatus that recognizes an image of a recognition object from a captured image is strongly required to increase the processing speed for recognition. Therefore, an image recognition device that realizes high-speed recognition by narrowing the cut-out image by the identification processing of the first identification means based on a simple first learning model is used for a vehicle display device and is used in a vehicle peripheral region. This is particularly suitable as a configuration for recognizing a recognition object.

本発明の第一実施形態によるナイトビュー装置の構成が概略的に示される構成図である。1 is a configuration diagram schematically illustrating a configuration of a night view apparatus according to a first embodiment of the present invention. 液晶ディスプレイに表示される前方画像が示される図である。It is a figure by which the front image displayed on a liquid crystal display is shown. 第一実施形態による第一学習モデルを説明するための図であって、（ａ）は、第一サンプル画像群の詳細を説明するための図であり、（ｂ）は、第一サンプル画像群を学習モデル生成装置に入力することにより第一学習モデルが構築されることを説明するための図である。It is a figure for demonstrating the 1st learning model by 1st embodiment, Comprising: (a) is a figure for demonstrating the detail of a 1st sample image group, (b) is a 1st sample image group. It is a figure for demonstrating that a 1st learning model is constructed | assembled by inputting into a learning model production | generation apparatus. 第一実施形態による第二学習モデルを説明するための図であって、（ａ）は、第二サンプル画像群の詳細を説明するための図であり、（ｂ）は、第二サンプル画像群を学習モデル生成装置に入力することにより第二学習モデルが構築されることを説明するための図である。It is a figure for demonstrating the 2nd learning model by 1st embodiment, Comprising: (a) is a figure for demonstrating the detail of a 2nd sample image group, (b) is a 2nd sample image group. It is a figure for demonstrating that a 2nd learning model is constructed | assembled by inputting into a learning model production | generation apparatus. 前方画像から歩行者画像を認識する画像認識回路の処理が示されるフローチャートである。It is a flowchart in which the process of the image recognition circuit which recognizes a pedestrian image from a front image is shown. 第一識別部による切出画像の絞り込みの有無に起因する、認識のための処理量及び認識率の違いを比較した図である。It is the figure which compared the difference in the processing amount and recognition rate for recognition resulting from the presence or absence of narrowing down of the cut-out image by a 1st identification part. 第二実施形態による第一学習モデルを説明するための図であって、図３の変形例を示す図である。It is a figure for demonstrating the 1st learning model by 2nd embodiment, Comprising: It is a figure which shows the modification of FIG. 第三実施形態による第一学習モデルを説明するための図であって、図３の別の変形例を示す図である。It is a figure for demonstrating the 1st learning model by 3rd embodiment, Comprising: It is a figure which shows another modification of FIG.

以下、本発明の第一実施形態を図面に基づいて説明する。図１は、本発明の第一実施形態によるナイトビュー装置１００の構成を概略的に示す構成図である。ナイトビュー装置１００は、車両に搭載され、車両の前方領域に存在する例えば歩行者等の認識対象物の画像（以下、歩行者画像５１）を認識する。そして、ナイトビュー装置１００は、図２に示されるように、歩行者画像５１を強調するための枠画像５３を前方画像５０に重畳して表示する。以下、ナイトビュー装置１００の構成について、図１及び図２に基づいて詳細に説明する。 Hereinafter, a first embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a configuration diagram schematically showing the configuration of a night view apparatus 100 according to the first embodiment of the present invention. The night view device 100 is mounted on a vehicle and recognizes an image of a recognition object such as a pedestrian (hereinafter referred to as a pedestrian image 51) existing in a front area of the vehicle. And the night view apparatus 100 superimposes and displays the frame image 53 for emphasizing the pedestrian image 51 on the front image 50, as FIG. 2 shows. Hereinafter, the configuration of the night view apparatus 100 will be described in detail with reference to FIGS. 1 and 2.

ナイトビュー装置１００は、近赤外線カメラ１０と接続されている。ナイトビュー装置１００は、近赤外線カメラ１０によって生成された前方画像５０を取得するナイトビュー制御回路１５、及びナイトビュー制御回路１５から出力された前方画像５０を表示する液晶ディスプレイ４０を備えている。 The night view apparatus 100 is connected to the near infrared camera 10. The night view apparatus 100 includes a night view control circuit 15 that acquires the front image 50 generated by the near infrared camera 10 and a liquid crystal display 40 that displays the front image 50 output from the night view control circuit 15.

近赤外線カメラ１０は、車両の進行方向を向けて設置されており、車両の周辺領域のうち、特に車両の前方領域を撮影する。近赤外線カメラ１０は、波長が０．７〜２．５マイクロメートル程度の近赤外線を検知することにより、可視光の少ない環境下においても前方領域の様子を撮影することができる。近赤外線カメラ１０が搭載された車両には、例えばヘッドライトモジュールと一体的に構成され、前方領域に向けて近赤外線を投光する近赤外線投光器１１が設置されている。近赤外線カメラ１０は、近赤外線投光器１１と接続されており、前方領域の撮影を開始すると共に、近赤外線投光器１１に投光を開始させる。近赤外線カメラ１０は、前方領域の物体により反射された近赤外線を検知することにより、撮像画像としての前方画像５０を生成する。近赤外線カメラ１０は、生成した前方画像５０を、ナイトビュー制御回路１５に向けて逐次出力する。 The near-infrared camera 10 is installed so that the traveling direction of the vehicle is directed, and images a region in front of the vehicle, in particular, of the vehicle peripheral region. The near-infrared camera 10 can photograph the state of the front region even in an environment with little visible light by detecting near-infrared light having a wavelength of about 0.7 to 2.5 micrometers. The vehicle on which the near-infrared camera 10 is mounted is provided with a near-infrared projector 11 that is configured integrally with, for example, a headlight module and that projects near-infrared rays toward the front region. The near-infrared camera 10 is connected to the near-infrared projector 11 and starts photographing the front area and causes the near-infrared projector 11 to start projecting light. The near-infrared camera 10 generates a front image 50 as a captured image by detecting near-infrared light reflected by an object in the front region. The near-infrared camera 10 sequentially outputs the generated front image 50 toward the night view control circuit 15.

ナイトビュー制御回路１５は、近赤外線カメラ１０及び液晶ディスプレイ４０と接続されている。ナイトビュー制御回路１５は、画像認識回路２０及び描画回路３０等によって構成されている。画像認識回路２０及び描画回路３０は、それぞれ制御及び描画のための各種の演算処理を行うプロセッサ、当該演算処理に用いられるプログラム等が格納されたフラッシュメモリ、及び演算処理の作業領域として機能するＲＡＭ等によって構成されている。 The night view control circuit 15 is connected to the near infrared camera 10 and the liquid crystal display 40. The night view control circuit 15 includes an image recognition circuit 20, a drawing circuit 30, and the like. The image recognition circuit 20 and the drawing circuit 30 are each a processor that performs various arithmetic processes for control and drawing, a flash memory that stores programs used for the arithmetic processes, and a RAM that functions as a work area for the arithmetic processes Etc. are constituted.

画像認識回路２０は、所定のプログラムをプロセッサに実行させることにより、機能ブロックとして画像取得部２１、切出部２３、第一識別部２５、第二識別部２７、及び認識部２９を有する。画像取得部２１は、近赤外線カメラ１０によって生成される前方画像５０を取得し、切出部２３に出力する。 The image recognition circuit 20 includes an image acquisition unit 21, a cutout unit 23, a first identification unit 25, a second identification unit 27, and a recognition unit 29 as functional blocks by causing a processor to execute a predetermined program. The image acquisition unit 21 acquires the front image 50 generated by the near-infrared camera 10 and outputs it to the cutout unit 23.

切出部２３は、切出ウィンドウ５５を用いて、予め設定されたサイズの切出画像５６を前方画像５０から順に切り出す。切出部２３は、切出ウィンドウ５５を前方画像５０の水平方向及び鉛直方向に徐々に移動させつつ（図２破線の矢印を参照）、当該切出ウィンドウ５５によって囲まれた範囲の画像を切出画像５６として切り出していく。切出ウィンドウ５５が右下まで到達した場合、切出部２３は、切出ウィンドウ５５のサイズを変更し、再び前方画像５０からの切出画像５６の切り出しを行う。尚、前方画像５０において主に空が撮影される領域は、切出ウィンドウ５５の移動範囲から除かれていてもよい。 The cutout unit 23 uses the cutout window 55 to cut out a cutout image 56 having a preset size in order from the front image 50. The cutout unit 23 moves the cutout window 55 gradually in the horizontal direction and the vertical direction of the front image 50 (see the dashed arrows in FIG. 2), and cuts the image in the range surrounded by the cutout window 55. The extracted image 56 is cut out. When the cutout window 55 reaches the lower right, the cutout unit 23 changes the size of the cutout window 55 and cuts out the cutout image 56 from the front image 50 again. It should be noted that the region where the sky is mainly photographed in the front image 50 may be excluded from the moving range of the cutout window 55.

第一識別部２５は、切出部２３によって切り出された複数の切出画像５６について、第一学習モデル６６に基づいて、歩行者画像５１が含まれている可能性があるか否かを識別する。これにより第一識別部２５は、第二識別部２７による識別処理の対象となる切出画像５６を絞り込む。まず、第一識別部２５は、切り出された各切出画像５６のＨａａｒ−ｌｉｋｅ特徴量を算出する。Ｈａａｒ−ｌｉｋｅ特徴とは、白い矩形領域と黒い矩形領域とが組み合わされたものである（非特許文献１Ｆｉｇｕｒｅ１参照）。このようなＨａａｒ−ｌｉｋｅ特徴として、様々な形状が規定されている。Ｈａａｒ−ｌｉｋｅ特徴量は、このＨａａｒ−ｌｉｋｅ特徴の白い矩形領域の平均輝度と、黒い矩形領域の平均輝度との差に基づく値である。 The first identification unit 25 identifies whether or not there is a possibility that the pedestrian image 51 is included based on the first learning model 66 for the plurality of cut images 56 cut by the cut unit 23. To do. As a result, the first identification unit 25 narrows down the cut-out images 56 to be subjected to the identification process by the second identification unit 27. First, the first identification unit 25 calculates the Haar-like feature amount of each cut image 56 cut out. The Haar-like feature is a combination of a white rectangular area and a black rectangular area (see Non-Patent Document 1 FIG. 1). Various shapes are defined as such Haar-like features. The Haar-like feature value is a value based on the difference between the average luminance of the white rectangular area of the Haar-like feature and the average luminance of the black rectangular area.

第一学習モデル６６には、歩行者画像５１を含む可能性のある画像の識別に有効なＨａａｒ−ｌｉｋｅ特徴の形状、位置、方向、及び識別のための閾値等が学習されている。第一識別部２５は、第一学習モデル６６に基づいて、切出画像５６の特定領域におけるＨａａｒ−ｌｉｋｅ特徴量を算出し、学習された閾値と比較する。第一識別部２５は、第一学習モデル６６に基づいて、Ｈａａｒ−ｌｉｋｅ特徴量に基づく識別を繰り返すことにより、各切出画像５６が第一学習モデル６６に学習された正例に対応するか否かを識別する。これにより第一識別部２５は、正例に対応する切出画像５６に、歩行者画像５１が含まれている可能性があると識別する。 In the first learning model 66, the shape, position, direction, threshold for identification, and the like of Haar-like features effective for identifying an image that may include the pedestrian image 51 are learned. Based on the first learning model 66, the first identification unit 25 calculates the Haar-like feature amount in the specific region of the cut-out image 56, and compares it with the learned threshold value. Based on the first learning model 66, the first identification unit 25 repeats the identification based on the Haar-like feature value, so that each cut-out image 56 corresponds to the positive example learned by the first learning model 66. Identify whether or not. Thereby, the 1st identification part 25 identifies that the pedestrian image 51 may be contained in the cut-out image 56 corresponding to a positive example.

第二識別部２７は、第一識別部２５によって歩行者画像５１が含まれている可能性があると識別された切出画像５６について、第二学習モデル７７に基づいて、歩行者画像５１が含まれているか否かを識別する。これにより第一識別部２５は、歩行者画像５１が含まれている可能性があると識別された切出画像５６の中から、歩行者に類似する物体が写る切出画像５６を除外する。まず、第二識別部２７は、第一識別部２５によって絞り込まれた各切出画像５６のＨａａｒ−ｌｉｋｅ特徴量を算出する。第二学習モデル７７には、歩行者画像５１を含む画像の識別に有効なＨａａｒ−ｌｉｋｅ特徴の形状、位置、方向、及び識別のための閾値等が学習されている。第二識別部２７は、第二学習モデル７７に基づいて、切出画像５６の特定領域におけるＨａａｒ−ｌｉｋｅ特徴量を算出し、学習された閾値と比較する。第二識別部２７は、第二学習モデル７７に基づいて、Ｈａａｒ−ｌｉｋｅ特徴量に基づく判定を繰り返すことにより、各切出画像５６が第二学習モデル７７に学習された正例に対応するか否かを識別する。そして、第二識別部２７は、正例に対応する切出画像５６に、歩行者画像５１が含まれていると識別する。 Based on the second learning model 77, the second identification unit 27 uses the second learning model 77 to identify the pedestrian image 51 for the clipped image 56 that has been identified by the first identification unit 25 as possibly including the pedestrian image 51. Identifies whether it is included. Thereby, the 1st identification part 25 excludes the cut-out image 56 in which the object similar to a pedestrian is reflected from the cut-out images 56 identified as the possibility that the pedestrian image 51 may be included. First, the second identification unit 27 calculates the Haar-like feature amount of each cut-out image 56 narrowed down by the first identification unit 25. In the second learning model 77, the shape, position, direction, threshold value for identification, and the like of the Haar-like feature effective for identifying the image including the pedestrian image 51 are learned. Based on the second learning model 77, the second identification unit 27 calculates the Haar-like feature amount in the specific region of the cutout image 56 and compares it with the learned threshold value. Based on the second learning model 77, the second identification unit 27 repeats the determination based on the Haar-like feature value, so that each cut-out image 56 corresponds to the positive example learned by the second learning model 77. Identify whether or not. And the 2nd identification part 27 identifies that the pedestrian image 51 is contained in the cut-out image 56 corresponding to a positive example.

認識部２９は、第二識別部２７によって歩行者画像５１を含むと識別された切出画像５６の前方画像５０における位置を、前方画像５０において歩行者が写る位置として認識する。第一識別部２５及び第二識別部２７による識別処理が繰り返されることにより、認識部２９は、一つの前方画像５０において、歩行者の写る位置を全て認識することができる。認識部２９は、認識した歩行者画像５１の前方画像５０における位置情報を、描画回路３０に出力する。 The recognizing unit 29 recognizes the position in the front image 50 of the cut image 56 identified as including the pedestrian image 51 by the second identifying unit 27 as the position where the pedestrian appears in the front image 50. By repeating the identification processing by the first identification unit 25 and the second identification unit 27, the recognition unit 29 can recognize all the positions where the pedestrian appears in one front image 50. The recognition unit 29 outputs position information of the recognized pedestrian image 51 in the front image 50 to the drawing circuit 30.

描画回路３０は、前方画像５０に歩行者画像５１が認識された場合に、前方画像５０に枠画像５３を重畳する合成処理を行う。具体的に、描画回路３０は、認識部２９から取得した歩行者画像５１の位置情報に基づいて、所定のレイヤに枠画像５３を描画する。そして、描画回路３０は、枠画像５３が歩行者画像５１を囲むように、枠画像５３の描画されたレイヤを、前方画像５０が描画されたレイヤに重畳する。描画回路３０は、枠画像５３が重畳された前方画像５０を、液晶ディスプレイ４０に向けて逐次出力する。尚、画像認識回路２０によって歩行者画像５１が認識されていない場合には、描画回路３０は、枠画像５３の合成処理を行うことなく、前方画像５０を液晶ディスプレイ４０に向けて逐次出力する。 When the pedestrian image 51 is recognized in the front image 50, the drawing circuit 30 performs a composition process for superimposing the frame image 53 on the front image 50. Specifically, the drawing circuit 30 draws the frame image 53 on a predetermined layer based on the position information of the pedestrian image 51 acquired from the recognition unit 29. Then, the drawing circuit 30 superimposes the layer on which the frame image 53 is drawn on the layer on which the front image 50 is drawn so that the frame image 53 surrounds the pedestrian image 51. The drawing circuit 30 sequentially outputs the front image 50 on which the frame image 53 is superimposed toward the liquid crystal display 40. When the pedestrian image 51 is not recognized by the image recognition circuit 20, the drawing circuit 30 sequentially outputs the front image 50 toward the liquid crystal display 40 without performing the frame image 53 combining process.

液晶ディスプレイ４０は、表示面に配列された複数の画素を制御することにより、カラー表示が可能なドットマトリクス方式の表示装置である。液晶ディスプレイ４０は、図２に示される正面側を運転席側に向けた状態で、車両室内のインスツルメントパネルの内部に配置されている。液晶ディスプレイ４０は、描画回路３０から逐次取得する前方画像５０を連続的に表示することにより、前方領域の様子を動画により視認者に示すことができる。 The liquid crystal display 40 is a dot matrix display device capable of color display by controlling a plurality of pixels arranged on the display surface. The liquid crystal display 40 is disposed inside the instrument panel in the vehicle compartment with the front side shown in FIG. 2 facing the driver's seat. The liquid crystal display 40 can continuously display the front image 50 sequentially acquired from the drawing circuit 30, thereby showing the state of the front region to the viewer with a moving image.

次に、第一識別部２５によって用いられる第一学習モデル６６、及び第二識別部２７によって用いられる第二学習モデル７７について、図１〜図４に基づいて、さらに詳細に説明する。 Next, the first learning model 66 used by the first identification unit 25 and the second learning model 77 used by the second identification unit 27 will be described in more detail based on FIGS.

第一学習モデル６６は、歩行者画像５１が切出画像５６に含まれている可能性があるか否かを識別するための学習モデルである。第一学習モデル６６は、学習の基準となる第一サンプル画像群６０を学習モデル生成装置８０によって学習させることにより作成されている。また、第二学習モデル７７は、歩行者画像５１が切出画像５６に含まれているか否かを識別するための学習モデルである。第二学習モデル７７は、学習の基準となる第二サンプル画像群７０を学習モデル生成装置８０によって学習させることにより作成されている。 The first learning model 66 is a learning model for identifying whether or not the pedestrian image 51 may be included in the cutout image 56. The first learning model 66 is created by causing the learning model generation device 80 to learn the first sample image group 60 serving as a learning reference. The second learning model 77 is a learning model for identifying whether or not the pedestrian image 51 is included in the cutout image 56. The second learning model 77 is created by causing the learning model generation device 80 to learn the second sample image group 70 that serves as a learning reference.

これら第一サンプル画像群６０及び第二サンプル画像群７０には、認識対象物として想定されている歩行者が写る複数の正例サンプル画像６１，７１と、歩行者以外の車両、標識、及び背景等（以下、非歩行者）が写る複数の負例サンプル画像６３，７３とが含まれている。正例サンプル画像６１，７１には、歩行者が写っていることを明確に判別できる正例サンプル画像６１ａ，７１ａ、及び非歩行者に類似する歩行者が写る正例サンプル画像６１ｂ，７１ｂが含まれている。同様に、負例サンプル画像６３，７３には、歩行者が写っていないことが明確に判別できる負例サンプル画像６３ａ，７３ａ、及び歩行者に類似する非歩行者が写る負例サンプル画像６３ｂ，７３ｂが含まれている。尚、第二サンプル画像群７０は、第一サンプル画像群６０と同じ画像群であっても良く、又は第一サンプル画像群６０とは異なる画像群であってもよい。 The first sample image group 60 and the second sample image group 70 include a plurality of positive sample images 61 and 71 in which pedestrians assumed as recognition objects are shown, vehicles other than pedestrians, signs, and backgrounds. Etc. (hereinafter referred to as non-pedestrians) and a plurality of negative example sample images 63 and 73 are included. The positive sample images 61 and 71 include positive sample images 61a and 71a that can clearly determine that a pedestrian is shown, and positive sample images 61b and 71b that show a pedestrian similar to a non-pedestrian. It is. Similarly, in the negative example sample images 63 and 73, the negative example sample images 63a and 73a that can clearly determine that no pedestrian is shown, and the negative example sample images 63b and 63b that show a non-pedestrian similar to the pedestrian. 73b is included. The second sample image group 70 may be the same image group as the first sample image group 60, or may be an image group different from the first sample image group 60.

学習モデル生成装置８０は、各種の演算処理を行うプロセッサ、当該演算処理に用いられるプログラム等が格納されたハードディスクドライブ等の記憶媒体、及び演算処理の作業領域として機能するＲＡＭ等によって構成されている。第一学習モデル６６及び第二学習モデル７７の作成に際して、学習モデル生成装置８０には、第一サンプル画像群６０又は第二サンプル画像群７０と共に、各サンプル画像について正例なのか負例なのかを示す教師信号が入力される。学習モデル生成装置８０は、入力される各サンプル画像について、Ｈａａｒ−ｌｉｋｅ特徴の形状、位置、及び方向を変えながらその特徴量を算出し、算出結果を蓄積していく。そして学習モデル生成装置８０は、正例を示す教師信号に関連付けられたサンプル画像に共通し、且つ負例を示す教師信号に関連付けられたサンプル画像とは共通しないＨａａｒ−ｌｉｋｅ特徴を、例えばＡｄａＢｏｏｓｔ等のブースティングアルゴリズムを用いて学習する。 The learning model generation device 80 includes a processor that performs various arithmetic processes, a storage medium such as a hard disk drive that stores programs used for the arithmetic processes, and a RAM that functions as a work area for the arithmetic processes. . When creating the first learning model 66 and the second learning model 77, the learning model generation device 80 includes the first sample image group 60 or the second sample image group 70 and whether each sample image is a positive example or a negative example. Is input. The learning model generation device 80 calculates the feature amount of each input sample image while changing the shape, position, and direction of the Haar-like feature, and accumulates the calculation results. Then, the learning model generation apparatus 80 has a Haar-like feature that is common to the sample image associated with the teacher signal indicating the positive example and not common to the sample image associated with the teacher signal indicating the negative example, such as AdaBoost or the like. Learning using the boosting algorithm.

第一学習モデル６６の特徴として、複数の負例サンプル画像６３のうち、歩行者に類似する非歩行者が写る負例サンプル画像６３ｂが、複数の正例サンプル画像６１と共に、正例として学習されている。具体的には、歩行者に類似する非歩行者が写る負例サンプル画像６３ｂは、学習モデル生成装置８０に入力される際に、正例であることを示す教師信号と関連付けられる。これにより、第一学習モデル６６に基づく第一識別部２５の識別では、歩行者画像５１が含まれる切出画像５６に加えて、歩行者に類似した物体が写る切出画像５６も、正例に対応する切出画像５６として識別される。歩行者画像５１が含まれる切出画像５６、及び歩行者に類似した物体が写る切出画像５６は、即ち歩行者画像５１が含まれている可能性のある切出画像５６である。故に、第一識別部２５は、上述した第一学習モデル６６に学習された正例に対応する切出画像５６を識別することにより、歩行者画像５１が含まれている可能性のある切出画像５６を的確に絞り込むことができる。 As a feature of the first learning model 66, a negative example sample image 63 b showing a non-pedestrian similar to a pedestrian among a plurality of negative example sample images 63 is learned as a positive example together with a plurality of positive example sample images 61. ing. Specifically, the negative example sample image 63b showing a non-pedestrian similar to a pedestrian is associated with a teacher signal indicating that it is a positive example when input to the learning model generation device 80. Thereby, in the identification of the 1st identification part 25 based on the 1st learning model 66, in addition to the cut-out image 56 containing the pedestrian image 51, the cut-out image 56 in which the object similar to a pedestrian is reflected is also a positive example. Is identified as a cut-out image 56 corresponding to. The cutout image 56 including the pedestrian image 51 and the cutout image 56 in which an object similar to the pedestrian is captured are cutout images 56 that may include the pedestrian image 51. Therefore, the 1st discrimination | determination part 25 identifies the cutout image 56 corresponding to the positive example learned by the 1st learning model 66 mentioned above, and the cutout which may contain the pedestrian image 51 is included. The image 56 can be accurately narrowed down.

第二学習モデル７７において、複数の正例サンプル画像７１は、正例を示す教師信号に関連付けられている。また、複数の負例サンプル画像７３は、第一学習モデル６６の場合とは異なり、全て負例を示す教師信号と関連付けられている。故に、第二学習モデル７７に基づく第二識別部２７の識別では、歩行者に類似した物体が写る切出画像５６が除外されることにより、正例に対応する歩行者画像５１が含まれる切出画像５６が識別される。 In the second learning model 77, a plurality of positive example sample images 71 are associated with teacher signals indicating positive examples. Further, unlike the case of the first learning model 66, the plurality of negative example sample images 73 are all associated with teacher signals indicating negative examples. Therefore, in the identification of the second identification unit 27 based on the second learning model 77, a clipped image 56 that shows an object similar to a pedestrian is excluded, and a clipped pedestrian image 51 corresponding to the positive example is included. An outgoing image 56 is identified.

第一学習モデル６６と第二学習モデル７７とを比較した場合、第二学習モデル７７の方が、以下の理由により、識別のためのＨａａｒ−ｌｉｋｅ特徴の探索回数が多く、複雑な学習モデルになる。その理由とは、非歩行者に類似する歩行者が写る画像と、歩行者に類似する非歩行者が写る画像との差異が曖昧であることにより、これらの画像を識別することは困難である。一方で、歩行者が含まれている可能性のある画像と、それ以外の例えば歩行者とは明白に異なる非方向者しか含まない背景のみの画像との差異は、比較的明確である。故に、識別は容易である。以上の理由により、歩行者に類似する非歩行者の画像が写る負例サンプル画像６３ｂを、正例として学習する第一学習モデル６６は、歩行者と歩行者とは明白に異なる非歩行者とを識別する学習モデルである。故に、第一学習モデル６６は、高い精度を維持していても、第一識別部２５による識別処理の高速化に寄与する、単純に構成され得る。一方で、非歩行者に類似する歩行者が写る画像と、歩行者に類似する非歩行者が写る画像とを識別するための第二学習モデル７７は、歩行者と歩行者に類似した非歩行者とを識別する学習モデルであることにより、高い精度を維持するためには、複雑にならざるを得ない。 When the first learning model 66 and the second learning model 77 are compared, the second learning model 77 has a larger number of searches for Haar-like features for identification due to the following reasons, and is a complicated learning model. Become. The reason is that it is difficult to distinguish these images because the difference between an image of a pedestrian similar to a non-pedestrian and an image of a non-pedestrian similar to a pedestrian is ambiguous. . On the other hand, the difference between an image that may include a pedestrian and an image that includes only a non-direction person that is clearly different from other pedestrians, for example, is relatively clear. Therefore, identification is easy. For the above reasons, the first learning model 66 that learns as a positive example the negative sample image 63b in which an image of a non-pedestrian that is similar to a pedestrian is captured is a Is a learning model for identifying Therefore, even if the first learning model 66 maintains high accuracy, the first learning model 66 can be simply configured to contribute to speeding up the identification processing by the first identification unit 25. On the other hand, the second learning model 77 for discriminating between an image showing a pedestrian similar to a non-pedestrian and an image showing a non-pedestrian similar to a pedestrian is a non-walking similar to a pedestrian and a pedestrian. In order to maintain high accuracy by using a learning model for identifying a person, it must be complicated.

ここまで説明したナイトビュー装置１００が、枠画像５３の重畳された前方画像５０を液晶ディスプレイ４０に表示する処理を、図５に基づいて詳しく説明する。以下説明する処理は、ナイトビュー装置１００の作動を開始及び停止させるためのスイッチがユーザによって操作されることにより、画像認識回路２０によって実施される。画像認識回路２０による処理は、ユーザの操作に基づいてナイトビュー装置１００の作動が停止されるまで、繰り返される。 A process in which the night view apparatus 100 described so far displays the front image 50 on which the frame image 53 is superimposed on the liquid crystal display 40 will be described in detail with reference to FIG. The processing described below is performed by the image recognition circuit 20 when a switch for starting and stopping the operation of the night view apparatus 100 is operated by the user. The processing by the image recognition circuit 20 is repeated until the operation of the night view device 100 is stopped based on the user's operation.

Ｓ１０１では、近赤外線カメラ１０から前方画像５０を取得し、Ｓ１０２に進む。Ｓ１０２では、Ｓ１０１にて取得した前方画像５０から、切出ウィンドウ５５を用いて、切出画像５６を切り出し、Ｓ１０３に進む。 In S101, the front image 50 is acquired from the near-infrared camera 10, and it progresses to S102. In S102, the cutout image 56 is cut out from the front image 50 acquired in S101 using the cutout window 55, and the process proceeds to S103.

Ｓ１０３では、Ｓ１０２にて切り出した切出画像５６について、第一学習モデル６６に基づいて、歩行者画像５１を含んでいる可能性があるか否かを識別する。Ｓ１０３において、歩行者画像５１を含んでいる可能性がないと識別した場合、Ｓ１０６に進む。一方、Ｓ１０３において、歩行者画像５１を含んでいる可能性があると識別した場合、Ｓ１０４に進む。 In S <b> 103, whether or not there is a possibility of including the pedestrian image 51 is identified based on the first learning model 66 for the cut image 56 cut out in S <b> 102. If it is determined in S103 that there is no possibility of including the pedestrian image 51, the process proceeds to S106. On the other hand, if it is determined in S103 that the pedestrian image 51 may be included, the process proceeds to S104.

Ｓ１０４では、Ｓ１０３にて歩行者画像５１を含んでいる可能性があると識別された切出画像５６について、第二学習モデル７７に基づいて、歩行者を含んでいるか否かを識別する。Ｓ１０４において、歩行者画像５１を含んでいないと識別した場合、Ｓ１０６に進む。一方、Ｓ１０４において、歩行者画像５１を含んでいると識別した場合、Ｓ１０５に進む。 In S <b> 104, whether or not a pedestrian is included is identified based on the second learning model 77 for the cut image 56 that has been identified as having a possibility of including the pedestrian image 51 in S <b> 103. If it is determined in S104 that the pedestrian image 51 is not included, the process proceeds to S106. On the other hand, if it is determined in S104 that the pedestrian image 51 is included, the process proceeds to S105.

Ｓ１０５では、Ｓ１０２において切出画像５６が切り出された位置を、前方画像５０において歩行者が写る位置として認識し、Ｓ１０６に進む。Ｓ１０６では、Ｓ１０１において取得した前方画像５０から切り出される全ての切出画像５６について、識別処理が終了したか否かを識別する。Ｓ１０６において、全ての切出画像５６についての識別処理が終了していないと識別した場合、Ｓ１０２に戻り、次の切出画像５６について、Ｓ１０３〜Ｓ１０５の処理を実施する。Ｓ１０６において、全ての切出画像５６についての識別処理が終了したと識別した場合、Ｓ１０７に進む。 In S105, the position where the clipped image 56 is clipped in S102 is recognized as a position where a pedestrian appears in the front image 50, and the process proceeds to S106. In S106, it is identified whether or not the identification process has been completed for all the cut images 56 cut out from the front image 50 acquired in S101. If it is determined in S106 that the identification processing for all the cutout images 56 has not been completed, the process returns to S102, and the processing of S103 to S105 is performed for the next cutout image 56. If it is determined in S106 that the identification processing for all the cutout images 56 has been completed, the process proceeds to S107.

Ｓ１０７では、Ｓ１０５において認識した歩行者画像５１の前方画像５０における位置情報を描画回路３０に出力し、Ｓ１０１に戻る。そして、次に取得される前方画像５０について、Ｓ１０２〜Ｓ１０６の歩行者を認識するための処理を実施する。Ｓ１０７によって出力された歩行者画像５１の位置情報を取得した描画回路３０は、当該位置情報に従って、枠画像５３を前方画像５０に重畳する。そして描画回路３０は、枠画像５３を重畳した前方画像５０を液晶ディスプレイ４０に連続的に出力する。これにより、液晶ディスプレイ４０には、前方領域の様子が動画として映し出される。 In S107, the positional information in the front image 50 of the pedestrian image 51 recognized in S105 is output to the drawing circuit 30, and the process returns to S101. And the process for recognizing the pedestrian of S102-S106 is implemented about the front image 50 acquired next. The drawing circuit 30 that has acquired the position information of the pedestrian image 51 output in S107 superimposes the frame image 53 on the front image 50 in accordance with the position information. The drawing circuit 30 continuously outputs the front image 50 on which the frame image 53 is superimposed to the liquid crystal display 40. Thereby, the state of the front area is displayed on the liquid crystal display 40 as a moving image.

ここまで説明した第一実施形態では、第二識別部２７の識別処理の対象となる切出画像５６は、第一識別部２５による識別処理によって、歩行者画像５１が含まれている可能性のある切出画像５６に絞り込まれる。この第一識別部２５による識別処理では、差異の明確な画像の識別を行っているので、歩行者画像５１が含まれている可能性があるものと、非歩行者の画像しか含まないことが明白なものとの識別は、高精度に行われる。故に、切出画像５６が第一識別部２５によって絞り込まれても、前方画像５０から歩行者画像５１を認識する全体の処理の精度は、高いまま維持され得る。 In the first embodiment described so far, the clipped image 56 that is the target of the identification process of the second identification unit 27 may include the pedestrian image 51 by the identification process by the first identification unit 25. The extracted image 56 is narrowed down. In the identification processing by the first identification unit 25, since an image with a clear difference is identified, a pedestrian image 51 may be included and a non-pedestrian image may be included. The distinction is made with high accuracy. Therefore, even if the cut-out image 56 is narrowed down by the first identification unit 25, the accuracy of the entire process for recognizing the pedestrian image 51 from the front image 50 can be maintained high.

加えて、単純な第一学習モデル６６に基づく第一識別部２５による識別処理は、複雑な第二学習モデル７７に基づく第二識別部２７による識別処理よりも、高速に処理され得る。故に、第一識別部２５による識別処理によって切出画像５６を絞り込むことにより、冗長な第二識別部２７の識別処理の対象となる切出画像５６が低減される。これらにより、第一識別部２５及び第二識別部２７によって一部の切出画像５６に対して重複する識別処理が行われたとしても、前方画像５０から歩行者画像５１を認識するまでの全体の処理は、高速化され得る。 In addition, the identification process by the first identification unit 25 based on the simple first learning model 66 can be processed at a higher speed than the identification process by the second identification unit 27 based on the complicated second learning model 77. Therefore, by narrowing down the cutout image 56 by the identification process by the first identification unit 25, the cutout image 56 that is the target of the identification process of the redundant second identification unit 27 is reduced. As a result, even when the first identifying unit 25 and the second identifying unit 27 perform overlapping identification processing on some of the cut-out images 56, the entire process until the pedestrian image 51 is recognized from the front image 50. This process can be speeded up.

第一実施形態のように切出画像５６を絞り込む場合と、第一識別部２５によって切出画像５６を絞り込むことなく、全ての切出画像５６を第二識別部２７の識別処理の対象とする場合との比較結果が、図６に示されている。第一識別部２５によって切出画像５６を絞り込むことにより、第一識別部２５及び第二識別部２７によってＨａａｒ−ｌｉｋｅ特徴が探索される回数は、約７５パーセント低減される。加えて、認識率は、高いまま維持されている。 When narrowing the cut image 56 as in the first embodiment, and without narrowing the cut image 56 by the first identification unit 25, all the cut images 56 are targeted for identification processing of the second identification unit 27. The comparison result with the case is shown in FIG. By narrowing the cut-out image 56 by the first identification unit 25, the number of times that the Haar-like feature is searched for by the first identification unit 25 and the second identification unit 27 is reduced by about 75%. In addition, the recognition rate remains high.

したがって、ナイトビュー装置１００の備える画像認識回路２０は、前方画像５０から歩行者画像５１を認識する精度を高く維持したまま、認識のための処理を高速で行うことができる。 Therefore, the image recognition circuit 20 included in the night view device 100 can perform recognition processing at high speed while maintaining high accuracy of recognizing the pedestrian image 51 from the front image 50.

加えて第一実施形態の第一学習モデル６６は、歩行者に類似する非歩行者の画像が写る負例サンプル画像６３ｂを正例として学習している。故に、第一学習モデル６６に基づいて識別処理を実施する第一識別部２５は、正例に対応する切出画像５６を識別することにより、歩行者画像５１が含まれている可能性のある切出画像５６を的確に絞り込むことができる。 In addition, the first learning model 66 of the first embodiment learns a negative example sample image 63b in which a non-pedestrian image similar to a pedestrian is captured as a positive example. Therefore, the first identification unit 25 that performs the identification process based on the first learning model 66 may include the pedestrian image 51 by identifying the cutout image 56 corresponding to the positive example. The cut image 56 can be accurately narrowed down.

さらに第二学習モデル７７は、非歩行者に類似する歩行者が写る正例サンプル画像７１ｂを正例として学習し、且つ歩行者に類似する非歩行者が写る負例サンプル画像７３ｂを負例として学習している。故に、第二学習モデル７７は、歩行者と非歩行者との曖昧な差異について、詳しく学習している。以上により、第二識別部２７は、第二学習モデル７７の正例に対応する切出画像５６を識別することにより、歩行者画像５１が含まれている可能性がある切出画像５６の中から、歩行者に類似した物体が写る切出画像５６を的確に除外し得る。したがって、画像認識回路２０は、認識のための処理が高速化されていても、前方画像５０から歩行者画像５１を認識する精度を確実に高く維持することができる。 Further, the second learning model 77 learns a positive example sample image 71b showing a pedestrian similar to a non-pedestrian as a positive example, and takes a negative example sample image 73b showing a non-pedestrian similar to a pedestrian as a negative example. Learning. Therefore, the second learning model 77 learns in detail about ambiguous differences between pedestrians and non-pedestrians. As described above, the second identifying unit 27 identifies the clipped image 56 corresponding to the positive example of the second learning model 77, and thus the pedestrian image 51 may be included. Therefore, the cut-out image 56 in which an object similar to a pedestrian is captured can be accurately excluded. Therefore, the image recognition circuit 20 can reliably maintain high accuracy for recognizing the pedestrian image 51 from the front image 50 even if the processing for recognition is speeded up.

また第一実施形態では、前方画像５０に含まれる歩行者画像５１は、実際の歩行者の移動方向及び近赤外線カメラ１０に対する歩行者の相対位置等に応じて、形態が様々である。故に、歩行者画像５１か否かを正確に識別するためには、学習モデルは、非常に多数のサンプル画像から、歩行者画像５１の特徴を学習しなければならなくなる。すると、学習モデルが複雑になることにより、識別処理は、さらに冗長なものとなるおそれがある。 In the first embodiment, the pedestrian image 51 included in the front image 50 has various forms according to the actual movement direction of the pedestrian and the relative position of the pedestrian with respect to the near-infrared camera 10. Therefore, in order to accurately identify whether or not the image is the pedestrian image 51, the learning model has to learn the characteristics of the pedestrian image 51 from a very large number of sample images. Then, since the learning model becomes complicated, the identification process may become further redundant.

しかし、第一実施形態の画像認識回路２０は、第二識別部２７の識別処理の対象となる切出画像５６を第一識別部２５によって絞り込むことにより、前方画像５０から歩行者画像５１を認識するための全体の処理量を低減している。故に、第二識別部２７によって用いられる第二学習モデル７７が複雑になったとしても、画像認識回路２０は、高い精度を維持したまま、高速に歩行者画像５１の認識を行うことができる。以上のように、第一識別部２５によって切出画像５６を絞り込む構成は、認識対象物の画像として歩行者画像５１を認識する画像認識回路２０に特に好適なのである。 However, the image recognition circuit 20 of the first embodiment recognizes the pedestrian image 51 from the front image 50 by narrowing down the cutout image 56 that is the target of the identification processing of the second identification unit 27 by the first identification unit 25. To reduce the overall processing amount. Therefore, even if the second learning model 77 used by the second identification unit 27 becomes complicated, the image recognition circuit 20 can recognize the pedestrian image 51 at high speed while maintaining high accuracy. As described above, the configuration in which the cutout image 56 is narrowed down by the first identification unit 25 is particularly suitable for the image recognition circuit 20 that recognizes the pedestrian image 51 as the image of the recognition object.

さらに第一実施形態のようなナイトビュー装置１００は、車両の周辺領域に位置する歩行者の存在を、車両の操作者に迅速に伝えなければならない。故に、前方画像５０から歩行者画像５１を認識する画像認識回路２０には、認識のための処理の高速化が強く求められる。したがって、単純な第一学習モデル６６に基づいた第一識別部２５の識別処理で切出画像５６を絞り込むことにより、高速な認識を実現する画像認識回路２０は、ナイトビュー装置１００に用いられて車両の周辺領域の歩行者を認識する構成として、特に好適なのである。 Furthermore, the night view apparatus 100 as in the first embodiment must promptly notify the operator of the vehicle of the presence of a pedestrian located in the peripheral area of the vehicle. Therefore, the image recognition circuit 20 that recognizes the pedestrian image 51 from the front image 50 is strongly required to speed up the process for recognition. Therefore, the image recognition circuit 20 that realizes high-speed recognition by narrowing the cut image 56 by the identification processing of the first identification unit 25 based on the simple first learning model 66 is used in the night view device 100. This is particularly suitable as a configuration for recognizing pedestrians in the surrounding area of the vehicle.

尚、第一実施形態において、近赤外線カメラ１０が特許請求の範囲に記載の「撮像手段」に相当し、画像認識回路２０が特許請求の範囲に記載の「画像認識装置」に相当し、画像取得部２１が特許請求の範囲に記載の「画像取得手段」に相当し、切出部２３が特許請求の範囲に記載の「切出手段」に相当し、第一識別部２５が特許請求の範囲に記載の「第一識別手段」に相当し、第二識別部２７が特許請求の範囲に記載の「第二識別手段」に相当し、認識部２９が特許請求の範囲に記載の「認識手段」に相当し、描画回路３０及び液晶ディスプレイ４０が特許請求の範囲に記載の「表示手段」に相当し、前方画像５０が特許請求の範囲に記載の「撮像画像」に相当し、歩行者画像５１が特許請求の範囲に記載の「認識対象物の画像」及び「想定物の画像」に相当し、枠画像５３が特許請求の範囲に記載の「強調画像」に相当し、歩行者が特許請求の範囲に記載の「認識対象物」及び「想定物」に相当し、ナイトビュー装置１００が特許請求の範囲に記載の「車両用表示装置」に相当する。 In the first embodiment, the near-infrared camera 10 corresponds to the “imaging means” recited in the claims, the image recognition circuit 20 corresponds to the “image recognition device” recited in the claims, and the image The acquisition unit 21 corresponds to the “image acquisition unit” described in the claims, the cutout unit 23 corresponds to the “cutout unit” described in the claims, and the first identification unit 25 includes the claims. The second identification unit 27 corresponds to the “second identification unit” described in the claims, and the recognition unit 29 corresponds to the “recognition unit” described in the claims. The drawing circuit 30 and the liquid crystal display 40 correspond to the “display means” described in the claims, the front image 50 corresponds to the “captured image” described in the claims, and the pedestrian The image 51 is an “image of the recognition object” and “ The frame image 53 corresponds to the “enhanced image” described in the claims, and the pedestrian corresponds to the “recognition object” and the “assumed object” described in the claims. The night view device 100 corresponds to the “vehicle display device” recited in the claims.

（第二実施形態）
本発明の第二実施形態は、第一実施形態の変形例である。この第二実施形態において、第一識別部２５は、第一実施形態で用いた第一学習モデル６６に換えて、第一学習モデル２６６に基づいて、識別処理を実施する。以下、図７、図１、図２に基づいて、第二実施形態において用いられる第一学習モデル２６６、及び第一学習モデル２６６に基づく第一識別部２５の識別処理について詳細に説明する。 (Second embodiment)
The second embodiment of the present invention is a modification of the first embodiment. In the second embodiment, the first identification unit 25 performs identification processing based on the first learning model 266 instead of the first learning model 66 used in the first embodiment. Hereinafter, the first learning model 266 used in the second embodiment and the identification processing of the first identification unit 25 based on the first learning model 266 will be described in detail based on FIGS. 7, 1, and 2.

第一学習モデル２６６を作成するに際して、学習モデル生成装置８０には、正例を示す教師信号に関連付けられた正例サンプル画像６１ａと、負例を示す教師信号に関連付けられた負例サンプル画像６３ａとが入力される（図７（ｂ）参照）。詳しく説明すると、第一学習モデル２６６は、第一サンプル画像群６０に含まれる複数の正例サンプル画像６１から、非歩行者に類似する歩行者が写る正例サンプル画像６１ｂを除外した画像群を正例として学習している。加えて第一学習モデル２６６は、複数の負例サンプル画像６３から、歩行者に類似する非歩行者が写る負例サンプル画像６３ｂを除外した画像群を負例として学習している。 In creating the first learning model 266, the learning model generation device 80 includes a positive example sample image 61a associated with a teacher signal indicating a positive example and a negative example sample image 63a associated with a teacher signal indicating a negative example. Are input (see FIG. 7B). More specifically, the first learning model 266 is an image group obtained by excluding a positive example sample image 61b in which a pedestrian similar to a non-pedestrian is captured from a plurality of positive example sample images 61 included in the first sample image group 60. Learning as a positive example. In addition, the first learning model 266 learns, as a negative example, an image group obtained by excluding a negative example sample image 63b in which a non-pedestrian similar to a pedestrian is shown from a plurality of negative example sample images 63.

上述したように、非歩行者に類似する歩行者が写る画像と、歩行者に類似する非歩行者が写る画像との差異は、曖昧である。故に、非歩行者に類似する歩行者が写る正例サンプル画像６１ｂ及び歩行者に類似する非歩行者が写る負例サンプル画像６３ｂを第一サンプル画像群６０から除外することで、第一学習モデル２６６は、これらの曖昧な差異を詳細には学習しない。そのため、第一学習モデル２６６は、第一識別部２５による処理の高速化に寄与する、単純な学習モデルに確実になり得る。 As described above, the difference between an image showing a pedestrian similar to a non-pedestrian and an image showing a non-pedestrian similar to a pedestrian is ambiguous. Therefore, by excluding the positive sample image 61b showing a pedestrian similar to a non-pedestrian and the negative sample image 63b showing a non-pedestrian similar to a pedestrian from the first sample image group 60, a first learning model is obtained. 266 does not learn these ambiguous differences in detail. Therefore, the first learning model 266 can be a simple learning model that contributes to speeding up the processing by the first identification unit 25.

第二実施形態では、第一識別部２５は、第一学習モデル２６６に学習された正例に関連するか否かによって、各切出画像５６に歩行者画像５１が含まれている可能性があるか否かを識別する。歩行者に類似する非歩行者の負例サンプル画像６３ｂが負例から除外されていることにより、歩行者画像５１が含まれる切出画像５６に加えて、歩行者に類似した物体の画像が含まれる切出画像５６も、正例に関連する切出画像５６に該当し得る。故に、第一識別部２５は、第一学習モデル２６６の正例に関連する切出画像５６を識別することにより、歩行者画像５１が含まれている可能性のある切出画像５６を高速且つ的確に絞り込むことができる。 In the second embodiment, the first identification unit 25 may include the pedestrian image 51 in each cut-out image 56 depending on whether or not it is related to the positive example learned by the first learning model 266. Identifies whether or not there is. Since the negative sample image 63b of a non-pedestrian similar to a pedestrian is excluded from the negative examples, an image of an object similar to a pedestrian is included in addition to the cut-out image 56 including the pedestrian image 51. The cut image 56 to be displayed may correspond to the cut image 56 related to the positive example. Therefore, the 1st identification part 25 identifies the cutout image 56 relevant to the positive example of the 1st learning model 266, and can extract the cutout image 56 which may contain the pedestrian image 51 at high speed. It can narrow down accurately.

そして第一識別部２５によって絞り込まれた切出画像５６から、第二識別部２７は、歩行者画像５１が含まれているか否かを識別することにより、歩行者に類似した物体の画像を含む切出画像５６を除外する。以上のように、歩行者画像５１が含まれる切出画像５６は、第一識別部２５及び第二識別部２７の識別処理によって、精度良く識別される。したがって、画像認識回路２０は、第二実施形態による第一学習モデル２６６を用いても、認識のための処理を高速化しつつ、前方画像５０から歩行者画像５１を認識する精度を確実に高く維持することができる。 And from the cut-out image 56 narrowed down by the 1st identification part 25, the 2nd identification part 27 includes the image of the object similar to a pedestrian by identifying whether the pedestrian image 51 is contained. The cut image 56 is excluded. As described above, the cut image 56 including the pedestrian image 51 is identified with high accuracy by the identification processing of the first identification unit 25 and the second identification unit 27. Therefore, even when the first learning model 266 according to the second embodiment is used, the image recognition circuit 20 reliably maintains high accuracy for recognizing the pedestrian image 51 from the front image 50 while speeding up the processing for recognition. can do.

（第三実施形態）
本発明の第三実施形態は、第一実施形態の別の変形例である。この第三実施形態において、第一識別部２５は、第一実施形態で用いた第一学習モデル６６に換えて、第一学習モデル３６６に基づいて、識別処理を実施する。以下、図８、図１、図２に基づいて、第三実施形態において用いられる第一学習モデル３６６、及び当該第一学習モデル３６６に基づく第一識別部２５の識別処理について詳細に説明する。 (Third embodiment)
The third embodiment of the present invention is another modification of the first embodiment. In the third embodiment, the first identification unit 25 performs the identification process based on the first learning model 366 instead of the first learning model 66 used in the first embodiment. Hereinafter, the first learning model 366 used in the third embodiment and the identification processing of the first identification unit 25 based on the first learning model 366 will be described in detail based on FIGS. 8, 1, and 2.

第一学習モデル３６６を作成するに際して、学習モデル生成装置８０には、第一サンプル画像群６０のうち、複数の正例サンプル画像６１のみが、正例を示す教師信号に関連付けられて入力される（図８（ｂ）参照）。この第一学習モデル３６６は、正例サンプル画像６１のみを学習し、負例サンプル画像６３を学習していない。このように正例サンプル画像６１に基づいて構築された第一学習モデル３６６は、歩行者と非歩行者との間の曖昧な差異を詳細には学習していない。故に、第一学習モデル３６６は、第一識別部２５による処理の高速化に寄与する、単純な学習モデルに確実になり得る。 When creating the first learning model 366, only a plurality of positive example sample images 61 in the first sample image group 60 are input to the learning model generation device 80 in association with teacher signals indicating positive examples. (See FIG. 8 (b)). The first learning model 366 learns only the positive example sample image 61 and does not learn the negative example sample image 63. Thus, the 1st learning model 366 constructed | assembled based on the positive example sample image 61 is not learning in detail the ambiguous difference between a pedestrian and a non-pedestrian. Therefore, the first learning model 366 can be surely a simple learning model that contributes to speeding up the processing by the first identification unit 25.

第三実施形態の第一識別部２５は、第一学習モデル３６６に学習された正例に関連するか否かによって、各切出画像５６に歩行者画像５１が含まれている可能性があるか否かを識別する。非歩行者の写る負例サンプル画像６３を第一学習モデル３６６が負例として学習していないので、歩行者画像５１が含まれる切出画像５６に加えて、歩行者に類似した物体の画像が含まれる切出画像５６も、正例に関連する切出画像５６に該当し得る。故に、第一識別部２５は、第一学習モデル３６６の正例に関連する切出画像５６を識別することにより、歩行者画像５１が含まれている可能性のある切出画像５６を高速且つ的確に絞り込むことができる。 The first identification unit 25 of the third embodiment may include the pedestrian image 51 in each cut-out image 56 depending on whether or not it is related to the positive example learned by the first learning model 366. Or not. Since the first learning model 366 has not learned the negative example image 63 in which the non-pedestrian is photographed as a negative example, in addition to the cut-out image 56 including the pedestrian image 51, an image of an object similar to the pedestrian is obtained. The included cutout image 56 may also correspond to the cutout image 56 related to the positive example. Therefore, the 1st identification part 25 identifies the cutout image 56 relevant to the positive example of the 1st learning model 366, and can extract the cutout image 56 which may contain the pedestrian image 51 at high speed. It can narrow down accurately.

そして第一識別部２５によって絞り込まれた切出画像５６から、第二識別部２７は、歩行者画像５１が含まれているか否かを識別することにより、歩行者に類似した物体の画像を含む切出画像５６を除外する。以上のように、歩行者画像５１が含まれる切出画像５６は、第一識別部２５及び第二識別部２７の識別処理によって、精度良く識別される。したがって、画像認識回路２０は、第三実施形態による第一学習モデル３６６を用いても、認識のための処理を高速化しつつ、前方画像５０から歩行者画像５１を認識する精度を確実に高く維持することができる。 And from the cut-out image 56 narrowed down by the 1st identification part 25, the 2nd identification part 27 includes the image of the object similar to a pedestrian by identifying whether the pedestrian image 51 is contained. The cut image 56 is excluded. As described above, the cut image 56 including the pedestrian image 51 is identified with high accuracy by the identification processing of the first identification unit 25 and the second identification unit 27. Therefore, even when the first learning model 366 according to the third embodiment is used, the image recognition circuit 20 reliably maintains high accuracy for recognizing the pedestrian image 51 from the front image 50 while speeding up the process for recognition. can do.

（他の実施形態）
以上、本発明による複数の実施形態について説明したが、本発明は、上記実施形態に限定して解釈されるものではなく、本発明の要旨を逸脱しない範囲内において種々の実施形態及び組み合わせに適用することができる。 (Other embodiments)
Although a plurality of embodiments according to the present invention have been described above, the present invention is not construed as being limited to the above embodiments, and can be applied to various embodiments and combinations without departing from the gist of the present invention. can do.

上記第一実施形態において、第一学習モデル６６は、歩行者に類似する非歩行者が写る負例サンプル画像６３ｂを正例として学習することにより、単純な学習モデルとして構築されていた。また上記第二実施形態において、第一学習モデル２６６は、非歩行者に類似する歩行者が写る正例サンプル画像６１ｂ及び歩行者に類似する非歩行者が写る負例サンプル画像６３ｂを第一サンプル画像群６０から除外することにより、単純な学習モデルとして構築されていた。しかし、単純な学習モデルとして構築することができれば、第一学習モデルを構築する方法は、上記実施形態の方法に限定されない。例えば、第一学習モデルは、複数の正例サンプル画像６１のうち、非歩行者に類似する歩行者が写る正例サンプル画像６１ｂを、複数の負例サンプル画像６３と共に、負例として学習していてもよい。このような第一学習モデルは、歩行者と非歩行者との間の曖昧な差異を詳細に学習しないので、第一識別部２５による処理の高速化に寄与する、単純な学習モデルになり得る。このような第一学習モデルを用いる場合、第一識別部は、正例と識別するための閾値を低く設定することにより、歩行者画像５１が含まれている可能性のある切出画像５６を的確に絞り込むことができる。 In the first embodiment, the first learning model 66 is constructed as a simple learning model by learning, as a positive example, a negative example sample image 63b showing a non-pedestrian similar to a pedestrian. In the second embodiment, the first learning model 266 includes a positive sample image 61b showing a pedestrian similar to a non-pedestrian and a negative sample image 63b showing a non-pedestrian similar to a pedestrian as a first sample. By excluding from the image group 60, it was constructed as a simple learning model. However, as long as it can be constructed as a simple learning model, the method for constructing the first learning model is not limited to the method of the above embodiment. For example, the first learning model learns a positive example sample image 61b showing a pedestrian similar to a non-pedestrian among the plurality of positive example sample images 61 as a negative example together with the plurality of negative example sample images 63. May be. Such a first learning model does not learn in detail an ambiguous difference between a pedestrian and a non-pedestrian, and thus can be a simple learning model that contributes to speeding up the processing by the first identification unit 25. . When such a first learning model is used, the first identification unit sets the cut-out image 56 that may include the pedestrian image 51 by setting a threshold value for identifying a positive example low. It can narrow down accurately.

上記実施形態において、学習モデル生成装置８０が歩行者画像５１の特徴を学習する際、及び画像認識回路２０が歩行者画像５１を認識する際に、Ｈａａｒ−ｌｉｋｅ特徴量が用いられていた。しかし、画像認識のための特徴量として、Ｈａａｒ−ｌｉｋｅ特徴量以外の特徴量が用いられていてもよい。例えば、正例サンプル画像の輝度分布や、正例サンプル画像をウェーブレット変換した場合の出現頻度の高いウェーブレット係数が、学習及び識別のための特徴量として用いられていてもよい。又は、隣り合う画素間の輝度勾配を算出するＨｉｓｔｏｇｒａｍｓｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔｓ（ＨＯＧ）特徴量、ＪｏｉｎｔＨＯＧ特徴量、及びＪｏｉｎｔＨａａｒ−ｌｉｋｅ特徴量等が、学習及び識別のための特徴量として用いられていてもよい。 In the above embodiment, the Haar-like feature amount is used when the learning model generation device 80 learns the feature of the pedestrian image 51 and when the image recognition circuit 20 recognizes the pedestrian image 51. However, a feature value other than the Haar-like feature value may be used as a feature value for image recognition. For example, the luminance distribution of the positive sample image or the wavelet coefficient having a high appearance frequency when the positive sample image is wavelet transformed may be used as a feature amount for learning and identification. Alternatively, a Histograms of Oriented Gradients (HOG) feature value, a Joint HOG feature value, a Joint Haar-like feature value, etc. for calculating a luminance gradient between adjacent pixels are used as a feature value for learning and identification. Also good.

上記実施形態において、学習モデル生成装置８０によるＨａａｒ−ｌｉｋｅ特徴量の学習には、ＡｄａＢｏｏｓｔが用いられていた。しかし、特徴量の学習手法は、ＡｄａＢｏｏｓｔに限定されない。例えば、特徴量として輝度分布及びウェーブレット係数を学習する場合、学習モデル生成装置８０は、学習手法としてニューラルネットワーク及びＳＶＭ等を用いることができる。また、特徴量としてＨＯＧ特徴量、ＪｏｉｎｔＨＯＧ特徴量、及びＪｏｉｎｔＨａａｒ−ｌｉｋｅ特徴量を学習する場合、学習モデル生成装置８０は、学習手法としてＲｅａｌＡｄａＢｏｏｓｔ等を用いることができる。 In the above embodiment, AdaBoost is used for learning the Haar-like feature value by the learning model generation device 80. However, the feature amount learning method is not limited to AdaBoost. For example, when learning the luminance distribution and the wavelet coefficient as the feature amount, the learning model generation apparatus 80 can use a neural network, SVM, or the like as a learning method. Further, when learning the HOG feature value, the Joint HOG feature value, and the Joint Haar-like feature value as the feature value, the learning model generation apparatus 80 can use Real AdaBoost or the like as a learning method.

上記実施形態において、第一学習モデル６６及び第二学習モデル７７が構築される際には、共にＨａａｒ−ｌｉｋｅ特徴量が用いられていた。しかし、第一学習モデルを構築する際に用いられる特徴量と、第二学習モデルを構築する際に用いられる特徴量とが異なっていてもよい。例えば、切出画像５６を絞り込むための第一学習モデルの構築には、学習モデルの単純化が比較的容易な、Ｈａａｒ−ｌｉｋｅ特徴量を用いる。一方、歩行者と非歩行者とを識別するための第二学習モデル７７の構築には、学習モデルが複雑になるものの、高い識別精度が期待できる、ＪｏｉｎｔＨＯＧ特徴量を用いる。以上のようにして第一学習モデル及び第二学習モデルを構築することにより、前方画像から歩行者画像を認識する処理は、さらに高速化及び高精度化することができる。尚、学習モデルを構築した際の特徴量と、その学習モデルに基づいて識別処理をする際に識別部が用いる特徴量とは、対応していなければならない。故に、第一学習モデルがＨａａｒ−ｌｉｋｅ特徴量を用いて構築されている場合、第一識別部は、切出画像のＨａａｒ−ｌｉｋｅ特徴量に基づいて、識別処理を実施する。また、第二学習モデルがＪｏｉｎｔＨＯＧ特徴量を用いて構築されている場合、第二識別部は、切出画像のＪｏｉｎｔＨＯＧ特徴量に基づいて、識別処理を実施する。以上のように、学習モデルを構築する際の特徴量と、各識別部による識別処理の際の特徴量とが対応していれば、画像認識回路は、画像認識のための種々の特徴量を用いることができる。 In the above embodiment, when the first learning model 66 and the second learning model 77 are constructed, the Haar-like feature amount is used. However, the feature quantity used when constructing the first learning model may be different from the feature quantity used when constructing the second learning model. For example, for the construction of the first learning model for narrowing down the cut-out image 56, Haar-like feature quantities that are relatively easy to simplify the learning model are used. On the other hand, the construction of the second learning model 77 for discriminating between pedestrians and non-pedestrians uses a Joint HOG feature that can be expected to have high discrimination accuracy although the learning model is complicated. By constructing the first learning model and the second learning model as described above, the process of recognizing the pedestrian image from the front image can be further increased in speed and accuracy. It should be noted that the feature amount when the learning model is constructed must correspond to the feature amount used by the identification unit when performing the identification process based on the learning model. Therefore, when the 1st learning model is constructed | assembled using Haar-like feature-value, a 1st discrimination | determination part implements identification processing based on the Haar-like feature-value of a cut-out image. Further, when the second learning model is constructed using the Joint HOG feature value, the second identification unit performs the identification process based on the Joint HOG feature value of the clipped image. As described above, if the feature amount when the learning model is constructed and the feature amount at the time of the identification processing by each identifying unit correspond to each other, the image recognition circuit determines various feature amounts for image recognition. Can be used.

上記実施形態では、前方画像５０を撮像するための撮像手段として、近赤外線カメラ１０が用いられていた。しかし、前方画像５０を撮像することができれば、撮像手段は、近赤外線カメラ１０に限定されない。撮像手段は、例えば可視光線を検知し画像を生成する可視光カメラや、遠赤外線を検知して画像を生成する遠赤外線カメラであってもよい。また、上記実施形態のナイトビュー装置１００は、近赤外線カメラ１０から前方画像５０を取得する構成であった。しかし、ナイトビュー装置が、撮像手段を備えていてもよい。 In the above embodiment, the near-infrared camera 10 is used as an imaging unit for capturing the front image 50. However, as long as the front image 50 can be captured, the imaging means is not limited to the near-infrared camera 10. The imaging means may be, for example, a visible light camera that detects visible light and generates an image, or a far-infrared camera that detects far infrared and generates an image. Moreover, the night view apparatus 100 of the said embodiment was the structure which acquires the front image 50 from the near-infrared camera 10. FIG. However, the night view apparatus may include an imaging unit.

上記実施形態では、画像認識回路２０及び描画回路３０が、それぞれプロセッサ等を備える回路として構成されていた。しかし、ナイトビュー制御回路が、プログラムの実行によって、画像認識回路２０及び描画回路３０に相当する機能ブロックを有する構成であってもよい。また、画像認識回路２０に含まれる各機能ブロック２１，２３，２５，２７，２９は、それぞれ独立した回路として構成されていてもよい。さらに、プログラムの実行によらないアナログ回路が、各機能ブロックに相当する構成として、ナイトビュー制御回路に形成されていてもよい。 In the above embodiment, the image recognition circuit 20 and the drawing circuit 30 are each configured as a circuit including a processor and the like. However, the night view control circuit may have a functional block corresponding to the image recognition circuit 20 and the drawing circuit 30 by executing a program. In addition, each functional block 21, 23, 25, 27, 29 included in the image recognition circuit 20 may be configured as an independent circuit. Furthermore, an analog circuit not depending on the execution of the program may be formed in the night view control circuit as a configuration corresponding to each functional block.

上記実施形態において、ナイトビュー装置１００は、液晶ディスプレイ４０に、枠画像５３の重畳された前方画像５０を表示していた。しかし、ナイトビュー装置の有する表示手段は、液晶ディスプレイ４０に限定されない。例えば、車両のウィンドスクリーンに枠画像５３の重畳された前方画像５０を投影するヘッドアップディスプレイ装置が、ナイトビュー装置の表示手段であってもよい。又は、ナイトビュー装置の表示手段は、例えばナビゲーションシステムの備える液晶ディスプレイに向けて、前方画像５０の信号を出力する出力部であってもよい。 In the above embodiment, the night view apparatus 100 displays the front image 50 on which the frame image 53 is superimposed on the liquid crystal display 40. However, the display means of the night view device is not limited to the liquid crystal display 40. For example, the head-up display device that projects the front image 50 on which the frame image 53 is superimposed on the windscreen of the vehicle may be the display means of the night view device. Alternatively, the display unit of the night view device may be an output unit that outputs a signal of the front image 50 toward, for example, a liquid crystal display included in the navigation system.

上記実施形態において、ナイトビュー装置１００は、前方領域を撮影する近赤外線カメラ１０から前方画像５０を取得していた。しかし、ナイトビュー装置は、車両の前方領域に限らず、車両の後方領域又は側方領域の画像から、歩行者等の認識対象物の画像を認識してもよい。また、上記実施形態では、画像認識回路２０は、認識対象物として歩行者を認識していた。しかし、画像認識回路２０によって認識される認識対象物は、歩行者に限定されない。例えば、車両、自転車又は二輪車に搭乗する人間、及び鹿や猪等の動物等が認識対象物として予め設定されていてもよい。 In the embodiment described above, the night view apparatus 100 acquires the front image 50 from the near-infrared camera 10 that captures the front area. However, the night view apparatus may recognize an image of a recognition target object such as a pedestrian from an image of a rear region or a side region of the vehicle, not limited to the front region of the vehicle. Moreover, in the said embodiment, the image recognition circuit 20 recognized the pedestrian as a recognition target object. However, the recognition target recognized by the image recognition circuit 20 is not limited to a pedestrian. For example, a person who rides a vehicle, a bicycle or a two-wheeled vehicle, an animal such as a deer or a spider, and the like may be set in advance as recognition objects.

さらに、本発明は、車両用のナイトビュー装置１００に限らず、認識対象物を認識するための画像認識装置全般に適用することができる。例えば、画像認識装置は、マンションのエレベータホールや商店等に設置される監視カメラと接続さて、認識対象物として人物を認識する監視装置の一部として用いられてもよい。 Furthermore, the present invention can be applied not only to the vehicle night view device 100 but also to all image recognition devices for recognizing recognition objects. For example, the image recognition apparatus may be used as a part of a monitoring apparatus that recognizes a person as a recognition target object by being connected to a monitoring camera installed in an elevator hall or a store in a condominium.

１０近赤外線カメラ（撮像手段）、１１近赤外線投光器、１５ナイトビュー制御回路、２０画像認識回路（画像認識装置）、２１画像取得部（画像取得手段）、２３切出部（切出手段）、２５第一識別部（第一識別手段）、２７第二識別部（第二識別手段）、２９認識部（認識手段）、３０描画回路（表示手段）、４０液晶ディスプレイ（表示手段）、５０前方画像（撮像画像）、５１歩行者画像（認識対象物の画像）、５３枠画像（強調画像）、５５切出ウィンドウ、５６切出画像、６０第一サンプル画像群、７０第二サンプル画像群、６１，６１ａ，６１ｂ，７１，７１ａ，７１ｂ正例サンプル画像、６３，６３ａ，６３ｂ，７３，７３ａ，７３ｂ負例サンプル画像、６６，２６６，３６６第一学習モデル、７７第二学習モデル、８０学習モデル生成装置、１００ナイトビュー装置（車両用表示装置） DESCRIPTION OF SYMBOLS 10 Near-infrared camera (imaging means), 11 Near-infrared projector, 15 Night view control circuit, 20 Image recognition circuit (image recognition apparatus), 21 Image acquisition part (image acquisition means), 23 Cutout part (cutout means), 25 First identification unit (first identification unit), 27 Second identification unit (second identification unit), 29 Recognition unit (recognition unit), 30 Drawing circuit (display unit), 40 Liquid crystal display (display unit), 50 Front Image (captured image), 51 Pedestrian image (image of recognition object), 53 frame image (emphasized image), 55 cut window, 56 cut image, 60 first sample image group, 70 second sample image group, 61, 61a, 61b, 71, 71a, 71b Positive example sample image, 63, 63a, 63b, 73, 73a, 73b Negative example sample image, 66, 266, 366 First learning model 77 Second learning model, 80 learning model generating device 100 night view apparatus (vehicle display device)

Claims

Image acquisition means for acquiring a captured image generated by the imaging means;
From the captured image acquired by the image acquisition means, a cutting means for cutting out a cut image of a preset size in order,
For the plurality of cut-out images cut out by the cut-out means, an image representing a recognition object set in advance based on a first learning model created from a first sample image group serving as a learning reference is included. First identifying means for identifying whether there is a possibility of being
A second learning model created from a second sample image group serving as a learning reference for the clipped image that has been identified by the first identification means as being likely to contain the image of the recognition object. Based on the second identification means for identifying whether or not the image of the recognition object is included,
Recognizing means for recognizing a position in the captured image of the clipped image identified as including the image of the recognition object by the second identification means as a position where the recognition object appears;
An image recognition apparatus comprising:

The first sample image group includes a plurality of positive example sample images in which an assumed object assumed as the recognition object is captured, and a plurality of negative sample images in which a non-assumed object other than the assumed object is captured,
The first learning model learns, as a positive example, the negative example sample image in which the non-imaginary object similar to the assumed object is captured among the plurality of negative example sample images together with the plurality of positive example sample images. Created by
Depending on whether the first identification means corresponds to the positive example learned by the first learning model, there is a possibility that an image of the recognition object may be included in each cut-out image The image recognition apparatus according to claim 1, wherein:

The first sample image group includes a plurality of positive example sample images in which an assumed object assumed as the recognition object is captured, and a plurality of negative sample images in which a non-assumed object other than the assumed object is captured,
The first learning model learns, as a positive example, an image group obtained by excluding the positive example sample image in which the assumption object similar to the non-expected object is captured from the plurality of positive example sample images, and the plurality of negative example images. It is created by learning from the example sample image as a negative example an image group excluding the negative example sample image in which the non-imaginary object similar to the assumed object is reflected,
Whether the first identification means may include an image of the recognition object in each of the cut-out images depending on whether the first example is related to the positive example learned by the first learning model. The image recognition apparatus according to claim 1, wherein:

The first sample image group includes a plurality of positive example sample images in which an assumed object assumed as the recognition object is captured, and a plurality of negative sample images in which a non-assumed object other than the assumed object is captured,
The first learning model learns the plurality of positive sample images as positive examples in the first sample image group,
Whether the first identification means may include an image of the recognition object in each of the cut-out images depending on whether the first example is related to the positive example learned by the first learning model. The image recognition apparatus according to claim 1, wherein:

The second sample image group includes a plurality of positive sample images in which an assumed object assumed as the recognition object is captured, and a plurality of negative sample images in which a non-expected object other than the assumed object is captured,
The second learning model is created by learning the plurality of positive example sample images as positive examples and learning the negative example sample images as negative examples,
The second identifying means identifies whether or not the image of the recognition object is included in the cut-out image depending on whether or not it corresponds to the positive example learned in the second learning model. The image recognition apparatus according to any one of claims 1 to 4, wherein the image recognition apparatus is characterized.

The first identifying means identifies whether or not there is a possibility that an image of a pedestrian that is the recognition object is included in the plurality of cut-out images,
The second identification means determines whether or not the pedestrian image is included in the clipped image identified by the first identification means as being likely to include the pedestrian image. The image recognition apparatus according to claim 1, wherein the image recognition apparatus is identified.

From the captured image generated by the imaging unit that captures the surrounding area of the vehicle, the captured image is an enhanced image that recognizes the image of the recognition object existing in the surrounding area and emphasizes the image of the recognition object. A vehicle display device that superimposes and displays
The image recognition apparatus according to any one of claims 1 to 6, wherein the captured image is acquired from the imaging unit.
Display means for superimposing and displaying the emphasized image at a position where the recognition object recognized by the image recognition device appears in the captured image;
A vehicle display device comprising: