JP2006178857A

JP2006178857A - Image processor

Info

Publication number: JP2006178857A
Application number: JP2004373452A
Authority: JP
Inventors: Tomoharu Nagao; 智晴長尾; Motoya Ogawa; 原也小川; Katsuyuki Kise; 勝之喜瀬; Takeshi Torii; 毅鳥居
Original assignee: Yokohama National University NUC; Fuji Heavy Industries Ltd
Current assignee: Subaru Corp; Yokohama National University NUC
Priority date: 2004-12-24
Filing date: 2004-12-24
Publication date: 2006-07-06

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image processor capable of extracting a particular object in a moving image on the basis of a processing program obtained by combining respective image filters in the shape of a tree structure, especially, a particular object accompanied with temporal change and displacement and to further provide an image processor with high versatility capable of easily obtaining such a processing program. <P>SOLUTION: The image processor 1 for performing image processing of an image picked up by an image pickup device 21 to extract a particular object in the image is provided with an image processing part 3 for performing image processing of a plurality of types of input images t-1 to t-k picked up by the image pickup device 21 to form an output image with the particular object extracted on the basis of the processing program obtained by combining the image filters in the shape of a tree structure. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、画像処理装置に係り、特に、画像中から特定対象を抽出可能な画像処理装置に関する。 The present invention relates to an image processing apparatus, and more particularly to an image processing apparatus capable of extracting a specific target from an image.

近年、ＴＶカメラやＣＣＤカメラ等の画像入力手段で被写体や風景等を撮像し、得られた動画像に画像処理を施して、その画像の中から特定の対象、例えば、環境内を移動する物体やその動き等を抽出するための画像処理装置や画像処理方法の研究が盛んに行われている（例えば、特許文献１〜５等参照）。 In recent years, an image input means such as a TV camera or a CCD camera is used to capture an image of a subject, a landscape, etc., and the obtained moving image is subjected to image processing. Research on image processing apparatuses and image processing methods for extracting the movements and the like has been actively conducted (for example, see Patent Documents 1 to 5).

このような画像処理装置は、例えば、自動車等の分野においては、車両に取り付けたＣＣＤカメラ等で前方風景を撮像し、その動画像から歩行者や他の車両を抽出して、それらとの衝突等の事故回避に応用されたり（特許文献１〜３参照）、或いは、人工知能型ロボット等の分野では、ロボットが、搭載したカメラで環境を観察しながら他の移動物体を発見し、計測し、それに対するロボットの行動を決定したりすることに用いられる（特許文献４参照）など、種々の分野で研究や実用化が図られている。 Such an image processing apparatus, for example, in the field of automobiles or the like, captures a front landscape with a CCD camera or the like attached to the vehicle, extracts pedestrians and other vehicles from the moving image, and collides with them. In the field of artificial intelligence robots, etc., robots discover and measure other moving objects while observing the environment with a built-in camera. Research and practical application are being made in various fields, such as determining the behavior of a robot in response thereto (see Patent Document 4).

動画像中から特定の対象を抽出するために、このような画像処理装置では、例えば、水平方向に離間した２台のＣＣＤカメラ等により得られた入力画像に画像処理を施して特定対象の輪郭部分を抽出したり、入力画像からオプティカルフローを算出して特定対象を抽出したり、或いはパターンマッチング処理によりデータベースに登録されているモデルと照合して特定対象を推定する等の画像処理方法が採られている。 In order to extract a specific target from a moving image, in such an image processing apparatus, for example, an input image obtained by two CCD cameras or the like separated in the horizontal direction is subjected to image processing and the contour of the specific target Image processing methods such as extracting a part, calculating an optical flow from an input image and extracting a specific target, or collating with a model registered in a database by pattern matching processing to estimate the specific target. It has been.

しかし、これらの方法は、通常、処理プログラムの構築に非常に手間がかかる上、目的とする特定対象ごとに処理プログラムを構築しなければならない。そのため、従来から、処理プログラムを容易に構築可能であり、汎用性の高い処理プログラムを得ることができる画像処理方法や画像処理装置が望まれていた。 However, these methods usually require a lot of work to construct a processing program, and a processing program must be constructed for each specific target object. Therefore, there has been a demand for an image processing method and an image processing apparatus that can easily construct a processing program and can obtain a highly versatile processing program.

一方、静止画像に対する画像処理の分野では、近年、各種画像フィルタを図１９に示すような木構造状に組み合わせた処理プログラムに基づいて入力画像に画像処理を施す画像処理技術（ＡＣＴＩＴ）が提案されている（非特許文献１参照）。 On the other hand, in the field of image processing for still images, an image processing technique (ACTIT) for performing image processing on an input image based on a processing program in which various image filters are combined in a tree structure as shown in FIG. 19 has been recently proposed. (See Non-Patent Document 1).

具体的には、例えば、印刷文字と手書きの文字とよりなる文書の入力画像に図１９に示したような処理プログラムにより画像処理を施し、印刷文字のみが抽出された出力画像を出力させたり、複数の角膜内皮細胞が連なった状態の顕微鏡画像から細胞同士の境界部分の網目模様のみを抽出して出力画像を得る等の画像処理を行う画像処理技術である。 Specifically, for example, an input image of a document composed of print characters and handwritten characters is subjected to image processing by a processing program as shown in FIG. 19, and an output image in which only print characters are extracted is output. This is an image processing technique for performing image processing such as obtaining an output image by extracting only a mesh pattern at a boundary portion between cells from a microscope image in a state where a plurality of corneal endothelial cells are connected.

非特許文献１では、さらに、この各種画像フィルタの組み合わせを自動的に最適化するために遺伝的プログラミング（Genetic Programming。ＧＰと略される。）の技法を取り入れることが提案されている。
特開平５−２６５５４７号公報特開平１０−１１５８５号公報特開２００２−８３２９７号公報特開２００１−８４３８３号公報特開平９−２７１０１４号公報青木紳也、外１名、「木構造状画像変換の自動構築法ＡＣＴＩＴ」、映像情報メディア学会誌、社団法人映像情報メディア学会、１９９９年、第５３巻、第６号、ｐ．８８８〜８９４ Non-Patent Document 1 further proposes to adopt a technique of genetic programming (abbreviated as GP) in order to automatically optimize the combination of the various image filters.
JP-A-5-265547 Japanese Patent Laid-Open No. 10-11585 JP 2002-83297 A JP 2001-84383 A JP-A-9-271014 Shinya Aoki, 1 other person, “Automatic construction method of tree-structured image conversion ACTIT”, Journal of the Institute of Image Information and Television Engineers, The Institute of Image Information and Television Engineers, 1999, Vol. 888-894

そこで、前述した動画像中から移動する物体等の特定の対象を抽出するような課題に、非特許文献１に記載された画像処理技術が応用されることが期待される。 Therefore, it is expected that the image processing technique described in Non-Patent Document 1 is applied to the above-described problem of extracting a specific target such as a moving object from the moving image.

しかしながら、この画像処理技術は、前述したように、静止画像に対して画像処理を行うものであり、具体的に言えば、図１９に示した処理プログラムの「入力画像」には同一の静止画像が繰り返し入力されることが前提とされている。 However, as described above, this image processing technique performs image processing on a still image. Specifically, the same still image is used as the “input image” of the processing program shown in FIG. It is assumed that is repeatedly input.

そこで、本発明の目的は、この画像処理技術を拡張して動画像にも同様の画像処理技術を適用できるようにし、各種画像フィルタを木構造状に組み合わせた処理プログラムに基づいて動画像中から特定の対象を抽出可能であり、特に時間的な変化や変位を伴う特定対象の抽出を行うことが可能な画像処理装置を提供することである。本発明は、このような処理プログラムを容易に得ることができる汎用性の高い画像処理装置の提供をも目的とする。 Therefore, an object of the present invention is to extend this image processing technique so that the same image processing technique can be applied to a moving image, and based on a processing program combining various image filters in a tree structure, An object of the present invention is to provide an image processing apparatus capable of extracting a specific target, and particularly capable of extracting a specific target with a temporal change or displacement. It is another object of the present invention to provide a highly versatile image processing apparatus that can easily obtain such a processing program.

前記の問題を解決するために、請求項１に記載の画像処理装置は、
撮像装置により撮像された画像に画像処理を施して前記画像中から特定の対象を抽出する画像処理装置において、
画像フィルタを木構造状に組み合わせた処理プログラムに基づいて、前記撮像装置により撮像された複数種類の画像に画像処理を施し、前記特定の対象が抽出された出力画像を形成するための画像処理部を備えることを特徴とする。 In order to solve the above problem, an image processing apparatus according to claim 1,
In an image processing device that performs image processing on an image captured by an imaging device and extracts a specific target from the image,
An image processing unit for performing image processing on a plurality of types of images captured by the imaging device based on a processing program in which image filters are combined in a tree structure, and forming an output image from which the specific target is extracted It is characterized by providing.

請求項１に記載の発明によれば、画像処理装置の画像処理部が有する木構造の処理プログラムの構成を、従来のように終端記号として同一の静止画像のみを有する構成とするのではなく、複数種類の画像を終端記号とする木構造の処理プログラムとする。 According to the first aspect of the present invention, the configuration of the tree-structured processing program included in the image processing unit of the image processing apparatus is not a configuration having only the same still image as a terminal symbol as in the prior art. A tree-structured processing program using a plurality of types of images as terminal symbols.

請求項２に記載の発明は、請求項１に記載の画像処理装置において、前記複数種類の画像は、前記撮像装置により時間間隔をおいて撮像された複数枚の画像であることを特徴とする。 According to a second aspect of the present invention, in the image processing apparatus according to the first aspect, the plurality of types of images are a plurality of images captured at time intervals by the imaging device. .

請求項２に記載の発明によれば、木構造の処理プログラムは、入力される複数種類の画像として、時間間隔をおいて撮像された複数枚の動画像のフレームが用いられるように構成される。 According to the second aspect of the present invention, the tree structure processing program is configured such that a plurality of moving image frames captured at time intervals are used as a plurality of types of input images. .

請求項３に記載の発明は、請求項１に記載の画像処理装置において、前記複数種類の画像は、前記撮像装置により時間間隔をおいて撮像された複数枚の画像のフレーム間差分画像またはエッジ抽出画像であることを特徴とする。 According to a third aspect of the present invention, in the image processing device according to the first aspect, the plurality of types of images are inter-frame difference images or edges of a plurality of images captured at time intervals by the imaging device. It is an extracted image.

請求項３に記載の発明によれば、木構造の処理プログラムは、入力される複数種類の画像として、時間間隔をおいて撮像された複数枚の画像のフレーム間差分画像またはエッジ抽出画像が用いられるように構成される。 According to the invention described in claim 3, the tree structure processing program uses, as the plurality of types of input images, inter-frame difference images or edge extraction images of a plurality of images captured at time intervals. Configured to be.

請求項４に記載の発明は、請求項１乃至請求項３のいずれか一項に記載の画像処理装置において、前記処理プログラムを形成するための処理プログラム形成部を備え、前記処理プログラム形成部は、前記複数種類の画像、目標画像および重み画像を用いた遺伝的プログラミングにより処理プログラムを形成するように構成されていることを特徴とする。 According to a fourth aspect of the present invention, in the image processing apparatus according to any one of the first to third aspects, the image processing apparatus includes a processing program forming unit for forming the processing program. A processing program is formed by genetic programming using the plurality of types of images, target images, and weight images.

請求項４に記載の発明によれば、画像処理装置は、処理プログラム形成部において、撮像装置で撮像された動画像と、形成された処理プログラムが前記動画像を入力された場合に出力すべき画像として与えられる目標画像と、出力画像の輝度値と目標画像の輝度値との画素ごとの差（正確には差の絶対値）に重み付けをするための重み画像とを用いて、遺伝的プログラミングにより処理プログラムを自動形成する。 According to the fourth aspect of the present invention, the image processing apparatus should output the moving image captured by the imaging device and the formed processing program when the moving image is input in the processing program forming unit. Genetic programming using a target image given as an image and a weighted image for weighting the pixel-by-pixel difference between the luminance value of the output image and the luminance value of the target image (exactly the absolute value of the difference) The processing program is automatically formed by the above.

請求項５に記載の発明は、請求項４に記載の画像処理装置において、前記重み画像は、その抽出領域の重みと非抽出領域の重みとの比が、前記抽出領域および前記非抽出領域の面積比の逆数の比となるように設定されることを特徴とする。 According to a fifth aspect of the present invention, in the image processing apparatus according to the fourth aspect, the weighted image has a ratio of the weight of the extraction region to the weight of the non-extraction region. It is set so that it may become the ratio of the reciprocal of area ratio.

請求項５に記載の発明によれば、前記重み画像による重み付けを、画像の抽出領域と非抽出領域とで異なるものとし、特に、その重みの比が抽出領域および非抽出領域の面積比の逆数の比となるようにする。 According to the invention described in claim 5, the weighting by the weighted image is different between the extracted region and the non-extracted region of the image, and in particular, the weight ratio is the reciprocal of the area ratio of the extracted region and the non-extracted region. So that the ratio of

請求項６に記載の発明は、請求項４または請求項５に記載の画像処理装置において、前記処理プログラム形成部は、前記複数種類の画像、目標画像および重み画像よりなる学習セットを複数セット用いて処理プログラムを形成することを特徴とする。 According to a sixth aspect of the present invention, in the image processing device according to the fourth or fifth aspect, the processing program forming unit uses a plurality of learning sets including the plurality of types of images, target images, and weight images. And forming a processing program.

請求項６に記載の発明によれば、１つの処理プログラムを形成するために、複数種類の画像、目標画像および重み画像よりなる学習セットを１セットだけでなく、例えば、場面が異なる別の学習セットを複数セット用いる。 According to the invention described in claim 6, in order to form one processing program, not only one learning set composed of a plurality of types of images, target images and weight images but also different learnings with different scenes, for example. Use multiple sets.

請求項７に記載の発明は、請求項４乃至請求項６のいずれか一項に記載の画像処理装置において、前記処理プログラム形成部での遺伝的プログラミングに用いられる適応度が、前記処理プログラムにおけるノード数が大きいほど小さい値をとるように算出されることを特徴とする。 According to a seventh aspect of the present invention, in the image processing device according to any one of the fourth to sixth aspects, the fitness used for the genetic programming in the processing program forming unit is in the processing program. It is calculated so that it may take a small value, so that the number of nodes is large.

請求項７に記載の発明によれば、遺伝的プログラミングにおいて、ノード数が大きい処理プログラムを淘汰され易くする。 According to the seventh aspect of the present invention, in genetic programming, a processing program having a large number of nodes is easily deceived.

請求項８に記載の発明は、請求項７に記載の画像処理装置において、前記ノード数の適応度に対する割合が、遺伝的プログラミングにおける進化過程の世代数に応じて変化するように構成されていることを特徴とする。 The invention according to claim 8 is the image processing apparatus according to claim 7, wherein the ratio of the number of nodes to the fitness changes according to the number of generations of the evolution process in genetic programming. It is characterized by that.

請求項８に記載の発明によれば、請求項７に記載の発明においてノード数が大きくなるほど適応度を減少させる割合を、遺伝的プログラミングにおける進化過程に応じて変化させる。 According to the invention described in claim 8, in the invention described in claim 7, the rate at which the fitness is decreased as the number of nodes increases is changed according to the evolution process in genetic programming.

請求項９に記載の発明は、請求項４乃至請求項８のいずれか一項に記載の画像処理装置において、前記処理プログラム形成部での遺伝的プログラミングに用いられる適応度が、前記処理プログラムにおける２入力画像フィルタのノード数が大きいほど大きな値をとるように算出されることを特徴とする。 According to a ninth aspect of the present invention, in the image processing apparatus according to any one of the fourth to eighth aspects, the fitness used for the genetic programming in the processing program forming unit is in the processing program. The two-input image filter is calculated so as to take a larger value as the number of nodes is larger.

請求項９に記載の発明によれば、遺伝的プログラミングにおいて、２入力画像フィルタのノード数が大きい処理プログラムの適応度を大きくして淘汰され難くする。 According to the ninth aspect of the present invention, in genetic programming, the fitness of a processing program having a large number of nodes of a two-input image filter is increased to make it difficult to be deceived.

請求項１０に記載の発明は、請求項９に記載の画像処理装置において、前記２入力画像フィルタのノード数の適応度に対する割合が、遺伝的プログラミングにおける進化過程の世代数に応じて変化するように構成されていることを特徴とする。 According to a tenth aspect of the present invention, in the image processing apparatus according to the ninth aspect, the ratio of the number of nodes of the two-input image filter to the fitness is changed according to the number of generations of the evolution process in genetic programming. It is comprised by these.

請求項１０に記載の発明によれば、請求項９に記載の発明において２入力画像フィルタのノード数が大きくなるほど適応度を増加させる割合を、遺伝的プログラミングにおける進化過程に応じて変化させる。 According to the invention described in claim 10, in the invention described in claim 9, the rate at which the fitness is increased as the number of nodes of the two-input image filter increases is changed according to the evolution process in genetic programming.

請求項１１に記載の発明は、請求項１乃至請求項１０のいずれか一項に記載の画像処理装置において、前記処理プログラムは、複数の処理プログラムが組み合わされて形成されていることを特徴とする。 According to an eleventh aspect of the present invention, in the image processing apparatus according to any one of the first to tenth aspects, the processing program is formed by combining a plurality of processing programs. To do.

請求項１１に記載の発明によれば、人為的に形成した処理プログラムや遺伝的プログラミングにより形成した処理プログラムを複数組み合わせて大規模な処理プログラムを形成する。 According to the eleventh aspect of the present invention, a large-scale processing program is formed by combining a plurality of processing programs formed artificially or processing programs formed by genetic programming.

請求項１２に記載の発明は、請求項１１に記載の画像処理装置において、前記複数の処理プログラムによる処理の非線形の重ね合わせにより出力画像が形成されるように構成されていることを特徴とする。 According to a twelfth aspect of the present invention, in the image processing apparatus according to the eleventh aspect of the present invention, an output image is formed by nonlinear superposition of processing by the plurality of processing programs. .

請求項１２に記載の発明によれば、請求項１１に記載の発明における大規模な処理プログラムにより出力される画像は、大規模な処理プログラムを構成する各処理プログラムの多数が出力輝度値を持つ画素部分は出力輝度値が大きくなり、少数しか出力輝度値を持たない画素部分は出力輝度値が小さくなり或いは０になる。 According to the invention described in claim 12, in the image output by the large-scale processing program according to the invention described in claim 11, a large number of each processing program constituting the large-scale processing program has an output luminance value. The pixel portion has a large output luminance value, and the pixel portion having only a small number of output luminance values has a small or zero output luminance value.

請求項１３に記載の発明は、請求項１乃至請求項１２のいずれか一項に記載の画像処理装置において、前記画像フィルタには、マスクフィルタが含まれることを特徴とする。 According to a thirteenth aspect of the present invention, in the image processing device according to any one of the first to twelfth aspects, the image filter includes a mask filter.

請求項１３に記載の発明によれば、木構造の処理プログラムを構成する非終端記号としてマスクフィルタを用いたり、処理プログラムを複数組み合わせて構成される大規模な処理プログラムにおける各処理プログラムの組み合わせにマスクフィルタを用いることができる。 According to the invention of claim 13, a mask filter is used as a non-terminal symbol constituting a tree-structured processing program, or a combination of processing programs in a large-scale processing program configured by combining a plurality of processing programs is masked. A filter can be used.

請求項１４に記載の発明は、請求項１乃至請求項１３のいずれか一項に記載の画像処理装置において、画像を表示するための表示部を備え、前記処理プログラムに基づいて形成した出力画像を前記表示部に表示されている前記入力画像に重ね合わせて表示するように構成されていることを特徴とする。 According to a fourteenth aspect of the present invention, in the image processing device according to any one of the first to thirteenth aspects, an output image is provided that includes a display unit for displaying an image and is formed based on the processing program. Is superimposed on the input image displayed on the display unit and displayed.

請求項１４に記載の発明によれば、画像処理装置の表示部には、例えば、白黒画像として表示された入力画像に、赤色で着色された出力画像が重ね合わせて表示される。 According to the fourteenth aspect of the present invention, an output image colored in red is superimposed and displayed on an input image displayed as a black and white image, for example, on the display unit of the image processing apparatus.

請求項１に記載の発明によれば、画像処理装置の画像処理部が有する木構造の処理プログラムの構成を、従来のように終端記号として同一の静止画像のみを有する構成とするのではなく、複数種類の画像を終端記号とする木構造の処理プログラムとする。そのため、従来のＡＣＴＩＴ（前記非特許文献１参照）の画像処理技術を拡張して、フレームごとに画像が異なる動画像にもＡＣＴＩＴの技術が適用できるようになる。 According to the first aspect of the present invention, the configuration of the tree-structured processing program included in the image processing unit of the image processing apparatus is not a configuration having only the same still image as a terminal symbol as in the prior art. A tree-structured processing program using a plurality of types of images as terminal symbols. For this reason, the conventional ACTIT (see Non-Patent Document 1) image processing technique is expanded so that the ACTIT technique can be applied to moving images having different images for each frame.

また、同時に、入力される各画像を比較して差分処理を行ったり論理積処理を行ったりすることにより、各画像間の特定対象の位置のずれ等の要素を加味した画像処理が可能となり、画像中の時間的な変化や空間的な変位を伴う特定対象の抽出を行うことが可能となる。 At the same time, by comparing each input image and performing difference processing or logical product processing, it becomes possible to perform image processing that takes into account factors such as the displacement of the position of a specific target between the images, It is possible to extract a specific object with temporal change or spatial displacement in the image.

請求項２に記載の発明によれば、木構造の処理プログラムは、入力される複数種類の画像として、時間間隔をおいて撮像された複数枚の動画像のフレームが用いられるように構成されるため、動画像の各フレームの差分処理等を行うことにより、特に時間的な変化や変位を伴う特定対象の抽出をより的確に行うことが可能となる。 According to the second aspect of the present invention, the tree structure processing program is configured such that a plurality of moving image frames captured at time intervals are used as a plurality of types of input images. Therefore, by performing the difference processing of each frame of the moving image, it is possible to more accurately extract a specific target that is particularly accompanied by a temporal change or displacement.

請求項３に記載の発明によれば、木構造の処理プログラムは、入力される複数種類の画像として、時間間隔をおいて撮像された複数枚の画像のフレーム間差分画像またはエッジ抽出画像が用いられるように構成される。そのため、処理プログラムは、動画像の各フレームの時間微分的なデータ画像（すなわち、フレーム間差分画像）等に基づいて直接的に時間的変化等を伴う特定対象の抽出を行うことが可能となる。 According to the invention described in claim 3, the tree structure processing program uses, as the plurality of types of input images, inter-frame difference images or edge extraction images of a plurality of images captured at time intervals. Configured to be. Therefore, the processing program can directly extract a specific target with a temporal change or the like based on a time differential data image (that is, an inter-frame difference image) of each frame of the moving image. .

請求項４に記載の発明によれば、画像処理装置は、処理プログラム形成部において、撮像装置で撮像された動画像と、形成された処理プログラムが前記動画像を入力された場合に出力すべき画像として与えられる目標画像と、出力画像の輝度値と目標画像の輝度値との画素ごとの差（正確には差の絶対値）に重み付けをするための重み画像とを用いて、遺伝的プログラミングにより処理プログラムを自動形成する。そのため、前記各請求項に記載の発明の効果に加え、適切な目標画像と重み画像を与えることにより、非常に容易に処理プログラムを獲得することが可能となる。 According to the fourth aspect of the present invention, the image processing apparatus should output the moving image captured by the imaging device and the formed processing program when the moving image is input in the processing program forming unit. Genetic programming using a target image given as an image and a weighted image for weighting the pixel-by-pixel difference between the luminance value of the output image and the luminance value of the target image (exactly the absolute value of the difference) The processing program is automatically formed by the above. Therefore, in addition to the effects of the inventions described in the above claims, a processing program can be obtained very easily by providing an appropriate target image and weight image.

また、目標画像および重み画像を変更することにより、抽出すべき特定対象を容易に変更することができる。すなわち、従来のように特定対象が変更されるごとにその抽出のための処理プログラムを手作業で構築することなく、目標画像と重み画像とを変更するだけで、前述した遺伝的プログラミングを用いた方法をそのまま使って同じ手順で処理プログラムを容易に構築することが可能となる。 Moreover, the specific object which should be extracted can be changed easily by changing a target image and a weight image. In other words, the genetic programming described above was used only by changing the target image and the weight image without manually constructing a processing program for the extraction every time the specific target is changed as in the past. It is possible to easily construct a processing program in the same procedure using the method as it is.

請求項５に記載の発明によれば、請求項４に記載の発明において、前記重み画像による重み付けを、画像の抽出領域と非抽出領域とで異なるものとし、特に、その重みの比が抽出領域および非抽出領域の面積比の逆数の比となるように設定する。重み画像のすべての画素において同一の重みとすることも可能であるが、その場合には、抽出領域と非抽出領域のうち面積が大きい方の一致度がより重視される結果を招き易い。重みの比が抽出領域および非抽出領域の面積比の逆数の比となるように設定することにより、抽出領域と非抽出領域との一致度が同等に重視された処理プログラムを得ることが可能となる。 According to a fifth aspect of the present invention, in the fourth aspect of the present invention, the weighting by the weighted image is different between the extraction region and the non-extraction region of the image, and in particular, the weight ratio is the extraction region. And it sets so that it may become the ratio of the reciprocal of the area ratio of a non-extraction area | region. Although it is possible to use the same weight for all the pixels of the weighted image, in this case, it is easy to cause a result in which the degree of coincidence of the extracted area and the non-extracted area having a larger area is more important. By setting the weight ratio to be the reciprocal of the area ratio of the extraction region and the non-extraction region, it is possible to obtain a processing program in which the degree of coincidence between the extraction region and the non-extraction region is equally important Become.

請求項６に記載の発明によれば、１つの処理プログラムを形成するために、複数種類の画像、目標画像および重み画像よりなる学習セットを１セットだけでなく、例えば、場面が異なる別の学習セットを複数セット用いる。そのため、前記各請求項に記載の発明の効果に加え、学習セットとして１セットのみを用いた場合に陥り易い画像中の特定位置の対象の抽出という現象が生じることを避けることができるとともに、遺伝的プログラミングにおける学習セットには用いなかった画像の中からもより的確に特定の対象を抽出できるようになり、動画像中から的確に人物のみを抽出する汎用性の高い処理プログラムを得ることが可能となる。 According to the invention described in claim 6, in order to form one processing program, not only one learning set composed of a plurality of types of images, target images and weight images but also different learnings with different scenes, for example. Use multiple sets. Therefore, in addition to the effects of the inventions described in the above claims, it is possible to avoid the phenomenon of extraction of a target at a specific position in an image that easily falls when only one set is used as a learning set. Specific objects can be extracted more accurately from images that were not used in the learning set for dynamic programming, and it is possible to obtain a versatile processing program that accurately extracts only people from moving images It becomes.

請求項７に記載の発明によれば、遺伝的プログラミングにおいて、ノード数が大きい処理プログラムを淘汰され易くすることにより、前記各請求項に記載の発明の効果に加え、処理プログラムが過学習の状態になることを回避することが可能となる。すなわち、処理プログラムは、ノード数が大きくなるほどより限定された対象（例えば、歩行者一般に対する濃い色の服を着た歩行者）を抽出するようになる場合があるが、このような過学習の状態の処理プログラムが形成されることを有効に防止することができる。 According to the invention described in claim 7, in the genetic programming, the processing program having a large number of nodes is easily deceived, so that the processing program is in an over-learning state in addition to the effects of the inventions described in the above claims. It becomes possible to avoid becoming. That is, the processing program may extract more limited objects (for example, pedestrians wearing dark clothes for pedestrians in general) as the number of nodes increases. It is possible to effectively prevent the state processing program from being formed.

請求項８に記載の発明によれば、請求項７に記載の発明においてノード数が大きくなるほど適応度を減少させる割合を、遺伝的プログラミングにおける進化過程に応じて変化させることで、ノード数が大きな処理プログラムを進化過程のどの段階で淘汰され易くするかを自由に決定することができる。また、遺伝的プログラミングにおける進化が進むと、世代数が増えても適応度の最大値は増加せず停滞が生じることがあるが、そのような停滞が生じた場合に、前記割合を変化させることで、より最適化された処理プログラムが得られる可能性が高くなる。 According to the invention described in claim 8, in the invention described in claim 7, the ratio of decreasing the fitness as the number of nodes increases is changed according to the evolution process in genetic programming, so that the number of nodes increases. It is possible to freely decide at which stage of the evolution process the processing program is easily deceived. As genetic programming evolves, even if the number of generations increases, the maximum fitness value may not increase and stagnation may occur. When such stagnation occurs, the ratio may be changed. This increases the possibility of obtaining a more optimized processing program.

請求項９に記載の発明によれば、遺伝的プログラミングにおいて、２入力画像フィルタのノード数が大きい処理プログラムの適応度を大きくして淘汰され難くすることにより、前記各請求項に記載の発明の効果に加え、処理プログラムに入力される画像の多様性を増加させることができ、より的確に特定の対象を抽出できる処理プログラムを得ること可能となる。 According to the invention described in claim 9, in genetic programming, the fitness of a processing program having a large number of nodes of a two-input image filter is increased to make it difficult to be deceived. In addition to the effect, it is possible to increase the diversity of images input to the processing program, and to obtain a processing program that can extract a specific target more accurately.

請求項１０に記載の発明によれば、請求項９に記載の発明において２入力画像フィルタのノード数が大きくなるほど適応度を増加させる割合を、遺伝的プログラミングにおける進化過程に応じて変化させることで、２入力画像フィルタのノード数が大きな処理プログラムを進化過程のどの段階で生き残り易くするかを自由に決定することができる。また、遺伝的プログラミングにおける進化が進むと、世代数が増えても適応度の最大値は増加せず停滞が生じることがあるが、そのような停滞が生じた場合に、前記割合を変化させることで、より最適化された処理プログラムが得られる可能性が高くなる。 According to the invention described in claim 10, in the invention described in claim 9, the rate of increasing the fitness as the number of nodes of the two-input image filter increases is changed according to the evolution process in genetic programming. It is possible to freely determine at which stage of the evolution process the processing program having a large number of nodes of the 2-input image filter is likely to survive. As genetic programming evolves, even if the number of generations increases, the maximum fitness value may not increase and stagnation may occur. When such stagnation occurs, the ratio may be changed. This increases the possibility of obtaining a more optimized processing program.

請求項１１に記載の発明によれば、人為的に形成した処理プログラムや遺伝的プログラミングにより形成した処理プログラムを複数組み合わせて大規模な処理プログラムを形成することにより、前記各請求項に記載の発明の効果に加え、目的とする特定対象をより的確に抽出することが可能となる。さらに、処理プログラムを遺伝的プログラミングで求める場合、通常、処理プログラムを構成する画像フィルタ（非終端記号）の数が増えるに従って解プログラムの探索空間がいわば指数関数的に増大し、膨大な探索が必要となり、局所解に落ち込む可能性が高くなるが、このように処理プログラムを複数組み合わせることで、より容易にかつ的確に特定対象を抽出可能な汎用性の高い処理プログラムを得ることが可能となる。 According to the invention described in claim 11, the invention according to each of the above claims is formed by forming a large-scale processing program by combining a plurality of processing programs formed artificially or processing programs formed by genetic programming. In addition to the effect, it is possible to more accurately extract the target specific object. Furthermore, when a processing program is obtained by genetic programming, the search space for the solution program increases exponentially as the number of image filters (non-terminal symbols) constituting the processing program increases, and a huge search is required. Although the possibility of falling into a local solution increases, it becomes possible to obtain a highly versatile processing program that can extract a specific target more easily and accurately by combining a plurality of processing programs in this way.

請求項１２に記載の発明によれば、請求項１１に記載の発明における大規模な処理プログラムにより出力される画像は、大規模な処理プログラムを構成する各処理プログラムの多数が出力輝度値を持つ画素部分は出力輝度値が大きくなり、少数しか出力輝度値を持たない画素部分は出力輝度値が小さくなり或いは０となる。そのため、大規模な処理プログラムによる出力画像がより鮮明になるとともに、本来抽出すべきでないノイズ等も効果的に除去することができる。 According to the invention described in claim 12, in the image output by the large-scale processing program according to the invention described in claim 11, a large number of each processing program constituting the large-scale processing program has an output luminance value. The pixel portion has a large output luminance value, and the pixel portion having only a small number of output luminance values has a small or zero output luminance value. Therefore, the output image by the large-scale processing program becomes clearer, and noise that should not be extracted can be effectively removed.

請求項１３に記載の発明によれば、木構造の処理プログラムを構成する非終端記号としてマスクフィルタを用いたり、処理プログラムを複数組み合わせて構成される大規模な処理プログラムにおける各処理プログラムの組み合わせにマスクフィルタを用いることができる。そのため、前記各請求項に記載の発明の効果に加え、処理プログラムや大規模な処理プログラムの出力画像中から不必要なノイズを除去したり、或いは出力画像を分割してそれぞれの領域に別々の処理プログラムの出力画像を割り当てて出力画像を形成することが可能となる。 According to the invention of claim 13, a mask filter is used as a non-terminal symbol constituting a tree-structured processing program, or a combination of processing programs in a large-scale processing program configured by combining a plurality of processing programs is masked. A filter can be used. Therefore, in addition to the effects of the inventions described in the above claims, unnecessary noise is removed from the output image of a processing program or a large-scale processing program, or the output image is divided into different areas. It is possible to assign an output image of the processing program and form an output image.

請求項１４に記載の発明によれば、画像処理装置は表示部を備え、その表示部には、例えば、白黒画像として表示された入力画像に、赤色で着色された出力画像が重ね合わせて表示されるため、前記各請求項に記載の発明の効果に加え、入力画像中の特定の対象を明確に表示することが可能となる。 According to the fourteenth aspect of the present invention, the image processing apparatus includes a display unit, and the display unit displays, for example, an input image displayed as a black-and-white image superimposed on an output image colored in red. Therefore, in addition to the effects of the inventions described in the above claims, a specific object in the input image can be clearly displayed.

以下、本発明の画像処理装置に係る実施の形態について、図面を参照して説明する。 Hereinafter, embodiments of the image processing apparatus of the present invention will be described with reference to the drawings.

本実施形態では、画像処理装置が自動車車両に搭載され、車両前方の風景画像の中から歩行者を抽出する画像処理装置について述べる。 In the present embodiment, an image processing apparatus will be described in which an image processing apparatus is mounted on an automobile vehicle and a pedestrian is extracted from a landscape image in front of the vehicle.

図１は、本実施形態の画像処理装置の構成を示すブロック図であり、画像処理装置１は、画像入力部２と、画像処理部３と、表示部４と、メモリ５と、処理プログラム形成部６と、入力部７とを備えている。本実施形態では、画像処理装置１としては、ＣＰＵやＲＡＭ、ＲＯＭ、入出力インターフェース等がＢＵＳにより接続されて構成されたコンピュータを用いることができる。 FIG. 1 is a block diagram illustrating a configuration of an image processing apparatus according to the present embodiment. The image processing apparatus 1 includes an image input unit 2, an image processing unit 3, a display unit 4, a memory 5, and processing program formation. A unit 6 and an input unit 7 are provided. In the present embodiment, as the image processing apparatus 1, a computer configured by connecting a CPU, a RAM, a ROM, an input / output interface, and the like through a BUS can be used.

画像入力部２は、撮像した画像を電気信号に変換可能な撮像装置２１を備えており、撮像装置２１としては、例えば、電荷結合素子（ＣＣＤ）等の固体撮像素子を用いたＣＣＤカメラが用いられる。本実施形態では、画像入力部２の撮像装置２１は、図示しない自動車車両のルームミラー付近のフロントガラス内側に前方を撮像可能な状態に取り付けられており、通常のテレビ画像と同様に１／３０秒ごとに車両前方を撮像してその入力画像を画像処理部３に送信するようになっている。 The image input unit 2 includes an imaging device 21 that can convert a captured image into an electrical signal. As the imaging device 21, for example, a CCD camera using a solid-state imaging device such as a charge coupled device (CCD) is used. It is done. In the present embodiment, the imaging device 21 of the image input unit 2 is attached to the inside of the windshield near the rear mirror of an automobile vehicle (not shown) so that the front can be imaged, and is 1/30 like a normal TV image. The vehicle front is imaged every second and the input image is transmitted to the image processing unit 3.

なお、本実施形態では、この一定の時間間隔ごとに送信されてくる入力画像の単位を１フレームという。つまり、本実施形態では、画像処理部３には、１秒間に３０フレームの入力画像が画像入力部２から送信されるようになっている。 In the present embodiment, the unit of the input image transmitted at regular time intervals is referred to as one frame. That is, in this embodiment, the image processing unit 3 is configured to transmit an input image of 30 frames per second from the image input unit 2.

画像処理部３には、モニタを備えた表示部４およびメモリ５が接続されており、画像処理部３は、画像入力部２から送信されてきた入力画像を表示部４に送信してモニタに表示させると同時に、入力画像を順次メモリ５に一時保存するように構成されている。 The image processing unit 3 is connected to a display unit 4 having a monitor and a memory 5, and the image processing unit 3 transmits the input image transmitted from the image input unit 2 to the display unit 4 to be used as a monitor. Simultaneously with the display, the input images are temporarily stored in the memory 5 sequentially.

また、画像処理部３には、各種画像フィルタが木構造状に組み合わされた処理プログラムが記憶されており、画像処理部３は、この処理プログラムに従って画像処理を行って出力画像を形成するように構成されている。 The image processing unit 3 stores a processing program in which various image filters are combined in a tree structure, and the image processing unit 3 performs image processing according to the processing program to form an output image. It is configured.

ここで、処理プログラムの構造について説明する。処理プログラムは、図２に示すように、各種画像フィルタ（図３参照）を木構造状に組み合わせたプログラムであり、入力された複数種類の入力画像ｔ、ｔ−１、…、ｔ−ｋ（ｋは１以上の整数）、すなわち、ｋ＋１個の互いに異なる入力画像ｔ、ｔ−１、…、ｔ−ｋに対して各画像フィルタによる画像処理を施し、出力画像を形成するように構成されている。 Here, the structure of the processing program will be described. As shown in FIG. 2, the processing program is a program in which various image filters (see FIG. 3) are combined in a tree structure, and a plurality of input images t, t−1,. k is an integer greater than or equal to 1), that is, k + 1 different input images t, t−1,..., t−k are subjected to image processing by each image filter to form an output image. Yes.

本実施形態では、入力画像ｔ、ｔ−１、…、ｔ−ｋとして、現在の時刻ｔの入力画像ｔおよびそれ以前のＭフレームごとの入力画像ｔ、ｔ−１、…、ｔ−ｋが処理プログラムに入力されるようになっており、ｋ＝３、Ｍ＝１に設定されている。すなわち、図４（Ａ）に示すように、現在の時刻ｔの入力画像ｔとそれ以前の１フレームごと、すなわち、現在の時刻ｔからさかのぼって１／３０秒の時間間隔をおいて撮像された連続する４フレーム分の入力画像ｔ、ｔ−１、ｔ−２、ｔ−３がメモリ５から読み出され、処理プログラムに入力されるようになっている。 In this embodiment, as the input images t, t−1,..., Tk, the input image t at the current time t and the input images t, t−1,. The program is input to the processing program, and k = 3 and M = 1 are set. That is, as shown in FIG. 4A, the input image t at the current time t and every previous frame, that is, the image was taken at a time interval of 1/30 second from the current time t. Input images t, t-1, t-2, and t-3 for four consecutive frames are read from the memory 5 and input to the processing program.

なお、ｋおよびＭの値は適宜設定することが可能であり、例えば、ｋ＝２、Ｍ＝３と設定すれば、図４（Ｂ）に示すように、現在の時刻ｔおよびそれ以前の３フレームごとの入力画像ｔ、ｔ−１、ｔ−２の計３枚の画像が読み出されて処理プログラムに入力されるようになる。また、他の選択方法により相異なる複数種類の入力画像を選択して処理プログラムに入力するように構成することも可能である。 The values of k and M can be set as appropriate. For example, if k = 2 and M = 3 are set, as shown in FIG. A total of three images, i.e., input images t, t-1, and t-2 for each frame, are read out and input to the processing program. It is also possible to select a plurality of different types of input images by other selection methods and input them to the processing program.

また、本実施形態の処理プログラムには、計算速度を向上させるために、図３に示したような一般的な画像フィルタが用いられるが、目的に応じて機能が特化された画像フィルタを追加することも可能である。 The processing program of the present embodiment uses a general image filter as shown in FIG. 3 in order to improve the calculation speed. However, an image filter specialized in function according to the purpose is added. It is also possible to do.

本実施形態の画像処理装置１は、前述したように、車両前方の風景画像の中から特定の対象としての歩行者を抽出するものであり、処理プログラムも入力画像Ｔの中から歩行者を抽出するように構成されている。すなわち、前述した連続する４フレーム分の入力画像ｔ、ｔ−１、ｔ−２、ｔ−３（図５参照。ただし、図５には入力画像ｔ（Ａ）および入力画像ｔ−３（Ｂ）のみ示す。）が処理プログラムに入力されると、処理プログラムは各画像フィルタで画像処理を行い、図６に示すような歩行者を含む特定の対象を抽出した出力画像を形成するようになっている。なお、図６の出力画像において、斜線部分の画素の輝度値は０である。 As described above, the image processing apparatus 1 according to the present embodiment extracts a pedestrian as a specific target from a landscape image in front of the vehicle, and the processing program also extracts a pedestrian from the input image T. Is configured to do. That is, the input images t, t-1, t-2, and t-3 (see FIG. 5) for the four consecutive frames described above (see FIG. 5; however, in FIG. 5, the input image t (A) and the input image t-3 (B ) Only is input to the processing program, the processing program performs image processing with each image filter to form an output image in which a specific target including a pedestrian as shown in FIG. 6 is extracted. ing. In the output image of FIG. 6, the luminance value of the hatched pixel is 0.

さらに、本実施形態では、このようにして形成された出力画像が表示部４のモニタに表示されている入力画像ｔに重ね合わせて表示されるように構成されている。すなわち、前述したように、画像処理部３から送信されてきた入力画像ｔが表示部４のモニタに表示され、さらに、図７に示すように、その入力画像ｔに処理プログラムにより形成された出力画像が重ね合わされて表示されるように構成されている。また、その際、入力画像ｔは白黒画像として表示され、出力画像のうち正の輝度値を持つ画素部分（図中の斜線部分）が赤色で着色されて表示されるようになっている。 Furthermore, in the present embodiment, the output image formed in this way is configured to be displayed superimposed on the input image t displayed on the monitor of the display unit 4. That is, as described above, the input image t transmitted from the image processing unit 3 is displayed on the monitor of the display unit 4, and further, as shown in FIG. 7, the output formed by the processing program on the input image t. The images are displayed so as to overlap each other. At that time, the input image t is displayed as a black and white image, and the pixel portion (hatched portion in the figure) having a positive luminance value in the output image is displayed in red.

その際、処理プログラムの出力画像に対して、図８に示すようなマスクフィルタによる画像処理を行うように構成することも可能であり、例えば、図７の車両フロント部の赤色の着色部分Ｆや画像上方の木の着色部分Ｔは不必要な着色部分であるから、マスクフィルタを用いて画像処理を行い、それらの部分の着色を表示しないように構成することも可能である。 At that time, it is also possible to perform the image processing by the mask filter as shown in FIG. 8 on the output image of the processing program. For example, the red colored portion F of the vehicle front portion of FIG. Since the colored portion T of the tree above the image is an unnecessary colored portion, it is possible to perform image processing using a mask filter so as not to display the color of those portions.

ところで、処理プログラムは、人為的に構築されて画像処理部３に与えられることも可能である。図２に示したような木構造の処理プログラムにおいて、例えば、図３に示したような１入力或いは２入力の画像フィルタを最大４０個の範囲で任意に組み合わせ、それに入力画像ｔ、ｔ−１、ｔ−２、ｔ−３を任意に組み合わせたすべての木構造の処理プログラムについて全探索を行うことも可能である。 By the way, the processing program can be artificially constructed and given to the image processing unit 3. In the processing program having a tree structure as shown in FIG. 2, for example, one-input or two-input image filters as shown in FIG. 3 are arbitrarily combined in a maximum of 40 ranges, and input images t and t−1 are combined. , T-2, and t-3 can be arbitrarily searched for all tree-structured processing programs.

本実施形態では、画像処理部３に接続された処理プログラム形成部６で遺伝的プログラミングの手法を用いて自動的に形成するように構成されている。図９は、処理プログラム形成部の構成を示すブロック図であり、処理プログラム形成部６は、初期個体生成手段６１と、適応度評価手段６２、６６と、親選択手段６３と、交叉手段６４と、突然変異手段６５と、終了判定手段６７とを備えている。 In this embodiment, the processing program forming unit 6 connected to the image processing unit 3 is automatically formed using a genetic programming technique. FIG. 9 is a block diagram showing the configuration of the processing program forming unit. The processing program forming unit 6 includes an initial individual generating unit 61, fitness evaluation units 62 and 66, a parent selecting unit 63, and a crossing unit 64. , A mutation means 65 and an end determination means 67 are provided.

初期個体生成手段６１は、キーボードやマウス等よりなる入力部７（図１参照）からの処理プログラム形成指示に応じて、図２に示したような木構造の処理プログラムを、設定された前記ｋおよびＭの値の範囲内でランダムに一定数（本実施形態の場合は１００個体）生成させるように構成されている。 The initial individual generation means 61 sets a processing program having a tree structure as shown in FIG. 2 in accordance with a processing program formation instruction from the input unit 7 (see FIG. 1) such as a keyboard and a mouse. And a fixed number (100 individuals in this embodiment) are randomly generated within the range of the values of M and M.

処理プログラムをランダムに生成する規則として、本実施形態では、初期個体のみならず最適化された処理プログラムが得られるまでの進化過程において木構造の処理プログラムを構成するノードのうち画像フィルタ（すなわち非終端記号）の数が最大でも４０を超えないように設定されている。なお、画像フィルタは、図３に示した画像フィルタの中からランダムに選択されるようになっている。また、選択される画像フィルタには、図３に示した各種画像フィルタのほか、図８に示したようなマスクフィルタを含めることも可能である。 As a rule for randomly generating a processing program, in this embodiment, an image filter (that is, a non-terminal) among nodes constituting a processing program having a tree structure in an evolution process until an optimized processing program is obtained as well as an initial individual. The number of symbols) is set so as not to exceed 40 at the maximum. The image filter is randomly selected from the image filters shown in FIG. In addition to the various image filters shown in FIG. 3, the selected image filter can include a mask filter as shown in FIG.

また、前述したように、本実施形態では、ｋ＝３、Ｍ＝１と設定されており、一定の時刻ｔからさかのぼって１／３０秒の時間間隔をおいて撮像された連続する４フレーム分の入力画像ｔ、ｔ−１、ｔ−２、ｔ−３の中から処理プログラムに入力すべき入力画像が任意に選択されるようになっている。すなわち、入力画像ｔ、ｔ−１、ｔ−２、ｔ−３の４種類を処理プログラムの入力画像としてすべて用いる必要はなく、初期個体の中には、例えば、入力画像ｔと入力画像ｔ−２の２種類、或いは入力画像ｔ−３だけしか用いないような処理プログラムが含まれることも許容される。 Further, as described above, in this embodiment, k = 3 and M = 1 are set, and four consecutive frames imaged at a time interval of 1/30 seconds retroactive from a certain time t. The input images to be input to the processing program are arbitrarily selected from the input images t, t-1, t-2, and t-3. That is, it is not necessary to use all four types of input images t, t−1, t−2, and t−3 as input images of the processing program. Some of the initial individuals include, for example, the input image t and the input image t−. It is allowed to include a processing program that uses only two types of 2 or only the input image t-3.

初期個体生成手段６１には、適応度評価手段６２が接続されており、初期個体生成手段６１で生成された各処理プログラムの初期個体が適応度評価手段６２に送信されるようになっている。 An fitness evaluation means 62 is connected to the initial individual generation means 61 so that the initial individual of each processing program generated by the initial individual generation means 61 is transmitted to the fitness evaluation means 62.

適応度評価手段６２では、各処理プログラムについて入力画像を入力して出力画像を得るシミュレーションがそれぞれ実行され、シミュレーションにより得られた出力画像と目標画像とが比較されて各処理プログラムの適応度Ｅが下記（１）式に基づいてそれぞれ算出されるように構成されている。 The fitness evaluation means 62 executes a simulation for inputting an input image for each processing program and obtaining an output image, and compares the output image obtained by the simulation with the target image to determine the fitness E of each processing program. It is configured to be calculated based on the following equation (1).

ここで、目標画像とは、最適化された処理プログラムが出力すべき画像をいう。本実施形態では、処理プログラムの目的が車両前方の風景画像の中からの歩行者の抽出であることから、入力画像ｔ（例えば、図５（Ａ）参照）の中から歩行者のみを白抜きの抽出領域（輝度値２５５）とし、それ以外の部分を非抽出領域（輝度値０）とした画像（図１０参照）が目標画像として適応度評価手段６２に送信される。 Here, the target image is an image to be output by the optimized processing program. In the present embodiment, since the purpose of the processing program is to extract pedestrians from a landscape image in front of the vehicle, only the pedestrians are outlined from the input image t (for example, see FIG. 5A). An image (see FIG. 10) in which the extracted area (luminance value 255) is set and the other part is the non-extracted area (luminance value 0) is transmitted to the fitness evaluation means 62 as a target image.

また、重み画像とは、出力画像（Ｏ）と目標画像（Ｔ）との距離｜Ｏ−Ｔ｜を画素ごとに重み付けするための重みが画素ごとに定義された画像をいい、画素ごとの重みは、構築すべき処理プログラムの目的により適宜決められる。通常、出力画像が目標画像と一致することが強く求められる画素領域では大きな値が設定され、出力画像と目標画像との一致が強く要求されない画素領域では小さな値が設定される。 The weighted image is an image in which the weight for weighting the distance | OT−T | between the output image (O) and the target image (T) for each pixel is defined for each pixel. Is appropriately determined according to the purpose of the processing program to be constructed. Normally, a large value is set in a pixel area where the output image is strongly required to match the target image, and a small value is set in a pixel area where the match between the output image and the target image is not strongly required.

本実施形態では、歩行者を抽出し、その他のものは抽出しないようにすることが目的であるため、出力画像は、目標画像の抽出領域においても非抽出領域においてもともに目標画像との一致が強く要求される。しかし、重みを全画像において同じ値に設定すると、出力画像における歩行者の占める画素領域（すなわち抽出領域）の面積割合がその他の画素領域（すなわち非抽出領域）の面積割合よりも小さく（面積比で１２：２５６）、適応度評価における非抽出領域での一致度の寄与が大きくなりすぎる可能性がある。 In this embodiment, since the purpose is to extract pedestrians and not to extract others, the output image matches the target image both in the target image extraction region and in the non-extraction region. Strongly required. However, if the weight is set to the same value in all the images, the area ratio of the pixel area occupied by the pedestrian (that is, the extraction area) in the output image is smaller than the area ratio of the other pixel areas (that is, the non-extraction area) (area ratio). 12: 256), the contribution of the matching degree in the non-extraction region in the fitness evaluation may be too large.

そのため、本実施形態では、重み画像は、図１１に示すように目標画像（図１０参照）と同様の画像となり、抽出領域の重みと非抽出領域の重みとの比がそれぞれの面積比の逆数の比になるように重み（Ｗ）が抽出領域および非抽出領域でそれぞれ１／１２および１／２５６の値となるように設定されている。重み画像は、目標画像やメモリ５から読み出された入力画像ｔ、ｔ−１、ｔ−２、ｔ−３とともに適応度評価手段６２に送信される。 Therefore, in this embodiment, the weight image is the same as the target image (see FIG. 10) as shown in FIG. 11, and the ratio of the weight of the extraction region to the weight of the non-extraction region is the reciprocal of the respective area ratios. The weight (W) is set to be 1/12 and 1/256 in the extracted region and the non-extracted region, respectively, so that the ratio becomes. The weighted image is transmitted to the fitness evaluation means 62 together with the target image and the input images t, t−1, t−2, and t−3 read from the memory 5.

適応度評価手段６２では、以上のような複数種類の入力画像ｔ、ｔ−１、ｔ−２、ｔ−３、目標画像および重み画像を用いて各処理プログラムの適応度Ｅを計算するが、本実施形態では、さらに、入力画像、目標画像および重み画像を組み合わせてなるセット（以下、学習セットという。）を１セットだけでなく複数セット用いられて処理プログラムのシミュレーションが行われるように構成されている。 In the fitness evaluation means 62, the fitness E of each processing program is calculated using a plurality of types of input images t, t-1, t-2, t-3, target images and weight images as described above. In the present embodiment, the processing program is simulated by using not only one set but also a plurality of sets (hereinafter referred to as a learning set) formed by combining an input image, a target image, and a weight image. ing.

すなわち、例えば、図１２に示すように、時刻ｔにおける複数種類の入力画像ｔ、ｔ−１、ｔ−２、ｔ−３とそれに対応する目標画像および重み画像よりなる第１学習セットとともに、時刻ｔ以前の時刻ｔａにおける入力画像（図１３（Ａ）参照）等の同様の第２学習セットと、時刻ｔ以降の時刻ｔｂにおける入力画像（図１３（Ｂ）参照）等の同様の第３学習セットの計３セット（この場合は（１）式においてＮ＝３となる。）が適応度評価手段６２に送られ、各処理プログラムごとに１セットずつ計３回シミュレーションを行い、それぞれの学習セットについて（１）式中のΣＷ・｜Ｏ−Ｔ｜／ΣＷ・２５６を計算して（１）式に基づいて適応度Ｅを得るようになっている。 That is, for example, as shown in FIG. 12, together with a first learning set consisting of a plurality of types of input images t, t−1, t−2, and t−3 and corresponding target images and weight images at time t, A similar second learning set such as an input image (see FIG. 13A) at time ta before t, and a similar third learning such as an input image at time tb after time t (see FIG. 13B). A total of three sets (in this case, N = 3 in the equation (1)) is sent to the fitness evaluation means 62, and one set for each processing program is simulated three times in total. ΣW · | OT | / ΣW · 256 in the equation (1) is calculated to obtain the fitness E based on the equation (1).

適応度評価手段６２には、親選択手段６３が接続されており、適応度評価手段６２で適応度Ｅが計算された各処理プログラムが親選択手段６３に送信されるようになっている。 A parent selection means 63 is connected to the fitness evaluation means 62, and each processing program whose fitness E is calculated by the fitness evaluation means 62 is transmitted to the parent selection means 63.

親選択手段６３は、各処理プログラムの中から適応度Ｅに基づいてルーレット選択や期待値選択、ランキング選択、トーナメント選択等の方法で次世代に残すべき１００個体の処理プログラムの選択およびそれらの処理プログラムの増殖を行うように構成されている。また、本実施形態では、トーナメント選択により１００個体を選択するとともに、適応度Ｅが最大の処理プログラムのエリート保存を同時に行うようになっている。 Based on the fitness E, the parent selection means 63 selects 100 individual processing programs to be left in the next generation by methods such as roulette selection, expected value selection, ranking selection, tournament selection, etc., and their processing. It is configured to multiply the program. Further, in the present embodiment, 100 individuals are selected by tournament selection, and the elite storage of the processing program having the maximum fitness E is simultaneously performed.

親選択手段６３で選択、増殖された１００個体の処理プログラムは、次の交叉手段６４に送信されるようになっている。 The processing program for 100 individuals selected and expanded by the parent selection means 63 is transmitted to the next crossing means 64.

交叉手段６４では、図１４に示すように、親選択手段６３から送られてきた処理プログラムを２個体ずつ対にして（親プログラム１、２という。）、それぞれの個体対ごとにランダムに選ばれた交叉部分（図１４の親プログラム１、２の点線でそれぞれ囲まれた部分）を所定の割合で互いに交叉させて、子プログラム１、２を生成させるように構成されている。その際、２つの子プログラムの中に非終端記号の数が４０を超えるものが生成される場合には、その交叉は取り消され、再度、元の親プログラム１、２で交叉部分がランダムに選ばれて交叉が行われるようになっている。 In the crossover means 64, as shown in FIG. 14, the processing programs sent from the parent selection means 63 are paired by 2 individuals (referred to as parent programs 1 and 2), and randomly selected for each individual pair. The crossed portions (portions surrounded by dotted lines of the parent programs 1 and 2 in FIG. 14) are crossed with each other at a predetermined ratio to generate child programs 1 and 2. At that time, if the number of non-terminal symbols exceeding 40 is generated in two child programs, the crossover is canceled and the crossover part is selected at random in the original parent programs 1 and 2 again. Crossover is performed.

本実施形態では、交叉手段６４においては、図１４に示したような１点交叉が行われるように構成されているが、この他にも、例えば、多点交叉や一様交叉等の方法で交叉させるように構成することも可能である。 In the present embodiment, the crossover means 64 is configured to perform one-point crossover as shown in FIG. 14, but in addition to this, for example, a multipoint crossover or uniform crossover is used. It is also possible to configure it to cross.

交叉手段６４で生成された１００個体の子プログラムである処理プログラムは、次の突然変異手段６５に送られるようになっている。 A processing program that is a child program of 100 individuals generated by the crossover means 64 is sent to the next mutation means 65.

突然変異手段６５では、処理プログラムごとに所定の割合でノードの変異、挿入、欠失等を発生させるように構成されている。その際、ノードの挿入により処理プログラム中の非終端記号の数が４０を超える場合には、その挿入は行われず、また、終端記号（すなわち、入力画像ｔ等）と非終端記号（すなわち、画像フィルタ）との変異は禁止されている。この他にも、例えば、転座や重複等の突然変異を行うように構成することも可能であり、その際、適宜適切な制限が加えられる。 The mutation means 65 is configured to generate node mutations, insertions, deletions, etc. at a predetermined rate for each processing program. At this time, if the number of non-terminal symbols in the processing program exceeds 40 due to the insertion of the node, the insertion is not performed, and the terminal symbols (that is, the input image t, etc.) and the non-terminal symbols (that is, the image filter) Mutations with are prohibited. In addition to this, for example, it is possible to perform a mutation such as translocation or duplication, and appropriate restrictions are added accordingly.

突然変異手段６５には、適応度評価手段６６が接続されており、突然変異手段６５で生成された１００個体の処理プログラムは適応度評価手段６６に送られるようになっている。適応度評価手段６６では、前述した適応度評価手段６２と同様の処理が行われ、適応度評価手段６２で用いたものと同じ第１〜第３学習セットが用いられて各処理プログラムについてシミュレーションが行われ、前記（１）式に基づいてそれぞれ適応度Ｅが計算されるように構成されている。 A fitness evaluation means 66 is connected to the mutation means 65, and the processing program of 100 individuals generated by the mutation means 65 is sent to the fitness evaluation means 66. In the fitness evaluation means 66, processing similar to that of the fitness evaluation means 62 described above is performed, and the same first to third learning sets as those used in the fitness evaluation means 62 are used, and simulation is performed for each processing program. The fitness E is calculated based on the equation (1).

適応度評価手段６６には、終了判定手段６７が接続されており、適応度評価手段６６で適応度Ｅが計算された各処理プログラムと前記親選択手段６３でエリート保存された前世代の適応度が最大の処理プログラムとが終了判定手段６７に送られ、処理プログラム形成部６における処理プログラム形成を終了するか否かの判定を受けるようになっている。 The fitness evaluation means 66 is connected to an end determination means 67, and each processing program whose fitness E is calculated by the fitness evaluation means 66 and the fitness of the previous generation elite-stored by the parent selection means 63. Is sent to the end determination means 67, and it is determined whether or not the processing program formation in the processing program forming unit 6 is to be ended.

本実施形態では、終了判定手段６７は、進化過程の世代数があらかじめ設定された終了世代数Ｇｅに達したかどうかを判定し、終了世代数Ｇｅに到達したと判定した場合にはその時点で適応度Ｅが最大の処理プログラムを解として画像処理部３に出力して、プログラム形成を終了させるように構成されている。また、終了判定手段６７は、世代数が終了世代数Ｇｅに到達していないと判定すると、各処理プログラムを前記親選択手段６３に送り、前述した処理手順を繰り返すようになっている。 In the present embodiment, the end determination unit 67 determines whether the number of generations of the evolution process has reached a preset end generation number Ge, and when it is determined that the end generation number Ge has been reached, at that time A processing program having the maximum fitness E is output to the image processing unit 3 as a solution, and the program formation is terminated. If the end determination unit 67 determines that the number of generations has not reached the end generation number Ge, the end determination unit 67 sends each processing program to the parent selection unit 63 and repeats the above-described processing procedure.

終了判定手段６７では、この他にも、例えば、各処理プログラムの中に適応度があらかじめ設定した目標適応度Ｅｑに達した処理プログラムがあるか否かを判定し、目標適応度Ｅｑに達した処理プログラムがあればその処理プログラムを解として画像処理部３に出力するように構成されてもよい。また、終了判定手段６７が各処理プログラムの適応度の最大値を記憶するように構成し、適応度の最大値が所定の世代数経過してもその間変化しない場合、すなわち、適応度の最大値が停滞した場合にはその世代で前記手順を終了し、その最大の適応度を有する処理プログラムを解として画像処理部３に出力するように構成することも可能である。 In addition to this, for example, the end determination means 67 determines whether or not there is a processing program whose fitness has reached a preset target fitness Eq in each processing program, and has reached the target fitness Eq. If there is a processing program, the processing program may be output to the image processing unit 3 as a solution. In addition, the end determination means 67 is configured to store the maximum fitness value of each processing program, and when the maximum fitness value does not change during a predetermined number of generations, that is, the maximum fitness value. It is also possible to configure such that when the stagnation occurs, the procedure is terminated at that generation, and the processing program having the maximum fitness is output to the image processing unit 3 as a solution.

処理プログラム形成部６では、以上の進化過程に基づいて最適化された処理プログラムが形成されるが、得られた処理プログラムには、いわゆる過学習と呼ばれる現象が見られることがある。つまり、本実施形態の場合で言えば、歩行者一般を抽出するのではなく、さらに限定された、例えば、白っぽい服を着た歩行者は抽出せず、濃い色の服を着た歩行者のみを抽出してしまうような処理プログラムが得られる場合がある。 In the processing program forming unit 6, a processing program optimized based on the above evolution process is formed, but a phenomenon called so-called over-learning may be seen in the obtained processing program. That is, in the case of this embodiment, instead of extracting pedestrians in general, more limited, for example, pedestrians wearing white clothes are not extracted, only pedestrians wearing dark clothes May be obtained.

このような過学習の発生を避けるため、本実施形態では、前記適応度評価手段６２、６６での適応度評価において、前記（１）式で算出された適応度Ｅからさらに下記（２）式に基づいて過学習制限を考慮した適応度Ｅ´を算出するように構成されている。従って、本実施形態では、親選択手段６３や終了判定手段６７においては、この過学習制限を考慮した適応度Ｅ´が比較され、参照される。 In order to avoid the occurrence of such overlearning, in this embodiment, in the fitness evaluation by the fitness evaluation means 62 and 66, the following formula (2) is further calculated from the fitness E calculated by the formula (1). Based on the above, the fitness E ′ in consideration of overlearning restriction is calculated. Therefore, in the present embodiment, the parent selection unit 63 and the end determination unit 67 compare and refer to the fitness E ′ in consideration of this overlearning restriction.

ここで、係数ａ、ｂはともに正の値をとる。（２）式によれば、過学習制限を考慮した適応度Ｅ´は、処理プログラムにおけるノード数ｎ（Node）が大きいほど小さい値をとるように算出され、また、２入力画像フィルタのノード数ｍ（2input-node）が大きいほど大きな値をとるように算出される。 Here, the coefficients a and b both take positive values. According to the equation (2), the fitness E ′ in consideration of overlearning restriction is calculated so as to take a smaller value as the number of nodes n (Node) in the processing program is larger, and the number of nodes of the two-input image filter. The larger the value of m (2input-node) is, the larger the value is calculated.

過学習制限を考慮した適応度Ｅ´が前記（２）式のように構成される理由は、処理プログラムの木構造のノード数が大きいほど抽出する対象がより限定されて過学習の状態になり易く、ノード数が小さいほどより一般的な対象（本実施形態では歩行者全般）を抽出できるようになり、汎用性が向上するためである。 The reason why the fitness E ′ considering the overlearning restriction is configured as in the above equation (2) is that the larger the number of nodes of the tree structure of the processing program, the more the target to be extracted becomes more overlearned. This is because, as the number of nodes is smaller, more general objects (in general, pedestrians in this embodiment) can be extracted, and versatility is improved.

また、このようにノード数が大きいほど適応度Ｅ´が小さくなるように構成されると、今度は、処理プログラムの木構造における２入力画像フィルタの割合が小さくなり、例えば、本実施形態のように入力画像として４種類の入力画像（すなわち、入力画像ｔ、ｔ−１、ｔ−２、ｔ−３）の入力を許しても、実際にはより少ない種類の入力画像しか入力しない処理プログラムが得られる傾向が強くなるため、２入力画像フィルタのノード数が大きいほど適応度Ｅ´が大きな値をとるように構成されている。 Further, when the fitness E ′ is reduced as the number of nodes is increased, the ratio of the two-input image filter in the tree structure of the processing program is reduced. For example, as in the present embodiment, However, there is a processing program that actually inputs fewer types of input images even if four types of input images (that is, input images t, t-1, t-2, and t-3) are allowed to be input. Since the tendency to be obtained becomes stronger, the fitness E ′ is configured to take a larger value as the number of nodes of the two-input image filter is larger.

なお、係数ａ、ｂはそれぞれノード数の適応度Ｅ´に対する割合および２入力画像フィルタのノード数の適応度Ｅ´に対する割合を表すが、この係数ａ、ｂを前記処理プログラム形成部６における遺伝的プログラミングの進化過程の世代数に応じて変化するように構成することも可能である。 The coefficients a and b represent the ratio of the number of nodes to the fitness level E ′ and the ratio of the number of nodes of the two-input image filter to the fitness level E ′. The coefficients a and b are inherited in the processing program forming unit 6. It can be configured to change according to the number of generations in the evolutionary process of dynamic programming.

例えば、世代数が小さい段階では係数ａ、ｂがともに大きな値をとり世代とともに小さな値をとるようにすれば、ノード数が大きな処理プログラムが淘汰され易くなり（ａの効果）、２入力画像フィルタを多く含む処理プログラムが生き残る可能性が高くなる（ｂの効果）。また、逆に、係数ａ、ｂを世代とともに大きくするようにすれば、進化初期に得られた学習セットに特化した処理を進化の後半でよりシンプルな形にしていくことが可能となる。 For example, if the coefficients a and b both take a large value and take a small value with the generation when the number of generations is small, a processing program with a large number of nodes can be easily deceived (effect of a). There is a high possibility that a processing program containing a large amount of will survive (effect of b). Conversely, if the coefficients a and b are increased with the generation, the processing specialized for the learning set obtained in the early stage of evolution can be made simpler in the latter half of the evolution.

さらに、進化が進んで適応度の最大値の停滞が生じたときに、人為的に係数ａ、ｂの値を変化させることで、より最適化された処理プログラムが得られる可能性が高くなる。 Furthermore, when evolution progresses and the stagnation of the maximum value of fitness occurs, it is highly likely that a more optimized processing program can be obtained by artificially changing the values of the coefficients a and b.

処理プログラム形成部６で以上のようにして形成された処理プログラムは、前述したように、画像処理部３に送られるようになっている。さらに、本実施形態では、図１５に示すように、形成された複数の処理プログラム１〜ｎを組み合わせ、１つのより大規模な処理プログラムを形成して用いるように構成されている。 The processing program formed by the processing program forming unit 6 as described above is sent to the image processing unit 3 as described above. Further, in the present embodiment, as shown in FIG. 15, a plurality of processing programs 1 to n formed are combined to form and use one larger processing program.

組み合わせの方法としては、例えば、各処理プログラム１〜ｎでそれぞれ得られたｎ個の出力画像の対応する画素ごとに論理和を求め、２値化された画像を大規模な処理プログラムの出力画像とするように構成することも可能である。また、図８（Ａ）および図８（Ｂ）に示したマスクフィルタを用いて、図１６に示すような大規模な処理プログラムを構成し、出力画像の下半分および上半分に処理プログラム１および処理プログラム２による画像処理の結果を表示して１つの出力画像とするように構成することも可能である。 As a combination method, for example, a logical sum is obtained for each corresponding pixel of n output images obtained by the processing programs 1 to n, and a binarized image is output as an output image of a large-scale processing program. It is also possible to configure as follows. Further, a large-scale processing program as shown in FIG. 16 is configured by using the mask filter shown in FIGS. 8A and 8B, and the processing program 1 and the upper half of the output image are formed. It is also possible to display a result of image processing by the processing program 2 so as to form one output image.

本実施形態では、前記処理プログラム形成部６での遺伝的プログラミングで得られた６個の処理プログラムを組み合わせて大規模な処理プログラムを構築するように構成されている。この大規模な処理プログラムでは、出力画像からノイズを除去し、出力画像の画素のうち多くの処理プログラムによって画像が抽出されている画素ではより強く赤色を表示するために、出力画像のある画素におけるｉ番目の処理プログラムの出力結果をｄｉとして、下記（３）式に示す非線形の重ね合わせに基づいて出力画像の各画素における出力輝度値Ｄが決定されるようになっている。 In the present embodiment, a large-scale processing program is constructed by combining six processing programs obtained by genetic programming in the processing program forming unit 6. In this large-scale processing program, in order to remove noise from the output image and display a stronger red color in pixels in which the image is extracted by many processing programs among the pixels of the output image, Assuming that the output result of the i-th processing program is di, the output luminance value D in each pixel of the output image is determined based on the non-linear superposition represented by the following equation (3).

ここで、本実施形態の場合、ｎ＝６であり、ｐは２に設定されている。また、閾値Ｋは定数であり、本実施形態では１２７に設定されている。ｐおよびＫの値は任意に設定可能であり、ｐの値を大きくすれば画像が抽出されている画素をより強調して表示することが可能となる。 Here, in the present embodiment, n = 6 and p is set to 2. The threshold value K is a constant and is set to 127 in this embodiment. The values of p and K can be arbitrarily set. If the value of p is increased, the pixel from which the image is extracted can be displayed with more emphasis.

次に、本実施形態の画像処理装置の作用について説明する。 Next, the operation of the image processing apparatus of this embodiment will be described.

自動車車両のフロントガラス内側に取り付けられた画像処理装置１の撮像装置２（図１参照）は車両の前方風景を撮像し、その画像（図５（Ａ）参照）を入力画像ｔとして画像処理部３に送信する。撮像装置２は、この動作を１／３０秒ごとに繰り返す。 An imaging device 2 (see FIG. 1) of the image processing device 1 attached to the inside of the windshield of the automobile vehicle images a front landscape of the vehicle, and uses the image (see FIG. 5A) as an input image t as an image processing unit. 3 to send. The imaging device 2 repeats this operation every 1/30 seconds.

画像処理部３は、撮像装置２からの入力画像ｔを受信すると、表示部４に送信してモニタに表示させるとともに、入力画像ｔを順次メモリ５に一時保存させる。また、同時に、メモリ５に保存されている入力画像ｔ−１、ｔ−２、ｔ−３を読み出し、入力画像ｔ、ｔ−１、ｔ−２、ｔ−３を画像フィルタを木構造状に組み合わせた処理プログラムに入力して出力画像を形成し、表示部４のモニタに白黒画像として表示されている入力画像ｔに赤色に着色した出力画像を重ね合わせて表示する。 When receiving the input image t from the imaging device 2, the image processing unit 3 transmits the input image t to the display unit 4 to be displayed on the monitor, and temporarily stores the input image t in the memory 5 sequentially. At the same time, the input images t-1, t-2, and t-3 stored in the memory 5 are read, and the input images t, t-1, t-2, and t-3 are converted into a tree structure. An output image is formed by inputting to the combined processing program, and the output image colored in red is superimposed on the input image t displayed as a monochrome image on the monitor of the display unit 4 and displayed.

ここで、処理プログラムは、前述したように、人為的に構築されたものでもよいが、処理プログラム形成部６における遺伝的プログラミングによりあらかじめ形成しておくことも可能である。 Here, as described above, the processing program may be artificially constructed, but may be formed in advance by genetic programming in the processing program forming unit 6.

処理プログラム形成部６における処理プログラム形成の手順は前述したとおりであり、ここでは、図１７に、処理プログラム形成部６における遺伝的プログラミングによって形成された解としての処理プログラムの一例を示す。この場合、非終端記号、すなわち、画像フィルタの数は４０、終端記号のうち入力画像の数は１５、出力画像の数は１になっている。 The procedure for forming the processing program in the processing program forming unit 6 is as described above, and FIG. 17 shows an example of a processing program as a solution formed by genetic programming in the processing program forming unit 6. In this case, the number of non-terminal symbols, that is, the number of image filters is 40, the number of input images among the terminal symbols is 15, and the number of output images is 1.

また、図１７に示した処理プログラムとそれと同様にして得られた処理プログラムとを前記（３）式に基づいて組み合わせて、大規模な処理プログラムを形成することができる。この大規模な処理プログラムに、例えば、前記入力画像ｔ、ｔ−１、ｔ−２、ｔ−３（図５参照）を入力すると、図６に示した歩行者を含む特定の対象を抽出した出力画像が得られ、表示部４のモニタに白黒画像として表示されている入力画像ｔに赤色に着色した出力画像を重ね合わせて表示すると、図７に示したような画像が得られる。 Also, a large-scale processing program can be formed by combining the processing program shown in FIG. 17 and a processing program obtained in the same manner based on the equation (3). For example, when the input images t, t-1, t-2, and t-3 (see FIG. 5) are input to this large-scale processing program, specific objects including the pedestrian shown in FIG. 6 are extracted. When an output image is obtained and the output image colored in red is superimposed on the input image t displayed as a monochrome image on the monitor of the display unit 4, an image as shown in FIG. 7 is obtained.

なお、前記処理プログラム形成部６において形成された処理プログラムを見ると、入力画像に対する画像フィルタの処理の早い段階で、差分フィルタによる処理が行われるように形成されていることが観察されることが多い。これは、本実施形態における処理プログラムの目的が、移動中の車両から前方風景を撮像してその画像の中から移動中の或いは停止している歩行者を抽出することであり、歩行者の位置が僅かながら徐々に変化する時系列的な複数の入力画像の中から歩行者を抽出するためであると考えられる。 When the processing program formed in the processing program forming unit 6 is viewed, it can be observed that the processing is performed so that the processing by the differential filter is performed at an early stage of the processing of the image filter for the input image. Many. The purpose of the processing program in this embodiment is to capture a front landscape from a moving vehicle and extract a moving or stopped pedestrian from the image. This is considered to be because pedestrians are extracted from a plurality of time-series input images in which the gradual change slightly occurs.

そのため、入力画像ｔ、ｔ−１、ｔ−２、ｔ−３として、図５に示したような画像の全画像入力を行うかわりに、撮像装置により時間間隔をおいて撮像された複数枚の画像のフレーム間差分画像（すなわち、時間微分データ）を入力するように構成することも可能である。また、各入力画像ｔ、ｔ−１、ｔ−２、ｔ−３のそれぞれにおけるエッジ抽出画像（すなわち、各画像内の空間微分データ）を入力するように構成することも可能である。 Therefore, as input images t, t−1, t−2, and t−3, instead of performing all image input as shown in FIG. 5, a plurality of images captured at time intervals by the imaging device It is also possible to configure to input an inter-frame difference image (that is, time differential data) of the image. Moreover, it is also possible to configure so that edge extraction images (that is, spatial differential data in each image) in each of the input images t, t-1, t-2, and t-3 are input.

以上のように、本実施形態の画像処理装置によれば、画像フィルタを木構造状に組み合わせた処理プログラムに、時間間隔をおいて撮像された車両の前方風景の複数種類の入力画像ｔ、ｔ−１、…、ｔ−ｋを入力可能とすることにより、処理プログラムに動画像の複数のフレームを入力可能となり、動画像を画像処理することができるようになる。 As described above, according to the image processing apparatus of the present embodiment, a plurality of types of input images t and t of a front landscape of a vehicle captured at time intervals in a processing program in which image filters are combined in a tree structure. By making it possible to input -1,..., Tk, it becomes possible to input a plurality of frames of a moving image into the processing program, and the moving image can be processed.

また、木構造の処理プログラムを構成する差分フィルタ等の各種画像フィルタにより動画像（例えば、車両の前方風景）の各フレームが比較されて差分等の画像処理を受けることにより、動画像の中から時間な変化や変位を伴う特定の対象（本実施形態の場合は歩行者）が抽出された出力画像を効果的に形成することが可能となる。 Further, each frame of a moving image (for example, a landscape in front of a vehicle) is compared by various image filters such as a difference filter constituting a tree structure processing program and subjected to image processing such as a difference, so that the moving image It is possible to effectively form an output image in which a specific object (a pedestrian in the case of the present embodiment) accompanied by temporal changes and displacements is extracted.

さらに、処理プログラム形成部において遺伝的プログラミングにより処理プログラムを自動形成させることにより、処理プログラムを容易に得ることができる。また、目標画像および重み画像を変更することにより、抽出すべき特定対象を容易に変更することができる。すなわち、従来のように特定対象が変更されるごとにその抽出のための処理プログラムを手作業で構築することなく、目標画像と重み画像とを変更するだけで、前述した遺伝的プログラミングを用いた方法をそのまま使って同じ手順で処理プログラムを容易に構築することができる。 Furthermore, the processing program can be easily obtained by automatically forming the processing program by genetic programming in the processing program forming unit. Moreover, the specific object which should be extracted can be changed easily by changing a target image and a weight image. In other words, the genetic programming described above was used only by changing the target image and the weight image without manually constructing a processing program for the extraction every time the specific target is changed as in the past. The processing program can be easily constructed in the same procedure using the method as it is.

その際、処理プログラムの形成において、例えば、図５の入力画像、図１０の目標画像および図１１の重み画像を組み合わせてなる学習セットを１セットのみ用いて学習を行った場合、例えば、図７の重ね合わせ画像で主に画像左側の人物のみを抽出するようになってしまう現象が生じることがある。このような現象が生じると、この処理プログラムに、例えば、図１３（Ｂ）のような画像を入力しても画像左側の人物のみを抽出し、画像右側の歩行者を抽出できなくなってしまう。 At this time, in the formation of the processing program, for example, when learning is performed using only one learning set formed by combining the input image of FIG. 5, the target image of FIG. 10, and the weight image of FIG. 11, for example, FIG. In such a superimposed image, a phenomenon may occur in which only the person on the left side of the image is extracted. When such a phenomenon occurs, for example, even if an image as shown in FIG. 13B is input to this processing program, only the person on the left side of the image is extracted, and the pedestrian on the right side of the image cannot be extracted.

しかし、本実施形態のように、処理プログラムの形成において、入力画像、目標画像および重み画像を組み合わせてなる学習セットを複数セット用いることにより、このような現象が生じることを回避することが可能となる。さらには、図１８に示すように、遺伝的プログラミングにおける学習セットには用いなかった風景の中からもより的確に人物を抽出できるようになり、動画像中から的確に人物のみを抽出する汎用性の高い処理プログラムを得ることが可能となる。 However, as in the present embodiment, in the formation of a processing program, it is possible to avoid the occurrence of such a phenomenon by using a plurality of learning sets formed by combining an input image, a target image, and a weight image. Become. Furthermore, as shown in FIG. 18, it becomes possible to more accurately extract a person from a landscape not used in a learning set in genetic programming, and versatility to accurately extract only a person from a moving image. High processing program can be obtained.

また、このようにして得られた処理プログラムを複数組み合わせて大規模な処理プログラムとすることにより、このような効果をさらに効果的に発揮させることが可能となる。 Further, by combining a plurality of processing programs obtained in this way into a large-scale processing program, it is possible to exhibit such effects more effectively.

さらに、処理プログラムを遺伝的プログラミングで求める場合、通常、処理プログラムを構成する画像フィルタ（非終端記号）の数が増えるに従って解プログラムの探索空間がいわば指数関数的に増大し、膨大な探索が必要となる。しかし、本実施形態のように、異なる学習セットを用いて形成された処理プログラムを複数組み合わせることにより、より容易にかつ的確に特定対象を抽出可能な汎用性の高い処理プログラムを得ることが可能となる。 Furthermore, when a processing program is obtained by genetic programming, the search space for the solution program increases exponentially as the number of image filters (non-terminal symbols) constituting the processing program increases, and a huge search is required. Become. However, as in this embodiment, by combining a plurality of processing programs formed using different learning sets, it is possible to obtain a highly versatile processing program that can extract a specific target more easily and accurately. Become.

なお、本実施形態では、移動中の車両から前方風景を撮像してその画像の中から歩行者を抽出することを目的としたが、この他にも、例えば、前方風景の中から車両を抽出したり、車両や歩行者等の一般的な移動物を抽出したり、或いは、車道と歩道の境界を抽出するように構成することも可能である。また、それらを組み合わせて、車道と歩道の境界を抽出してその間の車道上を移動する車両や歩行者を抽出するように構成することもできる。 In this embodiment, the purpose is to capture a front landscape from a moving vehicle and extract a pedestrian from the image. In addition, for example, a vehicle is extracted from the front landscape. It is also possible to extract a general moving object such as a vehicle or a pedestrian, or to extract a boundary between a roadway and a sidewalk. Further, by combining them, it is possible to extract the boundary between the roadway and the sidewalk and extract a vehicle or a pedestrian moving on the roadway between them.

また、本実施形態では、入力画像ｔに出力画像を重ね合わせて表示する場合について述べたが、本実施形態の画像処理装置と他の装置を組み合わせることにより、本実施形態の画像処理装置で抽出した特定の対象を他の装置に送って監視したり、他の装置でその対象との距離を測定したりすることに用いることも可能である。 In the present embodiment, the output image is superimposed on the input image t for display. However, the image processing apparatus of the present embodiment can be extracted by combining the image processing apparatus of the present embodiment with another apparatus. It is also possible to send a specific target to another device for monitoring, or to measure the distance from the target with another device.

例えば、本実施形態の画像処理装置と距離測定装置とを組み合わせることにより、本実施形態の画像処理装置で歩行者を特定し、距離測定装置で歩行者との距離を測定することにより、接近の際の自動警報を発したり、走行制御によって衝突を回避したりすることが可能となる。また、距離測定装置についても車両前方全域の対象物との距離を測定する必要がなくなり負担が軽減される。 For example, by combining the image processing device of the present embodiment and the distance measuring device, the pedestrian is identified by the image processing device of the present embodiment, and the distance from the pedestrian is measured by the distance measuring device. It is possible to issue an automatic warning at the time, and to avoid a collision by running control. Further, the distance measuring device does not need to measure the distance to the object in the entire front area of the vehicle, and the burden is reduced.

さらに、本実施形態の画像処理装置は、車両に搭載するだけでなく、例えば、人工知能型ロボット等に搭載することも可能であり、例えば、ロボットが、搭載したカメラで環境を観察しながら他の移動物体を発見し、計測し、それに対するロボットの行動を決定したりするために用いることができる。 Furthermore, the image processing apparatus according to the present embodiment can be mounted not only on a vehicle but also on, for example, an artificial intelligent robot or the like. For example, a robot can observe other environments while observing an environment with a mounted camera. Can be used to find, measure, and determine the robot's behavior for it.

本実施形態の画像処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image processing apparatus of this embodiment. 本実施形態の処理プログラムの構造を説明した図である。It is a figure explaining the structure of the processing program of this embodiment. 本実施形態で用いられる画像フィルタの一覧表である。It is a list of the image filter used by this embodiment. 処理プログラムに入力される入力画像の選択方法を説明した図である。It is a figure explaining the selection method of the input image input into a processing program. 連続する４フレーム分の入力画像を説明した図であり、（Ａ）は入力画像ｔ、（Ｂ）は入力画像ｔ−３を示す。It is a figure explaining the input image for 4 continuous frames, (A) shows the input image t, (B) shows the input image t-3. 図５の入力画像に基づく出力画像を説明した図である。It is the figure explaining the output image based on the input image of FIG. 図５（Ａ）の入力画像と図６の出力画像を重ね合わせて表示した状態を示す図である。FIG. 7 is a diagram showing a state in which the input image of FIG. 5A and the output image of FIG. 本実施形態で用いられるマスクフィルタの例を示す図である。It is a figure which shows the example of the mask filter used by this embodiment. 本実施形態の処理プログラム形成部の構成を示すブロック図である。It is a block diagram which shows the structure of the processing program formation part of this embodiment. 適応度評価手段で用いられる目標画像を説明した図である。It is a figure explaining the target image used with a fitness evaluation means. 適応度評価手段で用いられる重み画像を説明した図である。It is a figure explaining the weight image used by a fitness evaluation means. 本実施形態で用いられる３つの学習セットを説明した図である。It is a figure explaining three learning sets used by this embodiment. 図１２の各学習セットで用いられる入力画像を示す図であり、（Ａ）は第２学習セット、（Ｂ）は第３学習セットに用いられる入力画像を示す。It is a figure which shows the input image used by each learning set of FIG. 12, (A) shows a 2nd learning set, (B) shows the input image used for a 3rd learning set. 交叉手段における処理プログラムの交叉を説明した図である。It is a figure explaining the crossing of the processing program in a crossing means. 処理プログラムを組み合わせて形成される大規模な処理プログラムを説明した図である。It is a figure explaining the large-scale processing program formed combining a processing program. マスクフィルタを用いて画像を分割して表示する大規模な処理プログラムを説明した図である。It is a figure explaining the large-scale processing program which divides | segments and displays an image using a mask filter. 遺伝的プログラミングにより形成された処理プログラムの一例を示す図である。It is a figure which shows an example of the processing program formed by the genetic programming. 学習セットに用いなかった入力画像の中から人物を抽出した状態を示す図である。It is a figure which shows the state which extracted the person from the input image which was not used for the learning set. 従来の処理プログラムの構造を説明した図である。It is a figure explaining the structure of the conventional processing program.

Explanation of symbols

１画像処理装置
２１撮像装置
３画像処理部
４表示部
６処理プログラム形成部 DESCRIPTION OF SYMBOLS 1 Image processing apparatus 21 Imaging apparatus 3 Image processing part 4 Display part 6 Processing program formation part

Claims

In an image processing device that performs image processing on an image captured by an imaging device and extracts a specific target from the image,
An image processing unit for performing image processing on a plurality of types of images captured by the imaging device based on a processing program in which image filters are combined in a tree structure, and forming an output image from which the specific target is extracted An image processing apparatus comprising:

The image processing apparatus according to claim 1, wherein the plurality of types of images are a plurality of images captured at time intervals by the imaging apparatus.

The image processing apparatus according to claim 1, wherein the plurality of types of images are inter-frame difference images or edge extraction images of a plurality of images captured at time intervals by the imaging apparatus.

A processing program forming unit for forming the processing program, wherein the processing program forming unit is configured to form a processing program by genetic programming using the plurality of types of images, target images, and weight images; The image processing apparatus according to claim 1, wherein the image processing apparatus is an image processing apparatus.

The weighted image is set so that a ratio between a weight of the extraction region and a weight of the non-extraction region is a reciprocal ratio of an area ratio of the extraction region and the non-extraction region. 5. The image processing apparatus according to 4.

The image processing apparatus according to claim 4, wherein the processing program forming unit forms a processing program using a plurality of learning sets including the plurality of types of images, target images, and weight images. .

7. The fitness value used for genetic programming in the processing program forming unit is calculated to take a smaller value as the number of nodes in the processing program is larger. The image processing apparatus according to one item.

The image processing apparatus according to claim 7, wherein the ratio of the number of nodes to the fitness is configured to change according to the number of generations of evolutionary processes in genetic programming.

5. The fitness used for genetic programming in the processing program forming unit is calculated to take a larger value as the number of nodes of the two-input image filter in the processing program is larger. Item 9. The image processing device according to any one of Items 8 to 9.

The image processing apparatus according to claim 9, wherein the ratio of the number of nodes of the two-input image filter to the fitness is changed according to the number of generations of the evolution process in genetic programming.

The image processing apparatus according to any one of claims 1 to 10, wherein the processing program is formed by combining a plurality of processing programs.

The image processing apparatus according to claim 11, wherein an output image is formed by nonlinear superposition of processing by the plurality of processing programs.

The image processing apparatus according to claim 1, wherein the image filter includes a mask filter.

A display unit for displaying an image is provided, and an output image formed based on the processing program is configured to be superimposed on the input image displayed on the display unit. The image processing apparatus according to any one of claims 1 to 13.