JP2008040724A

JP2008040724A - Image processing device and image processing method

Info

Publication number: JP2008040724A
Application number: JP2006213115A
Authority: JP
Inventors: Ryusuke Miyamoto; 龍介宮本; Tetsuya Kimata; 哲也木全
Original assignee: Sumitomo Electric Industries Ltd; SYNTHESIS Corp
Current assignee: Sumitomo Electric Industries Ltd; SYNTHESIS Corp
Priority date: 2006-08-04
Filing date: 2006-08-04
Publication date: 2008-02-21

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image processing device and an image processing method capable of dividing the region of an image with a high degree of precision even when the amount of information on the image is small. <P>SOLUTION: The image processing device acquires data of an image taken with an imaging device that takes the image of a periphery by the frame units in a time-series manner, and divides it into partial regions. The image processing device extracts the feature quantity based on the image data acquired from the imaging device, and acquires the coordinate values on a screen of the feature region where the extracted feature quantity meets a predetermined condition. The image processing device acquires movement information concerning the quantity of movement per unit time on the screen and the direction of movement for each pixel corresponding to the feature region. It acquires color information and/or brightness information for each pixel corresponding to the feature region. Based on the coordinate values on the screen of the feature region, the movement information, and the color information and/or the brightness information, the image processing device generates a feature space made up by a plurality of dimensions, divides the feature space into the partial space that meets a predetermined condition, specifies the coordinate values on the screen corresponding to the divided partial space, and divides the image into a plurality of partial regions. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、周辺を撮像する撮像装置で撮像した画像データに基づいて、画像の領域分割を精度良く行うことができる画像処理装置、及び画像処理方法に関する。 The present invention relates to an image processing apparatus and an image processing method capable of accurately dividing an area of an image based on image data captured by an imaging apparatus that captures the periphery.

周囲を撮像することが可能な撮像装置により撮像された画像中から、認識すべき対象物を検出して画像認識する場合、認識精度を向上するべく、画像を特定の条件に基づいて分割し、分割された領域について対象物の検出処理を行う方法が良く用いられている。画像を複数の領域へ分割する方法は、領域の移動情報（動き量）、色又は輝度情報、初期エッジのトラッキング等による。 When detecting an object to be recognized from an image captured by an imaging device capable of capturing the surroundings, the image is divided based on specific conditions in order to improve recognition accuracy, A method of performing object detection processing on the divided areas is often used. A method of dividing an image into a plurality of regions is based on region movement information (motion amount), color or luminance information, initial edge tracking, and the like.

例えば非特許文献１に開示されているように、画素単位で前フレームの位置と現在のフレームの位置との乖離を算出することにより画素ごとのフレーム間での動き量を把握し、動き量が所定量よりも大きい領域を部分領域として抽出する方法が開発されている。 For example, as disclosed in Non-Patent Document 1, the amount of motion between frames for each pixel is grasped by calculating the deviation between the position of the previous frame and the position of the current frame in pixel units. A method of extracting an area larger than a predetermined amount as a partial area has been developed.

また、非特許文献２に開示されているように、画素単位で輝度値、又はカラー画像の場合にはＹ、Ｕ、Ｖ信号値（又はＲ、Ｇ、Ｂ信号値）を取得し、信号値が所定の範囲内である画素を一グループとして候補領域を抽出する方法が開発されている。
N.Vasconcelos、他１名、「実証的ベイジアン推定に基づく動きによる領域分割（Empirical Bayesian motion segmentation）」、ＩＥＥＥ論文誌「パターン認識と機械の知性（Transactions on Pattern Analysis and Machine Intelligence）」、ｐ．２１７−２２１、２００１年２月 D.Comaniciciu、他１名、「平均値シフト：特徴領域解析に対するロバストアプローチ」、ＩＥＥＥ論文誌「パターン認識と機械の知性（Transactions on Pattern Analysis and Machine Intelligence）」、ｐ．６０３−６１９、２００２年５月 Further, as disclosed in Non-Patent Document 2, a luminance value or a Y, U, V signal value (or R, G, B signal value) is obtained in pixel units, and a signal value is obtained in the case of a color image. A method has been developed for extracting candidate regions by grouping pixels within a predetermined range.
N. Vasconcelos, 1 other, “Empirical Bayesian motion segmentation”, IEEE paper “Transactions on Pattern Analysis and Machine Intelligence”, p. 217-221, February 2001 D. Comaniciciu and one other, “Average Shift: A Robust Approach to Feature Domain Analysis”, IEEE Journal “Transactions on Pattern Analysis and Machine Intelligence”, p. 603-619, May 2002

しかし、非特許文献１に開示されているように、画素ごとのフレーム間での動き量に基づく画像の領域分割方法では、例えば遠赤外線撮像装置で撮像された画像のように情報量が少ない場合には、動き量を正確に把握することができず、画像の領域分割を精度良く行なうことができないおそれがあるという問題点があった。 However, as disclosed in Non-Patent Document 1, in the image segmentation method based on the amount of motion between frames for each pixel, for example, when the amount of information is small, such as an image captured by a far-infrared imaging device However, there is a problem that the amount of motion cannot be accurately grasped, and there is a possibility that the area division of the image cannot be performed with high accuracy.

また、非特許文献２に開示されているように、画素ごとの輝度値等が類似する画素群を１つの領域として抽出する領域分割方法では、本来その領域に含まれるべき画素ではないにも関わらず、輝度値等が類似する画素を該領域に含めるおそれがあり、誤った領域分割を行うおそれがあるという問題点があった。 Further, as disclosed in Non-Patent Document 2, the region dividing method for extracting a pixel group having similar luminance values for each pixel as one region is not a pixel that should originally be included in the region. In other words, there is a possibility that pixels having similar luminance values and the like may be included in the area, and there is a possibility that erroneous area division may be performed.

本発明は、斯かる事情に鑑みてなされたものであり、例えば遠赤外線撮像装置で撮像された画像のように情報量が少ない画像であっても、画像の領域分割を精度良く行うことができる画像処理装置、及び画像処理方法を提供することを目的とする。 The present invention has been made in view of such circumstances. For example, even an image with a small amount of information, such as an image captured by a far-infrared imaging device, can perform image segmentation with high accuracy. An object is to provide an image processing apparatus and an image processing method.

上記目的を達成するために第１発明に係る画像処理装置は、周辺をフレーム単位で時系列的に撮像する撮像装置で撮像された画像データを取得して、対象物が存在する画像中の部分領域を検出する画像処理装置において、前記撮像装置から取得した画像データに基づ
いて、画素ごとに画面上での単位時間当たりの移動量及び移動方向に関する移動情報を取得する移動情報取得手段と、前記撮像装置から取得した画像データに基づいて、画素ごとに色情報及び／又は輝度情報を取得する色／輝度情報取得手段と、前記画面上での座標値、移動情報、並びに色情報及び／又は輝度情報に基づいて、複数の次元で構成される特徴空間を生成する特徴空間生成手段と、該特徴空間生成手段で生成した特徴空間を所定の条件を具備する部分空間に分割し、分割された部分空間に対応する画面上での座標値を特定して画像を複数の部分領域に分割する画像分割手段とを備えることを特徴とする。 In order to achieve the above object, an image processing device according to a first aspect of the present invention obtains image data captured by an imaging device that images the periphery in time series in units of frames, and a portion in an image in which an object exists. In the image processing device for detecting a region, based on the image data acquired from the imaging device, movement information acquisition means for acquiring movement information about a movement amount and a movement direction per unit time on the screen for each pixel, and Color / luminance information acquisition means for acquiring color information and / or luminance information for each pixel based on image data acquired from the imaging device, coordinate values on the screen, movement information, and color information and / or luminance Based on the information, the feature space generating means for generating a feature space composed of a plurality of dimensions, and the feature space generated by the feature space generating means are divided into partial spaces having predetermined conditions. Identifying a coordinate value on the screen corresponding to the divided partial space, characterized in that it comprises an image dividing means for dividing an image into a plurality of partial regions.

第２発明に係る画像処理装置は、第１発明において、前記画像分割手段で分割された複数の部分領域につき、隣接する部分領域間の類似度を算出する手段と、該手段で算出した類似度が所定値より大きいか否かを判断する手段と、該手段で大きいと判断した場合、隣接する部分領域を一の部分領域へ統合する手段とを備えることを特徴とする。 An image processing apparatus according to a second invention is the image processing apparatus according to the first invention, wherein, for the plurality of partial areas divided by the image dividing means, a means for calculating a similarity between adjacent partial areas, and a similarity calculated by the means Means for determining whether or not is greater than a predetermined value, and means for integrating adjacent partial areas into one partial area when it is determined that the means is larger.

第３発明に係る画像処理方法は、周辺をフレーム単位で時系列的に撮像する撮像装置で撮像された画像データを取得して、対象物が存在する画像中の部分領域を検出する画像処理方法において、前記撮像装置から取得した画像データに基づいて、画素ごとに画面上での単位時間当たりの移動量及び移動方向に関する移動情報を取得し、前記撮像装置から取得した画像データに基づいて、画素ごとに色情報及び／又は輝度情報を取得し、前記画面上での座標値、移動情報、並びに色情報及び／又は輝度情報に基づいて、複数の次元で構成される特徴空間を生成し、生成された特徴空間を所定の条件を具備する部分空間に分割し、分割された部分空間に対応する画面上での座標値を特定して画像を複数の部分領域に分割することを特徴とする。 An image processing method according to a third aspect of the present invention is an image processing method for detecting a partial region in an image in which an object is present by acquiring image data captured by an imaging device that images the periphery in time series in units of frames. In the above, based on the image data acquired from the imaging device, movement information regarding a movement amount and a movement direction per unit time on the screen is acquired for each pixel, and the pixel is determined based on the image data acquired from the imaging device. Color information and / or luminance information is acquired for each, and a feature space composed of a plurality of dimensions is generated and generated based on the coordinate values, movement information, and color information and / or luminance information on the screen. The feature space is divided into partial spaces having predetermined conditions, and coordinate values on the screen corresponding to the divided partial spaces are specified to divide the image into a plurality of partial regions.

第１発明、及び第３発明では、周辺をフレーム単位で時系列的に撮像する撮像装置で撮像された画像データを取得して、対象物が存在する画像中の部分領域を検出する。撮像装置から取得した画像データに基づいて、画素ごとに画面上での単位時間当たりの移動量及び移動方向に関する移動情報を取得する。一方、撮像装置から取得した画像データに基づいて、画素ごとに色情報及び／又は輝度情報を取得する。画面上での座標値、移動情報、並びに色情報及び／又は輝度情報に基づいて、複数の次元で構成される特徴空間を生成し、生成された特徴空間を所定の条件を具備する部分空間に分割し、分割された部分空間に対応する画面上での座標値を特定して画像を複数の部分領域に分割する。これにより、画素ごとのフレーム間での移動情報だけでなく、画素ごとの色情報及び／又は輝度情報も考慮した複数次元の特徴空間を生成することにより、移動情報の取得、並びに色情報及び／又は輝度情報の取得の信頼性が低い場合であっても、これらを相補的に利用することにより画像の領域分割を精度良く行うことが可能となる。 In the first invention and the third invention, image data captured by an imaging device that images the periphery in a time-series manner is acquired, and a partial region in an image in which an object is present is detected. Based on the image data acquired from the imaging device, movement information regarding the movement amount and movement direction per unit time on the screen is acquired for each pixel. On the other hand, based on the image data acquired from the imaging device, color information and / or luminance information is acquired for each pixel. A feature space composed of a plurality of dimensions is generated based on coordinate values on the screen, movement information, and color information and / or luminance information, and the generated feature space is converted into a partial space having a predetermined condition. The image is divided, the coordinate value on the screen corresponding to the divided partial space is specified, and the image is divided into a plurality of partial areas. Thus, by generating a multi-dimensional feature space considering not only movement information between frames for each pixel but also color information and / or luminance information for each pixel, acquisition of movement information, color information and / or Alternatively, even when luminance information acquisition reliability is low, it is possible to perform region segmentation of an image with high accuracy by using these in a complementary manner.

第２発明では、分割された複数の部分領域につき、隣接する部分領域間の類似度を算出し、算出された類似度が所定値より大きい場合、隣接する部分領域を一の部分領域へ統合する。これにより、過剰に分割されて特定されている部分領域を適切な大きさ、例えば検出する対象物の物理的な大きさに近い大きさへと更新することが可能となる。 In the second invention, for a plurality of divided partial areas, the similarity between adjacent partial areas is calculated, and if the calculated similarity is greater than a predetermined value, the adjacent partial areas are integrated into one partial area. . Thereby, it becomes possible to update the partial area specified by being excessively divided to an appropriate size, for example, a size close to the physical size of the object to be detected.

第１発明、及び第３発明によれば、画素ごとのフレーム間での移動情報だけでなく、画素ごとの色情報及び／又は輝度情報も考慮した複数次元の特徴空間を生成することにより、移動情報の取得、並びに色情報及び／又は輝度情報の取得の信頼性が低い場合であっても、これらを相補的に利用することにより画像の領域分割を精度良く行うことが可能となる。 According to the first and third aspects of the invention, by generating a multi-dimensional feature space that considers not only movement information between frames for each pixel but also color information and / or luminance information for each pixel, Even when the acquisition of information and the reliability of acquisition of color information and / or luminance information is low, it is possible to perform image segmentation with high accuracy by using these in a complementary manner.

第２発明によれば、過剰に分割されて特定されている部分領域を適切な大きさ、例えば検出する対象物の物理的な大きさに近い大きさへと更新することが可能となる。 According to the second aspect of the present invention, it is possible to update the partial area specified by being excessively divided to an appropriate size, for example, a size close to the physical size of the object to be detected.

以下、本発明の実施の形態について、図面を参照しながら詳細に説明する。なお、以下の実施の形態では、車両が走行中に遠赤外線撮像装置で撮像された画像を部分領域に分割し、分割された部分領域を利用して、車両の前方に存在する障害物、例えば歩行者、自転車等の存在を検出する場合を例として説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the following embodiment, an image captured by the far-infrared imaging device while the vehicle is traveling is divided into partial areas, and obstacles existing in front of the vehicle, for example, using the divided partial areas, for example, A case where the presence of a pedestrian, a bicycle, or the like is detected will be described as an example.

図１は、本発明の実施の形態に係る画像処理装置１の構成を示すブロック図である。本発明の実施の形態に係る画像処理装置１は、少なくともＬＳＩ１１、ＲＡＭ１２、画像メモリ１３、映像入力部１４、映像出力部１５、及び警報出力部１６で構成されている。 FIG. 1 is a block diagram showing a configuration of an image processing apparatus 1 according to an embodiment of the present invention. The image processing apparatus 1 according to the embodiment of the present invention includes at least an LSI 11, a RAM 12, an image memory 13, a video input unit 14, a video output unit 15, and an alarm output unit 16.

ＬＳＩ１１は、画像メモリ１３に記憶された画像データをフレーム単位で読出し、画素ごとに特徴量を抽出する。ＬＳＩ１１は、抽出された特徴量が所定の条件を具備する領域について、画面上の座標値ｘ、ｙ、フレーム単位での動き量Ｖｘ、Ｖｙ、及び色情報として例えばＬ^*ｕ^*ｖ^*表色系の信号値であるＬ^*ｕ^*ｖ^*信号値を取得し、七次元の特徴空間を生成する。以下、七次元の場合について説明を行うが、グレイ画像を対象とし、色ではなく輝度情報を用いる場合には、Ｌ^*ｕ^*ｖ^*に代わり輝度情報の一次元情報を用いる。したがって、特徴空間は五次元となる。ＬＳＩ１１は、特徴空間上において画素に対応する点を所定距離範囲ごとにグループ化することにより特徴空間を複数の部分空間に分割する。分割された部分空間に基づいて、画面上の領域を複数の部分領域へ分割する。 The LSI 11 reads out the image data stored in the image memory 13 in units of frames and extracts a feature amount for each pixel. The LSI 11 uses, for example, the L ^* u ^* v ^* color specification as the coordinate values x and y on the screen, the motion amounts Vx and Vy in units of frames, and color information for an area where the extracted feature value satisfies a predetermined condition. The system signal value L ^* u ^* v ^* signal value is acquired, and a seven-dimensional feature space is generated. Hereinafter, the case of the 7-dimensional case will be described. However, when a gray image is targeted and luminance information is used instead of color, one-dimensional information on luminance information is used instead of L ^* u ^* v ^* . Therefore, the feature space is five-dimensional. The LSI 11 divides the feature space into a plurality of partial spaces by grouping points corresponding to pixels on the feature space for each predetermined distance range. An area on the screen is divided into a plurality of partial areas based on the divided partial spaces.

ＲＡＭ１２は、ＳＲＡＭ、フラッシュメモリ等であり、演算処理の途上で生成したデータ及び特徴量が所定の条件を具備する領域の時系列的位置データ、すなわち移動量、移動ベクトル等を記憶する。画像メモリ１３は、ＳＲＡＭ、フラッシュメモリ、ＳＤＲＡＭ等であり、映像入力部１４を介して遠赤外線撮像装置２、２から入力された画像データを記憶する。 The RAM 12 is an SRAM, a flash memory, or the like, and stores time-series position data of an area in which data generated in the course of arithmetic processing and a feature amount satisfy a predetermined condition, that is, a movement amount, a movement vector, and the like. The image memory 13 is an SRAM, flash memory, SDRAM or the like, and stores image data input from the far-infrared imaging devices 2 and 2 via the video input unit 14.

映像入力部１４は、遠赤外線撮像装置２、２で撮像された画像データの入力を、ＮＴＳＣ等のアナログ映像方式、又はデジタル映像方式に対応した映像ケーブル７を介して受け付ける。映像入力部１４で取得した画像データは、例えば１フレーム単位に同期させて画像メモリ１３に記憶される。 The video input unit 14 receives input of image data captured by the far-infrared imaging devices 2 and 2 via a video cable 7 compatible with an analog video system such as NTSC or a digital video system. The image data acquired by the video input unit 14 is stored in the image memory 13 in synchronization with each frame, for example.

映像出力部１５は、ＮＴＳＣ、ＶＧＡ、ＤＶＩ等の映像方式に対応したケーブル８を介して液晶ディスプレイ等の表示装置４に対して画像データを出力し、警報出力部１６は、音声、効果音等により聴覚的な警告を発する警報装置５に対して、ＣＡＮに準拠した車載ＬＡＮケーブル６を介して合成音等の出力信号を送信する。 The video output unit 15 outputs image data to the display device 4 such as a liquid crystal display via a cable 8 compatible with a video system such as NTSC, VGA, DVI, and the alarm output unit 16 has voice, sound effects, etc. Thus, an output signal such as a synthesized sound is transmitted to the alarm device 5 that emits an audible warning via the in-vehicle LAN cable 6 compliant with CAN.

本実施の形態では、車両の周辺の画像を撮像する遠赤外線撮像装置２、２を、車両前方の中央近傍のフロントグリル内に搭載している。なお、遠赤外線撮像装置２、２は、波長が７〜１４マイクロメートルの赤外光を用いた撮像装置である。なお、撮像装置として遠赤外線撮像装置２、２を用いることに限定されるものではなく、波長が０．８〜３マイクロメートルの赤外光を用いた近赤外線撮像装置であっても良いし、可視光撮像装置であっても良い。 In the present embodiment, the far-infrared imaging devices 2 and 2 that capture an image around the vehicle are mounted in a front grill near the center in front of the vehicle. The far infrared imaging devices 2 and 2 are imaging devices using infrared light having a wavelength of 7 to 14 micrometers. In addition, it is not limited to using the far-infrared imaging devices 2 and 2 as an imaging device, The near-infrared imaging device using the infrared light whose wavelength is 0.8-3 micrometers may be sufficient, A visible light imaging device may be used.

図２は、本発明の実施の形態に係る画像処理装置１で用いる遠赤外線撮像装置２の構成を示すブロック図である。画像撮像部２１は、光学信号を電気信号に変換する撮像素子をマトリックス状に備えている。赤外光用の撮像素子としては、マイクロマシニング（ｍｉｃｒｏｍａｃｈｉｎｉｎｇ）技術を用いた酸化バナジウムのボロメータ型、ＢＳＴ（Ｂａｒｉｕｍ−Ｓｔｒｏｎｔｉｕｍ−Ｔｉｔａｎｉｕｍ）の焦電型等の赤外線センサを用いている。画像撮像部２１は、車両の周囲の赤外光像を輝度信号として読み取り、読み取った
輝度信号を信号処理部２２へ送信する。 FIG. 2 is a block diagram showing the configuration of the far-infrared imaging device 2 used in the image processing device 1 according to the embodiment of the present invention. The image pickup unit 21 includes image pickup elements that convert optical signals into electric signals in a matrix. As an image sensor for infrared light, an infrared sensor such as a vanadium oxide bolometer type using a micromachining technique or a pyroelectric type BST (Barium-Strontium-Titanium) is used. The image capturing unit 21 reads an infrared light image around the vehicle as a luminance signal, and transmits the read luminance signal to the signal processing unit 22.

信号処理部２２は、ＬＳＩであり、画像撮像部２１から受信した輝度信号をＬ^*ｕ^*ｖ^*表色系の信号であるＬ^*ｕ^*ｖ^*信号等のデジタル信号に変換し、撮像素子のばらつきを補正する処理、欠陥素子の補正処理、ゲイン制御処理等を行い、画像データとして画像メモリ２３へ記憶する。なお、画像データを画像メモリ２３へ一時記憶することは必須ではなく、映像出力部２４を介して直接画像処理装置１へ送信しても良いことは言うまでもない。 The signal processing unit 22 is a LSI, converts the luminance signal received from the image capturing unit 21 to the ^{^{^{^{L * u * v * L *}}}} u * v * digital signal such as a signal a color system signals of the imaging device The process of correcting the variation of the defect, the correction process of the defective element, the gain control process, and the like are performed and stored in the image memory 23 as image data. Needless to say, it is not essential to temporarily store the image data in the image memory 23, and the image data may be transmitted directly to the image processing apparatus 1 via the video output unit 24.

映像出力部２４は、ＬＳＩであり、ＮＴＳＣ等のアナログ映像方式、又はデジタル映像方式に対応した映像ケーブル７を介して画像処理装置１に映像データを出力する。 The video output unit 24 is an LSI, and outputs video data to the image processing apparatus 1 via a video cable 7 compatible with an analog video system such as NTSC or a digital video system.

以下、画像処理装置１のＬＳＩ１１での詳細な処理について説明する。図３は、本発明の実施の形態に係る画像処理装置１のＬＳＩ１１の処理の手順を示すフローチャートである。 Hereinafter, detailed processing in the LSI 11 of the image processing apparatus 1 will be described. FIG. 3 is a flowchart showing a processing procedure of the LSI 11 of the image processing apparatus 1 according to the embodiment of the present invention.

ＬＳＩ１１は、画像メモリ１３に記憶してある遠赤外線撮像装置２、２から取得した画像データを読み出す（ステップＳ３０１）。 The LSI 11 reads out the image data acquired from the far infrared imaging devices 2 and 2 stored in the image memory 13 (step S301).

ＬＳＩ１１は、画像メモリ１３に記憶してある画像データに基づいて、画素ごと（座標値（ｘ、ｙ）ごと）に前のフレームからの動き量ｄ（ｘ、ｙ、ｔ）を算出する（ステップＳ３０２）。画素ごとのフレーム単位の動き量の算出方法は、特に限定されるものではなく、例えば周知であるＫａｎａｄｅ−Ｌｕｃａｓ−Ｔｏｍａｓｉトラッカ（以下、ＫＬＴという）を用いる。 The LSI 11 calculates the amount of motion d (x, y, t) from the previous frame for each pixel (for each coordinate value (x, y)) based on the image data stored in the image memory 13 (step) S302). The method of calculating the amount of motion in units of frames for each pixel is not particularly limited, and for example, a well-known Kanade-Lucas-Tomasi tracker (hereinafter referred to as KLT) is used.

ＫＬＴでは、時刻ｔにおける座標値（ｘ、ｙ）の明度をＩ（ｘ、ｙ、ｔ）とした場合、フレーム間の動き量ｄ（ｘ、ｙ、ｔ）を（数１）で表すことができる。なお、（数１）において、τはフレーム間隔を、ｎは撮像装置、画像中の対象物の移動により生じるノイズを示している。 In KLT, when the brightness of the coordinate value (x, y) at time t is I (x, y, t), the motion amount d (x, y, t) between frames can be expressed by (Equation 1). it can. In (Equation 1), τ represents the frame interval, and n represents the noise generated by the movement of the imaging device and the object in the image.

Ｉ（ｘ、ｙ、ｔ＋τ）＝
Ｉ（ｘ−ｄ（ｘ、ｙ、ｔ）、ｙ−ｄ（ｘ、ｙ、ｔ）、ｔ）＋ｎ（ｘ、ｙ、ｔ）
・・・（数１） I (x, y, t + τ) =
I (x−d (x, y, t), yd (x, y, t), t) + n (x, y, t)
... (Equation 1)

したがって、（数１）のノイズｎ（ｘ、ｙ、ｔ）が所定の閾値より小さくなるような動き量ｄ（ｘ、ｙ、ｔ）を探索することにより、座標置（ｘ、ｙ）に対応する特徴点を対応付ける。 Therefore, by searching for the motion amount d (x, y, t) such that the noise n (x, y, t) in (Equation 1) is smaller than a predetermined threshold, the coordinate position (x, y) is supported. Corresponding feature points.

しかし、ＫＬＴを用いた動き量の推定では、対応付けられた特徴点の動き量しか推定することができない。すべての画素に対する動き量を求めるため、本実施の形態では、対応付けられていない特徴点の動き量を、空間的に最も近接している特徴点の動き量と同等であると仮定することにより動き量を補完する。このようにすることで特徴点が抽出されなかった画素についても動き量を推定することが可能となる。 However, in estimating the amount of motion using KLT, only the amount of motion of the associated feature point can be estimated. In order to obtain the motion amount for all the pixels, in this embodiment, it is assumed that the motion amount of the feature points that are not associated with each other is equivalent to the motion amount of the spatially closest feature point. Complement the amount of movement. In this way, it is possible to estimate the amount of motion for pixels from which feature points have not been extracted.

図４は、本発明の実施の形態に係る画像処理装置１の動き量の抽出結果の一例を示す図である。図４に示すように、例えば特徴点４１の近傍には同じ方向及び動き量を有する移動ベクトルが分布していることがわかる。したがって、動き量だけから移動ベクトルが類似する画素をグループ化することにより領域分割することも可能である。ただし、動き量の検出が困難である画像（フレーム）においては、動き量の信頼度が低いことから、正確
に領域分割を行うことができない。 FIG. 4 is a diagram illustrating an example of a motion amount extraction result of the image processing apparatus 1 according to the embodiment of the present invention. As shown in FIG. 4, for example, it can be seen that movement vectors having the same direction and amount of movement are distributed in the vicinity of the feature point 41. Therefore, it is possible to divide the region by grouping pixels having similar movement vectors only from the amount of motion. However, in an image (frame) in which it is difficult to detect the amount of motion, since the reliability of the amount of motion is low, it is not possible to accurately divide the region.

次にＬＳＩ１１は、画像メモリ１３に記憶してある画像データに基づいて、画素ごと（座標値（ｘ、ｙ）ごと）に、色情報及び／又は輝度情報を取得する（ステップＳ３０３）。取得した画像がカラー画像である場合には、色情報として例えばＬ^*ｕ^*ｖ^*表色系の信号値であるＬ^*ｕ^*ｖ^*信号値を、モノクロ画像である場合には輝度情報として輝度値を、それぞれ取得する。 Next, the LSI 11 acquires color information and / or luminance information for each pixel (for each coordinate value (x, y)) based on the image data stored in the image memory 13 (step S303). When the acquired image is a color image, for example, L ^* u ^* v ^* signal value, which is a signal value of L ^* u ^* v ^* color system, is used as color information, and as luminance information when the image is a monochrome image. Each luminance value is acquired.

ＬＳＩ１１は、抽出した特徴領域の画面上での座標値（ｘ、ｙ）、動き量ｄ（ｘ、ｙ、ｔ）、並びに色情報及び／又は輝度情報に基づいて、複数の次元で構成される特徴空間を生成する(ステップＳ３０４)。例えば画像メモリ１３に記憶してある画像データがカラー画像データである場合、抽出した特徴領域の画面上での座標値（ｘ、ｙ）、動き量（Ｖｘ、Ｖｙ）、及び色情報であるＬ^*ｕ^*ｖ^*信号値（ｌ、ｕ、ｖ）の七次元の特徴空間を生成する。 The LSI 11 is configured in a plurality of dimensions based on the coordinate values (x, y), the motion amount d (x, y, t) on the screen of the extracted feature region, and color information and / or luminance information. A feature space is generated (step S304). For example, when the image data stored in the image memory 13 is color image data, the coordinate value (x, y), the amount of motion (Vx, Vy) on the screen of the extracted feature region, and L that is color information. ^* u ^* v ^* Generates a seven-dimensional feature space of signal values (l, u, v).

図５は、特徴空間の生成方法を模式的に示す図である。図５（ａ）に示すように、画像メモリ１３に記憶してある画像データの任意の画素Ｐの座標値（ｘ、ｙ）について、動き量（Ｖｘ、Ｖｙ）、及び色情報であるＬ^*ｕ^*ｖ^*信号値（ｌ、ｕ、ｖ）を算出する。算出された動き量（Ｖｘ、Ｖｙ）、及び色情報であるＬ^*ｕ^*ｖ^*信号値（ｌ、ｕ、ｖ）を用いて、図５（ｂ）に示すように、それぞれの値を正規化した七次元（ｘ’、ｙ’、Ｖｘ’、Ｖｙ’、ｌ’、ｕ’、ｖ’）の特徴空間上に対応点Ｐ’をプロットする。このようにすることで、画像データを七次元の特徴空間へ写像することができる。すなわち、図５（ｂ）の対応点１つ１つが画像メモリ１３に記憶してある画像データの任意の画素の座標値（ｘ、ｙ）に対応しており、それぞれの対応点につき特徴空間の原点からの空間距離を算出することが可能となっている。 FIG. 5 is a diagram schematically illustrating a feature space generation method. As shown in FIG. 5A, with respect to the coordinate value (x, y) of an arbitrary pixel P of the image data stored in the image memory 13, the amount of motion (Vx, Vy) and L ^* which is color information ^. u ^* v ^* signal values (l, u, v) are calculated. Using the calculated motion amount (Vx, Vy) and color information L ^* u ^* v ^* signal values (l, u, v), as shown in FIG. Corresponding points P ′ are plotted on the converted seven-dimensional feature space (x ′, y ′, Vx ′, Vy ′, l ′, u ′, v ′). By doing so, the image data can be mapped to the seven-dimensional feature space. That is, each of the corresponding points in FIG. 5B corresponds to the coordinate value (x, y) of an arbitrary pixel of the image data stored in the image memory 13, and the feature space of each corresponding point is It is possible to calculate the spatial distance from the origin.

なお、正規化された七次元（ｘ’、ｙ’、Ｖｘ’、Ｖｙ’、ｌ’、ｕ’、ｖ’）は、画像メモリ１３に記憶してある画像データの任意の座標値（ｘ、ｙ）、動き量（Ｖｘ、Ｖｙ）、及び色情報であるＬ^*ｕ^*ｖ^*信号値（ｌ、ｕ、ｖ）を、それぞれ重み付け係数により除算した値となっている。 Note that the normalized seven dimensions (x ′, y ′, Vx ′, Vy ′, l ′, u ′, v ′) are arbitrary coordinate values (x, y), the amount of motion (Vx, Vy), and L ^* u ^* v ^* signal values (l, u, v), which are color information, are values divided by weighting coefficients, respectively.

ＬＳＩ１１は、特定された特徴空間内で所定の条件を具備するか否か、例えば特徴空間上での距離が所定の距離より短いか否かに応じて、一又は複数の部分空間へ分割する（ステップＳ３０５）。具体的には、ＬＳＩ１１は、特徴空間内での空間距離を画素の座標値（ｘ、ｙ）に対応する対応点ごとに算出することにより、特徴空間内で空間距離が近接している対応点の集合を１つのグループとして部分空間を特定する。ＬＳＩ１１は、分割された部分空間に含まれる座標値（ｘ、ｙ）を用いて、部分空間ごとに対応する画面上の部分領域を特定する（ステップＳ３０６）。 The LSI 11 divides into one or a plurality of partial spaces depending on whether or not a predetermined condition is satisfied in the specified feature space, for example, whether or not the distance on the feature space is shorter than the predetermined distance ( Step S305). Specifically, the LSI 11 calculates the spatial distance in the feature space for each corresponding point corresponding to the coordinate value (x, y) of the pixel, thereby corresponding points whose spatial distance is close in the feature space. A subspace is specified with a set of as a group. The LSI 11 specifies a partial area on the screen corresponding to each partial space using the coordinate values (x, y) included in the divided partial space (step S306).

図６は、部分領域の分割方法を模式的に示す図である。図６（ｂ）に示すように、特徴空間を複数の部分空間６１、６１、・・・に分割した場合、各部分空間６１の座標値は七次元の座標値となる。その中に画面上の位置を示す二次元座標値（ｘ、ｙ）が含まれていることから、図６（ａ）に示すように、部分空間６１、６１、・・・は１対１対応で画面上の画素群６２、６２、・・・へ写像することができる。したがって、画素群６２、６２、・・・で構成される領域を部分領域として分割することができる。 FIG. 6 is a diagram schematically illustrating a partial region dividing method. As shown in FIG. 6B, when the feature space is divided into a plurality of partial spaces 61, 61,..., The coordinate values of each partial space 61 are seven-dimensional coordinate values. Since the two-dimensional coordinate value (x, y) indicating the position on the screen is included therein, the partial spaces 61, 61,... Correspond one-to-one as shown in FIG. Can be mapped to the pixel groups 62, 62,... On the screen. Therefore, the area formed by the pixel groups 62, 62,... Can be divided as partial areas.

フレーム間での画素の動き量だけでは、遠赤外線撮像装置２、２で撮像された画像のように情報量が少ない場合には、部分領域を精度良く分割することができない。また、画素ごとの色情報が類似する画素群を部分領域として分割する方法では、異なる画素であるにもかかわらず、輝度値等が類似する画素を同一の部分領域に属すると誤認識するおそれが
ある。それに対して、図６（ｂ）に示すように、動き量及び色情報を含む１つの特徴空間を生成し、生成した特徴空間内での空間距離の大小に応じて部分空間に分割することにより、動き量の取得、並びに色情報及び／又は輝度情報の取得の信頼性が低い場合であっても、これらを相補的に利用することにより、画像の領域分割を精度良く行うことが可能となる。 If the amount of information is small, such as images captured by the far-infrared imaging devices 2 and 2, the partial area cannot be divided with high accuracy only by the amount of pixel movement between frames. In addition, in the method of dividing a pixel group having similar color information for each pixel as a partial region, there is a risk of erroneously recognizing that pixels having similar brightness values belong to the same partial region even though they are different pixels. is there. On the other hand, as shown in FIG. 6B, by generating one feature space including the amount of motion and color information, and dividing it into partial spaces according to the spatial distance in the generated feature space. Even when the acquisition of the amount of motion and the reliability of the acquisition of color information and / or luminance information are low, it is possible to perform image segmentation with high accuracy by using these complementarily. .

図７は、色情報及び／又は輝度情報のみを用いて部分領域に分割した場合の一例を示す図である。背景画像との区別をつけることが困難であることから、対象物を検出することが可能な部分領域として分割することができていない。一方、図８は、同じ画像データに対して本実施の形態に係る画像処理を施した結果を示す図である。図８に示すように、動き量及び色情報を含む１つの特徴空間を生成し、生成した特徴空間内での空間距離の大小に応じて部分空間に分割することにより、対象物を検出することが可能な部分領域を精度良く特定することが可能となる。 FIG. 7 is a diagram illustrating an example of the case where the image is divided into partial areas using only color information and / or luminance information. Since it is difficult to distinguish the background image from the background image, it cannot be divided as a partial region where the object can be detected. On the other hand, FIG. 8 is a diagram illustrating a result of performing image processing according to the present embodiment on the same image data. As shown in FIG. 8, one feature space including the amount of motion and color information is generated, and an object is detected by dividing the feature space into subspaces according to the spatial distance in the generated feature space. It is possible to accurately identify a partial area where the image can be captured.

ＬＳＩ１１は、分割された部分領域につき、例えば歩行者、自転車等の対象物が存在するか否かを周知の方法で判定して、判定結果に応じて表示装置４に対して警告表示データを送信し、又は警報装置５に対して警報信号を送信する。 The LSI 11 determines whether or not an object such as a pedestrian or a bicycle exists for the divided partial area by a known method, and transmits warning display data to the display device 4 according to the determination result. Or an alarm signal is transmitted to the alarm device 5.

なお、複数次元の特徴空間から部分領域を分割する場合、部分空間の分割アルゴリズム又は閾値の大きさによっては、部分領域が過剰に細分化されて特定される場合も想定される。そこで、ＬＳＩ１１は、分割された部分領域間の類似度を算出して、算出した類似度に基づいて類似すると判断される部分領域を統合することにより新たな部分領域とする。 Note that, when a partial area is divided from a multi-dimensional feature space, it may be assumed that the partial area is excessively subdivided and specified depending on the partial space division algorithm or the size of the threshold. Therefore, the LSI 11 calculates a similarity between the divided partial areas, and integrates the partial areas determined to be similar based on the calculated similarity to obtain a new partial area.

具体的には、ＬＳＩ１１は、画像上で隣接する部分領域について、動き量（Ｖｘ、Ｖｙ）、及び色情報であるＬ^*ｕ^*ｖ^*信号値（ｌ、ｕ、ｖ）の差の二乗平均を算出する。ＬＳＩ１１は、算出した二乗平均が所定の閾値より小さいか否かを判断し、ＬＳＩ１１が所定の閾値より小さいと判断した場合、ＬＳＩ１１は、両部分領域の類似度が高いものと判断して、部分領域を統合する。 Specifically, the LSI 11 calculates the mean square of the difference between the motion amount (Vx, Vy) and the color information L ^* u ^* v ^* signal value (l, u, v) for adjacent partial regions on the image. Is calculated. The LSI 11 determines whether or not the calculated root mean square is smaller than a predetermined threshold. If the LSI 11 determines that the LSI 11 is smaller than the predetermined threshold, the LSI 11 determines that the similarity between both partial areas is high, and Merge areas.

このようにすることで、部分領域が過剰に細分化されることを防止することができ、画像中に存在する対象物を検出するのに最適な部分領域を特定することができる。 By doing so, it is possible to prevent the partial area from being excessively subdivided, and it is possible to identify the optimal partial area for detecting an object present in the image.

なお、類似度に基づいて部分領域を統合する方法に限定されるものではなく、例えばＭｏｒｐｈｏｌｏｇｉｃａｌ処理等を用いることで分離した部分領域を統合しても良い。Ｍｏｒｐｈｏｌｏｇｉｃａｌ処理は例えばＯｐｅｎｉｎｇ等を用いる。 Note that the method is not limited to the method of integrating the partial regions based on the similarity, and the partial regions separated by using, for example, morphological processing may be integrated. The morphological process uses, for example, Opening.

以上のように本実施の形態によれば、画素ごとのフレーム間での移動情報だけでなく、画素ごとの色情報及び／又は輝度情報も考慮した複数次元の特徴空間を生成することにより、移動情報の取得、並びに色情報及び／又は輝度情報の取得の信頼性が低い場合であっても、これらを相補的に利用することにより画像の領域分割を精度良く行うことが可能となる。 As described above, according to the present embodiment, by generating not only the movement information between frames for each pixel but also the multi-dimensional feature space considering the color information and / or luminance information for each pixel, Even when the acquisition of information and the reliability of acquisition of color information and / or luminance information is low, it is possible to perform image segmentation with high accuracy by using these in a complementary manner.

本発明の実施の形態に係る画像処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image processing apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る画像処理装置で用いる遠赤外線撮像装置の構成を示すブロック図である。It is a block diagram which shows the structure of the far-infrared imaging device used with the image processing apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る画像処理装置のＬＳＩの処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process of LSI of the image processing apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る画像処理装置の動き量の抽出結果の一例を示す図である。It is a figure which shows an example of the extraction result of the motion amount of the image processing apparatus which concerns on embodiment of this invention. 特徴空間の生成方法を模式的に示す図である。It is a figure which shows typically the production | generation method of feature space. 部分領域の分割方法を模式的に示す図である。It is a figure which shows typically the division | segmentation method of a partial region. 色情報及び／又は輝度情報のみを用いて部分領域に分割した場合の一例を示す図である。It is a figure which shows an example at the time of dividing | segmenting into a partial area using only color information and / or luminance information. 本実施の形態に係る画像処理を施した結果を示す図である。It is a figure which shows the result of having performed the image process which concerns on this Embodiment.

Explanation of symbols

１画像処理装置
２遠赤外線撮像装置
４表示装置
５警報装置
６車載ＬＡＮケーブル
７映像ケーブル
８ケーブル
１１ＬＳＩ
１２ＲＡＭ
１３画像メモリ
１４映像入力部
１５映像出力部
１６警報出力部 DESCRIPTION OF SYMBOLS 1 Image processing apparatus 2 Far-infrared imaging device 4 Display apparatus 5 Alarm apparatus 6 Car-mounted LAN cable 7 Video cable 8 Cable 11 LSI
12 RAM
13 Image memory 14 Video input unit 15 Video output unit 16 Alarm output unit

Claims

In an image processing device that acquires image data captured by an imaging device that images the periphery in time series in units of frames, and detects a partial region in an image in which an object exists,
Based on the image data acquired from the imaging device, movement information acquisition means for acquiring movement information regarding a movement amount and a movement direction per unit time on the screen for each pixel;
Color / luminance information acquisition means for acquiring color information and / or luminance information for each pixel based on image data acquired from the imaging device;
Feature space generating means for generating a feature space composed of a plurality of dimensions based on coordinate values on the screen, movement information, and color information and / or luminance information;
The feature space generated by the feature space generation means is divided into partial spaces having predetermined conditions, and coordinate values on the screen corresponding to the divided partial spaces are specified to divide the image into a plurality of partial regions. An image processing apparatus comprising: an image dividing unit.

Means for calculating a similarity between adjacent partial areas for a plurality of partial areas divided by the image dividing means;
Means for determining whether the similarity calculated by the means is greater than a predetermined value;
The image processing apparatus according to claim 1, further comprising: means for integrating adjacent partial areas into one partial area when the means determines that the area is large.

In an image processing method for acquiring image data captured by an imaging device that captures the periphery in a time-series manner in units of frames and detecting a partial region in an image in which an object exists,
Based on the image data acquired from the imaging device, to acquire movement information about the movement amount and movement direction per unit time on the screen for each pixel,
Based on the image data acquired from the imaging device, color information and / or luminance information is acquired for each pixel,
Based on coordinate values on the screen, movement information, and color information and / or luminance information, a feature space composed of a plurality of dimensions is generated,
Dividing the generated feature space into partial spaces having a predetermined condition, specifying coordinate values on the screen corresponding to the divided partial spaces, and dividing the image into a plurality of partial regions Image processing method.