JP2000078564A

JP2000078564A - System for tracing person on television camera monitor

Info

Publication number: JP2000078564A
Application number: JP10243306A
Authority: JP
Inventors: Kazuhiro Yasuda; 和弘安田
Original assignee: Aiphone Co Ltd
Current assignee: Aiphone Co Ltd
Priority date: 1998-08-28
Filing date: 1998-08-28
Publication date: 2000-03-14

Abstract

PROBLEM TO BE SOLVED: To display an image of a person photographed by a television camera with high accuracy in the middle of a screen of a television monitor. SOLUTION: A television camera of a doorphone slave set photographs a person by using at least three consecutive images, that is, an image Sn of a current frame, an image Sn-1 of a preceding frame, and an image Sn-2 anterior to the preceding frame, a person tracing processing unit 22 of a doorphone master set applies difference processing 34 and binary processing 35 to the image of the current frame and the image of the preceding frame, and the image of the preceding frame and the image anterior to the preceding frame respectively to obtain two difference images S1, S2, applies expansion processing 36, contraction processing 37 and AND processing 39 to the two difference images and then applies labeling processing 40 to them, applies person discrimination processing 42 to the images at a maximum label region after the labeling processing and applies segmentation processing 44 to the image around the person based on the center of gravity of the maximum label region so as to display an image within a prescribed range around the person in the middle of a screen of a television monitor with high accuracy.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明はテレビカメラモニタ
人物追跡方式に係わり、特にテレビドアホンのテレビカ
メラにて撮像される人物の画像を、テレビモニタの画面
中央に人物を中心とした一定範囲（人物付近）の画像と
して高精度で映し出すテレビカメラモニタ人物追跡方式
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a person tracking system for a television camera monitor, and more particularly, to an image of a person taken by a television camera of a television doorphone, in a certain range (person) centered on the center of the screen of the television monitor. The present invention relates to a television camera monitor person tracking method for projecting images with high precision as (nearby) images.

【０００２】[0002]

【従来の技術および発明が解決しようとする課題】従来
から、テレビカメラにて撮像される人物の画像をテレビ
モニタ等の表示装置の画面中央に映し出すことにより、
人物の画像を高精度かつ自動で視野調整させる特公平７
−７１２８８号公報記載の自動視野調整方法および装置
の自動視野調整テレビドアホンが提案されている。この
公報による人物追跡アルゴリズムの手法によれば、画像
内の差分情報を上のラインから順に検索していき、一番
最初に見つかった閾値以上の差分画素の位置に顔がある
位置と判定して認識するものであった。2. Description of the Related Art Conventionally, an image of a person captured by a television camera is projected on the center of the screen of a display device such as a television monitor, thereby achieving
Tokuho 7 that automatically and precisely adjusts the visual field of a person's image
An automatic visual field adjustment television doorphone of the automatic visual field adjustment method and apparatus described in JP-A-71288 has been proposed. According to the technique of the person tracking algorithm disclosed in this publication, the difference information in the image is searched in order from the upper line, and it is determined that the face is located at the position of the difference pixel equal to or larger than the first found threshold value. It was something to recognize.

【０００３】しかしながら、従来の人物追跡アルゴリズ
ムの手法による人物自動追跡方式では、人物以外の移動
物体が画面上部に存在するときには全く動作せず、ま
た、ドアホンのカメラの前に人が２人以上いる場合には
背の高い人物に対してのみに画像の切り出しを行うこと
から、ドアホンにおいてはドアホンの近くにいる人物が
より重要度が高いと考えられており、ドアホンに適用す
るには好ましくないといった難点があった。[0003] However, in the automatic person tracking system based on the conventional person tracking algorithm, when a moving object other than a person exists at the top of the screen, it does not operate at all, and there are two or more people in front of the door phone camera. In this case, since the image is cut out only for a tall person, it is considered that the person near the doorphone is more important in the doorphone, and it is not preferable to apply the method to the doorphone. There were difficulties.

【０００４】本発明は上述の難点を解消するためになさ
れたもので、テレビドアホン等の映像機器が設置される
多様な環境下において、テレビカメラにて撮像される人
物の画像を自動で視野調整して、テレビモニタの画面中
央に人物を中心とした一定範囲（人物付近）の画像を高
精度で映し出すことができるテレビカメラモニタ人物追
跡方式を提供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned difficulties, and automatically adjusts the field of view of a person's image picked up by a TV camera under various environments where video equipment such as a TV door phone is installed. Then, an object of the present invention is to provide a television camera monitor person tracking system capable of projecting an image in a certain range (near a person) centering on a person at the center of the screen of the television monitor with high accuracy.

【０００５】[0005]

【課題を解決するための手段】このような目的を達成す
るため、本発明によるテレビカメラモニタ人物追跡方式
は、テレビカメラにて撮像された人物の画像をテレビモ
ニタにて映し出すにあたり、入力された画像のうち少な
くとも３枚の連続画像である現フレームの画像、前フレ
ームの画像、前々フレームの画像に対して、現フレーム
の画像と前フレームの画像、前フレームの画像と前々フ
レームの画像に対して差分の絶対値をとり、２値化を行
った２個の差分画像をそれぞれ求め、２個の差分画像の
論理積をとり膨張処理および収縮処理を行った後、画像
の領域ごとに分割するラベリング処理を行ない、ラベリ
ング処理後の最大ラベル領域において人物かどうかの判
定を行ない、最大ラベル領域の重心から人物付近の画像
を切り出してテレビモニタに映し出すものである。In order to achieve the above object, a television camera monitor person tracking method according to the present invention is applied to displaying an image of a person captured by a television camera on a television monitor. The current frame image, the previous frame image, the previous frame image, and the previous frame image are included in the current frame image, the previous frame image, and the previous frame image that are at least three continuous images among the images. , The absolute value of the difference is taken, two binarized difference images are obtained, the logical product of the two difference images is taken, and the dilation process and the erosion process are performed. Perform labeling processing to divide, determine whether or not a person is in the maximum label area after labeling processing, cut out an image near the person from the center of gravity of the maximum label area, and perform telephoto. It is those that reflect on the monitor.

【０００６】このようなテレビカメラモニタ人物追跡方
式によれば、ドアホン子機のテレビカメラにて撮像され
る人物の画像を少なくとも３枚の連続画像である現フレ
ームの画像、前フレームの画像、前々フレームの画像に
ついて撮像し、ドアホン親機の人物追跡処理装置にて現
フレームの画像と前フレームの画像、前フレームの画像
と前々フレームの画像に対して差分処理および２値化処
理を行い２個の差分画像をそれぞれ求め、２個の差分画
像の膨張、収縮処理、論理積処理を行った後にラベリン
グ処理を行ない、ラベリング処理後の最大ラベル領域に
おいて人物判定処理して最大ラベル領域の重心から人物
付近の画像を切出処理することによりテレビモニタの画
面中央に人物を中心とした一定範囲（人物付近）の画像
を高精度で映し出すことができる。According to such a television camera monitor person tracking system, at least three continuous images of a current frame image, a previous frame image, Each frame image is captured, and a difference process and a binarization process are performed on the current frame image and the previous frame image, and the previous frame image and the two-before frame image by the person tracking processing device of the intercom master unit. Two difference images are obtained respectively, and after performing expansion, contraction processing, and logical product processing of the two difference images, labeling processing is performed, and person determination processing is performed on the largest label area after labeling processing, and the center of gravity of the maximum label area is performed. The image of a certain area (near a person) centered on the person is displayed with high precision in the center of the screen of the TV monitor by extracting the image near the person from It is possible.

【０００７】[0007]

【発明の実施の形態】以下、本発明によるテレビカメラ
モニタ人物追跡方式を適用した好ましい形態の一実施例
について、図面を参照して説明する。図２は本発明によ
るテレビカメラモニタ人物追跡方式の一実施例を適用し
たテレビドアホンの構成を示すブロック図であり、通
常、住戸の玄関（住戸玄関）に設置され画像の撮像機能
および通話機能を有するドアホン子機１と、ドアホン子
機１に伝送路Ｌ1を介して接続され住戸の居室内に設置
されるとともにドアホン子機１にて撮像された画像の再
生機能および通話機能を有するドアホン親機２とで構成
されている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A preferred embodiment to which a television camera monitor person tracking system according to the present invention is applied will be described below with reference to the drawings. FIG. 2 is a block diagram showing a configuration of a television doorphone to which an embodiment of a television camera monitor person tracking system according to the present invention is applied. And a door phone master unit connected to the door phone slave unit 1 via a transmission line L1 and installed in a living room of a dwelling unit and having a function of reproducing an image captured by the door phone slave unit 1 and a function of talking. And 2.

【０００８】ドアホン子機１には、該テレビカメラの前
にいる人物と人物以外の移動物体とが合成された画像を
撮像して解像度が640×480のフレーム単位毎のカラー画
像信号として出力するテレビカメラ１０と、ドアホン子
機１を使用する人物がドアホン親機２を使用する人物と
の通話を成立させるために用いる子機マイク１１および
子機スピーカ１２と、多重回路１３とが備えられてお
り、多重回路１３には子機ライン接続端子Ｔ1を介して
伝送路Ｌ1が接続されている。[0008] The door phone slave unit 1 captures an image in which a person in front of the television camera and a moving object other than the person are combined, and outputs the image as a color image signal in 640 × 480 frame units. A television camera 10, a slave unit microphone 11 and a slave unit speaker 12 used by a person using the intercom slave 1 to establish a call with a person using the intercom master 2, and a multiplexing circuit 13 are provided. The transmission line L1 is connected to the multiplexing circuit 13 via the slave unit line connection terminal T1.

【０００９】ドアホン親機２には、多重回路２０と、ド
アホン子機１のテレビカメラ１０にて撮像されたフレー
ム単位毎のカラー画像信号を人物追跡処理装置２２に順
次出力するとともに、人物追跡処理装置２２にて信号処
理（後述）された画像の人物座標を適宜に信号処理（詳
述せず）された映像信号として出力する映像信号制御回
路２１と、映像信号制御回路２１から出力された映像信
号によりドアホン子機１のテレビカメラ１０にて撮像さ
れた人物の画像を画面中央に映し出すテレビモニタ２３
と、ドアホン親機２を使用する人物がドアホン子機１を
使用する人物との通話を成立させるために用いるハンド
セット２４と、ハンドセット２４にて送受信される音声
信号を適宜に信号処理（詳述せず）する音声信号制御回
路２５とが備えられており、多重回路２０には親機ライ
ン接続端子Ｔ2を介して伝送路Ｌ1が接続されている。The intercom master unit 2 sequentially outputs a multiplexing circuit 20 and a color image signal for each frame image picked up by the television camera 10 of the intercom slave unit 1 to a person tracking processing unit 22 and a person tracking process. A video signal control circuit 21 that outputs human coordinates of an image that has been subjected to signal processing (described later) by the device 22 as a video signal that has been appropriately processed (not described in detail), and a video output from the video signal control circuit 21 A television monitor 23 for displaying, at the center of the screen, an image of a person captured by the television camera 10 of the intercom slave unit 1 by a signal
And a handset 24 used by the person using the intercom base unit 2 to establish a call with the person using the intercom slave unit 1, and appropriately processing the audio signals transmitted and received by the handset 24 (detailed description). An audio signal control circuit 25 is provided, and the transmission line L1 is connected to the multiplexing circuit 20 via a parent device line connection terminal T2.

【００１０】人物追跡処理装置２２には図１のブロック
図に示すように、ドアホン子機１のテレビカメラ１０に
て撮像された解像度が640×480の複数フレームのカラー
画像信号をフレーム単位毎に記憶する画像メモリ３０
と、前述のフレーム単位毎のカラー画像信号をモノクロ
の白黒画像信号に変換する白黒変換処理部３１と、白黒
変換処理部３１を介した複数フレームの白黒画像信号の
解像度を80×60に画像縮小する画像縮小処理部３２と、
画像縮小処理部３２を介して画像縮小された複数フレー
ムの白黒画像縮小信号をフレーム単位毎に記憶する画像
メモリ３３と、画像縮小処理部３２および画像メモリ３
３を介した複数フレームの白黒画像縮小信号の画素の差
分をとる差分処理部３４と、画素の差分の絶対値を２値
化した差分画像信号に信号処理する２値化処理部３５
と、２値化処理部３５を介した差分画像信号に対して膨
張処理を行う膨張処理部３６と、同様に収縮処理を行う
収縮処理部３７と、収縮処理部３７を介した差分画像信
号の２値化画像を記憶する画像メモリ３８と、同様に２
値化画像の論理積処理を行う論理積処理部３９と、論理
積処理部３９を介した２値化画像の連結成分を領域毎に
分割するラベリング処理部４０と、ラベリング処理部４
０を介して分割された領域の特徴（領域特徴）を抽出す
る特徴抽出処理部４１と、抽出された領域特徴が人物で
あるか否かを判定する人物判定処理部４２と、領域の重
心特徴を抽出する重心抽出処理部４３と、画像の任意の
位置より切り出しを行う画像切出処理部４４とが備えら
れている。As shown in the block diagram of FIG. 1, the person tracking processing device 22 converts a plurality of color image signals of a plurality of frames of 640.times.480 imaged by the television camera 10 of the intercom slave unit 1 into frame units. Image memory 30 for storing
And a black-and-white conversion processing unit 31 for converting a color image signal for each frame unit into a monochrome black-and-white image signal, and reducing the resolution of a plurality of frames of black-and-white image signals via the black-and-white conversion processing unit 31 to 80 × 60 An image reduction processing unit 32 to perform
An image memory 33 for storing, on a frame-by-frame basis, a plurality of black-and-white image reduced signals of a plurality of frames whose images have been reduced via the image reduction processing unit 32;
3, and a binarization processing unit 35 that performs signal processing on a differential image signal obtained by binarizing the absolute value of the pixel difference into a binary image.
And a dilation processing unit 36 that performs dilation processing on the difference image signal passed through the binarization processing unit 35, a contraction processing unit 37 that similarly performs contraction processing, and a differential image signal that passes through the contraction processing unit 37. An image memory 38 for storing a binarized image,
A logical product processing unit 39 that performs a logical product process on the digitized image, a labeling processing unit 40 that divides the connected components of the binary image via the logical product processing unit 39 into regions, and a labeling processing unit 4
0, a feature extraction processing unit 41 for extracting a feature (region feature) of the region divided, a person determination processing unit 42 for determining whether or not the extracted region feature is a person, and a centroid feature of the region. And an image extraction processing unit 44 for extracting an image from an arbitrary position.

【００１１】このように構成されたテレビカメラモニタ
人物追跡方式において、以下、図２のブロック図に示す
テレビドアホンに適用される図１のブロック図に示す人
物追跡処理装置の信号処理動作について、図３の処理動
作を示すフローチャート、図４の処理途中の画像の模式
図、図５の処理動作説明図、図６の画像縮小処理説明
図、図７の差分処理説明図、図８の膨張収縮処理説明
図、図９の論理積処理説明図をそれぞれ参照して説明す
る。The signal processing operation of the person tracking processor shown in the block diagram of FIG. 1 applied to the TV door phone shown in the block diagram of FIG. 3 is a flowchart showing the processing operation, FIG. 4 is a schematic diagram of an image in the middle of processing, FIG. 5 is a processing operation explanatory diagram, FIG. 6 is an image reduction processing explanatory diagram, FIG. 7 is a difference processing explanatory diagram, and FIG. The description will be made with reference to the explanatory diagram and the logical product processing explanatory diagram of FIG.

【００１２】図２のブロック図に示すドアホン子機１の
テレビカメラ１０にて撮像されたカラー画像は、図４の
画像の模式図に示すように解像度が640×480でＲＧＢの
輝度情報が含有されテレビカメラ１０の前にいる人物の
カラー画像Ｓ101と人物以外の移動物体のカラー画像Ｓ1
02とが合成されている。テレビカメラ１０はこのカラー
画像（カラー画像信号）をフレーム単位毎に順次出力す
る。この複数フレームのカラー画像信号は、多重回路１
３、子機ライン接続端子Ｔ1、伝送路Ｌ1、ドアホン親機
２の親機ライン接続端子Ｔ2、多重回路２０、映像信号
制御回路２１を介して人物追跡処理装置２２に順次伝送
される。尚、ここでは、複数フレームのカラー画像信号
の後述するドアホン親機２の人物追跡処理装置２２にお
ける信号処理を、少なくとも３枚の連続画像である現フ
レームのカラー画像信号Ｓn、前フレームのカラー画像
信号Ｓn-1、前々フレームのカラー画像信号Ｓn-2の３フ
レーム間について行うものとする。A color image picked up by the television camera 10 of the intercom slave unit 1 shown in the block diagram of FIG. 2 has a resolution of 640 × 480 and contains RGB luminance information as shown in the schematic diagram of the image of FIG. The color image S101 of a person in front of the television camera 10 and the color image S1 of a moving object other than the person
02 is synthesized. The television camera 10 sequentially outputs this color image (color image signal) for each frame unit. The color image signals of the plurality of frames are supplied to the multiplexing circuit 1.
3. The information is sequentially transmitted to the person tracking processor 22 via the slave unit line connection terminal T1, the transmission path L1, the master unit line connection terminal T2 of the intercom master unit 2, the multiplexing circuit 20, and the video signal control circuit 21. Here, the signal processing of the color image signals of a plurality of frames in the person tracking processing device 22 of the intercom master unit 2 to be described later is performed by the color image signal Sn of the current frame, which is at least three continuous images, and the color image signal of the previous frame. The processing is performed for three frames of the signal Sn-1 and the color image signal Sn-2 of the frame two frames before the previous frame.

【００１３】人物追跡処理装置２２に入力された３フレ
ーム間のカラー画像信号Ｓn、Ｓn-1、Ｓn-2は、図１の
ブロック図に示す画像メモリ３０に順次記憶されるとと
もに白黒変換処理部３１にそれぞれ伝送される。白黒変
換処理部３１は入力された３フレーム間のカラー画像信
号Ｓn、Ｓn-1、Ｓn-2のノイズを低減し領域の穴埋めを
行うとともに後述する画像メモリ３３の消費を抑制する
ために、白黒変換処理された３フレーム間の白黒画像信
号Ｓn'、Ｓn-1'、Ｓn-2'として画像縮小処理部３２に順
次出力すると同時にそれぞれの画素の輝度値｛（Ｒ＋２
Ｇ＋Ｂ）／４｝を算出する（ステップST1、ST2）。The color image signals Sn, Sn-1 and Sn-2 of the three frames input to the person tracking processor 22 are sequentially stored in an image memory 30 shown in the block diagram of FIG. 31 respectively. The black-and-white conversion processing unit 31 reduces the noise of the color image signals Sn, Sn-1, and Sn-2 between the three input frames, fills the area, and suppresses the consumption of the image memory 33 described later. The converted black-and-white image signals Sn ', Sn-1', and Sn-2 'for the three frames are sequentially output to the image reduction processing unit 32, and at the same time, the luminance value of each pixel ｛(R + 2
(G + B) / 4} (steps ST1 and ST2).

【００１４】画像縮小処理部３２は入力された３フレー
ム間の白黒画像信号Ｓn'、Ｓn-1'、Ｓn-2'を、図６の画
像縮小説明図に示すイメージで該白黒画像信号の解像度
を640×480から80×60の１／８まで落とすことにより画
面縮小し、図５の処理動作説明図に示す画面縮小された
３フレーム間の白黒画像縮小信号Ｓn''、Ｓn-1''、Ｓn-
2''として画像メモリ３３に順次記憶させる（ステップS
T3、ST4）。尚、このときの解像度の変化は、8×8のブ
ロックの平均を画像縮小後の画素とすることにより行な
われる。The image reduction processing section 32 converts the input black-and-white image signals Sn ', Sn-1' and Sn-2 'between the three frames into an image shown in FIG. Is reduced from 640 × 480 to １／ of 80 × 60 to reduce the screen, and the black-and-white image reduced signals Sn ″ and Sn−1 ″ between the three reduced frames shown in the processing operation explanatory diagram of FIG. , Sn-
2 '' is sequentially stored in the image memory 33 (step S
T3, ST4). The change in resolution at this time is performed by setting the average of 8 × 8 blocks as pixels after image reduction.

【００１５】差分処理部３４は画像メモリ３３にフレー
ム単位毎に順次記憶された３フレーム間の白黒画像縮小
信号Ｓn''、Ｓn-1''、Ｓn-2''のうち、図７の差分処理
説明図に示すイメージで時間的に隣り合う（隣接フレー
ム間）２枚の画像から対象の位置的、形状的変化による
動きのあった画素を検出するために、現フレームの白黒
画像縮小信号Ｓn''と前フレームの白黒画像縮小信号Ｓn
-1''との画素の差分、前フレームの白黒画像縮小信号Ｓ
n-1''と前々フレームの白黒画像縮小信号Ｓn-2''との画
素の差分をそれぞれ算出した２個の差分信号である現フ
レームの差分信号Ｓ11、前フレームの差分信号Ｓ12を２
値化処理部３５にそれぞれ出力する（ステップST5）。The difference processing unit 34 calculates the difference between the black and white image reduced signals Sn ″, Sn−1 ″ and Sn−2 ″ of three frames sequentially stored in the image memory 33 for each frame unit as shown in FIG. In order to detect a pixel that has moved due to a change in the position or shape of a target from two images temporally adjacent (between adjacent frames) in the image shown in the processing explanatory diagram, a black-and-white image reduced signal Sn of the current frame is detected. '' And the black-and-white image reduced signal Sn of the previous frame
-1 '' pixel difference, black and white image reduced signal S of the previous frame
The difference signal S11 of the current frame and the difference signal S12 of the previous frame, which are two difference signals respectively calculated from the pixel difference between the black-and-white image reduced signal Sn-2 '' of the previous frame and the difference signal S12 of the previous frame, are calculated as 2
The data is output to the value processing unit 35 (step ST5).

【００１６】２値化処理部３５は入力された２個の差分
信号Ｓ11、Ｓ12の画素の差分の並びから移動物体の形状
を求めるために、図７の差分処理説明図に示すイメージ
で連続した画素の差分における正あるいは負の領域の絶
対値をとり２値化された図５の処理動作説明図に示す２
個の差分画像信号である現フレームの差分画像信号Ｓ
1、前フレームの差分画像信号Ｓ2を算出して膨張処理部
３６にそれぞれ出力する（ステップST6）。The binarization processing unit 35 has a series of images shown in FIG. 7 for explaining the difference processing in order to determine the shape of the moving object from the arrangement of the differences between the pixels of the two input difference signals S11 and S12. The absolute value of the positive or negative area in the pixel difference is taken and binarized and shown in FIG.
Differential image signals S of the current frame,
1. The difference image signal S2 of the previous frame is calculated and output to the dilation processing unit 36 (step ST6).

【００１７】膨張処理部３６は入力された２個の差分画
像信号Ｓ1、Ｓ2に対して該差分画像信号に与えられた連
結成分の境界点を一層分太らせるために、図８の膨張収
縮処理説明図に示すイメージで数回膨張するとともに、
収縮処理部３７を介して元の大きさに戻すために数回収
縮させノイズの低減や差分領域が穴埋められた図５の処
理動作説明図に示す２個の差分画像膨張収縮信号である
現フレームの差分画像膨張収縮信号Ｓ1'、前フレームの
差分画像膨張収縮信号Ｓ2'として画像メモリ３８に順次
記憶させる（ステップST7）。The dilation processing section 36 performs the dilation / shrinkage processing shown in FIG. 8 on the input two difference image signals S1 and S2 so as to further increase the boundary points of the connected components given to the difference image signals. While expanding several times with the image shown in the illustration,
The current frame which is two differential image expansion / contraction signals shown in the processing operation explanatory diagram of FIG. 5 in which the image is contracted several times to return to the original size via the contraction processing unit 37 and the noise is reduced and the difference area is filled in. Are sequentially stored in the image memory 38 as the differential image expansion / contraction signal S1 ′ of the previous frame and the differential image expansion / contraction signal S2 ′ of the previous frame (step ST7).

【００１８】論理積処理部３９は現フレームの差分画像
膨張収縮信号Ｓ1'と画像メモリ３８から読み出された前
フレームの差分画像膨張収縮信号Ｓ2'に対して図９の論
理積処理説明図に示すイメージで論理積を算出すること
により、現フレームの画像の正確な移動物体領域が抽出
された図４の画像の模式図、図５の処理動作説明図に示
す論理積画像信号Ｓ3をラベリング処理部４０に出力す
る（ステップST8）。The logical product processing unit 39 applies the logical product processing shown in FIG. 9 to the differential image expansion / contraction signal S1 'of the current frame and the differential image expansion / contraction signal S2' of the previous frame read from the image memory 38. By calculating a logical product using the image shown, a labeling process of the logical product image signal S3 shown in the schematic diagram of the image of FIG. 4 in which the accurate moving object region of the image of the current frame is extracted and the process operation explanatory diagram of FIG. Output to the unit 40 (step ST8).

【００１９】ラベリング処理部４０は入力された論理積
画像信号Ｓ3に対して同じ連結成分に属する全ての画素
には同じラベル（番号）を割り当て、異なった連結部分
には異なったラベルを割り当てて連結成分の最大領域を
抽出するために、論理積画像信号Ｓ3の連結成分の領域
を分割し領域の画素数である領域面積と周囲長の２乗÷
領域面積から算出される図形の複雑度等からなる図４の
画像の模式図、図５の処理動作説明図に示す領域特徴Ｓ
4を抽出して特徴抽出処理部４１に出力する（ステップS
T9）。The labeling processing unit 40 assigns the same label (number) to all pixels belonging to the same connected component to the input logical product image signal S3, and assigns different labels to different connected portions. In order to extract the maximum area of the component, the area of the connected component of the logical product image signal S3 is divided, and the square of the area area and the perimeter of the area, which is the number of pixels of the area, is used.
A schematic diagram of the image in FIG. 4 including the complexity of the figure calculated from the area of the region and a region feature S shown in the processing operation explanatory diagram of FIG.
4 is extracted and output to the feature extraction processing unit 41 (step S
T9).

【００２０】特徴抽出処理部４１は図５の処理動作説明
図に示すように入力された領域特徴Ｓ4の最大ラベル領
域の特徴である複雑度、面積、画像等を抽出して人物候
補領域Ｓ5として人物判定処理部４２に出力する。尚、
最大ラベル領域を切り出すことにより人物ではない小領
域が取り除かれるため画像は雑音に強くなる。人物判定
処理部４２は入力された人物候補領域Ｓ5に対して人物
の領域であるか否かを判別する。すなわち、この人物候
補領域Ｓ5の領域面積が例えば80×60の解像度において4
00画素である閾値以上であり、複雑度が例えば３５（こ
の値は小領域で１である）の閾値以下であり、画像の最
下部ラインに人物の領域が存在するとともに主軸方向は
Ｘ軸に対して90°（例えば、−45°〜＋45°）前後とな
るこれら４種の条件をすべて満たしていると、図４の画
像の模式図、図５の処理動作説明図に示すように人物判
定処理部４２は入力された人物候補領域Ｓ5を人物と判
別し人物確定領域Ｓ6として重心抽出処理部４３に出力
する。また、上述の４種の条件のうち１つでも不十分で
あるときには人物判定処理部４２は図４の画像の模式図
に示す人物不確定領域Ｓ7として重心抽出処理部４３に
出力する（ステップST11、ST12）。尚、上述の閾値はド
アホン環境において実験を行うことで決定するものとし
閾値は適宜に調整可能にする。The feature extraction processing unit 41 extracts the complexity, area, image, etc., which are the features of the maximum label area of the input area feature S4, as shown in the processing operation explanatory diagram of FIG. Output to the person determination processing unit 42. still,
By cutting out the maximum label area, a small area that is not a person is removed, so that the image is resistant to noise. The person determination processing section 42 determines whether or not the input person candidate area S5 is a person area. That is, the area of the person candidate area S5 is, for example, 4 at a resolution of 80 × 60.
It is equal to or greater than the threshold value of 00 pixels, and the complexity is equal to or less than the threshold value of, for example, 35 (this value is 1 in a small area). When all of these four conditions of about 90 ° (for example, −45 ° to + 45 °) are satisfied, the person is determined as shown in the schematic diagram of the image in FIG. 4 and the processing operation explanatory diagram in FIG. The processing unit 42 determines the input person candidate area S5 as a person and outputs the person candidate area S5 to the center-of-gravity extraction processing unit 43 as a person confirmed area S6. When even one of the above four conditions is insufficient, the person determination processing unit 42 outputs the person uncertainty area S7 shown in the schematic diagram of the image in FIG. 4 to the centroid extraction processing unit 43 (step ST11). , ST12). The above-mentioned threshold value is determined by conducting an experiment in a door phone environment, and the threshold value can be appropriately adjusted.

【００２１】重心抽出処理部４３は入力された人物確定
領域Ｓ6から、画像の最大ラベル領域のＸ、Ｙ座標から
平均値を算出することにより求められる重心領域を抽出
して、重心領域を現フレームの人物の顔位置の重心座標
Ｓ8として画像切出処理部４４に出力する。また、重心
抽出処理部４３にて人物以外の物体であると判定されれ
ば前フレームの重心座標を返す（ステップST13、ST1
4）。The center-of-gravity extraction processing unit 43 extracts a center-of-gravity region obtained by calculating an average value from the X and Y coordinates of the maximum label region of the image from the input person defined region S6, and converts the center-of-gravity region into the current frame. Is output to the image cutout processing unit 44 as the barycenter coordinate S8 of the person's face position. If the center-of-gravity extraction processing unit 43 determines that the object is not a person, the center-of-gravity coordinates of the previous frame are returned (steps ST13 and ST1).
Four).

【００２２】画像切出処理部４４は入力された重心座標
Ｓ8から重心を中心とした一定範囲の画像である人物付
近の画像を切り出して人物座標Ｓ9として映像信号制御
回路２１に出力し、映像信号制御回路２１は入力された
人物座標Ｓ9をもとに映像信号Ｓ10をテレビモニタ２３
に出力することから、図４の画像の模式図に示すように
テレビモニタ２３の画面中央に人物を中心とした一定範
囲（人物付近）の画像が映し出される。The image cut-out processing section 44 cuts out an image around a person, which is an image in a certain range centered on the center of gravity, from the input center-of-gravity coordinates S8 and outputs the cut-out image to the video signal control circuit 21 as person coordinates S9. The control circuit 21 outputs the video signal S10 to the television monitor 23 based on the input person coordinates S9.
Therefore, as shown in the schematic diagram of the image in FIG. 4, an image of a certain range (near a person) centered on a person is displayed in the center of the screen of the television monitor 23.

【００２３】また、ここでは、図２のブロック図に示す
ドアホン子機１を使用する人物が用いる子機マイク１１
および子機スピーカ１２と、伝送路Ｌ1を介してドアホ
ン親機２を使用する人物が用いるハンドセット２４間で
送受信される音声信号による通話成立時の動作について
の説明は省略する。尚、上記実施例では、テレビドアホ
ンに人物追跡処理装置を設けたものを示したが、これに
限定されず、テレビ電話に適用させてもよく上記実施例
と同様に人物追跡が可能となる。また、人物追跡処理装
置を構成する人物判定処理部の替わりに例えば防犯異常
を検出（判定）する異常検出処理部等を設けることによ
り、人物だけでなく異常検出等にも応用できる。更に、
上記実施例では解像度が640×480の画像を80×60に画像
縮小させたが解像度を変化させても同様の効果を奏す
る。In this case, the slave unit microphone 11 used by the person using the door phone slave unit 1 shown in the block diagram of FIG.
The description of the operation at the time of establishing a call by a voice signal transmitted and received between the handset 24 used by the person using the intercom master unit 2 via the slave unit speaker 12 and the transmission line L1 via the transmission path L1 is omitted. In the above embodiment, the video doorphone is provided with the person tracking processing device. However, the present invention is not limited to this. The present invention may be applied to a videophone and the person tracking can be performed as in the above embodiment. Further, by providing, for example, an abnormality detection processing unit for detecting (determining) a crime prevention abnormality in place of the person determination processing unit included in the person tracking processing device, it can be applied not only to a person but also to abnormality detection. Furthermore,
In the above embodiment, the image having the resolution of 640 × 480 is reduced to 80 × 60, but the same effect can be obtained by changing the resolution.

【００２４】[0024]

【発明の効果】以上の説明から明らかなように、本発明
のテレビカメラモニタ人物追跡方式によれば、テレビカ
メラにて撮像される人物および人物以外の移動物体とで
構成されるカラー画像を少なくとも３枚の連続画像であ
る現フレームの画像、前フレームの画像、前々フレーム
の画像について撮像し、ドアホン親機の人物追跡処理装
置にて現フレームの画像と前フレームの画像、前フレー
ムの画像と前々フレームの画像に対して差分処理および
２値化処理を行い２個の差分画像をそれぞれ求め、２個
の差分画像の膨張、収縮処理、論理積処理を行った後に
ラベリング処理を行ない、ラベリング処理後の最大ラベ
ル領域において人物判定処理して最大ラベル領域の重心
から人物付近の画像を切出処理することにより人物の位
置を追跡し、テレビモニタの画面中央に人物を中心とし
た一定範囲（人物付近）の画像を高精度で映し出すこと
ができる。As is clear from the above description, according to the television camera monitor person tracking system of the present invention, at least a color image composed of a person and a moving object other than the person captured by the television camera is obtained. The image of the current frame, the image of the previous frame, and the image of the frame before the last, which are three continuous images, are captured, and the image of the current frame, the image of the previous frame, and the image of the previous frame are taken by the person tracking processing device of the intercom master unit. Then, the difference processing and the binarization processing are performed on the image of the frame two frames before, two difference images are obtained, and the expansion processing, the contraction processing, and the logical product processing of the two difference images are performed, and then the labeling processing is performed. In the maximum label area after the labeling processing, the person determination processing is performed, and an image near the person is extracted from the center of gravity of the maximum label area to track the position of the person, and the It can be displayed an image of a radius around a person in the center of the screen of the monitor (near person) with high accuracy.

[Brief description of the drawings]

【図１】本発明によるテレビカメラモニタ人物追跡方式
の一実施例に用いられた人物追跡処理装置の構成を示す
ブロック図。FIG. 1 is a block diagram showing a configuration of a person tracking processing device used in an embodiment of a television camera monitor person tracking method according to the present invention.

【図２】本発明によるテレビカメラモニタ人物追跡方式
の一実施例が適用されたテレビドアホンの構成を示すブ
ロック図。FIG. 2 is a block diagram showing a configuration of a television doorphone to which an embodiment of a television camera monitor person tracking system according to the present invention is applied.

【図３】図１の人物追跡処理装置の処理動作を示すフロ
ーチャート。FIG. 3 is a flowchart showing a processing operation of the person tracking processing device of FIG. 1;

【図４】図１の人物追跡処理装置の処理途中の画像を示
す模式図。FIG. 4 is a schematic diagram showing an image in the process of the person tracking processing device of FIG. 1;

【図５】図１の人物追跡処理装置の処理動作を示す処理
動作説明図。FIG. 5 is a processing operation explanatory diagram showing the processing operation of the person tracking processing device of FIG. 1;

【図６】図１の人物追跡処理装置に備えられた画像縮小
処理部の動作を示す画像縮小処理説明図。FIG. 6 is an explanatory diagram of image reduction processing showing the operation of an image reduction processing unit provided in the person tracking processing device of FIG. 1;

【図７】図１の人物追跡処理装置に備えられた差分処理
部の動作を示す差分処理説明図。FIG. 7 is an explanatory diagram of a difference process showing an operation of a difference processing unit provided in the person tracking processing device of FIG. 1;

【図８】図１の人物追跡処理装置に備えられた膨張処理
部、収縮処理部の各動作を示す膨張収縮処理説明図。FIG. 8 is an explanatory view of expansion / contraction processing showing operations of an expansion processing unit and a contraction processing unit provided in the person tracking processing device of FIG. 1;

【図９】図１の人物追跡処理装置に備えられた論理積処
理部の動作を示す論理積処理説明図。9 is an explanatory diagram of a logical product process showing an operation of a logical product processing unit provided in the person tracking processing device of FIG. 1;

[Explanation of symbols]

１０・・・・・テレビカメラ２３・・・・・テレビモニタＳn・・・・・現フレームの画像（カラー画像信号）Ｓn-1・・・・・前フレームの画像（カラー画像信号）Ｓn-2・・・・・前々フレームの画像（カラー画像信号）Ｓ1、Ｓ2・・・・・２個の差分画像（差分画像信号） 10 TV camera 23 TV monitor Sn ... current frame image (color image signal) Sn-1 ... previous frame image (color image signal) Sn- 2... Image of two frames before (color image signal) S1, S2... Two difference images (difference image signal)

Claims

[Claims]

When displaying an image of a person captured by a television camera (10) on a television monitor (23), an image of a current frame (Sn), which is a continuous image of at least three of the input images, is displayed. ), Previous frame image (Sn
-1) The absolute value of the difference between the image of the current frame and the image of the previous frame, and the absolute value of the difference between the image of the previous frame and the image of the previous frame is determined for the image (Sn-2) of the previous two frames. And binarized two difference images (S1, S2)
After performing logical expansion of the two difference images and performing dilation processing and erosion processing, labeling processing for dividing each image area is performed, and whether or not a person is in the maximum label area after the labeling processing is determined. A television camera monitor person tracking system, wherein a determination is made, and an image near a person is cut out from the center of gravity of the maximum label area and projected on the TV monitor.