JP3129595B2

JP3129595B2 - Image processing device

Info

Publication number: JP3129595B2
Application number: JP06042348A
Authority: JP
Inventors: 久好滝井
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1993-12-13
Filing date: 1994-03-14
Publication date: 2001-01-31
Anticipated expiration: 2016-01-31
Also published as: JPH07225841A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、画像データの中から所
望の対象物体を特定する画像処理装置に関し、たとえ
ば、テレビ電話、テレビ会議装置等に使用される画像処
理装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image processing apparatus for specifying a desired target object from image data, for example, an image processing apparatus used for a videophone, a video conference apparatus, and the like.

【０００２】[0002]

【従来の技術】従来、テレビ電話やテレビ会議システム
において、ＣＣＤ（電荷結合デバイス）カメラ等の撮像
装置により撮像された画像の中で中心となる対象物を拡
大する等の方法で中心となる対象物を優先的に映し出す
方法や、撮影した画像データの中の物体の動きまたは音
声情報を利用して中心となる対象物を検出し、中心とな
る対象物の解像度を向上させ鮮明な画像を転送する方法
が開発されている。たとえば、電子情報通信学会誌、１
９９１年１２月、ｐ１３５６、「カラー動画像符号化装
置を開発」に記載されている従来の画像処理装置では、
テレビ電話で映し出されている画像から動きの有無の情
報を抽出し、動きのある箇所を人物画像の領域としてそ
の人物画像を背景画像より優先的に圧縮転送することに
より、通話者の表示モニタに人物画像を鮮明に映し出す
という方法が開示されている。2. Description of the Related Art Conventionally, in a videophone or video conference system, a central object is enlarged by a method such as enlarging a central object in an image captured by an imaging device such as a CCD (charge coupled device) camera. Detects the center object using the method of projecting the object preferentially or the movement or sound information of the object in the captured image data, improves the resolution of the center object, and transmits a clear image A way to do that has been developed. For example, IEICE Journal, 1
In the conventional image processing apparatus described in December 1999, p. 1356, "Developing a color moving picture coding apparatus",
By extracting information on the presence or absence of movement from the image displayed on the videophone, and making the moving part a region of the person image and compressing and transferring that person image with priority over the background image, it can be displayed on the caller's display monitor. A method of clearly displaying a person image has been disclosed.

【０００３】また、他の従来の画像処理装置としては、
信学技報ＩＥ９２−４９「ＴＶ会議における音源位置情
報による動画像符号化の制御」では、複数の人物がいる
会議の席上にマイクロホンを複数個設置し、その各々の
マイクロホンで捕えた音声信号を利用して、マイクロホ
ン間の位置や音声信号レベルの相関関係により発言者の
位置を推定することによって、複数の人物の中から中心
となる発言者を検出する方法が開示されている。[0003] Other conventional image processing apparatuses include:
In IEICE IE92-49, "Control of Moving Image Coding by Sound Source Position Information in TV Conference", a plurality of microphones are installed on a conference seat with a plurality of persons, and audio signals captured by each of the microphones are set. There is disclosed a method of estimating a speaker's position from a plurality of persons by estimating a speaker's position based on a correlation between positions between microphones and a sound signal level.

【０００４】[0004]

【発明が解決しようとする課題】上記の人物画像を優先
的に転送する画像処理装置では、目的とする人物を単に
動きの有無で判定しているため、画像データの中の所望
の対象物体を特定することができなかった。つまり、目
的とする人物の後方で他の人が移動したり、人物以外で
も動きのあるものが画像中に存在している場合や、複数
の人物が画像中に存在する場合等では、その中心となる
対象は目的とする人物（たとえば、複数の人物では発言
者）以外の動き物体となってしまい、目的とした人物を
優先的に圧縮転送できなかった。In the above-described image processing apparatus for preferentially transferring a person image, a desired person is simply determined based on the presence or absence of motion. Could not be identified. In other words, when another person moves behind the target person, when there is a moving person other than the person in the image, or when a plurality of people exist in the image, the center Is a moving object other than the target person (for example, a speaker in a plurality of persons), and the target person could not be preferentially compressed and transferred.

【０００５】また、音声情報を利用して発言者を特定す
る画像処理装置では、発言者を特定できるだけで、所望
の対象物体を特定することはできなかった。つまり、テ
レビ電話やテレビ会議等では、中心となる対象物体（通
話者が一番必要とする画像）が、発言者の顔映像のみで
はなく、たとえば、何らかの文章、図面、または図表等
が書かれた黒板等の表示枠が中心となる対象物体である
場合もあり、また何らかの物体（話題の中心の商品、モ
デル図等）が中心となる対象物体である場合もある。こ
のように発言者以外が中心となる対象物となった場合
は、発言者を追尾することがかえって短所となってしま
っていた。さらに、音声情報を利用しているため、必ず
複数のマイクロホンが必要となり、さらに、ＣＣＤカメ
ラ等の画像入力装置とマイクロホンとの位置関係を定ま
った状態にしておくことが必要となるという制限もあっ
た。Further, in an image processing apparatus for specifying a speaker using voice information, only a speaker can be specified, but a desired target object cannot be specified. In other words, in a videophone call or a videoconference, the target object (the image most needed by the caller) is not only the face image of the speaker, but also some text, drawing, chart, or the like. In some cases, a display frame such as a blackboard is a target object whose center is a target object, and in some cases, an object (a product or a model diagram of a topic, etc.) is a target object. In this way, when the target object is a person other than the speaker, tracking the speaker is rather a disadvantage. Furthermore, since voice information is used, a plurality of microphones are necessarily required, and further, it is necessary to maintain a positional relationship between the microphones and an image input device such as a CCD camera. Was.

【０００６】本発明は上記課題を解決するためのもので
あって、所望の対象物体を特定することができる画像処
理装置を提供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and has as its object to provide an image processing apparatus capable of specifying a desired target object.

【０００７】本発明の他の目的は、発言者の位置を特定
することができる画像処理装置を提供することである。Another object of the present invention is to provide an image processing apparatus capable of specifying the position of a speaker.

【０００８】本発明のさらに他の目的は、静止物体の位
置を測定することができる画像処理装置を提供すること
である。Still another object of the present invention is to provide an image processing apparatus capable of measuring the position of a stationary object.

【０００９】本発明のさらに他の目的は、細長い動物体
の位置を特定することができる画像処理装置を提供する
ことである。Yet another object of the present invention is to provide an image processing apparatus capable of specifying the position of an elongated moving object.

【００１０】本発明のさらに他の目的は、複数の人物が
重なって写っていてもそれぞれの人物を特定することが
できる画像処理装置を提供することである。[0010] Still another object of the present invention is to provide an image processing apparatus capable of specifying each person even when a plurality of persons are photographed in an overlapping manner.

【００１１】本発明のさらに他の目的は、対象物に対し
て指示棒等を用いていない場合でもその対象物の位置を
特定することができる画像処理装置を提供することであ
る。Still another object of the present invention is to provide an image processing apparatus capable of specifying the position of an object even when the pointing rod or the like is not used for the object.

【００１２】本発明のさらに他の目的は、指示棒等によ
りあらゆる方向から対象物が指し示されている場合でも
その対象物の位置を特定することができる画像処理装置
を提供することである。Still another object of the present invention is to provide an image processing apparatus capable of specifying the position of an object even when the object is pointed from all directions by a pointing rod or the like.

【００１３】[0013]

【課題を解決するための手段】請求項１記載の情報処理
装置においては、画像データを分析して所定の対象物を
検出する画像処理装置であって、複数の動物体を含む連
続画像データの時間差分データを求める差分手段と、時
間差分データを累積加算する加算手段と、加算手段によ
り累積加算された時間差分データを、複数の動物体ごと
に、累積加算された時間差分データに基づいて算出され
た複数の動物体ごとのしきい値で二値化する可変二値化
手段と、二値化手段により二値化されたデータをもとに
複数の動物体の位置を検出する検出手段とを含む。In the information processing apparatus according to the present invention, a predetermined object is analyzed by analyzing image data.
An image processing apparatus for detecting , comprising: a difference means for obtaining time difference data of continuous image data including a plurality of moving objects; an addition means for cumulatively adding the time difference data; and a time difference data cumulatively added by the addition means. Is calculated based on the accumulated time difference data for each of a plurality of moving objects.
Variable binarizing means for binarizing with a threshold value for each of the plurality of moving objects, and detecting means for detecting the positions of the plurality of moving objects based on the data binarized by the binarizing means. Including.

【００１４】請求項２記載の画像処理装置は、請求項１
に記載の発明の構成に加えて、画像データは人物を表わ
すデータを含み、二値化手段により二値化された画像
データと、人物の顔の輪郭を表わすデータおよび顔の中
の要素の輪郭を表わすデータの少なくとも一方とに基づ
いて、人物の顔を含む第１領域を特定する第１特定手段
と、第１領域の中から人物の口唇を含む第２領域を特定
する第２特定手段と、第２領域の口唇の動き量を検出
し、発言者の位置を検出する第３特定手段とをさらに含
む。[0014] The image processing apparatus according to the second aspect is the first aspect.
In addition to the configuration of the invention described in, the image data represents a person.
Includes to data, the two-value of image data, among the data and face representing the contour of the face of the person by binarizing means
Based on at least one of the data representing the outline of the element
A first specifying means for specifying a first area including a person's face, a second specifying means for specifying a second area including a lip of a person from the first area, and a movement amount of a lip in the second area detecting a third specifying means and further including <br/> No for detecting the position of the speaker.

【００１５】請求項３記載の画像処理装置は、請求項１
に記載の発明の構成に加えて、画像データは静止物体を
表わすデータを含み、画像データを微分して微分画像
データを作成する微分手段と、微分画像データを所定方
向に投影して投影データを作成する投影手段と、投影デ
ータの極値をもとに静止物体の位置を検出する静止物体
検出手段とをさらに含む。According to a third aspect of the present invention, there is provided the image processing apparatus according to the first aspect.
In addition to the configuration of the invention described in the above, the image data represents a stationary object.
Includes data representing still a differentiating means for generating differential image data by differentiating the image data, based on the projection means, the extreme value of the projection data to create projection data by projecting the differentiated image data in a predetermined direction further comprising a stationary object detection means for detecting the position of the object.

【００１６】請求項４記載の画像処理装置は、請求項１
に記載の発明の構成に加えて、画像データは細長い物
体を表わすデータを含み、画像データから細長い動物体
を含む動き領域を検出する動き領域検出手段と、画像デ
ータを微分して微分画像データを作成する微分手段と、
動き領域の中から微分画像データに基づき抽出した領域
の濃度データを所定方向に投影して投影データを作成す
る投影手段と、投影データの極値をもとに細長い動物体
の位置を検出する動物体検出手段とをさらに含む。An image processing apparatus according to a fourth aspect of the present invention provides an image processing apparatus according to the first aspect.
In addition to the configuration of the invention described in, the image data is elongated
Including data representing the body, a motion area detecting means for detecting a motion area including an elongated moving object from the image data, and differentiating means for differentiating the image data to create differential image data,
Area extracted based on differential image data from motion area
A projection means for the density data generating projection data projected to the predetermined direction, further comprising a moving object detecting means for detecting a position of the elongated moving objects based on the extreme value of the projection data.

【００１７】請求項５記載の画像処理装置は、請求項１
に記載の発明の構成に加えて、連続画像データのうち輝
度データの時間差分データを用いて連続画像データによ
り構成される画像中の動き領域を検出する動き領域検出
手段と、動き領域に対応する輝度データを所定のしきい
値により２値化し、２値化データを出力する２値化手段
と、画像データの輝度データおよび色データを基に色相
値を算出する色相値算出手段と、色相値と所定の定数値
とを比較し、所定の色相を有する色相領域を算出する色
相領域算出手段と、色相領域および２値化データを基に
色相領域内の所定の対象物の位置を算出する対象物位置
算出手段とをさらに含む。An image processing apparatus according to a fifth aspect of the present invention provides an image processing apparatus according to the first aspect.
In addition to the configuration of the invention described in (1), a motion area detecting means for detecting a motion area in an image composed of continuous image data using time difference data of luminance data in the continuous image data, Binarization means for binarizing luminance data with a predetermined threshold value and outputting binarized data; hue value calculating means for calculating hue values based on luminance data and color data of image data; And a predetermined constant value to calculate a hue area having a predetermined hue, and an object for calculating a position of a predetermined object in the hue area based on the hue area and the binarized data further comprising a goods position calculating means.

【００１８】請求項６記載の画像処理装置は、請求項１
に記載の発明の構成に加えて、連続画像データの時間差
分データを用いて連続画像データにより構成される画像
中の動き領域を検出する動き領域検出手段と、動き領域
に対応する輝度データを所定のしきい値により２値化
し、２値化データを出力する２値化手段と、２値化デー
タに対して収縮および膨張処理を行ない、元の２値化デ
ータと収縮および膨張処理後の２値化データとの差分に
より２値化データの幅の狭い領域を算出する狭幅領域算
出手段と、幅の狭い領域について近傍領域連結情報を求
め、近傍領域連結情報を用いて幅が狭くかつ所定値以上
の面積を有する細長い領域を算出する細長領域算出手段
とをさらに含む。An image processing apparatus according to a sixth aspect of the present invention is the first aspect of the invention.
In addition to the configuration of the present invention, a motion area detecting means for detecting a motion area in an image composed of continuous image data by using time difference data of the continuous image data, A binarizing means for binarizing the binary data according to the threshold value and outputting the binarized data; performing a contraction and expansion processing on the binarized data; A narrow area calculating means for calculating a narrow area of the binarized data based on a difference from the binarized data; obtaining neighboring area connection information for the narrow area; further comprising a elongated area calculating means for calculating an elongated region having an area equal to or larger than the value.

【００１９】請求項７記載の画像処理装置は、請求項１
に記載の発明の構成に加えて、画像データを微分し、水
平および垂直方向の微分画像データを算出する微分手段
と、画像データから構成される画像の中央付近の領域に
関して水平および垂直方向の微分画像データを垂直軸お
よび水平軸にそれぞれ投影し、水平および垂直投影デー
タを出力する水平垂直方向投影手段と、水平および垂直
投影データを用いて画像の中央付近に所定の対象物があ
るか否かを判定する判定手段と、判定手段により画像の
中央付近にあると判定された所定の対象物に対応する画
像データを抽出する抽出手段とをさらに含む。An image processing apparatus according to a seventh aspect is the first aspect.
In addition to the configuration of the invention described above, differentiating means for differentiating image data and calculating differential image data in the horizontal and vertical directions, and differentiating in the horizontal and vertical directions with respect to an area near the center of the image composed of the image data Horizontal and vertical projection means for projecting image data on a vertical axis and a horizontal axis, respectively, and outputting horizontal and vertical projection data; and using a horizontal and vertical projection data to determine whether or not there is a predetermined object near the center of the image Determining means for determining the
The image corresponding to the predetermined object determined to be near the center
Further comprising an extraction means for extracting image data.

【００２０】請求項８記載の画像処理装置は、請求項１
に記載の発明の構成に加えて、画像データの色データか
ら水平および垂直方向の色相値を算出する色相値算出手
段と、画像データから構成される画像の中央付近の領域
に関して水平および垂直方向の色相値を水平軸および垂
直軸にそれぞれ投影し、水平および垂直色相投影データ
を出力する水平垂直方向色相投影手段と、水平および垂
直色相投影データを用いて画像の中央付近に所定の対象
物があるか否かを判定する判定手段と、判定手段により
画像の中央付近にあると判定された所定の対象物に対応
する画像データを抽出する抽出手段とをさらに含む。[0020] The image processing apparatus according to the eighth aspect is the first aspect.
In addition to the structure of the invention according to, from the color data of the image data and the hue value calculating means for calculating hue values of the horizontal and vertical directions with respect to the region near the center of the composed image from the image data in the horizontal and vertical directions Horizontal and vertical hue projection means for projecting hue values on the horizontal axis and the vertical axis, respectively, and outputting horizontal and vertical hue projection data, and a predetermined object near the center of the image using the horizontal and vertical hue projection data. > A judgment means for judging whether there is an object and a judgment means
Corresponds to the target object determined to be near the center of the image
Further comprising an extraction means for extracting the image data to be.

【００２１】[0021]

【作用】請求項１記載の画像処理装置においては、複数
の動き物体ごとに所定のしきい値で時間差分データを二
値化することにより、画像データに含まれる複数の動き
物体の位置を検出することができる。According to the first aspect of the present invention, the positions of the plurality of moving objects included in the image data are detected by binarizing the time difference data with a predetermined threshold value for each of the plurality of moving objects. can do.

【００２２】請求項２記載の画像処理装置においては、
画像データから人物の口唇の動き量を検出し、発言者の
位置を検出することができる。In the image processing apparatus according to the second aspect,
The movement amount of the lips of a person is detected from the image data, and the position of the speaker can be detected.

【００２３】請求項３記載の画像処理装置においては、
画像データを微分し所定方向に投影することにより、静
止物体のエッジを特定することができ、静止物体の位置
を検出することができる。In the image processing apparatus according to the third aspect,
By differentiating the image data and projecting the image data in a predetermined direction, the edge of the stationary object can be specified, and the position of the stationary object can be detected.

【００２４】請求項４記載の画像処理装置においては、
細長い動物体を含む動き領域内で投影データの極値を求
めることにより、細長い動物体の位置を検出することが
できる。In the image processing apparatus according to the fourth aspect,
The position of the elongated moving object can be detected by calculating the extreme value of the projection data in the motion area including the elongated moving object.

【００２５】請求項５記載の画像処理装置においては、
輝度データの時間差分データを用いて画像中の動き領域
を検出し、その動き領域に対応する輝度データを所定の
しきい値により２値化しているので、速度の遅い動き物
体および複数の動き物体の位置を特定することができ
る。さらに、所定の色相値を有する色相領域を検出し、
上記動き領域内の所定の色を有する部分を特定すること
ができる。In the image processing apparatus according to the fifth aspect,
A motion area in an image is detected using the time difference data of the luminance data, and the luminance data corresponding to the motion area is binarized by a predetermined threshold value. Can be specified. Further, a hue region having a predetermined hue value is detected,
A portion having a predetermined color in the motion area can be specified.

【００２６】請求項６記載の画像処理装置においては、
連続画像データの時間差分データを用いて画像中の動き
領域を検出し、動き領域に対応する輝度データを所定の
しきい値により２値化することにより、複数の動き領域
を表わす２値化データを得ることができる。この２値化
データに対して収縮および膨張処理を行ない、２値化デ
ータの幅の狭い領域を算出し、算出した幅の狭い領域に
ついて近傍領域連結情報を求め、この近傍領域連結情報
を用いて幅が狭くかつ所定値以上の面積を有する細長い
領域の位置を特定することができる。In the image processing apparatus according to the sixth aspect,
Binary data representing a plurality of motion regions is detected by detecting a motion region in the image using time difference data of the continuous image data and binarizing luminance data corresponding to the motion region with a predetermined threshold value. Can be obtained. The binarized data is subjected to erosion and dilation processing to calculate a narrow region of the binarized data, obtain neighboring region connection information for the calculated narrow region, and use this neighboring region connection information. It is possible to specify the position of an elongated region having a small width and an area equal to or larger than a predetermined value.

【００２７】請求項７記載の画像処理装置においては、
微分画像データを水平および垂直方向に投影し、この水
平および垂直投影データにより画像中のエッジ成分を検
出し、このエッジ成分を基に画像の中央付近に所定の対
象物があるか否かを判定することができる。[0027] In the image processing apparatus according to the seventh aspect,
The differential image data is projected in the horizontal and vertical directions, edge components in the image are detected based on the horizontal and vertical projection data, and it is determined whether or not a predetermined object is present near the center of the image based on the edge components. can do.

【００２８】請求項８記載の画像処理装置においては、
色相値を水平および垂直方向に投影し、この色相量の差
異により画像の中央付近に所定の対象物があるか否かを
判定することができる。[0028] In the image processing apparatus according to the eighth aspect,
The hue value is projected in the horizontal and vertical directions, and it is possible to determine whether or not there is a predetermined object near the center of the image based on the difference in the hue amount.

【００２９】[0029]

【実施例】以下、本発明の第１の実施例の画像処理装置
について図面を参照しながら説明する。図１は、本発明
の第１の実施例の画像処理装置の構成を示すブロック図
である。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, an image processing apparatus according to a first embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing the configuration of the image processing apparatus according to the first embodiment of the present invention.

【００３０】図１において、画像処理装置は、Ａ／Ｄ変
換器１、画像メモリ２、絶対差分部３、比較部４、しき
い値決定部５、Ｘ軸濃度投影部６、連続領域ピーク検出
部７、１０、連続領域独立比較部８、１１、時間方向累
積部９、物体領域幅抽出部１２、算出物体選択部１３、
Ｙ軸濃度投影部１４、最大Ｙ軸投影座標抽出部１５、顔
幅抽出部１６、顔特徴抽出判定部１７、発言者判定部１
８、抽出座標決定部１９、エッジ検出部２０、表示枠検
出部２１、指示棒検出部２２、抽出対象物優先順位指定
部２３を含む。In FIG. 1, the image processing apparatus includes an A / D converter 1, an image memory 2, an absolute difference section 3, a comparison section 4, a threshold value determination section 5, an X-axis density projection section 6, a continuous area peak detection. Sections 7, 10, continuous area independent comparison sections 8, 11, time direction accumulation section 9, object area width extraction section 12, calculation object selection section 13,
Y-axis density projection unit 14, maximum Y-axis projection coordinate extraction unit 15, face width extraction unit 16, face feature extraction determination unit 17, speaker determination unit 1
8, an extraction coordinate determination unit 19, an edge detection unit 20, a display frame detection unit 21, a pointing rod detection unit 22, and an extraction target object priority designation unit 23.

【００３１】以下、上記の画像処理装置の動作について
説明する。まず、ＣＣＤカメラ等の撮像装置により撮影
された映像信号がＡ／Ｄ変換器１へ入力され、アナログ
信号からデジタル信号へ変換され、画像メモリ２、絶対
差分部３、しきい値決定部５、エッジ検出部２０へ出力
される。Hereinafter, the operation of the image processing apparatus will be described. First, a video signal photographed by an imaging device such as a CCD camera is input to an A / D converter 1 and converted from an analog signal to a digital signal. The image memory 2, the absolute difference unit 3, the threshold value determination unit 5, It is output to the edge detection unit 20.

【００３２】画像メモリ２は、デジタル信号に変換され
た映像信号を記憶し、ｋフレーム時間経過後絶対差分部
３へデジタル信号に変換された映像信号を出力する。こ
こで、ｋは状況に応じて変化する値である。たとえば、
映像信号に含まれる動物体の動き量が大きい場合は小さ
な値を用い、小さい場合は大きな値を用いる。The image memory 2 stores the video signal converted to a digital signal, and outputs the video signal converted to a digital signal to the absolute difference section 3 after elapse of k frame time. Here, k is a value that changes according to the situation. For example,
When the motion amount of the moving object included in the video signal is large, a small value is used, and when it is small, a large value is used.

【００３３】絶対差分部３は、Ａ／Ｄ変換器１から出力
される画像データと画像メモリ２によりｋフレーム時間
だけ遅延された画像データとの絶対差分を行ない、時間
差分データを比較部４へ出力する。The absolute difference unit 3 performs an absolute difference between the image data output from the A / D converter 1 and the image data delayed by k frame times by the image memory 2, and sends the time difference data to the comparison unit 4. Output.

【００３４】一方、しきい値決定部５は、Ａ／Ｄ変換器
１から出力される画像データに応じた所定のしきい値を
決定し、そのしきい値データを比較部４へ出力する。On the other hand, threshold value determining section 5 determines a predetermined threshold value according to the image data output from A / D converter 1 and outputs the threshold value data to comparing section 4.

【００３５】比較部４は、絶対差分部３から出力される
時間差分データとしきい値決定部５から出力されるしき
い値データとを比較して差分二値化信号を作成し、Ｘ軸
濃度投影部６、Ｙ軸濃度投影部１４、顔幅抽出部１６、
表示枠検出部２１、および指示棒検出部２２へ出力す
る。The comparing section 4 compares the time difference data output from the absolute difference section 3 with the threshold data output from the threshold value determining section 5 to create a binary difference signal, Projection unit 6, Y-axis density projection unit 14, face width extraction unit 16,
Output to the display frame detecting section 21 and the pointing rod detecting section 22.

【００３６】Ｘ軸濃度投影部６は、比較部４から出力さ
れた差分二値化信号をＸ軸（撮影された画像の水平方
向）に対して濃度投影（同じＸ座標の値を累積加算す
る）した濃度投影データを求める。図２は、濃度投影処
理から物体領域幅抽出処理までの処理を説明するための
図である。図２の（ａ）に示すように、複数の動物体が
画像内に含まれるとき、図２の（ｂ）に示すように、濃
度投影データは複数の連続領域となって現われる。ま
た、その複数の連続領域は、画像上での動物体の大きさ
に応じて各連続領域での濃度投影データの極大値が変化
する。したがって、後述するように各極大値に応じて濃
度投影データを二値化することにより動物体の大きさに
かかわらず複数の動物体の位置を検出することが可能と
なる。Ｘ軸濃度投影部６で作成された濃度投影データ
は、連続領域ピーク検出部７および連続領域独立比較部
８へ出力される。The X-axis density projection unit 6 performs density projection (cumulative addition of the same X coordinate value) on the X-axis (horizontal direction of the photographed image) on the difference binary signal output from the comparison unit 4. ) Is obtained. FIG. 2 is a diagram for explaining processing from density projection processing to object area width extraction processing. When a plurality of moving objects are included in the image as shown in FIG. 2A, the density projection data appears as a plurality of continuous regions as shown in FIG. 2B. In the plurality of continuous regions, the maximum value of the density projection data in each continuous region changes according to the size of the moving object on the image. Therefore, as described later, by binarizing the density projection data according to each local maximum value, it becomes possible to detect the positions of a plurality of moving objects regardless of the size of the moving object. The density projection data created by the X-axis density projection unit 6 is output to the continuous region peak detection unit 7 and the continuous region independent comparison unit 8.

【００３７】連続領域ピーク検出部７は、Ｘ軸濃度投影
部６から出力される濃度投影データの各極大値、たとえ
ば、図２の（ｂ）に示す場合はｐｅａｋ１、ｐｅａｋ２
を１／ｎ（ｎは所定の数）倍した値を連続領域独立比較
部８へ出力する。The continuous area peak detector 7 determines each local maximum value of the density projection data output from the X-axis density projector 6, for example, peak1 and peak2 in the case shown in FIG.
Is multiplied by 1 / n (n is a predetermined number) and is output to the continuous area independent comparing section 8.

【００３８】連続領域独立比較部８は、連続領域ピーク
検出部８から出力される各極大値の１／ｎ倍した値とＸ
軸濃度比較部６から出力される濃度投影データとを比較
し、濃度投影データを二値化する。二値化された濃度投
影データは、図２の（ｃ）に示すようになる。上記の処
理では、複数の連続領域をそれぞれの極大値に応じたし
きい値で二値化することにより、異なる動物体の動き量
の大きさの大小にかかわりなく、複数の動物体の位置を
検出することが可能となる。The continuous area independent comparing section 8 calculates a value obtained by multiplying a value obtained by multiplying 1 / n of each maximum value output from the continuous area peak detecting section 8 by X
The density projection data output from the axial density comparison unit 6 is compared with the density projection data to binarize the density projection data. The binarized density projection data is as shown in FIG. In the above processing, by binarizing a plurality of continuous regions with a threshold value corresponding to each local maximum value, the positions of the plurality of moving objects can be determined regardless of the magnitude of the amount of movement of different moving objects. It becomes possible to detect.

【００３９】二値化された濃度投影データは時間方向累
積部９へ入力され、図２の（ｄ）に示すように、所定時
間の間累積加算される。上記の累積加算の処理は、動物
体の時間当りの動き量が小さい場合に対して検出する時
間を延ばす効果があり、動き量が遅い動物体でも動き領
域をより適切に検出することが可能となる。時間方向累
積部９で累積加算された濃度投影データは連続領域ピー
ク検出部１０および連続領域独立比較部１１へ出力され
る。The binarized density projection data is input to the time direction accumulating section 9 and is cumulatively added for a predetermined time as shown in FIG. The above-described process of cumulative addition has an effect of extending the detection time for the case where the amount of motion per unit time of the moving object is small, and it is possible to more appropriately detect the moving region even with the moving object having a slow moving amount. Become. The density projection data cumulatively added by the time direction accumulator 9 is output to the continuous area peak detector 10 and the continuous area independent comparator 11.

【００４０】連続領域ピーク検出部１０は、時間方向累
積部９から入力された累積加算された濃度投影データの
極大値、たとえば、図２の（ｄ）に示す場合はｐｅａｋ
３、ｐｅａｋ４を各連続領域ごとに求める。求められた
各極大値は１／ｍ倍（ｍは所定の値）して連続領域独立
比較部１１へ出力される。The continuous area peak detector 10 calculates the maximum value of the cumulatively added density projection data input from the time direction accumulator 9, for example, peak in the case shown in FIG.
3, peak4 is obtained for each continuous area. Each obtained maximum value is output to the continuous area independent comparison section 11 after multiplying by 1 / m (m is a predetermined value).

【００４１】連続領域独立比較部１１は、連続領域ピー
ク検出部１１から出力された各極大値を１／ｍ倍した値
と時間方向累積部９から出力される濃度投影データとを
比較し、図２の（ｅ）に示すように濃度投影データを二
値化する。つまり、濃度投影データの複数の連続領域を
それぞれの極大値に応じたしきい値で二値化することに
より、異なる動物体の動き量の大きさの大小にかかわり
なく複数の動物体の位置を検出することが可能となる。
二値化された濃度投影データは物体領域幅抽出部１２へ
出力される。The continuous area independent comparing section 11 compares the value obtained by multiplying each local maximum value output from the continuous area peak detecting section 11 by 1 / m with the density projection data output from the time direction accumulating section 9, and The density projection data is binarized as shown in FIG. In other words, by binarizing a plurality of continuous areas of the density projection data with a threshold value corresponding to each local maximum value, the positions of the plurality of moving objects can be determined regardless of the magnitude of the amount of movement of the different moving objects. It becomes possible to detect.
The binarized density projection data is output to the object area width extraction unit 12.

【００４２】物体領域幅抽出部１２は、図２の（ｅ）に
示すように各連続領域ごとにＸ座標（Ｘａ１、Ｘｂ
１）、（Ｘａ２、Ｘｂ２）を算出物体選択部１３へ出力
する。上記の処理により、複数の動き物体の位置を動き
物体の動き量の大きさの大小にかかわりなく正確に検出
することが可能となる。As shown in FIG. 2E, the object region width extracting unit 12 sets the X coordinate (Xa1, Xb
1) and (Xa2, Xb2) are output to the calculation object selection unit 13. With the above processing, it is possible to accurately detect the positions of a plurality of moving objects regardless of the magnitude of the amount of movement of the moving objects.

【００４３】算出物体選択部１３は、物体領域幅抽出部
１２から出力される各連続領域ごとのＸ座標値を用い、
複数の動物体の中から所望の動物体を選択し、その動物
体のＸ座標値（Ｘａ、Ｘｂ）をＹ軸濃度投影部１４およ
び顔幅抽出部１６へ出力する。The calculation object selection unit 13 uses the X coordinate value for each continuous region output from the object region width extraction unit 12,
A desired moving object is selected from a plurality of moving objects, and the X coordinate values (Xa, Xb) of the moving object are output to the Y-axis density projection unit 14 and the face width extracting unit 16.

【００４４】上記処理では、動物体の個数を２個として
説明しているが、動物体の個数がＮ個（Ｎ＞２）の場合
でも同様に処理することが可能である。In the above processing, the number of moving objects is described as two, but the same processing can be performed even when the number of moving objects is N (N> 2).

【００４５】次に、発言者を判定する処理について説明
する。図３は、発言者判定処理を説明するための図であ
る。Next, a process for determining the speaker will be described. FIG. 3 is a diagram for explaining the speaker determination process.

【００４６】まず、Ｙ軸濃度投影部１４は、Ｘ座標（Ｘ
ａ、Ｘｂ）の内包区間領域（Ｘ座標値＝Ｘａ〜Ｘｂ）に
対して、比較部４から出力される差分二値化信号をＹ軸
（画像の垂直方向）濃度投影を行ない、濃度投影データ
を最大Ｙ軸投影座標抽出部１５へ出力する。First, the Y-axis density projection unit 14 sets the X coordinate (X
a, Xb) is subjected to Y-axis (vertical direction of image) density projection of the binary difference signal output from the comparison unit 4 for the inclusion section area (X coordinate value = Xa to Xb). Is output to the maximum Y-axis projected coordinate extracting unit 15.

【００４７】最大Ｙ軸投影座標抽出部１５は、図３に示
すようにＹ軸濃度投影部１４から出力された濃度投影デ
ータの中で最大垂直長さのＹ座標の最上部位置Ｙｔｏｐ
を求め、顔幅抽出部１６、顔特徴抽出判定部１７および
発言者判定部１８へ出力する。ここで、Ｙｔｏｐが人物
の頭頂座標となる。As shown in FIG. 3, the maximum Y-axis projected coordinate extracting unit 15 is a top position Ytop of the Y coordinate of the maximum vertical length in the density projection data output from the Y-axis density projected unit 14.
And outputs it to the face width extraction unit 16, the face feature extraction determination unit 17, and the speaker determination unit 18. Here, Ytop is the top coordinate of the person.

【００４８】また、エッジ検出部２０は、Ａ／Ｄ変換器
１によりデジタル信号に変換された映像信号を画像の垂
直方向および水平方向に微分し、垂直方向微分画像デー
タおよび水平方向微分画像データを求め、顔幅抽出部１
６、顔特徴抽出判定部１７、および発言者判定部１８へ
出力する。The edge detecting section 20 differentiates the video signal converted into a digital signal by the A / D converter 1 in the vertical and horizontal directions of the image, and separates the vertical differential image data and the horizontal differential image data. Find, face width extraction unit 1
6. Output to the facial feature extraction determination unit 17 and the speaker determination unit 18.

【００４９】顔幅抽出部１６には、比較部４から出力さ
れる差分二値化信号、算出物体選択部１３から出力され
る動物体のＸ座標（Ｘａ、Ｘｂ）、最大Ｙ軸投影座標抽
出部１５から出力される人物の頭頂座標Ｙｔｏｐ、エッ
ジ検出部２０から出力される垂直方向微分画像データが
それぞれ入力される。顔幅抽出部１６は、人物の頭頂座
標Ｙｔｏｐをサーチ開始点として、動物体のＸ座標（Ｘ
ａ、Ｘｂ）の内包区間領域に対して差分二値化信号を下
方向にサーチし、垂直方向微分画像データを用いて、図
３に示すように人物の顔幅のＸ座標（Ｌｃｏｒ、Ｒｃｏ
ｒ）を求め、顔特徴抽出判定部１７、発言者判定部１
８、表示枠検出部２１、および指示棒検出部２２へ出力
する。The face width extraction unit 16 extracts the binary difference signal output from the comparison unit 4, the X coordinate (Xa, Xb) of the moving object output from the calculation object selection unit 13, and the maximum Y axis projection coordinate. The top coordinate Ytop of the person output from the unit 15 and the vertical differential image data output from the edge detection unit 20 are input. The face width extraction unit 16 uses the top coordinate Ytop of the person as a search start point and sets the X coordinate (X
a, Xb), the difference binarized signal is searched downward in the inclusive section area, and the X-coordinates (Lcor, Rco, Rco, Rco) of the face width of the person as shown in FIG.
r), the facial feature extraction determination unit 17 and the speaker determination unit 1
8, output to the display frame detecting section 21 and the pointing rod detecting section 22.

【００５０】顔特徴抽出判定部１７には、最大Ｙ軸投影
座標抽出部１５から出力される人物の頭頂座標Ｙｔｏ
ｐ、顔幅抽出部１６から出力される人物の顔幅のＸ座標
（Ｌｃｏｒ、Ｒｃｏｒ）、およびエッジ検出部２０から
出力される水平および垂直微分画像データが入力され
る。顔特徴抽出判定部１７は、顔幅のＸ座標（Ｌｃｏ
ｒ、Ｒｃｏｒ）および頭頂座標Ｙｔｏｐをもとに人物の
頭部候補位置を求め、その位置の水平および垂直微分画
像データを調べることにより、人物の頭部か否かを判定
し、判定結果を発言者判定部１８へ出力する。顔特徴抽
出判定部１７では、人物の顔特徴量の１つである頬の縦
線および眉毛や目等の横線に相当する情報が含まれてい
るか否を分析することにより、人物の頭部であるか否か
を正確に判定することが可能となっている。The face feature extraction determination unit 17 outputs the person's top coordinate Yto output from the maximum Y-axis projection coordinate extraction unit 15.
p, the X-coordinate (Lcor, Rcor) of the face width of the person output from the face width extraction unit 16 and the horizontal and vertical differential image data output from the edge detection unit 20 are input. The face feature extraction determination unit 17 determines the X coordinate (Lco
r, Rcor) and the top coordinate Ytop, determine the candidate head position of the person, examine the horizontal and vertical differential image data at that position, determine whether or not the person is the head, and state the determination result. To the person determination unit 18. The face feature extraction determination unit 17 analyzes whether or not information corresponding to a vertical line on the cheek and a horizontal line such as eyebrows and eyes, which is one of the facial feature amounts of the person, is included, and the face characteristic extraction determination unit 17 determines whether the person's head It is possible to accurately determine whether or not there is.

【００５１】次に、求めた複数の人物から発言者を特定
する方法について説明する。図４は、図１に示す発言者
判定部の構成を示すブロック図である。Next, a description will be given of a method of specifying a speaker from the plurality of obtained persons. FIG. 4 is a block diagram illustrating a configuration of the speaker determination unit illustrated in FIG.

【００５２】図４において、発言者判定部は、口唇領域
算出部３１、水平微分Ｙ軸濃度投影部３２、投影幅累積
値算出部３３、口唇変化量判定部３４を含む。In FIG. 4, the speaker determination section includes a lip area calculation section 31, a horizontal differential Y-axis density projection section 32, a projection width accumulation value calculation section 33, and a lip change amount determination section 34.

【００５３】顔特徴抽出判定部１７により動物体が人物
の頭部を含んでいると判定された場合、以下の処理が行
なわれる。まず、口唇領域算出部３１は、入力される人
物の頭部のＸ座標（Ｌｃｏｒ、Ｒｃｏｒ）および頭頂座
標Ｙｔｏｐを用いて、図３に示す口唇の存在する垂直位
置（ＭＹ１、ＭＹ２）を以下の式により求める。When the facial feature extraction determination unit 17 determines that the moving object includes the head of a person, the following processing is performed. First, the lip area calculation unit 31 calculates the vertical position (MY1, MY2) where the lip shown in FIG. 3 exists using the X coordinate (Lcor, Rcor) and the top coordinate Ytop of the input human head as follows. It is determined by the formula.

【００５４】Ｘｌｅｎ＝Ｒｃｏｒ−Ｌｃｏｒｍｈ＝Ｘｌｅｎ・ＭＨ／ＸＬＭＹ１＝（Ｙｔｏｐ＋Ｘｌｅｎ）−ｍｈ／２ＭＹ２＝（Ｙｔｏｐ＋Ｘｌｅｎ）＋ｍｈ／２ここで、ＸＬ：頭部基準幅，ＭＨ：口唇基準高さ次に、水平微分Ｙ軸の濃度投影部３２は、頭部のＸ座標
（Ｌｃｏｒ、Ｒｃｏｒ）と口唇の垂直位置（ＭＹ１、Ｍ
Ｙ２）に囲まれた領域（口唇存在領域）について、エッ
ジ検出部２０で検出されたれ水平微分画像データＸｅｄ
ｇｅをＹ軸に濃度投影して累積加算し、累積加算データ
を投影幅累積値算出部３３へ出力する。Xlen = Rcor-Lcor mh = Xlen.MH / XL MY1 = (Ytop + Xlen) -mh / 2 MY2 = (Ytop + Xlen) + mh / 2 where XL: head reference width, MH: lip reference height , The horizontal differential Y axis density projection unit 32 calculates the X coordinate (Lcor, Rcor) of the head and the vertical position (MY1, M
Y2), the horizontal differential image data Xed detected by the edge detection unit 20 for the region (the lip presence region)
ge is density-projected on the Y-axis, cumulatively added, and the cumulatively added data is output to the projected width cumulative value calculation unit 33.

【００５５】投影幅累積値算出部３３は、累積加算した
水平微分画像データの垂直幅Ｍｈ、累積加算領域の面積
値Ｓｍを求め、口唇変化量判定部３４へ出力する。The projection width cumulative value calculation unit 33 calculates the vertical width Mh of the accumulated horizontal differential image data and the area value Sm of the cumulative addition area, and outputs them to the lip change amount determination unit 34.

【００５６】口唇変化量判定部３４は、入力した垂直幅
Ｍｈおよび面積値Ｓｍの時間ごとの変化量を検出するこ
とにより、人物が発言しているか否か（口唇が上下に動
かされているか否か）を判定し、判定結果を抽出座標決
定部１９へ出力する。この処理では、口唇存在領域の水
平微分値がどの程度変化しているかを調べることによ
り、水平微分値が多く含まれる口唇がどの程度上下に動
いているかを間接的に知ることができる。したがって、
求めようとする複数の人物の中でその変化量が所定値以
上で、かつ、変化している時間が所定時間以上で一番多
く存在する人物を発言者として判定している。The lip change amount determination unit 34 detects whether or not a person is speaking (whether or not the lip is moved up and down) by detecting the change amount of the input vertical width Mh and area value Sm with time. Is determined, and the determination result is output to the extraction coordinate determination unit 19. In this process, by examining how much the horizontal differential value of the lip existence region has changed, it is possible to indirectly know how much the lip, which contains a large number of horizontal differential values, moves up and down. Therefore,
Among a plurality of persons to be obtained, a person whose change amount is equal to or more than a predetermined value and whose change time is the most often for a predetermined time or more is determined as a speaker.

【００５７】上記の処理により、画像データである映像
信号のみを用いて複数の人物の中から発言者を判定する
ことが可能となる。According to the above-described processing, it is possible to determine the speaker from a plurality of persons using only the video signal which is image data.

【００５８】次に、表示枠検出部２１の動作について詳
細に説明する。図５は、図１に示す表示枠検出部の構成
を示すブロック図である。図６は、図１に示す表示枠検
出部の動作を説明するための図である。Next, the operation of the display frame detecting section 21 will be described in detail. FIG. 5 is a block diagram showing a configuration of the display frame detection unit shown in FIG. FIG. 6 is a diagram for explaining the operation of the display frame detection unit shown in FIG.

【００５９】図５において、表示枠検出部は、水平垂直
微分値濃度投影部５１、投影極大値座標抽出部５２を含
む。In FIG. 5, the display frame detecting section includes a horizontal / vertical differential value density projecting section 51 and a projected maximum value coordinate extracting section 52.

【００６０】まず、比較部４から出力される差分二値化
信号およびエッジ検出部２０から出力される水平および
垂直微分画像データが水平垂直微分値濃度投影部５１へ
入力される。水平垂直微分値濃度投影部５１は、図６に
示すように、水平および垂直微分画像データをＹおよび
Ｘ軸上にそれぞれ濃度投影する。ここでの濃度投影処理
は、黒板等の表示枠は静止物体であるので、動き変化を
示す差分二値化信号が存在しない画像領域について行な
っている。また、動物体により黒板等の表示枠が隠され
てしまうのを防ぐために、所定時間濃度投影された濃度
投影データの最大値を求めるように処理されている。First, the binary difference signal output from the comparing section 4 and the horizontal and vertical differential image data output from the edge detecting section 20 are input to the horizontal / vertical differential value density projecting section 51. As shown in FIG. 6, the horizontal / vertical differential value density projection unit 51 projects the horizontal and vertical differential image data on the Y and X axes, respectively. Since the display frame such as a blackboard is a stationary object, the density projection processing is performed on an image area in which there is no difference binary signal indicating a change in motion. Further, in order to prevent a display frame such as a blackboard from being hidden by a moving object, processing is performed so as to obtain the maximum value of density projection data obtained by density projection for a predetermined time.

【００６１】求めた濃度投影データは、投影極大値座標
抽出部５２へ入力され、Ｘ軸、Ｙ軸それぞれの２番目ま
での極大値座標（ＢＸ１、ＢＸ２）、（ＢＹ１、ＢＹ
２）が撮影極大値座標抽出部５２により求められる。こ
の結果、図６に示すように、黒板等の表示枠の画像上で
の座標が求められる。求められた各座標は抽出座標決定
部１９へ出力される。また、たとえば、２つの極大値座
標の差が所定値以下となった場合には、次の極大値（た
とえば、３番目の極大値）を求める。また、極大値の値
が所定値以下の場合は、その画像上には黒板などの表示
枠が存在しないものとして処理する。The obtained density projection data is input to the projection maximum value coordinate extraction unit 52, and the maximum value coordinates (BX1, BX2), (BY1, BY) up to the second on each of the X axis and the Y axis.
2) is obtained by the imaging maximum value coordinate extraction unit 52. As a result, as shown in FIG. 6, the coordinates of the display frame such as a blackboard on the image are obtained. The obtained coordinates are output to the extracted coordinate determination unit 19. Further, for example, when the difference between the two maximum value coordinates is equal to or smaller than a predetermined value, the next maximum value (for example, the third maximum value) is obtained. When the value of the local maximum value is equal to or less than the predetermined value, processing is performed assuming that a display frame such as a blackboard does not exist on the image.

【００６２】次に、図１に示す指示棒検出部２２につい
て詳細に説明する。図７は、図１に示す指示棒検出部の
構成を示すブロック図である。図８は、図１に示す指示
棒検出部の動作を説明するための図である。Next, the pointing rod detecting section 22 shown in FIG. 1 will be described in detail. FIG. 7 is a block diagram showing a configuration of the pointing stick detection unit shown in FIG. FIG. 8 is a diagram for explaining the operation of the pointing stick detection unit shown in FIG.

【００６３】図７において、指示棒検出部は、水平微分
値Ｙ軸濃度投影部７１、指示Ｙ座標抽出部７２、水平微
分値Ｘ軸濃度投影部７３、指示Ｘ座標抽出部７４を含
む。In FIG. 7, the pointing rod detection unit includes a horizontal differential value Y-axis density projection unit 71, a specified Y coordinate extraction unit 72, a horizontal differential value X-axis density projection unit 73, and a specified X coordinate extraction unit 74.

【００６４】まず、比較部４から出力される差分二値化
信号およびエッジ検出部２０から出力される水平微分画
像データが水平微分値Ｙ軸濃度投影部７１へ入力され
る。水平微分値Ｙ軸濃度投影部７１は、差分二値化信号
が存在する画像領域について水平微分画像データのみＹ
軸上に濃度投影する。求めたＹ軸上への濃度投影データ
は指示Ｙ座標抽出部７２へ出力される。First, the binary difference signal output from the comparing section 4 and the horizontal differential image data output from the edge detecting section 20 are input to the horizontal differential value Y-axis density projecting section 71. The horizontal differential value Y-axis density projection unit 71 determines that only the horizontal differential image data is Y for the image area where the differential binarized signal exists.
Density projection is performed on the axis. The obtained density projection data on the Y axis is output to the designated Y coordinate extraction unit 72.

【００６５】指示Ｙ座標抽出部７２は、入力された濃度
投影データの極大値のＹ座標ＰＹを求める。この処理で
は、図８に示すように、動き物体の中で水平微分値が大
きい水平方向の細長い物体つまり指示棒のＹ座標を求め
ている。一方、指示Ｙ座標抽出部７２は、入力した濃度
投影データの極大値の値が所定値以下の場合、その画像
上には対象物を指し示すための指示棒が存在しないか、
または指示棒は存在するが動かされていないものとして
処理される。The designated Y coordinate extracting unit 72 obtains the Y coordinate PY of the maximum value of the input density projection data. In this process, as shown in FIG. 8, the Y coordinate of a horizontally elongated object having a large horizontal differential value, that is, the pointing rod, is obtained among the moving objects. On the other hand, if the value of the maximum value of the input density projection data is equal to or less than a predetermined value, the instruction Y coordinate extraction unit 72 determines whether there is an indicator rod for pointing an object on the image.
Or the wand is treated as if it were present but not moved.

【００６６】次に、指示Ｙ座標抽出部７２は、図８に示
す指示棒が存在する垂直位置座標（ＰＹ１、ＰＹ２）を
以下の式により求める。Next, the designated Y coordinate extracting unit 72 obtains the vertical position coordinates (PY1, PY2) where the pointing bar shown in FIG.

【００６７】ＰＹ１＝ＰＹ−ＰＨ／２ＰＹ２＝ＰＹ＋ＰＨ／２ここで、ＰＨ：固定値求めた指示棒が存在する垂直位置座標（ＰＹ１、ＰＹ
２）は水平微分値Ｘ軸濃度投影部７３へ出力される。PY1 = PY-PH / 2 PY2 = PY + PH / 2 where PH: fixed value The vertical position coordinates (PY1, PY
2) is output to the horizontal differential value X-axis density projection unit 73.

【００６８】次に、上記の垂直位置座標（ＰＹ１、ＰＹ
２）、比較部４から出力される差分二値化信号、および
エッジ検出部２０から出力される水平微分画像データが
それぞれ水平微分値Ｘ軸濃度投影部７３へ入力される。
水平微分値Ｘ軸濃度投影部７３は、動き変化を示す差分
二値化信号が存在し、かつ、指示棒が存在する垂直位置
座標（ＰＹ１、ＰＹ２）に囲まれた領域の水平微分画像
データをＸ軸上に濃度投影し、濃度投影データを指示Ｘ
座標抽出部７４へ出力する。ここでは、図８に示すよう
に、濃度投影データのうち最大水平長さを持つ領域を見
つけ、その領域を水平方向に左右の２つ領域Ｓ１、Ｓ２
に分割する。２つに分割した各領域Ｓ１、Ｓ２の濃度投
影データの累積値を比較すると、指示棒を持っている人
物が存在する領域Ｓ１では、濃度投影データの累積値が
大きくなる、つまり、Ｓ１＞Ｓ２となるので、濃度投影
データの累積値が小さい領域Ｓ２の端部の座標を指示棒
のＸ座標ＰＸとする。この結果、指示Ｙ座標抽出部７２
および指示Ｘ座標抽出部７４で求められた指示棒の座標
（ＰＸ、ＰＹ）が抽出座標決定部１９へ出力される。Next, the vertical position coordinates (PY1, PY
2) The difference binary signal output from the comparison unit 4 and the horizontal differential image data output from the edge detection unit 20 are input to the horizontal differential value X-axis density projection unit 73, respectively.
The horizontal differential value X-axis density projection unit 73 converts the horizontal differential image data of the area surrounded by the vertical position coordinates (PY1, PY2) where the differential binarized signal indicating the change in motion exists and the pointing bar exists. Density projection is performed on the X-axis, and the density projection data is specified.
Output to the coordinate extraction unit 74. Here, as shown in FIG. 8, a region having the maximum horizontal length is found in the density projection data, and the region is divided into two left and right regions S1 and S2 in the horizontal direction.
Divided into Comparing the cumulative values of the density projection data of each of the two divided areas S1 and S2, the cumulative value of the density projection data is larger in the area S1 where the person holding the pointing stick exists, that is, S1> S2 Therefore, the coordinates of the end of the area S2 where the accumulated value of the density projection data is small are set as the X-coordinate PX of the pointing rod. As a result, the designated Y coordinate extraction unit 72
And the coordinates (PX, PY) of the pointing stick obtained by the pointing X coordinate extracting unit 74 are output to the extracted coordinate determining unit 19.

【００６９】なお、上記の指示棒の座標（ＰＸ、ＰＹ）
は指示棒が指し示す位置と完全に一致していない可能性
があるが、ここでは、指示棒の指し示すおおよその位置
を検出できれば十分であるので、特に問題とはならな
い。The coordinates (PX, PY) of the above-mentioned pointing rod
May not completely match the position indicated by the pointing rod, but here, it is sufficient if the approximate position indicated by the pointing rod can be detected.

【００７０】次に、発言者判定部１８、表示枠検出部２
１、指示棒検出部２２でそれぞれ求められた複数人物の
中の発言者、黒板等表示枠、指示棒の指し示す位置のそ
れぞれの座標が、抽出座標決定部１９へ入力される。抽
出座標決定部１９は入力された各データから、どの座標
が目的とする対象物体の座標であるかを決定する。この
決定は、抽出対象物優先順位指定部２３に本発明の画像
処理装置を含むテレビ電話装置やテレビ会議システム等
の使用者がその使用状況を指定することにより行なわれ
る。たとえば、どの対象物体を優先的に表示したいか、
黒板等の表示枠を使用するか、説明対象となるものを指
示棒で指し示すことを行なうか、または、複数の対象物
を重ね合せて表示するか等の内容を指定する。指定され
た内容は、抽出対象物優先順位指定部２３から抽出座標
決定部１９へ出力され、抽出座標決定部１９は、指定さ
れた内容に応じて、目的となる対象物を決定する。ま
た、対象物体の大きさに合せて自動的に対象物体の拡大
ズーム倍率を決定する等の処理を行なってもよい。上記
の処理により、最終的に、目的となる対象物体を含む領
域を指定するための抽出領域指示信号が出力される。Next, the speaker determining unit 18 and the display frame detecting unit 2
1. The coordinates of the speaker among the plurality of persons, the display frame such as a blackboard, and the position pointed to by the pointer are obtained by the pointer detection unit 22 are input to the extracted coordinate determination unit 19. The extracted coordinate determination unit 19 determines which coordinate is the coordinate of the target object from the input data. This determination is made by the user of the videophone device or the video conference system including the image processing device of the present invention designating the use status in the extraction target object priority designation unit 23. For example, which target objects do you want to display preferentially?
The user specifies the content such as whether to use a display frame such as a blackboard, to indicate an object to be explained with a pointing stick, or to superimpose and display a plurality of objects. The specified content is output from the extraction target object priority specification unit 23 to the extraction coordinate determination unit 19, and the extraction coordinate determination unit 19 determines a target target object according to the specified content. Further, processing such as automatically determining the enlargement zoom magnification of the target object according to the size of the target object may be performed. By the above processing, an extraction region instruction signal for designating a region including the target object is finally output.

【００７１】上記実施例では、複数の人物の判定を行な
うため、各人物の概略水平エッジ座標（Ｘａ、Ｘｂ）の
算出に、独立した複数の動き領域を対象に算出してい
る。しかしながら、ＣＣＤカメラ等の撮像装置の撮像表
示角度によっては、複数の人物画重なってしまうような
状況も存在する。このような状況で、動き領域から物体
の概略水平位置座標（Ｘａ、Ｘｂ）を求め、上述の処理
により人物の頭部後方位置を求めた場合、その頭部後方
位置は正しく人物の頭部位置を表していない場合があ
り、その頭部後方位置に対して顔認識処理を実施しても
顔と認識しない場合がある。また、正しく顔と判定され
たとしても、異なった複数の人物の１人のみを判定でき
ているだけで、他の人物を判定することはできない。In the above embodiment, in order to determine a plurality of persons, the calculation of the approximate horizontal edge coordinates (Xa, Xb) of each person is performed for a plurality of independent motion regions. However, depending on the imaging display angle of an imaging device such as a CCD camera, there is a situation where a plurality of human images overlap. In such a situation, when the approximate horizontal position coordinates (Xa, Xb) of the object are obtained from the motion area and the back position of the head of the person is obtained by the above processing, the back position of the head is correctly calculated. May not be represented, and the face may not be recognized as a face even if the face recognition processing is performed on the position behind the head. Further, even if the face is correctly determined, only one of a plurality of different persons can be determined, and no other person can be determined.

【００７２】また、話題とされている対象物の位置を求
めるため、黒板等の位置や指示棒等が水平方向に指し示
している位置を利用しているが、対象物に対して指示棒
等を用いていない場合や、指示棒等を用いている場合に
おいても対象物が指示棒等により水平方向以外から指し
示されている場合では指示棒等の検出ができない。この
ような場合、話題とされている対象物の位置を求めるこ
とができない。上記のような問題を解決する第２の実施
例について以下詳細に説明する。Further, in order to obtain the position of an object which is a topic, the position of a blackboard or the like or the position where a pointing stick or the like points in the horizontal direction is used. Even when the object is not used, or when the pointing rod or the like is used, the detection of the pointing rod or the like cannot be performed when the object is pointed by the pointing rod or the like from a direction other than the horizontal direction. In such a case, it is not possible to determine the position of the target object that is being discussed. A second embodiment that solves the above-described problem will be described in detail below.

【００７３】つまり、第２の実施例では、複数の人物が
重なって写っていてもそれぞれの人物の認識を可能とす
る。図９は、本発明の第２の実施例の画像処理装置の構
成を示すブロック図である。That is, in the second embodiment, even if a plurality of persons are photographed in an overlapping manner, each person can be recognized. FIG. 9 is a block diagram showing the configuration of the image processing apparatus according to the second embodiment of the present invention.

【００７４】図９を参照して、画像処理装置は、Ａ／Ｄ
変換器１０１、動きエリア検出部１０２、２値変換部１
０３、第１顔候補位置検出部１０４、顔幅抽出部１０
８、顔特徴抽出判定部１０９、エッジ検出部１１０、第
２顔候補位置検出部１１１を含む。Referring to FIG. 9, the image processing apparatus includes an A / D
Converter 101, motion area detector 102, binary converter 1
03, first face candidate position detecting section 104, face width extracting section 10
8, a face feature extraction determination unit 109, an edge detection unit 110, and a second face candidate position detection unit 111.

【００７５】以下、上記の画像処理装置の動作について
説明する。まず、ＣＣＤカメラ等の撮像装置により入力
された映像信号の輝度情報ＹがＡ／Ｄ変換器１０１に入
力され、デジタル輝度信号へ変換される。変換されたデ
ジタル輝度信号は、動きエリア検出部１０２、２値変換
部１０３、第２顔候補位置検出部１１１へ出力される。Hereinafter, the operation of the image processing apparatus will be described. First, luminance information Y of a video signal input by an imaging device such as a CCD camera is input to an A / D converter 101 and converted into a digital luminance signal. The converted digital luminance signal is output to motion area detecting section 102, binary converting section 103, and second face candidate position detecting section 111.

【００７６】動きエリア検出部１０２は、図１に示す画
像メモリ２および絶対差分部３と同様に動作する。つま
り、入力したデジタル輝度信号のフレーム間の差分を計
算し、時間差分データを２値変換部１０３へ出力する。The moving area detecting section 102 operates in the same manner as the image memory 2 and the absolute difference section 3 shown in FIG. That is, the difference between the frames of the input digital luminance signal is calculated, and the time difference data is output to the binary conversion unit 103.

【００７７】２値変換部１０３は、図１に示すしきい値
決定部５および比較部４と同様の動作を行なう。つま
り、２値変換部１０３は、入力したデジタル映像信号に
より決定された所定のしきい値と動きエリア検出部１０
２から出力された時間差分データとを比較し、時間差分
データを２値化し、差分２値化信号を第１顔候補位置検
出部１０４、顔幅抽出部１０８、第２顔候補位置検出部
１１１へ出力する。[0099] Binary conversion section 103 performs the same operation as threshold value determination section 5 and comparison section 4 shown in FIG. That is, the binary conversion unit 103 determines whether the predetermined threshold value determined by the input digital video signal and the motion area detection unit 10
2, the time difference data is binarized, and the difference binarized signal is converted into a first face candidate position detecting unit 104, a face width extracting unit 108, and a second face candidate position detecting unit 111. Output to

【００７８】エッジ検出部１１０は、図１に示すエッジ
検出部２０と同様の動作を行なう。つまり、入力したデ
ジタル輝度信号を画像の垂直方向および水平方向に微分
し、垂直方向微分画像データおよび水平方向微分画像デ
ータを求め、顔幅抽出部１０８、および顔特徴抽出判定
部１０９へ出力する。Edge detecting section 110 performs the same operation as edge detecting section 20 shown in FIG. That is, the input digital luminance signal is differentiated in the vertical and horizontal directions of the image to obtain vertical differential image data and horizontal differential image data, which are output to the face width extraction unit 108 and the face feature extraction determination unit 109.

【００７９】第１顔候補位置検出部１０４は、水平方向
動エリア抽出部１０５、動エリア選択部１０６、頭頂抽
出部１０７を含む。水平方向動エリア抽出部１０５は、
図１に示すＸ軸濃度投影部６、連続領域ピーク検出部
７、１０、連続領域独立比較部８、１１、時間方向累積
部９、物体領域幅抽出部１２と同様に動作する。動エリ
ア選択部１０６は、図１に示す算出物体選択部１３と同
様に動作を行なう。頭頂抽出部１０７は、図１に示すＹ
軸濃度投影部１４および最大Ｙ軸投影座標抽出部１５と
同様に動作を行なう。The first face candidate position detecting section 104 includes a horizontal moving area extracting section 105, a moving area selecting section 106, and a top extracting section 107. The horizontal moving area extraction unit 105
The operation is the same as that of the X-axis density projection unit 6, continuous region peak detection units 7, 10, continuous region independent comparison units 8, 11, time direction accumulation unit 9, and object region width extraction unit 12 shown in FIG. The moving area selection unit 106 performs the same operation as the calculation object selection unit 13 shown in FIG. The parietal extraction unit 107 determines the Y value shown in FIG.
The same operation as the axis density projection unit 14 and the maximum Y-axis projection coordinate extraction unit 15 is performed.

【００８０】したがって、水平方向動エリア抽出部１０
５は、入力した差分２値化信号を時間的および水平方向
に空間的に累積加算することにより動き物体の水平方向
の位置座標を動き物体ごとに求める。動エリア選択部１
０６は、求めた複数の動き物体の水平方向の位置座標か
ら本処理以降で処理したい１つの動き物体の水平方向の
位置座標（Ｘａ、Ｘｂ）を選択し、頭頂抽出部１０７お
よび顔幅抽出部１０８へ出力する。頭頂抽出部１０７
は、求めた動き物体の水平方向の位置座標（Ｘａ、Ｘ
ｂ）の内包区間（水平座標＝Ｘａ〜Ｘｂ）領域に対し
て、差分２値化信号を垂直方向に累積加算し、その累積
値の垂直方向の最上部位置Ｙｔｏｐすなわち人物の頭頂
座標を求める。Therefore, the horizontal moving area extraction unit 10
5 obtains the horizontal position coordinates of the moving object for each moving object by cumulatively adding the input difference binary signal spatially in time and in the horizontal direction. Moving area selector 1
In step 06, the horizontal position coordinates (Xa, Xb) of one moving object to be processed after this processing are selected from the obtained horizontal position coordinates of the plurality of moving objects, and the top extraction unit 107 and the face width extraction unit Output to 108. Head extraction unit 107
Are the horizontal position coordinates (Xa, X
The binarized difference signal is cumulatively added in the vertical direction to the inclusive section (horizontal coordinate = Xa to Xb) of b), and the topmost position Ytop of the cumulative value in the vertical direction, that is, the top coordinate of the person is obtained.

【００８１】次に、第２顔候補位置検出部１１１につい
て詳細に説明する。第２顔候補位置検出部１１１は、Ａ
／Ｄ変換器１１２、色相算出部１１３、比較部１１４、
定数値発生部１１５、顔色相位置算出部１１６を含む。Next, the second face candidate position detecting section 111 will be described in detail. The second face candidate position detecting unit 111
/ D converter 112, hue calculation unit 113, comparison unit 114,
A constant value generator 115 and a face hue position calculator 116 are included.

【００８２】Ａ／Ｄ変換器１１２には、ＣＣＤカメラ等
の撮像装置により入力された映像信号の色情報Ｃが入力
され、デジタル色信号に変換される。The A / D converter 112 receives color information C of a video signal input by an image pickup device such as a CCD camera and converts it into a digital color signal.

【００８３】デジタル輝度信号およびデジタル色信号は
色相算出部１１３に入力され、色相算出部１１３は、画
像の色相値を算出する。算出した色相値は、比較部１１
４へ出力される。The digital luminance signal and the digital color signal are input to the hue calculating section 113, and the hue calculating section 113 calculates the hue value of the image. The calculated hue value is supplied to the comparing unit 11
4 is output.

【００８４】比較部１１４には、定数値発生部１１５か
ら色相値に対応した所定の定数値が入力される。比較部
１１４は、入力した色相値および定数値を比較し、定数
値に対応した色相値を有する領域を示す色相領域信号を
顔色相位置算出部１１６へ出力する。たとえば、人間の
顔位置を求める場合、色相定数値として顔の肌色に対応
した定数値を定数値発生部１１５から出力する。次に、
肌色を表わす定数値と色相値とを比較し、肌色の色相値
を持つ領域を比較部１１４により求め、肌色の色相値を
持つ領域を示す色相領域信号を顔色相位置算出部１１６
へ出力する。A predetermined constant value corresponding to the hue value is input from the constant value generation unit 115 to the comparison unit 114. The comparing unit 114 compares the input hue value and the constant value, and outputs a hue region signal indicating a region having a hue value corresponding to the constant value to the face hue position calculating unit 116. For example, when determining the position of a human face, the constant value generation unit 115 outputs a constant value corresponding to the skin color of the face as a hue constant value. next,
The constant value representing the skin color is compared with the hue value, an area having the hue value of the skin color is obtained by the comparing section 114, and a hue area signal indicating the area having the hue value of the skin color is obtained by the face hue position calculating section 116.
Output to

【００８５】顔色相位置算出部１１６は、入力した色相
領域信号および差分２値化信号を基に、目的とする色相
領域でかつ差分２値化信号領域である領域、つまり動き
のある肌色領域を求める。顔色相位置算出部１１６は、
求めた肌色領域の中からその領域の面積および水平長さ
がある一定値以上の領域の水平方向の位置座標（Ｘ
ａ′、Ｘｂ′）および頭頂座標（Ｙｔｏｐ′）を求め、
顔幅抽出部１０８へ出力する。Based on the input hue area signal and differential binarized signal, face hue position calculating section 116 determines a target hue area and a differential binarized signal area, that is, a skin color area with motion. Ask. The face hue position calculation unit 116 calculates
The horizontal position coordinates (X
a ′, Xb ′) and parietal coordinates (Ytop ′),
Output to the face width extraction unit 108.

【００８６】以上の動作により、特定の色相たとえば肌
色の色相値を持つ領域の水平方向の位置座標および頭頂
座標を求めることができ、複数の人物が重なった場合で
も、複数の人物の顔の位置を特定することが可能とな
る。たとえば、３人の人物が重なり、３つの肌色領域つ
まり顔の領域がある場合は、３種類の水平方向の位置座
標（Ｘａ′、Ｘｂ′）および頭頂座標（Ｙｔｏｐ′）が
顔幅抽出部１０８へ出力される。With the above operation, it is possible to obtain the horizontal position coordinates and the top coordinates of a region having a specific hue, for example, a hue value of a flesh color. Even when a plurality of persons overlap, the positions of the faces of a plurality of persons can be obtained. Can be specified. For example, when three persons overlap and there are three skin color areas, that is, face areas, the three kinds of horizontal position coordinates (Xa ′, Xb ′) and top coordinates (Ytop ′) are determined by the face width extraction unit 108. Output to

【００８７】顔幅抽出部１０８では、２値変換部１０３
から出力される差分２値化信号およびエッジ検出部１１
０から出力される垂直方向微分画像データが入力され、
さらに、第１顔候補位置検出部１０４から出力される動
き物体の水平方向の位置座標（Ｘａ、Ｘｂ）および頭頂
座標（Ｙｔｏｐ）、および第２顔候補位置検出部１１１
から出力される肌色領域の水平方向の位置座標（Ｘ
ａ′、Ｘｂ′）および頭頂座標（Ｙｔｏｐ′）が入力さ
れる。顔幅抽出部１０８では、第１顔候補位置検出部１
０４および第２顔候補位置検出部１１１から出力された
位置座標および頭頂座標を選択的に使用し、たとえば、
複数の人物が重なっている場合には、第２顔候補位置検
出部１１１から出力される位置座標および頭頂座標を優
先的に使用する。顔幅抽出部１０８の動作は、図１に示
した顔幅抽出部１６と同様に動作し、入力した各データ
から人物の顔幅のＸ座標（Ｌｃｏｒ、Ｒｃｏｒ）を顔特
徴抽出判定部１０９へ出力する。In the face width extracting unit 108, the binary converting unit 103
Binarized signal and edge detector 11 output from
The vertical differential image data output from 0 is input,
Further, the horizontal position coordinates (Xa, Xb) and the top coordinate (Ytop) of the moving object output from the first face candidate position detection unit 104, and the second face candidate position detection unit 111
Horizontal position coordinates (X
a ′, Xb ′) and the parietal coordinates (Ytop ′) are input. In the face width extraction unit 108, the first face candidate position detection unit 1
04 and the top coordinates output from the second face candidate position detection unit 111 are selectively used.
When a plurality of persons overlap, the position coordinates and the top coordinates output from the second face candidate position detection unit 111 are preferentially used. The operation of the face width extraction unit 108 operates in the same manner as the face width extraction unit 16 shown in FIG. 1, and outputs the X coordinate (Lcor, Rcor) of the face width of the person from the input data to the face feature extraction determination unit 109. Output.

【００８８】顔特徴抽出判定部１０９は、図１に示す顔
特徴抽出判定部１７と同様に動作し、人物の頭部か否か
を判定し、判定結果を図１に示す発言者判定部１８へ出
力する。発言者判定部１８以降の各ブロックは図１と同
様であり、各動作は、図１に示す第１実施例と同様であ
るので以下その説明を省略する。The face feature extraction judging section 109 operates in the same manner as the face feature extraction judging section 17 shown in FIG. 1 to judge whether or not it is the head of a person, and outputs the judgment result to the speaker judging section 18 shown in FIG. Output to Each block after the speaker determination unit 18 is the same as in FIG. 1 and each operation is the same as in the first embodiment shown in FIG.

【００８９】以上の動作により、第２実施例では、第２
顔候補位置検出部１１１を用いて色相に応じた領域たと
えば肌色領域を特定し、顔幅抽出部１０８へ出力するこ
とができるので、複数の人物が重なっている場合でもそ
れぞれの人物の位置を特定することが可能となる。次
に、図１に示す指示棒検出部２２の他の構成例について
説明する。図１０は、本発明の第３の実施例の画像処理
装置の指示棒検出部の構成を示すブロック図である。With the above operation, in the second embodiment, the second
A region corresponding to the hue, such as a skin color region, can be specified by using the face candidate position detection unit 111 and can be output to the face width extraction unit 108. Therefore, even when a plurality of people overlap, the position of each person can be specified. It is possible to do. Next, another configuration example of the pointing stick detection unit 22 illustrated in FIG. 1 will be described. FIG. 10 is a block diagram illustrating the configuration of the pointing stick detection unit of the image processing device according to the third embodiment of the present invention.

【００９０】図１０を参照して、指示棒検出部は、狭幅
領域算出部１３０、領域膨張部１３３、小領域除去部１
３４、特徴量検出部１３５、指示棒判定部１３６、指示
位置検出部１３７を含む。また、狭幅領域算出部１３０
は、収縮膨張部１３１、差分部１３２を含む。Referring to FIG. 10, the pointing rod detection unit includes a narrow area calculation unit 130, an area expansion unit 133, and a small area removal unit 1.
34, a feature amount detection unit 135, a pointing stick determination unit 136, and a pointing position detection unit 137. Further, the narrow area calculation unit 130
Includes a contraction / expansion part 131 and a difference part 132.

【００９１】収縮膨張部１３１には、図１に示す比較部
４から入力画像中の動きのある領域を示す差分２値化信
号が入力される。収縮膨張部１３１は、入力した差分２
値化信号を所定回数収縮処理し、差分２値化信号のうち
幅が狭い領域を除去しまた、収縮処理の回数と同じ回数
膨張処理を施し削除された領域以外の領域を元の幅に戻
す。差分部１３２には、収縮膨張部１３１により狭い領
域が除去された差分２値化信号および元の差分２値化信
号が入力される。差分部１３２は、両者の差分をとり、
幅の狭い領域を抽出する。The contraction / expansion unit 131 receives a differential binary signal indicating a moving area in the input image from the comparison unit 4 shown in FIG. The contraction / expansion unit 131 calculates the input difference 2
The binarized signal is eroded a predetermined number of times to remove an area having a small width from the differential binarized signal, and to perform an expansion process the same number of times as the number of erosion processes and return an area other than the deleted area to the original width. . The difference binarized signal from which the narrow area has been removed by the contraction / expansion unit 131 and the original difference binarized signal are input to the difference unit 132. The difference unit 132 calculates the difference between the two,
Extract narrow areas.

【００９２】領域膨張部１３３には、抽出した幅の狭い
領域を示す信号が入力される。領域膨張部１３３は、抽
出された領域の８近傍領域の連結情報を求め、その８近
傍連結情報より近傍で隣接している複数の領域を１つの
領域にまとめる。A signal indicating the extracted narrow area is input to the area expanding section 133. The region expanding unit 133 obtains connection information of the eight neighboring regions of the extracted region, and combines a plurality of regions that are adjacent in the vicinity based on the eight neighboring connection information into one region.

【００９３】小領域除去部１３４は、領域膨張部１３３
で１つの領域にまとめられた領域のうち面積が一定値以
上ある領域のみを抽出することにより、動きのある領域
でかつ幅が狭くて一定以上の面積を持った領域のみを抽
出する。The small area removing section 134 includes an area expanding section 133
By extracting only the area having an area equal to or more than a certain value from the areas combined into one area, only the area having a motion and having a small width and an area equal to or more than a certain value is extracted.

【００９４】特徴量検出部１３５は、小領域除去部１３
４により求めた領域に対して以下の特徴量を求める。図
１１は、指示棒の座標を示す図であり、図１２は、図１
１に示す指示棒の特徴量を示す図である。つまり、特徴
量検出部１３５では、図１１および図１２に示す指示棒
の特徴量である図形最大長Ｌ、図形最大長の両端Ａ、
Ｂ、最大垂直幅Ｗ、主軸方向θを求め、指示棒判定部１
３６へ出力する。The feature amount detection unit 135 is provided with the small area removal unit 13
The following feature amounts are obtained for the area obtained in step 4. FIG. 11 is a diagram showing the coordinates of the pointer, and FIG.
FIG. 2 is a diagram showing characteristic amounts of the pointing stick shown in FIG. In other words, the feature amount detection unit 135 determines the maximum feature length L of the pointing stick shown in FIGS.
B, the maximum vertical width W, and the main axis direction θ are determined, and the pointing rod determination unit 1
36.

【００９５】指示棒判定部１３６では、入力された各特
徴量を基に、求めた領域の図形最大長Ｌおよび最大垂直
幅Ｗとの比率が一定値以上あるか否か、言い換えればそ
の求めた領域が指示棒等のように細長い領域であるか否
かを判定する。指示棒判定部１３６は、求めた領域が指
示棒等のように細長い領域であった場合、その領域の図
形最大長の端点（Ａ、Ｂ）を中心とした領域について差
分２値化信号の面積値を調べ、２ヶ所の端点（Ａ、Ｂ）
を中心とした領域で面積値が小さい領域つまり人間が指
示棒を持っていない端点Ｂを求め、その座標を指示棒の
座標（ＰｂＸ、ＰｂＹ）として出力する。出力した指示
棒の座標（ＰｂＸ、ＰｂＹ）は図１に示す抽出座標決定
部１９へ出力され、第１の実施例と同様に処理される。
以上の処理により図１０に示す指示棒検出部では、対象
物が指示棒等の細長い物体により水平方向以外のあらゆ
る方向から指し示されている場合でも指示棒の位置を特
定することができ、結果として対象物の位置を特定する
ことが可能となる。The pointing bar determination unit 136 determines whether or not the ratio between the determined maximum graphic length L and the maximum vertical width W of the region is equal to or more than a certain value, in other words, based on each input characteristic amount. It is determined whether or not the area is an elongated area such as a pointing stick. When the obtained area is an elongated area such as an indicating rod or the like, the pointing rod determination unit 136 determines the area of the difference binarized signal with respect to the area centered on the end point (A, B) of the maximum graphic length of the area. Check the value and find two end points (A, B)
Is determined, and an area B having a small area value, that is, an end point B at which a human does not have the pointing bar is obtained, and the coordinates are output as the coordinates (PbX, PbY) of the pointing bar. The output coordinates (PbX, PbY) of the pointer are output to the extracted coordinate determination unit 19 shown in FIG. 1 and are processed in the same manner as in the first embodiment.
By the above processing, the pointing rod detection unit shown in FIG. 10 can specify the position of the pointing rod even when the target object is pointed by an elongated object such as the pointing rod from any direction other than the horizontal direction. As a result, the position of the object can be specified.

【００９６】次に、本発明の第４の実施例の画像処理装
置について図面を参照しながら説明する。第４の実施例
では、図１に示す第１の実施例の指示棒検出部２２の代
わりに水平垂直方向微分画像データを用いて画面の中央
の対象物を判定する中央対象物検出部を備えるものであ
る。図１３は、本発明の第４の実施例の画像処理装置の
中央対象物検出部の構成を示すブロック図である。第４
の実施例では、図１に示す指示棒検出部２２の代わり
に、以下に説明する中央対象物検出部を備える点以外は
第１の実施例と同様であるのでその説明を省略し、中央
対象物検出部のみについて詳細に説明する。Next, an image processing apparatus according to a fourth embodiment of the present invention will be described with reference to the drawings. In the fourth embodiment, a center object detection unit that determines an object at the center of the screen using horizontal and vertical differential image data is provided instead of the pointing bar detection unit 22 of the first embodiment shown in FIG. Things. FIG. 13 is a block diagram illustrating a configuration of a central object detection unit of an image processing device according to a fourth embodiment of the present invention. 4th
The second embodiment is the same as the first embodiment except that a center object detection unit described below is provided instead of the pointing stick detection unit 22 shown in FIG. Only the object detection unit will be described in detail.

【００９７】図１３を参照して、中央対象物検出部は、
水平垂直投影部１４１、中央対象物判定部１４２を含
む。Referring to FIG. 13, the central object detecting section comprises:
A horizontal / vertical projection unit 141 and a central object determination unit 142 are included.

【００９８】水平垂直投影部１４１には、図１に示すエ
ッジ検出部２０から出力される水平垂直方向微分画像デ
ータが入力される。図１５は、図１３に示す中央対象物
検出部の動作を説明するための図である。水平垂直投影
部１４１は、たとえば、図１５に示す入力画像の水平方
向の中央領域ＣＸおよび垂直方向の中央からの下位領域
ＣＹで囲まれた領域に対して微分画像データの水平方向
および垂直方向の累積微分投影値を求める。水平方向お
よび垂直方向の累積微分投影値は、図１５に示すよう
に、水平および垂直方向で同様な累積微分投影値を持つ
領域Ａおよびその領域Ａのまわりで同様な累積微分投影
値を持つ領域ＢおよびＣが存在する。また、ここでは、
領域ＢとＣがほぼ等しい累積微分投影値を持っている。
領域Ａは、画面中央付近の対象物に対応する領域であ
り、領域ＢおよびＣは会議テーブルや机等に対応する領
域である。したがって、会議テーブルや机等の同様な微
分データを持つ領域の中に異なる微分データを持つ領域
があるか否かを判定することにより、対象物がどこにあ
るか否かを判定することができる。The horizontal / vertical projection section 141 receives the horizontal / vertical differential image data output from the edge detection section 20 shown in FIG. FIG. 15 is a diagram for explaining the operation of the central object detection unit shown in FIG. The horizontal / vertical projection unit 141 outputs the differential image data in the horizontal and vertical directions with respect to an area surrounded by a horizontal central area CX and a lower area CY from the vertical center of the input image shown in FIG. Find the cumulative differential projection value. As shown in FIG. 15, the cumulative differential projection values in the horizontal and vertical directions are the area A having the same cumulative differential projection value in the horizontal and vertical directions and the area having the same cumulative differential projection value around the area A. B and C are present. Also, here
Regions B and C have approximately equal cumulative differential projection values.
The area A is an area corresponding to an object near the center of the screen, and the areas B and C are areas corresponding to a conference table, a desk, and the like. Therefore, by determining whether there is an area having different differential data among areas having similar differential data, such as a conference table or a desk, it is possible to determine where the object is.

【００９９】したがって、中央対象物判定部１４２で
は、水平垂直投影部１４１で累積投影された累積微分投
影値を用いて周りに領域ＢおよびＣと異なる累積微分投
影値を持つ領域Ａが存在するか否か、言い換えると、会
議テーブルや机等の同等の微分データを持つ領域の中に
異なる微分データを持つ領域つまり対象物があるか否か
を判定する。中央対象物判定部１４２は、上記の条件に
従う領域つまり領域Ａのほぼ中央付近の座標（ＰＸ、Ｐ
Ｙ）を対象物の座標として図１に示す抽出座標決定部１
９へ出力する。以上の処理により、対象物に対して指示
棒等を用いていない場合でも、水平垂直方向微分画像デ
ータを用いて対象物の位置を特定することが可能とな
る。Therefore, the central object determining section 142 uses the cumulative differential projection value cumulatively projected by the horizontal / vertical projection section 141 to determine whether there is an area A having a different cumulative differential projection value from the areas B and C around it. No, in other words, it is determined whether there is an area having different differential data, that is, an object, in an area having equivalent differential data such as a conference table or a desk. The central object determining unit 142 determines the coordinates (PX, P
Y) is an extracted coordinate determining unit 1 shown in FIG.
9 is output. Through the above processing, even when the pointing rod or the like is not used for the object, the position of the object can be specified using the horizontal and vertical differential image data.

【０１００】次に、本発明の第５の実施例の画像処理装
置について説明する。第５の実施例では、色相値の差を
利用することにより対象物を検出する。図１４は、本発
明の第５の実施例の画像処理装置の中央対象物検出部の
構成を示すブロック図である。第５の実施例では、図１
に示す第１の実施例の指示棒検出部２２の代わりに、色
相量の差を利用した中央対象物検出部を用いる点を除き
第１の実施例と同様であるので、以下その説明を省略
し、中央対象物検出部のみについて詳細に説明する。Next, an image processing apparatus according to a fifth embodiment of the present invention will be described. In the fifth embodiment, an object is detected by using a difference between hue values. FIG. 14 is a block diagram showing the configuration of the central object detection unit of the image processing device according to the fifth embodiment of the present invention. In the fifth embodiment, FIG.
Is the same as that of the first embodiment except that a center object detection unit using a difference in hue amount is used instead of the pointing rod detection unit 22 of the first embodiment shown in FIG. Only the central object detection unit will be described in detail.

【０１０１】図１４を参照して、水平垂直投影部１４３
には、図９に示す色相算出部１１３から出力される色相
値が入力される。水平垂直投影部１４３は、図１３に示
す水平垂直投影部１４１と同様に色相値の水平方向およ
び垂直方向の累積色相投影値を求める。中央対象物判定
部１４４は、図１３に示す中央対象物判定部１４２と同
様に動作し、求められた累積色相投影値から会議テーブ
ルや机等の同等の色相値を持つ領域の中に異なる色相値
を持つ領域つまり対象物があるか否かを判定する。対象
物がある場合その領域の中央付近の位置座標（ＰＸ、Ｐ
Ｙ）を対象物のある位置座標として図１に示す抽出座標
決定部１９へ出力する。Referring to FIG. 14, horizontal / vertical projection unit 143
The hue value output from the hue calculation unit 113 shown in FIG. The horizontal / vertical projection unit 143 calculates the cumulative hue projection values of the hue values in the horizontal and vertical directions, similarly to the horizontal / vertical projection unit 141 shown in FIG. The central object determining unit 144 operates in the same manner as the central object determining unit 142 illustrated in FIG. 13, and calculates a different hue from an obtained cumulative hue projection value into an area having an equivalent hue value such as a conference table or a desk. It is determined whether there is an area having a value, that is, an object. If there is an object, the position coordinates (PX, P
Y) is output to the extracted coordinate determining unit 19 shown in FIG.

【０１０２】以上の処理により、対象物に対して指示棒
等を用いていない場合でも、色相値の差異を利用して対
象物の位置を特定することが可能となる。By the above processing, even when the pointing rod or the like is not used for the object, the position of the object can be specified by using the difference in the hue value.

【０１０３】上記各実施例は任意に組み合わせることが
でき、その場合にはそれぞれの作用効果を奏する。The above embodiments can be arbitrarily combined with each other, in which case the respective functions and effects can be obtained.

【０１０４】上記の各処理により、本発明による画像処
理装置では、テレビ電話やテレビ会議システムの使用時
における転送画像内容を決定するため、まず、ＣＣＤカ
メラ等の撮像装置により映された画像から動きや顔の特
徴を利用して複数の人物の位置を求め、複数の人物から
現在発言している人物を口唇の動きから求めることによ
り、会議等における複数の人物から現在発言している人
物を中心とした画像を自動的に指示するための抽出領域
指示信号を出力することができ、この抽出領域指示信号
を用いることにより、発言者を中心とした画像を自動的
に転送画像にすることができる。According to the above-described processing, the image processing apparatus according to the present invention first determines the contents of the transferred image when the videophone or the video conference system is used. By determining the positions of multiple people using features of the face and the face, and determining the person who is currently speaking from multiple people from the movement of the lips, the person who is currently speaking from multiple people in a meeting etc. It is possible to output an extraction area instruction signal for automatically instructing an image that has been set as an image, and by using this extraction area instruction signal, an image centering on the speaker can be automatically made a transfer image. .

【０１０５】また、ＣＣＤカメラ等の撮像装置により撮
影された画像データを微分し、その微分画像データから
黒板等の表示枠および細長い指示棒のような物体の位置
を自動的に求めることができ、会議等で発言時以外の状
態、すなわち黒板等で文書あるいは図面等が説明されて
いる場合や話題としている商品の物品などが説明されて
いる場合等では、その説明している表示枠や指示棒の示
す位置を中心とした画像を自動的に転送画像とすること
が可能となる。Further, image data captured by an imaging device such as a CCD camera can be differentiated, and the position of an object such as a display frame such as a blackboard and a long and thin stick can be automatically obtained from the differentiated image data. In a state other than when speaking at a meeting or the like, that is, when a document or drawing is described on a blackboard or the like or an article of a topical product is described, a display frame or a pointing stick that is being described. It is possible to automatically set an image centered on the position indicated by as the transfer image.

【０１０６】さらに、ＣＣＤカメラ等の撮像装置に得ら
れた画像から顔の色相情報を用いて人物の概略位置を求
め、その位置に対して人物の顔の模様の特徴を利用する
ことにより、複数の人物が重なっている場合においても
複数の人物の位置を求めることが可能となる。さらに、
話題の対象物の位置の検出のため、ＣＣＤカメラ等の撮
像装置で動きのある箇所に対して、細長い指示棒等のよ
うなものがどの方向を指していてもその検出を可能と
し、また、会議テーブルや机等と対象物の判定に両者の
微分値や色相値の差異を利用することにより、各対象物
の検出が可能となる。Further, the approximate position of the person is obtained from the image obtained by the imaging device such as a CCD camera using the hue information of the face, and the feature of the pattern of the face of the person is used for the position. It is possible to obtain the positions of a plurality of persons even when the persons overlap. further,
In order to detect the position of a topical object, it is possible to detect a moving part with an imaging device such as a CCD camera, regardless of the direction of an object such as an elongated pointing rod in any direction. By utilizing the difference between the differential value and the hue value of the conference table, desk, or the like and the object for the judgment of the object, each object can be detected.

【０１０７】さらに、上記の選択された転送画像の組合
わせ、優先順位、拡大表示の有無等をその会議の内容に
応じて使用者が予め設定しておくことにより、希望の画
像を使用者が手動で細かく指示することなく、自動的に
画像の選択、組合わせ、拡大倍率の自動切換を実現し、
その画像を転送画像とすることが可能となる。Further, the user sets in advance the combination of the selected transfer images, the priority order, the presence / absence of enlarged display, and the like according to the content of the conference, so that the user can set the desired image. Automatic selection of images, combination, and automatic switching of enlargement ratio without manual instructions.
This image can be used as a transfer image.

【０１０８】[0108]

【発明の効果】請求項１記載の画像処理装置において
は、複数の動き物体の位置を検出することはできるの
で、所望の対象物を特定することができる。According to the image processing apparatus of the present invention, since the positions of a plurality of moving objects can be detected, a desired object can be specified.

【０１０９】請求項２記載の画像処理装置においては、
発言者の位置を検出することができるので、所望の対象
物体を特定することが可能となる。In the image processing apparatus according to the second aspect,
Since the position of the speaker can be detected, it is possible to specify a desired target object.

【０１１０】請求項３記載の画像処理装置においては、
静止物体の位置を検出することができるので、所望の対
象物体を特定することが可能となる。In the image processing apparatus according to the third aspect,
Since the position of the stationary object can be detected, a desired target object can be specified.

【０１１１】請求項４記載の画像処理装置においては、
細長い動物体の位置を検出することができるので、所望
の対象物体を特定することが可能となる。In the image processing apparatus according to the fourth aspect,
Since the position of an elongated moving object can be detected, a desired target object can be specified.

【０１１２】請求項５記載の画像処理装置においては、
色相値を用いて所定の色相領域を特定し、色相領域に応
じた所定の対象物の位置を特定することができ、所望の
対象物体を特定することが可能となる。In the image processing apparatus according to the fifth aspect,
A predetermined hue region is specified using the hue value, a position of a predetermined target object corresponding to the hue region can be specified, and a desired target object can be specified.

【０１１３】請求項６記載の画像処理装置においては、
対象物の方向に関係なく幅が狭くかつ所定値以上の面積
を有する細長い領域を算出することができるので、所望
の対象物体を特定することが可能となる。In the image processing apparatus according to the sixth aspect,
An elongated region having a narrow width and an area equal to or larger than a predetermined value can be calculated regardless of the direction of the target object, so that a desired target object can be specified.

【０１１４】請求項７記載の画像処理装置においては、
微分画像データを用いて対象物体の位置を特定している
ので、指示棒等を用いていない場合でも所望の対象物を
特定することができる。In the image processing apparatus according to the seventh aspect,
Since the position of the target object is specified using the differential image data, a desired target object can be specified even when the pointing rod or the like is not used.

【０１１５】請求項８記載の画像処理装置においては、
色相値の差異を用いて対象物の位置を特定しているの
で、指示棒等を用いていない場合でも所望の対象物体を
特定することが可能となる。In the image processing apparatus according to the eighth aspect,
Since the position of the target object is specified using the difference in hue value, it is possible to specify a desired target object even when the pointing rod or the like is not used.

[Brief description of the drawings]

【図１】本発明の第１の実施例の画像処理装置の構成を
示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of an image processing apparatus according to a first embodiment of the present invention.

【図２】濃度投影処理から物体領域幅抽出処理までの処
理を説明するための図である。FIG. 2 is a diagram for explaining processing from density projection processing to object area width extraction processing.

【図３】発言者判定処理を説明するための図である。FIG. 3 is a diagram illustrating a speaker determination process.

【図４】図１に示す発言者判定部の構成を示すブロック
図である。FIG. 4 is a block diagram illustrating a configuration of a speaker determination unit illustrated in FIG. 1;

【図５】図１に示す表示枠検出部の構成を示すブロック
図である。FIG. 5 is a block diagram illustrating a configuration of a display frame detection unit illustrated in FIG. 1;

【図６】図１に示す表示枠検出部の動作を説明するため
の図である。FIG. 6 is a diagram for explaining the operation of the display frame detection unit shown in FIG.

【図７】図１に示す指示棒検出部の構成を示すブロック
図である。FIG. 7 is a block diagram illustrating a configuration of a pointing stick detection unit illustrated in FIG. 1;

【図８】図１に示す指示棒検出部の動作を説明するため
の図である。FIG. 8 is a diagram for explaining an operation of the pointing stick detection unit shown in FIG. 1;

【図９】本発明の第２の実施例の画像処理装置の構成を
示すブロック図である。FIG. 9 is a block diagram illustrating a configuration of an image processing apparatus according to a second embodiment of the present invention.

【図１０】本発明の第３の実施例の画像処理装置の指示
棒検出部の構成を示すブロック図である。FIG. 10 is a block diagram illustrating a configuration of a pointing rod detection unit of an image processing apparatus according to a third embodiment of the present invention.

【図１１】指示棒の座標を示す図である。FIG. 11 is a diagram showing coordinates of a pointing stick.

【図１２】図１１に示す指示棒の特徴量を示す図であ
る。FIG. 12 is a diagram showing characteristic amounts of the pointing stick shown in FIG. 11;

【図１３】本発明の第４の実施例の画像処理装置の中央
対象物検出部の構成を示すブロック図である。FIG. 13 is a block diagram illustrating a configuration of a central object detection unit of an image processing apparatus according to a fourth embodiment of the present invention.

【図１４】本発明の第５の実施例の画像処理装置の中央
対象物検出部の構成を示すブロック図である。FIG. 14 is a block diagram illustrating a configuration of a central object detection unit of an image processing apparatus according to a fifth embodiment of the present invention.

【図１５】図１３に示す中央対象物検出部の動作を説明
するための図である。FIG. 15 is a diagram for explaining the operation of the central target object detection unit shown in FIG.

[Explanation of symbols]

１Ａ／Ｄ変換器２画像メモリ３絶対差分部４比較部５しきい値決定部６Ｘ軸濃度投影部７、１０連続領域ピーク検出部８、１１連続領域独立比較部９時間方向累積部１２物体領域幅抽出部１３算出物体選択部１４Ｙ軸濃度投影部１５最大Ｙ軸投影座標抽出部１６顔幅抽出部１７顔特徴抽出判定部１８発言者判定部１９抽出座標決定部２０エッジ検出部２１表示枠検出部２２指示棒検出部２３抽出対象物優先順位指定部 DESCRIPTION OF SYMBOLS 1 A / D converter 2 Image memory 3 Absolute difference part 4 Comparison part 5 Threshold determination part 6 X-axis density projection part 7, 10 Continuous area peak detection part 8, 11 Continuous area independent comparison part 9 Time direction accumulation part 12 Object area width extraction unit 13 Calculated object selection unit 14 Y-axis density projection unit 15 Maximum Y-axis projection coordinate extraction unit 16 Face width extraction unit 17 Face feature extraction determination unit 18 Speaker determination unit 19 Extraction coordinate determination unit 20 Edge detection unit 21 Display frame detecting unit 22 Pointer detecting unit 23 Extraction object priority order specifying unit

フロントページの続き (56)参考文献特開平２−236787（ＪＰ，Ａ) 特開平５−205157（ＪＰ，Ａ) 特開平２−81590（ＪＰ，Ａ) 特開昭62−46384（ＪＰ，Ａ) 特開平１−314385（ＪＰ，Ａ) 特開昭61−134884（ＪＰ，Ａ) 特開昭62−84391（ＪＰ，Ａ) 特開平３−182189（ＪＰ，Ａ) 特開平２−108167（ＪＰ，Ａ) 特開平５−87631（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06T 7/00 - 7/60 G06T 1/00 G06T 5/00 Continuation of front page (56) References JP-A-2-236787 (JP, A) JP-A-5-205157 (JP, A) JP-A-2-81590 (JP, A) JP-A-62-46384 (JP) JP-A-1-314385 (JP, A) JP-A-61-134884 (JP, A) JP-A-62-84391 (JP, A) JP-A-3-182189 (JP, A) 2-108167 (JP, A) JP-A-5-87631 (JP, A) (58) Fields investigated (Int. Cl. ⁷ , DB name) G06T ⁷ /00-7/60 G06T 1/00 G06T 5 / 00

Claims

(57) [Claims]

An image data is analyzed to detect a predetermined object.
An image processing apparatus that outputs the time difference data of continuous image data including a plurality of moving objects; an addition unit that cumulatively adds the time difference data; and a time difference that is cumulatively added by the addition unit. the data,
For each of the plurality of moving objects, the cumulatively added time difference
Variable binarization means for binarizing with a threshold value for each of the plurality of moving objects calculated based on data; and the plurality of moving objects based on data binarized by the binarization means. An image processing apparatus comprising: a detecting unit that detects a position of the image.

2. The image data includes data representing a person.
The image processing device includes image data binarized by the binarization unit ,
Data representing the contour of a person's face and a ring of elements in the face
First specifying means for specifying a first area including a face of a person based on at least one of the data representing the gusset; and second specifying for specifying a second area including a lip of the person from the first area Means, detecting the amount of movement of the lips in the second area, and determining the position of the speaker
The method according to claim 1 , further comprising: a third specifying unit configured to detect .
The image processing apparatus.

3. The image data representing a stationary object.
Includes data, the image processing apparatus includes a differential means for differentiating said image data to create the differentiated image data, and a projection means for generating projection data by projecting the differentiated image data in a predetermined direction, the projection data The image processing apparatus according to claim 1 , further comprising : a stationary object detecting unit configured to detect a position of the stationary object based on the extreme value of the stationary object.

4. The image data representing an elongated object.
Comprises over data, the image processing apparatus includes a motion area detection means for detecting a movement area including the elongated moving object from the image data, a differentiating means for generating differential image data by differentiating said image data, said Extraction from the motion area based on the differential image data
Projecting means for projecting the density data of the obtained area in a predetermined direction to create projection data; and moving object detecting means for detecting a position of the elongated moving object based on an extreme value of the projection data, further comprising : Item 1
An image processing apparatus as described in the above .

5. An image processing apparatus comprising : a motion area detecting means for detecting a motion area in an image composed of continuous image data using time difference data of luminance data in the continuous image data; Binarizing means for binarizing luminance data corresponding to the image data with a predetermined threshold value and outputting binarized data; and a hue value calculating means for calculating a hue value based on the luminance data and the color data of the image data. And comparing the hue value with a predetermined constant value to determine whether the predetermined hue exists.
Further comprising, claims and hue area calculating means for calculating hue region, the target position calculating means for calculating a position of a predetermined object in the color area and the binarized data the color area based on the 2. The image processing device according to 1.

6. An image processing apparatus, comprising : a motion area detecting means for detecting a motion area in an image composed of the continuous image data using time difference data of the continuous image data; and a luminance corresponding to the motion area. Binarizing means for binarizing the data with a predetermined threshold value and outputting binarized data; performing erosion and dilation processing on the binarized data; 2 after expansion
A narrow area calculating means for calculating a narrow area of the binarized data based on a difference from the binarized data; and obtaining neighborhood area connection information for the narrow area; The image processing apparatus according to claim 1 , further comprising : an elongated area calculating unit configured to calculate an elongated area that is narrow and has an area equal to or larger than a predetermined value.

7. An image processing apparatus , comprising: differentiating means for differentiating image data and calculating differential image data in horizontal and vertical directions; and the horizontal and vertical directions with respect to an area near the center of an image composed of the image data. Horizontal and vertical projection means for projecting differential image data in the vertical and horizontal axes, respectively, and outputting horizontal and vertical projection data; and a predetermined object near the center of the image using the horizontal and vertical projection data. of determining a determination unit configured to determine whether there is a by the determining means near the center of the image
Extraction for extracting image data corresponding to a predetermined target object
And means, the image processing apparatus according to claim 1.

8. An image processing apparatus comprising : hue value calculating means for calculating hue values in the horizontal and vertical directions from color data of image data; Horizontal and vertical hue projection means for projecting hue values in the vertical direction onto the horizontal axis and the vertical axis, respectively, and outputting horizontal and vertical hue projection data; and and determination means for determining whether there is an object, to be near the center of the image by the determination means determines
Extraction to extract image data corresponding to the specified target object
And means, the image processing apparatus according to claim 1.