JP2002051316A

JP2002051316A - Image communication terminal

Info

Publication number: JP2002051316A
Application number: JP2001152614A
Authority: JP
Inventors: Kazuyuki Imagawa; 和幸今川; Hideaki Matsuo; 英明松尾; Yuji Takada; 雄二高田; Masabumi Yoshizawa; 正文吉澤; Shogo Hamazaki; 省吾濱崎; Tetsuya Yoshimura; 哲也吉村; Katsuhiro Iwasa; 克博岩佐
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2000-05-22
Filing date: 2001-05-22
Publication date: 2002-02-15

Abstract

PROBLEM TO BE SOLVED: To provide an image communication terminal in which the camera section tracks the user's position without using a large scale tracking mechanism and the user can be imaged at a good position. SOLUTION: The image communication terminal comprises a section 7 for extracting the position and size of a face region in an image picked up at a camera section 4, a section 3 for displaying the image to a user, a section 9 performing bi-directional communication of the image with an opposite information processor, and a transmission data processing section 8 for outputting an image in a rectangular transmission region being set shiftably in an image picked up at the camera section 4 to the communicating section 9. An effective region shifting integrally with the transmission region is set in an image picked up at the camera section 4 and only when the face region deviates from the effective region, position of the transmission region is shifted depending on the position and size of the face region.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、画像通信端末に関
し、より特定的には、利用者が自分又は近くにいる他人
の姿をカメラ部で撮影し、この撮影した画像を相手に送
信しながら対話を行うための画像通信端末に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image communication terminal, and more particularly, to a camera in which a user takes a picture of himself or another person in the vicinity, and transmits the photographed picture to the other party. The present invention relates to an image communication terminal for performing a conversation.

【０００２】[0002]

【従来の技術】周知のように、相手に画像を送信しなが
ら対話を行う画像通信端末としては、テレビ電話、テレ
ビ会議システム、ビデオメール等、種々の形態のものが
ある。これらのいずれの形態であっても、利用者が自分
又は近くにいる他人（以下、単に「利用者」という）の
姿を相手に送信するためには、画像通信端末に内蔵又は
外部接続されるカメラ部と被写体となる利用者とが、常
に適切な位置関係になっていなければならない。2. Description of the Related Art As is well known, various types of image communication terminals, such as a video telephone, a video conference system, and a video mail, which perform a conversation while transmitting an image to a partner, are known. In any of these forms, in order for the user to transmit the figure of himself or another person nearby (hereinafter simply referred to as “user”) to the other party, the user is built in or externally connected to the image communication terminal. The camera unit and the user who is the subject must always be in an appropriate positional relationship.

【０００３】この適切な位置関係を維持するためには、
カメラ部に光軸を移動させる機構やズーム機構等を設
け、カメラ部を利用者の動きに追従させる方法が考えら
れる。しかし、この方法では、追従動作に必要なカメラ
部及び関連機構が大掛かりになって、画像通信端末の小
型化及び低コスト化を図れない。特に、携帯性が重要で
あるモバイル端末や携帯（テレビ）電話等の画像通信端
末に、このような機構を設けるのは現実的ではない。In order to maintain this proper positional relationship,
A method is conceivable in which a mechanism for moving the optical axis, a zoom mechanism, and the like are provided in the camera unit, and the camera unit follows the movement of the user. However, according to this method, the camera unit and related mechanisms required for the tracking operation become large, and the size and cost of the image communication terminal cannot be reduced. In particular, it is not practical to provide such a mechanism in an image communication terminal such as a mobile terminal or a portable (video) telephone where portability is important.

【０００４】一方、画像通信端末から利用者へ、カメラ
部に対する利用者の位置に関する情報を提供し、利用者
が自らカメラ部に合わせるようにして、上記適切な位置
関係を維持する方法も考えられる。On the other hand, a method is also conceivable in which information relating to the position of the user with respect to the camera unit is provided from the image communication terminal to the user, and the user adjusts itself to the camera unit so as to maintain the appropriate positional relationship. .

【０００５】具体的には、第１の手法として、ピクチャ
インピクチャ方式又は画面分割方式により、画面の一部
を自分（利用者自身）の姿を映すために利用することが
従来より行われている。しかしながら、この手法では、
自分の姿を映すために画面のかなりの部分が占有され、
結果的に相手の姿が小さくなって見辛くなるという問題
がある。また、第２の手法として、自分の画像と相手の
画像とを切り替えながら表示することも従来より行われ
ている。しかしながら、この手法では、画面が度々切り
替えられてしまうので、利用者は切り替えが気になって
会話に集中し難いという問題がある。加えて、上記第１
及び第２のいずれの手法によっても、通常の会話（自分
と相手とが膝を交えて行う会話）環境から、あまりにも
かけ離れた環境であり、利用者は不自然な感じを禁じ得
ない。More specifically, as a first technique, a part of a screen is used to show a figure of the user (the user himself) by a picture-in-picture method or a screen division method. I have. However, with this approach,
A significant portion of the screen is occupied to reflect yourself,
As a result, there is a problem that the opponent's figure becomes small and it becomes hard to see. In addition, as a second method, switching between an image of one's own and an image of the other party while displaying the image has been conventionally performed. However, in this method, since the screen is frequently switched, there is a problem that the user is worried about the switching and cannot concentrate on the conversation. In addition, the first
According to both the second and the third techniques, the environment is far away from the normal conversation (conversation between the user and the other party with the knees), and the user cannot prohibit an unnatural feeling.

【０００６】そこで、このような問題点に対応すべく、
特開平８−２５１５６１号公報には、自分の姿を表示さ
せず、かつ、カメラ部の追従機構を省略できる技術が開
示されている。この公報の技術では、カメラ部で利用者
自身を撮影し、利用者の位置を検出し、検出した位置が
撮影範囲を逸脱したかどうかを判断する。そして、逸脱
した場合のみ、次のいずれかの方法によってその旨を利
用者へ通知する。（１）相手の姿をほぼ画面いっぱいに表示しておき、逸
脱した場合には相手の画像に変化を付けることで（例え
ば、相手の姿を変形させる等）、利用者へその旨を通知
する。（２）画面内に、相手の姿を表示する領域だけでなく文
字表示領域を確保する。そして、逸脱した場合には文字
表示領域に逸脱した旨のメッセージを表示することで、
利用者へその旨を通知する。Therefore, in order to deal with such a problem,
Japanese Patent Application Laid-Open No. Hei 8-251561 discloses a technique that does not display a user's own appearance and can omit a follow-up mechanism of a camera unit. According to the technique disclosed in this publication, a user himself / herself is photographed by a camera unit, the position of the user is detected, and it is determined whether or not the detected position deviates from a photographing range. Then, only in the case of deviation, the user is notified of the deviation by one of the following methods. (1) The appearance of the other party is displayed on almost the entire screen, and when the user deviates, the user is notified by changing the image of the other party (for example, deforming the other party's figure). . (2) A character display area as well as an area for displaying the other party's figure is secured in the screen. And, when it deviates, by displaying the message of departure in the character display area,
Notify the user to that effect.

【０００７】[0007]

【発明が解決しようとする課題】しかしながら、上記
（１）及び（２）のいずれの方法にしても、利用者の位
置が撮影範囲を逸脱しなければ、利用者へは何の通知も
されない。また、利用者が常識的な使用をしている場合
には、そう頻繁に撮影範囲を逸脱するものではない。従
って、利用者は、ほとんどの場合（つまり逸脱していな
い時）、撮影範囲に対する自分の位置を確認することが
できない。However, in any of the above methods (1) and (2), no notification is given to the user unless the position of the user deviates from the photographing range. Further, when the user uses common sense, the user does not depart from the shooting range so frequently. Therefore, in most cases (that is, when the user does not deviate), the user cannot confirm his / her position with respect to the shooting range.

【０００８】さらに、上記（１）の方法では、逸脱した
場合に、突然相手の姿が変化するので、利用者は驚いて
会話を途切らせたりしてしまう。また、上記（２）の方
法では、表示する文字（メッセージ）が潰れてしまわな
いようにするため、ある程度広い文字表示領域が必要と
なる。このため、文字表示領域に圧迫されて画像表示領
域が小さくなり、相手の姿が小さく見辛くなり易い。加
えて、上記（１）及び（２）のいずれの方法にしても、
画面における利用者の大きさについては全く関知されて
おらず、カメラ部に対する利用者の遠近方向の適否が不
明である。Further, in the above method (1), when the user deviates, the partner's appearance suddenly changes, and the user may be surprised and interrupt the conversation. Further, in the above method (2), a character display area to a certain extent is required in order to prevent characters (messages) to be displayed from being crushed. For this reason, the image display area is reduced by being pressed by the character display area, and the other party is likely to be small and difficult to see. In addition, in any of the above methods (1) and (2),
The size of the user on the screen is not known at all, and the suitability of the user for the camera unit in the near and far directions is unknown.

【０００９】それ故、本発明の目的は、大掛かりな追従
機構を用いることなく、利用者の位置にカメラ部側が追
従し、利用者を良好な位置で撮影できる画像通信端末を
提供することである。また、本発明のさらなる目的は、
相手を見易く表示した自然な会話を確保しつつ、利用者
が自分の写り（撮影位置）を常に確認できる画像通信端
末を提供することである。SUMMARY OF THE INVENTION It is therefore an object of the present invention to provide an image communication terminal in which a camera section can follow a user's position without using a large-scale follow-up mechanism and can photograph a user at a favorable position. . A further object of the present invention is
It is an object of the present invention to provide an image communication terminal that allows a user to always check his or her image (photographing position) while ensuring a natural conversation in which the other party is easily displayed.

【００１０】[0010]

【課題を解決するための手段および発明の効果】第１の
発明は、カメラ部で撮影された利用者の画像を相手に送
信する画像通信端末であって、利用者からの入力を受け
付ける入力部と、利用者を撮影するカメラ部と、カメラ
部で撮影された画像から、利用者の顔の位置及び大きさ
（顔領域）を抽出する顔抽出部と、利用者に画像を表示
する表示部と、相手の情報処理装置に対し、少なくとも
画像の通信を行う通信部と、カメラ部で撮影された画像
の領域よりも小さく、かつ、当該画像の領域内で移動可
能に設定される矩形の送信領域の画像を、通信部へ出力
する送信データ処理部とを備え、カメラ部で撮影された
画像の領域内に、送信領域と一体的に移動する有効領域
が設定され、送信データ処理部は、抽出された顔領域が
有効領域を逸脱した場合に、当該顔領域の位置に合わせ
て送信領域の設定位置を移動させることを特徴とする。Means for Solving the Problems and Effects of the Invention A first invention is an image communication terminal for transmitting an image of a user photographed by a camera unit to a partner, and an input unit for receiving an input from the user. A camera unit for photographing the user, a face extraction unit for extracting the position and size (face area) of the user's face from the image photographed by the camera unit, and a display unit for displaying the image to the user And a communication unit that performs at least image communication with the other party's information processing device, and transmission of a rectangle that is smaller than the area of the image captured by the camera unit and is set to be movable within the area of the image. A transmission data processing unit that outputs an image of the region to the communication unit, and an effective region that moves integrally with the transmission region is set in the region of the image captured by the camera unit. The extracted face area deviates from the effective area When, characterized in that to move the set position of the transmission region in accordance with the position of the face region.

【００１１】上記のように、第１の発明によれば、顔領
域が有効領域を逸脱していないかを判定し、有効領域を
逸脱している場合には、顔領域の位置に合わせて送信領
域の位置を移動させる。これにより、顔領域の動きに送
信領域が追従することとなり、利用者が写り具合を気に
しなくても、おおよその位置にいるだけで、適切にフレ
ーミングした自画像が相手へ送信されることになる。し
かも、カメラ部の光軸移動部やズーム部のような大がか
りな追従機構がいらず、画像通信端末の携帯性をそこな
わない。また、顔領域が有効領域内にあれば、送信領域
は移動しないので、相手側に送信される画像、特に利用
者の背景画像が頻繁にぶれるようなことはなく、相手の
酔いを防止できる。As described above, according to the first aspect, it is determined whether or not the face area has deviated from the effective area. If the face area has deviated from the effective area, transmission is performed in accordance with the position of the face area. Move the position of the area. As a result, the transmission area follows the movement of the face area, and the self-portrait appropriately framed is transmitted to the other party only when the user is at an approximate position without worrying about the degree of reflection. . Moreover, there is no need for a large tracking mechanism such as an optical axis moving unit or a zoom unit of the camera unit, and the portability of the image communication terminal is not impaired. Further, if the face area is within the effective area, the transmission area does not move, so that the image transmitted to the other party, particularly the background image of the user, does not frequently blur, and it is possible to prevent the other party from getting sick.

【００１２】第２の発明は、第１の発明に従属する発明
であって、有効領域は、送信領域よりも小さく、かつ、
送信領域内に設定されることを特徴とする。[0012] A second invention is an invention according to the first invention, wherein the effective area is smaller than the transmission area, and
It is set in the transmission area.

【００１３】上記のように、第２の発明によれば、顔領
域は、送信領域を逸脱する前に必ず有効領域を逸脱する
ので、送信領域外に顔領域がはみ出して顔の一部が欠け
るような事態を回避できる。As described above, according to the second aspect, since the face area always deviates from the effective area before deviating from the transmission area, the face area protrudes outside the transmission area and a part of the face is missing. Such a situation can be avoided.

【００１４】第３の発明は、第１の発明に従属する発明
であって、送信データ処理部は、抽出された顔領域が有
効領域を逸脱した場合、当該顔領域が送信領域の中心に
位置するように、送信領域を移動させることを特徴とす
る。[0014] A third invention is an invention according to the first invention, wherein the transmission data processing unit, when the extracted face area deviates from the effective area, positions the face area at the center of the transmission area. In such a case, the transmission area is moved.

【００１５】第４の発明は、第１の発明に従属する発明
であって、送信データ処理部は、抽出された顔領域が有
効領域を逸脱した場合、当該顔領域が送信領域の中心よ
り上方向に位置するように、送信領域を移動させること
を特徴とする。A fourth invention is an invention according to the first invention, wherein the transmission data processing section, when the extracted face area deviates from the effective area, places the face area above the center of the transmission area. The transmission area is moved so as to be located in the direction.

【００１６】第５の発明は、第４の発明に従属する発明
であって、送信データ処理部は、抽出された顔領域が有
効領域を逸脱した場合、当該顔領域が送信領域の中心又
は中心より上方向に位置するように、入力部から入力さ
れる送信モード情報に応じて切り替えて、送信領域を移
動させることを特徴とする。A fifth invention is an invention according to the fourth invention, wherein, when the extracted face area deviates from the effective area, the transmission data processing unit sets the face area to the center or the center of the transmission area. It is characterized in that the transmission area is moved by switching in accordance with the transmission mode information input from the input unit so as to be located further upward.

【００１７】上記のように、第３の発明によれば、顔領
域が送信領域の中心に位置するように移動させるので、
顔アップの好ましいフレーミングを実現できる。また、
第４の発明によれば、顔領域が送信領域の中心より上方
向に位置するように移動させるので、バストアップの好
ましいフレーミングを実現できる。さらに、第５の発明
によれば、利用者の好みに応じて、上記顔アップ／バス
トアップのフレーミングを選択できる。As described above, according to the third aspect, the face area is moved so as to be located at the center of the transmission area.
Facial framing with face-up can be realized. Also,
According to the fourth aspect, since the face area is moved so as to be located above the center of the transmission area, favorable framing of bust-up can be realized. Further, according to the fifth aspect, the face-up / bust-up framing can be selected according to the user's preference.

【００１８】第６の発明は、第４の発明に従属する発明
であって、表示部は、入力部から入力される情報に応じ
て、送信領域内の画像と顔領域とをモニタ表示し、利用
者は、モニタ表示を参照して、入力部への入力により送
信領域の位置を縦横方向に調節可能なことを特徴とす
る。A sixth invention is an invention according to the fourth invention, wherein the display unit monitors and displays an image and a face area in the transmission area in accordance with information input from the input unit. The user can adjust the position of the transmission area in the vertical and horizontal directions by inputting to the input unit with reference to the monitor display.

【００１９】上記のように、第６の発明によれば、利用
者は、送信領域内の画像と顔領域とをモニタし、送信領
域の位置を適宜調節することによって、任意のフレーミ
ングで自画像を相手に送信できる。As described above, according to the sixth aspect, the user monitors the image in the transmission area and the face area, and adjusts the position of the transmission area as appropriate, so that the self-portrait can be arbitrarily framed. Can be sent to the other party.

【００２０】第７の発明は、カメラ部で撮影された利用
者の画像を相手に送信する画像通信端末であって、利用
者からの入力を受け付ける入力部と、利用者を撮影する
カメラ部と、カメラ部で撮影された画像から、利用者の
顔の位置及び大きさ（顔領域）を抽出する顔抽出部と、
利用者に画像を表示する表示部と、相手の情報処理装置
に対し、少なくとも画像の通信を行う通信部と、カメラ
部で撮影された画像の領域よりも小さく、かつ、当該画
像の領域内で移動可能に設定される矩形の送信領域の画
像を、通信部へ出力する送信データ処理部とを備え、カ
メラ部で撮影された画像の領域内に、送信領域と一体的
に移動する有効領域が設定され、送信データ処理部は、
抽出された顔領域が有効領域を逸脱した場合に、当該顔
領域の位置に合わせて送信領域の設定位置を移動させ、
かつ、抽出された顔領域の画像輝度に基づいて、カメラ
部で撮影された画像内にある顔の視認性が向上するよう
に、送信領域の画像輝度を補正して通信部へ出力するこ
とを特徴とする。A seventh aspect of the present invention is an image communication terminal for transmitting an image of a user photographed by a camera section to a partner, comprising: an input section for receiving an input from the user; and a camera section for photographing the user. A face extraction unit that extracts the position and size (face area) of the user's face from an image captured by the camera unit;
A display unit for displaying an image to the user, a communication unit for communicating at least the image to the information processing device of the other party, and a region smaller than the region of the image taken by the camera unit and within the region of the image A transmission data processing unit that outputs an image of a rectangular transmission area set to be movable to the communication unit, and an effective area that moves integrally with the transmission area is included in the area of the image captured by the camera unit. Is set, and the transmission data processing unit
When the extracted face area deviates from the effective area, the set position of the transmission area is moved according to the position of the face area,
In addition, based on the image brightness of the extracted face region, the image brightness of the transmission region is corrected and output to the communication unit so that the visibility of the face in the image captured by the camera unit is improved. Features.

【００２１】第８の発明は、第７の発明に従属する発明
であって、送信データ処理部は、送信領域の画像輝度に
加え、色調も補正して通信部へ出力することを特徴とす
る。An eighth invention is an invention according to the seventh invention, wherein the transmission data processing section corrects the color tone in addition to the image luminance of the transmission area and outputs the corrected color tone to the communication section. .

【００２２】第９の発明は、カメラ部で撮影された利用
者の画像を相手に送信する画像通信端末であって、利用
者からの入力を受け付ける入力部と、利用者を撮影する
カメラ部と、カメラ部で撮影された画像から、利用者の
顔の位置及び大きさ（顔領域）を抽出する顔抽出部と、
利用者に画像を表示する表示部と、相手の情報処理装置
に対し、少なくとも画像の通信を行う通信部と、カメラ
部で撮影された画像の領域よりも小さく、かつ、当該画
像の領域内で移動可能に設定される矩形の送信領域の画
像を、通信部へ出力する送信データ処理部とを備え、カ
メラ部で撮影された画像の領域内に、送信領域と一体的
に移動する有効領域が設定され、送信データ処理部は、
抽出された顔領域が有効領域を逸脱した場合に、当該顔
領域の位置に合わせて送信領域の設定位置を移動させ、
かつ、抽出された顔領域の画像輝度に基づいて、カメラ
部で撮影された画像内にある顔の視認性が向上するよう
に、カメラ部の露出レベルの値を設定することを特徴と
する。A ninth aspect of the present invention is an image communication terminal for transmitting an image of a user photographed by a camera unit to a partner, comprising: an input unit for receiving an input from the user; and a camera unit for photographing the user. A face extraction unit that extracts the position and size (face area) of the user's face from an image captured by the camera unit;
A display unit for displaying an image to the user, a communication unit for communicating at least the image to the information processing device of the other party, and a region smaller than the region of the image taken by the camera unit and within the region of the image A transmission data processing unit that outputs an image of a rectangular transmission area set to be movable to the communication unit, and an effective area that moves integrally with the transmission area is included in the area of the image captured by the camera unit. Is set, and the transmission data processing unit
When the extracted face area deviates from the effective area, the set position of the transmission area is moved according to the position of the face area,
In addition, a value of an exposure level of the camera unit is set based on the extracted image brightness of the face region so that the visibility of a face in an image captured by the camera unit is improved.

【００２３】上記のように、第７〜第９の発明によれ
ば、逆光のような場合でも、利用者の顔が常に見えるよ
うな画像を相手側へ送信することが可能となる。これに
より、屋外においても周りの照明環境を気にすることな
く画像通信端末を用いて相手と対話することが可能とな
る。As described above, according to the seventh to ninth aspects, it is possible to transmit an image in which the user's face is always visible to the other party even in the case of backlight. Thus, it is possible to have a conversation with the other party using the image communication terminal even outdoors, without worrying about the surrounding lighting environment.

【００２４】第１０の発明は、カメラ部で撮影された利
用者の画像を相手に送信する画像通信端末であって、利
用者を撮影するカメラ部と、カメラ部で撮影された画像
から、利用者の顔の位置を抽出する顔抽出部と、利用者
に相手から受信した画像を表示する表示部と、抽出され
た顔の位置に基づいて、カメラ部で撮影された画像中に
おける利用者の顔の位置を、利用者に通知する通知制御
部と、相手の情報処理装置に対し、少なくとも画像の通
信を行う通信部とを備える。A tenth aspect of the present invention is an image communication terminal for transmitting an image of a user photographed by a camera unit to another party. A face extraction unit for extracting the position of the user's face, a display unit for displaying the image received from the other party to the user, and a user's position in the image captured by the camera unit based on the position of the extracted face. A notification control unit that notifies the user of the position of the face and a communication unit that performs at least image communication with the partner information processing device are provided.

【００２５】上記のように、第１０の発明によれば、利
用者は、撮影画像中の自分の位置を通知されるので、自
分の姿が画面を逸脱していない場合であっても、自分の
位置を確認しながら安心して相手との会話を進めること
ができる。万一、利用者が画面から逸脱しても、相手の
姿が突如変化するようなことはないので、利用者は、落
ち着いて通知を参照しながら、正しい位置へ復帰し、会
話を継続できる。しかも、カメラ部に利用者への追従機
構を設ける必要はないので、画像通信端末を軽量化かつ
低消費電力化させることができる。このため、携帯（テ
レビ）電話やモバイル端末等の携帯性が重視される機器
にも、好適に用いることができる。As described above, according to the tenth aspect, the user is notified of his / her own position in the photographed image, so that even if his / her figure does not deviate from the screen, his / her own You can proceed with a conversation with the other party with confidence while checking the position of. Even if the user deviates from the screen, the partner's appearance does not suddenly change, so that the user can calm down, return to the correct position while referring to the notification, and continue the conversation. In addition, since there is no need to provide a mechanism for following the user in the camera unit, the weight and power consumption of the image communication terminal can be reduced. For this reason, the present invention can be suitably used for devices in which portability is important, such as mobile (video) phones and mobile terminals.

【００２６】第１１の発明は、第１０の発明に従属する
発明であって、顔抽出部は、利用者の顔の位置と共に顔
の大きさも抽出し、通知制御部は、カメラ部で撮影され
た画像中における利用者の顔の位置及び大きさを、利用
者に通知することを特徴とする。An eleventh invention is an invention according to the tenth invention, wherein the face extraction unit extracts the size of the face together with the position of the user's face, and the notification control unit captures the image by the camera unit. The user is notified of the position and size of the user's face in the displayed image.

【００２７】上記のように、第１１の発明によれば、顔
領域の大きさも抽出して通知することにより、利用者
は、顔領域の位置及び大きさの両方の情報を得ることが
できる。従って、利用者は、これらの情報を参照して、
画面上の位置と遠近方向の位置とを、適正に保持するこ
とができる。また、利用者は、自画像を得なくとも、自
分が画面のどの位置にどの位の大きさで写っているのか
を確認できる。As described above, according to the eleventh aspect, by extracting and notifying the size of the face area, the user can obtain information on both the position and the size of the face area. Therefore, the user refers to this information,
The position on the screen and the position in the perspective direction can be appropriately maintained. In addition, the user can check at what position on the screen and at what size the user is without having to obtain a self-portrait.

【００２８】第１２の発明は、第１０の発明に従属する
発明であって、通知制御部は、抽出された顔の位置のみ
又は位置と大きさとを示す目印を、表示部に表示させる
ことを特徴とする。A twelfth invention is an invention according to the tenth invention, wherein the notification control unit causes the display unit to display only the position of the extracted face or a mark indicating the position and size. Features.

【００２９】上記のように、第１２の発明によれば、利
用者は、表示部に表示される相手の姿を見ながら、通常
の会話と同じように集中して会話を行える。また、利用
者は、簡潔な目印を参照して、自分の位置を確認でき
る。As described above, according to the twelfth aspect, the user can concentrate on the conversation in the same manner as a normal conversation while watching the other party displayed on the display unit. In addition, the user can confirm his / her position by referring to the simple landmark.

【００３０】第１３の発明は、第１２の発明に従属する
発明であって、目印は、相手から受信した画像上に表示
されることを特徴とする。A thirteenth invention is a invention according to the twelfth invention, wherein the mark is displayed on an image received from the other party.

【００３１】上記のように、第１３の発明によれば、相
手の姿の上に目印が現れるので、目印のためだけに広い
画面の領域を確保する必要がなく、相手の姿をより大き
くかつ見易く表示できる。しかも、利用者は、目印を見
るために視線を変える必要がなく、長時間会話しても疲
労が少ない。As described above, according to the thirteenth aspect, since the mark appears on the figure of the partner, it is not necessary to secure a wide screen area only for the mark, and the figure of the partner can be made larger and larger. It can be displayed easily. Moreover, the user does not need to change his / her eyes to see the landmarks, and is less tired even after a long conversation.

【００３２】第１４の発明は、第１２の発明に従属する
発明であって、目印は、相手から受信した画像外に表示
されることを特徴とする。A fourteenth invention is a invention according to the twelfth invention, wherein the mark is displayed outside an image received from the other party.

【００３３】上記のように、第１４の発明によれば、目
印を相手の画像から外すことにより、相手の画像に目印
が干渉せず、より鮮明に相手の姿を見ることができる。As described above, according to the fourteenth aspect, by removing the mark from the image of the partner, the marker does not interfere with the image of the partner, and the partner can be seen more clearly.

【００３４】第１５の発明は、第１２の発明に従属する
発明であって、通知制御部は、抽出された顔の位置を、
表示部とは別個に備える位置通知部を介して通知するこ
とを特徴とする。A fifteenth invention is a invention according to the twelfth invention, wherein the notification control unit determines the position of the extracted face by:
The notification is provided via a position notification unit provided separately from the display unit.

【００３５】上記のように、第１５の発明によれば、位
置通知部を表示部とは別個に備えることにより、表示部
の画面全部を相手の姿の表示に割り当てることができる
ので、相手の姿をより広くかつ見易く表示できる。As described above, according to the fifteenth aspect, by providing the position notification unit separately from the display unit, the entire screen of the display unit can be assigned to the display of the other party's figure. The figure can be displayed more widely and easily.

【００３６】第１６の発明は、第１０の発明に従属する
発明であって、通知制御部によって行われる利用者への
通知方法を、利用者からの指示に従って切り替え可能と
することを特徴とする。A sixteenth invention is an invention according to the tenth invention, wherein a method of notifying a user performed by the notification control unit can be switched according to an instruction from the user. .

【００３７】上記のように、第１６の発明によれば、利
用者は、好みの通知方法を選択することができる。As described above, according to the sixteenth aspect, the user can select a favorite notification method.

【００３８】第１７の発明は、第１〜第１６の発明に従
属する発明であって、顔抽出部は、カメラ部で撮影され
た画像からエッジ部（人物の外郭や顔の輪郭等に相当す
る画素）を抽出して、当該エッジ部だけの画像（エッジ
画像）を生成するエッジ抽出部と、予め定めた形状を、
相似で大きさを異ならせた種々のサイズによって、中心
点で同心状に複数設けたテンプレートを記憶するテンプ
レート記憶部と、テンプレートを構成する各サイズの形
状毎に、エッジ画像上の座標位置と投票数とを対応付け
てそれぞれ記憶する投票結果記憶部と、エッジ部の各画
素位置にテンプレートの中心点を順次移動させ、移動さ
せた当該画素位置毎に、各サイズの形状を形成する全画
素の位置に対応する各座標位置について、投票結果記憶
部に記憶されている投票数をそれぞれ増加又は減少させ
る投票部と、投票結果記憶部に記憶されている各投票数
に基づいて、対象画像に含まれる顔の位置及び大きさを
求める解析部とを備える。A seventeenth invention is an invention according to the first to sixteenth inventions, wherein the face extracting unit detects an edge portion (corresponding to a contour of a person, a contour of a face, or the like) from an image taken by the camera unit. And an edge extraction unit that generates an image (edge image) of only the edge portion,
A template storage unit that stores a plurality of templates provided concentrically at the center point in various sizes of similar and different sizes, and coordinate positions and voting on the edge image for each size of the template. And a voting result storage unit for storing the numbers of the pixels in association with each other, and sequentially moving the center point of the template to each pixel position of the edge portion, and for each of the moved pixel positions, all the pixels forming the shape of each size. For each coordinate position corresponding to the position, a voting unit that increases or decreases the number of votes stored in the voting result storage unit, respectively, and is included in the target image based on each voting number stored in the voting result storage unit. And an analysis unit for obtaining the position and size of the face to be obtained.

【００３９】上記のように、第１７の発明によれば、処
理負担が軽い投票処理（基本的には加算のみ）とその評
価だけで、顔の位置を高速に検出できる。しかも、相似
で同心状の複数の形状を備えたテンプレートを用いてい
るから、顔を含むであろうエッジ部が、これらの形状の
うち、いずれの大きさに近いかという実質的な近似を行
っていることになり、顔の大きさも高速に抽出できる。
このように、処理負担を大幅に軽減できるので、現状の
パーソナルコンピュータレベルの処理能力でも、ほぼ実
時間で顔を抽出できる。また、対象画像のうち、どの部
分に顔領域があるかという点や、顔領域の個数などは、
抽出前に不明であって差し支えなく、広い範囲の対象画
像について、一様に顔を検出でき、極めて汎用性が高
い。As described above, according to the seventeenth aspect, the position of the face can be detected at high speed only by the voting process (basically only the addition) with a light processing load and the evaluation thereof. In addition, since a template having a plurality of similar and concentric shapes is used, a substantial approximation is made as to which one of these shapes is closer to the edge portion that may include the face. Therefore, the size of the face can be extracted at high speed.
As described above, since the processing load can be greatly reduced, the face can be extracted almost in real time even with the processing capability of the current personal computer level. In addition, the point of the face area in the target image, the number of face areas, etc.
The face can be detected uniformly in a wide range of target images without any problem before extraction, and the versatility is extremely high.

【００４０】第１８の発明は、第１７の発明に従属する
発明であって、予め定めた形状は、円であることを特徴
とする。An eighteenth invention is an invention according to the seventeenth invention, wherein the predetermined shape is a circle.

【００４１】上記のように、第１８の発明によれば、形
状群は、円であるため、テンプレートの中心点から、形
状の全ての画素までの距離が常に一定になり、投票結果
の精度を高く保持できる。As described above, according to the eighteenth aspect, since the shape group is a circle, the distance from the center point of the template to all the pixels of the shape is always constant, and the accuracy of the voting result is reduced. Can be kept high.

【００４２】第１９の発明は、第１〜第１６の発明に従
属する発明であって、顔抽出部は、所定のテンプレート
画像を入力し、当該画像のエッジ法線方向ベクトルを求
め、当該エッジ法線方向ベクトルから評価ベクトルを生
成し、当該評価ベクトルを直交変換するテンプレート画
像処理部と、カメラ部で撮影された画像を入力し、当該
画像のエッジ法線方向ベクトルを求め、当該エッジ法線
方向ベクトルから評価ベクトルを生成し、当該評価ベク
トルを直交変換する入力画像処理部と、テンプレート画
像及び撮影された画像のそれぞれについて生成された直
交変換後の各評価ベクトルについて、対応スペクトルデ
ータを積和計算する積和部と、積和計算の結果を逆直交
変換して類似値のマップを生成する逆直交変換部とを備
え、評価ベクトルは、該当する画像のエッジ法線方向ベ
クトルを偶数倍角変換した成分を含み、類似値の算出
式、直交変換及び逆直交変換は、いずれも線形性を有す
るものであることを特徴とする。A nineteenth invention is an invention according to the first to sixteenth inventions, wherein the face extracting unit inputs a predetermined template image, obtains an edge normal direction vector of the image, and A template image processing unit that generates an evaluation vector from the normal direction vector, orthogonally transforms the evaluation vector, and an image captured by the camera unit, inputs an edge normal direction vector of the image, and obtains the edge normal An input image processing unit that generates an evaluation vector from the direction vector and orthogonally transforms the evaluation vector, and sums up the corresponding spectral data for each of the orthogonally transformed evaluation vectors generated for each of the template image and the captured image. A product-sum unit for calculating, and an inverse orthogonal transform unit for performing an inverse orthogonal transform on a result of the product-sum calculation to generate a map of similarity values, the evaluation vector Includes corresponding image of edge normal direction vectors and the even double angle convert components, calculation formula of the similarity value, the orthogonal transform and inverse orthogonal transformation, characterized in that both those having linearity.

【００４３】上記のように、第１９の発明によれば、背
景部分の輝度ばらつきにより、テンプレート画像のエッ
ジ法線方向ベクトルと、カメラ部で撮影された画像（入
力画像）のエッジ法線方向ベクトルとのなす角θの内積
（ｃｏｓθ）の正負が反転する場合でも、類似値に影響
が無く、正当にマッチングを評価できる。As described above, according to the nineteenth aspect, the edge normal direction vector of the template image and the edge normal direction vector of the image (input image) shot by the camera unit Even if the sign of the inner product (cos θ) of the angle θ formed by the inversion is inverted, the similarity is not affected and the matching can be evaluated properly.

【００４４】第２０の発明は、第１９の発明に従属する
発明であって、顔抽出部は、評価ベクトルの表現におい
て、エッジ法線方向ベクトルを極座標表現した場合の角
度に基づいて計算した値を用いることを特徴とする。A twentieth aspect is the invention according to the nineteenth aspect, wherein the face extraction unit calculates the value calculated based on the angle when the edge normal direction vector is expressed in polar coordinates in the expression of the evaluation vector. Is used.

【００４５】第２１の発明は、第１〜第１６の発明に従
属する発明であって、顔抽出部は、カメラ部で撮影され
た画像から顔として抽出された位置及び大きさが、真に
顔であるか否かを判定する顔・非顔判定部をさらに備
え、顔と判定した場合にのみ抽出結果を出力することを
特徴とする。The twenty-first invention is an invention according to the first to sixteenth inventions, wherein the face extraction unit is adapted to determine that the position and size extracted as a face from the image taken by the camera unit are truly true. A face / non-face judging unit for judging whether the face is a face or not is further provided, and the extraction result is output only when the face is judged.

【００４６】第２２の発明は、第１７の発明に従属する
発明であって、顔抽出部は、投票結果記憶部に記憶され
ている内容に基づいて、カメラ部で撮影された画像から
顔として抽出された位置及び大きさが、真に顔であるか
否かを判定する顔・非顔判定部をさらに備え、顔と判定
した場合にのみ抽出結果を出力することを特徴とする。A twenty-second invention is an invention according to the seventeenth invention, wherein the face extraction unit converts the image captured by the camera unit into a face based on the contents stored in the voting result storage unit. A face / non-face judging unit for judging whether or not the extracted position and size are truly a face is further provided, and the extraction result is output only when the face and the non-face are judged to be a face.

【００４７】第２３の発明は、第１９の発明に従属する
発明であって、顔抽出部は、逆直交変換部で生成された
類似値に基づいて、カメラ部で撮影された画像から顔と
して抽出された位置及び大きさが、真に顔であるか否か
を判定する顔・非顔判定部をさらに備え、顔と判定した
場合にのみ抽出結果を出力することを特徴とする。A twenty-third aspect is an invention according to the nineteenth aspect, wherein the face extraction unit converts the image captured by the camera unit into a face based on the similarity value generated by the inverse orthogonal transform unit. A face / non-face judging unit for judging whether or not the extracted position and size are truly a face is further provided, and the extraction result is output only when the face and the non-face are judged to be a face.

【００４８】上記のように、第２１〜第２３の発明によ
れば、実際の顔が顔領域の第１候補以外にある場合で
も、安定した顔領域の抽出が可能になる。また、画像中
に顔がない場合でも顔がないと判定することができるの
で、顔の位置を移動して表示する必要がない場合を自動
的に検出することが可能になる。As described above, according to the twenty-first to twenty-third aspects, a stable face area can be extracted even when the actual face is other than the first face area candidate. Further, even when there is no face in the image, it can be determined that there is no face, so that it is possible to automatically detect a case where it is not necessary to move and display the position of the face.

【００４９】第２４の発明は、第２１の発明に従属する
発明であって、顔・非顔判定部は、カメラ部で撮影され
た画像から顔として抽出された領域から得られる画像特
徴を用いて、サポートベクトル関数の判定結果に基づい
て顔・非顔の判定を行うことを特徴とする。A twenty-fourth invention is an invention according to the twenty-first invention, wherein the face / non-face determination unit uses an image feature obtained from an area extracted as a face from an image taken by the camera unit. The face / non-face determination is performed based on the determination result of the support vector function.

【００５０】第２５の発明は、第２４の発明に従属する
発明であって、顔・非顔判定部は、カメラ部で撮影され
た画像から顔として抽出された領域から得られるエッジ
法線方向ベクトルを画像特徴とすることを特徴とする。A twenty-fifth aspect is the invention according to the twenty-fourth aspect, wherein the face / non-face judging section comprises an edge normal direction obtained from an area extracted as a face from an image taken by the camera section. It is characterized in that a vector is an image feature.

【００５１】第２６の発明は、第２４の発明に従属する
発明であって、顔・非顔判定部は、カメラ部で撮影され
た画像から顔として抽出された領域から得られるエッジ
法線のヒストグラムを画像特徴とすることを特徴とす
る。A twenty-sixth invention is an invention according to the twenty-fourth invention, wherein the face / non-face judging section judges an edge normal obtained from an area extracted as a face from an image taken by the camera section. It is characterized in that the histogram is used as an image feature.

【００５２】[0052]

【発明の実施の形態】以下、図面を参照しながら、本発
明の各実施形態を説明する。（第１の実施形態）図１は、本発明の第１の実施形態に
係る画像通信端末の構成を示すブロック図である。図１
において、第１の実施形態に係る画像通信端末は、入力
部２と、表示部３と、カメラ部４と、表示制御部５と、
自画像メモリ６と、顔抽出部７と、送信データ処理部８
と、通信部９と、受信データ処理部１０と、相手画像メ
モリ１１とを備える。まず、第１の実施形態に係る画像
通信端末の各構成の概要を説明する。Embodiments of the present invention will be described below with reference to the drawings. (First Embodiment) FIG. 1 is a block diagram showing a configuration of an image communication terminal according to a first embodiment of the present invention. Figure 1
In the image communication terminal according to the first embodiment, the input unit 2, the display unit 3, the camera unit 4, the display control unit 5,
Self-image memory 6, face extraction unit 7, transmission data processing unit 8
, A communication unit 9, a reception data processing unit 10, and a partner image memory 11. First, an outline of each configuration of the image communication terminal according to the first embodiment will be described.

【００５３】図１に示すように、本実施形態の画像通信
端末では、入力部２、表示部３及びカメラ部４が、利用
者１に臨んでいる。入力部２は、キーボード（テンキー
等を含む）やマウス等で構成され、利用者１が送信モー
ド及びその他必要な情報を入力するために利用される。
表示部３は、ＬＣＤ等で構成され、画面上で相手の画像
や表示制御部５の指示に従った目印等を、利用者１に向
けて表示する。目印については後で詳述するが、利用者
１が画面中における自分の顔の位置や大きさを確認でき
る指標である。カメラ部４は、レンズ等の光学系及びＣ
ＣＤ等の電気系で構成され、利用者１を撮影するために
用いられる。このカメラ部４で撮影された画像（以下、
対象画像という）は、フレーム毎に自画像メモリ６に格
納される。表示制御部５は、表示部３の画面表示（主と
して、受信した相手画像の表示）を制御する。また、表
示制御部５は、入力部２から入力される情報に応じて、
顔抽出部７で抽出された顔領域に基づいた目印を、表示
部３の画面上に表示させる。As shown in FIG. 1, in the image communication terminal of the present embodiment, the input unit 2, the display unit 3, and the camera unit 4 face the user 1. The input unit 2 includes a keyboard (including a numeric keypad), a mouse, and the like, and is used by the user 1 to input a transmission mode and other necessary information.
The display unit 3 is configured by an LCD or the like, and displays an image of the other party, a mark according to an instruction of the display control unit 5, and the like to the user 1 on a screen. Although the mark is described in detail later, it is an index by which the user 1 can confirm the position and size of his / her face on the screen. The camera unit 4 includes an optical system such as a lens and C
It is composed of an electric system such as a CD and is used for photographing the user 1. Images taken by the camera unit 4 (hereinafter referred to as
The target image is stored in the self-image memory 6 for each frame. The display control unit 5 controls screen display of the display unit 3 (mainly, display of a received partner image). In addition, the display control unit 5 responds to information input from the input unit 2,
A mark based on the face area extracted by the face extracting unit 7 is displayed on the screen of the display unit 3.

【００５４】顔抽出部７は、自画像メモリ６に格納され
た対象画像に対して、存在する顔の位置及び大きさを調
べ、これらの情報を顔領域として表示制御部５及び送信
データ処理部８へ出力する。なお、この顔抽出部７につ
いては、本発明に適用可能な手法を後で詳細に説明す
る。送信データ処理部８は、顔抽出部７で抽出された顔
領域の位置に合わせて送信領域を設定する。そして、送
信データ処理部８は、入力部２から指示された送信モー
ドに従って、自画像メモリ６に格納された対象画像の
内、送信領域内の画像データを通信部９へ送出する。通
信部９は、通信経路を介して、相手の情報処理装置（画
像通信端末を含む）と、少なくとも画像データの通信を
行う。ここでの通信モードは任意であり、例えば、内線
電話のように基地局等を介さない子機間通信でもよい
し、テレビ電話のような基地局等を介する同期型通信又
は非同期型通信でもよい。受信データ処理部１０は、通
信部９を介して受信した相手の画像データを処理して、
フレーム毎に相手画像メモリ１１へ格納する。The face extracting unit 7 examines the position and size of the existing face with respect to the target image stored in the self-image memory 6, and uses the information as a face area for the display control unit 5 and the transmission data processing unit 8. Output to Regarding the face extraction unit 7, a method applicable to the present invention will be described later in detail. The transmission data processing unit 8 sets a transmission area according to the position of the face area extracted by the face extraction unit 7. Then, the transmission data processing unit 8 sends out the image data in the transmission area among the target images stored in the own image memory 6 to the communication unit 9 according to the transmission mode instructed from the input unit 2. The communication unit 9 communicates at least image data with a partner information processing device (including an image communication terminal) via a communication path. The communication mode here is arbitrary, and may be, for example, communication between slave units not via a base station such as an extension phone, or synchronous or asynchronous communication via a base station such as a videophone. . The reception data processing unit 10 processes the other party's image data received via the communication unit 9,
It is stored in the partner image memory 11 for each frame.

【００５５】なお、本実施形態では、通信部９が双方向
通信を行う場合を一例に挙げて説明するが、利用者１か
ら相手に画像データを単方向通信するビデオメール等に
も本発明を適用することができる。この場合、相手の情
報処理装置は、送信される画像データを受信して画面表
示させる構成のみを持つものであってもよい。In the present embodiment, the case where the communication unit 9 performs two-way communication will be described as an example. However, the present invention is also applied to video mail or the like in which the user 1 unidirectionally communicates image data to the other party. Can be applied. In this case, the information processing device of the other party may have only a configuration for receiving the transmitted image data and displaying the image data on the screen.

【００５６】次に、図２〜図６を用いて、送信データ処
理部８が行う顔領域の位置に合わせた追従処理について
説明する。まず、カメラ部４による撮影領域３０と、通
信部９から送信される画像の送信領域３１との関係は、
一般的に図３のようになる。送信領域３１は、撮影領域
３０よりも小さな矩形領域である。カメラ部４は、送信
領域３１より広い撮影領域３０で被写体（利用者１）を
撮影するが、画像通信端末からは送信領域３１内の画像
だけが相手に送信される。図３の例では、撮影領域３０
は、ｘ方向長さＡ、ｙ方向長さＢであり、送信領域３１
は、ｘ方向長さＬ、ｙ方向長さＭである。また、Ｌ＜Ａ
及びＭ＜Ｂであり、各々の長さＡ、Ｂ、Ｌ、Ｍは、固定
的である。図３の例では、送信領域３１の左上点（ｘ
１，ｙ１）を基準点としている。この基準点は、撮影領
域３０内を移動可能であり、基準点が定まることで、送
信領域３１の位置が一意に定まるようにしている。な
お、送信領域３１の左上点以外の点を基準としてもよ
い。Next, a tracking process performed by the transmission data processing unit 8 in accordance with the position of the face area will be described with reference to FIGS. First, the relationship between the photographing region 30 of the camera unit 4 and the transmission region 31 of the image transmitted from the communication unit 9 is as follows.
Generally, it is as shown in FIG. The transmission area 31 is a rectangular area smaller than the imaging area 30. The camera unit 4 captures an image of the subject (user 1) in a capturing area 30 wider than the transmission area 31, but only the image in the transmission area 31 is transmitted from the image communication terminal to the other party. In the example of FIG.
Is the length A in the x direction and the length B in the y direction.
Is a length L in the x direction and a length M in the y direction. Also, L <A
And M <B, and the lengths A, B, L, and M are fixed. In the example of FIG. 3, the upper left point (x
1, y1) is used as a reference point. The reference point is movable within the photographing area 30, and the position of the transmission area 31 is uniquely determined by determining the reference point. Note that a point other than the upper left point of the transmission area 31 may be used as a reference.

【００５７】一方、本実施形態では、顔抽出部７で抽出
された顔領域の位置及び大きさを、円形の目印Ｒで表現
する。この目印Ｒの中心が顔領域の中心であり、目印Ｒ
の直径が顔領域の大きさに相当する。なお、目印Ｒは、
円形以外の形状であっても構わない。On the other hand, in the present embodiment, the position and size of the face area extracted by the face extraction unit 7 are represented by circular marks R. The center of the mark R is the center of the face area, and the mark R
Is equivalent to the size of the face area. The mark R is
The shape may be other than a circle.

【００５８】図３の状態では、目印Ｒで示される顔領域
が送信領域３１の右側へ逸脱している。従って、目印Ｒ
に基づいて図中矢印で示すように、送信領域３１を右側
へ移動させれば、好ましいフレーミングになる。そこ
で、本実施形態では、目印Ｒが内部に含まれるように送
信領域３１を移動させる。図４は、送信領域３１を移動
させた後の状態（左上点（ｘ２，ｙ２））を示してい
る。ここで、本実施形態では、図４に示しているよう
に、送信領域３１の内側にさらに有効領域３２を設定
し、有効領域３２と送信領域３１とが一体的に移動する
ようにしている。そして、目印Ｒが送信領域３１ではな
く有効領域３２を逸脱したかどうかをチェックし、逸脱
した場合には、図３から図４のように送信領域３１及び
有効領域３２を移動させることとした。In the state of FIG. 3, the face area indicated by the mark R deviates to the right of the transmission area 31. Therefore, the mark R
If the transmission area 31 is moved to the right as indicated by an arrow in the figure based on the above, preferable framing is achieved. Therefore, in the present embodiment, the transmission area 31 is moved so that the mark R is included therein. FIG. 4 shows a state (upper left point (x2, y2)) after the transmission area 31 has been moved. Here, in the present embodiment, as shown in FIG. 4, an effective area 32 is further set inside the transmission area 31 so that the effective area 32 and the transmission area 31 move integrally. Then, it is checked whether or not the mark R has deviated from the effective area 32 instead of the transmission area 31. If the mark R has deviated, the transmission area 31 and the effective area 32 are moved as shown in FIGS.

【００５９】ここで、有効領域３２を狭くすると、目印
Ｒが有効領域３２を逸脱する確率が上がり、相手の酔い
を招来しやくすくなる。従って、図４に示しているよう
に、有効領域３２を広めにとって、送信領域３１の移動
を抑えることが望ましい。このようにしても、顔領域は
見易い位置にある。Here, when the effective area 32 is made narrower, the probability that the mark R deviates from the effective area 32 increases, and it becomes easy for the opponent to get sick. Therefore, as shown in FIG. 4, it is desirable to suppress the movement of the transmission area 31 in order to widen the effective area 32. Even in this case, the face area is at a position that is easy to see.

【００６０】加えて、本実施形態では、送信領域３１の
移動直後の目印Ｒの位置を、送信モード（バストアップ
モード又はバストアップモード）によって切り替えられ
るようにしている。図４は、目印Ｒが送信領域３１に対
してｘ方向中心かつｙ方向中心よりやや上方に位置す
る、バストアップモードによる画像表示手法の例であ
る。なお、顔アップモードとは、目印Ｒが送信領域３１
に対してｘ方向中心かつｙ方向中心に位置する画像表示
手法である。さらには、本実施形態では、図５に示すよ
うに、これらのモードから目印Ｒを好みの方向にオフセ
ットさせることを可能とする。これによれば、例えば、
利用者１が自分と共に持参している物を一緒に相手に見
せたいと考えるような場合等、種々の要求に対応できる
ようになる。In addition, in the present embodiment, the position of the mark R immediately after the movement of the transmission area 31 can be switched according to the transmission mode (bust-up mode or bust-up mode). FIG. 4 shows an example of an image display method in the bust-up mode in which the mark R is positioned slightly above the transmission area 31 with respect to the center in the x direction and the center in the y direction. In the face-up mode, the mark R is displayed in the transmission area 31.
Is an image display method located at the center in the x direction and the center in the y direction. Furthermore, in the present embodiment, as shown in FIG. 5, it is possible to offset the mark R from these modes in a desired direction. According to this, for example,
It is possible to respond to various requests, for example, when the user 1 wants to show the thing brought with him / her together to the other party.

【００６１】次に、図２を参照して、送信データ処理部
８が行う追従処理の各プロセスを説明する。まず、利用
者１が、入力部２から送信モード（バストアップモード
／顔アップモード）を入力する（ステップＳ２０１）。
次に、カメラ部４によって利用者１が撮影され、対象画
像として自画像メモリ６に格納される（ステップＳ２０
２）。この撮影の時には、利用者１は、広い撮影領域３
０内に顔が写る位置に居さえすれば十分である。次に、
顔抽出部７が、対象画像内の顔領域（顔の位置及び大き
さ）を抽出し、抽出した顔領域を送信データ処理部８へ
出力する（ステップＳ２０３）。Next, each process of the follow-up processing performed by the transmission data processing section 8 will be described with reference to FIG. First, the user 1 inputs a transmission mode (bust-up mode / face-up mode) from the input unit 2 (step S201).
Next, the user 1 is photographed by the camera unit 4 and stored in the self-portrait memory 6 as a target image (step S20).
2). At the time of this shooting, the user 1 has a large shooting area 3
It is sufficient to be at a position where the face is captured within 0. next,
The face extraction unit 7 extracts a face area (face position and size) in the target image, and outputs the extracted face area to the transmission data processing unit 8 (Step S203).

【００６２】顔領域が抽出されると、送信データ処理部
８は、送信モードに従って顔領域に送信領域３１を合わ
せる（ステップＳ２０４）。具体的には、図４に示すよ
うに、顔領域が送信領域３１内に含まれるように送信領
域３１の左上点が決定される。次に、送信領域３１内
に、有効領域３２が設定され（ステップＳ２０５）、図
４の送信領域３１内の画像が表示部３によって利用者１
へモニタ表示される（ステップＳ２０６）。なお、この
ステップＳ２０６では、利用者１自身の画像表示を省略
し、目印Ｒのみを表示してもよい。次に、利用者１が、
入力部２を用いてモニタ表示されたフレーミングでよい
か（送信領域３１をロックするか）どうかを入力する
（ステップＳ２０７）。利用者１が、送信領域３１のオ
フセットを希望する場合には、入力部２は、移動情報の
入力を受け付けて、送信領域３１の位置を調節する（ス
テップＳ２１５）。その後、処理がステップＳ２０５へ
戻り、再度利用者１の確認を仰ぐ。When the face area is extracted, the transmission data processing unit 8 adjusts the transmission area 31 to the face area according to the transmission mode (step S204). Specifically, as shown in FIG. 4, the upper left point of the transmission area 31 is determined so that the face area is included in the transmission area 31. Next, an effective area 32 is set in the transmission area 31 (step S205), and the image in the transmission area 31 of FIG.
Is displayed on the monitor (step S206). In step S206, the image display of the user 1 may be omitted, and only the mark R may be displayed. Next, user 1
The framing displayed on the monitor is input using the input unit 2 (whether the transmission area 31 is locked) or not (step S207). When the user 1 desires the offset of the transmission area 31, the input unit 2 adjusts the position of the transmission area 31 by accepting the input of the movement information (step S215). Thereafter, the process returns to step S205, and asks for confirmation of user 1 again.

【００６３】上記ステップＳ２０７でフレーミングが完
了すると、相手との画像通信が開始される（ステップＳ
２０８）。なお、適当な割り込み処理部を設けて、通信
途中でもステップＳ２０１〜Ｓ２０７の処理を行えるよ
うにすることもできる。通信が開始されると、通信部９
及び受信データ処理部１０を介して、相手画像メモリ１
１に格納された相手の画像が、表示部３の画面上に表示
される（ステップＳ２０９）。ここで再び、カメラ部４
が利用者１を撮影し（ステップＳ２１０）、顔抽出部７
が顔領域を抽出し（ステップＳ２１１）、送信データ処
理部８が顔領域が有効領域３２を逸脱したかどうかチェ
ックする（ステップＳ２１２）。When framing is completed in step S207, image communication with the other party is started (step S207).
208). Note that an appropriate interrupt processing unit may be provided so that the processes of steps S201 to S207 can be performed even during communication. When the communication is started, the communication unit 9
And the other party's image memory 1 via the reception data processing unit 10
1 is displayed on the screen of the display unit 3 (step S209). Here again, the camera unit 4
Captures the user 1 (step S210), and extracts the face
Extracts a face area (step S211), and the transmission data processing unit 8 checks whether the face area has deviated from the effective area 32 (step S212).

【００６４】ここで、図６に示すように逸脱していれ
ば、送信データ処理部８は、上記ステップＳ２０４と同
様に送信モードに従って送信領域３１の左上点を移動さ
せた後（ステップＳ２１３）、顔抽出部７において再び
抽出された顔領域が有効領域３２を逸脱したかどうか再
チェックする（ステップＳ２１１，Ｓ２１２）。一方、
逸脱していなければ、送信データ処理部８は、送信領域
３１を移動させることなく通信を継続させる。なお、利
用者が自分の写り具合を確認しながら安心して通信した
い場合には、例えばピクチャインピクチャ方式を用い
て、相手の画像と共に自分の画像が画面内に表示される
ようにしてもよい。そして、ステップＳ２０９〜Ｓ２１
３の処理が、通信終了まで繰り返される（ステップＳ２
１４）。Here, if it deviates as shown in FIG. 6, the transmission data processing unit 8 moves the upper left point of the transmission area 31 in accordance with the transmission mode as in step S204 (step S213). It is again checked whether the face area extracted again by the face extraction unit 7 has deviated from the effective area 32 (steps S211 and S212). on the other hand,
If there is no deviation, the transmission data processing unit 8 continues the communication without moving the transmission area 31. If the user wants to communicate with peace of mind while confirming his or her image quality, the user's own image may be displayed on the screen together with the image of the other party using, for example, a picture-in-picture method. Then, steps S209 to S21
3 is repeated until the communication ends (step S2).
14).

【００６５】以上のように、本発明の第１の実施形態に
係る画像通信端末によれば、大掛かりな追従機構を用い
ることなく、画像通信端末の携帯性を損なわずに、実質
的に利用者の動きに追従した撮影及び画像通信を行うこ
とができる。すなわち、利用者は、写り具合を気にしな
くとも好ましいフレーミングで撮影され、自画像が相手
に送信される。また、顔領域が有効領域内にあれば、送
信領域は移動しないので、相手側に送信される画像、特
に利用者側の背景画像が、頻繁にぶれるようなことがな
くなり相手の酔いを防止できる。As described above, according to the image communication terminal according to the first embodiment of the present invention, without using a large-scale follow-up mechanism, without substantially impairing the portability of the image communication terminal, the Shooting and image communication can be performed following the movement of the user. In other words, the user is taken with preferable framing without worrying about the degree of reflection, and the self-portrait is transmitted to the other party. Further, if the face area is within the effective area, the transmission area does not move, so that the image transmitted to the other party, especially the background image on the user side, does not blur frequently, and it is possible to prevent the other party from getting sick. .

【００６６】ところで、周知のように、カメラ部４に用
いるカメラによっては、自動露出補正の機能を有するも
のがある。自動露出補正とは、明るさが最適となるよう
に自動的に画像の輝度を補正する機能であり、一般に画
像全体又は数点の平均輝度に基づいて画像内の各画素の
輝度を変更することで行われる。しかしながら、逆光等
のように対象画像全体の平均輝度に比べ顔領域の平均輝
度が低い場合には、利用者１の顔が真っ黒になってしま
うという問題が残る。そこで、このような場合の対策と
して、送信データ処理部８では、顔抽出部７で抽出され
た顔領域に基づいて、カメラ部４が撮影した対象画像の
明るさを顔の視認性が向上するように輝度を補正した
後、通信部９へ送信するようにすればよい。As is well known, some cameras used for the camera unit 4 have a function of automatic exposure correction. Automatic exposure correction is a function that automatically corrects the brightness of an image so that the brightness is optimal, and generally changes the brightness of each pixel in an image based on the entire image or the average brightness of several points. Done in However, when the average luminance of the face area is lower than the average luminance of the entire target image as in the case of backlight or the like, there remains a problem that the face of the user 1 becomes black. Therefore, as a countermeasure in such a case, the transmission data processing unit 8 improves the visibility of the face based on the brightness of the target image captured by the camera unit 4 based on the face area extracted by the face extraction unit 7. After correcting the luminance as described above, the data may be transmitted to the communication unit 9.

【００６７】具体的には、送信データ処理部８が、顔領
域内部の平均輝度の理想値（理想平均輝度ａ）を予め記
憶している。そして、送信データ処理部８は、顔抽出部
７で抽出された顔領域内部の平均輝度Ｉを求め、カメラ
部４で撮影された対象画像の輝度Ｙ１を新たな輝度Ｙ２
に変更するため、対象画像の各画素に対して、Ｙ２＝Ｙ
１×（ａ／Ｉ）を施す。これにより、顔領域内部が理想
平均輝度ａとなるように補正することができる。また、
この理想平均輝度ａを用いて、輝度だけでなく色相につ
いても同様に変更することも考えられる。なお、これ以
外に、送信データ処理部８が、顔領域内部が平均輝度Ｉ
である場合に顔領域が理想平均輝度ａとなる、設定すべ
きカメラ部４の露出レベルを持っている場合もあり得
る。この場合には、送信データ処理部８が、顔領域内部
の平均輝度Ｉに対する露出レベルをカメラ部４へ通知す
ることにより、顔領域の明るさが理想値になるように補
正することが可能となる。More specifically, the transmission data processing section 8 stores in advance the ideal value of the average luminance inside the face area (ideal average luminance a). Then, the transmission data processing unit 8 obtains the average luminance I inside the face area extracted by the face extraction unit 7 and converts the luminance Y1 of the target image captured by the camera unit 4 into a new luminance Y2.
, For each pixel of the target image, Y2 = Y
1 × (a / I) is applied. This makes it possible to correct the inside of the face area to have the ideal average luminance a. Also,
Using this ideal average luminance a, not only luminance but also hue may be similarly changed. In addition, in addition to this, the transmission data processing unit 8 determines that the average luminance I
In such a case, the face area may have an exposure level of the camera unit 4 to be set such that the face area has the ideal average luminance a. In this case, the transmission data processing unit 8 notifies the camera unit 4 of the exposure level with respect to the average luminance I in the face area, so that the brightness of the face area can be corrected to the ideal value. Become.

【００６８】このようにすれば、逆光のような場合で
も、利用者１の顔が常に見えるような画像を相手側へ送
信することが可能となる。これにより、屋外においても
周りの照明環境を気にすることなく画像通信端末を用い
て相手と対話することが可能となる。In this way, even in the case of backlight, it is possible to transmit an image in which the face of the user 1 is always visible to the other party. Thus, it is possible to have a conversation with the other party using the image communication terminal even outdoors, without worrying about the surrounding lighting environment.

【００６９】（第２の実施形態）上記第１の実施形態で
は、簡単な追従機構を用い、画像通信端末側が利用者の
動きに自動的に合わせることによって、利用者をフレー
ム内に捉えた適切な画像を相手側に送信できる手法を説
明した。次に、この第２の実施形態では、追従機構を用
いることなく、利用者側が画像通信端末に合わせて動け
るような表示を行うことにより、利用者をフレーム内に
捉えた適切な画像を相手側に送信できる手法を説明す
る。(Second Embodiment) In the first embodiment described above, the image communication terminal automatically adjusts to the user's movement by using a simple follow-up mechanism. A method for transmitting a proper image to the other party has been described. Next, in the second embodiment, an appropriate image in which the user is captured in a frame is displayed by displaying the image so that the user can move in accordance with the image communication terminal without using the following mechanism. A method that can be transmitted to the server will be described.

【００７０】図７は、本発明の第２の実施形態に係る画
像通信端末の構成を示すブロック図である。図７におい
て、第２の実施形態に係る画像通信端末は、入力部２２
と、表示部３と、カメラ部４と、表示制御部２５と、自
画像メモリ６と、顔抽出部７と、送信データ処理部８
と、通信部９と、受信データ処理部１０と、相手画像メ
モリ１１とを備える。まず、第２の実施形態に係る画像
通信端末の各構成の概要を説明する。FIG. 7 is a block diagram showing a configuration of an image communication terminal according to the second embodiment of the present invention. 7, the image communication terminal according to the second embodiment includes an input unit 22.
, Display unit 3, camera unit 4, display control unit 25, self-portrait memory 6, face extraction unit 7, transmission data processing unit 8
, A communication unit 9, a reception data processing unit 10, and a partner image memory 11. First, an outline of each configuration of the image communication terminal according to the second embodiment will be described.

【００７１】図７に示すように、本実施形態の画像通信
端末では、入力部２２、表示部３及びカメラ部４が、利
用者１に臨んでいる。入力部２２は、キーボード（テン
キー等を含む）やマウス等で構成され、利用者１が通知
モード、送信モード及びその他必要な情報を入力するた
めに利用される。本実施形態では、入力部２２に点灯
（又は点滅）が可能なテンキーが具備されている。表示
部３は、ＬＣＤ等で構成され、画面上で相手の画像や表
示制御部２５の指示に従った目印等を、利用者１に向け
て表示する。目印については後で詳述するが、利用者１
が画面中における自分の顔の位置や大きさを確認できる
指標である。なお、入力部２２及び表示部３によって、
相手側への送信画像における利用者１の顔の位置及び大
きさを、利用者１へ通知する通知部１２が構成される。
カメラ部４は、レンズ等の光学系及びＣＣＤ等の電気系
で構成され、利用者１を撮影するために用いられる。こ
のカメラ部４で撮影された画像（対象画像）は、フレー
ム毎に自画像メモリ６に格納される。表示制御部２５
は、表示部３の画面表示（主として、受信した相手画像
の表示）を制御する。また、表示制御部２５は、入力部
２２から入力される通知モードに応じ、顔抽出部７で抽
出された顔領域に基づいて、目印を表示部３の画面上に
表示させたり、入力部２２のテンキーを点灯させたりす
る。As shown in FIG. 7, in the image communication terminal of this embodiment, the input unit 22, the display unit 3, and the camera unit 4 face the user 1. The input unit 22 includes a keyboard (including numeric keys) and a mouse, and is used by the user 1 to input a notification mode, a transmission mode, and other necessary information. In the present embodiment, the input unit 22 includes a numeric keypad that can be turned on (or blinked). The display unit 3 is configured by an LCD or the like, and displays an image of the other party, a mark according to an instruction of the display control unit 25, and the like to the user 1 on a screen. The landmarks will be described in detail later.
Is an index by which the position and size of one's face on the screen can be confirmed. Note that the input unit 22 and the display unit 3
A notification unit 12 is configured to notify the user 1 of the position and size of the face of the user 1 in the image transmitted to the other party.
The camera unit 4 includes an optical system such as a lens and an electric system such as a CCD, and is used for photographing the user 1. The image (target image) captured by the camera unit 4 is stored in the self-image memory 6 for each frame. Display control unit 25
Controls the screen display of the display unit 3 (mainly, the display of the received partner image). Further, the display control unit 25 displays a mark on the screen of the display unit 3 based on the face area extracted by the face extraction unit 7 according to the notification mode input from the input unit 22, Or turn on the numeric keypad.

【００７２】顔抽出部７は、自画像メモリ６に格納され
た対象画像に対して、存在する顔の位置及び大きさを調
べ、これらの情報を顔領域として表示制御部２５及び送
信データ処理部８へ出力する。なお、この顔抽出部７に
ついては、本発明に適用可能な手法を後で詳細に説明す
る。送信データ処理部８は、入力部２２から指示された
送信モードに従って、自画像メモリ６に格納された対象
画像を、そのまま又は後述する加工を施して通信部９へ
送出する。通信部９は、通信経路を介して、相手の情報
処理装置（画像通信端末を含む）と、少なくとも画像デ
ータの通信を行う。ここでの通信モードは任意であり、
例えば、内線電話のように基地局等を介さない子機間通
信でもよいし、テレビ電話のような基地局等を介する同
期型通信又は非同期型通信でもよい。受信データ処理部
１０は、通信部９を介して受信した相手の画像データを
処理して、フレーム毎に相手画像メモリ１１へ格納す
る。The face extraction unit 7 examines the position and size of the existing face with respect to the target image stored in the self-image memory 6, and uses the information as a face area as the display control unit 25 and the transmission data processing unit 8 Output to Regarding the face extraction unit 7, a method applicable to the present invention will be described later in detail. The transmission data processing unit 8 sends the target image stored in the self-image memory 6 to the communication unit 9 as it is or after performing the processing described below in accordance with the transmission mode instructed from the input unit 22. The communication unit 9 communicates at least image data with a partner information processing device (including an image communication terminal) via a communication path. The communication mode here is arbitrary,
For example, communication between slave units not via a base station such as an extension telephone may be used, or synchronous communication or asynchronous communication via a base station or the like such as a videophone may be used. The reception data processing unit 10 processes the other party's image data received via the communication unit 9 and stores the data in the other party's image memory 11 for each frame.

【００７３】次に、図８〜図１０を参照して、表示制御
部２５が表示部３の画面上に表示させる目印の一例を説
明する。なお、これらの例は、適宜組み合わせて用いる
ことができる。まず、図８（ａ）〜（ｄ）は、利用者１
の顔の位置（ここでは、顔抽出部７で抽出された顔領域
の中心）だけを、表示部３の画面上に目印Ｒで表示させ
る例である。図中矩形で示した領域が表示部３の画面で
あり、ここに相手の画像が表示される。図８（ａ）〜
（ｃ）では、目印Ｒが相手の画像内に重畳させて表示さ
れる。図８（ｄ）では、目印Ｒが相手の画像外に表示さ
れる。これらの目印Ｒの表示は、相手の画像のフレーム
に同期して更新してもよいし、非同期で更新してもよ
い。図８（ａ）は、目印Ｒとして十字線を用い、線の交
点が利用者１の顔の位置を示すようにしたものである。
図８（ｂ）は、目印Ｒとして矢印を用い、双方の矢印で
特定される点が利用者１の顔の位置を示すようにしたも
のである。図８（ｃ）は、目印Ｒとして十字又は×印の
図形を用い、図形の位置が顔の位置を示すようにしたも
のである。図８（ｄ）は、目印Ｒとして相手の画像の枠
外に表示される縦横ルーラを用い、縦ルーラ上に付され
た印と横ルーラ上に付された印とで特定される点が利用
者１の顔の位置を示すようにしたものである。Next, an example of a mark displayed on the screen of the display unit 3 by the display control unit 25 will be described with reference to FIGS. Note that these examples can be used in appropriate combination. First, FIG. 8A to FIG.
This is an example in which only the face position (here, the center of the face area extracted by the face extraction unit 7) is displayed as a mark R on the screen of the display unit 3. An area indicated by a rectangle in the figure is a screen of the display unit 3, on which an image of the other party is displayed. FIG.
In (c), the mark R is displayed superimposed on the image of the other party. In FIG. 8D, the mark R is displayed outside the image of the other party. The display of these marks R may be updated in synchronization with the frame of the image of the other party, or may be updated asynchronously. In FIG. 8A, a cross line is used as the mark R, and the intersection of the lines indicates the position of the user 1's face.
FIG. 8B illustrates an example in which an arrow is used as the mark R, and a point specified by both arrows indicates the position of the face of the user 1. FIG. 8C is a diagram in which a figure of a cross or an X mark is used as the mark R, and the position of the figure indicates the position of the face. FIG. 8D shows a case where the vertical and horizontal rulers displayed outside the frame of the image of the other party are used as the mark R, and the point specified by the mark on the vertical ruler and the mark on the horizontal ruler is determined by the user. 1 shows the position of the face.

【００７４】次に、図９（ａ）〜（ｃ）は、利用者１の
顔の位置及び大きさ（顔抽出部７で抽出された顔領域全
体）を、表示部３の画面上に目印Ｒで表示させる例であ
る。図９（ａ）は、目印Ｒとして縦横２本ずつの平行線
を用い、この平行線で囲まれた矩形領域が利用者１の顔
の位置及び大きさを示すようにしたものである。図９
（ｂ）は、目印Ｒとして相手の画像の枠外に表示される
縦横ルーラを用い、縦ルーラ上に付された幅付き印と横
ルーラ上に付された幅付き印とで特定される領域が利用
者１の顔の位置及び大きさを示すようにしたものであ
る。図９（ｃ）では、目印Ｒとして顔領域に近似する円
（又は楕円）を用い、円領域が利用者１の顔の位置及び
大きさを示すようにしたものである。Next, FIGS. 9A to 9C show the position and size of the face of the user 1 (the entire face area extracted by the face extracting unit 7) on the screen of the display unit 3. This is an example of displaying with R. In FIG. 9A, two vertical and horizontal parallel lines are used as the mark R, and a rectangular area surrounded by the parallel lines indicates the position and the size of the face of the user 1. FIG.
(B) uses a vertical and horizontal ruler displayed outside the frame of the image of the other party as a mark R, and an area specified by a width mark on the vertical ruler and a width mark on the horizontal ruler is determined. The position and size of the face of the user 1 are shown. In FIG. 9C, a circle (or an ellipse) approximating the face area is used as the mark R, and the circle area indicates the position and size of the face of the user 1.

【００７５】なお、これらの目印Ｒは、相手の画像に依
存せずに表示させてもよいし、依存して表示させてもよ
い。前者としては、例えば、相手の画像にかかわらず所
定の色（黒一色等）で目印Ｒを表示させることである。
後者としては、例えば、表示させる目印Ｒが相手の画像
上でわかり難くなる場合に、目印Ｒを表示させる画素の
輝度を変化させたり、そのＲＧＢ値を変化（反転）させ
ることである。いずれにしても、これらの目印Ｒは、相
手の画像の邪魔にならぬように表示することが望まし
い。Note that these marks R may be displayed without depending on the image of the other party, or may be displayed depending on them. As the former, for example, the mark R is displayed in a predetermined color (eg, black) regardless of the image of the other party.
As the latter, for example, when the mark R to be displayed is difficult to recognize on the image of the other party, changing the luminance of the pixel displaying the mark R or changing (inverting) the RGB value thereof. In any case, it is desirable to display these marks R so as not to disturb the image of the other party.

【００７６】さらに、図１０は、利用者１の顔のおおよ
その位置を、表示部３ではなく入力部２２で表示させる
例である。図１０に示すように、目印Ｒとして点灯が可
能なテンキーを用い、このテンキーのいずれかを点灯さ
せることで顔の位置を利用者１へ通知することができ
る。図１０では、「３」のキーを点灯させているので、
顔の位置が画面の「右上」にあることを通知できる。同
様に、「１」のキーなら画面の「左上」、「５」のキー
なら画面の「真中」、「９」のキーなら画面の「右下」
というように、概略の位置表示を行える。なお、このよ
うな概略位置の通知であっても、十分実用に値する。FIG. 10 shows an example in which the approximate position of the face of the user 1 is displayed not by the display unit 3 but by the input unit 22. As shown in FIG. 10, a ten-key that can be turned on is used as the mark R, and the user 1 can be notified of the position of the face by turning on one of the ten-keys. In FIG. 10, since the key “3” is lit,
It can notify that the position of the face is "upper right" of the screen. Similarly, the key “1” is “upper left” on the screen, the key “5” is “middle” on the screen, and the key “9” is “lower right” on the screen.
Thus, a rough position display can be performed. Even such a notification of the approximate position is sufficiently practical.

【００７７】なお、本実施形態では、入力部２２から表
示制御部２５へ与えられる通知モードによって、図８〜
図１０のいずれの方法で顔の位置を通知させるかを切り
替えることが可能なようにしている。さらに、これらの
通知は、常時行ってもよいし、利用者１が入力部２２で
通知を指示した時のみ行ってもよい。また、概略位置の
通知を行う方法としては、図１０に示す入力部２２のテ
ンキーの点灯以外にも、音や光によることもできる。例
えば、音の場合には、スピーカから発するインターバル
や周波数を顔の位置に応じて変化させたり、光の場合に
は、点灯させる明るさや点滅のインターバルを顔の位置
に応じて変化させること等が考えられる。In the present embodiment, the notification modes given from the input unit 22 to the display control unit 25 depend on the notification modes shown in FIGS.
It is possible to switch which method of FIG. 10 notifies the position of the face. Further, these notifications may be made all the time, or may be made only when the user 1 instructs the notification with the input unit 22. As a method of notifying the approximate position, a sound or light may be used instead of turning on the numeric keypad of the input unit 22 shown in FIG. For example, in the case of sound, the interval or frequency emitted from the speaker may be changed according to the position of the face, and in the case of light, the brightness or the interval of blinking may be changed according to the position of the face. Conceivable.

【００７８】次に、図１１を参照して、送信データ処理
部８が通信部９を介して送信する利用者１の画像例につ
いて説明する。本実施形態では、相手側へ送信される画
像が、入力部２２から送信データ処理部８へ与えられる
送信モードによって、選択できるようになっている。こ
こで、利用者１側（自分側）では、相手の画像の上に、
図１１（ａ）のような目印Ｒ（図９（ａ）〜（ｃ）の組
み合わせ）が表示されているものとする。このとき、送
信データ処理部８は、送信モードによって、種々の形態
で自画像を相手に送信することができる。例えば、送信
モードが「通常」であれば、図１１（ｂ）のように、送
信データ処理部８は、カメラ部４の取得画像をそのまま
送信する。また、送信モードが「目印付き」であれば、
図１１（ｃ）に示すように、送信データ処理部８は、顔
抽出部７で抽出された顔領域を参照して、取得画像に目
印Ｒを合成した自画像を作成し、相手に送信する。さら
に、送信モードが「顔のみ」であれば、図１１（ｄ）に
示すように、送信データ処理部８は、取得画像から顔抽
出部７で抽出された顔領域のみを切り取った自画像を、
相手に送信する。Next, an example of an image of the user 1 transmitted by the transmission data processing unit 8 via the communication unit 9 will be described with reference to FIG. In the present embodiment, an image to be transmitted to the other party can be selected by a transmission mode provided from the input unit 22 to the transmission data processing unit 8. Here, on the user 1 side (self side),
It is assumed that a mark R (a combination of FIGS. 9A to 9C) as shown in FIG. 11A is displayed. At this time, the transmission data processing unit 8 can transmit its own image to the other party in various modes depending on the transmission mode. For example, if the transmission mode is “normal”, the transmission data processing unit 8 transmits the image acquired by the camera unit 4 as it is, as shown in FIG. If the transmission mode is "marked",
As shown in FIG. 11C, the transmission data processing unit 8 refers to the face area extracted by the face extraction unit 7, creates a self-portrait in which the obtained image is combined with the mark R, and transmits the self-portrait to the other party. Further, if the transmission mode is “face only”, as shown in FIG. 11D, the transmission data processing unit 8 converts the own image obtained by cutting out only the face area extracted by the face extraction unit 7 from the acquired image,
Send to the other party.

【００７９】送信モードに基づくこれらの画像処理は、
周知技術によって簡単に実現することができるので、そ
の詳しい説明は省略する。ここで、図１１（ｃ）のよう
に「目印付き」で自画像を送信すれば、例えば自分がど
こに居るのかわかり難い画像（暗闇の中に居る画像）を
送信する場合であっても、相手に自分の位置を正確に把
握させることが可能となる。また、図１１（ｄ）のよう
に、「顔のみ」で自画像を送信すれば、背景が写らない
ので相手に見られたくない部分を隠すことができ、プラ
イバシーを保護できる。なお、このように背景を隠して
も、顔の表情等は相手に伝わるので、会話に支障はな
い。なお、上記説明した送信モードは、互いにユニーク
であれば、他の任意の区別法によることも可能である。The image processing based on the transmission mode is as follows.
Since it can be easily realized by a well-known technique, a detailed description thereof will be omitted. Here, if the self-image is transmitted with “marks” as shown in FIG. 11C, even if an image (for example, an image in the dark) where it is difficult to understand where the user is, is transmitted to the other party. It is possible to accurately grasp the user's own position. In addition, as shown in FIG. 11D, if the self-portrait is transmitted with “face only”, the background is not reflected, so that a part that the other party does not want to see can be hidden, and privacy can be protected. Note that even if the background is hidden in this way, the facial expression and the like are transmitted to the other party, so that there is no problem in conversation. Note that the transmission modes described above can be based on any other discriminating method as long as they are unique to each other.

【００８０】以上のように、本発明の第２の実施形態に
係る画像通信端末によれば、抽出した顔領域に基づく目
印を用いて、利用者の画面上の位置関係を簡潔かつ適切
に表現することができる。従って、利用者は、自分の顔
の位置が画面を逸脱した場合はもとより、自分の顔の位
置が画面を逸脱していない場合であっても、自分の顔の
位置を確認しながら、安心して相手との会話を進めるこ
とができる。また、上記第１の実施形態に比べ追従機構
を省略しているので、画像通信端末の携帯性を良好にさ
せることができる。As described above, according to the image communication terminal according to the second embodiment of the present invention, the positional relationship on the screen of the user is simply and appropriately expressed using the mark based on the extracted face area. can do. Therefore, the user can confirm the position of his / her face without worrying not only when the position of his / her face deviates from the screen but also when the position of his / her face does not deviate from the screen. Conversation with the other party can be advanced. Further, since the follow-up mechanism is omitted as compared with the first embodiment, the portability of the image communication terminal can be improved.

【００８１】（顔抽出部７の詳細な実施例）次に、上述
した本発明の第１及び第２の実施形態に係る画像通信端
末に適用できる顔抽出部７の具体的な実施例を、３通り
説明する。なお、顔抽出部７には、以下に説明する３つ
の手法の他に、色情報に基づくもの、目や口等の顔の部
分に着目するもの、テンプレートマッチングによるもの
等、周知のさまざまな手法を適用させることが可能であ
る。(Detailed Example of Face Extraction Unit 7) Next, a specific example of the face extraction unit 7 applicable to the image communication terminals according to the above-described first and second embodiments of the present invention will be described. Three ways will be described. In addition to the three methods described below, the face extraction unit 7 uses various well-known methods such as a method based on color information, a method that focuses on a face portion such as eyes and mouth, and a method based on template matching. Can be applied.

【００８２】＜実施例１＞図１２は、実施例１の顔抽出
部７の構成を示すブロック図である。図１２において、
顔抽出部７は、エッジ抽出部５１と、テンプレート記憶
部５２と、投票結果記憶部５３と、投票部５４と、解析
部５５とを備える。<Embodiment 1> FIG. 12 is a block diagram showing the structure of the face extraction unit 7 of Embodiment 1. In FIG.
The face extraction unit 7 includes an edge extraction unit 51, a template storage unit 52, a voting result storage unit 53, a voting unit 54, and an analysis unit 55.

【００８３】エッジ抽出部５１は、カメラ部４が撮影し
た対象画像からエッジ部を抽出し、エッジ部だけの画像
（以下、エッジ画像という）を生成する。ここで、エッ
ジ部とは、人物の外郭や顔の輪郭等に相当する部分（画
素）であって、対象画像内の高周波成分となる部分であ
る。このエッジ抽出部５１には、対象画像から高周波成
分を取り出すＳｏｂｅｌフィルタ等を用いるのが好まし
い。The edge extracting section 51 extracts an edge portion from the target image captured by the camera section 4 and generates an image of only the edge portion (hereinafter referred to as an edge image). Here, the edge part is a part (pixel) corresponding to the outline of a person, the outline of a face, or the like, and is a part that becomes a high-frequency component in a target image. For the edge extraction unit 51, it is preferable to use a Sobel filter or the like that extracts a high-frequency component from the target image.

【００８４】テンプレート記憶部５２には、予め定めた
形状を、相似で大きさを異ならせた種々のサイズによっ
て、中心点で同心状に複数設けたテンプレートのデータ
が記憶されている。このテンプレートの形状には、円、
楕円、正多角形、多角形等を用いることができるが、中
心点から形状線（形状を形成する各画素）までの距離が
常に一定である円を用いることが最も好ましい。これに
より、後述する投票結果の精度を高くさせることができ
る。以下、この実施例１では、図１３に示すように、中
心点がＰで半径が異なる同心円を複数設けたテンプレー
トを用いた場合を説明する。ここで、テンプレートを構
成する複数の円ｔ１〜ｔｎ（ｎは、任意の整数）は、図
１３に示すテンプレートのように、一定間隔で半径が変
化する構成であってもよいし、不定間隔で半径が変化す
る構成であってもよい。また、テンプレートを構成する
複数の円ｔ１〜ｔｎは、全ての線幅が１ドット（対象画
像の１画素に相当）で構成されてもよいし、一部又は全
部の線幅が２ドット以上（すなわち、円環形状）で構成
されてもよい。なお、以下の説明では、円及び円環を総
称して単に「円」という。The template storage unit 52 stores data of a plurality of templates provided concentrically at the center point in various shapes having similar shapes and different sizes. The shape of this template includes a circle,
An ellipse, regular polygon, polygon, or the like can be used, but it is most preferable to use a circle in which the distance from the center point to the shape line (each pixel forming the shape) is always constant. Thereby, the accuracy of a voting result described later can be increased. Hereinafter, in the first embodiment, as shown in FIG. 13, a case will be described in which a template having a plurality of concentric circles having a center point P and different radii is used. Here, the plurality of circles t1 to tn (n is an arbitrary integer) constituting the template may have a configuration in which the radius changes at regular intervals as in the template shown in FIG. A configuration in which the radius changes may be used. In addition, the plurality of circles t1 to tn forming the template may have a line width of one dot (corresponding to one pixel of the target image), or a part or all of the line width of two dots or more ( That is, it may be configured in a ring shape). In the following description, circles and rings are collectively referred to simply as “circles”.

【００８５】この複数の円ｔ１〜ｔｎは、まとめて１つ
のテンプレートとして扱われてテンプレート記憶部５２
に記憶されるが、実際の処理では、テンプレートを構成
する各円ｔ１〜ｔｎは、独立して扱われることとなる。
このため、各円ｔ１〜ｔｎを形成する画素データは、テ
ンプレート記憶部５２において、例えばテーブル形式で
それぞれ記憶される。The plurality of circles t1 to tn are collectively treated as one template and
However, in the actual processing, the circles t1 to tn constituting the template are handled independently.
For this reason, pixel data forming each of the circles t1 to tn is stored in the template storage unit 52, for example, in a table format.

【００８６】投票結果記憶部５３には、後述する投票部
５４において行われる投票処理の結果を記憶する領域
（以下、投票記憶領域という）が、テンプレート記憶部
５２に記憶されているテンプレートを構成する各サイズ
の形状毎に、設けられている。この例では、各サイズの
形状が円ｔ１〜ｔｎであるので、投票結果記憶部５３に
は、円ｔ１〜ｔｎに関してｎ個の投票記憶領域が設けら
れることとなる。なお、この投票記憶領域は、対象画像
に対応する範囲を有する。In the voting result storage unit 53, an area for storing the result of a voting process performed in the voting unit 54 described below (hereinafter referred to as a voting storage area) forms a template stored in the template storage unit 52. It is provided for each shape of each size. In this example, since the shape of each size is a circle t1 to tn, the voting result storage unit 53 is provided with n voting storage areas for the circles t1 to tn. This voting storage area has a range corresponding to the target image.

【００８７】投票部５４は、エッジ抽出部５１で生成さ
れたエッジ画像について、テンプレート記憶部５２に記
憶されているテンプレートを用いて、投票処理を行う。
図１４は、投票部５４で行われる投票処理の手順を示す
フローチャートである。図１４を参照して、投票部５４
は、まず、投票結果記憶部５３にアクセスして、各投票
記憶領域内の座標を表す成分（投票値）を、全て零に初
期化する（ステップＳ６０１）。次に、投票部５４は、
エッジ画像内のエッジ部の先頭画素位置に、テンプレー
トの中心点Ｐをセットする（ステップＳ６０２）。この
先頭画素位置は、例えば、エッジ画像上を左上から右上
又は左下へ順次走査して行き、最初に検出されたエッジ
部の画素の位置とすればよい。The voting section 54 performs a voting process on the edge image generated by the edge extracting section 51 by using the template stored in the template storage section 52.
FIG. 14 is a flowchart illustrating a procedure of the voting process performed by the voting unit 54. Referring to FIG. 14, voting unit 54
First, the voting result storage unit 53 is accessed, and components (voting values) representing the coordinates in each voting storage area are all initialized to zero (step S601). Next, the voting unit 54
The center point P of the template is set at the top pixel position of the edge portion in the edge image (step S602). The top pixel position may be, for example, the position of the pixel of the edge portion detected first by sequentially scanning the edge image from the upper left to the upper right or the lower left.

【００８８】次に、投票部５４は、テンプレートを構成
する形状（この例では、円ｔ１〜ｔｎ）を特定するカウ
ンタｉを、「１」に初期化する（ステップＳ６０３）。
次に、投票部５４は、カウンタｉ（＝１）によって特定
される円ｔ１について、円ｔ１を形成する全画素のエッ
ジ画像上のｘｙ座標をそれぞれ取得する（ステップＳ６
０４）。そして、投票部５４は、投票結果記憶部５３に
設けられた円ｔ１に関する投票記憶領域において、取得
した各ｘｙ座標を表す成分にそれぞれ「１」を加算して
投票を行う（ステップＳ６０５）。この処理が終わる
と、投票部５４は、カウンタｉを１つインクリメントし
て、ｉ＝２とする（ステップＳ６０７）。次に、投票部
５４は、カウンタｉ（＝２）によって特定される円ｔ２
について、円ｔ２を形成する全画素のエッジ画像上のｘ
ｙ座標をそれぞれ取得する（ステップＳ６０４）。そし
て、投票部５４は、投票結果記憶部５３に設けられた円
ｔ２に関する投票記憶領域において、取得した各ｘｙ座
標を表す成分にそれぞれ「１」を加算して投票を行う
（ステップＳ６０５）。Next, the voting unit 54 initializes a counter i for specifying a shape (a circle t1 to tn in this example) constituting the template to “1” (step S603).
Next, for the circle t1 specified by the counter i (= 1), the voting unit 54 acquires the xy coordinates of all the pixels forming the circle t1 on the edge image (step S6).
04). Then, the voting unit 54 performs voting by adding “1” to each of the acquired components representing the xy coordinates in the voting storage area for the circle t1 provided in the voting result storage unit 53 (step S605). When this process ends, the voting unit 54 increments the counter i by one to set i = 2 (step S607). Next, the voting unit 54 determines the circle t2 specified by the counter i (= 2).
For x on the edge image of all pixels forming the circle t2
The y coordinate is obtained (step S604). Then, the voting unit 54 performs voting by adding “1” to each of the acquired components representing the xy coordinates in the voting storage area for the circle t2 provided in the voting result storage unit 53 (step S605).

【００８９】以降同様にして、投票部５４は、ｉ＝ｎに
なるまでカウンタｉを１つずつインクリメントしながら
（ステップＳ６０６，Ｓ６０７）、テンプレートを構成
する全形状である円ｔ３〜ｔｎについて、上記ステップ
Ｓ６０４及びＳ６０５の投票処理を繰り返し行う。これ
により、各円ｔ１〜ｔｎに関する投票記憶領域のそれぞ
れに、先頭画素位置における投票処理が行われることに
なる。そしてさらに、投票部５４は、エッジ部の次の画
素位置にテンプレートの中心点Ｐをセットして上記ステ
ップＳ６０３〜Ｓ６０７の処理を繰り返し行うことを、
エッジ画像内のエッジ部の全画素に対して、１回ずつ行
う（ステップＳ６０８，Ｓ６０９）。すなわち、投票部
５４による投票処理は、テンプレートの中心点Ｐがエッ
ジ部の全画素を這うように行われる。Similarly, the voting unit 54 increments the counter i one by one until i = n (steps S606 and S607), and repeats the above for the circles t3 to tn, which are all the shapes forming the template. The voting process of steps S604 and S605 is repeated. As a result, the voting process at the head pixel position is performed on each of the voting storage areas for each of the circles t1 to tn. Further, the voting unit 54 sets the center point P of the template at the pixel position next to the edge portion, and repeatedly performs the processing of steps S603 to S607.
The process is performed once for all the pixels at the edge portion in the edge image (steps S608 and S609). That is, the voting process performed by the voting unit 54 is performed such that the center point P of the template crawls all pixels at the edge portion.

【００９０】例えば、図１５に示すエッジ画像に上記投
票処理を施すことによって、投票結果記憶部５３に設け
られたｎ個の投票記憶領域には、図１６に示すような投
票数が記憶される。なお、図１６では、図面を見易くす
るため、エッジ部の一部の画素位置で投票処理が行われ
た場合を示している。図１６において、実線円の部分
が、上記ステップＳ６０５においてテンプレートの各サ
イズの形状（円ｔ１〜ｔｎ）に基づいて投票された座標
成分に相当し、座標数「１」となる。また、上述したよ
うに各投票数は累積加算されるので、図１６の実線円が
交差する部分（図中、●印で示す）は、交差する数が多
いほど投票数が高いことを表している。For example, by performing the voting process on the edge image shown in FIG. 15, the number of votes as shown in FIG. 16 is stored in the n voting storage areas provided in the voting result storage unit 53. . FIG. 16 shows a case where the voting process is performed at a part of the pixel position of the edge portion in order to make the drawing easy to see. In FIG. 16, the solid circle corresponds to the coordinate component voted based on the shape of each size of the template (circles t1 to tn) in step S605, and the number of coordinates is “1”. Further, as described above, since the number of votes is cumulatively added, the portion where the solid circles intersect in FIG. 16 (indicated by a black circle in the figure) indicates that the greater the number of intersections, the higher the number of votes. I have.

【００９１】そのため、中心点を持つ円又は楕円に近似
した顔の輪郭を表現するエッジ部に、上述した投票処理
を施せば、その中心点付近に高い投票数が集中すること
となる。従って、高い投票値が集中する部分を判断すれ
ば、顔の中心を特定することが可能になる。また、この
ような高い投票値が集中する現象は、テンプレートの中
でも、顔の輪郭を表現するエッジ部の最小幅と等しい又
は非常に近い半径を持つ円形状を用いた場合に、より顕
著に現れる。従って、この現象がどの円形状の投票記憶
領域に顕著に現れているかを判断すれば、顔の大きさを
特定することが可能になる。この点は、一般化ハフ変換
と似ていると言える。しかし、本発明の顔抽出方法で
は、同心状に複数サイズの形状を持つテンプレートを使
用することにより、エッジ部の中心点と共にその大きさ
も一度に特定できるという点で、一般化ハフ変換とは明
確に異なる。Therefore, if the above-mentioned voting process is applied to the edge portion representing the contour of a face approximated to a circle or an ellipse having a center point, a high number of votes will be concentrated near the center point. Therefore, by determining a portion where high voting values are concentrated, the center of the face can be specified. Further, such a phenomenon in which high voting values are concentrated appears more remarkably when a circular shape having a radius equal to or very close to the minimum width of the edge representing the face contour is used in the template. . Therefore, it is possible to specify the size of the face by determining in which circular voting storage area this phenomenon is noticeable. This point is similar to the generalized Hough transform. However, the face extraction method of the present invention is different from the generalized Hough transform in that a template having concentric shapes of a plurality of sizes can be used to specify the size together with the center point of the edge at a time. Different.

【００９２】なお、上記ステップＳ６０１において、各
投票記憶領域内の座標を表す成分を、全て予め定めた最
大値に初期化し、上記ステップＳ６０５において、取得
した各ｘｙ座標を表す成分からそれぞれ「１」を減算し
て投票を行ってもよい。この場合、低い投票値が集中す
る部分を判断すれば、顔の中心を特定することが可能で
あり、この集中現象がどの円形状の投票記憶領域に顕著
に現れているかを判断すれば、顔の大きさを特定するこ
とが可能になる。また、上記ステップＳ６０５におい
て、投票数を加算又は減算させる値は「１」以外であっ
てもよく、値を自由に設定することができる。In step S601, the components representing the coordinates in each voting storage area are all initialized to a predetermined maximum value. In step S605, the components representing the xy coordinates obtained are each set to "1". May be subtracted for voting. In this case, it is possible to identify the center of the face by determining the portion where the low voting value is concentrated, and to determine in which circular voting storage area this concentration phenomenon is prominent, Can be specified. In step S605, the value by which the number of votes is added or subtracted may be other than “1”, and the value can be set freely.

【００９３】次に、投票結果記憶部５３に記憶された投
票結果に基づいて、対象画像の顔領域を特定する手法を
説明する。解析部５５は、投票部５４による投票処理が
完了した後、投票結果記憶部５３に記憶された投票結果
に基づいて、そのクラスタを評価して、対象画像に含ま
れる顔の位置及び大きさを求める。図１７は、解析部５
５で行われる解析処理の手順を示すフローチャートであ
る。Next, a method for specifying the face area of the target image based on the voting result stored in the voting result storage unit 53 will be described. After the voting process by the voting unit 54 is completed, the analysis unit 55 evaluates the cluster based on the voting result stored in the voting result storage unit 53 and determines the position and size of the face included in the target image. Ask. FIG.
9 is a flowchart illustrating a procedure of an analysis process performed in Step 5.

【００９４】図１７を参照して、解析部５５は、まず、
テンプレートを構成する形状（この例では、円ｔ１〜ｔ
ｎ）を特定するカウンタｊを、「１」にセットする（ス
テップＳ７０１）。次に、解析部５５は、カウンタｊ
（＝１）によって特定される円ｔ１について、投票結果
記憶部５３の円ｔ１に関する投票記憶領域に記憶されて
いる投票結果を参照して、投票数が予め定めたしきい値
Ｇ（例えば、２００等）を越える成分だけを抽出する
（ステップＳ７０２）。このしきい値Ｇは、対象画像の
精細度や所望する抽出精度に基づいて、任意に定めるこ
とができる。次に、解析部５５は、抽出した成分だけを
対象に、クラスタリングを行い（ステップＳ７０３）、
クラスタ化された各領域の分散値及び共分散値をそれぞ
れ計算する（ステップＳ７０４）。このクラスタリング
における類似度は、ユークリッド平方距離、標準化ユー
クリッド平方距離、マハラノビスの汎距離又はミンコフ
スキー距離のいずれを用いて判断されてもよい。また、
クラスタの形成には、最短距離法（ＳＬＩＮＫ：single
linkage clustering method）、最長距離法（ＣＬＩＮ
Ｋ：complete linkage clustering method）又は群平均
法（ＵＰＧＭＡ：unweighted pair-group method using
arithmetic averages）のいずれを用いてもよい。Referring to FIG. 17, analysis section 55 first
Shapes that make up the template (in this example, circles t1 to t
The counter j for specifying n) is set to "1" (step S701). Next, the analysis unit 55 sets the counter j
For the circle t1 specified by (= 1), the number of votes is determined by a predetermined threshold G (for example, 200) with reference to the voting result stored in the voting storage area related to the circle t1 in the voting result storage unit 53. And the like are extracted (step S702). The threshold value G can be arbitrarily determined based on the definition of the target image and the desired extraction accuracy. Next, the analysis unit 55 performs clustering on only the extracted components (step S703),
The variance value and the covariance value of each of the clustered regions are calculated (step S704). The similarity in this clustering may be determined using any of the Euclidean square distance, the standardized Euclidean square distance, the Mahalanobis' generalized distance, or the Minkowski distance. Also,
To form a cluster, the shortest distance method (SLINK: single
linkage clustering method, longest distance method (CLIN)
K: complete linkage clustering method or group averaging method (UPGMA: unweighted pair-group method using
arithmetic averages) may be used.

【００９５】次に、解析部５５は、クラスタ化された各
領域の分散値及び共分散値を、予め定めたしきい値Ｈと
比較する（ステップＳ７０５）。そして、このステップ
Ｓ７０５において各値がしきい値Ｈ未満の場合、解析部
５５は、その領域の中心点を顔の中心点とみなして、こ
の時のカウンタｊ（＝１）が指す円ｔ１のサイズ（直
径）を顔の短軸長とし（ステップＳ７０６）、この短軸
長に一定値（経験的に定める）を加えた長さを顔の長軸
長として決定する（ステップＳ７０７）。そして、解析
部５５は、この決定した中心点、短軸長及び長軸長を、
解析結果として保持する（ステップＳ７０８）。一方、
上記ステップＳ７０５において各値がしきい値Ｈ以上の
場合、解析部５５は、その領域の中心点が顔の中心点で
はないと判断して、次の処理に移る。Next, the analysis unit 55 compares the variance value and covariance value of each of the clustered areas with a predetermined threshold value H (step S705). If each value is less than the threshold value H in step S705, the analysis unit 55 regards the center point of the area as the center point of the face and determines the center point of the circle t1 indicated by the counter j (= 1) at this time. The size (diameter) is set as the minor axis length of the face (step S706), and the length obtained by adding a fixed value (determined empirically) to the minor axis length is determined as the major axis length of the face (step S707). Then, the analysis unit 55 calculates the determined center point, the short axis length and the long axis length,
The result is stored as an analysis result (step S708). on the other hand,
If each value is equal to or greater than the threshold value H in step S705, the analysis unit 55 determines that the center point of the area is not the center point of the face, and proceeds to the next process.

【００９６】この処理が終わると、解析部５５は、カウ
ンタｊを１つインクリメントして、ｊ＝２とする（ステ
ップＳ７１０）。次に、解析部５５は、カウンタｊ（＝
２）によって特定される円ｔ２について、投票結果記憶
部５３の円ｔ２に関する投票記憶領域に記憶されている
投票結果を参照して、投票数が予め定めたしきい値Ｇを
越える成分だけを抽出する（ステップＳ７０２）。次
に、解析部５５は、抽出した成分だけを対象に、クラス
タリングを行い（ステップＳ７０３）、クラスタ化され
た各領域の分散値及び共分散値をそれぞれ計算する（ス
テップＳ７０４）。次に、解析部５５は、クラスタ化さ
れた各領域の分散値及び共分散値を、予め定めたしきい
値Ｈと比較する（ステップＳ７０５）。そして、このス
テップＳ７０５において各値がしきい値Ｈ未満の場合、
解析部５５は、その領域の中心点を顔の中心点とみなし
て、この時のカウンタｊ（＝２）が指す円ｔ２のサイズ
を顔の短軸長とし（ステップＳ７０６）、この短軸長に
一定値を加えた長さを顔の長軸長として決定する（ステ
ップＳ７０７）。そして、解析部５５は、この決定した
中心点、短軸長及び長軸長を、解析結果として追加して
保持する（ステップＳ７０８）。一方、上記ステップＳ
７０５において各値がしきい値Ｈ以上の場合、解析部５
５は、その領域の中心点が顔の中心点ではないと判断し
て、次の処理に移る。When this process is completed, the analyzing unit 55 increments the counter j by one to j = 2 (step S710). Next, the analysis unit 55 sets the counter j (=
With respect to the circle t2 specified by 2), with reference to the voting result stored in the voting storage area related to the circle t2 in the voting result storage unit 53, only the component whose number of votes exceeds a predetermined threshold G is extracted. (Step S702). Next, the analysis unit 55 performs clustering only on the extracted components (step S703), and calculates the variance value and the covariance value of each of the clustered regions (step S704). Next, the analysis unit 55 compares the variance value and the covariance value of each of the clustered regions with a predetermined threshold value H (Step S705). If each value is less than the threshold value H in step S705,
The analyzing unit 55 regards the center point of the area as the center point of the face, sets the size of the circle t2 indicated by the counter j (= 2) at this time as the short axis length of the face (step S706), and Is determined as the major axis length of the face (step S707). Then, the analysis unit 55 additionally holds the determined center point, short axis length, and long axis length as an analysis result (step S708). On the other hand, step S
If each value is equal to or greater than the threshold value H in 705, the analysis unit 5
No. 5 judges that the center point of the area is not the center point of the face, and proceeds to the next processing.

【００９７】以降同様にして、解析部５５は、ｊ＝ｎに
なるまでカウンタｊを１つずつインクリメントしながら
（ステップＳ７０９，Ｓ７１０）、投票結果記憶部５３
に記憶されている各円ｔ３〜ｔｎに関する投票記憶領域
について、上記ステップＳ７０２〜Ｓ７０８の解析処理
を繰り返し行う。これにより、各円ｔ１〜ｔｎに関する
投票記憶領域における、顔領域抽出の解析結果を得るこ
とができる。この解析結果は、表示制御部５，２５及び
送信データ処理部８へ出力される。In the same manner, the analysis unit 55 increments the counter j by one until j = n (steps S709 and S710), and the voting result storage unit 53
The analysis processing of steps S702 to S708 is repeatedly performed on the voting storage area for each of the circles t3 to tn stored in. Thereby, the analysis result of the face area extraction in the voting storage area for each of the circles t1 to tn can be obtained. This analysis result is output to the display control units 5 and 25 and the transmission data processing unit 8.

【００９８】このように、実施例１の顔抽出部７では、
負担が軽い投票処理（基本的には加算処理のみ）と投票
数の評価だけで、顔の位置を高速に抽出できる。しか
も、相似で同心状の複数サイズの形状を備えたテンプレ
ートを用いているので、顔領域であろうエッジ部が、こ
れらの形状のいずれのサイズに近いかという実質的な近
似を行っていることになり、顔の大きさも高速に抽出で
きる。As described above, in the face extracting unit 7 of the first embodiment,
The face position can be extracted at high speed only by the light voting process (basically only the addition process) and the evaluation of the number of votes. In addition, since a template having similar and concentric shapes of a plurality of sizes is used, a substantial approximation is made as to which edge of the face area is closer to which of these shapes. , And the size of the face can be extracted at high speed.

【００９９】＜実施例２＞次に、実施例２として、直交
変換後の空間でパターンマッチングを行うことにより処
理量の削減を図り、携帯電話等のような限られた処理量
を要求される端末において有効となる手法を説明する。
図１８は、実施例２の顔抽出部７の構成を示すブロック
図である。図１８において、顔抽出部７は、テンプレー
ト画像処理部８０と、入力画像処理部９０と、積算部１
０１と、逆直交変換部（逆ＦＦＴ）１０２と、マップ処
理部１０３とを備える。この実施例２の手法は、テンプ
レート画像処理部８０及び入力画像処理部９０におい
て、テンプレート画像及び入力画像（対象画像）にそれ
ぞれ線形性を有する直交変換を施し、それらを積算した
後に逆直交変換して、類似値Ｌを求めるものである。<Embodiment 2> Next, as Embodiment 2, the amount of processing is reduced by performing pattern matching in the space after the orthogonal transformation, and a limited amount of processing such as that of a mobile phone is required. A method effective in the terminal will be described.
FIG. 18 is a block diagram illustrating a configuration of the face extraction unit 7 according to the second embodiment. In FIG. 18, the face extraction unit 7 includes a template image processing unit 80, an input image processing unit 90,
01, an inverse orthogonal transform unit (inverse FFT) 102, and a map processing unit 103. According to the method of the second embodiment, in the template image processing unit 80 and the input image processing unit 90, the template image and the input image (target image) are respectively subjected to orthogonal transformation having linearity, and after integrating them, the inverse orthogonal transformation is performed. Thus, a similarity value L is obtained.

【０１００】ここで、実施例２では、直交変換としてＦ
ＦＴ（高速離散フーリエ変換）を使用する場合を説明す
るが、この他にＨａｒｔｌｅｙ変換や数論的変換等を用
いることもできる。これら他の変換方法を使用する場合
には、以下の説明中の「フーリエ変換」とある部分を、
これらの変換方法に読み替えればよい。また、テンプレ
ート画像処理部８０及び入力画像処理部９０のいずれに
おいても、エッジ法線方向ベクトルの内積を利用し、エ
ッジ法線方向ベクトルの方向が近いほど、高い相関が出
るようにしている。しかも、この内積は、偶数倍角表現
を用いて評価される。以下簡単のため、偶数倍角の例と
して２倍角の場合を説明するが、４倍角や６倍角等の他
の偶数倍角においても、実施例２と同様の効果を奏する
ことができる。Here, in the second embodiment, F
The case where FT (fast discrete Fourier transform) is used will be described, but Hartley transform, number-theoretic transform, and the like can also be used. When using these other conversion methods, the part described as “Fourier Transform” in the following description,
What is necessary is just to read these conversion methods. In both the template image processing unit 80 and the input image processing unit 90, the inner product of the edge normal direction vector is used, and the closer the direction of the edge normal direction vector is, the higher the correlation is obtained. Moreover, this inner product is evaluated using an even-numbered double-width representation. In the following, for the sake of simplicity, the case of a double angle will be described as an example of an even number double angle. However, the same effect as that of the second embodiment can be obtained in other even number doubles such as a quadruple angle and a 6 times angle.

【０１０１】まず、テンプレート画像処理部８０につい
て説明する。図１８において、テンプレート画像処理部
８０は、エッジ抽出部８１と、評価ベクトル生成部８２
と、直交変換部（ＦＦＴ）８３と、圧縮部８４と、記録
部８５とを備える。First, the template image processing section 80 will be described. In FIG. 18, a template image processing unit 80 includes an edge extraction unit 81 and an evaluation vector generation unit 82
, An orthogonal transform unit (FFT) 83, a compression unit 84, and a recording unit 85.

【０１０２】エッジ抽出部８１は、入力されるテンプレ
ート画像に対して、ｘ方向及びｙ方向のそれぞれについ
て微分処理（エッジ抽出）を施し、テンプレート画像の
エッジ法線方向ベクトルを出力する。本実施例２では、
ｘ方向について、The edge extracting section 81 performs differentiation processing (edge extraction) on the input template image in each of the x direction and the y direction, and outputs an edge normal direction vector of the template image. In the second embodiment,
In the x direction,

【数１】なるＳｏｂｅｌフィルタを用い、ｙ方向について、(Equation 1) Using the Sobel filter

【数２】なるＳｏｂｅｌフィルタを用いている。これらのフィル
タ（１）及び（２）より、次式（３）で定義されるテン
プレート画像のエッジ法線方向ベクトルが求められる。(Equation 2) Sobel filter is used. From these filters (1) and (2), an edge normal direction vector of the template image defined by the following equation (3) is obtained.

【数３】 (Equation 3)

【０１０３】評価ベクトル生成部８２は、エッジ抽出部
８１からテンプレート画像のエッジ法線方向ベクトルを
入力し、次に述べる処理を行って、テンプレート画像の
評価ベクトルを直交変換部８３へ出力する。まず、評価
ベクトル生成部８２は、次式（４）を用いて、テンプレ
ート画像のエッジ法線方向ベクトルを長さについて正規
化する。The evaluation vector generation unit 82 receives the edge normal direction vector of the template image from the edge extraction unit 81, performs the following processing, and outputs the evaluation vector of the template image to the orthogonal transformation unit 83. First, the evaluation vector generation unit 82 normalizes the edge normal direction vector of the template image with respect to the length using the following equation (4).

【数４】これは、照明変動のような撮影条件が変化する場合、エ
ッジの強度（長さ）は影響を受け易いが、エッジの角度
は影響を受け難いことを考慮させるためである。そこ
で、本実施例２では、後述するように、入力画像処理部
９０において対象画像のエッジ法線方向ベクトルを長さ
「１」に正規化している。これに合わせて、テンプレー
ト画像処理部８０においても、テンプレート画像のエッ
ジ法線方向ベクトルを長さ「１」に正規化している。ま
た、周知のように、三角関数については次式（５）の倍
角公式が成立する。(Equation 4) This is because when the imaging conditions such as illumination fluctuations change, the strength (length) of the edge is easily affected, but the angle of the edge is hardly affected. Thus, in the second embodiment, as described later, the input image processing unit 90 normalizes the edge normal direction vector of the target image to the length “1”. At the same time, the template image processing unit 80 also normalizes the edge normal direction vector of the template image to the length “1”. Further, as is well known, the double angle formula of the following equation (5) holds for the trigonometric function.

【数５】この倍角公式を用いて、エッジベクトルを次式（６）に
基づいて正規化する。(Equation 5) Using this double-angle formula, the edge vector is normalized based on the following equation (6).

【数６】 (Equation 6)

【０１０４】以下、この式（６）について説明する。ま
ず、定数ａは、微小エッジ除去用のしきい値であって、
定数ａより小さなベクトルをゼロベクトルにしているの
は、ノイズ等を除去するためである。次に、ｘｙ各成分
が、式（４）のｘｙ各成分の倍角に係る余弦・正弦の従
属関数になっている点について、説明する。ここで、テ
ンプレートの評価ベクトルＴと、対象画像の評価ベクト
ルＩとのなす角をθとし、その内積、つまりｃｏｓθを
類似尺度として用いると、次のような問題がある。例え
ば、テンプレート画像が、図１９（ａ）に示すもので、
対象画像が、同図（ｂ）に示すものであるとする。ここ
で、図１９（ｂ）の背景部分の画像は、左半分が対象物
よりも明るく、右半分が対象物よりも暗くなっている。
画像のみで見れば、図１９（ａ）のテンプレート画像の
中心が、同図（ｂ）の対象画像の中心に一致する時、対
象物が完全に一致するので、このとき類似値は最大にな
らなければならない。そして、エッジ法線方向ベクトル
は、対象物の画像から外側に向くものを正とすると、図
１９（ｂ）の明るい背景部分でも暗い背景部分でも、対
象物から見て同じ向き（外向き／内向き）でなければな
らない。しかしながら、このとき、図１９（ｂ）の背景
部分の輝度が、対象物の左右でばらついていると、図１
９（ｂ）に矢印で示しているように、向きが反対（明る
い背景部分では対象物の外側向き、暗い背景部分では対
象物の内側向き）になってしまう。このような場合、本
来、最大の類似値となるべき場合において、必ずしも類
似値が高い値にならず、誤認識を招き易い。Hereinafter, equation (6) will be described. First, the constant a is a threshold value for removing minute edges,
The reason why the vector smaller than the constant a is set to the zero vector is to remove noise and the like. Next, a description will be given of a point that each xy component is a dependent function of cosine and sine related to a double angle of each xy component of Expression (4). Here, if the angle between the evaluation vector T of the template and the evaluation vector I of the target image is θ, and the inner product thereof, that is, cos θ, is used as a similarity scale, the following problem occurs. For example, if the template image is as shown in FIG.
It is assumed that the target image is as shown in FIG. Here, in the image of the background portion in FIG. 19B, the left half is brighter than the target, and the right half is darker than the target.
Looking at only the image, when the center of the template image in FIG. 19A coincides with the center of the target image in FIG. 19B, the object completely matches. There must be. Then, assuming that the edge normal direction vector is positive for the outward direction from the image of the target object, the same direction (outward / inward direction) as viewed from the target object in both the bright background portion and the dark background portion in FIG. Orientation). However, at this time, if the luminance of the background portion in FIG.
As shown by the arrow in FIG. 9B, the directions are opposite (outside of the object in a bright background portion, inward of the object in a dark background portion). In such a case, when the maximum similarity value is supposed to be the same, the similarity value does not always become a high value, and misrecognition is likely to occur.

【０１０５】以上の点を図２０を用いて、さらに詳しく
説明する。テンプレート画像の評価ベクトルＴと、対象
画像の評価ベクトルＩとのなす角θの内積ｃｏｓθを類
似値として用いる場合、上述したように、対象物の周囲
にある背景画像の輝度ばらつきによって、対象画像の評
価ベクトルの方向は、Ｉ方向又はその正反対のＩ’方向
のいずれかになる可能性がある。このため、類似尺度で
ある内積は、ｃｏｓθとｃｏｓθ’との２通りがあり得
ることになる。しかも、θ＋θ’＝πであり、ｃｏｓθ
＝ｃｏｓ（π−θ’）＝−ｃｏｓθである。つまり、ｃ
ｏｓθを類似尺度として用いると、本来、類似値を増や
すように作用しなければならない場合において、逆に類
似値を減らすことになる場合がある。また、類似値を減
らすように作用しなければならない場合において、逆に
類似値を増やすことになる場合がある。The above points will be described in more detail with reference to FIG. When the inner product cos θ of the angle θ between the evaluation vector T of the template image and the evaluation vector I of the target image is used as a similar value, as described above, the luminance variation of the background image around the target object causes The direction of the evaluation vector can be either the I direction or the diametrically opposite I ′ direction. For this reason, there are two possible inner products that are the similarity scales, cos θ and cos θ ′. Moreover, θ + θ ′ = π, and cos θ
= Cos (π-θ ') =-cosθ. That is, c
When osθ is used as a similarity measure, the similarity value may be reduced in a case where the function should originally act to increase the similarity value. In addition, when it is necessary to operate to reduce the similar value, the similar value may be increased.

【０１０６】そこで、本実施例２では、θの倍角の余弦
（ｃｏｓ２θ）を、類似値の式に使用している。こうす
ると、ｃｏｓθ’＝−ｃｏｓθとなっていても、式
（５）の倍角公式から、ｃｏｓ２θ’＝ｃｏｓ２θとな
る。つまり、類似値を増やすように作用しなければなら
ない場合には、背景部分に影響されずに類似値は高くな
る。従って、背景部分の画像に輝度ばらつきがあって
も、正当に画像のマッチングを評価することができる。
以上の点は、２倍角だけでなく、４倍角や６倍角などで
も同様に成立する。これにより、偶数倍角評価により、
背景の輝度条件にかかわらず、安定してパターンを抽出
できる。なお、この表現の他にもここで、ＴｘとＴｙの
値の組み合わせからｃｏｓθ＝Ｔｘ，ｓｉｎθ＝Ｔｙと
表現されるθの値（すなわち、エッジ法線方向ベクトル
を極座標表現した場合の位相角）として、Ｔｘ，Ｔｙの
２つではなく１つの値で表現することも可能である。ま
た、θを０〜３６０度でなく、例えば８ビット表現と
し、マイナスの値を２の補数表現として２進数で表現し
た場合（すなわち、−１２８〜１２７とした場合）、−
１２８に１を加算した場合は０となり、循環表現とな
る。このため、θに関する倍角計算及び類似値計算にお
いて、１２７を超えた場合に−１２８とする処理が自動
的に行われる。Therefore, in the second embodiment, the cosine (cos2θ) of the double angle of θ is used in the expression of the similar value. In this case, even if cos θ ′ = − cos θ, cos 2θ ′ = cos 2θ from the double angle formula of Expression (5). That is, when it is necessary to act to increase the similarity value, the similarity value becomes high without being affected by the background portion. Therefore, even if the image of the background portion has a luminance variation, it is possible to properly evaluate the image matching.
The above points are similarly established not only in the case of the double angle, but also in the case of the quadruple angle, the 6 times angle and the like. By this, even number double-width evaluation,
A pattern can be stably extracted regardless of the luminance condition of the background. In addition to this expression, here, the value of θ expressed as cos θ = Tx, sin θ = Ty from the combination of the values of Tx and Ty (that is, the phase angle when the edge normal direction vector is expressed in polar coordinates) , It is also possible to represent not one of Tx and Ty but one value. In addition, when θ is not 0 to 360 degrees but is represented by, for example, 8 bits, and a negative value is represented by a binary number as a two's complement representation (that is, −128 to 127),
When 1 is added to 128, it becomes 0, which is a cyclic expression. For this reason, in the double angle calculation and the similarity value calculation regarding θ, when it exceeds 127, the process of setting to −128 is automatically performed.

【０１０７】次に、類似値の計算について説明する。よ
り具体的には、本実施例２では、次式（７）により、類
似値Ｌを定義する。Next, the calculation of the similarity value will be described. More specifically, in the second embodiment, the similarity value L is defined by the following equation (7).

【数７】なお、評価ベクトルを（Ｖｘ，Ｖｙ），（Ｔｘ，Ｔｙ）
ではなく、Ｖθ，Ｔθとした場合は次式（８）となる。(Equation 7) Note that the evaluation vectors are (Vx, Vy), (Tx, Ty)
However, when Vθ and Tθ are used, the following equation (8) is obtained.

【数８】なお、ここでは評価ベクトルの要素が１つの場合もベク
トルと標記している。(Equation 8) Here, the case where the evaluation vector has one element is also referred to as a vector.

【０１０８】ここで、式（７）及び式（８）は、加算及
び積算のみからなるので、類似値Ｌは、対象画像及びテ
ンプレート画像のそれぞれの評価ベクトルについて線形
である。従って、式（７）及び式（８）をフーリエ変換
すると、フーリエ変換の離散相関定理により、Here, since the equations (7) and (8) consist only of addition and integration, the similarity value L is linear for each evaluation vector of the target image and the template image. Therefore, when the equations (7) and (8) are Fourier-transformed, according to the discrete correlation theorem of the Fourier transform,

【数９】 (Equation 9)

【数１０】となる。なお、式（９）及び式（１０）において、
“〜”はフーリエ変換値を、“＊”は複素共役を表して
いる。(Equation 10) Becomes Note that in equations (9) and (10),
“〜” Indicates a Fourier transform value, and “*” indicates a complex conjugate.

【０１０９】また、式（９）又は式（１０）を逆フーリ
エ変換すれば、式（７）又は式（８）の類似値Ｌが得ら
れる。そして、式（９）及び式（１０）より、次の２点
が明らかとなる。１．直交変換した後の変換値においては、テンプレート
画像に係るフーリエ変換値と、対象画像に係るフーリエ
変換値とを、単純に積和すればよい。２．テンプレート画像に係るフーリエ変換値と、対象画
像に係るフーリエ変換値とを、同時に求める必要はな
く、テンプレート画像に係るフーリエ変換値を対象画像
のフーリエ変換値に先行して求めておいても構わない。Further, if the equation (9) or (10) is subjected to inverse Fourier transform, a similar value L of the equation (7) or (8) can be obtained. Then, the following two points become clear from Expressions (9) and (10). 1. In the transformed value after the orthogonal transformation, the Fourier transform value of the template image and the Fourier transform value of the target image may be simply summed up. 2. It is not necessary to calculate the Fourier transform value of the template image and the Fourier transform value of the target image at the same time, and the Fourier transform value of the template image may be calculated prior to the Fourier transform value of the target image. .

【０１１０】そこで、本実施例２では、テンプレート画
像処理部８０に記録部８５を設け、対象画像の入力に先
立ち、圧縮部８４の出力を記憶しておくことにしてい
る。これにより、対象画像が入力画像処理部９０に入力
された後は、テンプレート画像処理部８０は、何らテン
プレート画像の処理を行う必要がない。従って、画像通
信端末の処理能力を、入力画像処理部９０及び積算部１
０１より後段の処理に集中させることができ、一層処理
を高速化できる。Therefore, in the second embodiment, the recording unit 85 is provided in the template image processing unit 80, and the output of the compression unit 84 is stored before the input of the target image. Thus, after the target image is input to the input image processing unit 90, the template image processing unit 80 does not need to perform any processing on the template image. Therefore, the processing capability of the image communication terminal is reduced by the input image processing unit 90 and the integrating unit 1.
It is possible to concentrate on the processing subsequent to 01, and the processing can be further speeded up.

【０１１１】次に、評価ベクトル生成部８２よりも後段
の構成を説明する。図１８に示すように、テンプレート
画像処理部８０において、評価ベクトル生成部８２から
出力されるテンプレート画像の評価ベクトルは、直交変
換部８３によりフーリエ変換され、圧縮部８４に出力さ
れる。圧縮部８４は、フーリエ変換後の評価ベクトルを
削減して、記録部８５に格納する。図２１に示すよう
に、変換後の評価ベクトルは、ｘｙ両方向について高低
さまざまな周波数成分を含んでいる。本発明者らの実験
によれば、全ての周波数成分について処理を行わなくと
も、低周波数成分（例えば、ｘｙ両方向について、低周
波側半分ずつ等）について処理を行えば、十分な精度が
得られることがわかっている。なお、図２１において、
斜線を付していない領域（−ａ≦ｘ≦ａ，−ｂ≦ｙ≦
ｂ）が元の領域であり、斜線を付した領域（−ａ／２≦
ｘ≦ａ／２，−ｂ／２≦ｙ≦ｂ／２）が削減後の領域で
ある。すなわち、処理量は１／４となる。このようにす
れば、処理対象を削減して、さらに高速な処理を実現で
きる。なお、圧縮部８４及び記録部８５は、データ量が
小さい時や高速性が要求されない時は、省略することも
可能である。Next, a configuration subsequent to the evaluation vector generation unit 82 will be described. As shown in FIG. 18, in the template image processing unit 80, the evaluation vector of the template image output from the evaluation vector generation unit 82 is Fourier-transformed by the orthogonal transformation unit 83, and is output to the compression unit 84. The compression unit 84 reduces the evaluation vector after the Fourier transform and stores it in the recording unit 85. As shown in FIG. 21, the evaluation vector after the conversion includes various high and low frequency components in both the xy directions. According to experiments performed by the present inventors, sufficient accuracy can be obtained by performing processing on low-frequency components (for example, halves on the low frequency side in both xy directions) without performing processing on all frequency components. I know that. In FIG. 21,
Areas not hatched (-a≤x≤a, -b≤y≤
b) is the original area, and the hatched area (-a / 2 ≦
(x ≦ a / 2, −b / 2 ≦ y ≦ b / 2) is the region after the reduction. That is, the processing amount becomes 1/4. In this way, the number of processing targets can be reduced, and higher-speed processing can be realized. Note that the compression unit 84 and the recording unit 85 can be omitted when the data amount is small or when high speed is not required.

【０１１２】次に、入力画像処理部９０について説明す
る。図１８において、入力画像処理部９０は、エッジ抽
出部９１と、評価ベクトル生成部９２と、直交変換部
（ＦＦＴ）９３と、圧縮部９４とを備える。入力画像処
理部９０は、テンプレート画像処理部８０と同等の処理
を行う。すなわち、エッジ抽出部９１は、式（１）及び
式（２）を用いて対象画像のエッジ法線方向ベクトルを
出力する。また、評価ベクトル生成部９２は、エッジ抽
出部９１から対象画像のエッジ法線方向ベクトルを入力
し、テンプレート画像処理部８０の評価ベクトル生成部
８２と同等の処理を行い、評価ベクトルを生成する。評
価ベクトル生成部９２から出力される対象画像の評価ベ
クトルは、直交変換部９３によりフーリエ変換され圧縮
部９４に出力される。圧縮部９４は、フーリエ変換後の
評価ベクトルを削減して、積算部１０１へ出力する。こ
こで、圧縮部９４は、テンプレート画像処理部８０の圧
縮部８４と同一の周波数帯に処理対象を削減する。Next, the input image processing section 90 will be described. 18, the input image processing unit 90 includes an edge extraction unit 91, an evaluation vector generation unit 92, an orthogonal transformation unit (FFT) 93, and a compression unit 94. The input image processing unit 90 performs the same processing as the template image processing unit 80. That is, the edge extraction unit 91 outputs the edge normal direction vector of the target image using Expressions (1) and (2). Further, the evaluation vector generating unit 92 receives the edge normal direction vector of the target image from the edge extracting unit 91, performs the same processing as the evaluation vector generating unit 82 of the template image processing unit 80, and generates an evaluation vector. The evaluation vector of the target image output from the evaluation vector generation unit 92 is Fourier-transformed by the orthogonal transformation unit 93 and is output to the compression unit 94. The compression unit 94 reduces the evaluation vector after the Fourier transform and outputs the result to the integration unit 101. Here, the compression unit 94 reduces processing targets to the same frequency band as the compression unit 84 of the template image processing unit 80.

【０１１３】次に、積算部１０１以降を説明する。テン
プレート画像処理部８０及び入力画像処理部９０の処理
が完了すると、積算部１０１は、記録部８５と圧縮部９
４とから、テンプレート画像及び対象画像の各評価ベク
トルのフーリエ変換値を入力する。そこで、積算部１０
１は、式（９）又は式（１０）による積和演算を行い、
結果（類似値Ｌのフーリエ変換値）を逆直交変換部１０
２へ出力する。逆直交変換部１０２は、類似値Ｌのフー
リエ変換値を逆フーリエ変換し、類似値ＬのマップＬ
（ｘ，ｙ）をマップ処理部１０３へ出力する。マップ処
理部１０３は、このマップＬ（ｘ，ｙ）から、値の高い
点（ピーク）を抽出し、その位置と値とを出力する。な
お、マップ処理部１０３以降は、必要に応じて自由に構
成することができる。Next, a description will be given of the integrating unit 101 and subsequent units. When the processing of the template image processing unit 80 and the input image processing unit 90 is completed, the accumulation unit 101
4, the Fourier transform values of the respective evaluation vectors of the template image and the target image are input. Therefore, the integrating unit 10
1 performs a product-sum operation by Expression (9) or Expression (10),
The result (Fourier transform value of similarity value L) is converted to inverse orthogonal transform unit 10
Output to 2. The inverse orthogonal transform unit 102 performs an inverse Fourier transform of the Fourier transform value of the similar value L, and generates a map L of the similar value L.
(X, y) is output to the map processing unit 103. The map processing unit 103 extracts a point (peak) having a high value from the map L (x, y) and outputs the position and the value. The map processing unit 103 and subsequent units can be freely configured as needed.

【０１１４】さて、対象画像のサイズをＡ（＝２^γ）と
し、テンプレート画像のサイズをＢとすると、テンプレ
ート画像を対象画像上で順次走査させ、各位置での相関
値を求めるためには、積の回数＝２ＡＢの計算回数が必要になる。なお、ここでの計算回数は、
計算コストが高い積の回数で評価する。一方、本実施例
２は、直交変換部８３，９３による２回のＦＦＴ、積算
部１０１の積和計算、及び逆直交変換部１０２による１
回の逆ＦＦＴが必要で、積の回数＝３｛（２γ−４）Ａ＋４｝＋２Ａの計算回数で済む。これらの計算回数を比較すると、例
えば、Ａ＝２５６×２５６＝２¹⁶とし、Ｂ＝６０×６０
とした場合、本実施例２による積の計算回数は約１／１
００となり、非常に高速な処理が可能になり、処理量の
削減につながる。Now, assuming that the size of the target image is A (= ^2γ ) and the size of the template image is B, in order to sequentially scan the template image on the target image and obtain the correlation value at each position, The number of times of product = 2AB is required. The number of calculations here is
Evaluate based on the number of products with high computational cost. On the other hand, in the second embodiment, two FFTs by the orthogonal transform units 83 and 93, a product-sum calculation by the integrating unit 101, and one
The number of times of inverse FFT is required, and the number of times of product = 3 {(2γ-4) A + 4} + 2A is sufficient. Comparing the numbers of these calculations, for example, A = 256 × 256 = 2 ¹⁶ and B = 60 × 60
, The number of product calculations according to the second embodiment is about 1/1.
00, which enables very high-speed processing, which leads to a reduction in the amount of processing.

【０１１５】このように、実施例２の顔抽出部７では、
少ない処理量で顔の位置を抽出できる。そのため、携帯
型の画像通信端末のように限られた処理量が要求される
場面においても、顔の位置及び大きさを抽出することが
可能となる。また、倍角表現を行うことで携帯型の画像
通信端末のように撮影場所や時間が限定されず、あらゆ
る撮影条件を想定しなければならない場面においても、
安定して顔を抽出することが可能になる。As described above, in the face extracting unit 7 of the second embodiment,
The position of the face can be extracted with a small amount of processing. Therefore, even in a scene where a limited amount of processing is required, such as a portable image communication terminal, the position and size of the face can be extracted. Also, by performing the double-width representation, the shooting location and time are not limited as in a portable image communication terminal, and even in a scene where all shooting conditions must be assumed,
It is possible to stably extract a face.

【０１１６】＜実施例３＞上記実施例１及び実施例２の
顔抽出手法では、対象画像内に顔が存在しない場合で
も、顔に近い部分を強引に顔領域として抽出してしま
う。そこで、次に実施例３として、実施例１及び実施例
２の顔抽出手法によって抽出された顔の位置及び大きさ
が、真に顔であるか否かをさらに判定する手法を説明す
る。<Embodiment 3> According to the face extraction methods of Embodiments 1 and 2, even if no face exists in the target image, a portion close to the face is forcibly extracted as a face area. Therefore, as a third embodiment, a method for further determining whether the position and size of the face extracted by the face extraction methods of the first and second embodiments are truly a face will be described.

【０１１７】これを実現するためには、図１２に示す実
施例１の解析部５５の後段、又は図１８に示す実施例２
のマップ処理部１０３の後段に、抽出された顔領域が真
の顔であるか否かを判定する構成（顔・非顔判定部）を
設ける。実施例１の解析部５５の後段に顔・非顔判定部
を設ける場合、最も簡単には、顔・非顔を判断するため
のしきい値を予め定め、解析部５５から出力される領域
の投票値及び顔の大きさから求めた値が、このしきい値
を越えれば当該領域が顔であると判断させる。ここで、
投票値及び顔の大きさから求めた値とは、投票値を顔の
大きさで割った値である。このような処理を行うのは、
顔の大きさに比例する投票値を、顔の大きさで正規化さ
せる理由による。また、実施例２のマップ処理部１０３
の後段に顔・非顔判定部を設ける場合、最も簡単には、
顔・非顔を判断するためのしきい値を予め定め、マップ
処理部１０３から出力される領域の類似値が、このしき
い値を越えれば当該領域が顔であると判断させる。な
お、上記実施例１及び実施例２では、顔抽出部７から出
力される顔領域が１つである場合を説明したが、複数の
顔領域が出力される場合であっても、上述した本実施例
３の顔・非顔判定を適用させることができる。In order to realize this, the analysis unit 55 of the first embodiment shown in FIG. 12 or the second embodiment shown in FIG.
A configuration (face / non-face determination unit) for determining whether or not the extracted face area is a true face is provided at the subsequent stage of the map processing unit 103. In the case where a face / non-face determination unit is provided after the analysis unit 55 of the first embodiment, in the simplest case, a threshold value for determining a face / non-face is determined in advance, and the area output from the analysis unit 55 If the value obtained from the voting value and the size of the face exceeds this threshold, the area is determined to be a face. here,
The value obtained from the voting value and the size of the face is a value obtained by dividing the voting value by the size of the face. Performing such processing is
This is because the voting value proportional to the face size is normalized by the face size. Also, the map processing unit 103 according to the second embodiment
When a face / non-face determination unit is provided after
A threshold for determining a face / non-face is determined in advance, and if the similarity value of the area output from the map processing unit 103 exceeds this threshold, the area is determined to be a face. In the above-described first and second embodiments, the case where only one face area is output from the face extracting unit 7 has been described. The face / non-face determination of the third embodiment can be applied.

【０１１８】そして、顔・非顔判定部で顔でないと判断
された顔領域は、顔抽出部７から表示制御部５及び送信
データ処理部８へは、出力されない。なお、上記第１の
実施形態における送信データ処理部８は、顔抽出部７か
ら顔領域が出力されない場合、送信領域３１の位置を移
動させずに前時刻の送信領域３１をそのまま用いる。ま
た、一定時間顔領域が出力されない場合には、初期位置
（例えば、撮影領域３０の中央）に送信領域３１を設定
する。The face area determined as not a face by the face / non-face determination unit is not output from the face extraction unit 7 to the display control unit 5 and the transmission data processing unit 8. When the face area is not output from the face extraction unit 7, the transmission data processing unit 8 in the first embodiment uses the transmission area 31 at the previous time without moving the position of the transmission area 31. If the face area is not output for a certain period of time, the transmission area 31 is set at the initial position (for example, the center of the shooting area 30).

【０１１９】一方、上述したしきい値による判断方法で
はなく、サポートベクトル関数を用いて顔・非顔を判定
する方法がある。以下に、サポートベクトル関数を用い
た顔・非顔判定を概説する。なお、サポートベクトル自
体は公知の技術であり、文献「ＳｕｐｐｏｒｔＶｅｃ
ｔｏｒＭａｃｈｉｎｅｓによる複数カテゴリの識別
（電子情報通信学会信学技法ＰＲＭＵ９８−３６（１９
９８−０６））」に詳しく説明されている。On the other hand, there is a method of determining a face / non-face using a support vector function instead of the above-described determination method using a threshold value. The outline of the face / non-face determination using the support vector function will be described below. Note that the support vector itself is a known technique, and is described in the document “Support Vec.
Discrimination of multiple categories by tor Machines (IEICE IEICE PRMU98-36 (19
98-06)) ".

【０１２０】図２２は、実施例３の顔抽出部７の構成の
内、実施例１及び実施例２の構成に追加される構成部分
を示すブロック図である。図２２において、実施例３で
の追加構成は、画像サイズ正規化部１１１と、特徴ベク
トル抽出部１１２と、顔・非顔判定部１１３と、顔・非
顔学習辞書１１４とを備える。この図２２の構成は、実
施例１の解析部５５の後段、又は実施例２のマップ処理
部１０３の後段に追加される。FIG. 22 is a block diagram showing components of the face extraction unit 7 of the third embodiment that are added to the configurations of the first and second embodiments. 22, the additional configuration according to the third embodiment includes an image size normalizing unit 111, a feature vector extracting unit 112, a face / non-face determining unit 113, and a face / non-face learning dictionary 114. The configuration of FIG. 22 is added after the analysis unit 55 of the first embodiment or after the map processing unit 103 of the second embodiment.

【０１２１】画像サイズ正規化部１１１は、解析部５５
又はマップ処理部１０３から出力される顔領域部分の画
像を、対象画像から切り出す。そして、画像サイズ正規
化部１１１は、切り出した画像（以下、顔領域候補画像
という）について、各画素における画像特徴（例えば、
エッジ強度、色の値、輝度値等）を求めた後、一定サイ
ズに正規化を行う。ここでは、顔領域候補画像を１０×
１０画素の大きさに拡大又は縮小（すなわち正規化）し
た例を説明する。特徴ベクトル抽出部１１２は、正規化
された顔領域候補画像の輝度情報を特徴データの１つと
して取得する。この例では、１０×１０画素の画像に正
規化されているので、１００次元の特徴ベクトルｘｉ
（０≦ｉ＜１００）が取得されることとなる。The image size normalizing section 111 includes an analyzing section 55
Alternatively, the image of the face area portion output from the map processing unit 103 is cut out from the target image. Then, the image size normalization unit 111 determines, for the cut-out image (hereinafter, referred to as a face area candidate image), image features (for example,
After obtaining edge strength, color value, luminance value, etc., normalization is performed to a fixed size. Here, the face area candidate image is 10 ×
An example of enlargement or reduction (that is, normalization) to the size of 10 pixels will be described. The feature vector extracting unit 112 acquires the luminance information of the normalized face area candidate image as one of the feature data. In this example, since the image is normalized to a 10 × 10 pixel image, a 100-dimensional feature vector xi
(0 ≦ i <100) is obtained.

【０１２２】ここで、特徴ベクトル抽出部１１２は、エ
ッジ法線方向ベクトルを特徴ベクトルとして抽出しても
よい。具体的には、顔領域候補画像に対してＸ方向ｓｏ
ｂｅｌフィルタとＹ方向ｓｏｂｅｌフィルタとをかけ、
各画素におけるＸ方向の強度及びＹ方向の強度を基にし
て方向ベクトルを計算する。この計算では、値として角
度と強さとが算出されるので、強度は無視して角度のみ
を取り出す。そして、２５６階調を基準として各方向の
正規化を実施し、特徴ベクトルとして使用する。また、
特徴ベクトル抽出部１１２は、顔領域候補画像内部の正
規化された角度毎のヒストグラムを計算して、エッジ法
線のヒストグラムを特徴ベクトルとして抽出してもよ
い。Here, the feature vector extracting unit 112 may extract the edge normal direction vector as a feature vector. Specifically, the face area candidate image is
Apply a bell filter and a Y-direction sobel filter,
A direction vector is calculated based on the intensity in the X direction and the intensity in the Y direction at each pixel. In this calculation, since the angle and the strength are calculated as the values, the strength is ignored and only the angle is extracted. Then, normalization in each direction is performed on the basis of 256 gradations, and is used as a feature vector. Also,
The feature vector extraction unit 112 may calculate a normalized histogram for each angle inside the face area candidate image, and extract the histogram of the edge normal as a feature vector.

【０１２３】そして、顔・非顔判定部１１３は、顔・非
顔学習辞書１１４に予め用意されている特徴画像及びパ
ラメータを用い、以下の計算式によって顔領域の顔・非
顔判定を行う。The face / non-face judging section 113 judges the face / non-face of the face area by the following formula using the characteristic images and parameters prepared in the face / non-face learning dictionary 114 in advance.

【数１１】ここで、Ｋ（）はカーネル関数を、αｉは対応するＬ
ａｇｒａｎｇｅ係数を、ｙｉは教師データを示し、学習
辞書が顔の時に＋１が、非顔の時に−１が適用される。
なお、カーネル関数には、上記した式（１２）以外にも
多項式Ｋ（Ｓｉ，Ｘｉ）＝（Ｓｉ・Ｘｉ＋１）や、２層
ニューラルネットワークＫ（Ｓｉ，Ｘｉ）＝ｔａｎｈ
（Ｓｉ・Ｘｉ−δ）を用いることが可能である。[Equation 11] Where K () is the kernel function and αi is the corresponding L
The aggregate coefficient, yi indicates teacher data, and +1 is applied when the learning dictionary is a face and -1 when the learning dictionary is not a face.
The kernel function includes a polynomial K (Si, Xi) = (Si · Xi + 1) and a two-layer neural network K (Si, Xi) = tanh in addition to the above equation (12).
(Si · Xi−δ) can be used.

【０１２４】顔・非顔判別の結果を図２３に示す。顔・
非顔判定部１１３では、上式（１２）の結果が０より大
きい時に顔領域候補画像が顔画像と判定され、０より小
さい時に非顔画像と判定される。同様に他の顔領域候補
画像に対しても顔・非顔判定が行われる。この図２３の
例では、画像１２１が顔画像と判定され、画像１２２〜
１２４が非顔画像と判定されている。顔・非顔学習辞書
１１４では、教師データとして顔画像及び非顔画像が用
意され、識別に使用される同じ特徴データを用いて辞書
の作成が行われる。FIG. 23 shows the result of face / non-face discrimination. face·
The non-face determination unit 113 determines that the face area candidate image is a face image when the result of the above expression (12) is greater than 0, and determines that the image is a non-face image when the result is smaller than 0. Similarly, face / non-face determination is performed on other face area candidate images. In the example of FIG. 23, the image 121 is determined to be a face image, and
124 is determined to be a non-face image. In the face / non-face learning dictionary 114, a face image and a non-face image are prepared as teacher data, and a dictionary is created using the same feature data used for identification.

【０１２５】このように、実施例３の顔抽出部７では、
実際の顔が顔領域の第１候補以外にある場合でも、安定
した顔領域の抽出が可能になる。また、画像中に顔がな
い場合でも顔がないと判定することができるので、顔の
位置を移動して表示する必要がない場合を自動的に検出
することが可能になる。As described above, in the face extracting unit 7 of the third embodiment,
Even when the actual face is other than the first candidate for the face area, stable extraction of the face area is possible. Further, even when there is no face in the image, it can be determined that there is no face, so that it is possible to automatically detect a case where it is not necessary to move and display the position of the face.

[Brief description of the drawings]

【図１】本発明の第１の実施形態に係る画像通信端末の
構成を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of an image communication terminal according to a first embodiment of the present invention.

【図２】送信データ処理部８が行う追従処理の手順を示
すフローチャートである。FIG. 2 is a flowchart illustrating a procedure of a tracking process performed by a transmission data processing unit 8;

【図３】撮影領域３０と送信領域３１との関係を説明す
る図である。FIG. 3 is a diagram illustrating a relationship between a photographing area 30 and a transmission area 31.

【図４】撮影領域３０と送信領域３１との関係を説明す
る図である。FIG. 4 is a diagram illustrating a relationship between a photographing area 30 and a transmission area 31.

【図５】撮影領域３０と送信領域３１との関係を説明す
る図である。FIG. 5 is a diagram illustrating a relationship between a shooting area 30 and a transmission area 31.

【図６】撮影領域３０と送信領域３１との関係を説明す
る図である。FIG. 6 is a diagram illustrating a relationship between a shooting area 30 and a transmission area 31.

【図７】本発明の第２の実施形態に係る画像通信端末の
構成を示すブロック図である。FIG. 7 is a block diagram illustrating a configuration of an image communication terminal according to a second embodiment of the present invention.

【図８】表示部３の画面上に表示される目印の一例を示
す図である。FIG. 8 is a diagram showing an example of a mark displayed on the screen of the display unit 3;

【図９】表示部３の画面上に表示される目印の一例を示
す図である。FIG. 9 is a diagram illustrating an example of a mark displayed on a screen of a display unit 3;

【図１０】入力部２２のテンキーを用いて通知される目
印の一例を示す図である。FIG. 10 is a diagram showing an example of a mark notified using the numeric keypad of the input unit 22.

【図１１】相手側の情報処理装置の画面上に表示される
利用者１側の画像の一例を示す図である。FIG. 11 is a diagram illustrating an example of an image of the user 1 displayed on the screen of the information processing device of the other party.

【図１２】実施例１の顔抽出部７の構成を示すブロック
図である。FIG. 12 is a block diagram illustrating a configuration of a face extraction unit 7 according to the first embodiment.

【図１３】テンプレート記憶部５２に記憶されているテ
ンプレートの一例を示す図である。FIG. 13 is a diagram illustrating an example of a template stored in a template storage unit 52;

【図１４】投票部５４で行われる投票処理の手順を示す
フローチャートである。FIG. 14 is a flowchart illustrating a procedure of a voting process performed by a voting unit.

【図１５】エッジ抽出部５１で抽出されるエッジ画像の
一例を説明する図である。FIG. 15 is a diagram illustrating an example of an edge image extracted by an edge extraction unit 51.

【図１６】投票処理によって投票結果記憶部５３の投票
記憶領域に記憶される投票数の概念を説明する図であ
る。FIG. 16 is a diagram illustrating the concept of the number of votes stored in a voting storage area of a voting result storage unit 53 by a voting process.

【図１７】解析部５５で行われる解析処理の手順を示す
フローチャートである。FIG. 17 is a flowchart illustrating a procedure of an analysis process performed by the analysis unit 55;

【図１８】実施例２の顔抽出部７の構成を示すブロック
図である。FIG. 18 is a block diagram illustrating a configuration of a face extraction unit 7 according to the second embodiment.

【図１９】エッジ抽出部８１及び９１に入力されるテン
プレート画像及び対象画像の一例を示す図である。FIG. 19 is a diagram illustrating an example of a template image and a target image input to edge extraction units 81 and 91.

【図２０】内積の正負反転を説明する図である。FIG. 20 is a diagram for explaining the sign reversal of the inner product.

【図２１】評価ベクトルの圧縮処理を説明する図であ
る。FIG. 21 is a diagram illustrating a process of compressing an evaluation vector.

【図２２】実施例３の顔抽出部７の構成の一部を示すブ
ロック図である。FIG. 22 is a block diagram illustrating a part of the configuration of a face extraction unit 7 according to the third embodiment.

【図２３】顔・非顔判別部１１３で行われた顔・非顔判
別結果の一例を示す図である。FIG. 23 is a diagram illustrating an example of a face / non-face discrimination result performed by the face / non-face discrimination unit 113.

[Explanation of symbols]

１…利用者２，２２…入力部３…表示部４…カメラ部５，２５…表示制御部６…自画像メモリ７…顔抽出部８…送信データ処理部９…通信部１０…受信データ処理部１１…相手画像メモリ１２…通知部３０…撮影領域３１…送信領域３２…有効領域５１，８１，９１…エッジ抽出部５２…テンプレート記憶部５３…投票結果記憶部５４…投票部５５…解析部８２，９２…評価ベクトル生成部８３，９３…直交変換部（ＦＦＴ）８４，９４…圧縮部８５…記録部１０１…積算部１０２…逆直交変換部（逆ＦＦＴ）１０３…マップ処理部１１１…画像サイズ正規化部１１２…特徴ベクトル抽出部１１３…顔・非顔判定部１１４…顔・非顔学習辞書１２１〜１２４…画像Ｒ…目印ｔ１〜ｔｎ…円形状 DESCRIPTION OF SYMBOLS 1 ... User 2,22 ... Input part 3 ... Display part 4 ... Camera part 5,25 ... Display control part 6 ... Self-image memory 7 ... Face extraction part 8 ... Transmission data processing part 9 ... Communication part 10 ... Reception data processing part DESCRIPTION OF SYMBOLS 11 ... Partner image memory 12 ... Notification part 30 ... Photographing area 31 ... Transmission area 32 ... Effective area 51, 81, 91 ... Edge extraction part 52 ... Template storage part 53 ... Voting result storage part 54 ... Voting part 55 ... Analysis part 82 , 92 ... Evaluation vector generation unit 83, 93 ... Orthogonal transformation unit (FFT) 84, 94 ... Compression unit 85 ... Recording unit 101 ... Integration unit 102 ... Inverse orthogonal transformation unit (inverse FFT) 103 ... Map processing unit 111 ... Image size Normalizing section 112 Feature vector extracting section 113 Face / non-face determining section 114 Face / non-face learning dictionary 121 to 124 Image R Marker t1 to tn Circle shape

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｈ０４Ｎ 1/46 Ｈ０４Ｎ 7/14 7/14 1/46 Ｚ (72)発明者高田雄二大阪府門真市大字門真1006番地松下電器産業株式会社内 (72)発明者吉澤正文大阪府門真市大字門真1006番地松下電器産業株式会社内 (72)発明者濱崎省吾大阪府門真市大字門真1006番地松下電器産業株式会社内 (72)発明者吉村哲也大阪府門真市大字門真1006番地松下電器産業株式会社内 (72)発明者岩佐克博大阪府門真市大字門真1006番地松下電器産業株式会社内Ｆターム(参考） 5B057 AA20 BA02 CA01 CA08 CA12 CA16 CB01 CB08 CB12 CB16 CC01 CE03 CE17 DA07 DB02 DB06 DB09 DC16 DC32 5C064 AA01 AA02 AB02 AB04 AC04 AC12 AC15 AD01 AD08 AD14 5C079 HA01 HB01 LA07 LB11 MA17 5L096 CA02 EA21 FA06 FA59 FA69 JA09 ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) H04N 1/46 H04N 7/14 7/14 1/46 Z (72) Inventor Yuji Takada Kadoma, Osaka 1006 Kadoma Matsushita Electric Industrial Co., Ltd. (72) Inventor Masafumi Yoshizawa 1006 Kadoma, Kadoma, Osaka Prefecture Matsushita Electric Industrial Co., Ltd. (72) Inventor Tetsuya Yoshimura 1006 Kadoma, Kazuma, Osaka Pref. Matsushita Electric Industrial Co., Ltd. (72) Katsuhiro Iwasa 1006 Kadoma, Kazuma, Kadoma, Osaka Pref. BA02 CA01 CA08 CA12 CA16 CB01 CB08 CB12 CB16 CC01 CE03 CE17 DA07 DB02 DB06 DB09 DC16 DC32 5C064 AA01 AA02 AB02 AB04 AC04 AC12 AC15 AD01 AD08 AD14 5C079 HA01 HB01 LA07 LB11 MA17 5L096 CA02 EA21 FA06 FA59 FA69 JA09

Claims

[Claims]

1. An image communication terminal for transmitting an image of a user photographed by a camera unit to a partner, an input unit for receiving an input from the user, a camera unit for photographing the user, and the camera unit A face extraction unit that extracts the position and size of a user's face (hereinafter, referred to as a face area) from an image captured in step 1, a display unit that displays an image to the user, A communication unit that communicates at least an image, and an image of a rectangular transmission area that is smaller than an area of the image captured by the camera unit and that is set to be movable within the image area, to the communication unit. A transmission data processing unit for outputting, wherein an effective area that moves integrally with the transmission area is set in an area of an image captured by the camera unit, and the transmission data processing unit is configured to extract the face Area misses the effective area When, characterized in that moving the set position of the transmission region in accordance with the position of the face area, the image communication terminal.

2. The image communication terminal according to claim 1, wherein the effective area is smaller than the transmission area and set within the transmission area.

3. The transmission data processing section, when the extracted face area deviates from the effective area, moves the transmission area so that the face area is located at the center of the transmission area. The image communication terminal according to claim 1, wherein:

4. The transmission data processing unit moves the transmission area such that when the extracted face area deviates from the effective area, the face area is located above the center of the transmission area. The image communication terminal according to claim 1, wherein

5. The input data processing unit, wherein when the extracted face area deviates from the effective area, the input unit controls the input area so that the face area is located at the center of the transmission area or in an upward direction from the center. The image communication terminal according to claim 4, wherein the transmission area is moved by switching according to transmission mode information input from the terminal.

6. The display unit displays an image in the transmission area and the face area on a monitor in accordance with information input from the input unit, and a user refers to the monitor display, The image communication terminal according to claim 4, wherein the position of the transmission area can be adjusted in the vertical and horizontal directions by an input to the input unit.

7. An image communication terminal for transmitting an image of a user photographed by a camera unit to a partner, an input unit for receiving an input from the user, a camera unit for photographing the user, and the camera unit A face extraction unit that extracts the position and size of a user's face (hereinafter, referred to as a face area) from an image captured in step 1, a display unit that displays an image to the user, A communication unit that communicates at least an image, and an image of a rectangular transmission area that is smaller than an area of the image captured by the camera unit and that is set to be movable within the image area, to the communication unit. A transmission data processing unit for outputting, wherein an effective area that moves integrally with the transmission area is set in an area of an image captured by the camera unit, and the transmission data processing unit is configured to extract the face Area misses the effective area In this case, the setting position of the transmission area is moved in accordance with the position of the face area, and based on the extracted image brightness of the face area, the face of the face in the image shot by the camera unit is extracted. An image communication terminal, wherein the image communication terminal corrects the image brightness of the transmission area and outputs the corrected image brightness to the communication unit so that visibility is improved.

8. The image communication terminal according to claim 7, wherein the transmission data processing unit corrects a color tone in addition to an image luminance of the transmission area and outputs the corrected color tone to the communication unit.

9. An image communication terminal for transmitting an image of a user photographed by a camera unit to a partner, an input unit for receiving an input from the user, a camera unit for photographing the user, and the camera unit A face extraction unit that extracts the position and size of a user's face (hereinafter, referred to as a face area) from an image captured in step 1, a display unit that displays an image to the user, A communication unit that communicates at least an image, and an image of a rectangular transmission area that is smaller than an area of the image captured by the camera unit and that is set to be movable within the image area, to the communication unit. A transmission data processing unit for outputting, wherein an effective area that moves integrally with the transmission area is set in an area of an image captured by the camera unit, and the transmission data processing unit is configured to extract the face Area misses the effective area In this case, the setting position of the transmission area is moved in accordance with the position of the face area, and based on the extracted image brightness of the face area, the face of the face in the image shot by the camera unit is extracted. An image communication terminal, wherein a value of an exposure level of the camera unit is set so that visibility is improved.

10. An image communication terminal for transmitting an image of a user photographed by a camera unit to a partner, comprising: a camera unit for photographing the user; and a face of the user based on the image photographed by the camera unit. A face extracting unit for extracting the position of the user, a display unit for displaying an image received from the other party to the user, and a user's face in the image taken by the camera unit based on the extracted position of the face. An image communication terminal, comprising: a notification control unit that notifies a user of the position of the information processing unit; and a communication unit that performs at least image communication with a partner information processing device.

11. The face extraction unit also extracts a size of a face together with the position of the face of the user, and the notification control unit determines a position of the face of the user in an image captured by the camera unit. The image communication terminal according to claim 10, wherein the size is notified to a user.

12. The image communication terminal according to claim 10, wherein the notification control unit causes the display unit to display only the position of the extracted face or a mark indicating the position and size. .

13. The image communication terminal according to claim 12, wherein the mark is displayed on an image received from a partner.

14. The image communication terminal according to claim 12, wherein the mark is displayed outside an image received from a partner.

15. The image communication according to claim 12, wherein the notification control unit notifies the position of the extracted face via a position notification unit provided separately from the display unit. Terminal.

16. The image communication terminal according to claim 10, wherein a method of notifying a user performed by the notification control unit can be switched according to an instruction from the user.

17. The face extraction unit extracts an edge portion (pixels corresponding to the outline of a person, a contour of a face, and the like) from an image captured by the camera unit, and extracts an image of only the edge portion (hereinafter, referred to as an image). An edge extraction unit that generates an edge image), a template storage unit that stores a plurality of templates provided concentrically at a center point in various shapes having similar shapes and different sizes, and A voting result storage unit for storing the coordinate position on the edge image and the number of votes in association with each other for each shape of each size constituting the template, and sequentially storing the center point of the template at each pixel position of the edge portion. For each of the moved pixel positions, the coordinates stored in the voting result storage unit for each coordinate position corresponding to the position of all pixels forming the shape of each size. A voting unit for increasing or decreasing the number, and an analyzing unit for calculating a position and a size of a face included in the target image based on the voting number stored in the voting result storage unit. Item 17. The image communication terminal according to any one of Items 1 to 16.

18. The image communication terminal according to claim 17, wherein the predetermined shape is a circle.

19. The face extraction unit inputs a predetermined template image, finds an edge normal direction vector of the image, generates an evaluation vector from the edge normal direction vector, and orthogonally transforms the evaluation vector. A template image processing unit, an image captured by the camera unit, an edge normal direction vector of the image obtained, an evaluation vector generated from the edge normal direction vector, and an input for orthogonally transforming the evaluation vector An image processing unit; a product-sum unit for calculating the product-sum of the corresponding spectrum data for each of the orthogonally transformed evaluation vectors generated for each of the template image and the captured image; and inverting the result of the product-sum calculation. An inverse orthogonal transformation unit that performs orthogonal transformation to generate a map of similarity values, wherein the evaluation vector is the edge of a corresponding image. Line direction vector includes the even combination angle converted components, calculation formula of the similarity value, the orthogonal transform and inverse orthogonal transformation, characterized in that both those having linearity, claims 1 to 16
The image communication terminal according to any one of the above.

20. The face extraction unit according to claim 19, wherein, in the expression of the evaluation vector, a value calculated based on an angle when the edge normal direction vector is expressed in polar coordinates is used. Image communication terminal.

21. The face extraction unit, wherein a position and a size extracted as a face from an image captured by the camera unit are:
17. The apparatus according to claim 1, further comprising a face / non-face determination unit that determines whether the face is truly a face, and outputs an extraction result only when the face is determined. Image communication terminal.

22. The face extracting unit, wherein a position and a size extracted as a face from an image taken by the camera unit are based on the content stored in the voting result storage unit.
18. The image communication terminal according to claim 17, further comprising a face / non-face determination unit that determines whether the face is truly a face, and outputs an extraction result only when the face is determined.

23. The face extraction unit, wherein the position and size extracted as a face from an image captured by the camera unit are based on the similarity value generated by the inverse orthogonal transform unit.
20. The image communication terminal according to claim 19, further comprising a face / non-face determination unit that determines whether or not the image is a true face, and outputs an extraction result only when the image is determined to be a face.

24. The face / non-face determining unit uses an image feature obtained from an area extracted as a face from an image captured by the camera unit, based on a support vector function determination result. Determining a face,
The image communication terminal according to claim 21.

25. The image processing apparatus according to claim 25, wherein the face / non-face determination unit uses an edge normal direction vector obtained from an area extracted as a face from an image captured by the camera unit as the image feature. Item 25. The image communication terminal according to item 24.

26. The image processing apparatus according to claim 26, wherein the face / non-face determination unit uses, as the image feature, a histogram of an edge normal obtained from a region extracted as a face from the image captured by the camera unit. Item 25. The image communication terminal according to item 24.