JPH0832947A

JPH0832947A - Image communication equipment

Info

Publication number: JPH0832947A
Application number: JP6158854A
Authority: JP
Inventors: Tsutomu Imai; 勉今井; Takuya Imaide; 宅哉今出; Ryushi Nishimura; 龍志西村; Kenji Ichige; 健志市毛
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1994-07-11
Filing date: 1994-07-11
Publication date: 1996-02-02

Abstract

PURPOSE:To provide image communication equipment capable of displaying the abundant action of an image as suppressing the increment of the signal quantity of a transmission/reception signal. CONSTITUTION:Video information photographed by a television camera 100 is transmitted at every plural frames, and action information discriminated by an action discrimination processing part 103 is transmitted at every frame. An action processing part 111 generates an interpolation image by using the video information and the action information, and supplies an action to a display image, and a display image with abundant action can be obtained from a small quantity of signals.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、テレビ電話装置やテレ
ビ会議装置などのように画像情報の通信を行う画像通信
装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image communication device for communicating image information such as a video telephone device and a video conference device.

【０００２】[0002]

【従来の技術】テレビ電話装置において、画像情報の信
号量を可能な限り圧縮して送受信する通信方法が、例え
ば、特公平２ー３６６８７号公報に記載されている。こ
の通信方法は、通信者の顔の口以外の部分は静止画デー
タを使用し、口は通信中に伝送される音声情報に応じ
て、受信側で合成処理することにより顔の動画像を生成
するものであり、顔動画の情報信号量を大幅に圧縮で
き、送受信信号量を大幅に削減することができるように
工夫されている。2. Description of the Related Art For example, Japanese Patent Publication No. 2-366887 discloses a communication method in which a signal quantity of image information is transmitted and received in a video telephone apparatus as much as possible. This communication method uses still image data for parts other than the mouth of the face of the communicator, and the mouth generates a moving image of the face by combining processing on the receiving side according to the audio information transmitted during communication. However, it is devised so that the information signal amount of the face moving image can be significantly compressed and the transmitted / received signal amount can be significantly reduced.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、このよ
うな画像情報の送受信を行なうテレビ電話装置において
は、通信者が首を縦に動かして頷いたり、首を横に振っ
て「ノー」を表現しても、顔全体は静止画像であるの
で、このような動作は相手側に伝わらない不都合があ
る。However, in the videophone apparatus for transmitting and receiving such image information, the communicator moves the neck vertically to nod, or shakes the neck sideways to express "no". However, since the entire face is a still image, there is an inconvenience that such an operation is not transmitted to the other party.

【０００４】本発明の目的は、このような問題点を解決
し、送受信信号量の増加を抑えつつ画像の動きを豊かに
することができる画像通信装置を提供することにあり、
例えば、テレビ電話装置やテレビ会議装置において、通
信者の意思表示の動作や表情などを相手に伝えて表現豊
かに通話することができるようにすることにある。An object of the present invention is to solve the above problems and to provide an image communication device capable of enriching the motion of an image while suppressing an increase in the amount of transmitted / received signals.
For example, in a videophone device or a videoconference device, it is intended to communicate the intention of the communicator and his / her facial expression to the other party so that the caller can expressively talk.

【０００５】[0005]

【課題を解決するための手段】本発明の１つの特徴は、
テレビカメラと、該テレビカメラにより撮影した映像情
報から通信者の所定部分に関する部分画像情報を抽出す
る抽出手段と、抽出された部分画像情報に基づいて前記
所定部分の動きを判別して動作情報を生成する動作判別
手段と、前記映像情報を複数フレーム毎に送受信すると
共に前記動作情報を毎フレーム毎に送受信する送受信手
段と、受信した前記映像情報と動作情報に基づいて表示
画像を生成する画像生成処理手段を備えたことにある。One of the features of the present invention is as follows.
A television camera, an extracting means for extracting partial image information about a predetermined portion of the correspondent from the video information captured by the television camera, and a motion of the predetermined portion based on the extracted partial image information to determine motion information. An operation determining means for generating, a transmitting / receiving means for transmitting / receiving the video information in every plural frames and transmitting / receiving the motion information in every frame, and an image generation for generating a display image based on the received video information and the motion information. It is equipped with processing means.

【０００６】本発明の他の特徴は、テレビカメラと、該
テレビカメラにより撮影した映像情報から通信者の顔の
構成部分の部分画像情報を抽出する抽出手段と、抽出さ
れた部分画像情報に基づいて該構成部分の動きを判別し
て動作情報を生成する動作判別手段と、前記映像情報と
動作情報を符号化する符号化手段と、該符号化された映
像情報を複数フレーム毎に送受信すると共に動作情報を
毎フレーム毎に送受信する送受信手段と、該送受信手段
により受信した符号化信号から前記映像情報と動作情報
を復元する復元手段と、復元した映像情報と動作情報に
基づいて表示画像を生成する画像生成処理手段を備えた
ことにある。Another feature of the present invention is based on a television camera, extraction means for extracting partial image information of a composing portion of a face of a correspondent from video information taken by the television camera, and based on the extracted partial image information. And a motion determining unit that determines motion of the component to generate motion information, a coding unit that codes the video information and the motion information, and transmits and receives the coded video information for every plurality of frames. A transmitting / receiving unit that transmits / receives motion information for each frame, a restoring unit that restores the video information and the motion information from the encoded signal received by the transmitting / receiving unit, and a display image is generated based on the restored video information and the motion information. The present invention is provided with an image generation processing means for performing the above.

【０００７】本発明の更に他の特徴は、テレビカメラ
と、該テレビカメラにより撮影した映像情報から通信者
の所定部分に関する部分画像情報を抽出する抽出手段
と、抽出された部分画像情報に基づいて前記所定部分の
動きを判別して動作情報を生成する動作判別手段と、前
記映像情報を所定フレーム毎に送受信すると共に前記部
分画像情報及び動作情報を毎フレーム毎に送受信する送
受信手段と、送受信した画像情報を記憶するメモリ手段
と、該メモリ手段から読出した映像情報を前記部分画像
情報と動作情報に応じて変形する画像変形手段を備えた
ことにある。Still another feature of the present invention is based on a television camera, extraction means for extracting partial image information relating to a predetermined portion of a correspondent from video information taken by the television camera, and the extracted partial image information. Transmission / reception is performed with an operation determination unit that determines the movement of the predetermined portion and generates operation information, with a transmission / reception unit that transmits / receives the video information for each predetermined frame and transmits / receives the partial image information and the operation information for each frame It is provided with a memory means for storing the image information and an image transforming means for transforming the video information read from the memory means according to the partial image information and the motion information.

【０００８】[0008]

【作用】送信側では、映像情報を複数フレーム毎に伝送
し、該映像情報から抽出した部分画像情報から判別した
動作情報は毎フレーム毎に伝送する。そして受信側で
は、映像情報あるいは部分画像情報と動作情報を使用し
て補間画像を生成して表示画像に動きを与え、少ない信
号量から動きの豊かな表示画像を得る。On the transmitting side, the video information is transmitted for each plurality of frames, and the motion information determined from the partial image information extracted from the video information is transmitted for each frame. Then, on the receiving side, an interpolation image is generated by using the video information or the partial image information and the motion information to give a motion to the display image, and a display image with rich motion is obtained from a small amount of signal.

【０００９】[0009]

【実施例】以下、本発明の実施例を図面を用いて説明す
る。Embodiments of the present invention will be described below with reference to the drawings.

【００１０】図１は、本発明の一実施例であるテレビ電
話装置のブロック図を示している。１００はテレビカメ
ラ、１０１は抽出処理部、１０２は制御回路、１０３は
動作判別処理部である。この動作判別処理部１０３は、
動き検出処理部１０４と動作処理部１０５を備える。１
０６は符号化処理部、１０７は多重化処理、１０８は送
受信部である。１０９は分離処理部、１１０は復号化処
理部、１１１は動作処理部である。この動作処理部１１
１は、動作判別処理部１１２と動作処理部１１３を備え
る。１１４は合成処理部、１１５は画像表示部、１１６
は前記テレビカメラ１００のズーム制御機構部である。FIG. 1 is a block diagram of a videophone device according to an embodiment of the present invention. Reference numeral 100 is a television camera, 101 is an extraction processing unit, 102 is a control circuit, and 103 is an operation determination processing unit. This operation determination processing unit 103
A motion detection processing unit 104 and a motion processing unit 105 are provided. 1
Reference numeral 06 is an encoding processing unit, 107 is a multiplexing process, and 108 is a transmission / reception unit. 109 is a separation processing unit, 110 is a decoding processing unit, and 111 is an operation processing unit. This motion processing unit 11
1 includes a motion determination processing unit 112 and a motion processing unit 113. Reference numeral 114 is a combination processing unit, 115 is an image display unit, and 116.
Is a zoom control mechanism section of the television camera 100.

【００１１】テレビカメラ１００は通信者などを撮影し
て映像情報に変換する。抽出処理部１０１は、この映像
情報の輝度信号などの情報に着目して通信者の目，口，
鼻，眉，耳などの顔の構成部分を抽出する。制御回路１
０２は、顔の映像が常にある程度の大きさになるように
テレビカメラ１００のズーム制御機構部１１６を制御す
る。動き検出処理部１０４は、抽出された部分が縦方向
または横方向に所定量以上に移動したかどうかを判別す
る。縦方向に移動していれば首を縦に振ったと判定し、
横方向に移動していれば首を横に振ったと判定する。ど
ちらにも当てはまらない場合には、首の動作はないと判
定する。動作処理部１０５は、動き検出処理部１０４で
判別した動作内容を表現するための信号処理を施し、符
号化１０６は該動作情報と前記映像情報を符号化する。
多重化処理部１０７は、符号化された前記動作情報およ
び前記映像情報を合わせて送信情報信号を作成し、送受
信部１０８は該送信情報信号を送信する。The television camera 100 photographs a correspondent or the like and converts it into video information. The extraction processing unit 101 pays attention to information such as the luminance signal of the video information, and the eyes, mouth,
Extract face parts such as nose, eyebrows, and ears. Control circuit 1
02 controls the zoom control mechanism unit 116 of the television camera 100 so that the image of the face is always a certain size. The motion detection processing unit 104 determines whether or not the extracted portion has moved vertically or horizontally by a predetermined amount or more. If you are moving in the vertical direction, it is determined that you have shaken your head vertically,
If it is moving in the lateral direction, it is determined that the head has been shook. If neither of the above applies, it is determined that there is no neck movement. The motion processing unit 105 performs signal processing for expressing the motion content determined by the motion detection processing unit 104, and the coding unit 106 codes the motion information and the video information.
The multiplexing processing unit 107 creates a transmission information signal by combining the encoded operation information and the video information, and the transmission / reception unit 108 transmits the transmission information signal.

【００１２】送受信部１０８は、相手側から受信した受
信情報信号を分離処理部１０９に与える。この分離処理
部１０９は、受信情報信号をそれぞれの部分に分離す
る。復号化処理部１１０は分離された受信情報信号を復
号し、映像情報は合成処理部１１４に与え、動作情報は
動作処理部１１１に与える。動作処理部１１１は、動作
判別処理部１１２において動作情報から動作内容を判別
し、動作処理部１１３において該動作内容に従った補間
画像生成処理を行なう。この補間画像生成処理は、合成
処理部１１４に記憶されている画像情報を基準にして行
なう。合成処理部１１４は、復元された映像情報を記憶
し、動作処理部１１３で生成された補間画像情報を合成
して記憶し、画像表示部１１５に表示させ表示画像信号
を生成する。The transmission / reception section 108 gives the reception information signal received from the other party to the separation processing section 109. The separation processing unit 109 separates the received information signal into respective parts. The decoding processing unit 110 decodes the separated reception information signal, gives video information to the synthesis processing unit 114, and gives motion information to the motion processing unit 111. In the operation processing unit 111, the operation determination processing unit 112 determines the operation content from the operation information, and the operation processing unit 113 performs the interpolation image generation processing according to the operation content. This interpolation image generation process is performed based on the image information stored in the synthesis processing unit 114. The synthesis processing unit 114 stores the restored video information, synthesizes the interpolated image information generated by the operation processing unit 113, stores the synthesized image information, and causes the image display unit 115 to display the display image signal.

【００１３】次に、動作判別処理部１０３における動き
検出処理部１０４の具体例を図２を用いて説明する。２
０１は座標値計算処理部、２０２は重心位置計算処理
部、２０３は判別処理部、２０４はメモリである。Next, a specific example of the motion detection processing unit 104 in the motion discrimination processing unit 103 will be described with reference to FIG. Two
Reference numeral 01 is a coordinate value calculation processing unit, 202 is a center of gravity position calculation processing unit, 203 is a discrimination processing unit, and 204 is a memory.

【００１４】座標値計算処理部２０１は、抽出処理部１
０１で抽出した両目と口の各座標値を計算する。重心位
置計算処理部２０２は、座標値計算処理部２０１で算出
した３つの座標を結ぶ三角形の重心の座標値を求める。
判別処理２０３は、重心位置計算処理２０２で算出した
重心の座標値とメモリ２０４に記録されている数フレー
ム前の重心の座標値の平均値とを比較し、該重心位置が
縦方向または横方向に所定量以上の移動を検出した場合
には、縦方向の移動の場合は頷きまたはイエスの動作と
判定し、横方向の移動の場合を拒否またはノーの動作と
判定する。そして、算出された重心の座標値をメモリ２
０４に記録する。The coordinate value calculation processing unit 201 includes an extraction processing unit 1.
The coordinate values of both eyes and mouth extracted in 01 are calculated. The center-of-gravity position calculation processing unit 202 obtains the coordinate value of the center of gravity of the triangle connecting the three coordinates calculated by the coordinate value calculation processing unit 201.
The determination processing 203 compares the coordinate value of the center of gravity calculated in the center-of-gravity position calculation processing 202 with the average value of the coordinate values of the center of gravity recorded in the memory 204 a few frames before, and determines the center-of-gravity position in the vertical or horizontal direction. When a movement of a predetermined amount or more is detected, a nod or yes motion is determined in the case of vertical movement, and a nod or no motion is determined in the case of horizontal movement. Then, the calculated coordinate value of the center of gravity is stored in the memory 2
Record in 04.

【００１５】次に、この動作判別処理部１０３における
動き検出処理部１０４の他の具体例を図３を用いて説明
する。３０１は座標値計算処理部、３０２は判定処理
部、３０３はメモリである。Next, another specific example of the motion detection processing unit 104 in the motion discrimination processing unit 103 will be described with reference to FIG. Reference numeral 301 is a coordinate value calculation processing unit, 302 is a determination processing unit, and 303 is a memory.

【００１６】座標値計算処理部３０１は、抽出処理１０
１で抽出された鼻の座標値を計算する。判定処理部３０
２は、その座標値とメモリ３０３に記憶されている数フ
レーム前の鼻の座標値の平均値とを比較することによっ
て該鼻の移動を検出し、前記と同様の判別処理を行な
う。そして、算出された鼻の座標値をメモリ３０３に記
録する。この判別処理は、目，口，眉，耳の１つに着目
してその座標の移動を検出することによって行なうこと
も可能である。The coordinate value calculation processing unit 301 performs the extraction processing 10
The coordinate value of the nose extracted in 1 is calculated. Judgment processing unit 30
2 detects the movement of the nose by comparing the coordinate value with the average value of the coordinate values of the nose several frames before stored in the memory 303, and performs the same discrimination processing as described above. Then, the calculated nose coordinate values are recorded in the memory 303. This discrimination processing can also be performed by focusing on one of the eyes, mouth, eyebrows, and ears and detecting the movement of the coordinates.

【００１７】図４及び図５は、このテレビ電話装置にお
ける動作処理部１１１の補間画像生成処理例を示してい
る。図４において、（ａ）は通信者を正面から見た図形
（画像）であり、一点鎖線５０１，５０２は該画像の回
転処理を行うための基準軸である。（ｂ）は通信者を横
から見た図形（画像）であり、黒丸５０１は回転処理を
行うための前記基準軸である。また、（ｃ）は通信者を
上から見た図形（画像）であり、黒丸５０２は回転処理
を行うための前記基準軸である。FIG. 4 and FIG. 5 show an example of interpolation image generation processing of the operation processing unit 111 in this videophone device. In FIG. 4, (a) is a figure (image) viewed from the front of the communicator, and alternate long and short dash lines 501 and 502 are reference axes for performing rotation processing of the image. (B) is a figure (image) viewed from the side of the communicator, and a black circle 501 is the reference axis for performing rotation processing. Further, (c) is a figure (image) as seen from above the communicator, and a black circle 502 is the reference axis for performing rotation processing.

【００１８】テレビ電話装置においては、首を縦に動か
したり横に振ったりする動作は表示画面の画像変化で表
現しなければならない。送信側における動作判別処理部
１０３で判別した首の動作内容を表す動作情報を受信側
の動作処理部１１１で受信すると、該動作処理部１１１
における動作判別処理部１１２がその動作内容を判別
し、動作処理部１１３においてそれに合った動作となる
ような画像処理を行う。つまり、首が縦に動いたときの
動作情報を受信した場合には、動作処理部１１３は、
（ｂ）に示すように、基準軸５０１を軸にして画像を回
転するような画像処理を行う。同様に、首が横に振られ
たときの動作情報を受信したときには、（ｃ）に示すよ
うに、基準軸５０２を軸にして画像を回転するような画
像処理を行う。首が正面，斜め下，斜め左，斜め右を向
いたときの画像を順次に生成することにより、頷きや首
振りを表現する画像表示を行なうための画像処理を実行
する。動きを滑らかに表現するためには動きの途中の状
態を複数の画像を用いて表示することが必要であるが、
ここでは最も簡単な動作表現例を示す。In the videophone apparatus, the movement of moving the neck vertically or shaking it horizontally must be expressed by the image change on the display screen. When the operation processing unit 111 on the receiving side receives the operation information indicating the operation content of the neck determined by the operation determination processing unit 103 on the transmitting side, the operation processing unit 111
The action determination processing unit 112 determines the content of the action, and the action processing unit 113 performs image processing so that the action matches the action. In other words, when the motion information when the neck moves vertically is received, the motion processing unit 113
As shown in (b), image processing is performed such that the image is rotated around the reference axis 501. Similarly, when the motion information when the head is shaken sideways is received, as shown in (c), image processing for rotating the image around the reference axis 502 is performed. By sequentially generating images when the neck faces front, diagonally downward, diagonally left, and diagonally right, image processing for displaying an image of nod and swing is executed. In order to express the movement smoothly, it is necessary to display the state in the middle of the movement using multiple images.
Here, the simplest operation expression example is shown.

【００１９】図５は、首を縦に動かす動作を表現する画
像処理の一例を示ものであり、太線で示した正面を向い
ている顔から細線で示した斜め下を向いている顔の画像
を生成する様子を示している。この画像処理は、顔（画
像）を基準軸５０１を軸にして回転させるような処理で
ある。６０１は太線で示した正面を向いている顔のＮフ
レーム、６０２は細線で示した斜め下を向いている顔の
処理後のＮ＋１フレームである。FIG. 5 shows an example of image processing for expressing a motion of vertically moving a neck. An image of a face, which is shown by a thick line, which faces the front, and a face, which is shown by a thin line, which faces obliquely downward are shown. It shows how to generate. This image processing is processing for rotating the face (image) around the reference axis 501. Reference numeral 601 denotes an N frame of the face facing the front, which is indicated by a thick line, and 602 is an N + 1 frame after the processing of the face facing the diagonally downward, which is indicated by a thin line.

【００２０】目を対象にした画像処理を例にとって説明
する。Ｎ＋１フレーム（頷いている）の目の画像データ
ｂは、Ｎフレーム（正面を向いている）の画像データａ
に関連して生成する。画像データｂの座標は、画像デー
タａの座標から首の振れ角と首の三次元形状モデルを用
いて算出できる。つまり、二次元画像の各画素に対して
三次元的マッピングを行なうことにより、動作後の二次
元画像を生成することができる。その他の顔の各構成部
分も同様の方法で処理することにより得られる。このよ
うにして、正面を向いた顔の画像データから斜め下を向
いている顔の画像を作り出す。Image processing for the eyes will be described as an example. The image data b of the N + 1th frame (nodded) is the image data a of the Nth frame (front facing).
Generated in relation to. The coordinates of the image data b can be calculated from the coordinates of the image data a using the deflection angle of the neck and the three-dimensional shape model of the neck. That is, by performing three-dimensional mapping for each pixel of the two-dimensional image, it is possible to generate a two-dimensional image after operation. The other components of the face can be obtained by processing in the same manner. In this way, an image of the face that faces diagonally downward is created from the image data of the face that faces the front.

【００２１】また、斜め左や斜め右を向いてる顔の画像
も同様の方法で作り出すことができる。An image of a face facing diagonally left or diagonally right can also be created by the same method.

【００２２】そしてこのような処理を行って得た正面を
向いた画像と斜め下を向いた２枚の画像を組み合わせる
ことにより、首を縦に動かして頷いている動作を表現し
たり、斜め左と正面と斜め右の３枚の画像を組み合わせ
て首を横に振る動作を表現することができる。By combining the front-facing image and the two diagonally-downward images obtained by performing such processing, the neck is moved vertically to express a nod motion, or the diagonal left is displayed. It is possible to express the action of shaking the head sideways by combining the three images of the front and diagonally right.

【００２３】この例は、首の縦と横の動作の判別及び再
現であるが、これを応用して例えば首を傾げる動作など
も表現することが可能である。In this example, the vertical and horizontal movements of the neck are discriminated and reproduced, but by applying this, it is possible to express the movement of tilting the neck.

【００２４】図６は、リアルなテレビ電話装置とするた
めに、画像表示部１１５として立体スクリーンを用いた
例を示している。７０１は通信者、７０２はテレビカメ
ラ、７０３は処理装置、７０４は通信網、７０５は立体
スクリーン、７０６は立体スクリーン駆動部である。立
体スクリーンに対する画像投影は、投影装置により行な
う。FIG. 6 shows an example in which a stereoscopic screen is used as the image display unit 115 in order to make a realistic videophone device. Reference numeral 701 is a communicator, 702 is a television camera, 703 is a processing device, 704 is a communication network, 705 is a stereoscopic screen, and 706 is a stereoscopic screen drive unit. The image projection on the stereoscopic screen is performed by the projection device.

【００２５】テレビカメラ７０２は送信側の被写体を撮
影して映像情報に変換する。処理装置７０３は前記実施
例と同様な処理部を備え、前記映像情報を前記実施例と
同様に処理し、符号化した後に通信網７０４を介して送
信する。また、相手側から送られてきた情報を復元して
立体スクリーン７０５に映し出すと共に立体スクリーン
駆動部７０６により該立体スクリーン７０５を動かす。The television camera 702 photographs the subject on the transmitting side and converts it into video information. The processing device 703 includes a processing unit similar to that in the above embodiment, processes the video information in the same manner as in the above embodiment, encodes it, and then transmits it via the communication network 704. Further, the information sent from the other party is restored and displayed on the stereoscopic screen 705, and the stereoscopic screen drive unit 706 moves the stereoscopic screen 705.

【００２６】図７は、立体スクリーン７０５を用いたテ
レビ電話装置における動作制御の一具体例を示してい
る。首を縦に動かして頷いた肯定動作のときには相手側
の立体スクリーン７０５も同様に動作することが望まし
い。従って、動作判別処理部（１０３）において、首を
縦に振って頷いたと判定してその動作情報が伝送されて
きたときには、動作処理部（１１３）は、（ａ）のよう
に立体スクリーン７０５を縦方向に動かす駆動を行なう
ように立体スクリーン駆動部７０６に制御信号を与え
る。また、この動作処理部（１１３）は、首を横に振っ
て否定動作をしていると判定された動作情報が伝送され
てきたときには、（ｂ）のように立体スクリーン７０５
が首を横に振るように立体スクリーン駆動部７０６を制
御する。FIG. 7 shows a specific example of operation control in the videophone device using the stereoscopic screen 705. In the case of a positive motion in which the head is vertically moved to nod, it is desirable that the partner's three-dimensional screen 705 also operates in the same manner. Therefore, when the motion determination processing unit (103) determines that the user has shaken his head vertically to nod and the motion information is transmitted, the motion processing unit (113) displays the stereoscopic screen 705 as shown in (a). A control signal is given to the three-dimensional screen drive unit 706 so as to drive to move in the vertical direction. Further, when the motion information which is determined to be a negative motion by shaking the head horizontally is transmitted, the motion processing unit (113) transmits the stereoscopic screen 705 as shown in (b).
Controls the three-dimensional screen drive unit 706 so that the person shakes his / her neck.

【００２７】図８は、本発明になるテレビ電話装置の他
の実施例を示している。９０１は映像情報を生成するテ
レビカメラ、９０２は符号化処理部、９０３は抽出処理
部、９０４はメモリ、９０５は動作判別処理部、９０６
は送受信部、９０７は復号化処理部、９０８はメモリ、
９０９は変形処理部である。FIG. 8 shows another embodiment of the video telephone device according to the present invention. 901 is a television camera that generates video information, 902 is an encoding processing unit, 903 is an extraction processing unit, 904 is a memory, 905 is an operation determination processing unit, and 906.
Is a transmitting / receiving unit, 907 is a decoding processing unit, 908 is a memory,
Reference numeral 909 is a transformation processing unit.

【００２８】送信情報の一部は、テレビカメラ９０１で
撮影して得た映像情報を符号化処理部９０２で符号化し
てルート９１２で複数フレーム毎に送信される。抽出処
理部９０３は前記映像情報から目、口、顔などの領域情
報を抽出し、前記実施例と同様に、抽出した目と口の座
標値を計算して、それらを三角形の頂点として重心値を
求めてメモリ９０４記憶する。動作判別処理部９０５
は、新たに求めた重心値と数フィールド前に求めてメモ
リ９０４に記憶されている重心値の座標値の平均値とを
比較することにより動きの内容を判別する。この動作情
報は、ルート９１４により毎フレーム毎に送信する。抽
出された領域情報（座標情報）もルート９１３により送
信する。A part of the transmission information is coded by the coding processing unit 902 of the video information obtained by photographing with the television camera 901, and is transmitted by the route 912 every plural frames. The extraction processing unit 903 extracts area information such as eyes, mouth, and face from the video information, calculates the coordinate values of the extracted eyes and mouth, and uses them as the vertices of a triangle to determine the barycentric value, as in the above-described embodiment. Is stored in the memory 904. Motion discrimination processing unit 905
Determines the content of the movement by comparing the newly obtained barycentric value with the average value of the coordinate values of the barycentric value stored several times before and stored in the memory 904. This operation information is transmitted for each frame by the route 914. The extracted area information (coordinate information) is also transmitted by the route 913.

【００２９】受信情報は、複数フレーム毎に受信する映
像情報をルート９１６で復号化処理部９０７に伝達して
復元し、メモリ９０８に記憶する。変形処理部９０９
は、ルート９１５で伝達される動作情報と領域情報に基
づいてメモリ９０８に記憶されている映像情報の中から
抽出した領域の画像を変形処理する。この変形処理は、
前記した実施例と同様に、二次元画像の各画素に対して
三次元的マッピングを行うことによって、動作（変形）
後の二次元画像を生成する処理である。そして表示部９
１０は、変形処理後の画像を表示する。As the received information, the video information received for every plural frames is transmitted to the decoding processing unit 907 by the route 916 to be restored, and stored in the memory 908. Deformation processing unit 909
Transforms the image of the region extracted from the video information stored in the memory 908 based on the motion information and the region information transmitted by the route 915. This transformation process
Similar to the embodiment described above, the operation (deformation) is performed by performing the three-dimensional mapping for each pixel of the two-dimensional image.
This is a process for generating a subsequent two-dimensional image. And display unit 9
10 displays the image after the transformation process.

【００３０】抽出対象は、領域ではなく顔の構成部分と
し、メモリ９０８に記憶されている画像情報の中から対
応する部分の画像を読出して変形処理するようにするこ
ともできる。It is also possible that the extraction target is not the area but the face constituent portion, and the image of the corresponding portion is read out from the image information stored in the memory 908 and the transformation processing is performed.

【００３１】なお、本発明で行なう動作判別処理結果
は、コンピュータにおけるキーボード操作やマウスのク
リック操作の代りに利用することができる。首を縦方向
に動かして頷く動作は「イエス」と等価な入力操作に相
当し、横方向に動かして否定する動作は「ノー」と等価
な入力操作に相当する入力として使用できる。The result of the operation determination process according to the present invention can be used in place of the keyboard operation and the mouse click operation in the computer. A nod operation by moving the neck vertically can be used as an input operation equivalent to "yes", and a motion of moving the neck horizontally to negate can be used as an input equivalent to an input operation equivalent to "no".

【００３２】[0032]

【発明の効果】以上のように本発明は、映像情報の間に
動作情報を授受して補間画像を生成することにより表示
画像に動きを与えるようにしたので、少ない信号量で動
きの豊かな画像表示を行なうことができる。As described above, according to the present invention, motion information is transmitted and received between video information to generate an interpolated image so as to give a motion to a display image. Images can be displayed.

[Brief description of drawings]

【図１】本発明になるテレビ電話装置の一実施例を示す
ブロック図である。FIG. 1 is a block diagram showing an embodiment of a videophone device according to the present invention.

【図２】図１における動き検出処理部の一具体例を示す
ブロック図である。FIG. 2 is a block diagram showing a specific example of a motion detection processing unit in FIG.

【図３】図１における動き検出処理部の他の具体例を示
すブロック図である。FIG. 3 is a block diagram showing another specific example of the motion detection processing section in FIG.

【図４】図１における動作処理部で実行する補間画像生
成処理の一例である。4 is an example of an interpolation image generation process executed by an operation processing unit in FIG.

【図５】図１における動作処理部で目に着目して実行す
る補間画像生成処理の一例である。FIG. 5 is an example of an interpolation image generation process executed by paying attention to an eye in the operation processing unit in FIG.

【図６】本発明になるテレビ電話装置の他の実施例を示
すブロック図である。FIG. 6 is a block diagram showing another embodiment of the videophone device according to the present invention.

【図７】図６における立体スクリーンの動作説明図であ
る。FIG. 7 is an operation explanatory diagram of the stereoscopic screen in FIG.

【図８】本発明になるテレビ電話装置の更に他の実施例
を示すブロック図である。FIG. 8 is a block diagram showing still another embodiment of the videophone device according to the present invention.

[Explanation of symbols]

１００…テレビカメラ、１０１…抽出処理部、１０３…
動作判別処理部、１０４…動き検出処理部、１０５…動
作処理部、１０６…符号化処理部、１０７…多重化処理
部、１０８…送受信部、１０９…分離処理部、１１０…
復号化処理部、１１１…動作処理部、１１２…動作判別
処理部、１１３…動作処理部、１１４…合成処理部。100 ... TV camera, 101 ... Extraction processing unit, 103 ...
Motion discrimination processing unit, 104 ... Motion detection processing unit, 105 ... Motion processing unit, 106 ... Encoding processing unit, 107 ... Multiplexing processing unit, 108 ... Transceiver unit, 109 ... Separation processing unit, 110 ...
Decryption processing unit, 111 ... Motion processing unit, 112 ... Motion discrimination processing unit, 113 ... Motion processing unit, 114 ... Combining processing unit.

フロントページの続き (72)発明者市毛健志神奈川県横浜市戸塚区吉田町292番地株式会社日立製作所映像メディア研究所内Front page continuation (72) Inventor Takeshi Ichige 292 Yoshida-cho, Totsuka-ku, Yokohama-shi, Kanagawa, Ltd. Inside Hitachi Media Media Research Laboratories

Claims

[Claims]

1. A television camera, an extracting means for extracting partial image information relating to a predetermined portion of a correspondent from video information captured by the television camera, and a movement of the predetermined portion is determined based on the extracted partial image information. Motion determining means for generating motion information, a transmitting / receiving means for transmitting / receiving the video information in every plural frames and transmitting / receiving the motion information in every frame, and a display image based on the received video information and the motion information. An image communication apparatus comprising an image generation processing unit for generating

2. A television camera, extracting means for extracting partial image information of a composing portion of a face of a correspondent from video information photographed by the television camera, and movement of the composing portion based on the extracted partial image information. And a motion determining unit for generating motion information, a coding unit for coding the video information and the motion information, and transmitting and receiving the coded video information for every plurality of frames and the motion information for each frame. A transmitting / receiving means for transmitting / receiving the image information, a restoring means for restoring the video information and the motion information from the encoded signal received by the transmitting / receiving means, and an image generation processing means for generating a display image based on the restored video information and the motion information. An image communication device characterized by being provided.

3. A television camera, extraction means for extracting partial image information relating to a predetermined portion of a correspondent from the whole video information taken by the television camera, and movement of the predetermined portion based on the extracted partial image information. An operation determination means for determining and generating operation information, a transmission / reception means for transmitting / receiving the whole video information for each predetermined frame and transmitting / receiving the partial image information and operation information for each frame, and storing the transmitted / received image information. An image communication apparatus comprising: a memory unit; and an image transforming unit that transforms video information read from the memory unit according to the partial image information and the motion information.

4. The image communication apparatus according to claim 1, wherein the extracting means extracts image information of an area including a predetermined portion of a correspondent from the video information.

5. The image communication apparatus according to claim 1, wherein the extracting means extracts image information of a predetermined portion of a correspondent from the video information.

6. The image communication apparatus according to claim 1, wherein the operation discriminating means has a memory means for storing information on the constituent parts.

7. The image communication device according to claim 1, wherein the motion discriminating means discriminates a vertical or horizontal movement of the neck.

8. The operation discriminating means according to claim 1, wherein the operation discriminating means obtains coordinate values of the extracted constituent parts and discriminates the operation of the constituent parts based on the change of the coordinate values. An image communication device comprising:

9. The method according to claim 1, wherein the extracting means has means for extracting image information of both eyes and a mouth of a correspondent, and the motion determining means has the extracted both eyes. And means for obtaining the coordinate values of the mouth, means for obtaining the coordinate values of the center of gravity of the area surrounded by the coordinate values, and means for determining the movement of the neck based on the change in the coordinates of the center of gravity. Image communication device.