JPH08161505A

JPH08161505A - Dynamic image processor, dynamic image encoding device, and dynamic image decoding device

Info

Publication number: JPH08161505A
Application number: JP29768494A
Authority: JP
Inventors: Shigeru Arisawa; 繁有沢
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1994-11-30
Filing date: 1994-11-30
Publication date: 1996-06-21

Abstract

PURPOSE: To provide the image processor which can encode three-dimensional shape information on respective segments constituting a dynamic image with high compressibility. CONSTITUTION: A three-dimensional model for respective segments constituting the continuous dynamic image is obtain from the dynamic image and projected on a specific reference two-dimensional plane by a projection part 21, and a connection line element extraction part 22 extracts connection line elements from the projection image to obtain the depth value from the two-dimensional projection image to the original three-dimensional model at respective feature points of the projection image. The projection image is processed through the chain coding of a chain coding part 23 and then the Huffman encoding of a 1st Huffman code generation part 24, and then sent out. Further, the depth value is processed through the ADPCM encoding of an ADPCM encoding part 25 and through the Huffman encoding of a 2nd Huffman code generation part 26, and then sent out.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、動画像よりその動画像
中の物体の３次元形状モデルを抽出し、その形状モデル
を高圧縮率で符号化および復号することが可能な動画像
処理装置、および、その動画像処理装置を適用して、た
とえば動画像の伝送・記録などを高圧縮率で行う、動画
像符号化装置、動画像復号装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a moving picture processing apparatus capable of extracting a three-dimensional shape model of an object in a moving picture from a moving picture and encoding and decoding the shape model at a high compression rate. And a moving picture coding apparatus and a moving picture decoding apparatus, to which the moving picture processing apparatus is applied to perform transmission / recording of a moving picture at a high compression rate.

【０００２】[0002]

【従来の技術】動画像系列の中の各物体の３次元モデル
を使って動画像を符号化し、その動画像系列を圧縮する
方法が提案されている。各物体の３次元形状とその動き
がわかれば、元の動画像系列と全く同じ動画像系列が生
成できる。そこで、たとえば画像通信において、送信側
と受信側で３次元モデルを共有し、送信側で入力画像の
動きの情報を検出し、受信側でその動きの情報から画像
合成を行えば、画像が再生できる。この場合、動きの情
報のみを伝送すればよいので超低レートでの画像通信が
期待できる。具体的には、顔のワイヤーフレームモデル
で表現された顔の３次元構造を送信側と受信側で共有
し、表情などの特徴のみを伝送して顔画像の合成を行う
方法などが試みられている。2. Description of the Related Art A method has been proposed in which a moving image is encoded by using a three-dimensional model of each object in the moving image sequence and the moving image sequence is compressed. If the three-dimensional shape of each object and its movement are known, a moving image sequence that is exactly the same as the original moving image sequence can be generated. Therefore, for example, in image communication, if the transmitting side and the receiving side share a three-dimensional model, the transmitting side detects the motion information of the input image, and the receiving side performs image synthesis from the motion information, the image is reproduced. it can. In this case, since only motion information needs to be transmitted, image communication at an extremely low rate can be expected. Specifically, a method has been tried in which the three-dimensional structure of the face expressed by the wire frame model of the face is shared between the transmitting side and the receiving side, and only the features such as facial expressions are transmitted to synthesize the facial image. There is.

【０００３】しかし、この符号化方法を自然画像に応用
する場合には、予め３次元モデルを用意できないため、
与えられた動画像系列から３次元モデルを抽出する必要
がある。そのような、動画像系列の中から３次元形状モ
デルを抽出し、そのモデルを利用して動画像を符号化す
る方法としては、Hans George Musmann 、Michael Hott
er、Jorn Ostermannらによる "OBJECT-ORIENTED analys
is coding of movingimages．",Signal processing:Ima
ge Communication 1(1989):117-138,ElsvierSCIENCE PU
BLISHERS B.V. に開示されている方法がある。この方法
によれば、エッジ部分については動きベクトルを求めそ
れを利用して奥行きを求め、エッジ以外の部分について
は奥行きを補間して、３次元形状を推測する。また、輝
度情報は、３次元上に属性としてマッピングし、補間さ
れた３次元形状のデータと共に送出している。However, when applying this encoding method to a natural image, a three-dimensional model cannot be prepared in advance,
It is necessary to extract a three-dimensional model from a given moving image sequence. As a method for extracting a three-dimensional shape model from a moving image sequence and coding a moving image using the model, Hans George Musmann, Michael Hott
er, Jorn Ostermann and others "OBJECT-ORIENTED analys
is coding of moving images. ", Signal processing: Ima
ge Communication 1 (1989): 117-138, ElsvierSCIENCE PU
There is a method disclosed in BLISHERS BV. According to this method, a motion vector is obtained for an edge portion and the depth is obtained by using the motion vector, and for a portion other than the edge, the depth is interpolated to infer a three-dimensional shape. The brightness information is mapped as an attribute in three dimensions and is transmitted together with the interpolated three-dimensional shape data.

【０００４】しかし、前述したような方法で３次元形状
を推測し、動画像系列を圧縮する方法においては、輝度
が急変するエッジ部分でさえ動きベクトルを求めること
は難しく、動きベクトルから正確な奥行きを推測するこ
とは非常に難しかった。したがって、エッジ部分の奥行
きを補間して３次元形状を推測しても、正確な３次元形
状が求められなかった。そのため、実際の形状とずれが
生じ、そのずれを表す余分な情報が増加して、圧縮率が
上げられなかった。また、３次元上に属性としてマッピ
ングした輝度情報を初期情報として伝送するので初期情
報量が非常に多かった。However, in the method of estimating the three-dimensional shape and compressing the moving image sequence by the method as described above, it is difficult to obtain the motion vector even at the edge portion where the brightness changes abruptly, and the accurate depth from the motion vector is obtained. It was very difficult to guess. Therefore, even if the three-dimensional shape is estimated by interpolating the depth of the edge portion, an accurate three-dimensional shape cannot be obtained. As a result, a deviation from the actual shape occurs, extra information representing the deviation increases, and the compression rate cannot be increased. Further, since the brightness information mapped as an attribute in three dimensions is transmitted as the initial information, the initial information amount is very large.

【０００５】また、３次元形状モデルを抽出する同様な
方法として、次のような方法がある。動画像系列の１フ
レーム目の各セグメントの２次元形状情報に、所定の奥
行き方向の情報を付加して３次元モデルの初期値とす
る。その３次元モデルの動きを推定し、その推定に基づ
いて得られる画像と、入力される各フレームの画像の各
特徴点の位置を比較し、その３次元モデルの全体的な動
きと、各特徴点のズレを抽出する。その３次元モデルの
全体の動き（動き推定値）と各特徴点のズレに基づい
て、順次前記３次元モデルを更新し、忠実な３次元モデ
ルを獲得する。この方法によれば、前述したような方法
に比べて、忠実な３次元モデルを獲得でき、初期情報量
も少なくすることができた。As a similar method for extracting the three-dimensional shape model, there is the following method. Information in the predetermined depth direction is added to the two-dimensional shape information of each segment of the first frame of the moving image sequence to obtain the initial value of the three-dimensional model. The motion of the three-dimensional model is estimated, the position of each feature point of the image obtained based on the estimation and the image of each input frame is compared, and the overall motion of the three-dimensional model and each feature are compared. Extract the deviation of points. The three-dimensional model is sequentially updated based on the overall movement of the three-dimensional model (motion estimation value) and the deviation of each feature point, and a faithful three-dimensional model is acquired. According to this method, it is possible to obtain a faithful three-dimensional model and reduce the amount of initial information, as compared with the method described above.

【０００６】[0006]

【発明が解決しようとする課題】しかし、前述した動画
像符号化装置においても、初期情報として３次元形状モ
デルのデータと分析値を伝送する必要があり、そのデー
タ量は十分少なくはなかった。すなわち、よりデータの
量を少なくしたい、特に、初期データの量を少なくした
いという要望があった。However, even in the moving picture coding apparatus described above, it is necessary to transmit the data of the three-dimensional shape model and the analysis value as the initial information, and the amount of the data is not sufficiently small. That is, there has been a demand to further reduce the amount of data, particularly to reduce the amount of initial data.

【０００７】したがって、本発明の目的は、より少ない
データ量で、動画像を構成する各セグメントの３次元形
状情報を符号化することができる、動画像処理装置を提
供することにある。Therefore, an object of the present invention is to provide a moving image processing apparatus capable of encoding three-dimensional shape information of each segment forming a moving image with a smaller amount of data.

【０００８】[0008]

【課題を解決するための手段】本発明の動画像処理装置
においては、符号化しようとする３次元形状情報を、２
次元平面に投影した像の位置情報と、その像の奥行き情
報とに分け、各情報の性質に適した方法により各々符号
化するようにした。特に、前記奥行き情報については、
各セグメントのエッジにおける奥行き方向の差分が小さ
いことに着目し、ＡＤＰＣＭ方式により符号化するよう
にした。In the moving image processing apparatus of the present invention, the three-dimensional shape information to be encoded is set to 2
The position information of the image projected on the three-dimensional plane and the depth information of the image are divided and encoded by a method suitable for the property of each information. In particular, regarding the depth information,
Focusing on the fact that the difference in the depth direction at the edge of each segment is small, encoding is performed by the ADPCM system.

【０００９】したがって、本発明の動画像処理装置は、
入力された連続的な動画像を構成する各セグメントの３
次元形状情報を獲得する３次元形状獲得手段と、前記３
次元形状獲得手段により獲得された３次元形状を所定の
２次元平面に投影する投影手段と、前記投影手段により
投影された２次元平面上の像の位置を符号化する位置情
報符号化手段と、前記投影手段により投影された２次元
平面上の像から前記３次元形状への奥行きをＡＤＰＣＭ
方式により符号化する奥行き情報符号化手段とを有し、
前記３次元形状獲得手段により獲得された３次元形状を
符号化する。Therefore, the moving image processing apparatus of the present invention is
3 of each segment that composes the input continuous moving image
A three-dimensional shape acquisition means for acquiring three-dimensional shape information;
Projection means for projecting the three-dimensional shape acquired by the three-dimensional shape acquisition means onto a predetermined two-dimensional plane; position information encoding means for encoding the position of the image projected on the two-dimensional plane by the projection means; The depth from the image projected on the two-dimensional plane by the projection means to the three-dimensional shape is ADPCM.
A depth information encoding means for encoding by a method,
The three-dimensional shape acquired by the three-dimensional shape acquisition means is encoded.

【００１０】好適には、前記位置情報符号化手段により
符号化された前記３次元形状の２次元平面上の像の位置
と、前記奥行き情報符号化手段により符号化された３次
元形状の奥行きとをさらにエントロピー符号化するエン
トロピー符号化手段をさらに有する。Preferably, the position of the image on the two-dimensional plane of the three-dimensional shape encoded by the position information encoding means and the depth of the three-dimensional shape encoded by the depth information encoding means. Further has entropy coding means for entropy coding.

【００１１】特定的には、前記３次元形状獲得手段は、
入力された連続的な動画像に関する所定の静止画像を分
析し、その画像を構成する各セグメントの３次元形状情
報を得るモデリング手段と、前記モデリング手段により
得られた前記各セグメントの３次元形状情報を記憶する
記憶手段と、前記連続的な動画像の各フレーム間におけ
るその動画像を構成する各セグメントの３次元的な動き
を、前記記憶手段に記憶されている前記各セグメントの
３次元形状情報に基づいて推定する動き推定手段と、前
記推定された動きにより各セグメントを３次元的に移動
させた結果の位置と実際の位置の差に基づいて、前記記
憶手段に記憶されている各セグメントの３次元形状情報
を更新する更新手段とを有する。Specifically, the three-dimensional shape acquisition means is
Modeling means for analyzing a predetermined still image related to the input continuous moving image and obtaining three-dimensional shape information of each segment forming the image, and three-dimensional shape information of each segment obtained by the modeling means And a three-dimensional shape information of each segment stored in the storage unit for storing the three-dimensional movement of each segment forming the moving image between each frame of the continuous moving image. Based on the difference between the position as a result of three-dimensionally moving each segment by the estimated motion and the actual position, and the motion estimation means for estimating each segment stored in the storage means. And an updating unit that updates the three-dimensional shape information.

【００１２】また、本発明の動画像符号化装置は、前記
動画像処理装置と、前記連続的な動画像の各フレーム間
における前記各セグメントの３次元的な動きを、前記３
次元形状情報に基づいて推定する動き推定手段と、前記
動き推定手段により推定された動きにより各セグメント
を３次元的に移動させた結果の各特徴点の位置と、実際
の各特徴点の位置の差を求める差検出手段と、前記動き
推定手段により推定された各セグメントの動き推定値
と、前記差検出手段により検出された各特徴点の位置の
差とを、各フレームごとに符号化する符号化手段とを有
し、前記連続的な動画像を符号化する。Further, the moving picture coding apparatus according to the present invention is characterized in that the three-dimensional movement of each segment between the moving picture processing apparatus and each frame of the continuous moving picture is referred to as the three-dimensional movement.
The motion estimation means for estimating based on the dimensional shape information, the position of each feature point resulting from the three-dimensional movement of each segment by the motion estimated by the motion estimation means, and the actual position of each feature point A code that encodes, for each frame, a difference detection unit that obtains a difference, a motion estimation value of each segment estimated by the motion estimation unit, and a position difference of each feature point detected by the difference detection unit. Encoding means for encoding the continuous moving image.

【００１３】また、本発明の動画像復号装置は、符号化
された連続的な動画像を構成する各セグメントの２次元
平面上に投影した像の位置を復号する２次元像復号手段
と、ＡＤＰＣＭ方式を用いて符号化された前記２次元平
面上の像の奥行きを復号する奥行き情報復号手段と、前
記２次元像復号手段により復号された前記２次元平面上
の像の位置と、前記奥行き情報復号手段により復号され
たその２次元平面上の像の奥行きに基づいて、前記動画
像を構成する各セグメントの３次元形状情報を生成する
３次元形状生成手段と、前記連続的な動画像の各フレー
ムごとに符号化された、前記各セグメントの動き推定値
と、各特徴点ごとの変位とを復号する動き復号手段と、
前記各セグメントの位置を、前記動き復号手段により復
号された各セグメントの動き推定値に基づいて３次元的
に移動させる移動手段と、前記移動手段により移動され
た位置の前記各セグメントを、２次元画面上に投影した
投影画像を得る投影手段と、前記投影画像における各セ
グメントの各特徴点の位置を、前記動き復号手段により
復号された各セグメントの各特徴点ごとの変位に基づい
て移動させる変形手段と、前記変形手段により変形され
た各セグメントの形状に基づいて、画像を合成する画像
合成手段とを有し、符号化された連続的な動画像を復号
する。Further, the moving picture decoding apparatus of the present invention comprises a two-dimensional image decoding means for decoding the position of the image projected on the two-dimensional plane of each segment constituting the encoded continuous moving picture, and ADPCM. Depth information decoding means for decoding the depth of the image on the two-dimensional plane coded by using the method, the position of the image on the two-dimensional plane decoded by the two-dimensional image decoding means, and the depth information Three-dimensional shape generation means for generating three-dimensional shape information of each segment forming the moving image based on the depth of the image on the two-dimensional plane decoded by the decoding means, and each of the continuous moving images. A motion decoding unit that decodes the motion estimation value of each segment and the displacement of each feature point, which is encoded for each frame.
Moving means for three-dimensionally moving the position of each segment based on the motion estimation value of each segment decoded by the motion decoding means, and two-dimensionally moving each segment at the position moved by the moving means Projection means for obtaining a projection image projected on the screen, and a modification for moving the position of each feature point of each segment in the projection image based on the displacement of each feature point of each segment decoded by the motion decoding means Means and an image synthesizing means for synthesizing images on the basis of the shape of each segment transformed by the transforming means, and decodes a coded continuous moving image.

【００１４】[0014]

【作用】本発明の動画像処理装置は、入力された連続的
な動画像を構成する各セグメントの３次元形状を獲得
し、その３次元形状を所定の２次元平面に投影して２次
元投影像を得、さらに、その２次元投影像を基準にした
前記３次元形状に対する奥行き情報を得る。これによ
り、前記３次元形状の情報を２次元投影像の位置情報と
その像に対応した奥行き情報とに分離する。２次元投影
像の位置情報は、任意の適切な符号化方法により符号化
し、また、前記奥行き情報は、ＡＤＰＣＭ方式により符
号化する。各々符号化された信号は、さらにエントロピ
ー符号化され高圧縮率で符号化される。The moving image processing apparatus of the present invention acquires the three-dimensional shape of each segment forming an input continuous moving image, projects the three-dimensional shape on a predetermined two-dimensional plane, and two-dimensionally projects it. An image is obtained, and further depth information for the three-dimensional shape based on the two-dimensional projected image is obtained. Thereby, the information of the three-dimensional shape is separated into the position information of the two-dimensional projected image and the depth information corresponding to the image. The position information of the two-dimensional projection image is encoded by any appropriate encoding method, and the depth information is encoded by the ADPCM method. Each encoded signal is further entropy encoded and encoded at a high compression rate.

【００１５】また、本発明の動画像符号化装置は、前記
動画像処理装置により獲得され符号化された連続的な動
画像を構成する各セグメントの３次元形状を用いて、前
記連続的な動画像の各フレーム間における前記各セグメ
ントの３次元的な動きを推定する。そして、その推定結
果の各特徴点の位置と、実際の各特徴点の位置の差を求
める。この各セグメントの動き推定値と各特徴点の位置
の差とを各フレームごとに符号化する。Further, the moving picture coding apparatus of the present invention uses the three-dimensional shape of each segment which forms a continuous moving picture acquired and coded by the moving picture processing apparatus, so as to obtain the continuous moving picture. Estimate the three-dimensional movement of each segment between each frame of the image. Then, the difference between the position of each feature point of the estimation result and the actual position of each feature point is obtained. The motion estimation value of each segment and the difference in position of each feature point are encoded for each frame.

【００１６】また、本発明の動画像復号装置は、符号化
された２次元平面上に投影した像の位置と、ＡＤＰＣＭ
方式により符号化されたその像の奥行き情報を各々復号
し、それらを合成して３次元形状情報を生成する。そし
て、動画像の各フレームにおいて、符号化された各セグ
メントの動き推定値と各特徴点ごとの変位とを復号し、
その動き推定値により各セグメントの位置を３次元的に
移動させ、移動させた各セグメントを２次元画面上に投
影し、その投影画像における各セグメントの各特徴点の
位置を前記各特徴点ごとの変位に基づいて移動させる。
これにより得られた各セグメントの形状に基づいて、画
像を合成する。Further, the moving picture decoding apparatus of the present invention is arranged such that the position of the image projected on the coded two-dimensional plane and the ADPCM
The depth information of the image coded by the method is decoded, respectively, and they are combined to generate three-dimensional shape information. Then, in each frame of the moving image, the encoded motion estimation value of each segment and the displacement of each feature point are decoded,
The position of each segment is three-dimensionally moved according to the motion estimation value, each moved segment is projected onto a two-dimensional screen, and the position of each characteristic point of each segment in the projected image is calculated for each characteristic point. Move based on displacement.
Images are combined based on the shape of each segment obtained in this way.

【００１７】[0017]

【実施例】本発明の一実施例の動画像符号化装置につい
て、図１〜図８を参照して説明する。図１は、本発明の
一実施例の動画像符号化装置１０の構成を示すブロック
図である。動画像符号化装置１０は、画像分析部１１、
分析画像記憶部１２、セグメンテーション部１３、記憶
部１４、動き推定・対応探索部１５、投影部１６、誤差
検出部１７、カルマンフィルタ１８、および、符号化部
１９を有する。DESCRIPTION OF THE PREFERRED EMBODIMENTS A moving picture coding apparatus according to an embodiment of the present invention will be described with reference to FIGS. FIG. 1 is a block diagram showing the configuration of a moving picture coding apparatus 10 according to an embodiment of the present invention. The moving image coding apparatus 10 includes an image analysis unit 11,
It has an analysis image storage unit 12, a segmentation unit 13, a storage unit 14, a motion estimation / correspondence search unit 15, a projection unit 16, an error detection unit 17, a Kalman filter 18, and an encoding unit 19.

【００１８】この動画像符号化装置１０は、後述する動
画像復号装置３０と協働して画像処理系を構成する。本
実施例の動画像符号化装置１０は、ＶＴＲなどからの動
画像系列より、図示せぬ連続シーケンス検出部で連続し
た動画像データを検出し、その動画像データ系列を符号
化し伝送する動画像伝送装置である。その伝送に際して
は、前記連続シーケンスより、そのシーケンスを構成す
るセグメントの３次元形状情報を抽出する第１のステッ
プと、抽出された３次元形状情報を用いて符号化を行う
第２のステップとに分けられる。以下、各部の動作につ
いて、前記第１のステップ、第２のステップごとに説明
する。This moving picture coding apparatus 10 constitutes an image processing system in cooperation with a moving picture decoding apparatus 30 which will be described later. The moving picture coding apparatus 10 of the present embodiment detects moving picture data from a moving picture series from a VTR or the like by a continuous sequence detecting unit (not shown), and codes and transmits the moving picture data series. It is a transmission device. In the transmission, the first step of extracting the three-dimensional shape information of the segments forming the sequence from the continuous sequence, and the second step of encoding using the extracted three-dimensional shape information. Be divided. The operation of each unit will be described below for each of the first step and the second step.

【００１９】まず、連続的な動画像系列より、その動画
像を構成する各セグメントの３次元形状情報を抽出する
第１のステップの各部の動作について説明する。ＶＴＲ
などから入力された動画像系列は、連続したシーケンス
が検出され、画像分析部１１に入力される。画像分析部
１１は、順次入力される各フレームの画像データを分析
し、特徴点を抽出し、特徴点の位置と分析値を求める。
本実施例においては、入力画像データに対して、異なる
解像度スケールを持つ複数のフィルタで画像データの分
析を行い、エッジを構成する点を特徴点として検出し、
入力画像データを特徴画像データであるエッジの画像デ
ータに変換し、そのエッジを構成する各点の位置と分析
値を抽出する。First, the operation of each part of the first step of extracting the three-dimensional shape information of each segment forming a moving image from a continuous moving image sequence will be described. VTR
A continuous sequence is detected from the moving image sequence input from the above, and is input to the image analysis unit 11. The image analysis unit 11 analyzes the sequentially input image data of each frame, extracts the characteristic points, and obtains the positions of the characteristic points and the analysis values.
In the present embodiment, with respect to the input image data, the image data is analyzed with a plurality of filters having different resolution scales, and the points forming the edges are detected as feature points,
The input image data is converted into image data of an edge, which is characteristic image data, and the position of each point constituting the edge and the analysis value are extracted.

【００２０】分析画像記憶部１２は、画像分析部１１で
分析された連続的な連続シーケンスの特徴点画像を記憶
するメモリである。記憶されている各フレームの特徴点
の情報は、動き推定・対応探索部１５より順次参照さ
れ、また、１フレーム目の特徴点の情報は、セグメンテ
ーション部１３および符号化部１９より参照される。The analysis image storage unit 12 is a memory for storing the feature point images of the continuous sequence analyzed by the image analysis unit 11. The stored information of the feature points of each frame is sequentially referred to by the motion estimation / correspondence search unit 15, and the information of the feature points of the first frame is referred to by the segmentation unit 13 and the encoding unit 19.

【００２１】セグメンテーション部１３は、画像分析部
１１より入力された１フレーム目の特徴画像データの特
徴点の位置と分析値より、セグメンテーションを行い、
この画像データを構成しているセグメントを抽出し、各
セグメントごとの特徴点の情報を記憶部１４に記憶す
る。このセグメンテーションは、カラー画像から赤、
緑、青、明度、色相、彩度の信号、および、テレビ信号
に対応したＹ信号、Ｉ信号、Ｑ信号の合計９種類の特徴
を抽出し、その特徴に関するヒストグラムに基づいてセ
グメンテーションを行う再帰的しきい値処理により行
う。The segmentation unit 13 performs segmentation on the basis of the positions and analysis values of the characteristic points of the characteristic image data of the first frame input from the image analysis unit 11,
The segments making up this image data are extracted, and the information of the characteristic points for each segment is stored in the storage unit 14. This segmentation is from a color image to red,
Recursive extraction of a total of nine types of features of green, blue, lightness, hue, and saturation signals, and Y, I, and Q signals corresponding to television signals, and segmentation based on a histogram relating to the features. Performed by thresholding.

【００２２】記憶部１４は、各セグメントの各特徴点に
ついて、位置情報Ｘ，Ｙ，Ｚと、分析値ｇ、確率共分散
行列ｖ、付加情報ａを記憶する記憶手段であり、メモリ
により構成される。記憶部１４に記憶されている情報
は、入力された連続シーケンスがＳ個のセグメントを有
し、各セグメントがＵs 個（ｓ＝１〜Ｓ）のエッジより
構成され、その各エッジがＮsu個（ｕ＝１〜Ｕs 、ｓ＝
１〜Ｓ）の特徴点より成される場合、式１のように表さ
れる。The storage unit 14 is a storage means for storing the position information X, Y, Z, the analysis value g, the probability covariance matrix v, and the additional information a for each feature point of each segment, and is composed of a memory. It The information stored in the storage unit 14 is such that the input continuous sequence has S segments, each segment is composed of Us (s = 1 to S) edges, and each edge is Nsu ( u = 1 to Us, s =
1 to S), it is expressed as in Equation 1.

【００２３】[0023]

【数１】 [Equation 1]

【００２４】なお、確率共分散行列ｖsun （ｎ＝１〜Ｎ
su、ｕ＝１〜Ｕs 、ｓ＝１〜Ｓ）は、各エッジを構成す
る点のちらばりなので、同一のエッジを構成する各特徴
点については同一の値が付される。The probability covariance matrix vsun (n = 1 to N)
Since su, u = 1 to Us, and s = 1 to S) are scattered among the points forming each edge, the same value is given to each feature point forming the same edge.

【００２５】記憶部１４に記憶されている情報は、ま
ず、１フレーム目についての情報がセグメンテーション
部１３より入力され、初期データが生成される。その
後、２フレーム目以降の画像データが入力されるごと
に、後述するカルマンフィルタ１８によりその内容が更
新される。As the information stored in the storage unit 14, first, the information on the first frame is input from the segmentation unit 13 to generate initial data. After that, every time the image data of the second and subsequent frames is input, the contents are updated by the Kalman filter 18 described later.

【００２６】動き推定・対応探索部１５は、前フレーム
の画像のエッジ画像の各点の情報｛Ｆsun ｝と現フレー
ムのエッジ位置と分析値から、各セグメントの動いた量
を推定し、前フレームの各点の情報｛Ｆsun ｝と、現フ
レームのエッジ画像の特徴点の対応付けを行う。The motion estimation / correspondence search unit 15 estimates the amount of movement of each segment from the information {Fsun} of each point of the edge image of the image of the previous frame, the edge position of the current frame, and the analysis value, The information {Fsun} of each point of (1) and the feature point of the edge image of the current frame are associated with each other.

【００２７】その方法について具体的に以下に説明す
る。まず、図２において、座標系ＸＹＺはカメラ座標系
で、座標系の原点はレンズの中心で、光軸は奥行き方向
となるＺ軸と一致させているとする。この座標系におい
ては、点Ｐの像はＸＹ平面に平行で原点からカメラの焦
点距離ｆだけ離れた所に設置された平面に投影されると
考えることができる。この投影面上の点Ｐの像の位置が
カメラより入力された画像上の画素の位置となる。その
投影面に対して、その面のＺ軸との交点を原点とし、Ｘ
軸およびＹ軸と平行な座標系ｘｙを設定する。The method will be specifically described below. First, in FIG. 2, it is assumed that the coordinate system XYZ is a camera coordinate system, the origin of the coordinate system is the center of the lens, and the optical axis is aligned with the Z axis that is the depth direction. In this coordinate system, it can be considered that the image of the point P is projected on a plane that is parallel to the XY plane and is separated from the origin by the focal length f of the camera. The position of the image of the point P on the projection plane becomes the position of the pixel on the image input from the camera. With respect to the projection plane, the origin is the intersection with the Z axis of that plane and X
A coordinate system xy parallel to the axes and the Y axis is set.

【００２８】ＸＹＺ空間内の点Ｐの座標をｐ＝（Ｘp ，
Ｙp ，Ｚp ）、点Ｐのｘｙ平面上の像である点Ｑの座標
をｑ＝（ｘq ，ｙq ）とすると、点Ｑの座標ｑは式２の
ように表される。The coordinates of the point P in the XYZ space are p = (Xp,
Yp, Zp) and the coordinates of the point Q, which is the image of the point P on the xy plane, are q = (xq, yq), the coordinates q of the point Q are expressed by equation 2.

【００２９】[0029]

【数２】 [Equation 2]

【００３０】あるセグメントｓ（ｓ＝１〜Ｓ）がＵs 個
（ｓ＝１〜Ｓ）のエッジより構成され、その各エッジが
Ｎsu個（ｕ＝１〜Ｕｓ）の点の情報で表され、それら各
点の位置はｐsun ＝（Ｘsun ，Ｙsun ，Ｚsun ）（ｎ＝
１〜Ｎsu）で表されるとする。このセグメントが、相対
的にＸ軸周りにΔωｘ、Ｙ軸周りにΔωｙ、Ｚ軸周りに
Δωｚ回転し、また、Δｔ＝（Δｔｘ，Δｔｙ，Δｔ
ｚ）だけ平行移動した場合、このセグメントを構成する
各点ｐsun の移動量Δｐsun ＝（ΔＸsun ，ΔＹsun ，
ΔＺsun ）は、前記各軸周りの回転Δωｘ，Δωｙ，Δ
ωｚ、および，平行移動量Δｔが小さいとすると、式３
のようになる。A segment s (s = 1 to S) is composed of Us (s = 1 to S) edges, and each edge is represented by Nsu (u = 1 to Us) point information. The positions of these points are psun = (Xsun, Ysun, Zsun) (n =
1 to Nsu). This segment relatively rotates by Δωx around the X axis, Δωy around the Y axis, Δωz around the Z axis, and Δt = (Δtx, Δty, Δt
z), the amount of movement of each point psun forming this segment Δpsun = (ΔXsun, ΔYsun,
ΔZsun) is the rotation around each axis Δωx, Δωy, Δ
Assuming that ωz and the parallel movement amount Δt are small, Equation 3
become that way.

【００３１】[0031]

【数３】 (Equation 3)

【００３２】点ｐsun のｘｙ平面上への投影点をｑsun
＝（ｘsun ，ｙsun ）とすると、前記セグメントの移動
にともなう投影点ｑsun の移動量Δｑsun ＝（Δｘsun
，Δｙsun ）は式４のようになる。Let qsun be the projection point of the point psun on the xy plane.
= (Xsun, ysun), the amount of movement of the projection point qsun due to the movement of the segment Δqsun = (Δxsun
, Δysun) is given by equation 4.

【００３３】[0033]

【数４】 [Equation 4]

【００３４】式２と式４より式５が得られる。From Equations 2 and 4, Equation 5 is obtained.

【００３５】[0035]

【数５】 (Equation 5)

【００３６】式２および式５を、Ｎ個の点の内のｍ＝１
〜ＭのＭ個に適用すると、式６のようになる。Equations 2 and 5 are replaced by m = 1 of N points.
When it is applied to M pieces of ~ M, it becomes as shown in Expression 6.

【００３７】[0037]

【数６】 (Equation 6)

【００３８】なお、Δｔ^tは行列Δｔの転置行列を示
す。Δｑについては、新たな画像が入力される前に得て
いた３次元位置ｐｍの式２による仮想の投影点ｑｍ’に
対応する画像上の点が分からないので、図３に示すよう
に、３次元位置情報の仮想の投影像Ｉｐにおいて物体像
Ｉｒの投影特徴点ｑｍから最も近い点と仮定する。Ｍ≧
３のとき回転および平行移動量のパラメータΔＣの推定
値ΔＣ’は、最小自乗法により式７により求められる。Note that Δt ^t represents a transposed matrix of the matrix Δt. Regarding Δq, since the point on the image corresponding to the virtual projection point qm ′ by the equation 2 of the three-dimensional position pm obtained before the new image is input is unknown, as shown in FIG. It is assumed that the virtual projection image Ip of the dimensional position information is the closest point to the projection feature point qm of the object image Ir. M ≧
When the value is 3, the estimated value ΔC ′ of the parameter ΔC of the amount of rotation and the amount of translation is obtained by the equation 7 by the least square method.

【００３９】[0039]

【数７】 (Equation 7)

【００４０】式７により得られたΔＣ’による３次元位
置情報の移動量Δｑsun を式２より計算して、新たに式
３により仮想の投影像を作り、同様に近い点を対応点と
仮定し、式７の計算を繰り返し、式８のようにしていく
と、仮想の投影像と物体像Ｉｒは近づく。The movement amount Δqsun of the three-dimensional position information based on ΔC ′ obtained by the equation 7 is calculated by the equation 2 and a virtual projected image is newly created by the equation 3, and the similar points are assumed to be corresponding points. If the calculation of Expression 7 is repeated and Expression 8 is repeated, the virtual projected image and the object image Ir come close to each other.

【００４１】[0041]

【数８】 (Equation 8)

【００４２】この計算を、Σ｜Δｑsun ｜²が予め定め
た所定値ε以下になるまで繰り返すことにより、元の画
像の３次元位置情報ｐsun に対する新たな画像の対応点
ｑsun が求められる。By repeating this calculation until Σ│Δqsun│ ² becomes equal to or less than the predetermined value ε, the corresponding point qsun of the new image with respect to the three-dimensional position information psun of the original image is obtained.

【００４３】以上述べた動き推定・対応探索の方法によ
れば、物体を剛体と仮定し、回転および平行移動につい
ての６個のパラメータで３次元位置情報を構成する点を
拘束することで、個々の点それぞれ独立にではなく、包
括的に動き推定・対応探索が行われている。したがっ
て、全体として矛盾のない対応関係が全ての点について
得られ、誤対応による３次元位置情報におけるノイズが
減る。According to the method of motion estimation / correspondence search described above, the object is assumed to be a rigid body, and the points forming the three-dimensional position information are constrained by the six parameters of rotation and translation, thereby individually However, the motion estimation / correspondence search is performed comprehensively, not independently. Therefore, as a whole, a consistent correspondence relationship is obtained for all points, and noise in the three-dimensional position information due to incorrect correspondence is reduced.

【００４４】投影部１６は、動き推定・対応探索部１５
により得られた、各セグメントの平行移動量ｔ、回転移
動量ωによって各セグメントの情報｛Ｆsun ｝の位置情
報を３次元空間において平行および回転移動させ、さら
に、各セグメントの各点の３次元位置ｐsun ＝（Ｘsun
，Ｙsun ，Ｚsun ）を画像上の位置ｑsun ＝（ｘsun，
ｙsun ）に変換し、得られた画像上の点ｑsun に分析値
ｇsun を与える。３次元位置ｐsun から投影点ｑsun へ
の変換は式３により行う。The projection unit 16 includes a motion estimation / correspondence search unit 15
The position information of the information {Fsun} of each segment is moved in parallel and rotationally in the three-dimensional space by the parallel movement amount t and the rotational movement amount ω of each segment, and the three-dimensional position of each point of each segment is obtained. psun = (Xsun
, Ysun, Zsun) at position qsun = (xsun,
ysun) and the analysis value gsun is given to the point qsun on the obtained image. The conversion from the three-dimensional position psun to the projection point qsun is performed by Expression 3.

【００４５】誤差検出部１７は、投影後の各特徴点の情
報｛Ｆsun ｝の画像上での位置ｑsun'と対応する入力画
像のエッジ位置ｑsun との差を求め、さらに、その差を
量子化しフラクチュエーションを求める。量子化方法と
しては、一定の適切な量子化ステップ（たとえば１画素
幅）による線形な量子化、いくつかの線形でない量子化
ステップを設定した非線形量子化、量子化ステップを固
定せず入力される画像の性質により量子化ステップを適
宜変える量子化などがあり、要求される伝送レート・画
質などに応じて適切な量子化方法を用いれば良い。たと
えば、高圧縮率が要求される場合には量子化ステップを
大きくしたり、画像に直線が多く量子化ノイズによる直
線の不連続性が目立つ場合は非線形量子化を行い、フラ
クチュエーションの小さい部分の量子化を細かくするよ
うにする。求められたフラクチュエーションは、カルマ
ンフィルタ１８に入力される。The error detector 17 finds the difference between the position qsun 'on the image of the information {Fsun} of each feature point after projection and the edge position qsun of the corresponding input image, and further quantizes the difference. Seeking fracturing. As the quantization method, linear quantization with a certain appropriate quantization step (for example, 1 pixel width), non-linear quantization with some non-linear quantization steps set, and quantization steps not fixed are input. There is quantization in which the quantization step is appropriately changed depending on the nature of the image, and an appropriate quantization method may be used according to the required transmission rate, image quality, and the like. For example, if a high compression rate is required, increase the quantization step, or if there are many straight lines in the image and the discontinuity of the straight lines due to quantization noise is conspicuous, perform non-linear quantization to reduce the fractional part. Try finer quantization. The calculated fraction is input to the Kalman filter 18.

【００４６】カルマンフィルタ１８は、前の画像におけ
る各セグメントの各点の情報｛Ｆsun ｝の３次元位置ｐ
sun とそれに対応する入力画像のエッジ位置ｑsun から
３次元位置ｐsun を更新する。カルマンフィルタはノイ
ズを含むシステムにおいて時系列の観測量から状態量の
最小自乗推定値を逐次得ることのできるフィルタであ
る。本実施例において、状態量は３次元位置ｐsun 、観
測量である２次元位置ｑsun である。２次元位置ｑsun
にはΔｑsun の量子化によるノイズが含まれる。また、
動き推定値にもノイズが含まれる。初期値の平面上の３
次元形状｛ｐsun｝は、カルマンフィルタによりセグメ
ントに動きがあるごとに、実際の３次元形状に近づくよ
うに更新されていく。各点の情報｛Ｆsun ｝における確
率共分散行列ｖsun はｐsun の確率共分散行列（３×
３）でｐsun を更新するのに用いられ、同時に確率共分
散行列ｖsun も更新される。The Kalman filter 18 uses the three-dimensional position p of the information {Fsun} of each point of each segment in the previous image.
The three-dimensional position psun is updated from the sun and the edge position qsun corresponding to the input image. The Kalman filter is a filter that can sequentially obtain a least squares estimation value of a state quantity from a time series observation quantity in a system including noise. In this embodiment, the state quantity is a three-dimensional position psun and a two-dimensional position qsun which is an observed quantity. Two-dimensional position qsun
Contains noise due to the quantization of Δqsun. Also,
The motion estimation value also contains noise. 3 on the plane of the initial value
The dimensional shape {psun} is updated by the Kalman filter so as to approach the actual three-dimensional shape each time there is a motion in the segment. The probability covariance matrix vsun at the information {Fsun} of each point is the probability covariance matrix (3 ×
3) is used to update psun, and at the same time the probability covariance matrix vsun is updated.

【００４７】以上の、カルマンフィルタにおける更新
を、連続的な動画像の全フレームについて行うと、最終
的に記憶部１４には、各セグメントごとの忠実度の高い
３次元形状モデルが記憶される。When the above Kalman filter update is performed for all frames of a continuous moving image, the storage unit 14 finally stores a high fidelity three-dimensional shape model for each segment.

【００４８】次に、前記第１のステップにおいて抽出さ
れた３次元形状モデルを用いて、この連続的な動画像を
符号化する第２のステップについて説明する。第２のス
テップにおいても、各部の動作は第１のステップと同じ
である。しかし、第２のステップにおいては、記憶部１
４に記憶されている最終的な各セグメントの３次元形状
情報を用いて動き推定・対応探索を行い、セグメントご
との動きを抽出し、各特徴点の実際の位置との差を求め
る。Next, the second step of encoding this continuous moving image using the three-dimensional shape model extracted in the first step will be described. Also in the second step, the operation of each unit is the same as that in the first step. However, in the second step, the storage unit 1
The motion estimation / correspondence search is performed using the final three-dimensional shape information of each segment stored in 4, and the motion of each segment is extracted, and the difference from the actual position of each feature point is obtained.

【００４９】したがって、まず、動き推定・対応探索部
１５において、記憶部１４に記憶されている各セグメン
トの３次元形状情報を用いて、分析画像記憶部１２に記
憶されている各フレームごとの特徴点の位置より、各セ
グメントの全体の動きと、各特徴点の対応を求める。そ
の求め方は、前記第１のステップの場合と同一である。
ここで求められた動きは投影部１６および符号化部１９
に出力される。Therefore, first, the motion estimation / correspondence search section 15 uses the three-dimensional shape information of each segment stored in the storage section 14 for the characteristics of each frame stored in the analysis image storage section 12. From the position of the points, the correspondence between the entire movement of each segment and each feature point is obtained. The method of obtaining the same is the same as in the case of the first step.
The motion obtained here is applied to the projection unit 16 and the encoding unit 19.
Is output to

【００５０】そして、投影部１６は、動き推定・対応探
索部１５により得られた各セグメントの移動量によって
各セグメントの位置情報を、３次元空間において平行お
よび回転移動させ、さらに、各セグメントの各点の３次
元位置を、画像上の位置に変換する。誤差検出部１７
は、投影後の各特徴点の画像上での位置と対応する入力
画像のエッジ位置との差を求め、さらに、その差を量子
化し、フラクチュエーションを求める。求められたフラ
クチュエーションは、符号化部１９に出力される。Then, the projection unit 16 moves the position information of each segment in parallel and rotationally in the three-dimensional space according to the movement amount of each segment obtained by the motion estimation / correspondence search unit 15, and further The three-dimensional position of the point is converted into the position on the image. Error detector 17
Calculates the difference between the position of each feature point on the image after projection and the edge position of the corresponding input image, and further quantizes the difference to calculate the fractionation. The obtained fractionation is output to the encoding unit 19.

【００５１】最後に、符号化部１９の動作について説明
するが、符号化部１９は前記第１のステップ第２のステ
ップ各々で得られた情報を符号化して伝送路に送出する
手段である。したがって、前記ステップに関係なくその
動作を説明する。符号化部１９は、各連続画像シーケン
スごとに、まず、記憶部１４に記憶されている３次元形
状情報、および、分析値を符号化する。また、各フレー
ムごとの画像データについては、各セグメントごとに動
き推定・対応探索部１５より出力される動き推定値と、
誤差検出部１７より出力されるフラクチュエーションと
を符号化し出力する。Finally, the operation of the encoding unit 19 will be described. The encoding unit 19 is means for encoding the information obtained in each of the first step and the second step and transmitting it to the transmission line. Therefore, the operation will be described regardless of the steps. The encoding unit 19 first encodes the three-dimensional shape information stored in the storage unit 14 and the analysis value for each continuous image sequence. For the image data of each frame, a motion estimation value output from the motion estimation / correspondence search unit 15 for each segment,
It encodes and outputs the fractionation output from the error detection unit 17.

【００５２】符号化部１９に含まれ、本発明に係わる前
記各連続画像シーケンスの３次元形状情報を符号化する
初期符号化部２０の構成および動作について図４〜図８
を参照して説明する。図４は、図１に示した動画像符号
化装置１０の符号化部１９に含まれる初期符号化部２０
の構成を示すブロック図である。初期符号化部２０は、
投影部２１、連結線素抽出部２２、チェーンコード化部
２３、第１のハフマン符号生成部２４、ＡＤＰＣＭ符号
化部２５、および、第２のハフマン符号生成部２６を有
する。The structure and operation of the initial coding unit 20 included in the coding unit 19 and coding the three-dimensional shape information of each continuous image sequence according to the present invention will be described with reference to FIGS.
Will be described with reference to. 4 is an initial coding unit 20 included in the coding unit 19 of the moving picture coding apparatus 10 shown in FIG.
FIG. 3 is a block diagram showing the configuration of FIG. The initial encoding unit 20 is
It has a projection unit 21, a connected line element extraction unit 22, a chain coding unit 23, a first Huffman code generation unit 24, an ADPCM coding unit 25, and a second Huffman code generation unit 26.

【００５３】投影部２１は、各セグメントの３次元形状
を基準となる所定の２次元平面に投影し、投影した２次
元画像と、その２次元画像上の各特徴点から元の３次元
モデルまでの奥行き情報とを得る。具体的には、図５
（Ａ）に示すように、顔の３次元モデル５０が２次元平
面５１上に投影されて、２次元の顔画像５２と、２次元
の顔画像５２と元の３次元モル５０の各特徴点ごとの奥
行きとが得られる。連結線素抽出部２２は、投影部２１
により投影されて得られた各セグメントごとの２次元画
像に基づいて、特徴点の連結線素を抽出する。図５
（Ａ）に示した２次元画像からは、図５（Ｂ）に示す連
結線素１〜連結線素６の６つの連結線素が抽出される。The projection unit 21 projects the three-dimensional shape of each segment onto a predetermined two-dimensional plane serving as a reference, and the projected two-dimensional image and each feature point on the two-dimensional image to the original three-dimensional model. And the depth information of. Specifically, FIG.
As shown in (A), the three-dimensional model 50 of the face is projected on the two-dimensional plane 51, and the two-dimensional face image 52, the feature points of the two-dimensional face image 52, and the original three-dimensional mole 50 are projected. The depth of each is obtained. The connecting line element extraction unit 22 includes a projection unit 21.
The connecting line elements of the feature points are extracted based on the two-dimensional image for each segment obtained by projection by. Figure 5
Six connecting line elements 1 to 6 shown in FIG. 5B are extracted from the two-dimensional image shown in FIG.

【００５４】その抽出された各連結線素の位置は、チェ
ーンコード化部２３において各々チェーンコード化さ
れ、さらに、第１のハフマン符号生成部２４により、ハ
フマン符号化され、２次元の位置情報として伝送路に送
出される。また、前記２次元画像の各特徴点から元の３
次元モデルまでの奥行き情報は、ＡＤＰＣＭ符号化部２
５でＡＤＰＣＭ符号化され、さらに、第２のハフマン符
号生成部２６により、ハフマン符号化され、２次元の位
置情報として伝送路に送出される。The positions of the extracted connecting line elements are respectively chain coded by the chain coding unit 23, and further Huffman coded by the first Huffman code generation unit 24 to obtain two-dimensional position information. It is sent to the transmission line. In addition, from each feature point of the two-dimensional image, the original 3
The depth information up to the dimensional model is obtained by the ADPCM encoding unit 2
5, ADPCM coded, and further Huffman coded by the second Huffman code generation unit 26, and sent to the transmission path as two-dimensional position information.

【００５５】このように、前記奥行き情報をＡＤＰＣＭ
方式により符号化することにより、３次元形状データが
効率よく符号化されることについて図６〜図８を参照し
て説明する。図６は、前記各連結線素の奥行きを示す図
であり、（Ａ）〜（Ｆ）はそれぞれ連結線素１〜６の奥
行きを示す図である。図６に示すように、一般的に、３
次元形状モデルの奥行きの値は、連結方向に相関があり
急変することはない。As described above, the depth information is converted to ADPCM.
It will be described with reference to FIGS. 6 to 8 that the three-dimensional shape data is efficiently encoded by the encoding by the method. FIG. 6 is a diagram showing the depth of each of the connecting line elements, and (A) to (F) are diagrams showing the depth of the connecting line elements 1 to 6, respectively. Generally, as shown in FIG.
The depth value of the dimensional shape model has a correlation in the connecting direction and does not change suddenly.

【００５６】そこで、隣合う画素で奥行きの差分信号を
作ると図７のようになる。図７は、前記各連結線素の奥
行きの差を示す図であり、（Ａ）〜（Ｆ）はそれぞれ連
結線素１〜６の奥行きの差を示す図である。図７に示す
ように、奥行きの差は０付近の値となる。さらに、図７
に示した各連結線素の各画素の奥行き値を累積すると、
図８のようになり、０付近に集中していることがわか
る。したがって、奥行き値の差分をとって１次予測する
符号化方法により、このような信号は効率良く圧縮でき
る。Therefore, when a depth difference signal is created between adjacent pixels, it becomes as shown in FIG. FIG. 7 is a diagram showing the difference in depth between the connecting line elements, and FIGS. 7A to 7F are diagrams showing the difference in depth between the connecting line elements 1 to 6, respectively. As shown in FIG. 7, the depth difference has a value near 0. Furthermore, FIG.
When the depth values of each pixel of each connecting line element shown in are accumulated,
As shown in FIG. 8, it can be seen that the concentration is near 0. Therefore, such a signal can be efficiently compressed by the encoding method that performs the primary prediction by taking the difference of the depth values.

【００５７】このように、本実施例の動画像符号化装置
１０の初期符号化部２０によれば、初期データとして伝
送する３次元形状を、２次元平面に投影し、２次元画像
と奥行き値の情報に分離し、２次元画像の情報はチェー
ンコード化してハフマン符号化し、前記奥行き値はＡＤ
ＰＣＭ符号化をしてハフマン符号化をして、各々伝送す
る。したがって、前記３次元形状モデルを高圧縮率で効
率よく伝送できる。As described above, according to the initial coding unit 20 of the moving picture coding apparatus 10 of this embodiment, the three-dimensional shape to be transmitted as the initial data is projected onto the two-dimensional plane, and the two-dimensional image and the depth value are obtained. Information, the information of the two-dimensional image is chain coded and Huffman coded, and the depth value is AD.
PCM coding, Huffman coding, and transmission are performed. Therefore, the three-dimensional shape model can be efficiently transmitted at a high compression rate.

【００５８】次に、本発明の一実施例の動画像復号装置
について、図９および図１０を参照して説明する。図９
は、本発明の一実施例の動画像復号装置３０の構成を示
すブロック図である。動画像復号装置３０は、復号部３
１、記憶部３２、動き処理部３３、投影部３４、変形部
３５、再合成部３６、および、合成画像記憶部３７を有
する。本実施例の動画像復号装置３０は、伝送路より伝
送された符号化された動画像系列を展開し、合成し、出
力する動画像復号装置であって、前述した動画像符号化
装置１０と協働して画像処理系を構成する。Next, a moving picture decoding apparatus according to an embodiment of the present invention will be described with reference to FIGS. 9 and 10. Figure 9
FIG. 3 is a block diagram showing a configuration of a moving picture decoding device 30 according to an embodiment of the present invention. The moving image decoding device 30 includes the decoding unit 3
1, a storage unit 32, a motion processing unit 33, a projection unit 34, a deformation unit 35, a recomposition unit 36, and a composite image storage unit 37. The moving picture decoding apparatus 30 of the present embodiment is a moving picture decoding apparatus that expands, combines, and outputs an encoded moving picture sequence transmitted from a transmission path, and includes the above-described moving picture coding apparatus 10. Collaborate to form an image processing system.

【００５９】以下、各部の動作について説明する。復号
部３１は、伝送路より伝送された信号を受信し、復号し
て各情報を取り出し、適宜各部に出力する受信手段であ
る。復号部３１は、まず、連続的な動画像シーケンスを
構成する各セグメントの３次元形状情報を受信し、復号
し、記憶部３２に記憶する。その際の各セグメントの位
置は、その連続的な動画像の１フレーム目での位置で表
される。そして、２フレーム目以降については、符号化
されたその各セグメントの全体の移動量（グローバルモ
ーション）と各特徴点の細かな動き（フラクチュエーシ
ョン）を受信し、復号し、グローバルモーションは動き
処理部３３に、フラクチュエーションは変形部３５に出
力される。The operation of each unit will be described below. The decoding unit 31 is a receiving unit that receives the signal transmitted from the transmission path, decodes the information, extracts the information, and outputs the information to the units as appropriate. The decoding unit 31 first receives the three-dimensional shape information of each segment forming a continuous moving image sequence, decodes it, and stores it in the storage unit 32. The position of each segment at that time is represented by the position in the first frame of the continuous moving image. Then, for the second and subsequent frames, the encoded total movement amount (global motion) of each segment and the fine movement (fractation) of each feature point are received and decoded, and the global motion is processed by the motion processing unit. 33, the fractionation is output to the transformation unit 35.

【００６０】復号部３１に含まれ、本発明に係わる前記
各セグメントの３次元形状情報を受信する初期復号部４
０の構成について図１０を参照して説明する。図１０
は、図９に示した動画像復号装置３０の復号部３１に含
まれる初期復号部４０の構成を示すブロック図である。
初期復号部４０は、第１のハフマン符号解読部４１、チ
ェーンコード解読部４２、第２のハフマン符号解読部４
３、ＡＤＰＣＭ復号部４４、および、合成部４５を有す
る。An initial decoding unit 4 included in the decoding unit 31 for receiving the three-dimensional shape information of each segment according to the present invention.
The configuration of 0 will be described with reference to FIG. Figure 10
FIG. 10 is a block diagram showing a configuration of an initial decoding unit 40 included in the decoding unit 31 of the moving image decoding device 30 shown in FIG. 9.
The initial decoding unit 40 includes a first Huffman code decoding unit 41, a chain code decoding unit 42, and a second Huffman code decoding unit 4.
3, an ADPCM decoding unit 44, and a combining unit 45.

【００６１】チェーンコード化されさらにハフマン符号
化された前記各セグメントの２次元平面上に投影した像
の位置は、まず第１のハフマン符号解読部４１において
ハフマン符号が復号され、さらにチェーンコード解読部
４２においてチェーンコードが復号される。また、ＡＤ
ＰＣＭ方式により符号化されさらにハフマン符号化され
た前記２次元平面上の像の奥行き情報は、まず第２のハ
フマン符号解読部４３においてハフマン符号が復号さ
れ、さらにＡＤＰＣＭ復号部４４においてＡＤＰＣＭ符
号化された奥行き情報が復号される。そして、前記２次
元平面への投影像と、その投影像に対する奥行き情報が
合成部４５において合成され、３次元の形状情報が生成
され記憶部３２に記憶される。The position of the image projected on the two-dimensional plane of each segment which is chain coded and Huffman coded is first Huffman code decoded by the first Huffman code decoding unit 41, and further chain code decoding unit. At 42, the chain code is decoded. Also, AD
The depth information of the image on the two-dimensional plane encoded by the PCM method and further Huffman-encoded is first Huffman code-decoded by the second Huffman-code decoding unit 43 and further ADPCM-coded by the ADPCM decoding unit 44. Depth information is decoded. Then, the projection image on the two-dimensional plane and the depth information for the projection image are combined by the combining unit 45, and three-dimensional shape information is generated and stored in the storage unit 32.

【００６２】記憶部３２は、連続的な動画像シーケンス
を構成する各セグメントの３次元形状情報と、各セグメ
ントの位置情報を記憶するメモリである。記憶部３２
は、復号部３１より入力された連続的な動画像シーケン
スを構成する各セグメントの３次元形状情報を初期値と
して記憶し、以後、動き処理部３３によりその位置を各
フレームごと更新される。動き処理部３３は、復号部３
１により入力された動き推定値に基づいて、記憶部３２
に記憶されている各セグメントを移動させる。移動させ
た情報は、投影部３４に出力するとともに、記憶部３２
の各セグメントの位置情報を更新する。The storage unit 32 is a memory for storing the three-dimensional shape information of each segment forming a continuous moving image sequence and the position information of each segment. Storage unit 32
Stores the three-dimensional shape information of each segment forming the continuous moving image sequence input from the decoding unit 31 as an initial value, and thereafter, the position is updated by the motion processing unit 33 for each frame. The motion processing unit 33 includes the decoding unit 3
The storage unit 32 based on the motion estimation value input by 1
Move each segment stored in. The moved information is output to the projection unit 34 and also stored in the storage unit 32.
Position information of each segment is updated.

【００６３】投影部３４は、動き処理部３３により移動
された各セグメントを２次元画像上に投影する。変形部
３５は、投影部３４により投影された画像の各特徴点の
位置に対して、復号部３１より入力されたフラクチュエ
ーションを各特徴点に加え、各特徴点の位置を補正す
る。再合成部３６は、変形部３５より入力された各特徴
点の情報、および、記憶部３２に記憶されている各特徴
点の分析値、および、この連続的な動画像のＤＣ成分に
基づいて、画像データを復元する。合成画像記憶部３７
は、再合成部３６により復元された画像データを記憶し
ておくメモリである。合成画像記憶部３７に記憶されて
いる動画像情報は、適宜表示装置などに出力される。The projection unit 34 projects each segment moved by the motion processing unit 33 onto a two-dimensional image. The transformation unit 35 adds the fractionation input from the decoding unit 31 to each position of each feature point of the image projected by the projection unit 34, and corrects the position of each feature point. The recomposition unit 36 is based on the information of each feature point input from the transformation unit 35, the analysis value of each feature point stored in the storage unit 32, and the DC component of this continuous moving image. , Restore image data. Composite image storage unit 37
Is a memory for storing the image data restored by the re-synthesis unit 36. The moving image information stored in the combined image storage unit 37 is appropriately output to a display device or the like.

【００６４】このように、本実施例の動画像復号装置３
０によれば、前述した本実施例の動画像符号化装置によ
り符号化され送出された動画像の情報を適切に受信し再
生することができる。特に、チェーンコード化とＡＤＰ
ＣＭ符号化されて伝送される３次元形状情報を適切に受
信し、３次元形状情報を生成することができる。As described above, the moving picture decoding apparatus 3 according to the present embodiment.
According to 0, it is possible to properly receive and reproduce the information of the moving image coded and transmitted by the moving image coding apparatus of the present embodiment described above. Especially chain coding and ADP
It is possible to appropriately receive the three-dimensional shape information that is CM-encoded and transmitted, and generate the three-dimensional shape information.

【００６５】[0065]

【発明の効果】本発明の動画像処理装置によれば、動画
像を構成する各セグメントの３次元形状情報を効率よく
符号化することができる。また、本発明の動画像符号化
装置によれば、初期データの量を少なくすることがで
き、高圧縮率で動画像を符号化することができる。ま
た、本発明の動画像復号装置によれば、前記動画像符号
化装置により高圧縮率で符号化された動画像を適切に再
生することができる。According to the moving picture processing apparatus of the present invention, the three-dimensional shape information of each segment forming a moving picture can be efficiently coded. Further, according to the moving picture coding apparatus of the present invention, the amount of initial data can be reduced, and a moving picture can be coded at a high compression rate. Further, according to the moving picture decoding apparatus of the present invention, it is possible to appropriately reproduce the moving picture coded at a high compression rate by the moving picture coding apparatus.

[Brief description of drawings]

【図１】本発明の一実施例の動画像符号化装置の構成を
示すブロック図である。FIG. 1 is a block diagram showing the configuration of a moving picture coding apparatus according to an embodiment of the present invention.

【図２】図１に示した動画像符号化装置の動き推定・対
応探索部の方法を説明する図であり、３次元空間の点Ｐ
を望む様子を示し座標系の説明をする図である。2 is a diagram illustrating a method of a motion estimation / correspondence search unit of the moving picture coding apparatus shown in FIG. 1, and is a point P in a three-dimensional space.
FIG. 6 is a diagram illustrating a situation in which the user desires to describe the coordinate system.

【図３】図１に示した動画像符号化装置の動き推定・対
応探索の方法を説明する図である。FIG. 3 is a diagram for explaining a method of motion estimation / correspondence search of the moving picture coding apparatus shown in FIG.

【図４】図１に示した動画像符号化装置の符号化部に含
まれる初期符号化部の構成を示すブロック図である。4 is a block diagram showing a configuration of an initial coding unit included in a coding unit of the moving picture coding apparatus shown in FIG.

【図５】図２に示した初期符号化部の投影部の動作を説
明する図であり、（Ａ）は投影する状態を示す図、
（Ｂ）は投影結果の２次元平面上の連結線素を示す図で
ある。5A and 5B are diagrams for explaining the operation of the projection unit of the initial encoding unit shown in FIG. 2, in which FIG.
(B) is a diagram showing connecting line elements on a two-dimensional plane as a projection result.

【図６】図５に示した投影像の各連結線素の奥行きを示
す図であり、（Ａ）〜（Ｆ）はそれぞれ連結線素１〜６
の奥行きを示す図である。6 is a diagram showing the depth of each connecting line element of the projected image shown in FIG. 5, where (A) to (F) are the connecting line elements 1 to 6 respectively.
It is a figure which shows the depth of.

【図７】図５に示した投影像の各連結線素の奥行きの差
を示す図であり、（Ａ）〜（Ｆ）はそれぞれ連結線素１
〜６の奥行きの差を示す図である。FIG. 7 is a diagram showing the difference in depth between the connecting line elements of the projected image shown in FIG. 5, where (A) to (F) are the connecting line elements 1 respectively.
It is a figure which shows the difference of the depth of-6.

【図８】図７に示した各連結線素の奥行きの差の分布を
示す図である。FIG. 8 is a diagram showing a distribution of depth differences between the connecting line elements shown in FIG. 7.

【図９】本発明の一実施例の動画像復号装置の構成を示
すブロック図である。FIG. 9 is a block diagram showing a configuration of a moving picture decoding apparatus according to an embodiment of the present invention.

【図１０】図９に示した動画像復号装置の復号部に含ま
れる初期復号部の構成を示すブロック図である。10 is a block diagram showing a configuration of an initial decoding unit included in the decoding unit of the moving picture decoding apparatus shown in FIG.

[Explanation of symbols]

１０…動画像符号化装置１１…画像分析部１２…分析画像記憶部１３…セグメンテーション部１４…記憶部１５…動き推定・対応探索部１６…投影部１７…誤差検出部１８…カルマンフィル
タ１９…符号化部２０…初期符号化部２１…投影部２２…連結線素抽出部２３…チェーンコード化部２４…ハフマン符号生
成部２５…ＡＤＰＣＭ符号化部２６…ハフマン符号生
成部３０…動画像復号装置３１…復号部３２…記憶部３３…動き処理部３４…投影部３５…変形部３６…再合成部３７…合成画像記憶部４０…初期復号部４１…ハフマン符号解読部４２…チェーンコード
解読部４３…ハフマン符号解読部４４…ＡＤＰＣＭ復号
部４５…合成部DESCRIPTION OF SYMBOLS 10 ... Moving image coding apparatus 11 ... Image analysis part 12 ... Analysis image storage part 13 ... Segmentation part 14 ... Storage part 15 ... Motion estimation / correspondence search part 16 ... Projection part 17 ... Error detection part 18 ... Kalman filter 19 ... Encoding Unit 20 ... Initial coding unit 21 ... Projection unit 22 ... Connected line element extraction unit 23 ... Chain coding unit 24 ... Huffman code generation unit 25 ... ADPCM coding unit 26 ... Huffman code generation unit 30 ... Video decoding device 31 ... Decoding unit 32 ... Storage unit 33 ... Motion processing unit 34 ... Projection unit 35 ... Deformation unit 36 ... Resynthesis unit 37 ... Synthetic image storage unit 40 ... Initial decoding unit 41 ... Huffman code decoding unit 42 ... Chain code decoding unit 43 ... Huffman Code decoding unit 44 ... ADPCM decoding unit 45 ... Compositing unit

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｈ０４Ｎ 7/24 ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification code Internal reference number FI Technical indication H04N 7/24

Claims

[Claims]

1. A three-dimensional shape acquisition means for acquiring three-dimensional shape information of each segment constituting a continuous moving image, and a three-dimensional shape acquired by the three-dimensional shape acquisition means into a predetermined two-dimensional plane. Projecting means for projecting, position information encoding means for encoding the position of the image on the two-dimensional plane projected by the projecting means, and the three-dimensional shape from the image on the two-dimensional plane projected by the projecting means The depth to the adaptive differential pulse code modulation (A
Depth information encoding means for encoding by the DPCM method, and the 3
A moving image processing apparatus for encoding a three-dimensional shape.

2. The position of the image on the two-dimensional plane of the three-dimensional shape encoded by the position information encoding means and the depth of the three-dimensional shape encoded by the depth information encoding means are described. The moving image processing apparatus according to claim 1, further comprising entropy coding means for performing entropy coding.

3. The three-dimensional shape acquisition means analyzes a predetermined still image regarding a continuous moving image and obtains three-dimensional shape information of each segment forming the image, and the modeling means obtains the three-dimensional shape information. Storage means for storing the three-dimensional shape information of each segment, and three-dimensional movement of each segment forming the moving image between each frame of the continuous moving image are stored in the storage means. The motion estimation means for estimating based on the three-dimensional shape information of each segment, and the difference between the actual position and the position resulting from the three-dimensional movement of each segment by the estimated motion. The moving image processing apparatus according to claim 1, further comprising an updating unit that updates the three-dimensional shape information of each segment stored in the storage unit.

4. The moving image processing apparatus according to claim 1, wherein the three-dimensional movement of each segment between each frame of the continuous moving image is based on the three-dimensional shape information. And a position of each feature point as a result of three-dimensionally moving each segment by the motion estimated by the motion estimating unit,
Difference detection means for obtaining the difference between the positions of the actual feature points, the motion estimation value of each segment estimated by the motion estimation means, and the difference between the positions of the feature points detected by the difference detection means, A moving picture coding apparatus, which has a coding means for coding each frame, and codes the continuous moving picture.

5. A two-dimensional image decoding means for decoding the position of an image projected on a two-dimensional plane of each segment constituting a coded continuous moving image, and an adaptive differential pulse code modulation (ADPCM) method. Depth information decoding means for decoding the depth of the image on the two-dimensional plane coded by using the position of the image on the two-dimensional plane decoded by the two-dimensional image decoding means, and the depth information decoding means Three-dimensional shape generation means for generating three-dimensional shape information of each segment forming the moving image based on the depth of the image on the two-dimensional plane decoded by the above; and for each frame of the continuous moving image. Encoded into,
A motion decoding unit that decodes the motion estimation value of each segment and the displacement of each feature point, and the position of each segment is three-dimensional based on the motion estimation value of each segment that is decoded by the motion decoding unit. A moving means for moving the segment, a projection means for obtaining a projection image obtained by projecting each segment at the position moved by the moving means on a two-dimensional screen, and a position of each feature point of each segment in the projection image. A deforming unit that moves based on the displacement of each feature point of each segment decoded by the motion decoding unit; and an image combining unit that combines images based on the shape of each segment deformed by the deforming unit, And a moving picture decoding apparatus that decodes an encoded continuous moving picture.