JPH1066088A

JPH1066088A - Multi-point video conference control device

Info

Publication number: JPH1066088A
Application number: JP23368896A
Authority: JP
Inventors: Giichi Watanabe; 義一渡邊
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1996-08-16
Filing date: 1996-08-16
Publication date: 1998-03-06

Abstract

PROBLEM TO BE SOLVED: To almost accurately calculate moving vector information to be added to an encoded composite moving image to be transmitted to each video conference terminal equipment by a simple operation by calculating the moving vector information based on moving vector information obtained at the time of decoding each original moving image, a composite position and a reduction ratio for reduction compositing. SOLUTION: A system control part 22 reads out a moving vector value for four macro blocks(MBs) of a position corresponding to the position of an MB read out from a sound/moving picture multi-plexing part 27 by a moving picture encoding/decoding part 26 from a moving vector table. Moving vector information calculated based on moving vector information obtained at the time of decoding respective original moving images constituting a composite moving image, a composite position and a reduction ratio for reduction compositing is added to each encoded composite moving image. Thereby the moving vector information to be added to the encoded composite moving image can be almost accurately calculated without considering the height of correlation between the moving vector information of respective original moving images and the moving vector information of the composite moving image.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、複数地点に設置さ
れたテレビ会議端末装置からそれぞれ受信した、動きベ
クトル情報を伴って動き補償フレーム間予測符号化され
た符号化原動画像をそれぞれ復号化して得た各テレビ会
議端末装置からの原動画像を縮小合成することにより合
成動画像を作成し、その合成動画像を再度動きベクトル
情報を伴って動き補償フレーム間予測符号化することに
より符号化合成動画像を生成して、前記各テレビ会議端
末装置に送信する機能を少なくとも備えた多地点テレビ
会議制御装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for decoding coded moving images, each of which has been motion-compensated interframe predictive coded with motion vector information and received from video conference terminals installed at a plurality of points. A synthesized moving image is created by reducing and synthesizing the obtained moving images from each of the video conference terminal devices, and the synthesized moving image is subjected to motion-compensated inter-frame predictive coding again with motion vector information, thereby obtaining an encoded synthesized moving image. The present invention relates to a multipoint video conference control device having at least a function of generating an image and transmitting the generated image to each of the video conference terminal devices.

【０００２】[0002]

【従来の技術】現在のほとんどのテレビ会議システムに
おける通信方式は、ＩＴＵ−Ｔ勧告Ｈ．３２０に従って
おり、動画像の符号化・復号化方式は、そのシリーズで
あるＩＴＵ−Ｔ勧告Ｈ．２６１に従っている。2. Description of the Related Art The communication system in most current video conference systems is based on ITU-T Recommendation H.264. In accordance with ITU-T Recommendation H.320, which is a series of the encoding / decoding system for moving images. 261.

【０００３】そのＩＴＵ−Ｔ勧告Ｈ．２６１に規定され
た動画像の符号化方式は、フレーム間の差分を符号化す
るフレーム間予測符号化方式であり、また、前後のフレ
ームを比較して物体の動きを検出し、前フレームの中の
物体を動いた量だけずらしてから次のフレームを予測す
ることにより予測誤差を減らすことができる、動き補償
付きフレーム間予測符号化方式も、オプションではある
が定義されており、ほとんどのテレビ会議端末装置ある
いは多地点テレビ会議制御装置は、その動き補償付きフ
レーム間予測符号化方式に対応している。[0003] The ITU-T Recommendation H. 261 is an inter-frame predictive encoding method for encoding a difference between frames, and detects a motion of an object by comparing preceding and succeeding frames, and detects a motion of a previous frame. The motion-compensated inter-frame predictive coding method, which can reduce the prediction error by predicting the next frame after shifting the object by the amount of movement, is also defined as an option, but most video conferencing The terminal device or the multipoint video conference control device supports the inter-frame predictive coding method with motion compensation.

【０００４】その動き補償付きフレーム間予測符号化方
式では、フレームを構成する多数の小領域（ブロック）
毎に動きベクトルを演算により検出している。In the inter-frame predictive coding system with motion compensation, a large number of small areas (blocks) constituting a frame are used.
A motion vector is detected by calculation every time.

【０００５】動きベクトルの検出方法としては、ブロッ
クマッチング法やグラジエント法等が知られているが、
ＩＴＵ−Ｔ勧告Ｈ．２６１では、ブロックマッチング法
を採用しており、マクロブロック（ＭＢ：１６×１６画
素）毎に、±１５画素の探索範囲で、現フレーム中の着
目しているマクロブロックと最も近似しているブロック
を前フレームから探し、その画素位置の差を当該現フレ
ームの着目しているマクロブロックについての動きベク
トルとするものである。[0005] As a method of detecting a motion vector, a block matching method, a gradient method, and the like are known.
ITU-T Recommendation H. In H.261, a block matching method is adopted, and a block closest to the focused macro block in the current frame within a search range of ± 15 pixels for each macro block (MB: 16 × 16 pixels). From the previous frame, and the difference between the pixel positions is used as a motion vector for the macroblock of interest in the current frame.

【０００６】動き補償付きフレーム間予測符号化方式に
おける符号化処理は、上記動きベクトル検出処理の他、
ＤＣＴ処理、逆ＤＣＴ処理、量子化処理、逆量子化処
理、可変長符号化処理等の多数のデータ処理により構成
されるが、その中でも動きベクトル検出処理は、毎秒約
３０回発生する各フレームに多数（フレームのフォーマ
ットがＦＣＩＦ（３５２画素×２８８ライン）なら、３
９６（２２×１８）個）含まれる各マクロブロックにつ
いて、±１５画素の探索範囲分（３１×３１＝９６１）
の９６１回の比較演算が必要であり、そのめための処理
量は、全符号化処理量の半分以上を占めている。[0006] The encoding process in the inter-frame predictive encoding system with motion compensation includes, in addition to the above-described motion vector detection process,
It is composed of a number of data processing such as DCT processing, inverse DCT processing, quantization processing, inverse quantization processing, variable length encoding processing, etc. Among them, the motion vector detection processing is performed for each frame generated about 30 times per second. Many (if the frame format is FCIF (352 pixels x 288 lines), 3
For each of the 96 (22 × 18) macroblocks, a search range of ± 15 pixels (31 × 31 = 961)
961 comparison operations are required, and the processing amount for that occupies more than half of the total encoding processing amount.

【０００７】一方、多地点テレビ会議制御装置は、各テ
レビ会議端末装置から受信した動画像を、ＩＴＵ−Ｔ勧
告Ｔ．１２０勧告草案に示される各種ミキシング方式に
より適宜ミキシングして各テレビ会議端末装置に送信す
るが、そのうち、トランスコーダを用いたミキシング方
式の場合の多地点ＴＶ会議制御装置における動画像処理
について、図１７を参照して模式的に説明する。なお、
図１７の前提として、多地点テレビ会議制御装置には、
テレビ会議端末装置（以下単に端末と略す）ＡないしＥ
の５端末が接続されているものとする。[0007] On the other hand, the multipoint video conference control device transmits a moving image received from each video conference terminal device to ITU-T Recommendation T.40. H.120, which is appropriately mixed according to various mixing schemes indicated in the draft of the Recommendation 120 and transmitted to each video conference terminal apparatus. Among them, the moving picture processing in the multipoint TV conference control apparatus in the case of the mixing scheme using a transcoder is described with reference to FIG. This will be described schematically with reference to FIG. In addition,
As a premise of FIG. 17, the multipoint video conference control device includes:
Video conference terminal device (hereinafter simply abbreviated as terminal) A to E
It is assumed that the five terminals are connected.

【０００８】同図において、多地点ＴＶ会議制御装置
は、端末ＡないしＥからそれぞれ受信した、符号化され
た原動画像（符号化原動画像）をそれぞれ復号化して、
各原動画像を得て、それらの各原動画像をそれぞれ縦横
２分の１に縮小して縮小動画像を作成し、それらの各縮
小動画像を各端末毎に合成する。つまり、端末Ａに対し
ては、端末Ａから受信した原動画像を除く端末Ｂないし
Ｅからそれぞれ受信した原動画像の縮小動画像を合成し
て合成動画像を作成し、その他の端末に対しても同様
に、合成動画像を作成する。そして、それら各端末向け
の合成動画像を、それぞれ再度符号化して符号化合成動
画像を作成し、各端末に対して送信する。In FIG. 1, a multipoint TV conference control device decodes an encoded moving image (encoded moving image) received from terminals A to E, respectively.
Each moving image is obtained, and each of the moving images is reduced by half in the vertical and horizontal directions to create a reduced moving image, and the reduced moving images are combined for each terminal. That is, for the terminal A, the reduced moving images of the moving images received from the terminals B to E except for the moving image received from the terminal A are combined to create a combined moving image, and for the other terminals, Similarly, a composite moving image is created. Then, the combined moving image for each terminal is re-encoded to create an encoded combined moving image, and transmitted to each terminal.

【０００９】[0009]

【発明が解決しようとする課題】このように、トランス
コーダを用いたミキシング方式では、各端末から受信し
た符号化原動画像をそれぞれいったん復号化して、原動
画像に戻してから任意に縮小処理や合成処理を行えるた
め、多地点テレビ会議制御装置において多様な動画像処
理を行える利点がある一方、各端末向けのそれぞれの動
画像を、各端末毎に同時的に再符号化する必要があるた
め、端末と接続可能な通信チャネル数分の符号化器を備
える必要がある。As described above, in the mixing method using the transcoder, the coded moving picture received from each terminal is decoded once, returned to the moving picture, and then arbitrarily reduced or synthesized. Since the processing can be performed, there is an advantage that various moving image processing can be performed in the multipoint video conference control device, while it is necessary to simultaneously re-encode each moving image for each terminal for each terminal, It is necessary to provide encoders for the number of communication channels connectable to the terminal.

【００１０】その各通信チャネル毎の符号化器は、１つ
のテレビ会議端末装置が備えるものと同等の機能を備え
る必要があり、それらの符号化器が、動き補償付きフレ
ーム間予測符号化に対応していれば、多地点テレビ会議
制御装置は、通信チャネル数分の、動きベクトル検出機
能を持つ符号化器を備える必要がある。一方、動きベク
トル検出機能は、前述したように、符号化処理の全処理
量の半分以上を占めている。したがって、多地点テレビ
会議制御装置は、通信チャネル数分の動きベクトル検出
機能をまかなうために、高性能な演算処理装置を備える
必要があり、その分装置コストが嵩んでいた。[0010] The encoder for each communication channel must have the same function as that provided in one video conference terminal, and these encoders support inter-frame predictive coding with motion compensation. If so, the multipoint video conference control device needs to include encoders having motion vector detection functions for the number of communication channels. On the other hand, the motion vector detection function occupies more than half of the entire processing amount of the encoding processing as described above. Therefore, the multipoint video conference control device needs to include a high-performance arithmetic processing device in order to provide a motion vector detection function for the number of communication channels, and the device cost is increased accordingly.

【００１１】本発明は、係る事情に鑑みてなされたもの
であり、各テレビ会議端末装置に送信する符号化合成動
画像に付加する動きベクトル情報を簡易な演算により算
出できる多地点テレビ会議制御装置を提供することを目
的とする。SUMMARY OF THE INVENTION The present invention has been made in view of the above circumstances, and provides a multipoint video conference control apparatus capable of calculating motion vector information to be added to an encoded combined moving image to be transmitted to each video conference terminal device by a simple calculation. The purpose is to provide.

【００１２】[0012]

【課題を解決するための手段】上記目的を達成するた
め、請求項１記載の多地点テレビ会議制御装置は、複数
の地点に設置されたテレビ会議端末装置を接続して、そ
れら各テレビ会議端末装置からの、フレームを構成する
各小領域毎に付加された動きベクトル情報を伴って動き
補償フレーム間予測符号化された符号化原動画像を受信
する受信手段と、その受信した各テレビ会議端末装置か
らの符号化原動画像をそれぞれ復号化する復号化手段
と、その復号化されたそれぞれの原動画像を縮小して合
成することで合成動画像を作成する縮小合成手段と、そ
の合成動画像を、フレームを構成する各小領域毎に付加
される動きベクトル情報を伴って動き補償フレーム間予
測符号化することにより符号化合成動画像を生成する再
符号化手段と、その符号化合成動画像を前記各テレビ会
議端末装置に送信する送信手段とを少なくとも備えた多
地点テレビ会議制御装置において、前記復号化手段が各
テレビ会議端末装置からそれぞれ受信した現フレームの
符号化原動画像を復号化する際に得られる動きベクトル
情報をそれぞれの現フレーム毎に記憶する現フレーム動
きベクトル記憶手段と、前記再符号化手段において前記
符号化合成動画像の現フレームを構成する各小領域毎に
付加する動きベクトル情報を、当該合成動画像として縮
小合成された前記各原動画像の現フレームについて前記
現フレーム動きベクトル記憶手段が記憶している動きベ
クトル情報と、それら各原動画像の前記合成動画像上の
合成位置と、それら原動画像の前記縮小合成手段におけ
る縮小率とに基づいて算出する動きベクトル算出手段と
を備えたことを特徴とする。According to a first aspect of the present invention, there is provided a multi-point video conference controller connected to video conference terminals installed at a plurality of locations and connected to the respective video conference terminals. Receiving means for receiving a coded moving image subjected to motion compensation inter-frame predictive coding together with motion vector information added to each small area constituting a frame from the device, and each of the received video conference terminal devices A decoding means for decoding the encoded moving image from each, a reduced synthesizing means for creating a synthesized moving image by reducing and synthesizing each decoded moving image, and the synthesized moving image, Re-encoding means for generating an encoded synthesized moving image by performing motion-compensated inter-frame prediction encoding with motion vector information added for each small region constituting a frame; And a transmitting means for transmitting the coded composite moving image to each of the video conference terminal devices, wherein the decoding means receives the coded original moving image of the current frame received from each of the video conference terminal devices. Current vector motion vector storage means for storing the motion vector information obtained when decoding is performed for each current frame, and for each small area constituting the current frame of the coded composite moving image in the re-encoding means. The motion vector information stored in the current frame motion vector storage means for the current frame of each of the moving images reduced and synthesized as the synthesized moving image, and the synthesized moving image of each of the moving images. A motion vector calculated based on a synthesis position on an image and a reduction ratio of the moving images in the reduction synthesis unit. Characterized by comprising a Le calculating means.

【００１３】請求項２記載の多地点テレビ会議制御装置
は、複数の地点に設置されたテレビ会議端末装置を接続
して、それら各テレビ会議端末装置からの、フレームを
構成する各小領域毎に付加された動きベクトル情報を伴
って動き補償フレーム間予測符号化された符号化原動画
像を受信する受信手段と、その受信した各テレビ会議端
末装置からの符号化原動画像をそれぞれ復号化する復号
化手段と、その復号化されたそれぞれの原動画像を縮小
して合成することで合成動画像を作成する縮小合成手段
と、その合成動画像を、フレームを構成する各小領域毎
に付加される動きベクトル情報を伴って動き補償フレー
ム間予測符号化することにより符号化合成動画像を生成
する再符号化手段と、その符号化合成動画像を前記各テ
レビ会議端末装置に送信する送信手段とを少なくとも備
えた多地点テレビ会議制御装置において、前記復号化手
段が各テレビ会議端末装置からそれぞれ受信した現フレ
ームの符号化原動画像を復号化する際に得られる動きベ
クトル情報をそれぞれの現フレーム毎に記憶する現フレ
ーム動きベクトル記憶手段と、前記再符号化手段におい
て前記符号化合成動画像の現フレームを構成する各小領
域毎に付加する動きベクトル情報の探索範囲の中心を、
当該合成動画像として縮小合成された前記各原動画像の
現フレームについて前記現フレーム動きベクトル記憶手
段が記憶している動きベクトル情報と、それら各原動画
像の前記合成動画像上の合成位置と、それら原動画像の
前記縮小合成手段における縮小率とに基づいて各小領域
毎に算出する探索範囲中心算出手段とを備え、前記再符
号化手段は、前記符号化合成動画像の現フレームを構成
する各小領域毎に付加する動きベクトル情報を、前記探
索範囲中心算出手段により各小領域について算出される
探索範囲中心を中心とする所定範囲内で探索することを
特徴とする。According to a second aspect of the present invention, there is provided a multi-point video conference control device which connects video conference terminal devices installed at a plurality of locations, and outputs a video signal from each of the video conference terminal devices for each small area constituting a frame. Receiving means for receiving an encoded moving image subjected to motion compensation inter-frame predictive encoding with added motion vector information, and decoding for respectively decoding the received encoded moving image from each video conference terminal device Means for reducing and combining the decoded original moving images to form a combined moving image; and a method for adding the combined moving image to each small area constituting a frame. Re-encoding means for generating an encoded combined moving image by performing motion-compensated inter-frame prediction encoding with vector information, and each of the video conference terminal devices A multi-point video conference control device having at least a transmission unit for transmitting, the motion vector information obtained when the decoding unit decodes the coded moving image of the current frame received from each of the video conference terminal devices. A current frame motion vector storage unit for storing each current frame, and a center of a search range of motion vector information to be added to each of the small regions constituting the current frame of the encoded combined moving image in the re-encoding unit. ,
The motion vector information stored in the current frame motion vector storage means for the current frame of each of the moving images reduced and synthesized as the synthesized moving image, the synthesized position of each of the moving images on the synthesized moving image, A search range center calculating means for calculating for each small area based on a reduction ratio of the original moving image in the reduction synthesizing means, wherein the re-encoding means comprises a current frame of the encoded synthetic moving image. Motion vector information to be added for each small area is searched for within a predetermined range centered on the search range center calculated for each small area by the search range center calculating means.

【００１４】請求項３記載の多地点テレビ会議制御装置
は、請求項１記載の多地点テレビ会議制御装置におい
て、前記復号化手段が各テレビ会議端末装置からそれぞ
れ受信した、現フレームより１フレーム前のフレームの
符号化原動画像を復号化する際に得られる動きベクトル
情報をそれぞれ記憶する前フレーム動きベクトル記憶手
段を更に備え、前記動きベクトル算出手段は、前記再符
号化手段における前記合成動画像の符号化の際にフレー
ムスキップが発生した場合は、前記再符号化手段におい
て前記符号化合成動画像の現フレームを構成する各小領
域毎に付加する動きベクトル情報を、当該合成動画像と
して縮小合成された前記各原動画像の現フレームについ
て前記現フレーム動きベクトル記憶手段が記憶している
各小領域毎の動きベクトルと、その現フレーム動きベク
トル記憶手段が記憶している現フレームの各小領域毎の
動きベクトルが指し示す小領域について、当該合成動画
像として縮小合成された前記各原動画像の現フレームよ
り１フレーム前のフレームについて前記前フレーム動き
ベクトル記憶手段が記憶している動きベクトルとを前記
各原動画像の現フレームを構成する各小領域について加
算した動きベクトル情報と、それら各原動画像の前記合
成動画像上の合成位置と、それら原動画像の前記縮小合
成手段における縮小率とに基づいて算出することを特徴
とする。According to a third aspect of the present invention, there is provided the multipoint video conference control device according to the first aspect, wherein the decoding means receives one frame before the current frame from each video conference terminal device. Further comprising a previous frame motion vector storage means for storing the motion vector information obtained when decoding the coded moving image of the frame of the above, wherein the motion vector calculation means, the re-encoding means in the re-encoding means When a frame skip occurs at the time of encoding, the re-encoding means reduces and synthesizes the motion vector information to be added to each of the small regions constituting the current frame of the encoded combined moving image as the combined moving image. The motion vector for each small area stored in the current frame motion vector storage means for the current frame of And a small area indicated by a motion vector for each small area of the current frame stored in the current frame motion vector storage means, one frame from the current frame of each original moving image reduced and synthesized as the synthesized moving image. The motion vector information obtained by adding the motion vector stored in the previous frame motion vector storage means for the previous frame for each of the small regions constituting the current frame of each of the moving images, and the synthesized moving image of each of the moving images. The calculation is performed based on the above combination position and the reduction ratio of the moving image in the reduction combination means.

【００１５】請求項４記載の多地点テレビ会議制御装置
は、請求項２記載の多地点テレビ会議制御装置におい
て、前記復号化手段が各テレビ会議端末装置からそれぞ
れ受信した現フレームより１フレーム前のフレームの符
号化原動画像を復号化する際に得られる動きベクトル情
報をそれぞれ記憶する前フレーム動きベクトル記憶手段
を更に備え、前記探索範囲中心算出手段は、前記再符号
化手段における前記合成動画像の符号化の際にフレーム
スキップが発生した場合は、前記再符号化手段において
前記符号化合成動画像の現フレームを構成する各小領域
毎に付加する動きベクトル情報の探索範囲の中心を、当
該合成動画像として縮小合成された前記各原動画像の現
フレームについて前記現フレーム動きベクトル記憶手段
が記憶している各小領域毎の動きベクトルと、その現フ
レーム動きベクトル記憶手段が記憶している現フレーム
の各小領域毎の動きベクトルが指し示す小領域につい
て、当該合成動画像として縮小合成された前記各原動画
像の現フレームより１フレーム前のフレームについて前
記前フレーム動きベクトル記憶手段が記憶している動き
ベクトルとを前記各原動画像の現フレームを構成する各
小領域について加算した動きベクトル情報と、それら各
原動画像の前記合成動画像上の合成位置と、それら原動
画像の前記縮小合成手段における縮小率とに基づいて算
出することを特徴とする。According to a fourth aspect of the present invention, there is provided the multipoint video conference control apparatus according to the second aspect, wherein the decoding means is one frame before the current frame received from each video conference terminal device by the decoding means. The image processing apparatus further includes a previous frame motion vector storage unit that stores motion vector information obtained when decoding the encoded moving image of the frame, wherein the search range center calculation unit calculates the composite moving image of the combined moving image in the re-encoding unit. If a frame skip occurs during encoding, the re-encoding means sets the center of the search range of the motion vector information to be added to each small area constituting the current frame of the encoded composite moving image, and Each current frame of the original moving image reduced and synthesized as a moving image is stored in the current frame motion vector storage unit. With respect to the motion vector for each area and the small area indicated by the motion vector for each small area of the current frame stored in the current frame motion vector storage means, the current size of each original moving image reduced and synthesized as the synthesized moving image is obtained. The motion vector information obtained by adding the motion vector stored in the previous frame motion vector storage means for each of the small regions forming the current frame of each of the moving images for the frame one frame before the frame, and the motion vector information of each of the moving images. The calculation is performed based on a combination position on the combined moving image and a reduction ratio of the original moving image in the reduction combining unit.

【００１６】[0016]

【発明の実施の形態】以下、添付図面を参照しながら本
発明の実施の形態に係る多地点テレビ会議制御装置につ
いて詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, a multipoint video conference control apparatus according to an embodiment of the present invention will be described in detail with reference to the accompanying drawings.

【００１７】図１は、本実施の形態に係る多地点テレビ
会議制御装置を含むテレビ会議システムの構成を示して
いる。FIG. 1 shows a configuration of a video conference system including a multipoint video conference control device according to the present embodiment.

【００１８】同図において、１、１９、２０は、本発明
に関係する、同一構成のテレビ会議端末装置であり、Ｉ
ＳＤＮ回線１８により、ＩＳＤＮネットワークに接続さ
れている。なお、図示していないが、本発明に関係する
テレビ会議端末装置は、１、１９及び２０の３装置に限
られない。また、２１は本発明に係る多地点テレビ会議
制御装置であり、ＩＳＤＮ回線２８によりＩＳＤＮネッ
トワークに接続されている。In FIG. 1, reference numerals 1, 19, and 20 denote video conference terminals having the same configuration, which are related to the present invention.
An SDN line 18 connects to an ISDN network. Although not shown, the video conference terminal devices related to the present invention are not limited to the three devices 1, 19 and 20. Reference numeral 21 denotes a multipoint video conference control device according to the present invention, which is connected to an ISDN network via an ISDN line 28.

【００１９】図２は、本発明に関係するテレビ会議端末
装置のうちのテレビ会議端末装置１について、そのブロ
ック構成を示したものである。FIG. 2 shows a block configuration of the video conference terminal device 1 of the video conference terminal devices related to the present invention.

【００２０】同図において、２はシステム全体の制御を
司り、ＣＰＵ、メモリ、タイマー等からなるシステム制
御部、３は各種プログラムやデータを記憶するための磁
気ディスク装置、４はＩＳＤＮのレイヤ１の信号処理と
Ｄチャネルのレイヤ２の信号処理とを行うＩＳＤＮイン
ターフェイス部、５はＩＴＵ−Ｔ勧告Ｈ．２２１に規定
された信号処理によって、複数メディアのデータの多重
・分離を行うマルチメデイア多重・分離部、６は音声入
力のためのマイク、７は、マイク６からの入力信号を増
幅した後Ａ／Ｄ変換を行う音声入力処理部、８は音声信
号の符号化・復号化・エコーキャンセルを行う音声符号
・復号化部、９は音声符号・復号化部８で復号化された
音声信号をＤ／Ａ変換の後増幅する、音声出力処理部、
１０は音声出力処理部９からの音声を出力するためのス
ピーカ、１１は映像入力のためのビデオカメラ、１２は
ビデオカメラ１１からの映像信号をＮＴＳＣデコード、
Ａ／Ｄ変換等の信号処理を行う映像入力処理部、１３は
ＩＴＵ−Ｔ勧告Ｈ．２６１に準拠した動画像の符号化・
復号化を行う動画符号化・復号化部、１４は、動画符号
化・復号化装置１３で復号化された映像信号をＤ／Ａ変
換、ＮＴＳＣエンコード、グラフイックス合成等の信号
処理を行う映像出力処理部、１５は受信動画映像やグラ
フィックス情報を表示するためのモニター、１６はコン
ソールを制御するユーザーインターフェイス制御部、１
７は操作キー及び表示部よりなるコンソール、１８はＩ
ＳＤＮ回線である。In FIG. 1, reference numeral 2 denotes a system control unit including a CPU, a memory, a timer, etc., 3 a magnetic disk device for storing various programs and data, and 4 an ISDN layer 1 of the ISDN. The ISDN interface unit 5 that performs signal processing and signal processing of the layer 2 of the D channel is provided in the ITU-T Recommendation H.5. 221 is a multimedia multiplexing / demultiplexing unit that multiplexes / demultiplexes data of a plurality of media, 6 is a microphone for audio input, 7 is an A / A after amplifying an input signal from the microphone 6. An audio input processing unit that performs D conversion, 8 is an audio encoding / decoding unit that performs encoding / decoding / echo cancellation of the audio signal, and 9 is an audio signal decoding unit that converts the audio signal decoded by the audio encoding / decoding unit 8 into a D / Audio output processing unit that amplifies after A conversion,
Reference numeral 10 denotes a speaker for outputting audio from the audio output processing unit 9, reference numeral 11 denotes a video camera for inputting video, and reference numeral 12 denotes NTSC decoding of a video signal from the video camera 11.
A video input processing unit 13 for performing signal processing such as A / D conversion is provided in accordance with ITU-T Recommendation H.264. Encoding of moving images conforming to H.261
A moving image encoding / decoding unit 14 for decoding is a video output for performing signal processing such as D / A conversion, NTSC encoding, and graphics synthesis on the video signal decoded by the moving image encoding / decoding device 13. A processing unit, 15 is a monitor for displaying the received moving image and graphics information, 16 is a user interface control unit for controlling a console, 1
7 is a console comprising operation keys and a display unit, 18 is I
SDN line.

【００２１】図３は、多地点テレビ会議制御装置２１の
ブロック構成を示している。同図において、２２はシス
テム全体の制御を司りＣＰＵ、メモリ、タイマー等から
なるシステム制御部である。FIG. 3 shows a block configuration of the multipoint video conference controller 21. In FIG. 1, reference numeral 22 denotes a system control unit that controls the entire system and includes a CPU, a memory, a timer, and the like.

【００２２】２３はＩＳＤＮインターフェイス部、２４
はマルチメデイア多重・分離部、２５は音声信号の符号
化・復号化を行う音声符号・復号化部、２６はＩＴＵ−
Ｔ勧告Ｈ．２６１に準拠した動画像の符号化・復号化を
行う動画符号化・復号化部であり、２３ないし２６の構
成要素により、通信チャネル１が構成されている。Reference numeral 23 denotes an ISDN interface unit;
Is a multimedia multiplexing / demultiplexing unit, 25 is a voice coding / decoding unit for coding / decoding a voice signal, and 26 is an ITU-
Recommendation T. H. This is a moving image encoding / decoding unit that encodes / decodes a moving image in compliance with H.261, and a communication channel 1 is configured by 23 to 26 components.

【００２３】以上の構成は、通信チャネル１の構成であ
るが、図示するように、多地点テレビ会議制御装置２１
は、１ないしｎの通信チャネルを備え、通信チャネル１
以外の通信チャネルも、図示を省略しているが通信チャ
ネル１と同一構成を備え、それぞれがＩＳＤＮ回線に接
続されている。The above configuration is the configuration of the communication channel 1, but as shown in the figure, the multipoint video conference controller 21
Comprises 1 to n communication channels, communication channel 1
Although not shown, the other communication channels have the same configuration as the communication channel 1 and are connected to ISDN lines.

【００２４】２７は、各通信チャネルで復号化された音
声及び動画像のデータをチャネル間で合成し、各チャネ
ルに配送する音声・動画マルチプレクス部、２８は、通
信チャネルに接続されたＩＳＤＮ回線である。なお、図
において、音声符号・復号化部２５から音声・動画マル
チプレクス部２７への接続、動画符号・複号化部２６か
ら音声・動画マルチプレクス部２７への接続、及び、音
声・動画マルチプレクス部２７から動画符号・復号化部
２６への接続は、図の煩雑さを避ける為に単一の接続で
示したが、これらは、実際は、各通信チャネル１〜ｎ毎
に別個に接続されている。Reference numeral 27 denotes an audio / video multiplex unit for synthesizing audio and video data decoded by each communication channel between channels and delivering the data to each channel; and 28, an ISDN line connected to the communication channel. It is. In the figure, the connection from the audio coding / decoding unit 25 to the audio / video multiplexing unit 27, the connection from the video coding / decoding unit 26 to the audio / video multiplexing unit 27, and the connection between the audio / video multiplexing The connection from the plexing unit 27 to the moving image encoding / decoding unit 26 is shown as a single connection to avoid complexity in the figure, but these are actually connected separately for each of the communication channels 1 to n. ing.

【００２５】次に、テレビ会議システムの基本的な動作
について図４を参照して説明する。同図において、テレ
ビ会議を起動する際には、各テレビ会議端末装置と多地
点テレビ会議制御装置２１（相手端末）との間でまず回
線の接続を行う必要がある。これはＬＡＰＤを通じて行
う通常の発呼手順に従う。ＳＥＴＵＰ（呼設定メッセー
ジ）は、伝達能力（ＢＣ）を非制限デジタル、下位レイ
ヤ整合性（ＬＬＣ）をＨ．２２１、高位レイヤ整合性
（ＨＬＣ）を会議として送出する。Next, the basic operation of the video conference system will be described with reference to FIG. In the figure, when starting a video conference, it is necessary to first connect a line between each video conference terminal device and the multipoint video conference control device 21 (the partner terminal). This follows the normal calling procedure performed through LAPD. The SETUP (call setup message) has a digital transmission capability (BC) and an H.264 lower layer consistency (LLC). 221, send high layer consistency (HLC) as a conference.

【００２６】相手端末がＳＥＴＵＰを解析し、通信可能
性が承認されると、相手端末はＣＯＮＮ（応答）を返
し、呼が確立される。ここで、下位レイヤ整合性におい
てＨ．２２１とは、図２におけるマルチメデイア多重・
分離部５で実行されるＩＴＵ−Ｔ勧告Ｈ．２２１がイン
プリメントされていることを示している。When the partner terminal analyzes the SETUP and the communication possibility is approved, the partner terminal returns CONN (response) and the call is established. Here, in lower layer consistency, H.264 is used. Reference numeral 221 denotes the multimedia multiplexing / multiplexing in FIG.
The ITU-T Recommendation H.264 implemented in the separation unit 5 221 has been implemented.

【００２７】呼が確立されると、システム制御部２はマ
ルチメデイア多重・分離部５を制御し、マルチフレーム
同期信号の送出を行いマルチフレーム同期を確立する。
更に、システム制御部２はＩＴＵ−Ｔ勧告Ｈ．２４２に
従いマルチメデイア多重・分離部５を制御して能力通知
を行い、交信モードを確立する。これは、Ｈ．２２１上
のＢＡＳ信号上で行い、共通能力で必要なチャネルの設
定、ビットレートの割り当てを行う。本実施の形態で
は、音声、動画、デー夕（ＬＳＤ）の３つのチャネルが
アサインされる。交信モードが確定すると、各チャネル
は各々独立したデータとして取り扱う事が可能となり、
テレビ会議としての動作を開始する。When a call is established, the system control unit 2 controls the multimedia multiplexing / demultiplexing unit 5 to transmit a multi-frame synchronization signal and establish multi-frame synchronization.
Further, the system control unit 2 conforms to ITU-T Recommendation H.264. In accordance with H.242, the multimedia multiplexing / demultiplexing unit 5 is controlled to notify the capability and establish a communication mode. This is described in H. This is performed on the BAS signal on the H.221, and the necessary channel setting and bit rate allocation are performed with the common capability. In the present embodiment, three channels of audio, video, and data (LSD) are assigned. When the communication mode is determined, each channel can be handled as independent data,
Start operation as a video conference.

【００２８】以上の手順が、各テレビ会議端末と、多地
点テレビ会議制御装置２１の各通信チャネルとの間で行
われることにより、多地点テレビ会議制御装置２１を介
して多地点テレビ会議が可能となる。The above procedure is performed between each video conference terminal and each communication channel of the multipoint video conference control device 21, so that a multipoint video conference can be performed via the multipoint video conference control device 21. Becomes

【００２９】テレビ会議が起動されると、システム制御
部２は、音声符号・復号化部８及び動画符号・復号化部
１３を起動し、音声、動画、及びデータの双方向通信が
可能となる。When the video conference is started, the system control unit 2 starts the audio encoding / decoding unit 8 and the moving image encoding / decoding unit 13 to enable two-way communication of audio, moving image, and data. .

【００３０】テレビ会議終了時には、システム制御部２
は、音声符号・復号化部８及び動画符号・復号化部１３
を停止すると共に、ＩＳＤＮインターフェイス部４を制
御し図４に示した手順に従い呼を解放する。At the end of the video conference, the system control unit 2
Are the audio encoding / decoding unit 8 and the video encoding / decoding unit 13
And controls the ISDN interface 4 to release the call according to the procedure shown in FIG.

【００３１】ユーザは、これまで述べた各動作（発呼、
会議終了）の起動を、コンソール１７を操作して行う。
入力された操作データは、ユーザーインターフェイス制
御部１６を介してシステム制御部２へ通知される。シス
テム制御部２は、操作データを解析し、操作内容に応じ
た動作の起動あるいは停止を行うと共に、ユーザーへの
ガイダンスの表示データを作成し、ユーザーインターフ
ェイス制御部１６を介して、コンソール１７上へ表示さ
せる。The user operates each of the operations described above (calling,
The start of “end of meeting” is performed by operating the console 17.
The input operation data is notified to the system controller 2 via the user interface controller 16. The system control unit 2 analyzes the operation data, starts or stops an operation according to the operation content, creates display data of guidance to a user, and sends the guidance display data to the console 17 via the user interface control unit 16. Display.

【００３２】多地点テレビ会議制御装置２１側では、上
述した様な手順で、各通信チャネル毎に１つのテレビ会
議端末と接続し、多地点間でのテレビ会議を運営する。
なお、上述した例では、テレビ会議端末装置側からの発
呼により接続する例について説明したが、あらかじめ定
められた時刻に定められたテレビ会議端末装置へ多地点
テレビ会議制御装置２１側から発呼し、接続することも
できる。The multipoint video conference controller 21 is connected to one video conference terminal for each communication channel and operates a multipoint video conference in the above procedure.
In the example described above, an example is described in which connection is made by calling from the video conference terminal device side, but a call is made from the multipoint video conference control device 21 to the video conference terminal device set at a predetermined time. And can be connected.

【００３３】テレビ会議制御装置２１において、音声・
動画マルチプレクス部２７では、各通信チャネルで復号
化された音声を合成して各通信チャネルに配分すると共
に、同じく各通信チャネルで復号化された動画を縮小し
て合成して各通信チャネルに配分する。In the video conference control device 21, voice
The moving image multiplexing unit 27 synthesizes the sound decoded in each communication channel and allocates it to each communication channel, and also reduces and synthesizes the moving image decoded in each communication channel and allocates it to each communication channel. I do.

【００３４】図５に、各テレビ会議端末装置（端末）の
接続形態と、図示しない多地点テレビ会議制御装置２１
からそれら各テレビ会議端末装置に送信されて表示され
る画像例を示す。FIG. 5 shows a connection form of each video conference terminal (terminal) and a multipoint video conference controller 21 (not shown).
3 shows an example of an image transmitted to each of the video conference terminal devices and displayed.

【００３５】同図において、端末ＡないしＥは、会議に
参加している各テレビ会議端末装置（図１の１、１９、
２０等に相当）で、それぞれ多地点テレビ会議制御装置
によって縮小、合成された動画像の各端末での表示画像
を示している。四角内の各アルファベットＡ〜Ｅは、各
テレビ会議端末装置Ａ〜Ｅから多地点テレビ会議制御装
置に送信された動画像を示している。同図を見てわかる
ように、各テレビ会議端末装置では、自端末以外の４地
点のテレビ会議端末装置からの動画像が多地点テレビ会
議制御装置により縮小合成された動画像が表示されてい
る。これにより、各テレビ会議端末装置における会議参
加者は、他端末における会議風景を見ながらテレビ会議
を行う。In the figure, terminals A to E are each a video conference terminal device (1, 19, FIG. 1 in FIG. 1) participating in a conference.
20 and the like), and shows a display image on each terminal of a moving image reduced and combined by the multipoint video conference control device. Each alphabet A to E in the square indicates a moving image transmitted from each of the video conference terminal devices A to E to the multipoint video conference control device. As can be seen from the figure, in each video conference terminal device, a video image obtained by reducing and synthesizing the video images from the video conference terminal devices at four points other than the own terminal by the multipoint video conference control device is displayed. . Thereby, the conference participant in each video conference terminal device has a video conference while watching the conference scene in the other terminal.

【００３６】このような合成動画像を作成するために、
多地点テレビ会議制御装置２１の音声・動画マルチプレ
クス部２７内では、各通信チャネルの動画符号・復号化
部２６からの復号化された動画像を縮小したのち、その
各通信チャネル毎の縮小動画像のそれぞれを音声・動画
マルチプレクス部２７内に配置された画像メモリ内に記
憶し、各通信チャネル毎の合成形態に従って画像メモリ
から順次読み出して合成し、その合成動画像を各通信チ
ャネルに転送する。なお、転送は各通信チャネルで並行
に行われる。各通信チャネルに転送された合成動画像
は、各通信チャネルの動画符号・復号化部２６により符
号化されて、各テレビ会議端末装置に送信され、各テレ
ビ会議端末装置では、図５に示したように、受信した符
号化合成動画像を、動画符号・復号化部１３により復号
化してモニター１５に表示する。なお、多地点テレビ会
議制御装置２１の音声・動画マルチプレクス部２７にお
ける各通信チャネルからの動画像の合成形態は、テレビ
会議中の状況（発言権の移動等）により可変であり、シ
ステム制御部２２からの設定に従って随時変更すること
ができる。In order to create such a composite moving image,
In the audio / video multiplexing unit 27 of the multipoint video conference control device 21, after reducing the decoded moving image from the video encoding / decoding unit 26 of each communication channel, the reduced video for each communication channel is reduced. Each of the images is stored in an image memory arranged in the audio / video multiplexing unit 27, sequentially read out from the image memory in accordance with a synthesis mode for each communication channel and synthesized, and the synthesized moving image is transferred to each communication channel. I do. The transfer is performed in parallel on each communication channel. The synthesized moving image transferred to each communication channel is encoded by the moving image encoding / decoding unit 26 of each communication channel and transmitted to each video conference terminal device. In each video conference terminal device, as shown in FIG. As described above, the received encoded combined moving image is decoded by the moving image encoding / decoding unit 13 and displayed on the monitor 15. Note that the combination mode of moving images from each communication channel in the audio / video multiplexing unit 27 of the multipoint video conference control device 21 is variable depending on the situation during the video conference (movement of the floor) and the system control unit. It can be changed at any time according to the settings from 22.

【００３７】次に、本発明に係る多地点テレビ会議制御
装置における、各テレビ会議端末装置からの受信動画像
の復号化処理及び、各テレビ会議端末装置へ送信する合
成動画像の符号化処理について第１ないし第３実施形態
に分けて説明する。Next, in the multi-point video conference control device according to the present invention, the decoding process of the moving image received from each video conference terminal device and the encoding process of the synthesized video transmitted to each video conference terminal device The first to third embodiments will be described separately.

【００３８】先ず、第１実施形態に係る、図６に示す各
テレビ会議端末装置からの受信動画像の復号化処理、及
び、図７に示す各テレビ会議端末装置へ送信する合成動
画像の符号化処理について説明する。First, the decoding processing of the moving image received from each video conference terminal device shown in FIG. 6 according to the first embodiment, and the code of the synthesized moving image transmitted to each video conference terminal device shown in FIG. The conversion process will be described.

【００３９】これらの２つの処理は、後述する、図８に
示す動きベクトルテーブルへの書き込みと読み出しか競
合しない様に、かつ、前述した音声・動画マルチプレク
ス部２７内の画像メモリ内の画像と図８に示す動きベク
トルテーブルの対応が損なわれない様に制御しつつ、並
行して行われる。また、音声・動画マルチプレクス部２
７内で実行される縮小処理の為に、図８に示す動きベク
トルテーブルの更新は、音声・動画マルチプレクス部２
７内の画像メモリの更新に先だって実行される。その間
は、図８に示す動きベクトルテーブルと画像メモリ内の
画像の対応が損なわれる為、相当する位置（マクロブロ
ック単位）の符号化処理は禁止される。また、これらの
処理は各通信チヤネル毎に並行に実行される。These two processes are performed so that writing and reading to the motion vector table shown in FIG. 8, which will be described later, do not conflict with each other, and the above-described processing in the image memory in the audio / moving image multiplex unit 27 is performed. It is performed in parallel while controlling so that the correspondence of the motion vector table shown in FIG. 8 is not impaired. In addition, audio / video multiplex unit 2
The update of the motion vector table shown in FIG.
7 is executed prior to updating the image memory. During that time, the correspondence between the motion vector table shown in FIG. 8 and the image in the image memory is lost, so that the encoding process at the corresponding position (in units of macroblocks) is prohibited. These processes are executed in parallel for each communication channel.

【００４０】図６に示す第１実施形態に係る復号化処理
手順について以下説明する。なお、この復号化処理にお
いて復号化する動画像のフォーマットは、ＦＣＩＦ（３
５２画素×２８８ライン）であるとする。従って、動画
像の１フレームには、１番から１２番のＧＯＢが含まれ
ている。また、１つのＧＯＢには、３３個のマクロブロ
ック（ＭＢ）含まれている。したがって、１フレーム
は、１２×３３＝３９６個のマクロブロックで構成され
ている。なお、ＧＯＢのフレーム中における配列順序
や、マクロブロックの、ＧＯＢにおける配置は、ＩＴＵ
−Ｔ勧告Ｈ．２６１に規定されており、それ自体よく知
られているため、詳細な説明は省略する。The decoding processing procedure according to the first embodiment shown in FIG. 6 will be described below. Note that the format of the moving image to be decoded in this decoding process is FCIF (3
52 pixels × 288 lines). Therefore, one frame of the moving image includes the first to twelfth GOBs. Further, one GOB includes 33 macroblocks (MB). Therefore, one frame is composed of 12 × 33 = 396 macro blocks. Note that the arrangement order in a GOB frame and the arrangement of macroblocks in a GOB are determined by ITU.
-T Recommendation H. 261 and are well known per se, and therefore detailed description is omitted.

【００４１】さて、まず、各通信チャネルの動画符号・
復号化部２６は、各テレビ会議端末装置から受信した、
マクロブロック単位で動きベクトル情報が付加されるこ
とにより動き補償付きフレーム間予測符号化された符号
化動画像の現フレーム中の１ＭＢ分の画像データの復号
化を実行する（処理１０１）。First, the moving picture code of each communication channel
The decoding unit 26 receives from each video conference terminal device,
By adding the motion vector information in units of macroblocks, decoding of 1 MB of image data in the current frame of the coded moving image that has been subjected to the inter-frame predictive coding with motion compensation is executed (process 101).

【００４２】そして、復号化処理に伴って得られた、当
該ＭＢの動きベクトル値を、そのＭＢの含まれるＧＯＢ
の番号（ＩＴＵ−Ｔ勧告Ｈ．２６１参照）、及び、その
ＭＢの属するＧＯＢにおいて当該ＭＢの配置を特定する
ためのＭＢアドレスと共にシステム制御部２２へ通知す
る（処理１０２）。もし、当該ＭＢに動きベクトル値が
付随していなかった場合には動きベクトル値をＸ成分、
Ｙ成分共に値０として通知する。Then, the motion vector value of the MB obtained by the decoding process is referred to as the GOB containing the MB.
(See ITU-T Recommendation H.261) and the MB address for specifying the location of the MB in the GOB to which the MB belongs, to the system control unit 22 (process 102). If a motion vector value is not attached to the MB, the motion vector value is set to an X component,
The Y component is notified as a value of 0.

【００４３】システム制御部２２は、システム制御部２
２内に配置されている一時記憶用メモリに図８に示す様
な動きベクトルテーブルを確保し、その動きベクトルテ
ーブルの、処理１０２で通知されたＭＢのアドレス及び
ＧＯＢ番号により決まる現フレームにおけるＭＢ位置
に、当該ＭＢの動きベクトル値を格納する（処理１０
３）。The system control unit 22 includes the system control unit 2
A motion vector table as shown in FIG. 8 is secured in the temporary storage memory arranged in the memory 2 and the MB position in the current frame determined by the address and the GOB number of the MB notified in the process 102 in the motion vector table. Is stored with the motion vector value of the MB.
3).

【００４４】ここで、図８に示す動きベクトルテーブル
について説明する。同図において、（ａ）は、現フレー
ムを構成する（Ｘ方向２２）×（Ｙ方向１８）＝３９６
個のＭＢのそれぞれについての動きベクトルＶのＸ成分
格納用テーブル、（ｂ）が同じくＹ成分格納用テーブル
である。それら各成分格納用テーブルにおいて、Ｐｘ
は、復号化され縮小される前の現フレームのＸ方向のＭ
Ｂ単位での位置を、Ｐｙが同じくＹ方向のＭＢ単位での
位置を示している。このような動きベクトルテーブル
が、動画像合成のソースとなる画像分（本実施形態では
５つ）設けられ、各通信チヤネル内の動画符号・復号化
部２６から通知された動きベクトル値を記憶する。ま
た、ＭＢデータが無く、スキップされたＭＢがある場合
には、同時にそれらの位置に相当するベクトル値を０に
書き換える。当該スキップされたＭＢの位置は、ＩＴＵ
−Ｔ勧告Ｈ．２６１に規定されたＭＢデータの伝送順に
従って判定する。Here, the motion vector table shown in FIG. 8 will be described. In the figure, (a) shows the current frame (22 in the X direction) × (18 in the Y direction) = 396.
A table for storing the X component of the motion vector V for each of the MBs, and (b) is a table for storing the Y component. In each of these component storage tables, Px
Is M in the X direction of the current frame before being decoded and reduced.
Py indicates the position in B units, and Py indicates the position in MB units in the Y direction. Such a motion vector table is provided for an image (five in this embodiment) serving as a source of moving image synthesis, and stores the motion vector value notified from the moving image coding / decoding unit 26 in each communication channel. . If there is no MB data and there are skipped MBs, the vector values corresponding to those positions are rewritten to 0 at the same time. The position of the skipped MB is the ITU
-T Recommendation H. 261 is determined according to the transmission order of the MB data.

【００４５】動画符号・復号化部２６は復号化して得ら
れた１ＭＢ分の画像データを音声・動画マルチプレクス
部２７に転送する（処理１０４）。以上の処理が、現フ
レームを構成する全てのＭＢについて繰り返される（判
断１０５のＮｏループ）。なお、音声・動画マルチプレ
クス部２７内では、前述したように、各通信チャネルの
動画符号・復号化部２６からの復号化された現フレーム
分の動画像を縮小したのち、その各通信チャネル毎の縮
小動画像のそれぞれを音声・動画マルチプレクス部２７
内に配置された画像メモリ内に記憶し、各通信チャネル
毎の合成形態に従って画像メモリから順次読み出して合
成し、その合成動画像を各通信チャネルに並行して転送
する。これにより、各通信チャネル毎の現フレームの動
きベクトル情報がシステム制御部２２の各通信チャネル
毎の動きベクトルテーブルに記憶されると共に、それら
動きベクトルテーブルに対応する各通信チャネル毎の縮
小動画像が音声・動画マルチプレクス部２７に記憶され
る。The moving picture coding / decoding section 26 transfers 1 MB of image data obtained by decoding to the audio / moving picture multiplexing section 27 (process 104). The above processing is repeated for all MBs constituting the current frame (No loop of the determination 105). In the audio / video multiplexing unit 27, as described above, the video of the current frame decoded from the video encoding / decoding unit 26 of each communication channel is reduced, and then, Audio / video multiplexing unit 27
The image data is stored in an image memory arranged therein, sequentially read out from the image memory in accordance with a synthesis mode for each communication channel, and synthesized, and the synthesized moving image is transferred in parallel to each communication channel. As a result, the motion vector information of the current frame for each communication channel is stored in the motion vector table for each communication channel of the system control unit 22 and the reduced moving image for each communication channel corresponding to the motion vector table is stored. It is stored in the audio / video multiplex unit 27.

【００４６】以上説明した復号化処理により得られた各
通信チャネル毎の動きベクトルテーブルに基づいて、各
通信チャネル向けの縮小合成動画像を動き補償付きフレ
ーム間予測符号化する場合の手順について、図７に示す
第１実施形態に係る符号化処理手順を参照して説明す
る。The procedure for performing the inter-frame predictive coding with motion compensation on the reduced synthesized moving image for each communication channel based on the motion vector table for each communication channel obtained by the above-described decoding processing will be described. This will be described with reference to the encoding processing procedure according to the first embodiment shown in FIG.

【００４７】同図において、各通信チャネルの動画符号
・復号化部２６は、前述した音声・動画マルチプレクス
部２７内に各通信チャネル毎に記憶されている各縮小画
像データから、画像データの合成位置、ＭＢ位置に相当
する１ＭＢ分の画像データを読み込む（処理２０１）。In the figure, the moving picture coding / decoding section 26 for each communication channel synthesizes image data from each reduced picture data stored for each communication channel in the audio / moving picture multiplexing section 27 described above. 1 MB of image data corresponding to the position and MB position is read (process 201).

【００４８】また、システム制御部２２は、処理２０１
において、各通信チャネルの動画符号・復号化部２６が
音声・動画マルチプレクス部２７から読み込んだＭＢ
（今処理しているＭＢ）の位置に相当する位置の４ＭＢ
分の動きベクトル値を、図８に示した動きベクトルテー
ブルから読み込み（処理２０２）、その読み込んだ４Ｍ
Ｂ分の動きベクトル値に基づいて、今処理しているＭＢ
の動きベクトルを、以下に示す演算により算出する（処
理２０３）。Further, the system control unit 22 executes the processing 201
, The MB read from the audio / video multiplex unit 27 by the video encoding / decoding unit 26 of each communication channel.
4 MB at a position corresponding to the position of (MB currently being processed)
8 is read from the motion vector table shown in FIG. 8 (process 202), and the read 4M
The MB currently being processed based on the motion vector value for B
Is calculated by the following calculation (process 203).

【００４９】いま、各通信チャネル毎に音声・動画マル
チプレクス部２７の動きベクトルテーブルに記憶されて
いる各ＭＢ位置における動きベクトル値をＶ＝（Ｖｘ，Ｖｙ）とし、求めるべき、今処理しているＭＢの動きベクトル
値をＭＶ＝（ＭＶｘ，ＭＶｙ）とし、ＭＶのＸ成分ＭＶｘを、以下の（式１）により求
め、ＭＶのＹ成分ＭＶｙを、以下の（式２）により求め
る。Now, the motion vector value at each MB position stored in the motion vector table of the audio / video multiplexing unit 27 for each communication channel is set to V = (Vx, Vy). The motion vector value of the MB is MV = (MVx, MVy), the X component MVx of the MV is obtained by the following (Equation 1), and the Y component MVy of the MV is obtained by the following (Equation 2).

【００５０】[0050]

【数１】 (Equation 1)

【００５１】上記の（式１）及び（式２）の意味につい
て定性的にいうならば、今処理しているＭＢの動きベク
トル値は、各通信チャネルにより受信された原動画像を
構成するＭＢのうち、縮小合成処理により当該今処理し
ているＭＢに実質的に含まれることとなった複数のＭＢ
（本実施形態では、４つのＭＢ）のそれぞれの動きベク
トルを合計して、その合計数（本実施形態では４）で除
算することにより平均化された値に、縮小率（本実施形
態では１／２）を乗じた値となる。Qualitatively regarding the meanings of the above (Equation 1) and (Equation 2), the motion vector value of the MB currently being processed is determined by the value of the MB constituting the moving picture received by each communication channel. Among them, a plurality of MBs which are substantially included in the MB currently being processed by the reduction synthesis process
The motion vector of each of the four MBs (in the present embodiment) is summed and divided by the total number (4 in the present embodiment) to obtain a value averaged by the reduction ratio (1 in the present embodiment). / 2).

【００５２】図９に、今符号化している縮小合成画像に
おいて今処理しているＭＢの動きベクトルＭＶ（ＭＰ
ｘ，ＭＰｙ）と、原画像中の、縮小合成処理により当該
今処理しているＭＢに実質的に含まれることとなった２
×２＝４個のＭＢについてのそれぞれの動きベクトルＶ
（Ｐｘ、Ｐｙ）、Ｖ（Ｐｘ＋１、Ｐｙ）、Ｖ（Ｐｘ、Ｐ
ｙ＋１）、Ｖ（Ｐｘ＋１、Ｐｙ＋１）との対応例を示し
ている。また、同図を見て分かるように、今符号化して
いる縮小合成画像は、原画像を縦横をそれぞれ２分１に
縮小した画像を４つ含んでいる。なお、図中の各動きベ
クトルの矢印は、現フレームの前フレームの相当位置の
画像を参照していることを示している。FIG. 9 shows the motion vector MV (MP) of the MB currently being processed in the reduced composite image that is being encoded.
x, MPy) and the original image that has been substantially included in the MB currently being processed by the reduction synthesis processing.
× 2 = each motion vector V for 4 MBs
(Px, Py), V (Px + 1, Py), V (Px, P
y + 1) and V (Px + 1, Py + 1). Also, as can be seen from the figure, the reduced composite image currently being encoded includes four images each of which has been reduced in length and width by half in the original image. Note that the arrow of each motion vector in the figure indicates that an image at a position corresponding to the previous frame of the current frame is referenced.

【００５３】縮小合成前の原画像の２×２＝４個のＭＢ
は、縮小合成画像における１個のＭＢに相当する。した
がって、縮小合成画像における１個のＭＢの動きベクト
ル値は、そのＭＢに対応する原画像の４個のＭＢの動き
ベクトル値と大きな相関がある。したがって、その原画
像の４個のＭＢの動きベクトル値を平均した動きベクト
ル値は、縮小合成画像において対応する１個のＭＢの動
きベクトル値とほぼ同一であると考えることができる。
ただし、実際には、原画像は、縮小合成画像において
は、Ｘ方向及びＹ方向に１／２の縮小率で縮小されてい
るため、その画素単位の動きも１／２になる、したがっ
て、原画像の４個のＭＢの動きベクトル値を平均した動
きベクトル値の１／２が、縮小合成画像において対応す
る１個のＭＢの動きベクトル値になる。2 × 2 = 4 MBs of original image before reduction synthesis
Corresponds to one MB in the reduced composite image. Therefore, the motion vector value of one MB in the reduced composite image has a large correlation with the motion vector values of four MBs in the original image corresponding to the MB. Therefore, the motion vector value obtained by averaging the motion vector values of the four MBs of the original image can be considered to be substantially the same as the motion vector value of the corresponding one MB in the reduced composite image.
However, since the original image is actually reduced at a reduction ratio of に in the X direction and the Y direction in the reduced composite image, the motion of each pixel becomes １／. A half of the motion vector value obtained by averaging the motion vector values of the four MBs of the image becomes the motion vector value of one corresponding MB in the reduced composite image.

【００５４】このようにして、システム制御部２２は、
今処理しているＭＢの動きベクトルを算出するが、その
算出のための演算量は、従来のブロックマッチング法等
と比較すれば、ずっと少なく、縮小合成動画像の符号化
時の動きベクトル検出のための演算量を大幅に減らすこ
とができる。なお、システム制御部２２は、さらに、図
９に示す縮小合成画像を構成する各ＭＢについての動き
ベクトル値の算出において、縮小合成画像の周囲を構成
する、斜線で示された位置のＭＢについては、ＩＴＵ−
Ｔ勧告Ｈ．２６１では、規定により、画像の外側を参照
する事ができないため、ＩＴＵ−Ｔ勧告Ｈ．２６１で規
定されている範囲に動きベクトル値をクリッピング処理
する。As described above, the system control unit 22
The motion vector of the MB currently being processed is calculated. The amount of calculation for the calculation is much smaller than that of the conventional block matching method and the like. Calculation amount can be greatly reduced. Note that the system control unit 22 further calculates the motion vector value for each MB constituting the reduced composite image shown in FIG. , ITU-
Recommendation T. H. According to ITU-T Recommendation H.261, it is not possible to refer to the outside of an image according to regulations. The motion vector value is clipped in the range defined by H.261.

【００５５】そして、システム制御部２２は、算出した
今処理しているＭＢの動きベクトルＭＶを動画符号・復
号化部２６に設定する（処理２０４）。Then, the system control unit 22 sets the calculated motion vector MV of the MB currently being processed in the moving image encoding / decoding unit 26 (process 204).

【００５６】動画符号・復号化部２６に設定された今処
理しているＭＢの動きベクトルＭＶは、動画符号・復号
化部２６内において、入力動画像の動き補償付きフレー
ム間予測符号化を担う、図１０に示す動画符号器の、動
き補償用可変遅延機能を持つ画像メモリ１００に与えら
れる。図１０に示す動画符号器は、基本的には、ＩＴＵ
−Ｔ勧告Ｈ．２６１に準拠したものであり、異なる点
は、従来は、動き補償用可変遅延機能を持つ画像メモリ
１００が自ら検出してした今処理しているＭＢの動きベ
クトルが、システム制御部２２から動きベクトルＭＶと
して与えられる点である。これにより、動き補償用可変
遅延機能を持つ画像メモリ１００による膨大な演算によ
る動きベクトルの検出が不要となる。The motion vector MV of the MB currently processed set in the moving picture coding / decoding section 26 is responsible for the motion compensated inter-frame predictive coding of the input moving picture in the moving picture coding / decoding section 26. , Of the moving image encoder shown in FIG. 10 having the variable delay function for motion compensation. The moving image encoder shown in FIG.
-T Recommendation H. 261 is different from that of the first embodiment in that the motion vector of the MB currently being processed and detected by the image memory 100 having the variable delay function for motion compensation is This is the point given as MV. Accordingly, it becomes unnecessary to detect a motion vector by an enormous amount of calculation by the image memory 100 having the variable delay function for motion compensation.

【００５７】動画符号・復号化部２６内の、図１０に示
す動画符号器は、システム制御部２２から設定された動
きベクトルＭＶに従って、処理２０１で音声・動画マル
チプレクス部２７から読み込んだ、今処理している１Ｍ
Ｂ分の縮小合成画像を動き補償付きフレーム間予測符号
化する（処理２０５）。これにより、図１０に示す動画
符号器は、従来と同様に動画符号・復号化部２６内の図
示しないビデオ信号多重化部に対して、その他の符号化
パラメータと共に、動きベクトルｖを出力するが、この
動きベクトルｖとして出力される動きベクトルは、シス
テム制御部２２から設定された動きベクトルＭＶそのも
のである。なお、図１０に示す動画符号器における符号
化においては、動きベクトルの検出以外の、ＩＮＴＥＲ
／ＩＮＴＲＡの判別、ループフィルタのＯＮ／ＯＦＦ等
については、従来と同様に、図１０に示す動画符号器の
適応制御にゆだねられる。The moving picture coder shown in FIG. 10 in the moving picture coding / decoding section 26 reads from the audio / moving picture multiplexing section 27 in the processing 201 according to the motion vector MV set by the system control section 22. 1M processing
The reduced synthesized image for B is subjected to inter-frame predictive coding with motion compensation (process 205). Thus, the moving image encoder shown in FIG. 10 outputs the motion vector v together with other encoding parameters to a video signal multiplexing unit (not shown) in the moving image encoding / decoding unit 26 as in the related art. The motion vector output as the motion vector v is the motion vector MV itself set by the system control unit 22. In the encoding in the moving picture encoder shown in FIG. 10, INTER other than detection of a motion vector is used.
The determination of / INTRA, ON / OFF of the loop filter, and the like are left to the adaptive control of the moving picture encoder shown in FIG.

【００５８】以上の処理が各通信チャネル向けの縮小合
成動画像の現フレームを構成する全てのＭＢについて繰
り返される（判断２０６のＮｏループ）。The above processing is repeated for all the MBs constituting the current frame of the reduced composite moving image for each communication channel (No loop of decision 206).

【００５９】次に、第２実施形態に係る、各テレビ会議
端末装置からの受信動画像の復号化処理、及び、各テレ
ビ会議端末装置へ送信する合成動画像の符号化処理につ
いて説明する。Next, a description will be given of a process of decoding a moving image received from each TV conference terminal device and a process of encoding a composite moving image transmitted to each TV conference terminal device according to the second embodiment.

【００６０】各テレビ会議端末装置からの受信動画像の
復号化処理については、図６に示した、第１実施形態に
係る復号化処理手順と同じであるため、説明を省略する
が、図６に示す復号化処理により、各通信チャネル毎の
現フレームの動きベクトルがシステム制御部２２の各通
信チャネル毎の動きベクトルテーブルに記憶されると共
に、それら動きベクトルテーブルに対応する各通信チャ
ネル毎の縮小動画像が音声・動画マルチプレクス部２７
に記憶される。The decoding processing of the moving image received from each video conference terminal device is the same as the decoding processing procedure according to the first embodiment shown in FIG. As a result of the decoding processing shown in (1), the motion vector of the current frame for each communication channel is stored in the motion vector table for each communication channel of the system control unit 22, and the reduction for each communication channel corresponding to the motion vector table Moving image is audio / video multiplexing unit 27
Is stored.

【００６１】その図６に示した復号化処理により得られ
た各通信チャネル毎の動きベクトルテーブルに基づい
て、各通信チャネル向けの縮小合成動画像を動き補償付
きフレーム間予測符号化する場合の手順について、図１
１に示す第２実施形態に係る符号化処理手順を参照して
説明する。A procedure for the case where the reduced synthesized moving image for each communication channel is subjected to inter-frame predictive coding with motion compensation based on the motion vector table for each communication channel obtained by the decoding process shown in FIG. About FIG. 1
This will be described with reference to the encoding processing procedure according to the second embodiment shown in FIG.

【００６２】同図において、各通信チャネルの動画符号
・復号化部２６は、前述した音声・動画マルチプレクス
部２７内に各通信チャネル毎に記憶されている各縮小画
像データから、画像データの合成位置、ＭＢ位置に相当
する１ＭＢ分の画像データを読み込む（処理３０１）。In the figure, the moving picture coding / decoding section 26 of each communication channel synthesizes image data from the reduced image data stored for each communication channel in the audio / moving picture multiplexing section 27 described above. 1 MB of image data corresponding to the position and MB position is read (process 301).

【００６３】また、システム制御部２２は、処理２０１
において、各通信チャネルの動画符号・復号化部２６が
音声・動画マルチプレクス部２７から読み込んだＭＢ
（今処理しているＭＢ）の位置に相当する位置の４ＭＢ
分の動きベクトル値を、第１実施形態と同様に図８に示
した動きベクトルテーブルから読み込み（処理３０
２）、その読み込んだ４ＭＢ分の動きベクトル値に基づ
いて、今処理しているＭＢの動きベクトルＭＶを、第１
実施形態と同様に、前記（式１）及び（式２）に示す演
算により算出し、その算出した動きベクトルＭＶに基づ
いて、動きベクトル探索範囲ＳＡを決定する（処理３０
３）。Further, the system control unit 22 performs processing 201
, The MB read from the audio / video multiplex unit 27 by the video encoding / decoding unit 26 of each communication channel.
4 MB at a position corresponding to the position of (MB currently being processed)
The motion vector value of the minute is read from the motion vector table shown in FIG.
2) Based on the read 4 MB motion vector values, the motion vector MV of the MB currently being processed is set to the first
Similarly to the embodiment, the motion vector search range SA is calculated based on the calculations shown in (Equation 1) and (Equation 2), and the motion vector search range SA is determined based on the calculated motion vector MV (Process 30).
3).

【００６４】その動きベクトル探索範囲ＳＡの決定法に
ついて、図１２を参照して説明する。同図において、
（ｐ）は、動きベクトル探索範囲ＳＡの決定の対象とな
る今処理しているＭＢの中心点、（Ｓ）は従来のＩＴＵ
−Ｔ勧告Ｈ・２６１に準拠した符号器における動きベク
トルの探索範囲（±１５画素）、（ａ）（ｂ）及び
（ｃ）は算出により求めた動きベクトルＭＶのいくつか
の具体例、（Ａ）（Ｂ）及び（Ｃ）は、それぞれ動きベ
クトル（ａ）（ｂ）及び（ｃ）に対応した動きベクトル
探索範囲ＳＡを示している。A method of determining the motion vector search range SA will be described with reference to FIG. In the figure,
(P) is the center point of the MB currently being processed to determine the motion vector search range SA, and (S) is the conventional ITU
The search range (± 15 pixels) of a motion vector in an encoder conforming to −T Recommendation H.261, (a), (b) and (c) show some specific examples of the motion vector MV obtained by calculation, (A (B) and (C) show the motion vector search range SA corresponding to the motion vectors (a), (b) and (c), respectively.

【００６５】システム制御部２２は、基本的には、今処
理しているＭＢの中心点（ｐ）を原点として、算出した
動きベクトルＭＶ（（ａ）（ｂ）（ｃ）等）が差し示す
画素の周辺±３画素の範囲を動きベクトル探索範囲ＳＡ
として決定する。ただし、従来の探索範囲（±１５画
素：これはＩＴＵ−Ｔ勧告Ｈ．２６１で探索範囲の上限
である）からはみ出ない様にクリッピングを行い探索範
囲を設定する。つまり、例えば、図１２において、
（Ａ）（Ｂ）の範囲はクリッピングなし、（Ｃ）がクリ
ッピングされた範囲を示している。また、第１実施形態
の場合と同様に、図９の縮小合成画像の斜線で示された
位置のＭＢを処理している場合には、規定によりＨ．２
６１では、画像の外側を参照する事ができないため、Ｉ
ＴＵ−Ｔ勧告Ｈ．２６１で規定されている範囲に探索範
囲をクリッピング処理する。このように、検索範囲のク
リッピングを行うことで、後述する動画符号器におい
て、ＩＴＵ−Ｔ勧告Ｈ．２６１の規定から逸脱すること
のない動きベクトルｖを求めることができる。The system control unit 22 basically indicates the calculated motion vector MV ((a) (b) (c), etc.) with the center point (p) of the MB currently being processed as the origin. A motion vector search range SA is defined as a range of ± 3 pixels around a pixel.
To be determined. However, the search range is set by performing clipping so as not to exceed the conventional search range (± 15 pixels: this is the upper limit of the search range in ITU-T Recommendation H.261). That is, for example, in FIG.
The ranges of (A) and (B) show no clipping and the (C) shows the clipped range. As in the case of the first embodiment, when processing the MB at the position indicated by the diagonal lines of the reduced composite image in FIG. 2
In 61, since it is not possible to refer to the outside of the image,
TU-T Recommendation H. The search range is clipped to the range defined by H.261. By clipping the search range in this manner, a moving image encoder described below can be used in the ITU-T Recommendation H.264. A motion vector v that does not deviate from the definition of H.261 can be obtained.

【００６６】なお、処理３０３で算出した動きベクトル
ＭＶは、第１実施形態においては、そのまま、今処理し
ているＭＢの動きベクトルとしたが、この第２実施形態
では、動きベクトル探索範囲を決定するために用いるだ
けで、そのまま、今処理しているＭＢの動きベクトルと
されるのではない。In the first embodiment, the motion vector MV calculated in the process 303 is used as it is as the motion vector of the MB currently being processed. However, in the second embodiment, the motion vector search range is determined. Is not used as the motion vector of the MB currently being processed.

【００６７】さて、システム制御部２は、決定した動き
ベクトル探索範囲ＳＡを動画符号・復号化部２６に設定
する（処理３０４）。The system control section 2 sets the determined motion vector search range SA in the moving picture coding / decoding section 26 (step 304).

【００６８】動画符号・復号化部２６に設定された今処
理しているＭＢについての動きベクトル探索範囲ＳＡ
は、動画符号・復号化部２６内において、入力動画像の
動き補償付きフレーム間予測符号化を担う、図１３に示
す動画符号器の、動き補償用可変遅延機能を持つ画像メ
モリ１０１に与えられる。The motion vector search range SA for the currently processed MB set in the video encoding / decoding unit 26
Is provided to the image memory 101 having the variable delay function for motion compensation of the moving image encoder shown in FIG. 13 which performs the motion-compensated inter-frame predictive coding of the input moving image in the moving image encoding / decoding unit 26. .

【００６９】図１３に示す動画符号器は、基本的には、
ＩＴＵ−Ｔ勧告Ｈ．２６１に準拠したものであり、異な
る点は、従来は、動き補償用可変遅延機能を持つ画像メ
モリ１０１が自ら、ＩＴＵ−Ｔ勧告Ｈ．２６１の規定に
基づいて±１５画素の範囲内で、参照される画素が符号
化対象となるフレームの内部に在るように必要に応じて
クリッピングして、今処理しているＭＢの動きベクトル
探索範囲を設定していたのに対し、今処理しているＭＢ
の動きベクトル探索範囲が、システム制御部２２から動
きベクトル探索範囲ＳＡとして与えられる点である。The moving picture encoder shown in FIG.
ITU-T Recommendation H. H.261, which is different from the conventional one in that the image memory 101 having the variable delay function for motion compensation is itself used by the ITU-T Recommendation H.264. 261. In the range of ± 15 pixels, clipping is performed as needed so that the pixel to be referred is within the frame to be encoded, and a motion vector search of the MB currently being processed is performed. While the range was set, the MB currently being processed
Is given by the system control unit 22 as the motion vector search range SA.

【００７０】動画符号・復号化部２６内の、図１３に示
す動画符号器は、システム制御部２２から設定された動
きベクトル探索範囲ＳＡ内で、今処理しているＭＢ分の
縮小合成画像の動きベクトルを探索する（処理３０
５）。この探索方法自体は、ブロックマッチング法等の
それ自体すでに知られたものである。The moving picture coder shown in FIG. 13 in the moving picture coding / decoding section 26 has the size of the reduced composite image of the MB currently being processed within the motion vector search range SA set by the system control section 22. Search for a motion vector (Process 30
5). This search method itself is already known per se, such as a block matching method.

【００７１】ここで、従来の動きベクトルの探索処理
と、この第２実施形態における動きベクトルの探索処理
の処理量について比較すると、それらの処理量は、動き
ベクトルの探索範囲の広さにほぼ比例するため、動きベ
クトル探索範囲のクリッピングがないとして、従来で
は、３１（±１５画素）×３１（±１５画素）＝９６１
回の比較演算が必要であったが、この第２実施形態で
は、７（±３画素）×７（±３画素）＝４９回の比較演
算ですみ、処理量は、約１／２０となる。Here, comparing the processing amount of the conventional motion vector search processing with the processing amount of the motion vector search processing in the second embodiment, the processing amounts are almost proportional to the width of the motion vector search range. Therefore, assuming that there is no clipping of the motion vector search range, conventionally, 31 (± 15 pixels) × 31 (± 15 pixels) = 961
In the second embodiment, 7 (± 3 pixels) × 7 (± 3 pixels) = 49 comparison operations are required, and the processing amount is about 1/20. .

【００７２】このように、この第２実施形態において、
動きベクトルの探索範囲を狭くすることができるのは、
今処理しているＭＢについての動きベクトル探索範囲Ｓ
Ａを決定する基となる動きベクトルＭＶが、第１実施形
態において説明したように、縮小合成処理により、当該
今処理しているＭＢに含まれることとなった、当該今処
理しているＭＢとの相関が大きい原画像の複数のＭＢの
動きベクトルの平均値に、縮小率を乗じたもので、ほぼ
当該今処理しているＭＢの真の動きベクトル値と近い値
をとるためである。つまり、従来のように、動きベクト
ル探索範囲を広くとらなくても、動きベクトルＭＶが差
し示す画素周辺の微小範囲について、動きベクトルを探
索すれば、その中に当該今処理しているＭＢの真の動き
ベクトルが存在するといえる。As described above, in the second embodiment,
The reason that the search range of the motion vector can be narrowed is
Motion vector search range S for MB currently being processed
As described in the first embodiment, the motion vector MV that is the base for determining A is included in the MB currently being processed by the reduction synthesis process. Is obtained by multiplying the average value of the motion vectors of a plurality of MBs of the original image having a large correlation by the reduction ratio, and to take a value substantially close to the true motion vector value of the MB currently being processed. That is, even if the motion vector search range is not widened as in the related art, if the motion vector is searched for a minute range around the pixel indicated by the motion vector MV, the true value of the MB currently being processed is found therein. Can be said to exist.

【００７３】これにより、従来よりもずっと少ない処理
量で、かつ、第１実施形態よりも正確な動きベクトルの
検出が可能となる。As a result, it is possible to detect a motion vector with a processing amount much smaller than that of the related art and more accurately than in the first embodiment.

【００７４】動画符号・復号化部２６内の、図１３に示
す動画符号器は、処理３０５により得られた今処理して
いるＭＢについての動きベクトル従って、処理３０１で
音声・動画マルチプレクス部２７から読み込んだ、今処
理しているＭＢ分の縮小合成画像を動き補償付きフレー
ム間予測符号化する（処理３０６）。これにより、図１
３に示す動画符号器は、従来と同様に動画符号・復号化
部２６内の図示しないビデオ信号多重化部に対して、そ
の他の符号化パラメータと共に、動きベクトルｖを出力
するが、この動きベクトルｖとして出力される動きベク
トルは、システム制御部２２から設定された動きベクト
ル探索範囲ＳＡの範囲内で検出したものである。なお、
図１３に示す動画符号器における符号化においては、動
きベクトルの検出以外の、ＩＮＴＥＲ／ＩＮＴＲＡの判
別、ループフィルタのＯＮ／ＯＦＦ等については、従来
と同様に、図１３に示す動画符号器の適応制御にゆだね
られる。The moving picture encoder shown in FIG. 13 in the moving picture coding / decoding section 26 uses the motion vector of the MB currently being processed obtained in the processing 305, and accordingly, in the processing 301, the audio / moving picture multiplexing section 27 in the processing 301. Then, the reduced composite image of the MB currently being processed, which has been read from, is subjected to inter-frame predictive coding with motion compensation (process 306). As a result, FIG.
3 outputs a motion vector v together with other coding parameters to a video signal multiplexing unit (not shown) in the moving image encoding / decoding unit 26 as in the related art. The motion vector output as v is detected within the motion vector search range SA set by the system control unit 22. In addition,
In the encoding in the moving picture encoder shown in FIG. 13, the determination of INTER / INTRA, ON / OFF of the loop filter, etc. other than the detection of the motion vector is performed by adapting the moving picture encoder shown in FIG. Committed to control.

【００７５】以上の処理が各通信チャネル向けの縮小合
成動画像の現フレームを構成する全てのＭＢについて繰
り返される（判断３０７のＮｏループ）。The above process is repeated for all the MBs constituting the current frame of the reduced composite moving image for each communication channel (No loop in decision 307).

【００７６】次に、第３実施形態に係る、各テレビ会議
端末装置からの受信動画像の復号化処理、及び、各テレ
ビ会議端末装置へ送信する合成動画像の符号化処理につ
いて説明する。Next, a description will be given of a process of decoding a moving image received from each TV conference terminal device and a process of encoding a composite moving image transmitted to each TV conference terminal device according to the third embodiment.

【００７７】この第３実施形態は、第１実施形態の変形
例であり、各テレビ会議端末装置からの受信動画像の復
号化処理については、図６に示した第１実施形態に係る
復号化処理と基本的には同様である。しかし、第３実施
形態では、システム制御部２２は、各通信チャネルにつ
いて、図８に示したような動きベクトルテーブルを現フ
レームと、その現フレームより１フレーム前のフレーム
（前フレーム）分の２つ備えている点である。The third embodiment is a modification of the first embodiment. The decoding process of the moving image received from each video conference terminal device is performed according to the decoding process of the first embodiment shown in FIG. The processing is basically the same. However, in the third embodiment, the system control unit 22 stores, for each communication channel, a motion vector table as shown in FIG. 8 for the current frame and two frames for the frame (previous frame) one frame before the current frame. It has one point.

【００７８】つまり、現フレームを復号化しているとき
には、２つの動きベクトルテーブルのうちの一方を当該
現フレームを復号化する際に得られる動きベクトル値の
格納のために使用し、現フレームの次のフレームを復号
化する際には、その次のフレームを復号化する際に得ら
れる動きベクトル値を、前記現フレームの動きベクトル
値を格納したテーブルではない方のテーブルに格納する
というように、２つの動きベクトルテーブルを１フレー
ム毎に切り替えて使用する。That is, when the current frame is being decoded, one of the two motion vector tables is used to store the motion vector value obtained when the current frame is decoded, and When decoding the frame, the motion vector value obtained when decoding the next frame is stored in a table other than the table storing the motion vector value of the current frame, The two motion vector tables are switched and used for each frame.

【００７９】これにより、現フレームを復号化する際に
は、前フレームを復号化した際に得られた当該前フレー
ム分の動きベクトル値が、現フレーム分の動きベクトル
値を格納するための動きベクトルテーブルとは別の動き
ベクトルテーブルに常に格納されていることになる。As a result, when decoding the current frame, the motion vector value for the previous frame obtained when decoding the previous frame is the motion vector value for storing the motion vector value for the current frame. It is always stored in a motion vector table different from the vector table.

【００８０】以上のように、現フレームの前フレームの
それぞれの動きベクトル値がそれぞれの動きベクトルテ
ーブルに格納された上で行われる第３実施形態に係る縮
小合成画像の動き補償付きフレーム間予測符号化処理の
手順について、図１４を参照して説明する。As described above, the motion-compensated inter-frame predictive code of the reduced composite image according to the third embodiment is performed after the respective motion vector values of the previous frame of the current frame are stored in the respective motion vector tables. The procedure of the conversion process will be described with reference to FIG.

【００８１】同図において、各通信チャネルの動画符号
・復号化部２６は、前述した音声・動画マルチプレクス
部２７内に各通信チャネル毎に記憶されている各縮小画
像データから、画像データの合成位置、ＭＢ位置に相当
する１ＭＢ分の画像データを読み込む（処理４０１）。In the figure, the moving picture coding / decoding section 26 for each communication channel synthesizes image data from each reduced picture data stored for each communication channel in the audio / moving picture multiplexing section 27 described above. 1 MB of image data corresponding to the position and MB position is read (process 401).

【００８２】また、システム制御部２２は、処理４０１
において、各通信チャネルの動画符号・復号化部２６が
音声・動画マルチプレクス部２７から読み込んだＭＢ
（今処理しているＭＢ）の位置に相当する位置の４ＭＢ
分の動きベクトル値を、図８に示した現フレーム用の動
きベクトルテーブルから読み込む（処理４０２）。Further, the system control unit 22 executes a process 401
, The MB read from the audio / video multiplex unit 27 by the video encoding / decoding unit 26 of each communication channel.
4 MB at a position corresponding to the position of (MB currently being processed)
The minute motion vector value is read from the motion vector table for the current frame shown in FIG. 8 (process 402).

【００８３】そして、フレームスキップが発生したか、
すなわち、１つ前に符号化したフレームから現フレーム
までの間に、復号化された画像の書き換えが２つ以上発
生したかを調べる（判断４０３）。フレームスキップが
発生していなければ（判断４０３のＮｏ）、現フレーム
用の動きベクトルテーブルに格納された動きベクトル値
により、第１実施形態と同様の復号化が可能であるた
め、処理４０６に移り、その処理４０６以降の処理を第
１実施形態と同様に行う。実際、処理４０６、処理４０
７、処理４０８、及び、判断４０９は、それぞれ、図７
に示す第１実施形態に係る復号化処理における処理２０
３、処理２０４、処理２０５、及び、判断２０６と同一
である。言い換えれば、この第３実施形態に係る復号化
処理の特徴点は、フレームスキップが発生した場合にお
ける、現フレームの今処理しているＭＢの動きベクトル
ＭＶの算出手順にある。Then, whether a frame skip has occurred,
That is, it is checked whether or not two or more decoded images have been rewritten between the immediately preceding frame and the current frame (decision 403). If no frame skip has occurred (No in decision 403), the same decoding as in the first embodiment can be performed using the motion vector values stored in the motion vector table for the current frame. The processing after the processing 406 is performed in the same manner as in the first embodiment. Actually, Steps 406 and 40
7, the processing 408, and the judgment 409 are respectively performed in FIG.
20 in the decoding process according to the first embodiment shown in FIG.
3, Steps 204, 205, and 206. In other words, the feature of the decoding process according to the third embodiment is the procedure for calculating the motion vector MV of the MB currently being processed in the current frame when a frame skip occurs.

【００８４】すなわち、フレームスキップが発生したし
た場合（判断４０３のＹｅｓ）には、処理４０２で読み
込んだ現フレームの４ＭＢ分の動きベクトルがそれぞれ
差し示している位置（相当位置）の前フレームのＭＢの
動きベクトルをそれぞれ読み込んで（処理４０４）、当
該現フレームの４ＭＢのそれぞれの動きベクトルと、そ
れらのベクトルがそれぞれ差し示す前フレームのＭＢの
動きベクトルとを、それぞれ加算して、当該現フレーム
の４ＭＢのそれぞれの新たな動きベクトルとする（処理
４０５）。That is, when a frame skip has occurred (Yes in decision 403), the 4 MB motion vector of the current frame read in the process 402 shows the MB of the previous frame of the position (corresponding position) indicated by the motion vector. Are read (process 404), and the motion vectors of the 4 MB of the current frame and the motion vector of the MB of the previous frame indicated by the vectors are respectively added, and the motion vectors of the current frame are added. Each new motion vector of 4 MB is set (process 405).

【００８５】それらの現フレームの１ＭＢの動きベクト
ルと、その動きベクトルが差し示す前フレームのＭＢの
動きベクトルの加算処理の具体例について、図１５を参
照して説明する。同図（ａ）は、処理４０２で読み出さ
れた現フレームのＭＢの動きベクトル、同図（ｂ）は、
処理４０２で読み出された現フレームのＭＢ（点線枠）
の動きベクトル（点線）と、その動きベクトルが差し示
す、処理４０４で読み出された前フレームのＭＢ（実線
枠）の動きベクトル（実線）を示している。A specific example of the addition processing of the 1 MB motion vector of the current frame and the motion vector of the MB of the previous frame indicated by the motion vector will be described with reference to FIG. FIG. 11A shows the motion vector of the MB of the current frame read out in the process 402, and FIG.
MB of current frame read in process 402 (dotted frame)
, And the motion vector (solid line) of the MB (solid frame) of the previous frame read by the process 404 indicated by the motion vector.

【００８６】実際に処理４０２で読み出される現フレー
ムのＭＢは、２×２の４ＭＢであるが、図には、１つの
ＭＢについての、２つのパターンについて示してある。
同図（ａ）において、ベクトル（ａ）はベクトルをＭＢ
の中心に置いたとき左下のＭＢを指し示し、それに従っ
て処理４０４で読み出すベクトルは左下のＭＢのベクト
ル（ｂ）となり、そのベクトル（ｂ）が、ベクトル
（ａ）と加算処理される。一方、ベクトル（ｃ）は自Ｍ
Ｂを指し示し、それに従って処理４０４で読み出すベク
トルは同じ位置のＭＢのベクトル（ｄ）となり、そのベ
クトル（ｄ）が、ベクトル（ｃ）と加算処理される。The MB of the current frame actually read in the process 402 is a 2 × 2 4 MB, but the figure shows two patterns for one MB.
In the same figure (a), the vector (a) is the vector MB
, The lower left MB is pointed out, and the vector read out in processing 404 accordingly becomes the lower left MB vector (b), and the vector (b) is added to the vector (a). On the other hand, the vector (c) is
B, and the vector read out in step 404 in accordance therewith becomes the vector (d) of the MB at the same position, and the vector (d) is added to the vector (c).

【００８７】すなわち、処理４０２で読み出した現フレ
ームのＭＢ（１６画素×１６画素）の動きベクトルのｘ
及びｙ方向の各成分の大きさ（絶対値）が８以上である
場合には、前フレームにおいて周囲８つのＭＢのうちの
何れかの動きベクトルを、８未満である揚合には前フレ
ームにおいて同じ位置のＭＢの動きベクトルを、処理４
０４において読み出して、処理４０５で加算する。That is, x of the motion vector of the MB (16 pixels × 16 pixels) of the current frame read out in the process 402
And if the magnitude (absolute value) of each component in the y direction is 8 or more, the motion vector of any of the eight surrounding MBs in the previous frame is determined. The motion vector of the MB at the same position is
At step 405, the data is read out and added at step 405.

【００８８】その処理４０５における現フレームの４Ｍ
Ｂのそれぞれの動きベクトルと、それらが差し示す前フ
レームのＭＢの動きベクトルとのそれぞれの加算は、図
１６に示すように、ベクトル（ｖ１）と、ベクトル（ｖ
２）とが、ベクトル（ｖ）に合成できるのと同様に行わ
れ、具体的には、それぞれの動きベクトルのｘ及びｙ方
向の各成分毎の加算により行える。The 4M of the current frame in the process 405
B and the motion vector of the MB of the previous frame indicated by the addition are performed by adding the vector (v1) and the vector (v
2) is performed in the same manner as that can be combined with the vector (v), and specifically, can be performed by adding each motion vector for each component in the x and y directions.

【００８９】このように、フレームスキップが発生した
場合には、フレームスキップが発生したときのために、
復号化時に図８に示すような動きベクトルテーブルに記
憶していた前フレームの各ＭＢ毎の動きベクトルを、そ
れらのＭＢを差し示している現フレームのＭＢの動きベ
クトルに加算することで、スキップされたフレーム分の
画像の動き量を補間することができるため、フレームス
キップが発生した場合でも、第１実施形態と同様の縮小
合成画像の動き補償付きフレーム間予測符号化処理が可
能となる。As described above, when a frame skip occurs, the frame skip occurs when the frame skip occurs.
By adding the motion vector for each MB of the previous frame stored in the motion vector table as shown in FIG. 8 at the time of decoding to the motion vector of the MB of the current frame indicating those MBs, skipping is performed. Since the motion amount of the image of the extracted frame can be interpolated, even if a frame skip occurs, it is possible to perform the motion-compensated inter-frame predictive encoding process on the reduced composite image as in the first embodiment.

【００９０】なお、以上説明した第３実施形態に係る符
号化処理は、図７に示した第１実施形態に係る符号化処
理に、フレームスキップが発生した場合の処理（判断４
０３、処理４０４及び４０５）を追加したものである
が、図１１に示した第２実施形態に係る符号化処理に対
しても、同様にフレームスキップが発生した場合の処理
を追加することができる。すなわち、図１１に示した第
２実施形態に係る符号化処理手順の、処理３０２と、処
理３０３との間に、図１４に示した第３実施形態に係る
符号化処理手順における、判断４０３、処理４０４及び
処理４０５を追加することで、フレームスキップが発生
した場合でも、第２実施形態と同様の縮小合成画像の動
き補償付きフレーム間予測符号化処理が可能となる。The encoding process according to the third embodiment described above is the same as the encoding process according to the first embodiment shown in FIG.
03, processes 404 and 405), but the process when a frame skip occurs can be added to the encoding process according to the second embodiment shown in FIG. . That is, between the process 302 and the process 303 of the encoding procedure according to the second embodiment shown in FIG. 11, the judgment 403 in the encoding procedure according to the third embodiment shown in FIG. By adding the processing 404 and the processing 405, even when a frame skip occurs, the same inter-frame predictive coding processing with reduced motion of the reduced composite image as in the second embodiment can be performed.

【００９１】なお、以上説明した各実施形態において
は、本発明を、各通信チャネル向けの符号化縮小合成動
画像として、４つの原動画像を縦横それぞれ１／２の縮
小率で縮小した縮小動画像を、縦横それぞれ２づつ合成
した縮小合成画像を符号化する場合を例にとって説明し
たが、本発明は、原動画像の動きベクトル情報と、縮小
合成動画像の動きベクトル情報の相関性の高さを利用し
ているため、それに限らず、多様な縮小率（縦と横の縮
小率がそれぞれ異なる場合を含む）、合成動画像数、及
び、縮小合成動画像の合成位置に対応することができ
る。In each of the embodiments described above, the present invention is applied to a reduced moving image obtained by reducing four original moving images at a reduction ratio of 縦 each in the vertical and horizontal directions as an encoded reduced combined moving image for each communication channel. Has been described as an example in which a reduced composite image obtained by combining two images in both the vertical and horizontal directions is encoded. However, the present invention provides a method for calculating the degree of correlation between the motion vector information of the original moving image and the motion vector information of the reduced composite moving image. Since it is used, the present invention is not limited to this, and can correspond to various reduction ratios (including cases where the vertical and horizontal reduction ratios are different), the number of synthesized moving images, and the synthesis position of the reduced synthesized moving images.

【００９２】また、以上説明した各実施形態において
は、本発明に係る縮小合成動画像の動き補償付きフレー
ム間予測符号化法を、多地点テレビ会議制御装置に適用
したが、それに限らず、デジタル動画記録・再生・編集
装置等において、動き補償付きフレーム間予測符号化さ
れた複数の原動画像を復号化して縮小合成して再度動き
補償付きフレーム間予測符号化する場合にも適用可能で
あることはいうまでもない。ただし、多地点テレビ会議
制御装置のように、複数の動き補償付きフレーム間予測
符号化された原動画像を復号化して縮小合成して再度動
き補償付きフレーム間予測符号化する処理を、複数の通
信チャネル分リアルタイムに行う必要があるために、複
数の縮小合成画像について、動きベクトルの検出を同時
に行わなければならず、動きベクトルの検出のための処
理量が非常に多い装置程、本発明に係る縮小合成動画像
の動き補償付きフレーム間予測符号化法の効果は大き
い。Further, in each of the embodiments described above, the inter-frame predictive coding method with motion compensation of the reduced synthesized moving image according to the present invention is applied to the multipoint video conference control device. In a moving image recording / reproducing / editing device, the present invention is applicable to a case where a plurality of dynamic images subjected to inter-frame prediction coding with motion compensation are decoded, reduced and synthesized, and inter-frame prediction coding with motion compensation is performed again. Needless to say. However, like a multipoint video conference control device, a plurality of motion-compensated inter-frame predictive-encoded dynamic images are decoded, reduced and synthesized, and the motion-compensated inter-frame predictive-encoding process is performed by a plurality of communications. Since it is necessary to perform the processing in real time for the number of channels, the detection of the motion vector must be performed simultaneously for a plurality of reduced composite images. The effect of the inter-frame predictive coding method with motion compensation for the reduced synthesized moving image is great.

【００９３】[0093]

【発明の効果】請求項１に係る発明によれば、前記各符
号化合成動画像に付加される動きベクトル情報は、それ
ら合成動画像を構成する各原動画像の復号化時の動きベ
クトル情報と、合成位置と、縮小合成時の縮小率とに基
づいて算出されるため、従来のように、原動画像の動き
ベクトル情報と、合成動画像の動きベクトル情報の相関
性の高さを考慮することなく、合成動画像について、再
度動きベクトルの検出処理を行う必要がなく、各テレビ
会議端末装置に送信する符号化合成動画像に付加する動
きベクトル情報を簡易な演算によりほぼ正確に算出でき
る。したがって、動きベクトル検出のための演算処理負
担を軽減でき、その分、演算処理装置を低速で低コスト
なものにでき、装置コストを低減できる。According to the first aspect of the present invention, the motion vector information added to each of the encoded synthesized moving images is the same as the motion vector information at the time of decoding each of the moving images constituting the synthesized moving images. , Is calculated based on the synthesis position and the reduction ratio at the time of the reduction synthesis, so that it is necessary to consider the high correlation between the motion vector information of the original moving image and the motion vector information of the synthesized moving image as in the related art. In addition, it is not necessary to perform the motion vector detection process again on the combined moving image, and the motion vector information to be added to the encoded combined moving image transmitted to each video conference terminal device can be almost accurately calculated by a simple calculation. Therefore, it is possible to reduce the calculation processing load for detecting the motion vector, and accordingly, it is possible to reduce the cost of the calculation processing device at a low speed, thereby reducing the device cost.

【００９４】請求項２に係る発明によれば、前記各符号
化合成動画像に付加される動きベクトル情報の前記再符
号化手段における探索範囲の中心は、それら合成動画像
を構成する各原動画像の復号化時の動きベクトル情報
と、合成位置と、縮小合成時の縮小率とに基づいて算出
され、前記再符号化手段は、その算出された探索範囲中
心を中心とする所定範囲でのみ動きベクトルを探索する
ため、従来のように、原動画像の動きベクトル情報と、
合成動画像の動きベクトル情報の相関性の高さを考慮す
ることなく、合成動画像について、再度比較的広範囲な
探索範囲で動きベクトルの検出処理を行う必要がなく、
請求項１に係る発明と同様に、テレビ会議端末装置に送
信する符号化合成動画像に付加する動きベクトル情報を
簡易な演算により算出できる。したがって、動きベクト
ル検出のための演算処理負担は従来よりもずっと少なく
て済む。さらに、原動画像の動きベクトル情報と、合成
動画像の動きベクトル情報の相関性の高さを考慮して、
合成動画像の動きベクトルのおおよその値を、探索範囲
中心として求め、その近傍の所定範囲内に在るはずの真
の動きベクトルを前記再符号化手段が検出するため、請
求項１に係る発明よりも正確に合成動画像の動きベクト
ル情報を算出することができる。According to the second aspect of the present invention, the center of the search range in the re-encoding means for the motion vector information to be added to each of the encoded combined moving images is the original moving image constituting the combined moving image. Is calculated based on the motion vector information at the time of decoding, the synthesis position, and the reduction ratio at the time of reduction synthesis, and the re-encoding means performs motion only within a predetermined range centered on the calculated search range center. In order to search for the vector, the motion vector information of the moving image and the
Without considering the high correlation of the motion vector information of the synthesized moving image, the synthesized moving image does not need to perform the motion vector detection process again in a relatively wide search range,
Similarly to the first aspect, the motion vector information to be added to the encoded combined moving image transmitted to the video conference terminal device can be calculated by a simple calculation. Therefore, the processing load for detecting a motion vector can be much smaller than in the past. Furthermore, in consideration of the high correlation between the motion vector information of the original moving image and the motion vector information of the synthesized moving image,
The invention according to claim 1, wherein an approximate value of the motion vector of the synthesized moving image is obtained as the center of the search range, and the re-encoding means detects a true motion vector that should be within a predetermined range in the vicinity thereof. It is possible to more accurately calculate the motion vector information of the synthesized moving image.

【００９５】請求項３に係る発明によれば、前記再符号
化手段における前記合成動画像の符号化の際にフレーム
スキップが発生した場合は、前記動きベクトル算出手段
は、現フレームの各小領域についての動きベクトル情報
のそれぞれについて、前フレームの動きベクトル情報を
加算することにより、スキップされたフレームの動き情
報を補間するため、前記再符号化手段においてフレーム
スキップが発生した場合でも、請求項１に係る発明を効
果的に実現でき、請求項１に係る発明の汎用性を高める
ことができる。According to the third aspect of the present invention, when a frame skip occurs when the re-encoding unit encodes the synthesized moving image, the motion vector calculation unit determines whether each small area of the current frame has a small area. The motion vector information of the previous frame is added to each of the motion vector information of the first frame and the second frame to interpolate the motion information of the skipped frame. Can effectively be realized, and the versatility of the invention according to claim 1 can be enhanced.

【００９６】請求項４に係る発明によれば、前記再符号
化手段における前記合成動画像の符号化の際にフレーム
スキップが発生した場合は、探索範囲中心算出手段は、
現フレームの各小領域についての動きベクトル情報のそ
れぞれについて、前フレームの動きベクトル情報を加算
することにより、スキップされたフレームの動き情報を
補間するため、前記再符号化手段においてフレームスキ
ップが発生した場合でも、請求項２に係る発明を効果的
に実現でき、請求項２に係る発明の汎用性を高めること
ができる。According to the fourth aspect of the present invention, when a frame skip occurs when the re-encoding unit encodes the synthesized moving image, the search range center calculation unit includes:
For each of the motion vector information for each small area of the current frame, the motion vector information of the previous frame is added to interpolate the motion information of the skipped frame. Even in this case, the invention according to claim 2 can be effectively realized, and the versatility of the invention according to claim 2 can be enhanced.

[Brief description of the drawings]

【図１】本発明の実施の形態に係る多地点テレビ会議制
御装置を含むテレビ会議システムの構成を示す図であ
る。FIG. 1 is a diagram showing a configuration of a video conference system including a multipoint video conference control device according to an embodiment of the present invention.

【図２】本発明の実施の形態に係るテレビ会議端末装置
のブロック構成を示す図である。FIG. 2 is a diagram showing a block configuration of a video conference terminal device according to the embodiment of the present invention.

【図３】本発明の実施の形態に係る多地点テレビ会議制
御装置のブロック構成を示す図である。FIG. 3 is a diagram showing a block configuration of a multipoint video conference control device according to an embodiment of the present invention.

【図４】本発明の実施の形態に係るテレビ会議システム
の基本的な動作を示す図である。FIG. 4 is a diagram showing a basic operation of the video conference system according to the embodiment of the present invention.

【図５】本発明の実施形態に係る多地点テレビ会議制御
装置から各テレビ会議端末装置に対して送信され表示さ
れる縮小合成画像について示す模式的な図である。FIG. 5 is a schematic diagram showing a reduced composite image transmitted from the multipoint video conference control device to each video conference terminal device and displayed according to the embodiment of the present invention;

【図６】本発明の第１実施形態に係る復号化処理手順を
示すフローチャートである。FIG. 6 is a flowchart showing a decoding processing procedure according to the first embodiment of the present invention.

【図７】本発明の第１実施形態に係る符号化処理手順を
示すフローチャートである。FIG. 7 is a flowchart illustrating an encoding processing procedure according to the first embodiment of the present invention.

【図８】多地点テレビ会議制御装置のシステム制御部が
記憶している動きベクトルテーブルの具体例を示す図で
ある。FIG. 8 is a diagram showing a specific example of a motion vector table stored in a system control unit of the multipoint video conference control device.

【図９】原画像中の４つのマクロブロックと、縮小合成
画像中の１つのＭＢの対応関係を示す図である。FIG. 9 is a diagram illustrating a correspondence relationship between four macroblocks in an original image and one MB in a reduced composite image.

【図１０】本発明の第１実施形態に係る動画符号器のブ
ロック構成を示す図である。FIG. 10 is a diagram showing a block configuration of a moving image encoder according to the first embodiment of the present invention.

【図１１】本発明の第２実施形態に係る符号化処理手順
を示すフローチャートである。FIG. 11 is a flowchart illustrating an encoding processing procedure according to the second embodiment of the present invention.

【図１２】本発明の第２実施形態に係る符号化処理手順
において設定される動きベクトル探索範囲の具体例を示
す図である。FIG. 12 is a diagram showing a specific example of a motion vector search range set in an encoding processing procedure according to the second embodiment of the present invention.

【図１３】本発明の第２実施形態に係る動画符号器のブ
ロック構成を示す図である。FIG. 13 is a diagram illustrating a block configuration of a moving image encoder according to a second embodiment of the present invention.

【図１４】本発明の第３実施形態に係る符号化処理手順
を示すフローチャートである。FIG. 14 is a flowchart illustrating an encoding processing procedure according to the third embodiment of the present invention.

【図１５】本発明の第３実施形態に係る符号化処理手順
において加算される現フレームの動きベクトルと、前フ
レームの動きベクトルとの関係を示す図である。FIG. 15 is a diagram illustrating a relationship between a motion vector of a current frame and a motion vector of a previous frame added in an encoding processing procedure according to the third embodiment of the present invention.

【図１６】動きベクトルの加算法について示す図であ
る。FIG. 16 is a diagram showing a method of adding motion vectors.

【図１７】従来の多地点テレビ会議制御装置における各
テレビ会議端末装置から受信した符号化原動画像を、復
号化して縮小して合成して再符号化して各テレビ会議端
末装置に送信する際の処理について模式的に示した図で
ある。FIG. 17 illustrates a conventional multipoint video conference control device that decodes, reduces, combines, re-encodes, and transmits coded moving images received from each video conference terminal device to each video conference terminal device. It is the figure which showed the process typically.

[Explanation of symbols]

１、１９、２０テレビ会議端末装置２システム制御部３磁気ディスク装置４ＩＳＤＮインターフェイス部５マルチメデイア多重・分離部６マイク７音声入力処理部８音声符号・復号化部９音声出力処理部１０スピーカ１１ビデオカメラ１２映像入力処理部１３動画符号化・復号化部１４映像出力処理部１５モニター１６ユーザーインターフェイス制御部１７コンソール１８、２８ＩＳＤＮ回線２２システム制御部２３ＩＳＤＮインターフェイス部２４マルチメデイア多重・分離部２５音声符号・復号化部２６動画符号・復号化部２７音声・動画マルチプレクス部 1, 19, 20 Video conference terminal device 2 System control unit 3 Magnetic disk device 4 ISDN interface unit 5 Multimedia multiplexing / demultiplexing unit 6 Microphone 7 Audio input processing unit 8 Audio encoding / decoding unit 9 Audio output processing unit 10 Speaker 11 Video camera 12 Video input processing unit 13 Video encoding / decoding unit 14 Video output processing unit 15 Monitor 16 User interface control unit 17 Console 18, 28 ISDN line 22 System control unit 23 ISDN interface unit 24 Multimedia multiplexing / demultiplexing unit 25 Audio encoding / decoding unit 26 Video encoding / decoding unit 27 Audio / video multiplexing unit

Claims

[Claims]

1. A video conference terminal device installed at a plurality of points is connected, and each of the video conference terminal devices receives
Receiving means for receiving a coded moving image subjected to motion-compensated inter-frame predictive coding with motion vector information added for each small region constituting a frame, and encoding from the received video conference terminal device A decoding unit for decoding the original moving images, a reduced synthesizing unit for creating a synthesized moving image by reducing and synthesizing the decoded original moving images, and forming a frame of the synthesized moving image Re-encoding means for generating an encoded combined moving image by performing motion-compensated inter-frame predictive encoding with motion vector information added for each small area; A multi-point video conference control device comprising at least a transmission unit for transmitting the current frame to the device. A current frame motion vector storage means for storing motion vector information obtained when decoding the coded moving image for each current frame, and a current frame of the coded synthesized moving image in the re-encoding means. The motion vector information stored in the current frame motion vector storage means for the current frame of each of the moving images reduced and synthesized as the synthesized moving image, A multi-point video conference control device, comprising: a motion vector calculating means for calculating based on a synthesized position on the synthesized moving image and a reduction ratio of the original moving images in the reduction synthesizing means.

2. A video conference terminal device installed at a plurality of points is connected, and each of the video conference terminal devices receives
Receiving means for receiving a coded moving image subjected to motion-compensated inter-frame predictive coding with motion vector information added for each small region constituting a frame, and encoding from the received video conference terminal device A decoding unit for decoding the original moving images, a reduced synthesizing unit for creating a synthesized moving image by reducing and synthesizing the decoded original moving images, and forming a frame of the synthesized moving image Re-encoding means for generating an encoded combined moving image by performing motion-compensated inter-frame predictive encoding with motion vector information added for each small area; A multi-point video conference control device comprising at least a transmission unit for transmitting the current frame to the device. A current frame motion vector storage means for storing motion vector information obtained when decoding the coded moving image for each current frame, and a current frame of the coded synthesized moving image in the re-encoding means. The center of the search range of the motion vector information to be added to each small area is defined as the motion vector information stored in the current frame motion vector storage unit for the current frame of each of the original moving images reduced and synthesized as the synthesized moving image. A search range center calculating unit that calculates, for each small area, a synthesis position of each of the moving images on the synthesized moving image and a reduction ratio of the moving images in the reduction and synthesis unit. The searching means calculates motion vector information to be added to each small area constituting the current frame of the encoded combined moving image by the search range center calculating means. Each subregion multipoint video conference control apparatus characterized by searching within a predetermined range around the search range center is calculated for.

3. A preceding frame for storing motion vector information obtained when the decoding unit decodes an encoded moving image of a frame one frame before the current frame, which is received from each of the video conference terminals. Further comprising a motion vector storage means, wherein the motion vector calculation means, when a frame skip occurs at the time of encoding the synthetic moving image in the re-encoding means, the re-encoding means The motion vector information to be added to each small area constituting the current frame of the image is stored in the current frame motion vector storage means for the current frame of each original moving image reduced and synthesized as the synthesized moving image. The motion vector for each area and the motion vector for each small area of the current frame stored in the current frame motion vector storage means. For the small area pointed to by Tor,
The motion vector stored in the previous frame motion vector storage means for the frame one frame before the current frame of each of the moving images reduced and synthesized as the synthesized moving image is used as each of the current frames of each of the moving images. 2. The method according to claim 1, wherein the calculation is performed based on the motion vector information added for the small area, a synthesis position of each of the moving images on the synthesized moving image, and a reduction ratio of the moving images in the reduction synthesizing unit. The multipoint video conference controller according to the above.

4. A previous frame motion which stores motion vector information obtained when the decoding means decodes an encoded moving image of a frame one frame before the current frame received from each of the video conference terminals. Further comprising a vector storage means, wherein the search range center calculation means,
When a frame skip occurs during the encoding of the synthesized moving image in the re-encoding unit, the motion to be added by the re-encoding unit for each small area constituting the current frame of the encoded synthesized moving image. The center of the search range of the vector information is set as the motion vector for each small area stored in the current frame motion vector storage means for the current frame of each original moving image reduced and synthesized as the synthesized moving image, and the current frame. For the small area indicated by the motion vector for each small area of the current frame stored in the motion vector storage means, the previous frame of the current frame of each original moving image reduced and synthesized as the synthetic moving image is set to the previous frame. The motion vector stored in the frame motion vector storage means is added to each of the small areas constituting the current frame of each of the moving images. 3. The method according to claim 2, wherein the motion vector information is calculated based on the added motion vector information, a synthesized position of each of the moving images on the synthesized moving image, and a reduction ratio of the moving images in the reduction synthesizing unit. Multipoint video conference controller.