JPH02234587A

JPH02234587A - Picture decoder

Info

Publication number: JPH02234587A
Application number: JP1053772A
Authority: JP
Inventors: Koichi Oyama; 大山　公一; Toshibumi Sakaguchi; 俊文坂口; Yasuo Katayama; 片山　泰男
Original assignee: GRAPHICS COMMUN TECHNOL KK
Current assignee: GRAPHICS COMMUN TECHNOL KK
Priority date: 1989-03-08
Filing date: 1989-03-08
Publication date: 1990-09-17

Abstract

PURPOSE:To obtain moving picture information without using a moving vector by storing picture information in the unit of frames to plural frame memories and reproducing it from a neutral network after picture information framed in the unit of frames is decoded by a decoder. CONSTITUTION:A memory 5 stores picture information by one frame, its picture signal is outputted to an output device 9 and a frame memory 6 and the output device 9 outputs the picture information by one frame externally, partial picture signals 72, 71 of the picture information by one frame stored in the frame memories 6, 5 are subjected to the processing method of a neural network 7 thereby reproducing a frame omitted and interpolated. Thus, the device outputs picture information from the frame memory 5 and the picture information from a frame memory 8 alternately to output the consecutive moving picture information externally without using the moving vector information.

Description

[Detailed description of the invention]

［産業上の利用分野］本発明は、画像復号化装置に係り、特にフレーム単位に
符号化装置によりこま落しされた画像情翰から動画像を
復号化する画像復号化装置に関する。［従来の技術］一般にテレビ電話等に使用される低ビットレートの動画
像情報通信においては、転送情報量を低減するために送
信側からフレーム単位のこま落しされた符号化信号を送
信し、受信側で欠落したフレームを再生してフレームレ
ートを増加するフレーム内挿技術が採用されている。従
来技術によるフレーム内挿技術においては、こま落しさ
れたフレーム間の動ベクトルを画素ブロック単位に動ベ
クトルを検出して動き補償符号化を行うものが提案され
ている。尚、前記フレーム内挿技術に関する文献としては，例え
ば電子情報通信学会画像工学研究会が発表した文献Ｉ　
Ｅ８ｇ−４２　（１９８８年発行）の第３９頁乃至第４
６頁記載の「カラ一動画像信号の動き補償フレーム内挿
方式」の記事が挙げられる．［発明が解決しようとする
課題］前記従来技術による動ベクトル検出によるフレーム内挿
方技術は，動ベクトルを用いているために復号化する側
に動ベクトル情報が送られない場合には前記内挿を行う
ことができず、また動ベクトルの推定法は必ずしも正確
な推定ができず，視覚的に良好な画像が得られないと言
う不具合を招いていた．本発明の目的は前記従来技術による課題を解決すること
であり、動画像ベクトル情報を用いずに良好な内挿フレ
ームを再生可能な画像復号化装置を提供することである
。［３１題を解決するための手段］前記課題を解決するため本発明による画像復号化装置は
，フレーム単位にこま落しされた画像情報を復号化する
復号化器と，該復号化器により復号化されたフレーム単
位の画像情報を格納する複数のフレームメモリと、該複
数のフレームメモリに格納された連続するフレームの画
像情帷を人力として空間的荷重和演算を行うことにより
前記こま落しされたフレームの画像情報を再生するニュ
ーラルネットワー．クとを設けた。［作用］前記画像復号化装置は、フレーム単位にこま落しされた
画像情報を復号化器により復号化した後にフレーム嗅位
の画像情報を複数のフレームメモリに格納し，この連続
フレームの画像情報を基にこれら連続フレーム間に内挿
する内挿フレームをニューラルネットワークにより再生
する。従って、本装置は符号化の際にこま落しされたフレーム
弔位の画像情報を再生して動ベクトルを用いることなく
動画像情報を得ることができる．［実施例］以下，本発明による画像復号化装置の一実施例を図面を
参照して詳細に説明する。第１図は本実施例による画像復号化装置の原理を説明す
るための概念図であり、画像符号化装置より入力した第
ｉフレーム１０及び第ｉ＋ｋフレーム３０を用いて、こ
れらフレーム１０及び３０間のこま落しされたフレーム
に対応する第ｉ＋ｊフレーム２０を再生する例を説明す
る。この再生の原理は、前記第ｉフレーム１０及び第ｉ＋ｋ
フレーム３０の部分画像ａ及びａ゜の画像信号を取り出
し、この部分画像ａ及びａ′を予め学習させたニューラ
ルネットワークＣに入力させることにより，内挿する第
ｉ＋ｊフレーム２０の部分画像ｂを作成し、この部分画
像ｂをフレーム２０全体にわたって作成することである
。前記ニューラルネットワークＣは、生体に見られる多数
の神経細胞による情報処理法をモデルにし、比較的単純
な情報処理要素（ユニット）を用いて多数同時並行処理
を行うときの要素相互間の結合の強さを学習により生成
して空間的荷重和演算を行うニューラルコンピュータに
相当する。前記学習方式としては与えられた人力に対応
する出力と目標とする出力との差が小さくなるように相
互結合の強さを修正していく、いわゆるバックプロパゲ
ーション法が好適である６従って第１図の概念図を参照すれば明かな如く本実施例
による画像復号化装置は、こま落しされた画像情報の前
後のフレームを用いてバックプロパゲーション法により
学習したニューラルネットワークＣにより内挿するフレ
ームを再生するものである．さて、前記原理を利用した画像復号化装置の一実施例を
第２図及び第３図を用いて説明する．第２図は本実施例
による画像復号化装置の全体構成を示す図であり、こま
落しされた動画像の複数フレームの符号化信号１を入力
として該信号１の復号化を行う復号化器４と，該復号化
器４から出力される１フレーム分の画像情報を格納する
フレームメモリ５と、該メモリ５の出力を格納するフレ
ームメモリ６と、これらメモリ６及び５の出力信号、即
ちこま落しされた前後のフレームの部分画像信号７２及
び７１を入力として前述の内挿フレームの部分画像７３
を出力するニューラルネットワーク７と、該ネットワー
ク７の出力信号７３を人力として１フレーム分の画像情
報を格納するフレームメモリ８と，前記フレームメモリ
５からの画像情報に続いてフレームメモリ８からの画像
情帽を出力することにより，連続した動画像情報を外部
装置に出力する出力装置９とにより構成される．前記ニ
ューラルネットワーク７は、第３図に示す如く画素単位
の部分画像信号７ｌ・及び７２を入力する複数ユニット
（図中，丸で示す）ｉｘ＋＋乃至Ｌｌ１２ｎ’から成る
入力層４０と、同様に複数ユニット’Ｌｌ３１乃至ｔｚ
３１１から成り内挿するフレーム２０の部分画像信号７
３を出力する出力層６０と，該入力層４０及び出力層６
０との間に位置する複数ユニットＬｌ２１乃至ｕ２屠か
ら成る中間層５０とにより構成される。このニューラルネットワーク７は、前記入力層４０から
出力層６０に向かって各ユニットが結合され、入力償号
が入力層４０から出力層６０へ伝わり、学習は出力層６
０から入力層４０に向かって実際の呂力値と望ましい出
力値との差を減らすように各ユニット間の結合を換える
前記学習方法により行われる．具体的に述べると前記入
力層４０のユニットの入力値を：！／＋ｐｑ＋中間層５
０のユニットｕ２ｒ（ｒ：ｌ〜ｍ）の出力値をｙ２ｒ＋
出力層５０のユニットｕ３Ｊの出力値ｙ３５　（ｓＪ−
Ｑ）　＊入力層４０のユニットＬＩＩＰＱと中間層５０
との結合の強さをＷ　ｌ　ｐｑｊ　，中間層５０のユニ
ットｕ２，と出力層６０のユニットｕ３ｓと結合の強さ
をｗ２ｒｓとした場合、中間層５０のユニットｕｇｒの
出力値ｙ２ｒ及び出力層６０のユニットｕ３ｓの出力値
ｙ３ｇは次式１によって求められる．[Industrial Field of Application] The present invention relates to an image decoding device, and more particularly to an image decoding device that decodes a moving image from image information that has been chopped by an encoding device frame by frame. [Prior Art] Generally, in low bit rate video information communication used for videophones, etc., in order to reduce the amount of information to be transferred, the transmitting side transmits an encoded signal with frame-by-frame dropping, and the reception Frame interpolation technology is used to increase the frame rate by reproducing missing frames on the side. As a conventional frame interpolation technique, a technique has been proposed in which motion vectors between dropped frames are detected in units of pixel blocks and motion compensation coding is performed. In addition, as a document related to the frame interpolation technique, for example, Document I published by the Institute of Electronics, Information and Communication Engineers Image Engineering Study Group
E8g-42 (published in 1988), pages 39 to 4
An example of this is the article ``Motion compensated frame interpolation method for color moving image signals'' on page 6. [Problems to be Solved by the Invention] Since the frame interpolation technique using motion vector detection according to the prior art uses motion vectors, if motion vector information is not sent to the decoding side, the interpolation method cannot be performed. Furthermore, the motion vector estimation method does not always provide accurate estimation, leading to the problem of not being able to obtain visually good images. An object of the present invention is to solve the problems caused by the prior art, and to provide an image decoding device that can reproduce good interpolated frames without using moving image vector information. [Means for Solving Problem 31] In order to solve the above problems, an image decoding device according to the present invention includes a decoder that decodes image information that has been chopped in units of frames, and a decoder that decodes the image information that is A plurality of frame memories that store image information in units of frames, and a spatially weighted sum operation using the image information of successive frames stored in the plurality of frame memories, are used to calculate the segmented frames. A neural network that reproduces image information. A section was set up. [Operation] The image decoding device stores frame olfactory position image information in a plurality of frame memories after the decoder decodes the image information that has been chopped into frames, and stores the image information of the continuous frames in a plurality of frame memories. Based on this, the interpolated frames interpolated between these consecutive frames are reproduced by a neural network. Therefore, this device can reproduce frame-level image information that has been dropped during encoding to obtain moving image information without using motion vectors. [Embodiment] Hereinafter, an embodiment of an image decoding device according to the present invention will be described in detail with reference to the drawings. FIG. 1 is a conceptual diagram for explaining the principle of the image decoding device according to this embodiment. An example will be described in which the i+jth frame 20 corresponding to the frame that has been truncated is played back. The principle of this reproduction is that the i-th frame 10 and the i+k-th frame
By extracting the image signals of partial images a and a° of frame 30 and inputting these partial images a and a' to a neural network C trained in advance, partial image b of the i+jth frame 20 to be interpolated is created. , to create this partial image b over the entire frame 20. The neural network C is modeled on the information processing method using a large number of neurons found in living organisms, and is based on the strength of the connections between elements when performing multiple parallel processing using relatively simple information processing elements (units). It corresponds to a neural computer that generates the values by learning and performs spatial weighted sum calculations. As the learning method, the so-called back propagation method is suitable, in which the strength of mutual coupling is modified so that the difference between the output corresponding to the given human power and the target output becomes smaller6. As is clear from the conceptual diagram in the figure, the image decoding device according to this embodiment uses the frames before and after the chopped image information to generate frames to be interpolated by the neural network C learned by the backpropagation method. It is something to be regenerated. Now, one embodiment of an image decoding device using the above principle will be described with reference to FIGS. 2 and 3. FIG. 2 is a diagram showing the overall configuration of the image decoding device according to the present embodiment, in which a decoder 4 receives an encoded signal 1 of a plurality of frames of a frame-dropped moving image and decodes the signal 1. , a frame memory 5 that stores one frame worth of image information output from the decoder 4, a frame memory 6 that stores the output of the memory 5, and the output signals of these memories 6 and 5, that is, the frame The above-mentioned partial image 73 of the interpolation frame is obtained by inputting the partial image signals 72 and 71 of the previous and subsequent frames.
a neural network 7 that outputs the output signal 73; a frame memory 8 that stores image information for one frame using the output signal 73 of the network 7; and a frame memory 8 that stores image information for one frame following the image information from the frame memory 5 It is composed of an output device 9 that outputs continuous moving image information to an external device by outputting a hat. As shown in FIG. 3, the neural network 7 includes an input layer 40 consisting of a plurality of units (indicated by circles in the figure) ix++ to Ll12n' which input partial image signals 7l and 72 in units of pixels, and a plurality of units similarly. 'Ll31~tz
311 and the partial image signal 7 of the frame 20 to be interpolated.
3, the input layer 40 and the output layer 6
0 and an intermediate layer 50 consisting of a plurality of units L121 to U2 located between L121 and U2. In this neural network 7, each unit is connected from the input layer 40 to the output layer 60, input codes are transmitted from the input layer 40 to the output layer 60, and learning is carried out at the output layer 60.
This is performed by the above-mentioned learning method in which the connections between each unit are changed from 0 to the input layer 40 so as to reduce the difference between the actual power value and the desired output value. Specifically, the input value of the unit of the input layer 40 is:! /+pq+middle layer 5
The output value of unit u2r (r:l~m) of 0 is y2r+
Output value y35 (sJ-
Q) *Unit LIIPQ of input layer 40 and intermediate layer 50
If the strength of the connection between the unit u2 of the intermediate layer 50 and the unit u3s of the output layer 60 is w2rs, then the output value y2r of the unit ugr of the intermediate layer 50 and the output value y2r of the unit ugr of the output layer 60 The output value y3g of unit u3s is obtained by the following equation 1.

【式１】：！）／　３Ｓ＝　ｆ　　（　ｙ　ｚｓ）腫ｖ　２Ｓ：　Σ　Ｗ２ｒｇＩｌ　ｙ　２『＋　Ｗ２０Ｓ
ｒ＝１：ｙ　２ｒ＝　ｆ　　（　ｖＩｒ）＋１＝１但し・　Ｗ＠Ｑ５及びＷＩＯＯｒは、出力層６０及び中
間層５０の各ユニットと一定値との結合強さを表し、ｖ
Ｉｒ及びｖ２ｓは代入のための一時変数を表す．またＶ
ｉ７１Ｐｇｒ及びｗ２ｒｓの結合強さの設定は、予め前
記パックプロパゲーションにより，こま落しされたフレ
ームを教師データとして学習させておくものとする．即
ち第ｉフレーム１０の部分画索ａと第ｉ十ｋフレーム３
０の部分画素ａ′を入力として、第ｉ＋ｊフレーム２０
の部分画像ｂに対応する実際にこま落としされた第ｉ＋
ｊフレームの部分画像を教師データとし、部分画像のフ
レーム内における空間的位置及びｉを異ならせて学習を
行い，ニューラルネットワークの結合の強さを決定して
おく。尚，このよう・なニューラルネットワーク自体について
は１９８７年８月ｌＯ日付け発行の日経エレクトロニク
ス（Ｎ　ｏ　．　４２７）の第１１５頁乃至第１２４頁
記載の『ニューラルネットをパターン認識，知識処理に
使う』の記事他に記載されている。この様に構成された画像復号化装置は、図示しない画像
符号化装置により動画像のフレームをこま落し且つ符号
化された符号化信号１を復号化器４が復号化し，この復
号化器４の出力信号をフレームメモリ５に出力する．該
メモリ５は１フレーム分の画像情報を格納し、その画像
信号を出力装置９及びフレームメモリ６に出力し、該出
力装置９は１フレーム分の画像情報を外部に出力する．
ついで次の１フレームの画像信号がフレームメモリ５に
入力されることにより，復号化された連続するフレーム
の画像情報がそれぞれメモリ５及び６に記憶される．前記フレームメモリ６及び５に格納された１フレームの
画像情報の部分画像信号７２及び７１は、ニューラルネ
ットワーク７の前記式に従う処理手法により，こま落し
された内挿するフレームが再生される．即ち，前記部分
画像信号７２及び７１をニューラルネットワーク７がフ
レーム全面にわたって複数回処理することにより，内挿
するフレームが再生される．前記内挿するフレームの画
像情報はフレームメモリ８を介して出力装！９に出力さ
れる．従って、本装置はフレームメモリ５からの画像情報及び
フレームメモリ８からの画像情報を交互に出力すること
により、動ベクトル情報を用いずに連続した動画像情報
を外部に出力することができる．尚，前記実施例においては２つのフレームメモリ５及び
６を使用することにより１つの内挿フレームを生成する
例を説明したが，本発明はこれに限られるものではなく
、複数のニューラルネットワークを用意することにより
、複数の内挿フレームを生成することもできる．また、
ニューラルネットワークの前段に３つ以上のフレームメ
モリを用意し、該３つ以上のフレームメモリから内挿す
ることも、入力層にそのそれぞれの部分画像信号を入力
する様に構成しても良い．更にニューラルネットワーク
は前述の３層に限られるものではなく、４層以上に構成
しても良い．［発明の効果］以上述べた如く本発明による画像復号化装置は，こま落
しされた複数フレームの画像情報をニューラルネットワ
ークにより空間的荷重和演算処理することにより、前記
こま落しされたフレームを再生することができる．[Formula 1] :! )/3S= f (y zs) tumor v 2S: Σ W2rgIly 2'+ W20S
r=1 :y 2r= f (vIr) +1=1 However, W@Q5 and WIOOr represent the coupling strength between each unit of the output layer 60 and the intermediate layer 50 and a constant value, and v
Ir and v2s represent temporary variables for assignment. Also V
The connection strength of i71Pgr and w2rs is set in advance by learning frames that have been dropped by the pack propagation as training data. That is, the partial image a of the i-th frame 10 and the i-th k-th frame 3
With partial pixel a' of 0 as input, i+j frame 20
The i-th image corresponding to the partial image b of
Using the partial image of the j frame as training data, learning is performed by varying the spatial position of the partial image within the frame and i, and the strength of the neural network connection is determined in advance. Regarding the neural network itself, please refer to "Using Neural Networks for Pattern Recognition and Knowledge Processing" on pages 115 to 124 of Nikkei Electronics (No. 427), dated August 1987. The article is described elsewhere. In the image decoding device configured in this way, a decoder 4 decodes an encoded signal 1 that has been chopped and encoded by an image encoding device (not shown) of a moving image. Output the output signal to frame memory 5. The memory 5 stores one frame's worth of image information, and outputs the image signal to an output device 9 and a frame memory 6, and the output device 9 outputs one frame's worth of image information to the outside.
Then, the image signal of the next frame is inputted to the frame memory 5, so that the decoded image information of successive frames is stored in the memories 5 and 6, respectively. The partial image signals 72 and 71 of one frame of image information stored in the frame memories 6 and 5 are reproduced into frames to be interpolated by dropping frames by a processing method according to the above formula of the neural network 7. That is, the neural network 7 processes the partial image signals 72 and 71 multiple times over the entire frame, thereby reproducing the frame to be interpolated. The image information of the frame to be interpolated is sent to the output device via the frame memory 8! 9 is output. Therefore, by alternately outputting the image information from the frame memory 5 and the image information from the frame memory 8, this device can output continuous moving image information to the outside without using motion vector information. Incidentally, in the above embodiment, an example was explained in which one interpolated frame is generated by using two frame memories 5 and 6, but the present invention is not limited to this, and it is possible to prepare a plurality of neural networks. You can also generate multiple interpolated frames by doing this. Also,
Three or more frame memories may be prepared before the neural network, and interpolation may be performed from the three or more frame memories, or each partial image signal may be input to the input layer. Furthermore, the neural network is not limited to the three layers described above, but may be configured with four or more layers. [Effects of the Invention] As described above, the image decoding device according to the present invention reproduces the dropped frames by performing spatial weighted sum calculation processing on the image information of a plurality of dropped frames using a neural network. be able to.

[Brief explanation of the drawing]

第１図は本発明による画像処理装置の概念を説明するた
めの図，第２図は本発明の一実施例による画像処理装置
を示す図、第３図はニューラルネットワークを説明する
ための図である。１：符号化信号、４：復号化器，５及び６：フレームメ
モリ、７：ニューラルネットワーク、８：フレームメモ
リ．９：出力装置。特許出願人　　株式会社グラフィックス・コミュニケー
ション・テクノロジーズFIG. 1 is a diagram for explaining the concept of an image processing device according to the present invention, FIG. 2 is a diagram showing an image processing device according to an embodiment of the present invention, and FIG. 3 is a diagram for explaining a neural network. be. 1: encoded signal, 4: decoder, 5 and 6: frame memory, 7: neural network, 8: frame memory. 9: Output device. Patent applicant Graphics Communication Technologies Co., Ltd.

Claims

[Claims]

A decoding device that decodes moving image information by decoding frame-by-frame image information includes a decoder that decodes input image information, and a frame-by-frame frame that is decoded by the decoder. A plurality of frame memories storing image information and image information of successive frames stored in the plurality of frame memories are input and a spatially weighted sum operation is performed to reproduce the image information of the dropped frame. An image decoding device comprising: a neural network.