JP6280679B2

JP6280679B2 - Frame thinning device, frame interpolation device, video encoding device, video decoding device, and programs thereof

Info

Publication number: JP6280679B2
Application number: JP2014033447A
Authority: JP
Inventors: 俊輔岩村; 俊枝三須; 康孝松尾; 境田　慎一; 慎一境田
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2014-02-24
Filing date: 2014-02-24
Publication date: 2018-02-14
Anticipated expiration: 2034-02-24
Also published as: JP2015159442A

Description

本発明は、高フレームレートの映像を伝送する技術に関し、特に、高フレームレートの映像伝送における伝送効率を改善しジャダーを抑制するための、フレーム間引き装置、フレーム補間装置、映像符号化装置、映像復号装置、及びこれらのプログラムに関する。 The present invention relates to a technique for transmitting a high frame rate video, and more particularly to a frame thinning device, a frame interpolation device, a video encoding device, a video for improving transmission efficiency and suppressing judder in high frame rate video transmission. The present invention relates to a decoding device and these programs.

ＨＥＶＣ／Ｈ.２６５やＡＶＣ／Ｈ.２６４に代表される映像符号化は、動き予測・動き補償やエントロピー符号化など多様な演算処理が必要となる。スーパーハイビジョンやデジタルシネマの分野では、従来のハイビジョン放送の縦横２倍から４倍といった超高精細映像を、最大１２０Ｈｚのフレーム周波数で映像符号化する技術が必要とされている。しかしながら、このシステムは非常に高いスループットが要求され、コストや回路規模の面で実現が困難な状況にある。以下、このような高いフレームレートが要求される映像を「高フレームレート映像」と称する。 Video coding represented by HEVC / H.265 and AVC / H.264 requires various arithmetic processes such as motion prediction / motion compensation and entropy coding. In the field of super high-definition and digital cinema, there is a need for technology for encoding ultra-high-definition video with a frame frequency of 120 Hz at maximum, which is twice to four times the height and width of conventional high-definition broadcasting. However, this system is required to have a very high throughput and is difficult to realize in terms of cost and circuit scale. Hereinafter, an image that requires such a high frame rate is referred to as a “high frame rate image”.

モバイル端末など帯域や演算処理能力に制限のある端末を対象とした映像配信では、あらかじめフレームを一定量間引き、低いフレームレートに変換してから符号化、伝送を行うことが知られている（例えば、特許文献１参照）。しかし、スポーツなどの動きの多い映像に対しては、動きが不自然になるジャダーが発生する原因となる。このジャダーを軽減するため、受信側で高いフレームレートヘと変換する技法が知られている（例えば、非特許文献１参照）。この技法では、受信側で、受信した低いフレームレートの映像のみを用いて新しいフレームを補間することにより高いフレームレートヘと変換することにより、ジャダーの発生を抑制する。 In video distribution for mobile terminals such as mobile terminals with limited bandwidth and processing capacity, it is known that frames are thinned out in advance by a predetermined amount and converted to a low frame rate before encoding and transmission (for example, , See Patent Document 1). However, it may cause judder where the movement is unnatural for an image with a lot of movement such as sports. In order to reduce this judder, a technique for converting to a high frame rate on the receiving side is known (for example, see Non-Patent Document 1). In this technique, the reception side suppresses the occurrence of judder by converting to a high frame rate by interpolating a new frame using only the received low frame rate video.

一方、高フレームレート映像の符号化技法として、既存の低フレームレート用のコーデックを並列運転し、処理するコーデックをフレーム単位で振り分けることにより実現する技法が知られている（例えば、非特許文献２参照）。この技法では容易に整数倍のフレームレートの映像を符号化することが可能である。 On the other hand, as a high frame rate video encoding technique, a technique is known in which existing low frame rate codecs are operated in parallel and the codecs to be processed are allocated in units of frames (for example, Non-Patent Document 2). reference). With this technique, it is possible to easily encode an image having an integer multiple frame rate.

特開２０１２−２４４５６６号公報JP 2012-244666 A

B.‐D. Choi, J.‐W. Han, C.-S. Kim, and S.‐J. Ko, “Motion‐ Compensated Frame Interpolation Using Bilateral Motion Estimation and Adaptive Overlapped Block Matching Compensation,” IEEE Trans. on Circuits and Systems for Video Technology, vol.17, no.4, Apr. 2007.B.‐D. Choi, J.‐W. Han, C.-S. Kim, and S.‐J.Ko, “Motion‐ Compensated Frame Interpolation Using Bilateral Motion Estimation and Adaptive Overlapped Block Matching Compensation,” IEEE Trans. on Circuits and Systems for Video Technology, vol.17, no.4, Apr. 2007. S. Sakaida,“ Chapter VI: Super Hi-Vision and Its Encoding System,” in M. Mrak, M. Grgic, and M. Kunt, “High-Quality Visual Experience,” Springer, Jul. 2010S. Sakaida, “Chapter VI: Super Hi-Vision and Its Encoding System,” in M. Mrak, M. Grgic, and M. Kunt, “High-Quality Visual Experience,” Springer, Jul. 2010

特許文献１の技法のように、高フレームレート映像について低いフレームレートの映像に変換してから符号化して伝送する技法ではジャダーが発生する。このジャダーの発生を抑制するために、非特許文献１の技法では、受信側では受信した低いフレームレートの映像のみを用いて新しいフレームを補間することにより高いフレームレートヘと変換するように構成しているが、正しく補間が出来ずに劣化が目立つことがある。例えば、複雑な動きや雑音の多い映像、オクリュージョン、アンカバー領域が多い映像では正しく補間が出来ず、劣化が目立つことになる。 As in the technique of Patent Document 1, judder occurs in a technique in which a high frame rate video is converted into a low frame rate video and then encoded and transmitted. In order to suppress the occurrence of this judder, the technique of Non-Patent Document 1 is configured such that the receiving side converts to a higher frame rate by interpolating a new frame using only the received low frame rate video. However, the interpolation may not be performed correctly and deterioration may be noticeable. For example, in a video with a lot of complicated motion and noise, an occlusion, and a video with a lot of uncovered areas, the interpolation cannot be performed correctly, and the deterioration becomes conspicuous.

また、非特許文献２の技法のように、高フレームレート映像の符号化技法として、既存の低フレームレート用のコーデックを並列運転し、処理するコーデックをフレーム単位で振り分けるように構成した場合には、コーデックが複数必要である上、それらのコーデック間の相関は考慮されていない。このため、高フレームレート映像単体の符号化処理に比べ符号化効率が著しく低下する問題がある。 In addition, as in the technique of Non-Patent Document 2, as a high frame rate video encoding technique, when an existing low frame rate codec is operated in parallel and the codec to be processed is distributed in units of frames, In addition, a plurality of codecs are required, and correlation between these codecs is not considered. For this reason, there is a problem that the encoding efficiency is remarkably reduced as compared with the encoding process of a single high frame rate video.

したがって、高フレームレート映像の伝送に関して、既存の低フレームレート用コーデックを並列使用することなく、伝送効率を改善するとともに、ジャダーを抑制する技法が望まれる。 Therefore, there is a demand for a technique for improving transmission efficiency and suppressing judder without using an existing low frame rate codec in parallel for transmission of high frame rate video.

本発明の目的は、上述の問題を鑑みて為されたものであり、高フレームレート映像などの所定の映像の伝送に関して伝送効率を改善しジャダーを抑制するための、フレーム間引き装置、フレーム補間装置、映像符号化装置、映像復号装置、及びこれらのプログラムを提供することにある。 An object of the present invention is made in view of the above-described problem, and is a frame thinning device and a frame interpolation device for improving transmission efficiency and suppressing judder with respect to transmission of a predetermined video such as a high frame rate video. It is to provide a video encoding device, a video decoding device, and a program thereof.

本発明では、送信側では、例えば高フレームレート映像などの入力映像を構成するフレームを、直接受信側へ伝送するフレーム群（メイン映像）と、伝送しないフレーム群（サブ映像）に分割する。典型的には、メイン映像は既存の映像符号化処理を用いて符号化し伝送する。サブ映像の復元対象のフレームは、そのフレームに時間的に連続する復号後メイン映像のフレームを用いた双方向動き予測・動き補償により補間することで、原画のサブ映像の伝送を不要とする。また、送信側・受信側で同一の動き予測を行うことで、動きベクトルそのものの伝送を不要にする。 In the present invention, on the transmission side, for example, a frame constituting an input video such as a high frame rate video is divided into a frame group (main video) that is transmitted directly to the reception side and a frame group (sub video) that is not transmitted. Typically, the main video is encoded and transmitted using an existing video encoding process. The sub video restoration target frame is interpolated by bidirectional motion prediction / motion compensation using the decoded main video frame that is temporally continuous to the frame, thereby eliminating the need for transmission of the original sub video. In addition, by performing the same motion prediction on the transmission side and the reception side, it is not necessary to transmit the motion vector itself.

ただし、復号後メイン映像のフレームを用いた双方向動き予測は上記利点があるものの、補間対象のフレームを参照しないため、誤った動きベクトルを算出する可能性がある。 However, although bidirectional motion prediction using a decoded main video frame has the above-described advantages, it does not refer to a frame to be interpolated, and may cause an erroneous motion vector to be calculated.

これを抑制するため、本発明による好適な態様では、送信側・受信側で、双方向動き予測の探索範囲内の各動きに対応した誤差マップを生成し、これを分析することにより、複数の動きベクトル候補を選定する。さらに、選定した複数の動きベクトル候補に関して、送信側では、これらの動きベクトル候補のそれぞれについて、動き補償を行って補間結果を得る。得られた補間結果と原画のサブ映像とを比較することで、どの動きベクトル候補が最も適切にサブ映像を補間しているかを判定し、その動きベクトル候補のインデックスのみを受信側にサイド情報として伝送する。 In order to suppress this, in a preferred embodiment according to the present invention, an error map corresponding to each motion within the search range of the bidirectional motion prediction is generated on the transmitting side and the receiving side, Select motion vector candidates. Further, regarding the selected plurality of motion vector candidates, the transmission side performs motion compensation for each of these motion vector candidates to obtain an interpolation result. By comparing the obtained interpolation result with the sub-picture of the original picture, it is determined which motion vector candidate interpolates the sub-video most appropriately, and only the index of the motion vector candidate is used as side information on the receiving side. To transmit.

受信側では、復号後のメイン映像を用いて双方向動き予測を行い、送信側と同様に複数の動きベクトル候補を算出した後、実際に補間に用いる動きベクトルは、サイド情報として受信したインデックスを参照して決定する。決定した動きベクトルに基づく補間映像と復号後メイン映像を送信側で間引いた順序に従って合成することで復元映像を得る。尚、符号化せずにメイン映像を伝送する場合にも、本発明を適用することができる。 On the receiving side, bi-directional motion prediction is performed using the decoded main video, and a plurality of motion vector candidates are calculated in the same manner as on the transmitting side, and the motion vector actually used for interpolation is the index received as side information. Determine by reference. A restored video is obtained by synthesizing the interpolated video based on the determined motion vector and the decoded main video in the order of thinning out on the transmission side. Note that the present invention can also be applied to the case where the main video is transmitted without encoding.

これにより、高フレームレート映像の伝送に関して伝送効率を改善しジャダーを抑制することが可能となる。 Accordingly, it is possible to improve transmission efficiency and suppress judder with respect to transmission of high frame rate video.

即ち、本発明のフレーム間引き装置は、所定の映像に関してフレーム群を間引いて伝送するフレーム間引き装置であって、所定の映像を、受信側へ伝送するフレーム群からなるメイン映像と伝送しないフレーム群からなるサブ映像に分割するフレーム分割手段と、当該サブ映像に対し時間的に連続するメイン映像のフレームを複数参照して、動き予測により当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補を決定する動きベクトル予測手段と、メイン映像と予測した複数の動きベクトル候補を用いた動き補償により、複数の動きベクトル候補に対応するそれぞれの補間映像を生成するフレーム補間手段と、それぞれの補間映像と当該サブ映像とを比較して、当該複数の動きベクトル候補のうち最も差分の少ない補間映像に対応する動きベクトルを決定する比較手段と、当該決定した動きベクトルのインデックスをサイド情報として、前記メイン映像とともに外部に出力する出力手段と、を備えることを特徴とする。
That is, the frame decimation device of the present invention is a frame decimation device that decimates and transmits a frame group with respect to a predetermined image, and includes a main image composed of a frame group that transmits a predetermined image to a receiving side and a frame group that does not transmit A frame dividing means for dividing the sub video into a plurality of frames of a main video temporally continuous with respect to the sub video, and a motion vector of a prediction target block for restoring the sub video by motion prediction A motion vector predicting unit that performs a plurality of predictions and determines a plurality of motion vector candidates and motion compensation using the plurality of motion vector candidates predicted with the main video generate respective interpolated videos corresponding to the plurality of motion vector candidates. The frame interpolation means, the respective interpolated video and the sub video are compared, and the plurality of motion vectors are compared. Comparing means for determining a motion vector corresponding to an interpolated video having the smallest difference among complements, and output means for outputting the determined motion vector index as side information to the outside together with the main video. And

また、本発明のフレーム間引き装置において、前記動きベクトル予測手段は、当該連続するメイン映像のフレームにてそれぞれ参照する入力ブロック間の動き予測を行うことにより、前記入力ブロックを参照して補間するブロック内の当該予測対象ブロックについて、誤差として最小点に相当する動きベクトル候補と、前記最小点以外の周辺誤差に対して最小となる極小点に相当する動きベクトル候補とを含む複数の動きベクトル候補を決定することを特徴とする。 Further, in the frame thinning-out apparatus according to the present invention, the motion vector prediction means performs a block prediction with reference to the input block by performing a motion prediction between the input blocks referred to in the continuous main video frame. A plurality of motion vector candidates including a motion vector candidate corresponding to the minimum point as an error and a motion vector candidate corresponding to a minimum point that is minimum with respect to peripheral errors other than the minimum point, for the prediction target block in It is characterized by determining.

また、本発明のフレーム間引き装置において、前記動きベクトル予測手段は、当該連続するメイン映像のフレームにてそれぞれ参照する入力ブロック間の動き予測を行うことにより、前記予測対象ブロックに関する誤差マップを生成する動き予測手段と、前記誤差マップ全体の最小点を探索して抽出し、抽出した最小点座標を第１の動きベクトル候補として決定する最小点探索手段と、前記誤差マップに対して所定値で閾値処理を施すことにより新たな閾値処理後の前記予測対象ブロックに関する誤差マップを生成する閾値設定・処理手段と、前記閾値処理により区分された誤差マップに形成される各領域のうち前記最小点座標を含む領域を除く領域について最小点を探索して抽出し、抽出した最小点座標を第２以降の動きベクトル候補として決定する領域別最小点探索手段と、前記閾値設定・処理手段及び前記領域別最小点探索手段の動作を１回以上の指定回数で動作させることにより得られる前記第２以降の動きベクトル候補と、前記第１の動きベクトル候補とを含む複数の動きベクトル候補を蓄積する動きベクトル蓄積手段と、を備えることを特徴とする。 In the frame thinning-out apparatus according to the present invention, the motion vector predicting unit generates an error map related to the prediction target block by performing motion prediction between input blocks respectively referred to in the continuous main video frames. A motion prediction means; a minimum point search means for searching and extracting a minimum point of the entire error map; and determining the extracted minimum point coordinates as a first motion vector candidate; and a threshold value with a predetermined value for the error map A threshold setting / processing means for generating an error map related to the prediction target block after the new threshold processing by performing processing, and the minimum point coordinates of each region formed in the error map divided by the threshold processing. The minimum point is searched and extracted for the region excluding the included region, and the extracted minimum point coordinates are set as the second and subsequent motion vector candidates. The second and subsequent motion vector candidates obtained by operating the minimum point search means for each area to be determined, and the operation of the threshold setting / processing means and the minimum point search means for each area at a specified number of times of one or more; Motion vector storage means for storing a plurality of motion vector candidates including the first motion vector candidate.

更に、本発明のフレーム補間装置は、所定の映像に関してフレーム群が間引かれた間引き映像からフレームを補間して当該所定の映像を復元するフレーム補間装置であって、所定の映像に関して間欠的にフレームが間引かれたメイン映像、及び、当該間引かれたフレーム群に対応するサブ映像を復元するのに用いる予測対象ブロックの動きベクトルのインデックスを示すサイド情報を入力する入力手段と、当該サブ映像に対し時間的に連続するメイン映像のフレームを複数参照して、動き予測により当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補を決定する動きベクトル予測手段と、前記サイド情報を参照し、前記インデックスに従って前記複数の動きベクトル候補をもつ予測対象ブロックに対して１つの動きベクトルを選択する動きベクトル選択手段と、当該メイン映像と当該選択した動きベクトルを用いて前記サブ映像に相当する補間映像を生成するフレーム補間手段と、前記メイン映像を当該補間映像のフレームで補間することによりフレーム合成を行い、前記所定の映像を復元するフレーム合成手段と、を備えることを特徴とする。
Furthermore, the frame interpolating apparatus of the present invention is a frame interpolating apparatus for interpolating a frame from a thinned video obtained by thinning a frame group with respect to a predetermined video and restoring the predetermined video, and intermittently regarding the predetermined video. Input means for inputting side information indicating an index of a motion vector of a prediction target block used to restore a main video from which frames have been thinned out and a sub video corresponding to the thinned frame group; Motion that refers to a plurality of frames of a main video that are temporally continuous with respect to a video, predicts a plurality of motion vectors of a prediction target block for restoring the sub video by motion prediction, and determines a plurality of motion vector candidates A vector prediction means, referring to the side information, and a prediction pair having the plurality of motion vector candidates according to the index Motion vector selection means for selecting one motion vector for the block; frame interpolation means for generating an interpolated video corresponding to the sub video using the main video and the selected motion vector; and Frame synthesis means for performing frame synthesis by interpolating with frames of the interpolated video and restoring the predetermined video.

また、本発明のフレーム補間装置において、前記動きベクトル予測手段は、当該連続するメイン映像のフレームにてそれぞれ参照する入力ブロック間の動き予測を行うことにより、前記入力ブロックを参照して補間するブロック内の当該予測対象ブロックについて、誤差として最小点に相当する動きベクトル候補と、前記最小点以外の周辺誤差に対して最小となる極小点に相当する動きベクトル候補とを含む複数の動きベクトル候補を決定することを特徴とする。 Further, in the frame interpolating apparatus according to the present invention, the motion vector predicting means performs block prediction with reference to the input block by performing motion prediction between the input blocks referred to in the continuous main video frames. A plurality of motion vector candidates including a motion vector candidate corresponding to the minimum point as an error and a motion vector candidate corresponding to a minimum point that is minimum with respect to peripheral errors other than the minimum point, for the prediction target block in It is characterized by determining.

また、本発明のフレーム補間装置において、前記動きベクトル予測手段は、当該連続するメイン映像のフレームにてそれぞれ参照する入力ブロック間の動き予測を行うことにより、前記予測対象ブロックに関する誤差マップを生成する動き予測手段と、前記誤差マップ全体の最小点を探索して抽出し、抽出した最小点座標を第１の動きベクトル候補として決定する最小点探索手段と、前記誤差マップに対して所定値で閾値処理を施すことにより新たな閾値処理後の前記予測対象ブロックに関する誤差マップを生成する閾値設定・処理手段と、前記閾値処理により区分された誤差マップに形成される各領域のうち前記最小点座標を含む領域を除く領域について最小点を探索して抽出し、抽出した最小点座標を第２以降の動きベクトル候補として決定する領域別最小点探索手段と、前記閾値設定・処理手段及び前記領域別最小点探索手段の動作を１回以上の指定回数で動作させることにより得られる前記第２以降の動きベクトル候補と、前記第１の動きベクトル候補とを含む複数の動きベクトル候補を蓄積する動きベクトル蓄積手段と、を備えることを特徴とする。 In the frame interpolation apparatus of the present invention, the motion vector predicting unit generates an error map related to the prediction target block by performing motion prediction between input blocks referred to in the continuous main video frames. A motion prediction means; a minimum point search means for searching and extracting a minimum point of the entire error map; and determining the extracted minimum point coordinates as a first motion vector candidate; and a threshold value with a predetermined value for the error map A threshold setting / processing means for generating an error map related to the prediction target block after the new threshold processing by performing processing, and the minimum point coordinates of each region formed in the error map divided by the threshold processing. Search and extract the minimum point for the area excluding the included area, and use the extracted minimum point coordinates as the second and subsequent motion vector candidates Region-specific minimum point search means, and the second and subsequent motion vector candidates obtained by operating the threshold value setting / processing means and the region-specific minimum point search means one or more times. Motion vector storage means for storing a plurality of motion vector candidates including the first motion vector candidate.

更に、本発明の映像符号化装置は、所定の映像に関してフレーム群を間引いて符号化し伝送する映像符号化装置であって、所定の映像を、受信側へ伝送するフレーム群からなるメイン映像と伝送しないフレーム群からなるサブ映像に分割するフレーム分割手段と、前記メイン映像を符号化し、外部に出力する映像符号化手段と、前記符号化したメイン映像から、局部復号したメイン映像を生成する局部復号手段と、当該サブ映像に対し時間的に連続する前記局部復号したメイン映像のフレームを複数参照して、動き予測により当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補を決定する動きベクトル予測手段と、前記局部復号したメイン映像と予測した複数の動きベクトル候補を用いた動き補償により、複数の動きベクトル候補に対応するそれぞれの補間映像を生成するフレーム補間手段と、それぞれの補間映像と当該サブ映像とを比較して、当該複数の動きベクトル候補のうち最も差分の少ない補間映像に対応する動きベクトルを決定する比較手段と、当該決定した動きベクトルのインデックスをサイド情報として外部に出力する出力手段と、を備えることを特徴とする。
Furthermore, the video encoding device of the present invention is a video encoding device that encodes and transmits a predetermined video by thinning out a frame group, and transmits the predetermined video and a main video including a frame group that transmits the predetermined video to the receiving side. A frame dividing means for dividing the main video into sub-pictures, a video encoding means for encoding the main video and outputting to the outside, and a local decoding for generating a locally decoded main video from the encoded main video Means, referring to a plurality of frames of the locally decoded main video that are temporally continuous with respect to the sub video, and predicting a plurality of motion vectors of a prediction target block for restoring the sub video by motion prediction , Using motion vector predicting means for determining a plurality of motion vector candidates, the locally decoded main video, and a plurality of predicted motion vector candidates. Frame interpolation means for generating respective interpolated videos corresponding to a plurality of motion vector candidates by motion compensation, comparing each interpolated video with the sub-video, and determining the most difference among the plurality of motion vector candidates. Comparing means for determining a motion vector corresponding to a small number of interpolated videos, and output means for outputting the determined motion vector index as side information to the outside.

更に、本発明の映像復号装置は、所定の映像に関してフレーム群が間引かれ符号化された間引き映像からフレームを補間して当該所定の映像を復元する映像復号装置であって、所定の映像に関して間欠的にフレームが間引かれ符号化された間引き映像をメイン映像として入力し、前記メイン映像に対して復号処理を施し、復号後メイン映像を生成する映像復号手段と、当該間引かれたフレーム群に対応するサブ映像を復元するのに用いる予測対象ブロックの動きベクトルのインデックスを示すサイド情報を入力する入力手段と、当該サブ映像に対し時間的に連続するメイン映像のフレームを複数参照して、動き予測により当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補を決定する動きベクトル予測手段と、前記サイド情報を参照し、前記インデックスに従って前記複数の動きベクトル候補をもつ予測対象ブロックに対して１つの動きベクトルを選択する動きベクトル選択手段と、当該復号後メイン映像と当該選択した動きベクトルを用いて前記サブ映像に相当する補間映像を生成するフレーム補間手段と、前記復号後メイン映像を当該補間映像のフレームで補間することによりフレーム合成を行い、前記所定の映像を復元するフレーム合成手段と、を備えることを特徴とする。
Furthermore, the video decoding apparatus of the present invention is a video decoding apparatus that restores a predetermined video by interpolating a frame from the thinned video obtained by thinning and encoding a frame group with respect to the predetermined video. A video decoding means for inputting a thinned video in which frames are intermittently thinned and encoded as a main video, decoding the main video, and generating a decoded main video, and the thinned frame input means for inputting a side information indicating an index of the motion vector of the prediction target block used to restore the sub-images corresponding to the group, a plurality reference frame of the main image to be temporally continuous to the sub video , Predicting a plurality of motion vectors of a prediction target block for restoring the sub video by motion prediction, and determining a plurality of motion vector candidates Vector prediction means, motion vector selection means for referring to the side information and selecting one motion vector for the prediction target block having the plurality of motion vector candidates according to the index, the decoded main video, and Frame interpolation means for generating an interpolated video corresponding to the sub video using the selected motion vector, and frame synthesis by interpolating the decoded main video with the frame of the interpolated video, thereby restoring the predetermined video Frame synthesizing means.

更に、本発明による一態様のプログラムは、コンピュータに、所定の映像を、受信側へ伝送するフレーム群からなるメイン映像と伝送しないフレーム群からなるサブ映像に分割するステップと、当該サブ映像に対し時間的に連続するメイン映像のフレームを複数参照して、動き予測により当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補を決定するステップと、メイン映像と予測した複数の動きベクトル候補を用いた動き補償により、複数の動きベクトル候補に対応するそれぞれの補間映像を生成するステップと、それぞれの補間映像と当該サブ映像とを比較して、当該複数の動きベクトル候補のうち最も差分の少ない補間映像に対応する動きベクトルを決定するステップと、当該決定した動きベクトルのインデックスをサイド情報として、前記メイン映像とともに外部に出力するステップと、を実行させるためのプログラムである。
Furthermore, a program according to an aspect of the present invention includes a step of dividing a predetermined video into a main video composed of a frame group transmitted to a receiving side and a sub video composed of a frame group not transmitted to the computer, Determining a plurality of motion vector candidates by referring to a plurality of temporally continuous frames of a main video, predicting a plurality of motion vectors of a prediction target block for restoring the sub video by motion prediction, A step of generating each interpolated video corresponding to a plurality of motion vector candidates by motion compensation using the video and a plurality of motion vector candidates predicted, and comparing each interpolated video with the sub video, Determining a motion vector corresponding to an interpolated video with the smallest difference among the motion vector candidates of The index of the constant motion vector as side information, a program for executing the steps of: outputting to the outside together with the main image.

また、本発明による別態様のプログラムは、コンピュータに、所定の映像に関して間欠的にフレームが間引かれたメイン映像と、当該間引かれたフレーム群に対応するサブ映像を復元するのに用いる予測対象ブロックの動きベクトルのインデックスを示すサイド情報とを入力するステップと、当該サブ映像に対し時間的に連続するメイン映像のフレームを複数参照して、動き予測により当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補を決定するステップと、前記サイド情報を参照し、前記インデックスに従って前記複数の動きベクトル候補をもつ予測対象ブロックに対して１つの動きベクトルを選択するステップと、当該メイン映像と当該選択した動きベクトルを用いて前記サブ映像に相当する補間映像を生成するステップと、前記メイン映像を当該補間映像のフレームで補間することによりフレーム合成を行い、前記所定の映像を復元するステップと、を実行させるためのプログラムである。
In another aspect of the present invention, a program for prediction is used to restore a main video in which frames are intermittently thinned with respect to a predetermined video and a sub video corresponding to the thinned frame group. A step of inputting side information indicating an index of a motion vector of the target block, and a prediction for restoring the sub video by motion prediction with reference to a plurality of temporally continuous main video frames for the sub video Predicting a plurality of motion vectors of the target block, determining a plurality of motion vector candidates, referring to the side information, and one motion vector for the prediction target block having the plurality of motion vector candidates according to the index Selecting the sub-image using the main video and the selected motion vector. And generating an interpolation image corresponding to an image, the main image performs frame synthesis by interpolating a frame of the interpolation image, a program for executing the steps of: restoring the predetermined image.

また、本発明による更に別態様のプログラムは、コンピュータに、所定の映像を、受信側へ伝送するフレーム群からなるメイン映像と伝送しないフレーム群からなるサブ映像に分割するステップと、前記メイン映像を符号化し、外部に出力するステップと、前記符号化したメイン映像から、局部復号したメイン映像を生成するステップと、当該サブ映像に対し時間的に連続する前記局部復号したメイン映像のフレームを複数参照して、動き予測により当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補を決定するステップと、前記局部復号したメイン映像と予測した複数の動きベクトル候補を用いた動き補償により、複数の動きベクトル候補に対応するそれぞれの補間映像を生成するステップと、それぞれの補間映像と当該サブ映像とを比較して、当該複数の動きベクトル候補のうち最も差分の少ない補間映像に対応する動きベクトルを決定するステップと、当該決定した動きベクトルのインデックスをサイド情報として外部に出力するステップと、を実行させるためのプログラムである。
According to still another aspect of the present invention, there is provided a program for dividing a predetermined video into a main video composed of a frame group transmitted to a receiving side and a sub video composed of a frame group not transmitted to a computer; A step of encoding and outputting to the outside, a step of generating a locally decoded main image from the encoded main image, and a plurality of frames of the locally decoded main image that are temporally continuous with respect to the sub-image A plurality of motion vectors of a prediction target block for restoring the sub-video by motion prediction , determining a plurality of motion vector candidates, and a plurality of motion vectors predicted with the locally decoded main video Generate interpolated videos corresponding to multiple motion vector candidates by motion compensation using candidates. Comparing each interpolated video with the sub video, determining a motion vector corresponding to the interpolated video with the smallest difference among the plurality of motion vector candidates, and an index of the determined motion vector And a step of outputting to the outside as side information.

また、本発明による更に別態様のプログラムは、コンピュータに、所定の映像に関して間欠的にフレームが間引かれ符号化された間引き映像をメイン映像として入力し、前記メイン映像に対して復号処理を施し、復号後メイン映像を生成するステップと、当該間引かれたフレーム群に対応するサブ映像を復元するのに用いる予測対象ブロックの動きベクトルのインデックスを示すサイド情報を入力するステップと、当該サブ映像に対し時間的に連続するメイン映像のフレームを複数参照して、動き予測により当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補を決定するステップと、前記サイド情報を参照し、前記インデックスに従って前記複数の動きベクトル候補をもつ予測対象ブロックに対して１つの動きベクトルを選択するステップと、当該復号後メイン映像と当該選択した動きベクトルを用いて前記サブ映像に相当する補間映像を生成するステップと、前記復号後メイン映像を当該補間映像のフレームで補間することによりフレーム合成を行い、前記所定の映像を復元するステップと、を実行させるためのプログラムである。
Further, a program according to another aspect of the present invention inputs a thinned video obtained by intermittently decimating and encoding frames with respect to a predetermined video as a main video to a computer, and performs decoding processing on the main video. A step of generating a decoded main video, a step of inputting side information indicating an index of a motion vector of a prediction target block used for restoring a sub video corresponding to the thinned frame group, and the sub video A plurality of frames of a main video that are temporally continuous, a plurality of motion vectors of a prediction target block for restoring the sub video by motion prediction, and a plurality of motion vector candidates are determined; , Referring to the side information, and a prediction target block having the plurality of motion vector candidates according to the index. Selecting one motion vector for the video, generating an interpolated video corresponding to the sub video using the decoded main video and the selected motion vector, and decoding the main video A program for performing frame synthesis by interpolating with a frame of the interpolated video and restoring the predetermined video.

本発明によれば、高フレームレート映像などの所定の映像についてフレームを間引くことで伝送する際に、伝送するビットストリームの情報量を削減し伝送効率を改善するとともに、本発明に係る動き予測・動き補償によりジャダーを低減させることができる。特に、送信側で複数の動きベクトル候補の補間結果を予め原画と比較し、最適な動きベクトルのインデックスのみをメイン映像とともに伝送するように構成することで、動きベクトルそのものを伝送する必要はなく、サイド情報の情報量を削減することができる。また、本発明によれば、高フレームレート映像などの所定の映像についてフレームを間引いて符号化し伝送する際に、高フレームレート映像の符号化を既存の低いフレームレート用コーデック単体で行うため、既存の低フレームレート用コーデックを並列使用する態様よりも符号化効率を改善することができる。 According to the present invention, when transmitting a predetermined video such as a high frame rate video by thinning out the frame, the information amount of the bit stream to be transmitted is reduced and the transmission efficiency is improved. Judder can be reduced by motion compensation. In particular, it is not necessary to transmit the motion vector itself by comparing the interpolation results of a plurality of motion vector candidates with the original image in advance on the transmission side and transmitting only the optimal motion vector index together with the main video, The amount of side information can be reduced. Further, according to the present invention, when a predetermined video such as a high frame rate video is thinned and encoded and transmitted, the high frame rate video is encoded by the existing low frame rate codec alone. Thus, the coding efficiency can be improved as compared with the case of using the low frame rate codec in parallel.

本発明による一実施形態のフレーム間引き装置の概略を示すブロック図である。It is a block diagram which shows the outline of the flame | frame thinning-out apparatus of one Embodiment by this invention. 本発明による一実施形態のフレーム補間装置の概略を示すブロック図である。It is a block diagram which shows the outline of the frame interpolation apparatus of one Embodiment by this invention. 本発明による一実施形態のフレーム間引き装置及びフレーム補間装置における好適例の双方向動きベクトル予測部の概略を示すブロック図である。It is a block diagram which shows the outline of the bidirectional | two-way motion vector prediction part of the suitable example in the frame thinning-out apparatus of one Embodiment by this invention, and a frame interpolation apparatus. 本発明に係る双方向動きベクトル予測部における動き予測処理の好適例を示すフローチャートである。It is a flowchart which shows the suitable example of the motion prediction process in the bidirectional | two-way motion vector estimation part which concerns on this invention. 本発明に係る双方向動き予測部による双方向動き予測の一例を示す図である。It is a figure which shows an example of the bidirectional | two-way motion prediction by the bidirectional | two-way motion estimation part which concerns on this invention. （Ａ），（Ｂ）は、本発明に係る双方向動き予測部による誤差マップの閾値処理を説明する図である。(A), (B) is a figure explaining the threshold value process of the error map by the bidirectional | two-way motion estimation part which concerns on this invention. （Ａ）〜（Ｄ）は、本発明に係る双方向動き予測部による誤差マップの閾値処理により各領域の最小点座標の抽出法を説明する図である。(A)-(D) is a figure explaining the extraction method of the minimum point coordinate of each area | region by the threshold value process of the error map by the bidirectional | two-way motion estimation part which concerns on this invention. 本発明による一実施形態の映像符号化装置の概略を示すブロック図である。It is a block diagram which shows the outline of the video coding apparatus of one Embodiment by this invention. 本発明による一実施形態の映像復号装置の概略を示すブロック図である。It is a block diagram which shows the outline of the video decoding apparatus of one Embodiment by this invention. 本発明に係る双方向動き予測部による双方向動き予測の別の一例を示す図である。It is a figure which shows another example of the bidirectional | two-way motion prediction by the bidirectional | two-way motion estimation part which concerns on this invention.

まず、本発明による第１実施形態のフレーム間引き装置及びフレーム補間装置について説明する。 First, a frame thinning device and a frame interpolation device according to the first embodiment of the present invention will be described.

〔第１実施形態〕
（フレーム間引き装置）
図１は、本発明による第１実施形態のフレーム間引き装置１の概略を示すブロック図である。フレーム間引き装置１は、所定の映像に関してフレーム群を間引いて伝送する装置であり、フレーム分割部１１、双方向動きベクトル予測部１２、フレーム補間部１３、比較部１４及びサイド情報符号化部１５を備える。 [First Embodiment]
(Frame thinning device)
FIG. 1 is a block diagram showing an outline of a frame thinning device 1 according to the first embodiment of the present invention. The frame decimation device 1 is a device that decimates and transmits a frame group for a predetermined video, and includes a frame division unit 11, a bidirectional motion vector prediction unit 12, a frame interpolation unit 13, a comparison unit 14, and a side information encoding unit 15. Prepare.

フレーム分割部１１は、高フレームレート映像などの入力映像を、受信側へ伝送するフレーム群（メイン映像）と伝送しないフレーム群（サブ映像）に分割し、メイン映像についてはメインストリームとして外部に出力し、サブ映像については比較部１４に出力する。これにより、伝送するメインストリームの情報量を削減することができる。尚、メインストリームで出力するメイン映像は、所定のメモリ（図示せず）に一時記憶することが可能であり、双方向動きベクトル予測部１２及びフレーム補間部１３で読み出し可能な態様で一時記憶される。 The frame dividing unit 11 divides an input video such as a high frame rate video into a frame group (main video) transmitted to the receiving side and a frame group (sub video) not transmitted, and the main video is output to the outside as a main stream. The sub video is output to the comparison unit 14. Thereby, the amount of information of the main stream to be transmitted can be reduced. Note that the main video output in the main stream can be temporarily stored in a predetermined memory (not shown), and is temporarily stored in a manner that can be read by the bidirectional motion vector prediction unit 12 and the frame interpolation unit 13. The

双方向動きベクトル予測部１２は、復元するサブ映像のフレームに対し時間的に連続するメイン映像のフレーム（好適には、時間的に前後に位置するメイン映像のフレーム）を参照して、双方向動き予測により当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補をフレーム補間部１３に出力する。 The bidirectional motion vector predicting unit 12 refers to a main video frame that is temporally continuous with respect to a sub video frame to be restored (preferably, a main video frame that is temporally positioned before and after). A plurality of motion vectors of a prediction target block for restoring the sub-video are predicted by motion prediction, and a plurality of motion vector candidates are output to the frame interpolation unit 13.

より具体的には、双方向動きベクトル予測部１２は、当該連続するメイン映像のフレームにてそれぞれ参照する入力ブロック間（好適には、同一座標の２つの入力ブロック間）の双方向動き予測を行うことにより、これらの入力ブロックを参照して補間するブロック内の当該予測対象ブロックについて、誤差として最小点に相当する動きベクトル候補と、この最小点以外の周辺誤差に対して最小となる極小点に相当する動きベクトル候補とを含む複数の動きベクトル候補を決定する。ここで、連続するメイン映像のフレームにてそれぞれ参照する２つの入力ブロックは、間引かれたサブ映像のフレームを復元するための全ての座標で上記と同様に設定し、これらの入力ブロックを参照して補間するブロック内の全ての領域に当該予測対象ブロックを設定することで、間引かれたサブ映像のフレームを復元するのに十分な予測対象ブロックごとの複数の動きベクトル候補を決定することができる。尚、入力ブロックｂ１，ｂ２を用いる双方向動き予測の原理についての更なる詳細は、例えば非特許文献１等を参照されたい。したがって、双方向動きベクトル予測部１２は、原画のサブ映像を用いることなく、メイン映像のみを参照して、動きベクトルの予測を行う。この双方向動きベクトル予測部１２の動作の好適例の詳細は後述する。 More specifically, the bidirectional motion vector prediction unit 12 performs bidirectional motion prediction between input blocks (preferably between two input blocks having the same coordinates) that are referred to in the continuous main video frames. As a result, the motion vector candidate corresponding to the minimum point as the error and the minimum point that is the minimum with respect to the peripheral error other than the minimum point for the prediction target block in the block to be interpolated with reference to these input blocks A plurality of motion vector candidates including the motion vector candidates corresponding to the above are determined. Here, the two input blocks to be referred to in the successive main video frames are set in the same manner as described above at all coordinates for restoring the thinned sub video frames, and these input blocks are referred to. Determining a plurality of motion vector candidates for each prediction target block sufficient to restore the thinned sub-video frame by setting the prediction target block in all regions in the block to be interpolated Can do. For further details on the principle of bidirectional motion prediction using the input blocks b1 and b2, see, for example, Non-Patent Document 1. Therefore, the bidirectional motion vector prediction unit 12 performs motion vector prediction with reference to only the main video without using the sub-video of the original image. Details of a preferred example of the operation of the bidirectional motion vector prediction unit 12 will be described later.

フレーム補間部１３は、メイン映像と予測した複数の動きベクトル候補を用いた動き補償により、複数の動きベクトル候補に対応するそれぞれの補間映像を生成し、比較部１４に出力する。 The frame interpolation unit 13 generates each interpolated video corresponding to the plurality of motion vector candidates by motion compensation using the main video and the predicted plurality of motion vector candidates, and outputs the generated interpolated video to the comparison unit 14.

比較部１４は、それぞれの補間映像と原画のサブ映像とを比較して、最も原画と差分の少ない補間映像を選定することにより、当該複数の動きベクトル候補のうち当該選定した補間映像に対応する動きベクトルを決定し、決定した動きベクトルのインデックスをサイド情報としてサイド情報符号化部１５に出力する。より具体的には、比較部１４は、補間映像における予測対象ブロックとフレーム分割部１１により分割された原画のサブ映像における当該予測対象ブロックの座標位置に対応するブロックとの誤差として最も小さいものを最適な動きベクトルとし、この動きベクトルのインデックスをサイド情報としてサイド情報符号化部１５に出力する。 The comparison unit 14 compares each of the interpolated videos with the sub-picture of the original picture, and selects an interpolated picture having the smallest difference from the original picture, thereby corresponding to the selected interpolated picture among the plurality of motion vector candidates. A motion vector is determined, and an index of the determined motion vector is output to the side information encoding unit 15 as side information. More specifically, the comparison unit 14 calculates the smallest error between the prediction target block in the interpolated video and the block corresponding to the coordinate position of the prediction target block in the sub-image of the original image divided by the frame division unit 11. The optimal motion vector is set, and the index of the motion vector is output to the side information encoding unit 15 as side information.

サイド情報符号化部１５は、各予測対象ブロックの最適な動きベクトルのインデックスを示すサイド情報を既存の可逆符号化を用いて符号化しビットストリーム（サイドストリーム）として外部に出力する。尚、サイド情報に関して符号化しない場合には、サイド情報符号化部１５の機能は不要であり、比較部１４から直接、サイド情報をサイドストリームとして外部に出力するように構成することができる。 The side information encoding unit 15 encodes side information indicating an optimal motion vector index of each prediction target block using existing lossless encoding, and outputs the encoded bit stream (side stream) to the outside. When the side information is not encoded, the function of the side information encoding unit 15 is unnecessary, and the side information can be directly output as a side stream directly from the comparison unit 14.

このように、フレーム間引き装置１は、所定のサイド情報とともに、所定の映像に関してフレーム群を間引いて伝送する。 As described above, the frame thinning device 1 transmits a frame group with respect to a predetermined video together with predetermined side information.

（フレーム補間装置）
図２は、本発明による第１実施形態のフレーム補間装置５の概略を示すブロック図である。フレーム補間装置５は、所定の映像に関してフレーム群が間引かれた間引き映像からフレームを補間して当該所定の映像を復元する装置であり、双方向動きベクトル予測部５１、サイド情報復号部５２、動きベクトル選択部５３、フレーム補間部５４及びフレーム合成部５５を備える。 (Frame interpolation device)
FIG. 2 is a block diagram showing an outline of the frame interpolation device 5 according to the first embodiment of the present invention. The frame interpolation device 5 is a device that interpolates a frame from a thinned video obtained by thinning a frame group with respect to a predetermined video and restores the predetermined video, and includes a bidirectional motion vector prediction unit 51, a side information decoding unit 52, A motion vector selection unit 53, a frame interpolation unit 54, and a frame synthesis unit 55 are provided.

双方向動きベクトル予測部５１は、メインストリームとして伝送される、高フレームレート画像などの所定の映像に関して間欠的にフレームが間引かれた間引き映像（即ち、フレーム間引き装置１から出力されるメイン映像）のうち、復元するサブ映像のフレームに対し時間的に連続するメイン映像のフレーム（好適には、時間的に前後に位置するメイン映像のフレーム）を参照して、当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補を動きベクトル選択部５３に出力する。 The bi-directional motion vector predicting unit 51 is a thinned video (ie, a main video output from the frame thinning device 1) in which frames are intermittently thinned with respect to a predetermined video such as a high frame rate image transmitted as a main stream. ) In order to restore the sub video with reference to the main video frames that are temporally continuous with respect to the sub video frames to be restored (preferably, the main video frames that are temporally positioned before and after). A plurality of motion vectors of the prediction target block are predicted, and a plurality of motion vector candidates are output to the motion vector selection unit 53.

より具体的には、双方向動きベクトル予測部５１は、送信側の双方向動きベクトル予測部１２と同様に、連続するメイン映像のフレームにてそれぞれ参照する２つの入力ブロック間の双方向動き予測を行うことにより、これらの入力ブロックを参照して補間するブロック内の当該予測対象ブロックについて、誤差として最小点に相当する動きベクトル候補と、この最小点以外の周辺誤差に対して最小となる極小点に相当する動きベクトル候補とを含む複数の動きベクトル候補を決定する。尚、この双方向動きベクトル予測部５１の動作の好適例は、フレーム間引き装置１における双方向動きベクトル予測部１２の例と同一であり、その詳細は後述する。 More specifically, the bi-directional motion vector prediction unit 51, as with the bi-directional motion vector prediction unit 12 on the transmission side, performs bi-directional motion prediction between two input blocks that are respectively referred to in successive main video frames. For the prediction target block in the block to be interpolated with reference to these input blocks, the motion vector candidate corresponding to the minimum point as an error and the minimum that is minimum with respect to peripheral errors other than the minimum point A plurality of motion vector candidates including a motion vector candidate corresponding to a point is determined. A suitable example of the operation of the bidirectional motion vector predicting unit 51 is the same as the example of the bidirectional motion vector predicting unit 12 in the frame thinning-out apparatus 1, and details thereof will be described later.

サイド情報復号部５２は、サイドストリームとして伝送される、当該サブ映像を復元するための予測対象ブロックに関するサイド情報（フレーム間引き装置１から出力される符号化されたサイド情報）を、送信側の符号化処理に対応する復号処理で復号することにより、当該予測対象ブロックの最適な動きベクトルのインデックスを示すサイド情報を取得し、動きベクトル選択部５３に出力する。尚、サイド情報に関して符号化されていない場合には、サイド情報復号部５２の機能は不要であり、動きベクトル選択部５３がサイドストリームから直接、サイド情報を取得するように構成することができる。 The side information decoding unit 52 transmits side information (encoded side information output from the frame decimation device 1) related to the prediction target block for restoring the sub-video, transmitted as a side stream, to the transmission side code. The side information indicating the optimal motion vector index of the prediction target block is acquired by decoding using a decoding process corresponding to the conversion process, and is output to the motion vector selection unit 53. When the side information is not encoded, the function of the side information decoding unit 52 is unnecessary, and the motion vector selection unit 53 can be configured to acquire the side information directly from the side stream.

動きベクトル選択部５３は、当該各予測対象ブロックの最適な動きベクトルのインデックスを示すサイド情報を参照して、双方向動きベクトル予測部５１によって生成した複数の動きベクトル候補の中から、このインデックスに従って１つの動きベクトルを選択し、フレーム補間部５４に出力する。 The motion vector selection unit 53 refers to side information indicating the index of the optimal motion vector of each prediction target block, and refers to this index from among a plurality of motion vector candidates generated by the bidirectional motion vector prediction unit 51. One motion vector is selected and output to the frame interpolation unit 54.

フレーム補間部５４は、当該受信したメイン映像と当該選択した動きベクトルを用いて、当該サブ映像に相当する補間映像のフレームを復元し、復元後サブ映像をフレーム合成部５５に出力する。 The frame interpolation unit 54 restores the interpolated video frame corresponding to the sub video using the received main video and the selected motion vector, and outputs the restored sub video to the frame synthesis unit 55.

フレーム合成部５５は、当該受信したメイン映像と復元後サブ映像とを、送信側で間引かれた順序に従ってフレーム合成を行い、当該高フレームレート画像などの所定の映像を復元した復元映像を外部に出力する。 The frame synthesizing unit 55 performs frame synthesis on the received main video and the restored sub-video in the order thinned out on the transmission side, and restores the restored video obtained by restoring the predetermined video such as the high frame rate image to the outside Output to.

（双方向動きベクトル予測部の好適例の構成）
図３は、本発明による第１実施形態のフレーム間引き装置１及びフレーム補間装置５における好適例の双方向動きベクトル予測部１２，５１の概略を示すブロック図である。双方向動きベクトル予測部１２，５１は、送信側（フレーム間引き装置１側）と受信側（フレーム補間装置５側）で共通した処理を行う。受信側の双方向動きベクトル予測部５１の入力となるメイン映像は、送信側と共通であるため、予測する複数の動きベクトル候補は送信側と受信側で一致する。 (Configuration of preferred example of bidirectional motion vector prediction unit)
FIG. 3 is a block diagram showing an outline of the bidirectional motion vector prediction units 12 and 51 of the preferred example in the frame thinning device 1 and the frame interpolation device 5 of the first embodiment according to the present invention. The bidirectional motion vector prediction units 12 and 51 perform processing common to the transmission side (the frame thinning device 1 side) and the reception side (the frame interpolation device 5 side). Since the main video that is input to the bidirectional motion vector prediction unit 51 on the reception side is common to the transmission side, a plurality of motion vector candidates to be predicted match on the transmission side and the reception side.

双方向動きベクトル予測部１２，５１は、サブ映像を復元するための予測対象ブロックのフレームに時間的に連続するメイン映像のフレーム群（好適には、時間的に前後に位置するメイン映像のフレーム群）にてそれぞれ参照する２つの入力ブロックｂ１，ｂ２間の双方向動き予測を行ない、複数の動きベクトル候補を決定する機能部であり、双方向動き予測部３１、最小点探索部３２、閾値設定・処理部３３、ラベリング部３４、領域別最小点探索部３５及び動きベクトル蓄積部３６を備える。 The bi-directional motion vector predicting units 12 and 51 are main video frame groups that are temporally continuous to the frames of the prediction target block for restoring the sub video (preferably, main video frames positioned before and after the temporal video). Group) is a functional unit that performs bidirectional motion prediction between two input blocks b1 and b2 that are referred to in each group, and determines a plurality of motion vector candidates, and includes a bidirectional motion prediction unit 31, a minimum point search unit 32, a threshold value A setting / processing unit 33, a labeling unit 34, a region-specific minimum point searching unit 35, and a motion vector storage unit 36 are provided.

双方向動き予測部３１は、当該連続するメイン映像のフレームにてそれぞれ参照する２つの入力ブロック（参照ブロック）ｂ１，ｂ２間の双方向動き予測を行い、誤差マップを生成して最小点探索部３２及び閾値設定・処理部３３に出力する。ここで、誤差マップとは、図５を参照して詳細に後述するが、サブ映像を復元するための予測対象ブロックと時間的に前後に位置するメイン映像のフレーム上に設定される入力ブロックｂ１，ｂ２内で、補間したいブロック（予測対象ブロック）を基準に点対称で探索ブロックを動かした際に現れる、入力ブロックｂ１，ｂ２内の探索ブロック間の誤差の分布を表わしたものである。 The bidirectional motion prediction unit 31 performs bidirectional motion prediction between two input blocks (reference blocks) b1 and b2 that are respectively referred to in the continuous main video frames, generates an error map, and generates a minimum point search unit. 32 and the threshold value setting / processing unit 33. Here, the error map, which will be described in detail later with reference to FIG. 5, is an input block b1 set on a prediction target block for restoring a sub video and a main video frame positioned before and after in time. , B2 represents an error distribution between the search blocks in the input blocks b1 and b2, which appears when the search block is moved point-symmetrically with respect to the block to be interpolated (prediction target block).

最小点探索部３２は、誤差マップ全体の最小点を探索して抽出し、この最小点座標を対応する動きベクトル候補として動きベクトル蓄積部３６に出力するとともに、誤差マップの最小値の情報を閾値設定・処理部３３で用いる閾値の初期値として出力する。この最小点座標は、入力ブロックｂ１，ｂ２内の探索ブロックの位置座標を示すものであり、入力ブロックｂ１，ｂ２内の探索ブロック間で最も誤差が小さくなる予測対象ブロックの座標点を示す。 The minimum point search unit 32 searches for and extracts the minimum point of the entire error map, outputs this minimum point coordinate to the motion vector accumulation unit 36 as a corresponding motion vector candidate, and also sets information on the minimum value of the error map as a threshold value. The initial value of the threshold value used in the setting / processing unit 33 is output. This minimum point coordinate indicates the position coordinate of the search block in the input blocks b1 and b2, and indicates the coordinate point of the prediction target block with the smallest error between the search blocks in the input blocks b1 and b2.

閾値設定・処理部３３は、誤差マップにおける最小点もしくはその他の極小点の抽出に用いた閾値の情報を保存しておき、この閾値から所定値でシフトした閾値レベルで閾値を設定（更新）し、この設定（更新）した閾値で誤差マップに対して閾値処理を施すことにより新たな閾値処理後の誤差マップを生成し、保存した最小点座標の情報とともにラベリング部３４に出力する。この閾値処理後の誤差マップでは、その閾値処理により区分された領域が複数現れる。 The threshold setting / processing unit 33 stores threshold information used to extract the minimum point or other local minimum points in the error map, and sets (updates) the threshold at a threshold level shifted from the threshold by a predetermined value. Then, a threshold value process is performed on the error map with the set (updated) threshold value to generate a new error map after the threshold process, and outputs the error map to the labeling unit 34 together with the stored minimum point coordinate information. In the error map after the threshold processing, a plurality of regions divided by the threshold processing appear.

このような閾値更新型の閾値処理は、指定回数で、誤差マップにおける最小点もしくはその他の極小点抽出時の閾値から所定値でシフトした閾値レベルで閾値を更新し、この更新した閾値で誤差マップに対して閾値処理を施すように構成される。 Such a threshold update type threshold processing is performed by updating the threshold at a threshold level shifted by a predetermined value from the threshold at the time of extracting the minimum point or other local minimum points in the error map at a specified number of times. Is configured to perform threshold processing.

ラベリング部３４は、動きベクトル蓄積部３６が管理するラベルを取得して、閾値処理後の誤差マップで得られる各領域のうち、当該保存した最小点座標を含む領域を除外した領域毎に、ラベルを付すことによりラベリングを実行し、ラベル付きの誤差マップを領域別最小点探索部３５に出力する。尚、このラベルは、動きベクトル候補毎に割り当てられるインデックスを示すものであれば如何なる態様でもよい。 The labeling unit 34 acquires a label managed by the motion vector accumulation unit 36, and for each region excluding the region including the stored minimum point coordinate from the regions obtained by the error map after the threshold processing, the labeling unit 34 Is performed, and a labeled error map is output to the region-specific minimum point search unit 35. The label may be in any form as long as it indicates an index assigned to each motion vector candidate.

領域別最小点探索部３５は、ラベル付きの誤差マップに対して、ラベル別に（領域毎に）、誤差マップにおける各領域の最小点を探索して抽出し、各領域の最小点座標を対応する動きベクトル候補として動きベクトル蓄積部３６に出力する。その後、領域別最小点探索部３５は、当該新たな閾値処理後の誤差マップに対する複数の動きベクトル候補を保存した旨を閾値設定・処理部３３に通知する。通知を受けた閾値設定・処理部３３は、指定回数で繰り返し閾値を更新して閾値処理を実行し、更なる動きベクトル候補の決定動作を制御する。 The area-specific minimum point search unit 35 searches and extracts the minimum point of each area in the error map for each label (for each area) with respect to the labeled error map, and corresponds the minimum point coordinates of each area. It outputs to the motion vector storage part 36 as a motion vector candidate. Thereafter, the region-specific minimum point search unit 35 notifies the threshold setting / processing unit 33 that a plurality of motion vector candidates for the error map after the new threshold processing has been stored. Upon receiving the notification, the threshold setting / processing unit 33 repeatedly updates the threshold at the specified number of times, executes threshold processing, and controls further motion vector candidate determination operations.

動きベクトル蓄積部３６は、最小点探索部３２と領域別最小点探索部３５によって得られた複数の動きベクトルを蓄積し、補間するサブ映像に相当するフレーム上の補間したいブロック（予測対象ブロック）に対する複数の動きベクトル候補として外部に出力する。尚、動きベクトル蓄積部３６は、当該保存した最小点座標に対応する動きベクトル候補についてもラベルを付して管理することにより、複数の動きベクトル候補の各々についてラベル付きの動きベクトルとして出力することができ、複数の動きベクトル候補の各々は、各ラベルのインデックスにより識別することが可能となる。 The motion vector accumulation unit 36 accumulates a plurality of motion vectors obtained by the minimum point search unit 32 and the region-specific minimum point search unit 35, and a block (prediction target block) to be interpolated on a frame corresponding to the sub-picture to be interpolated. Are output to the outside as a plurality of motion vector candidates. The motion vector accumulating unit 36 also outputs a labeled motion vector for each of a plurality of motion vector candidates by managing the motion vector candidates corresponding to the stored minimum point coordinates with labels. Each of the plurality of motion vector candidates can be identified by the index of each label.

（双方向動きベクトル予測部の好適例の動作）
双方向動きベクトル予測部１２，５１の好適例の動作について、より具体的な例を挙げて詳細に説明する。ここで、メイン映像を入力映像の偶数番目のフレーム群、サブ映像を入力映像の奇数番目のフレーム群とした場合の一例を主に説明する。 (Operation of preferred example of bidirectional motion vector prediction unit)
The operation of the preferred example of the bidirectional motion vector prediction units 12 and 51 will be described in detail with a more specific example. Here, an example in which the main video is an even-numbered frame group of the input video and the sub-video is an odd-numbered frame group of the input video will be mainly described.

図４は、双方向動きベクトル予測部１２，５１における動き予測処理の好適例を示すフローチャートである。まず、双方向動き予測部３１により、連続するメイン映像のフレームにてそれぞれ参照する２つの入力ブロックｂ１，ｂ２間の双方向動き予測を行い、予測対象ブロックに関する誤差マップを生成する（ステップＳ１１）。 FIG. 4 is a flowchart showing a preferred example of motion prediction processing in the bidirectional motion vector prediction units 12 and 51. First, the bidirectional motion prediction unit 31 performs bidirectional motion prediction between two input blocks b1 and b2 that are respectively referred to in successive main video frames, and generates an error map related to the prediction target block (step S11). .

双方向動き予測部３１による双方向動き予測の例を図５に示す。復元するサブ映像におけるフレーム上の予測対象ブロックと同一座標を中心に位置する探索範囲を、当該予測対象ブロックと時間的に前後に位置するメイン映像のフレーム上に設定し（入力ブロックｂ１，入力ブロックｂ２）、入力ブロックｂ１，ｂ２内で当該予測対象ブロックを基準に点対称に探索ブロックを動かす。各探索ブロック位置（ｉ,ｊ）における入力ブロックｂ１，ｂ２内の探索ブロック間の誤差（二乗誤差や絶対誤差など）をそれぞれ算出することにより、予測対象ブロックに関する誤差マップを生成する。 An example of bidirectional motion prediction by the bidirectional motion prediction unit 31 is shown in FIG. A search range centered on the same coordinates as the prediction target block on the frame in the sub-picture to be restored is set on the frame of the main video positioned in front of and behind the prediction target block (input block b1, input block). b2) In the input blocks b1 and b2, the search block is moved symmetrically with respect to the prediction target block. By calculating an error (square error, absolute error, etc.) between search blocks in the input blocks b1, b2 at each search block position (i, j), an error map relating to the prediction target block is generated.

次に、最小点探索部３２により、双方向動き予測部３１で生成した誤差マップに対して、誤差マップ全体の最小点を探索して抽出し、その最小点座標を対応する動きベクトル候補とする（ステップＳ１２）。この最小点座標に対応する動きベクトル候補は動きベクトル蓄積部３６で保存する。また、抽出した最小点に相当する誤差マップに対する閾値を閾値設定・処理部３３で用いる閾値の初期値に設定する。 Next, the minimum point search unit 32 searches and extracts the minimum point of the entire error map from the error map generated by the bidirectional motion prediction unit 31, and sets the minimum point coordinate as a corresponding motion vector candidate. (Step S12). The motion vector candidate corresponding to the minimum point coordinate is stored in the motion vector storage unit 36. Further, the threshold value for the error map corresponding to the extracted minimum point is set to the initial value of the threshold value used in the threshold value setting / processing unit 33.

次に、設定した閾値について更新し新たな動きベクトル候補を抽出する際に（ステップＳ１３：Ｙｅｓ）、閾値設定・処理部３３により、閾値の初期値（ｔｈ_０）から所定値（ステップ幅：ＳＴＥＰ）でシフトした閾値レベルで閾値（ｔｈ_１）を設定（更新）する（ステップＳ１４）。続いて、閾値設定・処理部３３は、この設定（更新）した閾値で誤差マップに対して閾値処理を施すことにより、新たな閾値処理後の誤差マップを生成する（ステップＳ１５）。 Next, when the set threshold value is updated and a new motion vector candidate is extracted (step S13: Yes), the threshold value setting / processing unit 33 uses the threshold value from the initial value (th ₀ ) to a predetermined value (step width: STEP). The threshold value (th ₁ ) is set (updated) at the threshold level shifted in () (step S14). Subsequently, the threshold setting / processing unit 33 generates a new error map after threshold processing by performing threshold processing on the error map with the set (updated) threshold (step S15).

次に、ラベリング部３４により、動きベクトル蓄積部３６が管理するラベルを取得して、閾値処理後の誤差マップで得られる各領域のうち、当該保存した最小点座標を含む領域を除外した領域毎にラベルを付すことによりラベリングを実行し、ラベル付きの誤差マップを形成する（ステップＳ１６）。 Next, the labeling unit 34 obtains a label managed by the motion vector accumulation unit 36, and for each region excluding the region including the stored minimum point coordinate among the regions obtained by the error map after the threshold processing. Labeling is executed by attaching a label to the error map to form a labeled error map (step S16).

次に、領域別最小点探索部３５により、ラベル付きの誤差マップに対して、ラベル別に（領域毎に）、誤差マップにおける各領域の最小点を探索して抽出し、各領域の最小点座標を対応する動きベクトル候補とする（ステップＳ１７）。この最小点座標は動きベクトル蓄積部３６で保存する。この各領域の最小点座標は、元の誤差マップに対する極小点座標に相当する。ここで、動きベクトル蓄積部３６は、所定の個数以上の動きベクトル候補を保持したか、または誤差マップにてラベリングすべき領域（所定サイズの領域）がなくなったと判断したとき、当該予測対象ブロックに対する全ての動きベクトル候補が決定されたものとして処理するよう構成することができる。 Next, the minimum point search unit 35 for each area searches and extracts the minimum point of each area in the error map for each label (for each area) with respect to the labeled error map, and the minimum point coordinates of each area Is a corresponding motion vector candidate (step S17). The minimum point coordinates are stored in the motion vector storage unit 36. The minimum point coordinates of each region correspond to the minimum point coordinates with respect to the original error map. Here, when it is determined that the motion vector accumulation unit 36 has stored a predetermined number or more of motion vector candidates or there are no more regions (regions of a predetermined size) to be labeled in the error map, It can be configured to process all candidate motion vectors as determined.

ここで、誤差マップの各領域の最小点座標の抽出方法として、誤差マップに対して行う閾値処理及びラベリング処理について詳述する。図６（Ａ），（Ｂ）は、本発明に係る誤差マップの閾値処理を説明するための図であり、メイン映像の第１参照フレームにおける入力ブロックｂ１内の探索ブロック位置（ｉ,ｊ）を基準に、メイン映像の第２参照フレームにおける入力ブロックｂ２内の探索ブロックに対する誤差から、時間的且つ空間座標的に当該第１参照フレーム及び第２参照フレームの各探索ブロックの間に位置するサブ映像のフレーム内の補間したいブロック（予測対象ブロック）に相当する誤差量についてマッピングした誤差マップの一例を示している。最小点を含む複数の極小点を持つ誤差マップ（図６（Ａ）参照）に対し、閾値処理を行うと、最小点を含む領域の他に複数の領域が生じる（図６（Ｂ）参照）。これらの領域をそれぞれラベリングし、ラベル毎に最小点探索を行う。これにより、誤差マップの極小点について演算量を少なく探索することが可能となる。閾値の設定情報はブロック毎や映像毎に設定しサイド情報として伝送してもよいし、送信側と受信側との間で予め閾値の設定手順について予め定めておくこともできる。一例として、予測対象ブロックの最小点探索時に得られる閾値初期値（ｔｈ_０）に対し、所定値（ステップ幅：ＳＴＥＰ）でシフトする閾値レベルを、送信側と受信側との間で予め定めておくようにする。更新すべき閾値の値に応じて、所定値（ステップ幅：ＳＴＥＰ）を変更するように構成する場合も、閾値の値に応じたステップ幅の関数とするなど、送信側と受信側との間で予め定めておけばよい。 Here, a threshold value process and a labeling process performed on the error map will be described in detail as a method of extracting the minimum point coordinates of each area of the error map. 6A and 6B are diagrams for explaining threshold processing of the error map according to the present invention, and a search block position (i, j) in the input block b1 in the first reference frame of the main video. On the basis of the error from the search block in the input block b2 in the second reference frame of the main video, the sub-position located between the search blocks of the first reference frame and the second reference frame in terms of time and space coordinates. An example of an error map in which error amounts corresponding to blocks (prediction target blocks) to be interpolated in a video frame are mapped is shown. When an error map having a plurality of minimum points including the minimum point (see FIG. 6A) is subjected to threshold processing, a plurality of regions are generated in addition to the region including the minimum point (see FIG. 6B). . Each of these areas is labeled, and a minimum point search is performed for each label. As a result, it is possible to search for the minimum point of the error map with a small amount of calculation. The threshold setting information may be set for each block or video and transmitted as side information, or the threshold setting procedure may be determined in advance between the transmission side and the reception side. As an example, a threshold level that is shifted by a predetermined value (step width: STEP) with respect to the threshold initial value (th ₀ ) obtained when searching for the minimum point of the prediction target block is determined in advance between the transmission side and the reception side. To leave. Even when it is configured to change the predetermined value (step width: STEP) according to the threshold value to be updated, a function of the step width according to the threshold value can be used as a function between the transmission side and the reception side. It may be determined in advance.

また、閾値を更に更新することで、より精度の高い極小点探索が可能である。即ち、この更新型の閾値処理は、指定回数で、最小点探索部３２による最小点抽出時の閾値から所定値でシフトした閾値レベルで閾値を更新し、この更新した閾値で誤差マップに対して閾値処理を施す。 Further, by further updating the threshold value, a more accurate minimum point search is possible. That is, in this update type threshold processing, the threshold value is updated at a threshold level shifted by a predetermined value from the threshold value at the time of minimum point extraction by the minimum point search unit 32 by the specified number of times, and the error map is updated with the updated threshold value Threshold processing is performed.

例えば、図４において、設定した閾値について更に更新し、更なる動きベクトル候補を抽出する際に（ステップＳ１３：Ｙｅｓ）、閾値（ｔｈ_１）から所定値（ステップ幅：ＳＴＥＰ）でシフトした閾値レベルで閾値（ｔｈ_２）を更新する（ステップＳ１４）。続いて、閾値設定・処理部３３は、この更新した閾値で誤差マップに対して閾値処理を施すことにより、新たな閾値処理後の誤差マップを生成する（ステップＳ１５）。 For example, in FIG. 4, when the set threshold value is further updated and further motion vector candidates are extracted (step S13: Yes), the threshold level shifted from the threshold value (th ₁ ) by a predetermined value (step width: STEP) The threshold value (th ₂ ) is updated with (Step S14). Subsequently, the threshold setting / processing unit 33 generates a new error map after threshold processing by performing threshold processing on the error map with the updated threshold (step S15).

続いて、ラベリング部３４により、動きベクトル蓄積部３６が管理するラベルを取得して、閾値処理後の誤差マップで得られる各領域のうち、既に保存した最小点座標（閾値ｔｈ_０，ｔｈ_１で保存した最小点座標）を含む領域を除外した領域毎にラベルを付すことによりラベリングを実行し、ラベル付きの誤差マップを形成する（ステップＳ１６）。 Subsequently, the labeling unit 34 acquires a label managed by the motion vector accumulation unit 36, and among the regions obtained by the error map after the threshold processing, the already stored minimum point coordinates (thresholds th ₀ and th ₁ are used). Labeling is performed by attaching a label to each area excluding the area including the saved minimum point coordinates), and a labeled error map is formed (step S16).

続いて、領域別最小点探索部３５により、ラベル付きの誤差マップに対して、ラベル別に（領域毎に）、誤差マップにおける各領域の最小点を探索して抽出し、各領域の最小点座標を対応する動きベクトル候補とする（ステップＳ１７）。この領域別の最小点座標に対応する動きベクトル候補は動きベクトル蓄積部３６で保存する。 Subsequently, the minimum point search unit 35 for each region searches and extracts the minimum point of each region in the error map for each label (for each region) with respect to the labeled error map, and the minimum point coordinates of each region Is a corresponding motion vector candidate (step S17). The motion vector candidates corresponding to the minimum point coordinates for each region are stored in the motion vector storage unit 36.

そして、この更新型の閾値処理を指定回数で繰り返した後、双方向動きベクトル予測部１２，５１における動き予測処理を終了する（ステップＳ１３：Ｎｏ）。 Then, after repeating this update type threshold processing a specified number of times, the motion prediction processing in the bidirectional motion vector prediction units 12 and 51 is terminated (step S13: No).

図７を参照して、閾値を２回更新する例を簡潔に説明する。双方向動きベクトル予測部１２，５１は、まず、誤差マップ（図７（Ａ）参照）について、誤差マップ全体の最小値を探索し（図７（Ｂ）参照）、この最小点座標を対応する動きベクトル候補として動きベクトル蓄積部３６で保存する。また、抽出した最小点に相当する誤差マップに対する閾値を閾値ｔｈ_０として設定する。続いて、閾値ｔｈ_０に所定値（ステップ幅ＳＴＥＰ）を加えたもので閾値（ｔｈ_１＝ｔｈ_０＋ＳＴＥＰ）を更新し、閾値処理及びラベリング処理を行った誤差マップを形成する（図７（Ｃ）参照）。この誤差マップで形成された複数の領域のうち、既に動きベクトル蓄積部３６で保存された最小点座標を含む領域は除外領域とする。続いて、双方向動きベクトル予測部１２，５１は、除外領域以外の各領域について、最小点探索を行い、新たに取得した最小点座標を対応する新たな動きベクトル候補として動きベクトル蓄積部３６で保存する。 An example of updating the threshold value twice will be briefly described with reference to FIG. First, the bidirectional motion vector prediction units 12 and 51 search for the minimum value of the entire error map (see FIG. 7B) for the error map (see FIG. 7A), and correspond to this minimum point coordinate. The motion vector accumulation unit 36 stores the motion vector candidates. Further, to set the threshold for the error map corresponding to the minimum point extracted as the threshold value th _0. Then, to update the threshold value _{_{(th 1 = th 0 + STEP}} ) at plus a predetermined value (step width STEP) to the threshold th _0, to form an error map of performing threshold processing and labeling processing (Fig. 7 (C )reference). Of the plurality of areas formed by this error map, an area including the minimum point coordinates already stored in the motion vector storage unit 36 is an excluded area. Subsequently, the bidirectional motion vector prediction units 12 and 51 perform a minimum point search for each region other than the excluded region, and the motion vector storage unit 36 sets the newly acquired minimum point coordinate as a corresponding new motion vector candidate. save.

更に閾値ｔｈ_１に所定値（ステップ幅ＳＴＥＰ）を加えたもので閾値（ｔｈ_２＝ｔｈ_１＋ＳＴＥＰ）を更新し、閾値処理及びラベリング処理を行った誤差マップを形成する（図７（Ｄ）参照）。この誤差マップで形成された複数の領域のうち、既に動きベクトル蓄積部３６で保存された最小点座標を含む領域は除外領域とし、双方向動きベクトル予測部１２，５１は、除外領域以外の各領域について、最小点探索を行い、新たに取得した最小点座標を対応する新たな動きベクトル候補として動きベクトル蓄積部３６で保存する。これにより、誤差マップの極小点について演算量を少なく探索することが可能となる。なお、誤差マップの極小点探索は他の既存の技法を用いてもよいが、他の既存の技法では、あらゆる極小点を対象として対応する動きベクトル候補を決定しうるが、上記の好適例による方法であれば、誤差マップに対する閾値処理を利用することにより、送信側と受信側とで高精度に一致する極小点探索を可能とし、更に極小点探索の演算負担を低減することができる。 Further, the threshold value (th ₂ = th ₁ + STEP) is updated by adding a predetermined value (step width STEP) to the threshold value th ₁ , and an error map is formed by performing threshold processing and labeling processing (see FIG. 7D). ). Of the plurality of regions formed by this error map, the region including the minimum point coordinates already stored by the motion vector storage unit 36 is an excluded region, and the bidirectional motion vector predicting units 12 and 51 A minimum point search is performed for the region, and the newly acquired minimum point coordinate is stored in the motion vector storage unit 36 as a corresponding new motion vector candidate. As a result, it is possible to search for the minimum point of the error map with a small amount of calculation. In addition, although other existing techniques may be used for the minimum point search of the error map, other existing techniques can determine a corresponding motion vector candidate for any local minimum point, but according to the above preferred example If it is a method, the threshold value process with respect to an error map can be used to enable a minimum point search that coincides with high accuracy between the transmission side and the reception side, and further, the calculation load of the minimum point search can be reduced.

このように、本発明による第１実施形態のフレーム間引き装置１及びフレーム補間装置５では、高フレームレート映像のフレーム群を間引いて伝送することでメインストリームの情報量を削減し伝送効率を改善するとともに、双方向動き予測・動き補償によりジャダーを低減させることができる。特に、送信側で複数のベクトル候補の補間結果を予め原画と比較し、最適な動きベクトルのインデックスのみをメイン映像とともに伝送するように構成することで、動きベクトルそのものを伝送する必要はなく、サイド情報の情報量を削減することができる。 As described above, the frame thinning device 1 and the frame interpolation device 5 according to the first embodiment of the present invention reduce the amount of information of the main stream and improve the transmission efficiency by thinning and transmitting the frame group of the high frame rate video. At the same time, judder can be reduced by bidirectional motion prediction and motion compensation. In particular, by comparing the interpolation results of a plurality of vector candidates with the original image in advance on the transmission side and transmitting only the optimal motion vector index together with the main video, it is not necessary to transmit the motion vector itself. The amount of information can be reduced.

次に、本発明による第２実施形態の映像符号化装置及び映像復号装置について説明する。 Next, a video encoding device and a video decoding device according to a second embodiment of the present invention will be described.

〔第２実施形態〕
（映像符号化装置）
図８は、本発明による第２実施形態の映像符号化装置１０の概略を示すブロック図である。映像符号化装置１０は、所定の映像に関してフレーム群を間引いて符号化し伝送する装置であり、フレーム分割部１１、双方向動きベクトル予測部１２、フレーム補間部１３、比較部１４、サイド情報符号化部１５、映像符号化部１６及び局部復号部１７を備える。図８において、図１と同様な構成要素には、同一の参照番号を付している。 [Second Embodiment]
(Video encoding device)
FIG. 8 is a block diagram showing an outline of the video encoding device 10 according to the second embodiment of the present invention. The video encoding device 10 is a device that thins out and encodes a group of frames for a predetermined video, and transmits a frame division unit 11, a bidirectional motion vector prediction unit 12, a frame interpolation unit 13, a comparison unit 14, and side information encoding. Unit 15, video encoding unit 16 and local decoding unit 17. In FIG. 8, the same components as those in FIG. 1 are denoted by the same reference numerals.

フレーム分割部１１は、高フレームレート映像などの入力映像を、受信側へ伝送するフレーム群（メイン映像）と伝送しないフレーム群（サブ映像）に分割し、メイン映像については映像符号化部１６に出力し、サブ映像については比較部１４に出力する。 The frame dividing unit 11 divides an input video such as a high frame rate video into a frame group (main video) to be transmitted to the receiving side and a frame group (sub video) to be transmitted to the receiving side. The sub video is output to the comparison unit 14.

映像符号化部１６は、メイン映像を符号化し、ビットストリーム（メインストリーム）として局部復号部１７及び外部に出力する。 The video encoding unit 16 encodes the main video and outputs it as a bit stream (main stream) to the local decoding unit 17 and the outside.

局部復号部１７は、メインストリームから局部復号したメイン映像を生成し、所定のメモリ（図示せず）に一時記憶する。この局部復号したメイン映像は、双方向動きベクトル予測部１２及びフレーム補間部１３で読み出し可能な態様で一時記憶される。 The local decoding unit 17 generates a main video that is locally decoded from the main stream, and temporarily stores it in a predetermined memory (not shown). The locally decoded main video is temporarily stored in a manner that can be read by the bidirectional motion vector prediction unit 12 and the frame interpolation unit 13.

双方向動きベクトル予測部１２は、復元するサブ映像のフレームに対し時間的に連続する局部復号したメイン映像のフレーム（好適には、時間的に前後に位置するメイン映像のフレーム）を参照して、双方向動き予測により当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補をフレーム補間部１３に出力する。この双方向動きベクトル予測部１２の構成及び動作は、第１実施形態と同様とすることができる。 The bidirectional motion vector prediction unit 12 refers to a locally decoded main video frame (preferably, a main video frame positioned before and after in time) that is temporally continuous with respect to a sub video frame to be restored. Then, a plurality of motion vectors of a prediction target block for restoring the sub-video are predicted by bidirectional motion prediction, and a plurality of motion vector candidates are output to the frame interpolation unit 13. The configuration and operation of the bidirectional motion vector prediction unit 12 can be the same as in the first embodiment.

フレーム補間部１３は、局部復号したメイン映像と予測した複数の動きベクトル候補を用いた動き補償により、複数の動きベクトル候補に対応するそれぞれの補間映像を生成し、比較部１４に出力する。 The frame interpolation unit 13 generates each interpolated video corresponding to the plurality of motion vector candidates by motion compensation using the locally decoded main video and the predicted plurality of motion vector candidates, and outputs the interpolated video to the comparison unit 14.

比較部１４は、第１実施形態と同様に、それぞれの補間映像と原画のサブ映像とを比較して、最も原画と差分の少ない補間映像を選定し、当該複数の動きベクトル候補のうち当該選定した補間映像に対応する動きベクトルを決定し、決定した動きベクトルのインデックスをサイド情報としてサイド情報符号化部１５に出力する。 As in the first embodiment, the comparison unit 14 compares each interpolated video with the sub-video of the original image, selects an interpolated video with the least difference from the original image, and selects the selected motion vector candidate from the plurality of motion vector candidates. The motion vector corresponding to the interpolated video is determined, and the index of the determined motion vector is output to the side information encoding unit 15 as side information.

サイド情報符号化部１５は、第１実施形態と同様に、サイド情報を既存の可逆符号化を用いて符号化しビットストリーム（サイドストリーム）として外部に出力する。尚、サイド情報に関して符号化しない場合には、サイド情報符号化部１５の機能は不要であり、比較部１４から直接、サイド情報をサイドストリームとして外部に出力するように構成することができる。 As in the first embodiment, the side information encoding unit 15 encodes the side information using the existing lossless encoding and outputs it as a bit stream (side stream) to the outside. When the side information is not encoded, the function of the side information encoding unit 15 is unnecessary, and the side information can be directly output as a side stream directly from the comparison unit 14.

（映像復号装置）
図９は、本発明による第２実施形態の映像復号装置５０の概略を示すブロック図である。映像復号装置５０は、所定の映像に関してフレーム群が間引かれ符号化された映像を復号しフレームを補間して当該所定の映像を復元する装置であり、双方向動きベクトル予測部５１、サイド情報復号部５２、動きベクトル選択部５３、フレーム補間部５４、フレーム合成部５５及び映像復号部５６を備える。図９において、図２と同様な構成要素には、同一の参照番号を付している。 (Video decoding device)
FIG. 9 is a block diagram showing an outline of a video decoding device 50 according to the second embodiment of the present invention. The video decoding device 50 is a device that decodes a video in which a frame group has been thinned out and encoded with respect to a predetermined video and interpolates the frame to restore the predetermined video. A decoding unit 52, a motion vector selection unit 53, a frame interpolation unit 54, a frame synthesis unit 55, and a video decoding unit 56 are provided. In FIG. 9, the same reference numerals are assigned to the same components as those in FIG.

映像復号部５６は、メインストリームとして伝送される、高フレームレート画像などの所定の映像に関して間欠的にフレームが間引かれ符号化された映像（即ち、映像符号化装置１０から出力されるメイン映像）を、送信側の符号化処理に対応する復号処理で復号して双方向動きベクトル予測部５１及びフレーム合成部５５に出力する。 The video decoding unit 56 is a video in which frames are intermittently thinned and encoded with respect to a predetermined video such as a high frame rate image transmitted as a main stream (that is, a main video output from the video encoding device 10). ) Is decoded by a decoding process corresponding to the encoding process on the transmission side and output to the bidirectional motion vector prediction unit 51 and the frame synthesis unit 55.

双方向動きベクトル予測部５１は、復元するサブ映像のフレームに対し時間的に連続する復号後メイン映像のフレーム（好適には、時間的に前後に位置する復号後メイン映像のフレーム）を参照して、当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補を動きベクトル選択部５３に出力する。この双方向動きベクトル予測部５１の構成及び動作は、第１実施形態と同様とすることができる。 The bidirectional motion vector prediction unit 51 refers to the decoded main video frames that are temporally continuous with respect to the sub video frames to be restored (preferably, the decoded main video frames positioned before and after in time). Thus, a plurality of motion vectors of the prediction target block for restoring the sub video are predicted, and a plurality of motion vector candidates are output to the motion vector selection unit 53. The configuration and operation of the bidirectional motion vector prediction unit 51 can be the same as in the first embodiment.

サイド情報復号部５２は、第１実施形態と同様に、サイドストリームとして伝送されるサイド情報（映像符号化装置１０から出力される符号化されたサイド情報）を、送信側の符号化処理に対応する復号処理で復号することにより、当該予測対象ブロックの最適な動きベクトルのインデックスを示すサイド情報を取得し、動きベクトル選択部５３に出力する。尚、サイド情報に関して符号化されていない場合には、サイド情報復号部５２の機能は不要であり、動きベクトル選択部５３がサイドストリームから直接、サイド情報を取得するように構成することができる。 As in the first embodiment, the side information decoding unit 52 supports side information transmitted as a side stream (encoded side information output from the video encoding device 10) for encoding processing on the transmission side. Thus, side information indicating the optimal motion vector index of the prediction target block is acquired and output to the motion vector selection unit 53. When the side information is not encoded, the function of the side information decoding unit 52 is unnecessary, and the motion vector selection unit 53 can be configured to acquire the side information directly from the side stream.

動きベクトル選択部５３は、第１実施形態と同様に、各予測対象ブロックの最適な動きベクトルのインデックスを示すサイド情報を参照して、双方向動きベクトル予測部５１によって生成した複数の動きベクトル候補の中から、このインデックスに従って１つの動きベクトルを選択し、フレーム補間部５４に出力する。 Similar to the first embodiment, the motion vector selection unit 53 refers to side information indicating the optimal motion vector index of each prediction target block, and generates a plurality of motion vector candidates generated by the bidirectional motion vector prediction unit 51. One motion vector is selected according to this index and output to the frame interpolation unit 54.

フレーム補間部５４は、第１実施形態と同様に、当該受信したメイン映像と当該選択した動きベクトルを用いて、当該サブ映像に相当する補間映像のフレームを復元し、復元後サブ映像をフレーム合成部５５に出力する。 As in the first embodiment, the frame interpolation unit 54 uses the received main video and the selected motion vector to restore the interpolated video frame corresponding to the sub video, and performs frame synthesis on the restored sub video. To the unit 55.

フレーム合成部５５は、当該復号したメイン映像と復元後サブ映像とを、送信側で間引かれた順序に従ってフレーム合成を行い、当該高フレームレート画像などの所定の映像を復元した復元映像を外部に出力する。 The frame synthesizing unit 55 performs frame synthesis on the decoded main video and the restored sub video in the order thinned out on the transmission side, and restores the restored video obtained by restoring a predetermined video such as the high frame rate image to the outside Output to.

したがって、本発明による第２実施形態の映像符号化装置１０及び映像復号装置５０においても、第１実施形態と同様の利点を得ることができる。また、第２実施形態では、高フレームレート映像などの所定の映像についてフレームを間引いて符号化し伝送する際に、高フレームレート映像の符号化を既存の低いフレームレート用コーデック単体で行うため、既存の低フレームレート用コーデックを並列使用する態様よりも符号化効率を改善することができる。 Therefore, in the video encoding device 10 and the video decoding device 50 according to the second embodiment of the present invention, the same advantages as those in the first embodiment can be obtained. Further, in the second embodiment, when a predetermined video such as a high frame rate video is thinned and encoded and transmitted, the high frame rate video is encoded by the existing low frame rate codec alone. Thus, the coding efficiency can be improved as compared with the case of using the low frame rate codec in parallel.

以上、特定の実施例を挙げて本発明を説明したが、本発明は前述の各実施形態の例に限定されるものではなく、その技術思想を逸脱しない範囲で種々変形可能である。例えば、間引くフレームの間隔を２フレームに１フレームとした例を記述したが、間引くフレームの頻度は自由に設定してよい。また、間引く頻度をフレーム３枚中で２枚とした場合も、上記例と同様に、双方向動きベクトル予測部１２，５１によって、予測対象ブロックのもつ動きベクトルを時間的に前後に位置する復号後メイン映像から予測する。その際、図１０に示すように、双方向動き予測の探索ブロックは点対称に移動せず非対称に移動し、これによって誤差マップを取得することができる。同様に、予測対象ブロックのもつ動きベクトルを時間的に前方にある２枚の復号後メイン映像から予測してもよい。したがって、動き予測について双方向とする代わりに、予測対象ブロックのもつ動きベクトル候補を決定するにあたり、当該予測対象ブロックのフレームに対し時間的に連続するメイン映像（又は復号後メイン映像）のフレームを複数用いて予測する動き予測とすることができる。このように、間引くサブ映像のフレームに対し「時間的に前後に位置する」メイン映像のフレームを参照する場合に限らず、「時間的に連続する」メイン映像のフレームを参照して、動き予測により当該サブ映像を復元するための予測対象ブロックのもつ動きベクトルを複数予測し、複数の動きベクトル候補を決定するように構成することができる。 Although the present invention has been described with reference to specific examples, the present invention is not limited to the examples of the above-described embodiments, and various modifications can be made without departing from the technical idea thereof. For example, although an example in which the interval between frames to be thinned is 1 frame in 2 frames, the frequency of frames to be thinned may be set freely. Also, when the frequency of thinning out is 2 out of 3 frames, as in the above example, the bidirectional motion vector prediction units 12 and 51 decode the motion vectors of the block to be predicted positioned temporally forward and backward. Predict from the main video. At this time, as shown in FIG. 10, the search block for bidirectional motion prediction moves asymmetrically instead of moving point-symmetrically, and thereby an error map can be acquired. Similarly, the motion vector of the prediction target block may be predicted from two post-decoding main videos that are temporally forward. Therefore, instead of bi-directionally predicting motion, when determining motion vector candidates of a prediction target block, frames of a main video (or decoded main video) that are temporally continuous with respect to the frame of the prediction target block are used. It can be set as the motion prediction which predicts using multiple. In this way, motion prediction is performed not only with reference to the main video frame “positioned before and after in time” with respect to the sub-video frames to be thinned out, but also with reference to the main video frame “continuous in time”. Thus, a plurality of motion vectors of a prediction target block for restoring the sub video can be predicted, and a plurality of motion vector candidates can be determined.

また上記例では間引く間隔は均等としたが、均等でなくてもよい。その場合、送信側と受信側で予め設定した間引き方によって間引いてもよいし、間引き手順をサイド情報として伝送してもよい。 In the above example, the thinning interval is uniform, but it may not be uniform. In this case, thinning may be performed according to a thinning method set in advance on the transmitting side and the receiving side, or a thinning procedure may be transmitted as side information.

また上記例では、２枚のメイン映像（復号後メイン映像）間の双方向動き予測により複数の動きベクトル候補を生成したが、３枚以上のメイン映像（復号後メイン映像）間での双方向動き予測により複数の動きベクトル候補を生成してもよい。 In the above example, a plurality of motion vector candidates are generated by bidirectional motion prediction between two main videos (decoded main video), but bidirectional between three or more main videos (decoded main video). A plurality of motion vector candidates may be generated by motion prediction.

また上記例では、サイド情報の伝送に関して、サイド情報ストリームとして伝送する例を説明したが、例えば画像信号の伝送に関するメインストリームに対して多重するなど、既存の伝送システムに適合させた形態で実現することができる。 In the above example, the transmission of the side information has been described as an example of transmission as a side information stream. However, for example, the transmission is performed in a form adapted to an existing transmission system such as multiplexing with respect to a main stream related to transmission of an image signal. be able to.

また、各実施形態のフレーム間引き装置１及びフレーム補間装置５、並びに映像符号化装置１０及び映像復号装置５０のそれぞれの各構成要素の機能は、コンピュータにより実現することができ、当該コンピュータに、本発明に係る各構成要素を実現させるためのプログラムは、当該コンピュータの内部又は外部に備えられるメモリ（図示せず）に記憶される。コンピュータに備えられる中央演算処理装置（ＣＰＵ）などの制御で、各構成要素の機能を実現するための処理内容が記述されたプログラムを、適宜、メモリから読み込んで実行することにより、各実施形態の装置のそれぞれの各構成要素の機能をそれぞれコンピュータにより実現させることができる。ここで、各構成要素の機能をハードウェアの一部で実現してもよい。 In addition, the function of each component of the frame thinning device 1 and the frame interpolation device 5, the video encoding device 10, and the video decoding device 50 of each embodiment can be realized by a computer. A program for realizing each component according to the invention is stored in a memory (not shown) provided inside or outside the computer. By appropriately reading a program describing processing contents for realizing the function of each component under the control of a central processing unit (CPU) provided in the computer and executing the program, each embodiment The function of each component of the apparatus can be realized by a computer. Here, the function of each component may be realized by a part of hardware.

本発明によれば、所定の映像についてフレームを間引くことで伝送する際に、メインストリームの情報量を削減し伝送効率を改善するとともに、本発明に係る双方向動き予測・動き補償によりジャダーを低減することができるため、高いフレームレートの映像について伝送する用途に有用である。 According to the present invention, when transmitting a predetermined video by thinning out a frame, the amount of information of the main stream is reduced and the transmission efficiency is improved, and the judder is reduced by bidirectional motion prediction / motion compensation according to the present invention. Therefore, it is useful for applications that transmit high frame rate video.

１フレーム間引き装置
５フレーム補間装置
１０映像符号化装置
１１フレーム分割部
１２双方向動きベクトル予測部
１３フレーム補間部
１４比較部
１５サイド情報符号化部
１６映像符号化部
１７局部復号部
３１双方向動き予測部
３２最小点探索部
３３閾値設定・処理部
３４ラベリング部
３５領域別最小点探索部
３６動きベクトル蓄積部
５０映像復号装置
５１双方向動きベクトル予測部
５２サイド情報復号部
５３動きベクトル選択部
５４フレーム補間部
５５フレーム合成部
５６映像復号部 DESCRIPTION OF SYMBOLS 1 Frame decimation apparatus 5 Frame interpolation apparatus 10 Video encoding apparatus 11 Frame division part 12 Bidirectional motion vector prediction part 13 Frame interpolation part 14 Comparison part 15 Side information encoding part 16 Video encoding part 17 Local decoding part 31 Bidirectional motion Prediction unit 32 Minimum point search unit 33 Threshold setting / processing unit 34 Labeling unit 35 Region-specific minimum point search unit 36 Motion vector storage unit 50 Video decoding device 51 Bidirectional motion vector prediction unit 52 Side information decoding unit 53 Motion vector selection unit 54 Frame interpolation unit 55 Frame composition unit 56 Video decoding unit

Claims

A frame decimation device that decimates and transmits a frame group for a predetermined video,
Frame dividing means for dividing a predetermined video into a main video consisting of a frame group transmitted to the receiving side and a sub video consisting of a frame group not transmitted;
Refer to multiple temporal frames of the main video for the sub video, predict multiple motion vectors for the target block to restore the sub video by motion prediction, and determine multiple motion vector candidates Motion vector prediction means for
Frame interpolation means for generating respective interpolated videos corresponding to a plurality of motion vector candidates by motion compensation using a plurality of motion vector candidates predicted as a main video;
Comparing means for comparing each interpolated video with the sub video and determining a motion vector corresponding to the interpolated video with the smallest difference among the plurality of motion vector candidates;
An output means for outputting the determined motion vector index as side information to the outside together with the main video;
A frame thinning device comprising:

The motion vector prediction means performs motion prediction between input blocks that are referred to in the continuous main video frames, so that the prediction target block in the block to be interpolated with reference to the input block is detected as an error. A plurality of motion vector candidates including a motion vector candidate corresponding to a minimum point and a motion vector candidate corresponding to a minimum point that is minimum with respect to peripheral errors other than the minimum point are determined. The frame thinning device according to 1.

The motion vector prediction means includes:
Motion prediction means for generating an error map related to the prediction target block by performing motion prediction between input blocks referred to in the continuous main video frames;
Minimum point search means for searching and extracting the minimum point of the entire error map, and determining the extracted minimum point coordinates as a first motion vector candidate;
Threshold setting / processing means for generating an error map related to the prediction target block after the new threshold processing by performing threshold processing with a predetermined value on the error map;
Among the regions formed in the error map divided by the threshold processing, a minimum point is searched for and extracted from regions excluding the region including the minimum point coordinate, and the extracted minimum point coordinate is the second and subsequent motion vector candidates. A region-by-region minimum point search means determined as:
The second and subsequent motion vector candidates obtained by operating the threshold value setting / processing means and the region-specific minimum point searching means at a specified number of times of one or more, and the first motion vector candidates Motion vector storage means for storing a plurality of motion vector candidates;
The frame thinning-out apparatus according to claim 1 or 2, further comprising:

A frame interpolation device that interpolates a frame from a thinned video obtained by thinning a frame group with respect to a predetermined video and restores the predetermined video,
Input side information indicating the motion vector index of the prediction target block used to restore the main video in which frames are skipped intermittently with respect to a predetermined video and the sub video corresponding to the thinned frames. Input means to
Refer to multiple temporal frames of the main video for the sub video, predict multiple motion vectors for the target block to restore the sub video by motion prediction, and determine multiple motion vector candidates Motion vector prediction means for
Motion vector selection means for referring to the side information and selecting one motion vector for a prediction target block having the plurality of motion vector candidates according to the index;
Frame interpolation means for generating an interpolated video corresponding to the sub video using the main video and the selected motion vector;
Frame synthesis by performing frame synthesis by interpolating the main video with a frame of the interpolated video, and restoring the predetermined video; and
A frame interpolation apparatus comprising:

The motion vector prediction means performs motion prediction between input blocks that are referred to in the continuous main video frames, so that the prediction target block in the block to be interpolated with reference to the input block is detected as an error. A plurality of motion vector candidates including a motion vector candidate corresponding to a minimum point and a motion vector candidate corresponding to a minimum point that is minimum with respect to peripheral errors other than the minimum point are determined. 5. The frame interpolation apparatus according to 4.

The motion vector prediction means includes:
Motion prediction means for generating an error map related to the prediction target block by performing motion prediction between input blocks referred to in the continuous main video frames;
Minimum point search means for searching and extracting the minimum point of the entire error map, and determining the extracted minimum point coordinates as a first motion vector candidate;
Threshold setting / processing means for generating an error map related to the prediction target block after the new threshold processing by performing threshold processing with a predetermined value on the error map;
Among the regions formed in the error map divided by the threshold processing, a minimum point is searched for and extracted from regions excluding the region including the minimum point coordinate, and the extracted minimum point coordinate is the second and subsequent motion vector candidates. A region-by-region minimum point search means determined as:
The second and subsequent motion vector candidates obtained by operating the threshold value setting / processing means and the region-specific minimum point searching means at a specified number of times of one or more, and the first motion vector candidates Motion vector storage means for storing a plurality of motion vector candidates;
The frame interpolating device according to claim 4, comprising:

A video encoding device that thins out and encodes a group of frames for a predetermined video,
Frame dividing means for dividing a predetermined video into a main video consisting of a frame group transmitted to the receiving side and a sub video consisting of a frame group not transmitted;
Video encoding means for encoding the main video and outputting it externally;
Local decoding means for generating locally decoded main video from the encoded main video;
A plurality of motion vectors of a prediction target block for reconstructing the sub video by motion prediction are referred to by referring to a plurality of frames of the locally decoded main video that are temporally continuous with respect to the sub video, and a plurality of motions Motion vector prediction means for determining vector candidates;
Frame interpolation means for generating respective interpolated videos corresponding to a plurality of motion vector candidates by motion compensation using the locally decoded main video and a plurality of predicted motion vector candidates;
Comparing means for comparing each interpolated video with the sub video and determining a motion vector corresponding to the interpolated video with the smallest difference among the plurality of motion vector candidates;
An output means for outputting the determined motion vector index as side information to the outside;
A video encoding device comprising:

A video decoding device for interpolating a frame from a thinned video obtained by thinning and encoding a frame group with respect to a predetermined video, and restoring the predetermined video,
Video decoding means for inputting a thinned video obtained by intermittently decimating and encoding a frame with respect to a predetermined video as a main video, performing a decoding process on the main video, and generating a main video after decoding;
Input means for inputting side information indicating an index of a motion vector of a prediction target block used for restoring a sub video corresponding to the thinned frame group;
Refer to multiple temporal frames of the main video for the sub video, predict multiple motion vectors for the target block to restore the sub video by motion prediction, and determine multiple motion vector candidates Motion vector prediction means for
Motion vector selection means for referring to the side information and selecting one motion vector for a prediction target block having the plurality of motion vector candidates according to the index;
Frame interpolation means for generating an interpolated video corresponding to the sub video using the decoded main video and the selected motion vector;
Frame synthesizing by interpolating the decoded main video with a frame of the interpolated video, and frame synthesizing means for restoring the predetermined video;
A video decoding device comprising:

On the computer,
Dividing a predetermined video into a main video consisting of a frame group to be transmitted to the receiving side and a sub video consisting of a frame group not to be transmitted;
Refer to multiple temporal frames of the main video for the sub video, predict multiple motion vectors for the target block to restore the sub video by motion prediction, and determine multiple motion vector candidates And steps to
Generating each interpolated video corresponding to a plurality of motion vector candidates by motion compensation using a plurality of motion vector candidates predicted as a main video;
Comparing each interpolated video with the sub-video and determining a motion vector corresponding to an interpolated video with the smallest difference among the plurality of motion vector candidates;
Outputting the determined motion vector index as side information to the outside together with the main video;
A program for running

On the computer,
Inputs the main video with intermittent frames thinned out for a given video and side information indicating the motion vector index of the prediction target block used to restore the sub video corresponding to the thinned frames. And steps to
Refer to multiple temporal frames of the main video for the sub video, predict multiple motion vectors for the target block to restore the sub video by motion prediction, and determine multiple motion vector candidates And steps to
Referring to the side information and selecting one motion vector for a prediction target block having the plurality of motion vector candidates according to the index;
Generating an interpolated video corresponding to the sub-video using the main video and the selected motion vector;
Performing frame synthesis by interpolating the main video with the frame of the interpolated video and restoring the predetermined video;
A program for running

On the computer,
Dividing a predetermined video into a main video consisting of a frame group to be transmitted to the receiving side and a sub video consisting of a frame group not to be transmitted;
Encoding the main video and outputting to the outside;
Generating locally decoded main video from the encoded main video;
A plurality of motion vectors of a prediction target block for reconstructing the sub video by motion prediction are referred to by referring to a plurality of frames of the locally decoded main video that are temporally continuous with respect to the sub video, and a plurality of motions Determining vector candidates;
Generating each interpolated video corresponding to a plurality of motion vector candidates by motion compensation using the locally decoded main video and a plurality of predicted motion vector candidates;
Comparing each interpolated video with the sub-video and determining a motion vector corresponding to an interpolated video with the smallest difference among the plurality of motion vector candidates;
Outputting the determined motion vector index as side information to the outside;
A program for running

On the computer,
Inputting a thinned video in which frames are intermittently thinned and encoded with respect to a predetermined video as a main video, performing a decoding process on the main video, and generating a decoded main video;
Inputting side information indicating a motion vector index of a prediction target block used to restore a sub-video corresponding to the thinned frame group;
Refer to multiple temporal frames of the main video for the sub video, predict multiple motion vectors for the target block to restore the sub video by motion prediction, and determine multiple motion vector candidates And steps to
Referring to the side information and selecting one motion vector for a prediction target block having the plurality of motion vector candidates according to the index;
Generating an interpolated video corresponding to the sub video using the decoded main video and the selected motion vector;
Performing frame synthesis by interpolating the decoded main video with a frame of the interpolated video and restoring the predetermined video;
A program for running