JP4315521B2

JP4315521B2 - Video compression encoding method and apparatus, and video compression encoding / decoding system

Info

Publication number: JP4315521B2
Application number: JP14891999A
Authority: JP
Inventors: 修渡辺; 教敬岸田; 智弘上田
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1999-05-27
Filing date: 1999-05-27
Publication date: 2009-08-19
Anticipated expiration: 2019-05-27
Also published as: JP2000341685A

Description

【０００１】
【発明の属する技術分野】
本発明は、 MPEG(Moving Picture Experts Group) 方式に代表される動画像圧縮符号化方法及びその装置、並びに動画像圧縮符号化／復号化システムに関し、より詳しくは逆 3:2プルダウン処理時のタイムスタンプ作成の技術に関する。
【０００２】
【従来の技術】
逆 3:2プルダウン処理は、 3:2プルダウン処理されたNTSC信号から冗長なフィールドを間引く処理である。ところで、 3:2プルダウン処理とは、映画フィルム等の24フレームで１秒相当の動画像情報を30フレームで１秒相当のNTSCテレビジョン信号に変換する技術である。
【０００３】
図９は映画フィルムのフレームをNTSCテレビジョン信号に変換する 3:2プルダウン処理の手順を示す模式図である。図９(a) は原フィルムの 4/24(1/6)秒相当の４フレームを示しており、順にF0, F1, F2, F3フレームとする。この原フィルムの４フレームF0, F1, F2, F3をNTSCテレビジョン信号に直接変換した場合、図９(b) に示すようにそれぞれのフレームがトップフィールド（奇数フィールド）t0, t1, t2, t3とボトムフィールド（偶数フィールド）b0, b1, b2, b3とで構成される。即ち、原フィルムのフレームF0はNTSCテレビジョン信号のフィールドt0, b0に、原フィルムのフレームF1はNTSCテレビジョン信号のフィールドt1, b1に、原フィルムのフレームF2はNTSCテレビジョン信号のフィールドt2, b2に、原フィルムのフレームF3はNTSCテレビジョン信号のフィールドt3, b3にそれぞれ変換される。
【０００４】
そして、原フィルムのフレームF0, F1, F2, F3から得られたNTSCテレビジョン信号のフィールドt0, b0, t1, b1, t2, b2, t3, b3を、図９(c) に示されているように、フィールドt0とb2とを重複して使用することによりt0, b0, t0, b1, t1, b2, t2, b2, t3, b3の配列として10フィールド、即ちf0, f1, f2, f3, f4の５フレームに再構成する。従って、NTSCテレビジョン信号に変換後のフレームf0はフィールドt0, b0で、フレームf1はフィールドt0, b1で、フレームf2はフィールドt1, b2で、フレームf3はフィールドt2, b2で、フレームf4はフィールドt3, b3でそれぞれ構成される。
【０００５】
以上のようにして、24フレームで１秒相当の映画フィルムの動画像情報が30フレーム（60フィールド）で１秒相当のNTSCテレビジョン信号に変換される。ところで、このようにして得られたテレビジョン信号をMPEG-2方式で動画像圧縮符号化する場合には、連続する10フィールドに２フィールド含まれる冗長なフィールドを間引く処理、即ち逆 3:2プルダウン処理を行なう必要がある。
【０００６】
図10は、従来のMPEG-2方式による動画像圧縮符号化装置の構成例を示すブロック図である。図10において、参照符号１及び２は第１及び第２のフィールドメモリを示している。第１のフィールドメモリ１は入力されたデジタルの映像信号を１フィールド分遅延させる。参照符号４は第１のフィールドメモリからの出力を第２のフィールドメモリ２へ入力するか否かを切替えるスイッチを示しており、この切替えスイッチ４の出力、換言すれば第１のフィールドメモリ１の出力を第２のフィールドメモリ２が１フィールド分遅延させる。従って、第２のフィールドメモリ２の出力は第１のフィールドメモリ１への入力とは２フィールドの遅延が生じる。
【０００７】
参照符号３は第１のフィールドメモリ１への入力と第２のフィールドメモリ２からの出力（第１のフィールドメモリ１への入力とは２フィールド遅延している）とを比較して１フィールド分の差分が所定値よりも小さい場合、より具体的には実質的に同一である場合に切替えスイッチ４の出力を無信号に、それ以外の場合に切替えスイッチ４の出力を第１のフィールドメモリ１の出力にして第２のフィールドメモリ２へ入力されるように切替える相関検出回路を示している。
【０００８】
参照符号５は映像信号を圧縮処理するビデオエンコーダを、100 はビデオエンコーダ５の出力データを解析してプレゼンテーションタイムスタンプ（以下、PTS という) を計算する PTS解析回路を、101 はビデオエンコーダ５から出力された圧縮データを PTS解析回路100 が PTSの解析を行なっている間、保持しておくバッファメモリを、９はバッファメモリ101 が保持している圧縮データをパケット化し、それに PTS解析回路100 が算出したPTS を付加して出力する PTS付加回路である。
【０００９】
なお、第１のフィールドメモリ１、第２のフィールドメモリ２、相関検出回路３、切替えスイッチ４により逆 3:2プルダウン処理部50が構成される。この逆 3:2プルダウン処理部50は、入力されるビデオデータの各フィールドを２フィールド前のフィールドと比較し、両者の差分が所定値よりも小さい場合、即ち相関が大きい場合に同一のフィールドであると判定して切替えスイッチ４を無信号側に切替えることにより、冗長なフィールドを除去する逆 3:2プルダウン処理を行なう。
【００１０】
ビデオエンコーダ５には切替えスイッチ４の出力信号が入力されており、切替えスイッチ４が無信号側に切替えられた時点で、ビデオエンコーダ５はこれから入力されるフレームの第１フィールドを復号時にもう一度繰り返して出力するためのRFR(Repeat First Field) フラグを有効にした上でMPEG-2ビデオエレメンタリー圧縮符号化処理を行なう。
【００１１】
MPEG-2のビデオエレメンタリー圧縮符号化処理では動画像の相関を利用した圧縮符号化処理を行なうため、符号化後のフレームの配列（以下、ピクチャ並びという）は原信号のピクチャ並びとは異なる順序となる。その状態を図11の模式図に、ＩまたはＰピクチャが現れる周期が３である場合の代表的な例を示す。なお、Ｉピクチャ(Intra coded picture) はフレーム内符号化ピクチャと称され、１フレームのみで独立して符号化される。また、Ｐピクチャ(Predictive coded picture)は前方向予測符号化ピクチャと称され、Ｉピクチャから前方向予測により符号化される。なおこの他に、前後のＩ及びＰピクチャを参照画像として予測符号化される双方向予測符号化ピクチャと称されるＢピクチャ(Bidirectionally predictive coded picture)も存在する。
【００１２】
図11(a) に示されている 3:2プルダウン処理済NTSC信号、図11(b) に示されている逆 3:2プルダウン処理及び図11(f) に示されている再生画像において、t0，t1…はフレーム０，１…のトップフィールドを、b0，b1…はフレーム０、１…のボトムフィールドをそれぞれ表わしている。また、(c) に示されているビデオ符号化処理及び(e) に示されている復号化処理において、たとえばB0，I2，P5はピクチャタイプ（Ｂ、Ｉ、Ｐ) とフレーム番号（０、２、６）とをそれぞれ示している。
【００１３】
図11(a) に示されている 3:2プルダウン処理済のNTSC信号には４フィールドの周期で１フィールド（t0, b2, t4, b6, t8, b10)が重複して付加されている。この信号をデジタル変換した信号が図10に示されているビデデータ入力であり、重複している冗長なフィールドが逆 3:2プルダウン処理部50によって除去され、ビデオエンコーダ５によりビデオ符号化処理が行なわれる。
【００１４】
ビデオエンコーダ５によるビデオ符号化処理では、Ｂピクチャはその前後のＩ及びＰピクチャを参照画像として予測符号化されるため、参照すべきＩ及びＰピクチャが先に符号化され、その後にＢピクチャが符号化される。このため、詳細は後述するが、Ｉ，ＰピクチャとＢピクチャの順序が入れ替わる。
【００１５】
PTS解析回路100 は、ビデオエンコーダ５から出力されるエンコードデータを逐次解析してピクチャタイプ（Ｉ，ＰまたはＢ）、 RFFフラグ、フレーム周期を抽出し、これらに基づいて PTSを算出する。
【００１６】
しかし、Ｉ，Ｐピクチャに関しては、それよりも後ろのＢピクチャの RFFフラグの状態が判明しなければ PTSを算出することができない。たとえば、図11(c) に示されているI2ピクチャのPTS を算出するためには、それに先行するB0，B1ピクチャの RFFフラグを確認し、 RFFフラグが有効”１”であれば、１フィールド分のPTS を加算しなければならない。なお、図11に示されている例では、I2ピクチャのPTS は、図11(d) に示されているようにB0ピクチャの RFF フラグが有効であるため、フィールド数換算で５フィールドとなる。
【００１７】
このように、Ｉ，ＰピクチャのPTS は次のＩ，Ｐピクチャの直前のＢピクチャの情報がわからなければ確定しないため、それまでの間、データを保持して遅延出力するバッファメモリ101 が必要となる。
【００１８】
従って、バッファメモリ101 は、 PTS解析回路100 がPTS を確定した時点でデータが読み出されるように制御される。バッファメモリ101 から読み出されたデータは PTS付加回路９でパケット化され、 PTS解析回路100 で算出されているPTS が付加されて出力される。
【００１９】
また、コンピュータなどの汎用計算機と組み合わせて動画像圧縮符号化装置を構成する場合の一例としては、エレメンタリーストリームを作成するビデオエンコーダ５までを、即ち逆 3:2プルダウン処理部50とビデオエンコーダ５とを専用ハードウェアで構成し、それ以降の PTS解析回路100 及び PTS付加回路９の機能をソフトウェア処理により行なう構成が可能である。
【００２０】
しかしこのような場合にも、同様の処理手順が必要であり、 PTS解析回路100 に対応するソフトウェアでエンコード結果を逐次解析しつつ PTS付加回路９に対応するソフトウェアでエンコードデータをパケット化すると共に PTS解析回路100 に対応するソフトウェアで算出したPTS を付加してビデオPES(Packetized Elementary Stream) を出力する。
【００２１】
その後、図示していないオーディオエンコーダでエンコードしてPTS を付加したオーディオPES と、タイムスタンプの基準参照値となるSCR(System Clock Reference) と、前述のビデオPES とを図示していない多重化器で多重化してシステムストリームが作成される。このようにして作成されたシステムストリームは、復号化装置により、ビデオとオーディオのPTS 及びSCR を用いて映像と音声の同期を取りながら再生される。
【００２２】
【発明が解決しようとする課題】
従来の動画像圧縮符号化装置は、以上のように構成されていたので、あるピクチャデータのタイムスタンプ情報を確定する際に、そのピクチャデータよりも後から出力されるピクチャデータの情報が必要であるため、後から出力されるピクチャデータまでの全データを保持しておくためのバッファメモリが必要になる。しかも、そのバッファメモリの容量は最大符号化レートを想定し、最大のＩ，Ｐピクチャ間隔を想定して設定されるため、大容量が必要であるという問題点があった。
【００２３】
また一方では、実際に使用する符号化レート、実際のＩ，Ｐピクチャ間隔が通常は想定した最大値以下になるため、メモリの一部が冗長となって有効に使えないという問題点もあった。
【００２４】
更に、パーソナルコンピュータ等の汎用コンピュータに逆 3:2プルダウン処理部50とビデオエンコーダ５とを専用ハードウェアで構成したボードを接続し、それ以降の PTS解析回路100 及び PTS付加回路９の機能をソフトウェア処理により行なう構成を採る場合においても、コンピュータの内部メモリにバッファメモリ101 に対応するメモリ容量が要求される。
【００２５】
本発明はこのような事情に鑑みてなされたものであり、大容量で冗長なバッファメモリを設ける必要の無い動画像圧縮符号化方法及びその装置、並びに動画像圧縮符号化／復号化システムの提供を目的とする。
【００２６】
【課題を解決するための手段】
本発明の請求項１に記載した動画像圧縮符号化方法は、原動画像の各フレームを１フレームが２フィールドで構成される動画像に変換し、その動画像の連続する所定数のフレームを１単位として各１単位のフレーム中の特定のフィールドを重複させることにより予め時間軸を調整してなる画像信号中の重複したフィールドを検出して一方の冗長なフィールドを削除するステップと、フィールドが削除されたことを示すフィールド削除情報を出力するステップと、冗長なフィールドが削除された後の画像信号を圧縮符号化するステップと、前記画像信号がフレーム単位で圧縮符号化される時点のフィールド数にフィールド削除回数を加えた累積フィールド数を計数すると共に、そのフレームが圧縮符号化された後の他のフレームに対する配列順序を規定する画像データのタイプを特定するピクチャタイプ情報を出力するステップと、計数された累積フィールド数とピクチャタイプ情報とを記憶するステップと、出力されたピクチャタイプ情報に基づいて、ピクチャタイプ情報別に記憶してある累積フィールド数の内の圧縮符号化されたピクチャタイプ情報に対応する累積フィールド数の中で最も古い累積フィールド数を順次選択するステップと、選択した累積フィールド数を基準周波数のクロック数に換算することで時間情報を計数するステップと、選択された累積フィールド数を削除するステップと、計数された時間情報を圧縮符号化された画像信号のフレーム単位に付加するステップとを含むことを特徴とする。
【００２７】
このような請求項１に記載した本発明の動画像圧縮符号化方法では、画像データを圧縮符号化する時点で計数した時間情報と圧縮符号化後のピクチャタイプ情報を保持し、圧縮符号化処理された画像データのフレームのピクチャタイプ情報と同一のピクチャタイプ情報に対応する時間情報が選択されて時刻情報が確定される。従って、ある画像データのフレームの時間情報を計算するために、それ以降のフレームのフィールド削除情報である RFFフラグを検出した後に時間情報を計算するというような複雑な処理を行なわずとも済む。
【００２８】
本発明の請求項２に記載した動画像圧縮符号化方法は、原動画像の各フレームを１フレームが２フィールドで構成される動画像に変換し、その動画像の連続する所定数のフレームを１単位として各１単位のフレーム中の特定のフィールドを重複させることにより予め時間軸を調整してなる画像信号中の重複したフィールドを検出して一方の冗長なフィールドを削除するステップと、フィールドが削除されたことを示すフィールド削除情報を出力するステップと、冗長なフィールドが削除された後の画像信号を圧縮符号化するステップと、時間情報を表すタイムスタンプの基準となるクロックを生成するステップと、生成したクロックを計数することにより、画像データをフレーム単位で圧縮符号化する時点のタイムスタンプを算出すると共に、そのフレームが圧縮符号化された後の他のフレームに対する配列順序を規定する画像データのタイプを特定するピクチャタイプ情報を出力するステップと、算出したタイムスタンプとピクチャタイプ情報とを記憶し、圧縮符号化された後の画像データのピクチャタイプ情報に対応するタイムスタンプを記憶してあるタイムスタンプの内の、圧縮符号化された後の画像データのピクチャタイプ情報と同じピクチャタイプ情報の中で最も古いタイムスタンプから順次選択するステップと、圧縮符号化後の各フレームのデータに選択したタイムスタンプを付加するステップとを含むことを特徴とする。
【００２９】
本発明の請求項３に記載した動画像圧縮符号化装置は、原動画像の各フレームを１フレームが２フィールドで構成される動画像に変換し、その動画像の連続する所定数のフレームを１単位として各１単位のフレーム中の特定のフィールドを重複させることにより予め時間軸を調整してなる画像信号中の重複したフィールドを検出して一方の冗長なフィールドを削除し、フィールドが削除されたことを示すフィールド削除情報及び冗長なフィールドが削除された後の画像信号を出力するフィールド間引き手段と、該フィールド間引き手段から出力される画像信号及びフィールド削除情報を入力し、画像信号を圧縮符号化すると共に、圧縮処理したフィールド数及びフィールド削除情報を出力する圧縮符号化手段と、該圧縮符号化手段へ入力される画像信号の圧縮符号化された後の各フレームの他のフレームに対する配列順序を規定する画像データのタイプを特定するピクチャタイプ情報を出力すると共に、前記フィールド間引き手段への入力時点での累積フィールド数を算出するフィールド数算出手段と、該フィールド数算出手段が算出した累積フィールド数とピクチャタイプ情報とを記憶し、前記圧縮符号化手段から出力されるピクチャタイプ情報に対応する累積フィールド数を記憶してある累積フィールド数の内の、前記圧縮符号化手段が出力するピクチャタイプ情報と同じピクチャタイプ情報の中で最も小さい累積フィールド数から順次選択するフィールド数選択手段と、前記累積フィールド数を時間を表わすタイムスタンプに換算するタイムスタンプ換算手段と、圧縮符号化後の各フレームのデータに前記タイムスタンプ換算手段が換算したタイムスタンプを付加する時間情報付加手段とを備えることを特徴とする。
【００３０】
このような請求項３に記載の動画像圧縮符号化装置では、圧縮符号化手段へ入力される時点でカウントされた累積フィールド数と圧縮符号化後のピクチャタイプ情報とを保持しておき、圧縮符号化処理手段から出力されるピクチャタイプ情報と同一のピクチャタイプ情報に対応する累積フィールド数を選択するという単純な換算で時間情報が算出される。従って、ある画像データのフレームの時間情報を計算するために、それ以降のフレームのフィールド削除情報である RFFフラグを検出した後に時間情報を計算するというような複雑な処理を行なわずとも済む。
【００３１】
本発明の請求項４に記載した動画像圧縮符号化装置は、原動画像の各フレームを１フレームが２フィールドで構成される動画像に変換し、その動画像の連続する所定数のフレームを１単位として各１単位のフレーム中の特定のフィールドを重複させることにより予め時間軸を調整してなる画像信号中の重複したフィールドを検出して一方の冗長なフィールドを削除し、フィールドが削除されたことを示すフィールド削除情報及び冗長なフィールドが削除された後の画像信号を出力するフィールド間引き手段と、該フィールド間引き手段から出力される画像信号及びフィールド削除情報を入力し、画像信号を圧縮符号化すると共に、圧縮処理したフィールド数及びフィールド削除情報を出力する圧縮符号化手段と、時間情報を表わすタイムスタンプの基準となるクロックを生成するクロック生成手段と、前記クロック生成手段が生成するクロックを計数することにより、画像データがフレーム単位で前記圧縮符号化手段へ入力される時点のタイムスタンプを算出すると共に、そのフレームが圧縮符号化された後の他のフレームに対する配列順序を規定する画像データのタイプを特定するピクチャタイプ情報を出力するタイムスタンプ算出手段と、該タイムスタンプ算出手段が算出したタイムスタンプとピクチャタイプ情報とを記憶し、前記圧縮符号化手段から出力されるピクチャタイプ情報に対応するタイムスタンプを記憶してあるタイムスタンプの内の、前記圧縮符号化手段から出力されるピクチャタイプ情報と同じピクチャタイプ情報のタイムスタンプの中で最も古いタイムスタンプから順次選択するタイムスタンプ選択手段と、圧縮符号化後の各フレームのデータに前記タイムスタンプ選択手段が選択したタイムスタンプを付加するタイムスタンプ付加手段とを備えることを特徴とする。
【００３３】
このような請求項４に記載の動画像圧縮符号化装置では、圧縮符号化手段へ入力される時点でカウントされたタイムスタンプと圧縮符号化後のピクチャタイプ情報とを保持しておき、圧縮符号化処理手段から出力されるピクチャタイプ情報と同一のピクチャタイプ情報に対応するタイムスタンプを選択することによりタイムスタンプが確定される。従って、ある画像データのフレームのタイムスタンプを計算するために、それ以降のフレームのフィールド削除情報である RFFフラグを検出した後にタイムスタンプを計算するというような複雑な処理を行なわずとも済む。
【００３４】
本発明の請求項５に記載した動画像圧縮符号化／復号化システムは、原動画像の各フレームを１フレームが２フィールドで構成される動画像に変換し、その動画像の連続する所定数のフレームを１単位として各１単位のフレーム中の特定のフィールドを重複させることにより予め時間軸を調整してなる画像信号中の重複したフィールドを検出して一方の冗長なフィールドを削除し、フィールドが削除されたことを示すフィールド削除情報及び冗長なフィールドが削除された後の画像信号を出力するフィールド間引き手段と、該フィールド間引き手段から出力される画像信号及びフィールド削除情報を入力し、画像信号をフレーム単位で独立して圧縮符号化されるフレーム内符号化ピクチャであるＩピクチャと、該Ｉピクチャとのフレーム間の相関特性を利用して圧縮符号化される前方向予測符号化ピクチャであるＰピクチャと、前記Ｉ及びＰピクチャとのフレーム間の相関特性を利用して圧縮符号化される双方向予測符号化ピクチャであるＢピクチャとに圧縮符号化すると共に、圧縮処理したフィールド数及びフィールド削除情報を出力する圧縮符号化手段と、該圧縮符号化手段へ入力される画像信号の符号化後のＩピクチャまたはＰピクチャの繰返し周期を取得すると共に、そのフレームが圧縮符号化された後の他のフレームに対する配列順序を規定する画像データのタイプを特定するピクチャタイプ情報とフィールド削除情報とを抽出する情報抽出手段と、該情報抽出手段で取得、抽出したＩピクチャまたはＰピクチャの繰返し周期とピクチャタイプ情報とから、当該ピクチャタイプがＩピクチャまたはＰピクチャである場合は、１つ前のＩピクチャまたはＰピクチャの累積フィールド数に繰返し周期分のフィールド数と１つ前のＩピクチャまたはＰピクチャの次のＢピクチャから当該ピクチャまでのフィールド削除回数とを加算して当該ピクチャの累積フィールド数とし、当該ピクチャタイプがＢピクチャでありかつ１つ前のピクチャタイプがＩピクチャまたはＰピクチャである場合は、２つ前のＩピクチャまたはＰピクチャの累積フィールド数に”２”と２つ前のＩピクチャまたはＰピクチャの次のＢピクチャから当該ピクチャまでのＢピクチャフィールド削除回数とを加算して当該ピクチャの累積フィールド数とし、当該ピクチャタイプがＢピクチャでありかつ１つ前のピクチャタイプがＢピクチャである場合は、１つ前のＢピクチャの累積フィールド数に”２”と当該ピクチャのフィールド削除回数とを加算して当該ピクチャの累積フィールド数とするフィールド数予測手段と、累積フィールド数を画像のタイムスタンプに換算するタイムスタンプ換算手段と、前記圧縮符号化手段により圧縮符号化された後のデータに画像のタイムスタンプを付加して画像パケットデータを出力する画像タイムスタンプ付加手段と、音声データに音声のタイムスタンプを付加して音声パケットデータを出力する音声タイムスタンプ付加手段と、前記画像タイムスタンプ付加手段から出力される画像パケットデータと前記音声タイムスタンプ付加手段から出力される音声パケットデータと基準となるタイムスタンプとを多重化して出力する多重化手段と、前記多重化手段で多重化されたデータを画像パケットデータと音声パケットデータに分離するデータ分離手段と、音声パケットデータから音声のタイムスタンプを抽出して音声の同期合わせ処理を行なう音声同期合わせ処理手段と、音声パケットデータを復号して出力する音声復号手段と、画像パケットデータから抽出した画像のタイムスタンプと基準のタイムスタンプとの差分に時間軸のフィルタ処理を行なう時間軸フィルタ手段と、該時間軸フィルタ手段から出力される値で画像の同期合わせ処理を行なう画像同期合わせ処理手段と、画像パケットデータを復号して出力する画像復号手段とを備えたことを特徴とする。
【００３５】
このような請求項５に記載の動画像圧縮符号化／復号化システムでは、符号化時にＩ，Ｐピクチャのタイムスタンプが計算される。従って、それ以降に現れるＢピクチャの RFFフラグを検出した後にタイムスタンプを計算するといった複雑な処理を行なわずとも済む。また、復号化時には安定な同期合わせ処理が行なわれる。
【００３７】
本発明の請求項６に記載した動画像圧縮符号化／復号化システムは、請求項５の動画像圧縮符号化／復号化システムにおいて、前記時間軸フィルタ手段は、画像パケットデータから抽出した画像のタイムスタンプと基準のタイムスタンプとの差分を所定数のピクチャ分加算し、加算結果を前記所定数で除算した商を差分として時間軸のフィルタ処理を行なうべくなしてあり、前記所定数を、フィールド削除情報の繰り返しピクチャ周期と、ＩピクチャまたはＰピクチャの繰り返しピクチャ周期との公倍数とすることを特徴とする。
【００３８】
本発明の請求項７に記載した動画像圧縮符号化／復号化システムは、請求項５の動画像圧縮符号化／復号化システムにおいて、前記時間軸フィルタ手段は、画像パケットデータから抽出した画像のタイムスタンプと基準のタイムスタンプとの差分を所定数のピクチャ分加算し、加算結果を前記所定数で除算した商を差分として時間軸のフィルタ処理を行なうべくなしてあり、前記所定数を、フィールド削除情報の繰り返しピクチャ周期と、ＩピクチャまたはＰピクチャの繰り返しピクチャ周期との最小公倍数とすることを特徴とする。
【００３９】
このような請求項５、６及び７に記載の本発明の動画像圧縮符号化／復号化システムでは、応答の速い同期合わせ処理が行なわれる。
【００４０】
【発明の実施の形態】
以下、本発明に係る動画像圧縮符号化装置及び動画像圧縮符号化／復号化システムを、それぞれの実施の形態を示す図面に基づき具体的に説明する。なお、以下の説明において参照する各図中と、従来例の説明において参照した各図中とで同一符号で示されている部分は同一または相当部分を示す。
【００４１】
実施の形態１．
図１は本発明に係る動画像圧縮符号化装置の構成例を示すブロック図である。図１において、参照符号１及び２は第１及び第２のフィールドメモリを示している。第１のフィールドメモリ１は入力されたデジタルの映像信号を１フィールド分遅延させる。参照符号４は第１のフィールドメモリからの出力を第２のフィールドメモリ２へ入力するか否かを切替えるスイッチを示しており、この切替えスイッチ４の出力、換言すれば第１のフィールドメモリ１の出力を第２のフィールドメモリ２が１フィールド分遅延させる。従って、第２のフィールドメモリ２の出力は第１のフィールドメモリ１への入力とは２フィールドの遅延が生じる。
【００４２】
参照符号３は第１のフィールドメモリ１への入力と第２のフィールドメモリ２からの出力（第１のフィールドメモリ１への入力とは２フィールド遅延している）とを比較して１フィールド分の差分が所定値よりも小さい場合、より具体的には実質的に同一である場合に切替えスイッチ４の出力を無信号に、それ以外の場合に切替えスイッチ４の出力を第１のフィールドメモリ１の出力にして第２のフィールドメモリ２へ入力されるように切替える相関検出回路を示している。
【００４３】
参照符号５は映像信号を圧縮処理するビデオエンコーダを、６はビデオエンコーダ５が出力する圧縮データのフィールド数をカウントするフィールド数カウント回路を、７はフィールド数入替回路を、８はフィールド数入替回路７による処理結果をPTS に換算する PTS換算回路を、９はビデオデコーダ５が出力する圧縮データに PTS換算回路８が換算したPTS を付加して出力する PTS付加回路をそれぞれ示している。
【００４４】
次に、図１に示されている本発明に係る動画像圧縮符号化装置の動作について説明する。なお、第１のフィールドメモリ１、第２のフィールドメモリ２、相関検出回路３、切替えスイッチ４により逆 3:2プルダウン処理部50が構成される。この逆 3:2プルダウン処理部50は構成及び動作共に従来と同様であり、入力されるビデオデータの各フィールドを２フィールド前のフィールドと比較し、両者の差分が所定値よりも小さい場合、即ち相関が大きい場合に同一のフィールドであると判定して切替えスイッチ４を無信号側に切替えることにより、冗長なフィールドを除去する逆 3:2プルダウン処理を行なう。
【００４５】
ビデオエンコーダ５には切替えスイッチ４の出力信号が入力されており、切替えスイッチ４が無信号側に切替えられた時点で、ビデオエンコーダ５はこれから入力されるフレームの第１フィールドを復号時にもう一度繰り返して出力するためのフィールド削除情報であるRFF(Repeat First Field) フラグを有効にした上でMPEG-2ビデオエレメンタリー圧縮符号化処理を行ない、その結果として生成されるビデオエレメンタリストリームをパケット化しつつ PTS付加回路９へ出力する。
【００４６】
また、ビデオエンコーダ５は、フレーム単位でビデオデータが入力される都度、入力データの符号化後のピクチャタイプ（Ｉ，Ｐ，Ｂ）及び RFFフラグの有無をフィールド数カウント回路６へ出力する。なお、ピクチャタイプはフィールド数入替回路７へも出力される。
【００４７】
フィールド数カウント回路６は図２のフローチャートに示すような処理シーケンスを実行する。まず、フィールド数カウント回路６は累積フィールド数Ｆを初期化し (ステップS11)、次にビデオエンコーダ５へフレーム単位でビデオデータが入力される都度 (ステップS12)、入力データの符号化後のピクチャタイプ（Ｉ，Ｐ，Ｂ）と RFFフラグの有無とをビデオエンコーダ５から取得し (ステップS13)、既にカウントしている現ピクチャの累積フィールド数とピクチャタイプとをフィールド数入替回路７へ出力する (ステップS13)。
【００４８】
そして、フィールド数カウント回路６は、 RFFフラグが有効である場合は (ステップS14 で”YES") 、そのピクチャのフィールド数を”３”としてカウンとし (ステップS15)、無効である場合は (ステップS14 で”NO")、”２”としてカウンとし (ステップS16)、現ピクチャの累積フィールド数に加算することにより、次に取得するピクチャの累積フィールド数として保持する (ステップS17)。
【００４９】
フィールド数入替回路７には、Ｉ，Ｐピクチャの最大周期分とビデオエンコーダ５でのＩ，Ｐピクチャの最大遅延時間に相当するピクチャ数分のピクチャタイプと累積フィールド数とを格納するレジスタが構成されている。たとえば、Ｉ，Ｐピクチャの最大周期が３であり、Ｉ，Ｐピクチャのエンコード最大遅延時間が0.4 フレームである場合には、両者の加算値”3.4 ”が切り上げられて４ピクチャ分のレジスタがフィールド数入替回路７に構成される。そして、これらのレジスタに、フィールド数カウント回路６から出力されるピクチャの累積フィールド数とピクチャタイプとが順次格納される。
【００５０】
フィールド数カウント回路６は図３のフローチャートに示すような処理シーケンスを実行する。フィールド数入替回路７は、ビデオエンコーダ５へフレーム単位でビデオデータが入力される都度 (ステップS21)、ビデオエンコーダ５から出力されるピクチャタイプを取得すると (ステップS22)、そのピクチャが出力されるタイミングにおいて、レジスタに格納されている同一のピクチャタイプで最も古いピクチャに対応する累積フィールド数を出力する (ステップS23)。なお、累積フィールド数を出力したレジスタはクリアされ (ステップS24)、新たな情報を格納するための準備が行なわれる。
【００５１】
PTS換算回路８は、フィールド数入替回路７から出力された累積フィールド数が入力されると、下記の式(1) からPTS を算出して PTS付加回路９へ出力する。 PTSは90kHz のカウント数であり、映像はNTSC信号とし、フレーム周波数は29.97Hz である。
【００５２】
PTS ＝累積フィールド数×１／29.97 ×1/2 ×90000 …(1)
【００５３】
PTS付加回路９は、ビデオエンコーダ５から出力されるビデオエレメンタリストリームをパケット化しつつ、 PTS換算回路８から出力されるPTS を付加することによりビデオPES(Packetized Elementary Stream) を出力する。
【００５４】
実施の形態２．
ところで、上述した実施の形態１では本発明に係る動画像圧縮符号化装置は PTSを算出するための構成として、図１に示すように、フィールド数カウント回路６とフィールド数入替回路７と PTS換算回路８とを備えているが、他の構成を採ることも勿論可能である。そのような構成を有する本発明に係る動画像圧縮符号化装置を実施の形態２として以下に説明する。
【００５５】
図４は PTSを算出するための構成として上述の実施の形態１とは異なる構成を有する本発明に係る動画像圧縮符号化装置の構成例を示すブロック図である。なお、上述の実施の形態１においては、フィールド数カウント回路６とフィールド数入替回路７と PTS換算回路８とで PTSを算出する構成としているが、この実施の形態２においては、 90kHz分周器10と PTSカウント回路11と PTS入替回路12とで PTSを算出する構成としている。
【００５６】
90kHz分周器10はシステムの基準クロックである27MHz のクロックを分周することによりPTS の基準となる90kHz のクロック信号を発生して PTSカウント回路11へ入力する。
【００５７】
PTSカウント回路11では、ビデオエンコーダ５から出力されるエンコード開始情報をトリガとして、 90kHz分周器10から入力されているクロック信号のカウントをスタートする。そして、 PTSカウント回路11は、ビデオエンコーダ５に映像データが入力された時点でフレーム毎に入力データが圧縮符号化された後のピクチャタイプを取得し、取得した時点のカウント値をそのピクチャのPTS としてピクチャタイプと共に PTS入替回路12へ出力する。
【００５８】
PTS入替回路12には、Ｉ，Ｐピクチャの最大周期分とビデオエンコーダ５でのＩ，Ｐピクチャ最大遅延時間とを合わせた数に相当するピクチャのピクチャタイプと累積フィールド数とを格納するレジスタが構成されている。たとえば、Ｉ，Ｐピクチャの最大周期が３であり、Ｉ，Ｐピクチャのエンコード最大遅延時間が 0.4フレームである場合には、両者の加算値”3.4 ”を切り上げた４ピクチャ分のレジスタが PTS入替回路12に構成されている。そして、これらの各レジスタに、 PTSカウント回路11から出力される各ピクチャのPTS とピクチャタイプとが順次格納される。
【００５９】
PTS入替回路l2は、ビデオエンコーダ５から出力されるピクチャタイプを取得すると、それに対応するピクチャが出力されるタイミングで、レジスタに格納してある同一のピクチャタイプの内の最も古いピクチャに対応するPTS を出力する。なお、 PTS入替回路12はPTS を出力した後は、それに対応するレジスタをクリアして新たな情報を格納できるように準備する。
【００６０】
PTS付加回路９は、ビデオエンコーダ５から出力されるビデオエレメンタリストリームをパケット化しつつ、 PTS入替回路l2から出力されたPTS を付加することによりビデオPES(Packetized Elementary Stream) を出力する。
【００６１】
実施の形態３．
ところで、上述した実施の形態１及び２の本発明に係る動画像圧縮符号化装置では、ビデオエンコーダ５へ入力される時点の累積フィールド数またはPTS を取得することにより PTSを算出する構成としているが、他の構成、たとえばビデオエンコーダ５から出力されたデータを解析してPTS を簡易予測するような構成を採ることも可能である。そのような構成を有する本発明に係る動画像圧縮符号化／復号化システムを実施の形態３として以下に詳細に説明する。
【００６２】
図５はビデオエンコーダ５から出力されたデータを解析してPTS を簡易予測するような構成を有する本発明に係る動画像圧縮符号化／復号化システムの構成例を示すブロック図である。なお、逆 3:2プルダウン処理部50、ビデオエンコーダ５及び PTS付加回路９が備えられていることは図１及び図４に示されている実施の形態１及び２と同様であり、また PTS換算回路８が備えられていることは図１に示されている実施の形態１と同様である。
【００６３】
図５において、参照符号13はビデオエンコーダ５によりエンコードされたデータから情報を抽出する情報抽出回路を、14は情報抽出回路13が抽出した情報に基づいて累積フィールド数を予測するフィールド数簡易予測回路をそれぞれ示しており、このフィールド数簡易予測回路14が予測したフィールド数が PTS換算回路8 に与えられてPTS に換算され、その結果が PTS付加回路９へ出力される。
【００６４】
また、参照符号15は音声データを圧縮符号化するオーディオエンコーダを、16はオーディオエンコーダ15が圧縮符号化したオーディオデータをパケット(PES) 化してPTS を付加する PTS付加回路を、26はタイムスタンプの基準参照値となるSCR(System Clock Reference) を発生させる SCR作成回路を、17は PTS付加回路９及び16が作成したビデオPES データとオーディオPES データと SCR作成回路26が作成したSCR とを多重化する多重化回路をそれぞれ示している。
【００６５】
そして、多重化回路17から出力された多重化圧縮データが磁気ディスク等の記録媒体に記録装置18により記録される。
【００６６】
一方、参照符号19は記憶装置18が記録媒体から読み出したデータをビデオPES データ、オーディオPES データ及びSCR に分離する分離回路を、20は分離回路19により分離されたオーディオPES データを復号化するオーディオデコーダを、21は分離回路19により分離されたビデオPES データを復号化するビデオデコーダを、22は分離回路19により分離されたSCR から基準の時刻を発生する基準時刻生成回路を、23はオーディオエンコーダ20が符号化したオーディオ PTSと基準時刻生成回路22が生成した基準時刻とを比較して差分を出力するオーディオ PTS比較回路を、24はビデオエンコーダ21が符号化したビデオ PTSと基準時刻生成回路22が生成した基準時刻とを比較して差分を出力するビデオ PTS比較回路を、25はビデオ PTS比較回路24が出力したビデオ PTSの差分値に時間軸上のフィルタをかけて映像と音声との同期制御情報とする PTS差分フィルタである。
【００６７】
情報抽出回路13は、ビデオエンコーダ５からＩ，Ｐピクチャの周期Ｍを取得し、ビデオエンコーダ５で符号化されたデータからピクチャタイプと RFFフラグとを抽出してフィールド数簡易予測回路14へ出力する。
【００６８】
フィールド数簡易予測回路14では、図６のフローチャートに示すような処理シーケンスで現ピクチャの累積フィールド数を予測する。図６に示されているシーケンスの概略は、まず、フィールド数簡易予測回路14は、Ｉ，Ｐピクチャの周期とビデオエンコーダ５から出力されている現ピクチャのピクチャタイプとから、圧縮符号化処理の前後で生じるピクチャ順序の並べ替えを予測計算した累積フィールド数を算出する。更に、フィールド数簡易予測回路14は、現ピクチャの RFFフラグが有効である場合は既に予測計算してある現ピクチャの累積フィールド数に”１”を加算した結果を新たに現ピクチャの累積フィールド数とする。具体的には以下のようになる。
【００６９】
たとえば、Ｉ，Ｐピクチャの周期Ｍが３である場合に、図６に示すシーケンスにより予測される累積フィールド数を図７及び図８の模式図に示す。なお、図７及び図８は本来は図７の右端と図８の左端とが接続する一枚の図である。
【００７０】
図７及び図８において、動画像圧縮符号化装置へ入力されるNTSC信号がフィルム素材から 3:2プルダウン変換されたものであり、第１のフィールドメモリ１、第２のフィールドメモリ２、相関検出回路３、切替えスイッチ４で構成される逆 3:2プルダウン処理部50において逆 3:2プルダウン処理が行なわれた場合、 RFFフラグが１フレームおきに有効となってビデオエンコーダ５へ入力される（行２）。そして、ビデオエンコーダ５では、入力画像がＢ，Ｂ，Ｉ，Ｂ，Ｂ，Ｐ…のピクチャタイプ順でエンコードされる（行１）。この場合の正しい累積フィールド数は行３に示されているような値となる。
【００７１】
ビデオエンコーダ５での符号化処理では、前後のＩ及びＰピクチャを参照画像としてＢピクチャを予測符号化するため、Ｉ，ＰピクチャとＢピクチャの順序が入れ替わり（行４) 、 RFFフラグ及び正しい累積フィールド数もそれに応じて行５、行６に示されているようになる。
【００７２】
図６に示すシーケンスにより予測された累積フィールド数を行11に示す。図６に示すように、一つ前のＩまたはＰピクチャの累積フィールド数IP（行７) と、２つ前のＩピクチャまたはＰピクチャの累積フィールド数IP1(行８) と、ＩまたはＰピクチャの間に挟まれたＢピクチャの RFFフラグの有効回数BR（行９) と、ＩまたはＰピクチャの間に挟まれた一つ前のＢピクチャの RFFフラグの有効回数BR1(行10) と、ＩまたはＰピクチャの周期Ｍとから、各ピクチャの累積フィールド数Ｆ（行11) が予測される。
【００７３】
図６の予測シーケンスでは、まず前述の各変数の初期設定が行なわれる (ステップS31)。具体的には、、予測フィールド数Ｆは”０”に、前Ｉ，Ｐピクチャの累積フィールド数IPは”−２”に、前々Ｉ，Ｐピクチャの累積フィールド数IP1 は”−８”に、Ｉ，Ｐピクチャに挟まれたＢピクチャの RFFフラグの有効回数BRは”０”に前の連続するＢピクチャの RFFフラグの有効回数BR1 は”０”に、Ｉ，Ｐピクチャの周期Ｍは”３”にそれぞれ初期設定される。
【００７４】
次に、ビデオエンコーダ５のデータ出力からピクチャタイプと RFF フラグとが取得される (ステップS32)。この取得されたピクチャタイプがＩまたはＰピクチャであった場合は (ステップS33 で”YES") 、下記式(2) により一つ前のＩまたはＰピクチャの累積フィールド数IPに現ピクチャと一つ前のＩまたはＰピクチャ間のフィールド数(2×M)を加え、その間の RFF有効回数(BR)を加えることにより累積フィールド数Ｆの予測値が求められる (ステップS34)。
【００７５】
Ｆ＝ IP＋２×Ｍ＋BR …(2)
【００７６】
次に、現ピクチャの RFFフラグが有効な場合にのみ (ステップS35 で”YES") 、累積フィールド数Ｆに”１”が加えられ (ステップS36)、現ピクチャの予測累積フィールド数Ｐとして出力される (ステップS38)。なおこの際、IP1 がIP（行８) に、IPがＦ（行７) に、BR1 がBR（行10) に、 BR が”０”（行９) にそれぞれ置換される (ステップS37)。
【００７７】
次に、ビデオエンコーダ５のデータ出力から取得されたピクチャタイプがＢピクチャであり (ステップS33 で”NO")、しかもＩまたはＰピクチャのすぐ次に取得された場合は (ステップS41 で”YES") 、下記式(3) により二つ前のＩまたはＰピクチャの累積フィールド数ＩＰに、現ピクチャと二つ前のＩまたはＰピクチャ間の圧縮符号化前のフィールド数として”２”を加え、更に二つ前のＩまたはＰピクチャと一つ前のＩまたはＰピクチャ間のＢピクチャの RFF有効回数(BR1) を加える (ステップS42)。
【００７８】
Ｆ＝ IP1 ＋２＋BR1 …(3)
【００７９】
そして、現ピクチャの RFFフラグが有効な場合にのみ (ステップS43 で”YES") 、Ｆに”１”が加えられて現ピクチャの累積フィールド数Ｆとして出力されると共に、 RFFフラグが有効な場合はBRに”１”が加えられる (ステップS44)。
【００８０】
次に、ビデオエンコーダ５のデータ出力から取得されたピクチャタイプがＢピクチャであり (ステップS33 で”NO")、しかも一つ前のピクチャがＢピクチャである場合は (ステップS41 で”NO")、累積フィールド数Ｆに”２”が加えられる (ステップS45)。
【００８１】
そして、現ピクチャの RFFフラグが有効な場合にのみ (ステップS43 で”YES") 、Ｆに”１”が加えられて現ピクチャの累積フィールド数Ｆとして出力されると共に、 RFFフラグが有効な場合はBRに”１”が加えられる (ステップS44)。
【００８２】
このようにしてフィールド数簡易予測回路14が予測した累積フィールド数（行11) は、正しい累積フィールド数に比較して±１フィールドの差分が生じる（行l2) 。その理由は、 RFFフラグの有効、無効をそのピクチャの累積フィールド数に反映させることにあり、Ｉ，Ｐピクチャの繰返し周期３フレームと RFFフラグの繰返し周期２フレームとの最小公倍数６フレームにおいて±１フィールドの差分が繰返される。そして、復号時に PTS差分フィルタ25により時間軸上のフィルタをかけることにより、±１フィールド以下の実用上問題の無い高い精度で同期合わせを行なうことができる。
【００８３】
また、このような予測方法を採る場合は、フィルム素材から 3:2プルダウン処理が行なわれずにNTSC信号に変換された画像データ、即ち原信号から直接得られたNTSC信号である画像データが入力された場合には逆3:2 プルダウン処理部50での逆 3:2プルダウン処理は行なわれないため、 RFFフラグは無効のままである。このため、Ｉ，Ｐピクチャの周期とピクチャタイプとから予測計算された累積フィールド数と正しい累積フィールド数との差分は”０”となり、不要なオフセットが生じることはない。
【００８４】
このようにフィールド数簡易予測回路14から出力された現ピクチャの予測累積フィールド数は、 PTS換算回路８で式(1) により、フィールド数からPTS に変換されて PTS付加回路９へ出力される。
【００８５】
PTS付加回路９は、ビデオエンコーダ５からのビデオエレメンタリストリームをパケット化しつつ、 PTS換算回路８から出力されるPTS を付加することによりビデオPES(Paeketized Elementary Stream) を出力する。
【００８６】
一方、オーディオデータはオーディオエンコーダ15で圧縮符号化され、 PTS付加回路16でオーディオデータのPTS が付加されてパケット(PES) 化される。
【００８７】
また、タイムスタンプの基準参照値となるSCR(System Clock Reference) が SCR作成回路26により作成されて多重化回路17へ出力される。
【００８８】
ビデオPES データ、オーディオPES データ及びSCR は、多重化回路17で多重化されて一つのデータとされて記録装置18により光磁気ディスク等の記録媒体に記録される。
【００８９】
記録装置18により記録媒体から読み出されたデータは、分離回路19によりビデオPES データ、オーディオPES データ及びSCR に分離され、オーディオPES データはオーディオデコーダ20で復号され、ビデオPES データはビデオデコーダ21で復号され、 SCRは基準時刻作成回路22へ入力される。
【００９０】
ビデオPES データがビデオデコーダ21で復号されると、Ｉ，ＰピクチャはＢピクチャの参照画像になるため、行13に示すように、ピクチャの順序が入替わって符号化する前の並びに戻り、累積フィールド数差分（行14) もそれに応じて入替わる。
【００９１】
基準時刻生成回路22では、分離回路19から入力されるSCR を参照して基準時刻を生成する。基準時刻は図７及び図８に示されていいる正しい累積フィールド数（行３、行６) に相当するため、行14の累積フィールド数差分は予測累積フィールド数（行１）と基準時刻のフィールド数換算値との差分に等しい。
【００９２】
オーディオ PTS比較回路23では、オーディオデコーダ20が抽出したオーディオPTS と基準時刻とが比較され、その差分値が出力される。オーディオデコーダ20では、その差分値を”０”とするように同期合わせ処理が行なわれる。
【００９３】
同様にビデオ PTS比較回路24では、ビデオデコーダ21が抽出したビデオPTS と基準時刻とが比較されてその差分値が出力される。このビデオ PTS比較回路24から出力される差分値を PTS差分フィルタ25が６ピクチャ分保持し、６ピクチャ分毎の平均値を算出する。この算出結果は図７及び図８の行15に示されている。なお、初期値は”０”である。
【００９４】
この算出結果が１フィールド相当の時間で除算され、その商に対応する数だけのフィールド数の同期合わせ制御がビデオデコーダ21で行なわれる。また、上述の除算の剰余がオーディオPTS 比較回路23へ入力されてオーディオPTS と基準時刻との差分値に加算される。この加算結果はオーディオデコーダ20へ入力され、その差分値が”０”となるように同期合わせ処理が行なわれる。即ち、１フィールド以下の端数は音声の同期合せ処理を制御することにより、音声を映像に同期させる。
【００９５】
このような同期合わせ処理が行なわれることにより、映像の符号化時にPTS を簡易予測した場合にも、映像と音声とを同期させることが可能になる。
【００９６】
また、 PTS差分フィルタ25が平均するピクチャ数は６ピクチャ分としてあるが、これは RFFフラグのピクチャ周期である２とＩ，Ｐピクチャ周期であるＭ＝３との公倍数である。このように、 PTS差分フィルタ25が平均するピクチャ数を公倍数とすることにより、累積フィールド数差分の平均値がl/6 フィールド（行15) で一定になり、安定した同期合わせ処理を行なうことができるという効果もある。
【００９７】
更に、 PTS差分フィルタ25が平均するピクチャ数を前述の公倍数の内の最小値、即ち最小公倍数である”６”とすることにより、応答牲が迅速な同期合わせ処理を行なうことができるという効果がある。
【００９８】
【発明の効果】
以上に詳述したように本発明の動画像圧縮符号化方法及びその装置、並びに動画像圧縮符号化／復号化システムによれば、圧縮符号化手段から出力されるデータを一時格納するための大量で冗長なメモリを設ける必要が無くなるため、安価で効率的な動画像圧縮符号化装置及び動画像圧縮符号化／復号化システムを得ることができる。
【００９９】
また請求項１に記載の本発明の動画像圧縮符号化方法によれば、画像データを圧縮符号化する時点で計数した時間情報と圧縮符号化後のピクチャタイプ情報を保持し、圧縮符号化処理された画像データのフレームのピクチャタイプ情報と同一のピクチャタイプ情報に対応する時間情報が選択されて時刻情報が確定されるため、ある画像データのフレームの時間情報を計算するために、それ以降のフレームのフィールド削除情報である RFFフラグを検出した後に時間情報を計算するというような複雑な処理が不要になる。
【０１００】
更に請求項２及び３に記載の本発明の動画像圧縮符号化装置によれば、圧縮符号化手段へ入力される時点でカウントされた累積フィールド数と圧縮符号化後のピクチャタイプ情報とを保持しておき、圧縮符号化処理手段から出力されるピクチャタイプ情報と同一のピクチャタイプ情報に対応する累積フィールド数を選択するという単純な換算で時間情報が算出可能になる。このため、ある画像データのフレームの時間情報を計算するために、それ以降のフレームのフィールド削除情報である RFFフラグを検出した後に時間情報を計算するというような複雑な処理が不要になり、簡易な構成の動画像圧縮符号化装置を得ることができる。
【０１０１】
また更に請求項４及び５に記載の本発明の動画像圧縮符号化装置によれば、圧縮符号化手段へ入力される時点でカウントされたタイムスタンプと圧縮符号化後のピクチャタイプ情報とを保持しておき、圧縮符号化処理手段から出力されるピクチャタイプ情報と同一のピクチャタイプ情報に対応するタイムスタンプを選択することによりタイムスタンプが確定される。このため、ある画像データのフレームのタイムスタンプを計算するために、それ以降のフレームのフィールド削除情報である RFFフラグを検出した後にタイムスタンプを計算するというような複雑な処理が不要になり、簡易な構成の動画像圧縮符号化装置を得ることができる。
【０１０２】
更に請求項６に記載の本発明の動画像圧縮符号化／復号化システムによれば、符号化時にはＩ，Ｐピクチャのタイムスタンプを計算するために、それ以降に現れるＢピクチャの RFFフラグを検出した後にタイムスタンプを計算するといった複雑な処理を行なう必要がない動画像圧縮符号化／復号化システムを得ることができる。
【０１０３】
また、請求項７，８及び９に記載の本発明の動画像圧縮符号化／復号化システムによれば、応答の速い同期合わせ処理を行なう動画像圧縮符号化／復号化システムを得ることができる。
【０１０４】
また請求項８及び９に記載の本発明の動画像圧縮符号化／復号化システムによれば、符号化時には安定な同期合わせ処理を行なうことができる動画像圧縮符号化／復号化システムを得ることができる。
【図面の簡単な説明】
【図１】実施の形態１の本発明に係る動画像圧縮符号化装置の構成例を示すブロック図である。
【図２】実施の形態１の本発明に係る動画像圧縮符号化装置のフィールド数カウント回路の処理シーケンスを示すフローチャートである。
【図３】実施の形態１の本発明に係る動画像圧縮符号化装置のフィールド数入替回路の処理シーケンスを示すフローチャートである。
【図４】実施の形態２の本発明に係る動画像圧縮符号化装置の構成例を示すブロック図である。
【図５】実施の形態３の本発明に係る動画像圧縮符号化／復号化システムの構成例を示すブロック図である。
【図６】実施の形態３の本発明に係る動画像圧縮符号化／復号化システムのフィールド数簡易予測回路の処理シーケンスを示すフローチャートである。
【図７】実施の形態３の本発明に係る動画像圧縮符号化／復号化システムのフィールド数簡易予測回路により予測される累積フィールド数を示す模式図である。
【図８】実施の形態３の本発明に係る動画像圧縮符号化／復号化システムのフィールド数簡易予測回路により予測される累積フィールド数を示す模式図である。
【図９】 3:2プルダウン処理の手順を示す模式図である。
【図１０】従来のMPEG-2方式による動画像圧縮符号化装置の構成例を示すブロック図である。
【図１１】従来のMPEG-2方式による動画像圧縮符号化処理を行なった場合のフレームの配列符号化前後の状態を示す模式図である。
【符号の説明】
５ビデオエンコーダ、６フィールド数カウント回路、７フィールド数入替回路、８ PTS換算回路、９ PTS付加回路、10 90kHz分周器、11 PTSカウント回路、12 PTS入替回路、13 情報抽出回路、14 フィールド数簡易予測回路、15 オーディオエンコーダ、16 PTS付加回路、17 多重化回路、18 記録装置、19 分離回路、20 オーディオデコーダ、21 ビデオデコーダ、22 基準時刻生成回路、23 オーディオ PTS比較回路、24 ビデオ PTS比較回路、25 PTS差分フィルタ、26 SCR作成回路、50 逆3:2 プルダウン処理部。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a moving picture compression coding method and apparatus represented by the MPEG (Moving Picture Experts Group) system, and a moving picture compression coding / decoding system. More specifically, the present invention relates to a time for inverse 3: 2 pulldown processing. It is related to the technology of stamp making.
[0002]
[Prior art]
Inverse 3: 2 pull-down processing is processing for thinning out redundant fields from the NTSC signal subjected to 3: 2 pull-down processing. By the way, 3: 2 pull-down processing is a technique for converting moving image information equivalent to 1 second in 24 frames such as movie film into an NTSC television signal equivalent to 1 second in 30 frames.
[0003]
FIG. 9 is a schematic diagram showing the procedure of 3: 2 pull-down processing for converting a motion picture film frame into an NTSC television signal. FIG. 9 (a) shows four frames corresponding to 4/24 (1/6) second of the original film, which are F0, F1, F2, and F3 frames in order. When four frames F0, F1, F2, and F3 of this original film are directly converted into NTSC television signals, as shown in FIG. 9 (b), each frame is a top field (odd field) t0, t1, t2, t3. And bottom field (even field) b0, b1, b2, b3. That is, the frame F0 of the original film is in the fields t0 and b0 of the NTSC television signal, the frame F1 of the original film is in the fields t1 and b1 of the NTSC television signal, and the frame F2 of the original film is the field t2 of the NTSC television signal. In b2, the frame F3 of the original film is converted into fields t3 and b3 of the NTSC television signal, respectively.
[0004]
Then, the fields t0, b0, t1, b1, t2, b2, t3, b3 of the NTSC television signal obtained from the frames F0, F1, F2, F3 of the original film are shown in FIG. 9 (c). Thus, by using the fields t0 and b2 redundantly, 10 fields as an array of t0, b0, t0, b1, t1, b2, t2, b2, t3, b3, that is, f0, f1, f2, f3, Reconfigure to 5 frames of f4. Therefore, the frame f0 converted to the NTSC television signal is the fields t0 and b0, the frame f1 is the fields t0 and b1, the frame f2 is the fields t1 and b2, the frame f3 is the fields t2 and b2, and the frame f4 is the field f4 It consists of t3 and b3.
[0005]
As described above, motion picture information of a movie film corresponding to 1 second in 24 frames is converted into an NTSC television signal corresponding to 1 second in 30 frames (60 fields). By the way, when the television signal obtained in this way is compressed and encoded by the MPEG-2 system, the process of thinning out redundant fields included in two consecutive 10 fields, that is, inverse 3: 2 pull-down. It is necessary to perform processing.
[0006]
FIG. 10 is a block diagram showing a configuration example of a conventional moving picture compression encoding apparatus based on the MPEG-2 system. In FIG. 10, reference numerals 1 and 2 indicate first and second field memories. The first field memory 1 delays the input digital video signal by one field. Reference numeral 4 indicates a switch for switching whether or not the output from the first field memory is input to the second field memory 2. The output of the changeover switch 4, in other words, the first field memory 1 The second field memory 2 delays the output by one field. Accordingly, the output of the second field memory 2 is delayed by two fields from the input to the first field memory 1.
[0007]
Reference numeral 3 compares the input to the first field memory 1 with the output from the second field memory 2 (the input to the first field memory 1 is delayed by two fields), and is equivalent to one field. Is smaller than a predetermined value, more specifically, if the difference is substantially the same, the output of the changeover switch 4 is set to no signal, and otherwise, the output of the changeover switch 4 is set to the first field memory 1. The correlation detection circuit is switched so that the output is input to the second field memory 2.
[0008]
Reference numeral 5 is a video encoder that compresses the video signal, 100 is a PTS analysis circuit that analyzes the output data of the video encoder 5 and calculates a presentation time stamp (hereinafter referred to as PTS), and 101 is an output from the video encoder 5 The PTS analyzer circuit 100 packetizes the compressed data stored in the buffer memory 101 and the PTS analyzer circuit 100 calculates the compressed data stored in the buffer memory 101 while the PTS analyzer circuit 100 analyzes the PTS. This is a PTS addition circuit that adds a PTS and outputs it.
[0009]
The first field memory 1, the second field memory 2, the correlation detection circuit 3, and the changeover switch 4 constitute an inverse 3: 2 pull-down processing unit 50. The inverse 3: 2 pull-down processing unit 50 compares each field of the input video data with the field two fields before, and when the difference between the two is smaller than a predetermined value, that is, when the correlation is large, the same field is used. By determining that there is a switch and switching the changeover switch 4 to the non-signal side, reverse 3: 2 pull-down processing for removing redundant fields is performed.
[0010]
The output signal of the changeover switch 4 is input to the video encoder 5, and when the changeover switch 4 is switched to the non-signal side, the video encoder 5 repeats the first field of the frame to be input again at the time of decoding. An MPEG-2 video elementary compression encoding process is performed after enabling an RFR (Repeat First Field) flag for output.
[0011]
In MPEG-2 video elementary compression coding, compression coding processing using correlation of moving images is performed, so the frame arrangement after encoding (hereinafter referred to as picture arrangement) is different from the picture arrangement of the original signal. In order. A typical example of this state is shown in the schematic diagram of FIG. 11 when the period in which an I or P picture appears is three. Note that an I picture (Intra coded picture) is called an intra-frame coded picture, and is coded independently in only one frame. A P picture (Predictive coded picture) is called a forward predictive coded picture, and is encoded from an I picture by forward prediction. In addition to this, there is a B picture (Bidirectionally predictive coded picture) called a bidirectional predictive coded picture that is predictively coded using the preceding and following I and P pictures as reference pictures.
[0012]
In the 3: 2 pull-down processed NTSC signal shown in FIG. 11 (a), the inverse 3: 2 pull-down processing shown in FIG. 11 (b) and the playback image shown in FIG. 11 (f), t0, t1,... represent the top fields of frames 0, 1,..., and b0, b1. In the video encoding process shown in (c) and the decoding process shown in (e), for example, B0, I2, and P5 are picture types (B, I, P) and frame numbers (0, 2 and 6) respectively.
[0013]
One field (t0, b2, t4, b6, t8, b10) is redundantly added with a period of 4 fields to the 3: 2 pull-down processed NTSC signal shown in FIG. 11 (a). A signal obtained by digitally converting this signal is the bidet data input shown in FIG. 10, and redundant redundant fields are removed by the inverse 3: 2 pull-down processing unit 50, and video encoding processing is performed by the video encoder 5. It is.
[0014]
In the video encoding process by the video encoder 5, since the B picture is predictively encoded using the preceding and subsequent I and P pictures as reference images, the I and P pictures to be referred to are encoded first, and then the B pictures are encoded. Encoded. For this reason, although the details will be described later, the order of the I, P picture and B picture is switched.
[0015]
The PTS analysis circuit 100 sequentially analyzes the encoded data output from the video encoder 5 to extract the picture type (I, P or B), the RFF flag, and the frame period, and calculates the PTS based on these.
[0016]
However, for I and P pictures, PTS cannot be calculated unless the state of the RFF flag of the B picture behind it is known. For example, in order to calculate the PTS of the I2 picture shown in FIG. 11 (c), the RFF flag of the B0 and B1 pictures preceding it is confirmed. If the RFF flag is valid “1”, one field The PTS of minutes must be added. In the example shown in FIG. 11, the PTS of the I2 picture is 5 fields in terms of the number of fields because the RFF flag of the B0 picture is valid as shown in FIG. 11 (d).
[0017]
Thus, since the PTS of the I and P pictures is not determined unless the information of the B picture immediately before the next I or P picture is known, the buffer memory 101 that holds and delays the data until then is required. It becomes.
[0018]
Therefore, the buffer memory 101 is controlled so that data is read when the PTS analysis circuit 100 determines the PTS. The data read from the buffer memory 101 is packetized by the PTS adding circuit 9, and the PTS calculated by the PTS analyzing circuit 100 is added and output.
[0019]
In addition, as an example of the case where a moving image compression encoding apparatus is configured in combination with a general-purpose computer such as a computer, up to the video encoder 5 for creating an elementary stream, that is, the inverse 3: 2 pull-down processing unit 50 and the video encoder 5 are used. Can be configured by dedicated hardware, and the subsequent functions of the PTS analysis circuit 100 and the PTS additional circuit 9 can be performed by software processing.
[0020]
However, even in such a case, the same processing procedure is necessary. While encoding results are sequentially analyzed by software corresponding to the PTS analysis circuit 100, the encoded data is packetized by software corresponding to the PTS additional circuit 9 and PTS. A video PES (Packetized Elementary Stream) is output by adding a PTS calculated by software corresponding to the analysis circuit 100.
[0021]
After that, an audio PES encoded with an audio encoder (not shown) and added with PTS, an SCR (System Clock Reference) serving as a reference reference value for a time stamp, and the video PES described above are connected with a multiplexer (not shown). A system stream is created by multiplexing. The system stream created in this manner is reproduced by the decoding device while synchronizing video and audio using video and audio PTS and SCR.
[0022]
[Problems to be solved by the invention]
Since the conventional moving image compression coding apparatus is configured as described above, when determining the time stamp information of certain picture data, information on the picture data output after the picture data is necessary. For this reason, a buffer memory for holding all data up to the picture data to be output later is required. In addition, since the capacity of the buffer memory is set assuming the maximum encoding rate and the maximum I / P picture interval, there is a problem that a large capacity is required.
[0023]
On the other hand, since the encoding rate actually used and the actual I / P picture interval are usually less than the assumed maximum value, there is a problem that a part of the memory becomes redundant and cannot be used effectively. .
[0024]
In addition, a general-purpose computer such as a personal computer is connected to a board composed of dedicated 3: 2 pull-down processing unit 50 and video encoder 5, and the subsequent functions of PTS analysis circuit 100 and PTS additional circuit 9 are software. Even in the case of adopting a configuration performed by processing, a memory capacity corresponding to the buffer memory 101 is required for the internal memory of the computer.
[0025]
The present invention has been made in view of such circumstances, and provides a moving image compression encoding method and apparatus, and a moving image compression encoding / decoding system that do not require a large-capacity and redundant buffer memory. With the goal.
[0026]
[Means for Solving the Problems]
According to a first aspect of the present invention, there is provided a moving image compression encoding method, wherein each frame of an original moving image is converted into a moving image in which one frame is composed of two fields, and a predetermined number of consecutive frames of the moving image are converted into one. Detecting a duplicate field in an image signal obtained by adjusting a time axis in advance by duplicating a specific field in each unit frame as a unit, and deleting one redundant field; A step of outputting field deletion information indicating that the image signal has been deleted, a step of compressing and encoding the image signal after the redundant field is deleted, and the number of fields at the time when the image signal is compressed and encoded in units of frames. Cumulative field count plus field delete count Total And outputting picture type information identifying the type of image data defining the order of arrangement with respect to other frames after the frame is compression encoded and counted Cumulative field count And picture type information are stored for each picture type information based on the output picture type information. Cumulative field count Corresponding to compression-encoded picture type information Cumulative field count Oldest of Cumulative field number Selecting sequentially, Counting the time information by converting the selected cumulative field number to the reference frequency clock number, deleting the selected cumulative field number, and counting Of adding the time information thus made to a frame unit of a compression-encoded image signal And It is characterized by including.
[0027]
In such a moving image compression encoding method according to the first aspect of the present invention, the time information counted when the image data is compression encoded and the picture type information after the compression encoding are held, and the compression encoding process is performed. The time information corresponding to the same picture type information as the picture type information of the frame of the image data thus selected is selected, and the time information is determined. Therefore, in order to calculate time information of a frame of certain image data, it is not necessary to perform a complicated process such as calculating time information after detecting an RFF flag which is field deletion information of a subsequent frame.
[0028]
The moving image compression encoding method according to claim 2 of the present invention is: By converting each frame of the original moving image into a moving image in which one frame is composed of two fields, and overlapping a specific field in each unit frame with a predetermined number of consecutive frames of the moving image as one unit. A step of detecting a duplicate field in an image signal obtained by adjusting a time axis in advance and deleting one redundant field; a step of outputting field deletion information indicating that a field has been deleted; and a redundant field Compressing and encoding the image signal after the image is deleted, generating a clock as a reference for a time stamp indicating time information, and counting the generated clocks to compress the image data in units of frames. The time stamp at the time of conversion is calculated, and other frames after the frame is compressed and encoded A step of outputting picture type information for specifying the type of image data defining the arrangement order to be stored, and storing the calculated time stamp and picture type information, corresponding to the picture type information of the image data after compression coding Sequentially selecting from the oldest time stamp among the same picture type information as the picture type information of the image data after compression encoding among the time stamps in which the time stamps to be stored are stored, and after the compression encoding Adding a selected time stamp to the data of each frame of It is characterized by that.
[0029]
A moving image compression encoding apparatus according to claim 3 of the present invention is provided. By converting each frame of the original moving image into a moving image in which one frame is composed of two fields, and overlapping a specific field in each unit frame with a predetermined number of consecutive frames of the moving image as one unit. A duplicate field in an image signal obtained by adjusting the time axis in advance is detected, one redundant field is deleted, field deletion information indicating that the field has been deleted, and an image after the redundant field is deleted Field decimation means for outputting a signal, and an image signal and field deletion information output from the field decimation means, and a compression code for compressing and encoding the image signal and outputting the number of compressed fields and field deletion information And other frames of each frame after compression encoding of the image signal input to the compression encoding means. A field number calculating unit that outputs picture type information for specifying a type of image data that defines an arrangement order for a frame, and calculates a cumulative number of fields at the time of input to the field thinning unit; and the field number calculating unit The compression encoding means of the cumulative field number stored in the stored field number and the picture type information calculated from the above, and the cumulative field number corresponding to the picture type information output from the compression encoding means A field number selection means for sequentially selecting from the smallest accumulated field number in the same picture type information as the picture type information output by; a time stamp conversion means for converting the accumulated field number into a time stamp representing time; and a compression code Time stamp conversion means for each frame data after conversion And a time information adding means for adding the converted time stamp It is characterized by that.
[0030]
Such claims 3 In the moving image compression coding apparatus described in the above, the cumulative field number counted at the time of input to the compression coding means and the picture type information after the compression coding are held and output from the compression coding processing means. The time information is calculated by a simple conversion of selecting the number of accumulated fields corresponding to the same picture type information as the picture type information. Therefore, in order to calculate time information of a frame of certain image data, it is not necessary to perform a complicated process such as calculating time information after detecting an RFF flag which is field deletion information of a subsequent frame.
[0031]
According to a fourth aspect of the present invention, there is provided a moving picture compression encoding apparatus that converts each frame of an original moving picture into a moving picture having one frame consisting of two fields, and a predetermined number of consecutive frames of the moving picture are converted into one. By duplicating a specific field in each unit frame as a unit, a duplicate field in the image signal obtained by adjusting the time axis in advance is detected, and one redundant field is deleted, and the field is deleted. Field deletion information indicating that this is the case and a field thinning means for outputting an image signal after redundant fields are deleted, and an image signal and field deletion information output from the field thinning means are input, and the image signal is compressed and encoded And a compression encoding means for outputting the number of fields subjected to compression processing and field deletion information, and a time star representing time information. A clock generation unit that generates a clock serving as a reference for the image; and a time stamp at which the image data is input to the compression encoding unit in units of frames by counting the clocks generated by the clock generation unit A time stamp calculating means for outputting picture type information for specifying a type of image data defining an arrangement order with respect to another frame after the frame is compression-encoded, and a time stamp calculated by the time stamp calculating means And the picture type information, and a time stamp corresponding to the picture type information output from the compression encoding means is stored. Among the time stamps of the same picture type information as the picture type information output from the compression encoding means A time stamp selecting means; and a time stamp adding means for adding the time stamp selected by the time stamp selecting means to the data of each frame after compression encoding.
[0033]
Such claims 4 In the moving image compression coding apparatus described in the above, the time stamp counted at the time of input to the compression coding means and the picture type information after compression coding are held and output from the compression coding processing means. The time stamp is determined by selecting the time stamp corresponding to the same picture type information as the selected picture type information. Therefore, in order to calculate the time stamp of a frame of a certain image data, it is not necessary to perform a complicated process such as calculating a time stamp after detecting an RFF flag which is field deletion information of a subsequent frame.
[0034]
Claims of the invention 5 The moving image compression encoding / decoding system described in 1) converts each frame of the original moving image into a moving image in which one frame is composed of two fields, and each of the predetermined number of consecutive frames of the moving image as one unit. By duplicating a specific field in one unit frame, a duplicate field in an image signal obtained by adjusting the time axis in advance is detected, and one redundant field is deleted, indicating that the field has been deleted. Field decimation means for outputting an image signal after deletion of field deletion information and redundant fields, and an image signal and field deletion information output from the field decimation means are input, and the image signal is independently transmitted in frame units. Using the I-picture that is an intra-frame coded picture to be compression-coded and the correlation characteristics between the I-picture and the frame A P picture that is a forward-predictive encoded picture that is compression-encoded and a B picture that is a bi-predictive encoded picture that is compression-encoded using the correlation characteristics between the I and P pictures. Compression encoding means for performing compression encoding and outputting the number of fields subjected to compression processing and field deletion information, and obtaining a repetition cycle of an I picture or P picture after encoding of an image signal input to the compression encoding means And an information extracting means for extracting picture type information for specifying the type of image data defining the arrangement order with respect to other frames after the frame is compression-encoded and field deletion information, and the information extracting means The repetition period and picture type information of the acquired or extracted I picture or P picture Therefore, when the picture type is an I picture or P picture, the number of fields corresponding to the repetition period and the next B of the previous I picture or P picture are added to the cumulative number of fields of the previous I picture or P picture. The number of field deletions from the picture to the picture is added to obtain the cumulative field number of the picture. If the picture type is a B picture and the previous picture type is an I picture or P picture, then two The cumulative field number of the picture is obtained by adding “2” to the cumulative number of fields of the previous I picture or P picture and the B picture field deletion count from the B picture next to the previous I picture or P picture to the picture. The picture type is B picture and the previous picture type is B If a puncture is a cumulative number field of the previous B the picture by adding the a field deletion count of the picture "2" of the number of cumulative field picture Field number predicting means, time stamp converting means for converting the cumulative number of fields into image time stamps, and adding image time stamps to the data after being compression encoded by the compression encoding means, Image time stamp adding means for outputting, audio time stamp adding means for adding audio time stamp to audio data and outputting audio packet data, image packet data output from the image time stamp adding means, and audio time Multiplexing means for multiplexing and outputting voice packet data output from the stamp adding means and a reference time stamp, and data for separating the data multiplexed by the multiplexing means into image packet data and voice packet data Separation means and voice time data from voice packet data Audio synchronizing processing means for extracting audio data and performing audio synchronizing processing, audio decoding means for decoding and outputting audio packet data, and a time stamp of an image extracted from the image packet data and a reference time stamp Time-axis filter means for performing time-axis filter processing on the difference, image synchronization processing means for performing image synchronization processing with a value output from the time-axis filter means, and an image for decoding and outputting image packet data And a decoding means.
[0035]
Such claims 5 In the moving image compression encoding / decoding system described in 1), time stamps of I and P pictures are calculated at the time of encoding. Therefore, it is not necessary to perform complicated processing such as calculating the time stamp after detecting the RFF flag of the B picture appearing thereafter. In addition, a stable synchronization process is performed at the time of decoding.
[0037]
Claims of the invention 6 The video compression encoding / decoding system described in claim 1 5 In the moving image compression encoding / decoding system, the time axis filter unit adds a difference between a time stamp of an image extracted from image packet data and a reference time stamp for a predetermined number of pictures, and adds the result of the addition to the time stamp. Filtering in the time axis is performed by using the quotient divided by the predetermined number as a difference, and the predetermined number is a common multiple of the repeated picture period of the field deletion information and the repeated picture period of the I picture or P picture. It is characterized by.
[0038]
Claims of the invention 7 The video compression encoding / decoding system described in claim 1 5 In the moving image compression encoding / decoding system, the time axis filter unit adds a difference between a time stamp of an image extracted from image packet data and a reference time stamp for a predetermined number of pictures, and adds the result of the addition to the time stamp. Filtering in the time axis is performed by using a quotient divided by a predetermined number as a difference, and the predetermined number is a least common multiple of a repeated picture period of field deletion information and a repeated picture period of an I picture or a P picture. It is characterized by that.
[0039]
Such claims 5, 6 and 7 In the moving image compression encoding / decoding system according to the present invention described in 1), a synchronization matching process with a quick response is performed.
[0040]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, a moving picture compression coding apparatus and a moving picture compression coding / decoding system according to the present invention will be described in detail with reference to the drawings showing the respective embodiments. In addition, in each figure referred in the following description and each figure referred in description of a prior art example, the part shown with the same code | symbol shows the same or equivalent part.
[0041]
Embodiment 1 FIG.
FIG. 1 is a block diagram showing a configuration example of a moving image compression coding apparatus according to the present invention. In FIG. 1, reference numerals 1 and 2 indicate first and second field memories. The first field memory 1 delays the input digital video signal by one field. Reference numeral 4 indicates a switch for switching whether or not the output from the first field memory is input to the second field memory 2. The output of the changeover switch 4, in other words, the first field memory 1 The second field memory 2 delays the output by one field. Therefore, the output of the second field memory 2 is delayed by two fields from the input to the first field memory 1.
[0042]
Reference numeral 3 compares the input to the first field memory 1 with the output from the second field memory 2 (the input to the first field memory 1 is delayed by two fields), and is equivalent to one field. Is smaller than a predetermined value, more specifically, if the difference is substantially the same, the output of the changeover switch 4 is set to no signal, and otherwise, the output of the changeover switch 4 is set to the first field memory 1. The correlation detection circuit is switched so that the output is input to the second field memory 2.
[0043]
Reference numeral 5 is a video encoder for compressing the video signal, 6 is a field number counting circuit for counting the number of fields of the compressed data output from the video encoder 5, 7 is a field number replacement circuit, and 8 is field 9 shows a PTS conversion circuit that converts the processing result of the number change circuit 7 into PTS, and 9 shows a PTS addition circuit that adds the PTS converted by the PTS conversion circuit 8 to the compressed data output from the video decoder 5 and outputs it. .
[0044]
Next, the operation of the moving picture compression encoding apparatus according to the present invention shown in FIG. 1 will be described. The first field memory 1, the second field memory 2, the correlation detection circuit 3, and the changeover switch 4 constitute an inverse 3: 2 pull-down processing unit 50. The inverse 3: 2 pull-down processing unit 50 is the same in configuration and operation as before, and compares each field of the input video data with the field two fields before, and if the difference between the two is smaller than a predetermined value, When the correlation is large, it is determined that they are the same field, and the selector switch 4 is switched to the non-signal side, thereby performing an inverse 3: 2 pull-down process for removing redundant fields.
[0045]
The output signal of the changeover switch 4 is input to the video encoder 5, and when the changeover switch 4 is switched to the non-signal side, the video encoder 5 repeats the first field of the frame to be input again at the time of decoding. RF, which is field deletion information for output F ( The MPEG-2 video elementary compression / encoding process is performed with the Repeat First Field) flag enabled, and the resulting video elementary stream is output to the PTS adding circuit 9 while being packetized.
[0046]
Further, each time video data is input in units of frames, the video encoder 5 outputs the picture type (I, P, B) after the encoding of the input data and the presence / absence of the RFF flag to the field number counting circuit 6. The picture type is also output to the field number replacement circuit 7.
[0047]
The field number counting circuit 6 executes a processing sequence as shown in the flowchart of FIG. First, the field number counting circuit 6 initializes the cumulative field number F (step S11), and then every time video data is input to the video encoder 5 in units of frames (step S12), the picture type after the input data is encoded. (I, P, B) and the presence / absence of the RFF flag are obtained from the video encoder 5 (step S13), and the accumulated field number and picture type of the current picture already counted are output to the field number replacement circuit 7 ( Step S13).
[0048]
When the RFF flag is valid (“YES” in step S14), the field number counting circuit 6 counts the number of fields of the picture as “3” (step S15), and when it is invalid (step S15). "NO" in S14), count as "2" (step S16), and by adding to the cumulative field number of the current picture, it is held as the cumulative field number of the next picture to be acquired (step S17).
[0049]
The field number replacement circuit 7 includes a register for storing picture types and cumulative field numbers corresponding to the maximum period of I and P pictures and the number of pictures corresponding to the maximum delay time of I and P pictures in the video encoder 5. Has been. For example, when the maximum period of I and P pictures is 3, and the maximum encoding delay time of I and P pictures is 0.4 frames, the added value “3.4” of both is rounded up and the registers for 4 pictures are fielded. The number change circuit 7 is configured. In these registers, the cumulative field number and picture type of the picture output from the field number counting circuit 6 are sequentially stored.
[0050]
The field number counting circuit 6 executes a processing sequence as shown in the flowchart of FIG. Whenever the video data is input to the video encoder 5 in units of frames (step S21), the field number changing circuit 7 acquires the picture type output from the video encoder 5 (step S22), and the timing at which the picture is output. The number of accumulated fields corresponding to the oldest picture of the same picture type stored in the register is output (step S23). Note that the register that has output the cumulative number of fields is cleared (step S24), and preparations for storing new information are made.
[0051]
The PTS conversion circuit 8 is output from the field number replacement circuit 7. Accumulation When the number of fields is input, PTS is calculated from the following equation (1) and output to the PTS adding circuit 9. PTS is 90kHz count, video is NTSC signal, frame frequency is 29.9 7H z.
[0052]
PTS = Cumulative field number x 1 / 29.97 x 1/2 x 90000 (1)
[0053]
The PTS adding circuit 9 outputs a video PES (Packetized Elementary Stream) by adding the PTS output from the PTS conversion circuit 8 while packetizing the video elementary stream output from the video encoder 5.
[0054]
Embodiment 2. FIG.
By the way, in the first embodiment described above, as shown in FIG. 1, the moving picture compression coding apparatus according to the present invention calculates the PTS, as shown in FIG. 1, the field number counting circuit 6, the field number replacement circuit 7, and the PTS conversion. Although the circuit 8 is provided, it is of course possible to adopt other configurations. A moving image compression coding apparatus according to the present invention having such a configuration will be described below as a second embodiment.
[0055]
FIG. 4 is a block diagram showing a configuration example of a moving image compression encoding apparatus according to the present invention having a configuration different from that of the first embodiment as a configuration for calculating the PTS. In the first embodiment, the PTS is calculated by the field number counting circuit 6, the field number switching circuit 7, and the PTS conversion circuit 8. In the second embodiment, the 90 kHz frequency divider is used. 10, the PTS count circuit 11, and the PTS replacement circuit 12 are configured to calculate PTS.
[0056]
The 90 kHz frequency divider 10 divides the 27 MHz clock, which is the system reference clock, to generate a 90 kHz clock signal that serves as a PTS reference, and inputs it to the PTS count circuit 11.
[0057]
The PTS count circuit 11 starts counting the clock signal input from the 90 kHz frequency divider 10 using the encoding start information output from the video encoder 5 as a trigger. Then, the PTS count circuit 11 acquires the picture type after the input data is compression-coded for each frame at the time when the video data is input to the video encoder 5, and the count value at the time of acquisition is used as the PTS of the picture. To the PTS replacement circuit 12 together with the picture type.
[0058]
The PTS replacement circuit 12 has a register for storing the picture type of the picture corresponding to the sum of the maximum period of the I and P pictures and the maximum delay time of the I and P pictures in the video encoder 5 and the number of accumulated fields. It is configured. For example, if the maximum period of I and P pictures is 3, and the maximum encoding delay time of I and P pictures is 0.4 frames, the registers for 4 pictures obtained by rounding up the addition value “3.4” of both are replaced with PTS. The circuit 12 is configured. In each of these registers, the PTS and the picture type of each picture output from the PTS count circuit 11 are sequentially stored.
[0059]
When the PTS replacement circuit l2 acquires the picture type output from the video encoder 5, the PTS corresponding to the oldest picture of the same picture type stored in the register at the timing when the corresponding picture is output. Is output. Note that after outputting the PTS, the PTS replacement circuit 12 clears the corresponding register and prepares to store new information.
[0060]
The PTS adding circuit 9 outputs a video PES (Packetized Elementary Stream) by adding the PTS output from the PTS replacement circuit l2 while packetizing the video elementary stream output from the video encoder 5.
[0061]
Embodiment 3 FIG.
By the way, in the moving picture compression encoding apparatus according to the present invention of Embodiments 1 and 2 described above, the PTS is calculated by obtaining the cumulative field number or PTS at the time of input to the video encoder 5. It is also possible to adopt another configuration, for example, a configuration in which the data output from the video encoder 5 is analyzed and the PTS is simply predicted. A moving image compression encoding / decoding system according to the present invention having such a configuration will be described in detail below as a third embodiment.
[0062]
FIG. 5 is a block diagram showing a configuration example of a moving image compression encoding / decoding system according to the present invention having a configuration for simply predicting PTS by analyzing data output from the video encoder 5. The reverse 3: 2 pull-down processing unit 50, the video encoder 5 and the PTS adding circuit 9 are provided in the same manner as in the first and second embodiments shown in FIGS. The provision of the circuit 8 is the same as that of the first embodiment shown in FIG.
[0063]
In FIG. 5, reference numeral 13 is an information extraction circuit for extracting information from data encoded by the video encoder 5, and 14 is a field number simple prediction circuit for predicting the cumulative number of fields based on the information extracted by the information extraction circuit 13. The number of fields predicted by the simple number-of-fields prediction circuit 14 is given to the PTS conversion circuit 8 and converted to PTS, and the result is output to the PTS addition circuit 9.
[0064]
Reference numeral 15 is an audio encoder that compresses and encodes audio data, 16 is a PTS addition circuit that adds PTS by converting the audio data compression encoded by the audio encoder 15 into a packet (PES), and 26 is a time stamp. SCR creation circuit for generating SCR (System Clock Reference) as reference reference value, 17 multiplexes video PES data and audio PES data created by PTS additional circuits 9 and 16 and SCR created by SCR creation circuit 26 Each of the multiplexing circuits is shown.
[0065]
The multiplexed compressed data output from the multiplexing circuit 17 is recorded by a recording device 18 on a recording medium such as a magnetic disk.
[0066]
On the other hand, reference numeral 19 is a separation circuit that separates data read from the recording medium by the storage device 18 into video PES data, audio PES data, and SCR, and 20 is an audio that decodes the audio PES data separated by the separation circuit 19. Decoder, 21 is a video decoder for decoding video PES data separated by separation circuit 19, 22 is a reference time generation circuit for generating a reference time from SCR separated by separation circuit 19, and 23 is an audio encoder 20 is an audio PTS comparison circuit that compares the audio PTS encoded by the reference time generation circuit 22 with the reference time generated by the reference time generation circuit 22 and outputs a difference, and 24 is the video PTS encoded by the video encoder 21 and the reference time generation circuit 22. The video PTS comparison circuit that outputs the difference by comparing with the generated reference time, and 25 is the difference value of the video PTS output by the video PTS comparison circuit 24. A PTS differential filter to synchronous control information between the video and audio to filter on during shaft.
[0067]
The information extraction circuit 13 acquires the period M of the I and P pictures from the video encoder 5, extracts the picture type and RFF flag from the data encoded by the video encoder 5, and outputs them to the field number simple prediction circuit 14. .
[0068]
The field number simple prediction circuit 14 predicts the cumulative field number of the current picture in a processing sequence as shown in the flowchart of FIG. An outline of the sequence shown in FIG. 6 is as follows. First, the number-of-fields simple prediction circuit 14 performs compression encoding processing based on the period of I and P pictures and the picture type of the current picture output from the video encoder 5. The cumulative number of fields in which the rearrangement of the picture order occurring before and after is predicted is calculated. Furthermore, when the RFF flag of the current picture is valid, the field number simple prediction circuit 14 adds the result of adding “1” to the cumulative field number of the current picture that has already been predicted and calculated. And Specifically:
[0069]
For example, when the period M of I and P pictures is 3, the number of accumulated fields predicted by the sequence shown in FIG. 6 is shown in the schematic diagrams of FIGS. 7 and 8 are originally a diagram in which the right end of FIG. 7 and the left end of FIG. 8 are connected.
[0070]
7 and 8, the NTSC signal input to the moving image compression coding apparatus is obtained by performing 3: 2 pulldown conversion from the film material, and includes the first field memory 1, the second field memory 2, and the correlation detection. When reverse 3: 2 pull-down processing is performed in the reverse 3: 2 pull-down processing unit 50 constituted by the circuit 3 and the changeover switch 4, the RFF flag becomes valid every other frame and is input to the video encoder 5 ( Line 2). In the video encoder 5, the input image is encoded in the order of picture types of B, B, I, B, B, P... (Line 1). In this case, the correct number of accumulated fields is a value as shown in row 3.
[0071]
In the encoding process in the video encoder 5, since the B picture is predictively encoded using the preceding and following I and P pictures as reference images, the order of the I, P and B pictures is switched (line 4), the RFF flag and the correct accumulation The number of fields is also shown in line 5 and line 6 accordingly.
[0072]
The accumulated field number predicted by the sequence shown in FIG. As shown in FIG. 6, the cumulative field number IP of the previous I or P picture (line 7), the cumulative field number IP1 (line 8) of the previous I picture or P picture, and the I or P picture The effective number BR (line 9) of the RFF flag of the B picture sandwiched between the two, and the effective number BR1 (line 10) of the RFF flag of the previous B picture sandwiched between the I or P picture, From the period M of the I or P picture, the cumulative field number F (row 11) of each picture is predicted.
[0073]
In the prediction sequence of FIG. 6, first, the above-described variables are initially set (step S31). Specifically, the predicted field number F is “0”, the previous I and P picture cumulative field number IP is “−2”, and the previous I and P picture cumulative field number IP1 is “−8”. The effective frequency BR of the RFF flag of the B picture sandwiched between the I and P pictures is “0”, the effective frequency BR1 of the RFF flag of the previous consecutive B picture is “0”, and the cycle M of the I and P pictures is Initially set to “3”.
[0074]
Next, the picture type and the RFF flag are acquired from the data output of the video encoder 5 (step S32). If the acquired picture type is an I or P picture (“YES” in step S33), the current picture and the current picture are added to the previous I or P picture cumulative field number IP by the following equation (2). By adding the number of fields (2 × M) between the previous I or P pictures and adding the effective RFF number (BR) between them, a predicted value of the cumulative number of fields F is obtained (step S34).
[0075]
F = IP + 2 x M + BR (2)
[0076]
Next, only when the RFF flag of the current picture is valid (“YES” in step S35), “1” is added to the cumulative field number F (step S36) and output as the predicted cumulative field number P of the current picture. (Step S38). At this time, IP1 is replaced with IP (line 8), IP is replaced with F (line 7), BR1 is replaced with BR (line 10), and BR is replaced with “0” (line 9) (step S37).
[0077]
Next, when the picture type acquired from the data output of the video encoder 5 is a B picture ("NO" in step S33) and is acquired immediately after an I or P picture ("YES" in step S41). ), “2” is added as the number of fields before compression encoding between the current picture and the previous I or P picture to the cumulative field number IP of the previous I or P picture by the following equation (3): Further, the RFF effective count (BR1) of the B picture between the previous I or P picture and the previous I or P picture is added (step S42).
[0078]
F = IP1 + 2 + BR1 (3)
[0079]
Only when the RFF flag of the current picture is valid (“YES” in step S43), “1” is added to F and output as the cumulative field number F of the current picture, and when the RFF flag is valid “1” is added to BR (step S44).
[0080]
Next, when the picture type obtained from the data output of the video encoder 5 is a B picture ("NO" in step S33), and the previous picture is a B picture ("NO" in step S41). Then, “2” is added to the cumulative field number F (step S45).
[0081]
Only when the RFF flag of the current picture is valid (“YES” in step S43), “1” is added to F and output as the cumulative field number F of the current picture, and when the RFF flag is valid “1” is added to BR (step S44).
[0082]
The cumulative field number (line 11) predicted by the simple field number prediction circuit 14 in this way is a difference of ± 1 field compared to the correct cumulative field number (line l2). The reason is that the validity / invalidity of the RFF flag is reflected in the number of accumulated fields of the picture, and ± 1 in the least common multiple of 6 frames of 3 repetition cycles of I and P pictures and 2 repetition cycles of the RFF flag. The field difference is repeated. Then, by applying a filter on the time axis by the PTS difference filter 25 at the time of decoding, synchronization can be performed with high accuracy having no practical problem of ± 1 field or less.
[0083]
In addition, when such a prediction method is adopted, image data that has been converted to an NTSC signal without being subjected to 3: 2 pull-down processing from film material, that is, image data that is an NTSC signal obtained directly from the original signal is input. In such a case, the inverse 3: 2 pull-down process in the inverse 3: 2 pull-down processing unit 50 is not performed, so the RFF flag remains invalid. For this reason, the difference between the cumulative field number predicted from the period of I and P pictures and the picture type and the correct cumulative field number is “0”, and no unnecessary offset occurs.
[0084]
Thus, the predicted cumulative field number of the current picture output from the field number simple prediction circuit 14 is converted from the field number to PTS by the PTS conversion circuit 8 according to the equation (1) and output to the PTS adding circuit 9.
[0085]
The PTS addition circuit 9 outputs a video PES (Paeketized Elementary Stream) by adding the PTS output from the PTS conversion circuit 8 while packetizing the video elementary stream from the video encoder 5.
[0086]
On the other hand, the audio data is compressed and encoded by the audio encoder 15, and the PTS adding circuit 16 adds the PTS of the audio data to form a packet (PES).
[0087]
Also, an SCR (System Clock Reference) serving as a reference reference value for the time stamp is generated by the SCR generating circuit 26 and output to the multiplexing circuit 17.
[0088]
The video PES data, the audio PES data, and the SCR are multiplexed by the multiplexing circuit 17 into one data, and are recorded on a recording medium such as a magneto-optical disk by the recording device 18.
[0089]
The data read from the recording medium by the recording device 18 is separated into video PES data, audio PES data, and SCR by the separation circuit 19, and the audio PES data is decoded by the audio decoder 20, and the video PES data is decoded by the video decoder 21. The SCR is input to the reference time generation circuit 22 after being decoded.
[0090]
When the video PES data is decoded by the video decoder 21, the I and P pictures become reference pictures of B pictures. The field number difference (line 14) is also changed accordingly.
[0091]
The reference time generation circuit 22 refers to the SCR input from the separation circuit 19 and generates a reference time. Since the reference time corresponds to the correct cumulative field number (line 3, line 6) shown in FIGS. 7 and 8, the cumulative field number difference of line 14 is the predicted cumulative field number (line 1) and the reference time field. It is equal to the difference from the number conversion value.
[0092]
The audio PTS comparison circuit 23 compares the audio PTS extracted by the audio decoder 20 with the reference time, and outputs the difference value. The audio decoder 20 performs synchronization adjustment processing so that the difference value is “0”.
[0093]
Similarly, the video PTS comparison circuit 24 compares the video PTS extracted by the video decoder 21 with the reference time, and outputs the difference value. The PTS difference filter 25 holds the difference value output from the video PTS comparison circuit 24 for 6 pictures, and calculates the average value for every 6 pictures. The calculation result is shown in line 15 of FIGS. The initial value is “0”.
[0094]
This calculation result is divided by the time corresponding to one field, and the video decoder 21 performs synchronization control of the number of fields corresponding to the quotient. Further, the remainder of the above division is input to the audio PTS comparison circuit 23 and added to the difference value between the audio PTS and the reference time. The addition result is input to the audio decoder 20, and synchronization processing is performed so that the difference value becomes “0”. That is, the fraction of one field or less synchronizes audio with video by controlling audio synchronization processing.
[0095]
By performing such synchronization processing, video and audio can be synchronized even when PTS is simply predicted during video encoding.
[0096]
The number of pictures averaged by the PTS difference filter 25 is 6 pictures, which is a common multiple of 2 which is the picture period of the RFF flag and M = 3 which is the I and P picture periods. In this way, by setting the number of pictures averaged by the PTS difference filter 25 as a common multiple, the average value of the cumulative field number difference becomes constant in l / 6 fields (line 15), and stable synchronization processing can be performed. There is also an effect that can be done.
[0097]
Further, by setting the number of pictures averaged by the PTS difference filter 25 to the minimum value among the above-mentioned common multiples, that is, the least common multiple “6”, it is possible to perform synchronization processing with quick response. is there.
[0098]
【The invention's effect】
As described above in detail, according to the moving image compression encoding method and apparatus and the moving image compression encoding / decoding system of the present invention, a large amount of data for temporarily storing data output from the compression encoding means is used. Therefore, there is no need to provide a redundant memory, so that an inexpensive and efficient moving image compression encoding apparatus and moving image compression encoding / decoding system can be obtained.
[0099]
According to the moving image compression encoding method of the present invention as set forth in claim 1, the time information counted at the time of compressing and encoding the image data and the picture type information after the compression encoding are held, and the compression encoding process is performed. Since the time information corresponding to the same picture type information as the picture type information of the frame of the image data is selected and the time information is determined, in order to calculate the time information of the frame of a certain image data, Complicated processing such as calculating time information after detecting the RFF flag, which is frame field deletion information, becomes unnecessary.
[0100]
Furthermore, according to the moving image compression coding apparatus of the present invention as set forth in claims 2 and 3, the cumulative field number counted at the time of input to the compression coding means and the picture type information after compression coding are held. In addition, the time information can be calculated by a simple conversion of selecting the number of accumulated fields corresponding to the same picture type information as the picture type information output from the compression encoding processing means. This eliminates the need for complicated processing such as calculating time information after detecting the RFF flag, which is field deletion information for subsequent frames, in order to calculate time information for a frame of certain image data. It is possible to obtain a moving image compression encoding apparatus having a simple structure.
[0101]
Furthermore, according to the moving image compression coding apparatus of the present invention as set forth in claims 4 and 5, the time stamp counted at the time of input to the compression coding means and the picture type information after compression coding are held. In addition, the time stamp is determined by selecting a time stamp corresponding to the same picture type information as the picture type information output from the compression encoding processing means. This eliminates the need for complicated processing such as calculating the time stamp after detecting the RFF flag, which is the field deletion information for subsequent frames, in order to calculate the time stamp of a frame of certain image data. It is possible to obtain a moving image compression encoding apparatus having a simple structure.
[0102]
Furthermore, according to the moving picture compression encoding / decoding system of the present invention as set forth in claim 6, the RFF flag of the B picture appearing thereafter is detected in order to calculate the time stamp of the I and P pictures at the time of encoding. Then, it is possible to obtain a moving image compression encoding / decoding system that does not need to perform complicated processing such as calculating a time stamp.
[0103]
In addition, according to the moving image compression encoding / decoding system of the present invention as set forth in claims 7, 8 and 9, it is possible to obtain a moving image compression encoding / decoding system that performs a fast synchronization processing. .
[0104]
Further, according to the moving image compression encoding / decoding system of the present invention as set forth in claims 8 and 9, a moving image compression encoding / decoding system capable of performing stable synchronization processing at the time of encoding is obtained. Can do.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration example of a moving image compression coding apparatus according to the present invention in a first embodiment.
FIG. 2 is a flowchart showing a processing sequence of a field number counting circuit of the moving picture compression coding apparatus according to the first embodiment of the present invention.
FIG. 3 is a flowchart showing a processing sequence of a field number replacement circuit of the moving image compression coding apparatus according to the first embodiment of the present invention.
4 is a block diagram illustrating a configuration example of a moving image compression coding apparatus according to the present invention in Embodiment 2. FIG.
FIG. 5 is a block diagram illustrating a configuration example of a moving image compression encoding / decoding system according to the present invention in Embodiment 3.
FIG. 6 is a flowchart showing a processing sequence of a simple prediction circuit for the number of fields of a moving image compression encoding / decoding system according to the present invention in Embodiment 3;
FIG. 7 is a schematic diagram showing the cumulative number of fields predicted by the simple number-of-fields prediction circuit of the moving image compression encoding / decoding system according to the present invention in the third embodiment.
FIG. 8 is a schematic diagram showing the cumulative number of fields predicted by the simple number-of-fields prediction circuit of the moving image compression encoding / decoding system according to the present invention in the third embodiment.
FIG. 9 is a schematic diagram showing a procedure of 3: 2 pull-down processing.
FIG. 10 is a block diagram illustrating a configuration example of a moving picture compression encoding apparatus according to a conventional MPEG-2 system.
FIG. 11 is a schematic diagram showing a state before and after frame sequence encoding when a moving image compression encoding process according to the conventional MPEG-2 method is performed.
[Explanation of symbols]
5 video encoder, 6 field count circuit, 7 field number replacement circuit, 8 PTS conversion circuit, 9 PTS additional circuit, 10 90kHz frequency divider, 11 PTS count circuit, 12 PTS replacement circuit, 13 information extraction circuit, 14 field number Simple prediction circuit, 15 audio encoder, 16 PTS additional circuit, 17 multiplexing circuit, 18 recording device, 19 separation circuit, 20 audio decoder, 21 video decoder, 22 reference time generation circuit, 23 audio PTS comparison circuit, 24 video PTS comparison Circuit, 25 PTS differential filter, 26 SCR creation circuit, 50 inverse 3: 2 pull-down processor.

Claims

By converting each frame of the original moving image into a moving image in which one frame is composed of two fields, and overlapping a specific field in each unit frame with a predetermined number of consecutive frames of the moving image as one unit. Detecting a duplicate field in an image signal obtained by adjusting a time axis in advance and deleting one redundant field;
Outputting field deletion information indicating that the field has been deleted;
Compressing and encoding the image signal after the redundant field is removed;
Wherein together with the image signal to count the cumulative number field plus Field Delete number to the number of times which is compression-encoded in frame units, defines the sequence order for another frame after the frame has been compression-encoded Outputting picture type information identifying the type of image data to be performed;
Storing the counted cumulative field number and picture type information;
Based on the output picture type information, the oldest cumulative field number among the cumulative field numbers corresponding to the compression-coded picture type information among the cumulative field numbers stored for each picture type information is sequentially selected. Steps,
Counting time information by converting the selected cumulative field number to the reference frequency clock number;
Deleting the selected cumulative field number;
Moving picture compression encoding method characterized by including the step of adding the counted time information to the frame of the compressed encoded image signal.

By converting each frame of the original moving image into a moving image in which one frame is composed of two fields, and overlapping a specific field in each unit frame with a predetermined number of consecutive frames of the moving image as one unit. Detecting a duplicate field in an image signal obtained by adjusting a time axis in advance and deleting one redundant field;
Outputting field deletion information indicating that the field has been deleted;
Compressing and encoding the image signal after the redundant field is removed;
Generating a clock as a reference for a time stamp representing time information;
By counting the generated clocks, a time stamp at the time of compressing and encoding the image data in units of frames is calculated, and at the same time, the image data defining the arrangement order with respect to other frames after the frames are compressed and encoded Outputting picture type information identifying the type; and
An image after compression encoding among the time stamps storing the time stamp corresponding to the picture type information of the image data after storing the calculated time stamp and picture type information after compression encoding Sequentially selecting from the oldest time stamp among the same picture type information as the picture type information of the data;
Adding a selected time stamp to the data of each frame after compression encoding;
A moving image compression encoding method comprising:

By converting each frame of the original moving image into a moving image in which one frame is composed of two fields, and overlapping a specific field in each unit frame with a predetermined number of consecutive frames of the moving image as one unit. A duplicate field in an image signal obtained by adjusting the time axis in advance is detected, one redundant field is deleted, field deletion information indicating that the field has been deleted, and an image after the redundant field is deleted A field thinning means for outputting a signal;
A compression encoding means for inputting an image signal and field deletion information output from the field thinning means, compressing and encoding the image signal, and outputting the number of compressed fields and field deletion information;
Outputs picture type information that specifies the type of image data that defines the arrangement order of each frame after the compression encoding of the image signal input to the compression encoding means with respect to other frames, and the field thinning means Field number calculating means for calculating the cumulative number of fields at the time of input to
The accumulated field number and the picture type information calculated by the field number calculating means are stored, and the accumulated field number corresponding to the picture type information output from the compression encoding means is stored. Field number selection means for sequentially selecting from the smallest cumulative field number in the same picture type information as the picture type information output by the compression encoding means;
Time stamp conversion means for converting the cumulative field number into a time stamp representing time;
A moving image compression coding apparatus comprising: time information addition means for adding a time stamp converted by the time stamp conversion means to data of each frame after compression encoding.

By converting each frame of the original moving image into a moving image in which one frame is composed of two fields, and overlapping a specific field in each unit frame with a predetermined number of consecutive frames of the moving image as one unit. A duplicate field in an image signal obtained by adjusting the time axis in advance is detected, one redundant field is deleted, field deletion information indicating that the field has been deleted, and an image after the redundant field is deleted A field thinning means for outputting a signal;
A compression encoding means for inputting an image signal and field deletion information output from the field thinning means, compressing and encoding the image signal, and outputting the number of compressed fields and field deletion information;
Clock generating means for generating a clock that is a reference of a time stamp representing time information;
By counting the clocks generated by the clock generation means, a time stamp at the time when the image data is input to the compression encoding means in a frame unit is calculated, and another time after the frame is compression encoded A time stamp calculating means for outputting picture type information for specifying a type of image data defining an arrangement order with respect to a frame;
The time stamp calculated by the time stamp calculating means and picture type information are stored, and the compression code among the time stamps storing time stamps corresponding to the picture type information output from the compression encoding means is stored. A time stamp selecting means for sequentially selecting from the oldest time stamp among the time stamps of the same picture type information as the picture type information output from the converting means;
And a time stamp adding means for adding the time stamp selected by the time stamp selecting means to the data of each frame after the compression coding.

By converting each frame of the original moving image into a moving image in which one frame is composed of two fields, and overlapping a specific field in each unit frame with a predetermined number of consecutive frames of the moving image as one unit. A duplicate field in an image signal obtained by adjusting the time axis in advance is detected, one redundant field is deleted, field deletion information indicating that the field has been deleted, and an image after the redundant field is deleted A field thinning means for outputting a signal;
An image signal output from the field decimation means and field deletion information are input, and an I-picture that is an intra-frame coded picture in which the image signal is independently compression-coded in units of frames, and the frame between the I picture Bidirectional predictive coding that is compression-encoded using a correlation characteristic between frames of a P picture that is a forward-predictive encoded picture that is compression-encoded using the correlation characteristics of the I and P pictures Compression encoding means for compressing and encoding a B picture that is a picture and outputting the number of compressed fields and field deletion information;
Image data that obtains a repetition period of an I-picture or P-picture after encoding of an image signal input to the compression encoding means and defines an arrangement order with respect to other frames after the frame is compression-encoded Information extracting means for extracting picture type information and field deletion information for specifying the type of
From the repetition period and picture type information of the I picture or P picture acquired and extracted by the information extraction means,
If the picture type is an I picture or P picture, the cumulative field number of the previous I picture or P picture is calculated from the number of fields corresponding to the repetition period and the B picture next to the previous I picture or P picture. Add the number of field deletions to the picture to make the cumulative field number of the picture,
When the picture type is a B picture and the previous picture type is an I picture or a P picture, “2” is added to the number of accumulated fields of the previous I picture or P picture and the previous I picture Or, by adding the B picture field deletion count from the B picture next to the P picture to the picture, the cumulative field number of the picture is obtained.
If the picture type is a B picture and the previous picture type is a B picture, add “2” and the number of field deletions of the picture to the cumulative number of fields of the previous B picture. A field number predicting means for the number of accumulated fields of a picture, a time stamp converting means for converting the accumulated number of fields into a time stamp of an image,
Image time stamp adding means for adding an image time stamp to the data after being compressed and encoded by the compression encoding means and outputting image packet data;
Voice time stamp adding means for adding voice time stamp to voice data and outputting voice packet data;
Multiplexing means for multiplexing and outputting the image packet data output from the image time stamp adding means, the audio packet data output from the audio time stamp adding means, and a reference time stamp;
Data separating means for separating the data multiplexed by the multiplexing means into image packet data and audio packet data;
Voice synchronization processing means for extracting voice time stamps from voice packet data and performing voice synchronization processing;
Voice decoding means for decoding and outputting voice packet data;
Time axis filter means for performing time axis filtering on the difference between the time stamp of the image extracted from the image packet data and the reference time stamp;
Image synchronization alignment processing means for performing image synchronization alignment processing with values output from the time axis filter means;
A video compression encoding / decoding system comprising: image decoding means for decoding and outputting image packet data.

The time axis filter means adds a difference between a time stamp of the image extracted from the image packet data and a reference time stamp for a predetermined number of pictures, and uses a quotient obtained by dividing the addition result by the predetermined number as a difference. 6. The moving image according to claim 5, wherein a filtering process is performed, and the predetermined number is a common multiple of a repeated picture period of field deletion information and a repeated picture period of an I picture or a P picture. Compression encoding / decoding system.

The time axis filter means adds a difference between a time stamp of the image extracted from the image packet data and a reference time stamp for a predetermined number of pictures, and uses a quotient obtained by dividing the addition result by the predetermined number as a difference. 6. The moving image according to claim 5, wherein filtering is performed, and the predetermined number is a least common multiple of a repeated picture period of field deletion information and a repeated picture period of an I picture or a P picture. Image compression encoding / decoding system.