JP2001061149A

JP2001061149A - Image coder of switching frame structure/field structure

Info

Publication number: JP2001061149A
Application number: JP23453899A
Authority: JP
Inventors: Akio Yoneyama; 暁夫米山; Yasuyuki Nakajima; 康之中島; Hiromasa Yanagihara; 広昌柳原; Masaru Sugano; 勝菅野
Original assignee: KDD Corp
Current assignee: KDDI Corp
Priority date: 1999-08-20
Filing date: 1999-08-20
Publication date: 2001-03-06
Anticipated expiration: 2019-08-20
Also published as: JP3804745B2

Abstract

PROBLEM TO BE SOLVED: To provide a dynamic image coder than adaptively selects a frame structure or a field structure, in response to the features of an input image. SOLUTION: A discrimination result of an interlaced/noninterlaced image discriminating section 2 is outputted to a reduction feature planar generating section 4. The generating section 4 generates a reduction feature plan reflecting the feature of an image that is discriminated to be an interlace image. A simple motion retrieval section 6 applies a simple motion retrieval processing between two reduction feature planes at constant time intervals and outputs a motion compensation prediction error amount in this case as image change amount information. A frame structure/field structure decision section 8 decides coding by a frame structure, when the image change amount is small and decides coding by a field structure, when image change amount is large on the basis of the image change amount information and provides the output of image structure control signal. A dynamic image coding section 10 applies dynamic image codings to a received image signal, in accordance to the image structure control signal and provides the output of a compression coded dynamic image.

Description

【発明の詳細な説明】【０００１】【発明の属する技術分野】本発明はフレーム構造／フィ
ールド構造切替式画像符号化装置に関し、特にデジタル
動画像信号をフレーム構造、およびフィールド構造のど
ちらでも符号化を行うことが可能な動き補償予測を用い
た画像符号化装置に関する。【０００２】【従来の技術】デジタル動画像情報は、サンプリングさ
れた静止画の連続により表現される。この画像の表現方
法には二種類あり、一方をノンインタレース画像、また
はプログレッシブ画像、他方をインタレース画像と呼
ぶ。次にノンインタレース画像、インタレース画像の構
造を示す。【０００３】図２は円形の物体が左から右に移動してい
る映像をノンインタレース画像として表したものであ
る。動画像は、ｔ1 時間毎にサンプリングされ、ｔ1 時
間毎の連続した静止画像で表現される。動画像の解像度
とサンプリングされた静止画像の解像度は等しい。【０００４】一方、図３は、図２と同様の映像をインタ
レース画像で映像表現したものである。映像はｔ2 時間
毎の奇数番目の走査線のみを含む画像と偶数番目の走査
線のみを含む画像との交互により表現される。このイン
タレース画像の表現方法は２種類ある。【０００５】一つは、この奇数番目の走査線のみ、また
は偶数番目のみを一枚の画像として表現するフィールド
構造である。フィールド画像の解像度は、フレーム画像
に対して垂直方向に１／２となる。他の一つは、連続す
る２フィールドの画像から一枚の画像を作成したもので
ある。これがフレーム構造である。この場合の画像は、
画素列水平１ライン毎に異なるフィールドの映像を交互
に配置されている。【０００６】図１２に、従来の動画像符号化装置のブロ
ック図を示す。従来は、図示されているように、動画像
符号化装置は予め指定されたピクチャ構造（フィールド
構造又はフレーム構造）で符号化するように構成されて
いる。【０００７】図１３は、図１２の動画像符号化装置の構
成および動作を示したブロック図である。図１３におい
て、第一画面の入力画像信号１が入力された場合、予測
モード制御部１２により各々のスイッチはそれぞれ側
に接続されており、入力信号は高い符号化効率を得るた
めに直交変換器３に直接入力され、該直交変換器３でＤ
ＣＴ（離散コサイン変換）などを用いて直交変換され、
量子化器４で直交変換係数が量子化される。この量子化
係数は第１可変長符号化器５でハフマン符号などの可変
長符号に変換されてビデオ多重化器１５に入力される。【０００８】一方、逆量子化器６に入力した量子化係数
は逆量子化され、さらに逆直交変換器７で画像データが
復元される。復元された画像データはフレームメモリ９
に蓄積される。また、ビデオ多重化器１５では、第１可
変長符号化器５からの符号化データや量子化器４からの
量子化情報１８を多重化して符号化ビデオデータ出力１
６として出力する。【０００９】次の画面の入力画像信号１が入力されるよ
うになると、符号化モード制御部１２により、各々のス
イッチは側の接点に接続され、入力画像信号１が予測
信号減算器２および動き検出器１０に入力される。動き
検出器１０では該入力画像信号１とフレームメモリ９か
ら入力された参照画像とで動きベクトルが検出され、該
動きベクトルは位置シフタ１１と第２可変長符号化器１
４に入力される。第２可変長符号化器１４では、動きベ
クトル情報がハフマン符号などの可変長符号に変換され
てビデオ多重化器１５に入力される。【００１０】位置シフタ１１では、動きベクトルによっ
て指定される画像信号をフレームメモリ９から抽出し、
動き補償予測信号として予測信号減算器２および局所復
号加算器８に出力される。予測信号減算器２で入力画像
信号１から動き補償予測信号が減算され、その予測誤差
が符号化される。予測誤差信号は高い符号化効率を得る
ために直交変換器３においてＤＣＴ（離散コサイン変
換）などを用いて直交変換され、量子化器４で量子化さ
れた信号は第１可変長符号化器５でハフマン符号などの
可変長符号に変換される。また復号側と同一の予測信号
を用いるために、量子化器４で得られる量子化係数を逆
量子化器６で逆量子化し、逆直交変換器７で予測誤差信
号が局所的に復号される。さらに動き補償予測信号が局
所復号加算器８で復元された予測誤差信号と加算され、
フレームメモリ９に蓄積される。【００１１】動き補償予測符号化を用いた動画像の圧縮
符号化では、一般に動き補償予測を行う画像間での相関
が高いほど符号化効率が高くなる。従って、静止画に近
い映像では、符号化画像と動き補償予測符号化を行う参
照画像との間隔を大きくとることで符号化効率が改善す
るが、変化の激しい映像では参照画像との間隔が大きく
なると符号化効率が低下してしまう。これは、符号化画
像と参照画像との相関が低くなり、動き補償予測が効果
的ではなくなるためである。【００１２】フレーム構造、フィールド構造のピクチャ
構造違いによる符号化時の違いは、この符号化画像と予
測に用いる参照画像との最小距離にある。フレーム構造
を用いた場合には、参照画像との間隔の最小値は図３に
おける時間ｔ1 となるが、フィールド構造の場合には図
３における時間ｔ2 （時間ｔ1 の半分）とすることが可
能である。したがって、動きの激しい映像においては、
フィールド構造をとることによって、動き補償予測符号
化を効果的に動作させることができ、結果として符号化
効率を改善することが可能である。【００１３】フレーム構造／フィールド構造のどちらで
も符号化することが可能な動画像圧縮方式では、符号化
するピクチャ画像一枚を、フィールド画像一枚に対応さ
せて符号化する「フィールド構造」での符号化と、符号
化する画像一枚をインタレースフレーム画像一枚に対応
させて符号化する「フレーム構造」での符号化のどちら
を利用することも可能である。しかし、従来は動画像の
符号化を行う前にあらかじめフレーム構造／フィールド
構造のどちらの構造をとるかを外部から指定し、入力さ
れる動画像に対して該指定された構造を固定的に利用し
て符号化を行い、符号化動画像情報を出力する。【００１４】【発明が解決しようとする課題】前記した従来方法によ
る画像の符号化においては、図１２に示したように、画
像の特徴に関係なく、固定的なピクチャ構造をとり符号
化を行う。したがって、例えばフィールド構造とするこ
とで符号化効率が改善するような動きの激しい画像素材
を符号化する場合にも、符号化のピクチャ構造があらか
じめフレーム構造と指定されている場合には、フレーム
構造での符号化を継続することになり、結果として符号
化効率の低下を招くことになる。また逆に、フィールド
構造での符号化を指定している場合には、フレーム構造
での符号化効率が改善する場合であっても、固定的にフ
ィールド構造を利用するために符号化効率が改善しな
い。【００１５】また、入力された映像がインタレース画像
であるか、ノンインタレース画像であるかがわからない
場合には、あらかじめ別の何らかの方法により入力画像
がインタレース画像であるかどうかの判別を行い、その
判定した情報を元に符号化時に外部からピクチャ構造の
切替を行うといった２段階方式による実現となる。この
ような２段階方式は、実時間での符号化を想定した場合
には不可能である。【００１６】本発明の目的は、前記した従来技術の問題
点を解決し、画像の特徴、構造に関する情報のない入力
画像に対して、入力画像のインタレース／ノンインタレ
ースの判別を自動的に行い、また入力される画像の特徴
を解析し、動画像圧縮符号化におけるピクチャ構造をフ
レーム構造／フィールド構造と適応的に変化させること
で、符号化効率の向上を達成し、符号化画像品質を改善
する画像符号化装置を提供することにある。【００１７】【課題を解決するための手段】前記目的を達成するため
に、本発明は、連続的に入力される画像がインタレース
画像であるかノンインタレース画像であるかを判別する
手段を具備し、インタレース画像であると判別した場合
にフィールド構造での符号化を選択し、それ以外の場合
にフレーム構造での符号化を選択するようにした点に第
１の特徴がある。また、本発明は、該インタレース画
像の検出の際に、サンプリングされた静止画像の画素情
報の相関を利用する点に第２の特徴がある。また、本発
明は、入力されるインタレース画像から、該画像の変化
量を算出し、該算出値を元に、フレーム構造／フィール
ド構造の切替えを行う点に第３の特徴がある。また、本
発明は、入力画像の変化量の算出に、画素の小ブロック
単位での偏差を利用した簡易動き探索を利用した点に第
４の特徴がある。【００１８】前記した特徴によれば、従来の固定的なフ
レーム構造／フィールド構造の選択時には避けることの
できなかった入力画像の特徴変化による符号化効率の低
下を解消することができ、また符号化前にあらかじめ把
握しておく必要があった画像のインタレース／ノンイン
タレースの区別についても符号化時に自動的に検知する
ために、入力画像の特徴、構造に関わらず、効率的な符
号化を行うことが可能となる。【００１９】【発明の実施の形態】以下に、図面を参照して、本発明
を詳細に説明する。図１は、本発明の一実施形態の構成
を示すブロック図である。なお、以下の説明では、動画
像符号化方式として、図１３に示した符号化装置を用い
るものとするが、本発明はこれに限定されるものではな
い。【００２０】この実施形態は、連続して入力される画像
信号（静止画像信号）からその画像がインタレース画像
であるか否かの判別を行い、インタレース画像であった
場合には縮小特徴平面を作成し、該縮小特徴平面を利用
した簡易動き探索処理の結果を元に、フレーム構造／フ
ィールド構造での符号化を決定するようにしたものであ
る。【００２１】図１において、連続的に入力される画像信
号１はインタレース／ノンインタレース画像判別部２に
よって、該画像がインタレース画像であるかノンインタ
レース画像であるかの判別を行われ、その判別結果はイ
ンタレース／ノンインタレース判別情報３として縮小特
徴平面作成部４に出力される。縮小特徴平面作成部４で
は、インタレース／ノンインタレース画像判別部２にお
いてインタレース画像と判別された画像に対して、画像
特徴を反映した縮小特徴平面情報５を作成し、該縮小特
徴平面情報５を簡易動き探索部６へと出力する。簡易動
き探索部６では、２枚の縮小特徴平面間における簡易動
き探索処理を行い、その際の動き補償予測誤差量を画像
変化量情報７としてフレーム構造／フィールド構造決定
部８に出力する。【００２２】フレーム構造／フィールド構造決定部８で
は、簡易動き探索部６から得られた画像変化量情報７か
ら、該変化量が小さい場合にフレーム構造での符号化、
変化量が大きい場合にフィールド構造での符号化を決定
し、動画像符号化部１０にピクチャ構造制御信号９とし
て出力する。動画像符号化部１０では、入力される画像
信号１をフレーム構造／フィールド構造決定部８から指
示されるピクチャ構造制御信号９に従って動画像符号化
を行い、符号化動画像情報１１を出力する。ここに、動
画像符号化部１０は、前記ピクチャ構造制御信号９によ
るフレーム構造／フィールド構造の指定に伴い、例えば
図１３の動き補償器１０、第１可変長符号化器５、およ
び第２可変長符号化器１４の動作を、フレーム構造／フ
ィールド構造での符号化に適応する方式に切り替える。【００２３】次に図１内の各部の構成および動作の一例
を詳細に説明する。まずインタレース／ノンインタレー
ス画像判別部２について説明する。画像がインタレース
画像であるか否かの判別には、入力される画像情報から
隣接するいくつかの画素を用いた計算により決定する。
図４に入力されるフレーム画像情報の構造を示す。画像
情報は、空間的に一様に配置される画素の配列により形
成される。この画像情報から、図５に示すように任意の
位置の垂直方向に連続する５画素の値を取り出し、その
中で２画素間の絶対差分値を算出する。【００２４】算出する絶対差分値は、垂直方向で中央に
位置する画素をｐ(0) としたｐ(-2)〜ｐ(2) の５画素に
おいて、０と−２、０と２、−１と１、同フィールドに
属する画素同士での絶対差分値と、０と−１、０と１の
異フィールドに属する画素同±での絶対差分値を求め
る。そして、まず下記の(1) 式の条件を満たすか否かを
検証する。【００２５】 Max(d(0,-2),d(0,2),d(-1,1)) ＜閾値 …(1) 次に、前記(1) 式の条件を満たした場合に、さらに次の
(2) 式の条件を満たすか否かを検証する。【００２６】 (Max(d(0,-2),d(0,2),d(-1,1))＋offset) ＜Min(d(0,-1),d(0,1)) …(2) ここで、d(a,b)はa,b の絶対差分値を表し、Max(a,b,c)
はa,b,c の最大値、Min(a,b,c)はa,b,c の最小値を表す
ものとする。つまり、同フィールドの画素値が類似して
おり、最大絶対差分値が閾値（一定値）未満となってい
る場合に、異フィールドにおける絶対差分値の最小値
が、同フィールドにおける絶対差分の最大値にoffset
（一定値）を加えた値を超えるかどうかを検証する。こ
の処理を画像内全画素、またはの任意の数の位置におい
て行い、前記(1) 式および(2) 式を満たす点が、(1) 式
を満たす点の一定割合を超える場合には、その画像はイ
ンタレース画像と判別し、その結果をインタレース／ノ
ンインタレース判別情報としてフレーム毎に縮小特徴平
面作成部４に対して出力する。【００２７】また、この例では垂直方向の５画素を用い
てインタレース／ノンインタレースの判別をする例を説
明したが、画素数は隣接する同フィールドの画素と、異
フィールドの画素間での比較を行うことができる３画素
以上であれば、任意の画素数で検証が可能である。さら
に、上記の検証を行う画素位置は、画像内全画素位置に
ついて行うことも可能であり、また、例えば垂直方向に
５画素、水平方向にｎ画素を１ブロックとして、該ブロ
ック内の特定の位置、または任意の位置につき上記の検
証を行う標本点調査を行うことも可能である。さらに、
全く任意の点を無作為に抽出して行うことも可能であ
る。【００２８】次に、図１内の縮小特徴平面作成部４の処
理、すなわち、原画像から画像の特徴を反映した縮小平
面を作成する処理について、図６を参照して説明する。
まず、原画像を小ブロック単位に分割し、その小ブロッ
ク毎に代表値で表す。本発明では代表値には小ブロック
毎の画素値の標準偏差を用いる。また、代表値には、平
均値、中央値を利用することも可能である。また、この
計算の際の画素値には、画素の輝度成分を用いることも
可能であるし、その他の成分や、それらの平均を用いる
ことも可能である。また、小ブロックのサイズも任意に
設定することが可能である。縮小特徴平面のサイズは小
ブロックのサイズを水平：ph画素、垂直：pv画素とする
と、原画像（水平：Ｈ画素、垂直：Ｖ画素）のサイズに
対して水平：H/ph、垂直：V/pvとなりサンプル数は原画
像の画素数に対して１／(ph ×pv) となる。この小ブロ
ックの標準偏差を代表値とした縮小平面を縮小特徴平面
情報５とする。【００２９】次に、図１内の簡易動き探索部６の処理に
ついて説明する。該簡易動き探索部６は、縮小特徴平面
作成部４により作成された縮小特徴平面情報５から、２
枚の縮小特徴平面間における動き探索処理を行う。簡易
動き探索を行う参照平面と対象平面との時間的距離は固
定された任意の値である。動き探索方法としては、ブロ
ックマッチングによる動き探索などが利用可能である。
この場合、ブロックは、縮小特徴平面上の水平、垂直サ
イズともに任意の自然数をとることが可能である。した
がって、最小で１サンプル単位、最大で１つの縮小特徴
平面全体を１ブロックとしてブロック単位動き探索を行
うことがなどが可能である。【００３０】図７を参照して説明すると、設定したブロ
ックの左上の座標を(k,l) とし、縮小特徴平面１上の要
素をc(k,l)、縮小特徴平面２上の要素をr(k,l)で表し、
ブロックの水平方向サイズをN 、垂直方向サイズをM と
し、探索範囲は水平方向に±sh, 垂直方向に±svとす
る。この簡易動き探索において１要素の平均動き補償予
測誤差量Ｅ(k,l) を、図１４の(3) 、(4) 式のように、
探索範囲内の最小誤差量により求める。【００３１】該予測誤差量Ｅ(k,l) は、２乗誤差量を求
めた後、平方根処理を取ることも可能であり、また、絶
対差分値を利用することも可能である。この簡易動き探
索処理により得られる予測誤差量Ｅ(k,l) を、縮小特徴
平面１上の全てのブロックについて行い、縮小特徴平面
１における総和Ｅsum を求める。該動き補償予測誤差量
Ｅsum は、２画像間における変化の大きさを表す指標と
なる。そして、該値Ｅsum を画像変化量情報としてフレ
ーム構造／フィールド構造決定部に出力する。次に、図
１内のフレーム構造／フィールド構造決定部８では、入
力されたフレーム毎の画像変化量情報から、この値の閾
値処理により、閾値以上の値の場合にはフィールド構
造、閾値未満の場合にはフレーム構造をとるよう判定
し、動画像符号化部１０に対して該判定結果をピクチャ
構造制御信号９として出力する。【００３２】図１内の動画像符号化部１０では、フレー
ム構造／フィールド構造決定部８から出力されるピクチ
ャ構造制御信号９により指定されるピクチャ構造を用い
て、入力される画像信号の圧縮符号化を行い、符号化動
画像情報１１を出力する。具体的には、動画像符号化部
１０は、例えば図１３の動き補償器１０、第１可変長符
号化器５、および第２可変長符号化器１４の動作を、前
記ピクチャ構造に従ってフレーム構造／フィールド構造
での符号化に適応する方式に切り替える。【００３３】図８は、本発明の第２実施形態の構成を示
すブロック図である。図中の図１と同じ符号は、同一ま
たは同等物を示す。この実施形態は、縮小特徴平面作成
部４、簡易動き探索部６、およびフレーム構造／フィー
ルド構造決定部８により構成され、フレーム構造／フィ
ールド構造決定部８が、簡易動き探索部６における簡易
動き探索処理により得られた画像変化量情報７が大きい
場合にフィールド構造での符号化を、小さい場合にフレ
ーム構造での符号化を選択するようにした点に特徴があ
る。【００３４】図９は、本発明の第３実施形態の構成を示
すブロック図である。図中の図１と同じ符号は、同一ま
たは同等物を示す。この実施形態は、インタレース／ノ
ンインタレース画像判別部２、およびフレーム構造／フ
ィールド構造決定部８より構成され、インタレース／ノ
ンインタレース画像判別部２において入力画像がインタ
レース画像か否かの判別を行い、インタレース画像と判
定された場合にはフレーム構造／フィールド構造決定部
８においてフィールド構造を選択、ノンインタレース画
像と判別された場合には、フレーム構造での符号化を選
択するようにした点に特徴がある。【００３５】図１０は、本発明の第４実施形態の構成を
示すブロック図である。図中の図１と同じ符号は、同一
または同等物を示す。この実施形態は、インタレース／
ノンインタレース画像判別部２、インタレース／ノンイ
ンタレース切替部２１、縮小特徴平面作成部４、簡易動
き探索部６、およびフレーム構造／フィールド構造決定
部８により構成され、インタレース／ノンインタレース
画像判別部２における入力画像のインタレース／ノンイ
ンタレース画像の判別は、最初に入力される一枚、また
は複数枚の映像により決定し、この決定に基づいてイン
タレース／ノンインタレース切替部２１はそのスイッチ
を“０”または“１”に設定し、以後の入力画像につい
ては、インタレース／ノンインタレース画像判別部２が
インタレース／ノンインタレースの判別を行わないよう
にした点に特徴がある。【００３６】図１１は、本発明の第５実施形態の構成を
示すブロック図である。図中の図１と同じ符号は、同一
または同等物を示す。この実施形態は、インタレース／
ノンインタレース画像判別部２、インタレース／ノンイ
ンタレース切替部２１、およびフレーム構造／フィール
ド構造決定部８より構成され、インタレース／ノンイン
タレース画像判別部２における入力画像のインタレース
／ノンインタレース画像の判別は、最初に入力される一
枚、または複数枚の映像により決定し、この決定に基づ
いてインタレース／ノンインタレース切替部２１はその
スイッチを“０”または“１”に設定し、以後の入力画
像については、インタレース／ノンインタレース画像判
別部２がインタレース／ノンインタレースの判別を行わ
ないようにした点に特徴がある。【００３７】【発明の効果】以上の説明から明らかなように、従来の
固定されたピクチャ構造の符号化では、符号化効率の高
くなる映像は限られていたが、本発明によれば、入力画
像の特徴や変化に応じたピクチャ構造に従って符号化を
選択するようにしたので、どのような特徴の画像が入力
されても、また画像特徴が途中で変化しても、高い符号
化効率を保つことができる。【００３８】また、フレーム構造／フィールド構造のど
ちらでも符号化が可能な動き補償予測符号化を用いた動
画像符号化方式として、ＭＰＥＧ−２ビデオ符号化方式
を用いた動画像の符号化シミュレーション実験を行い、
その結果、４Mbit/sの符号化レートで圧縮符号化した場
合に、フレーム構造で固定した場合と比較して、本発明
ではＰＳＮＲで０．４ｄＢ〜１．０ｄＢ程度の画質の改
善を行うことができた。Description: BACKGROUND OF THE INVENTION [0001] 1. Field of the Invention [0002] The present invention relates to a frame structure / field structure switching type image coding apparatus, and more particularly to a method for coding a digital moving image signal in either a frame structure or a field structure. The present invention relates to an image coding apparatus using motion compensated prediction capable of performing the following. 2. Description of the Related Art Digital moving image information is represented by a sequence of sampled still images. There are two types of image representation methods, one of which is called a non-interlaced image or a progressive image, and the other of which is called an interlaced image. Next, the structure of a non-interlaced image and an interlaced image will be described. FIG. 2 shows an image in which a circular object is moving from left to right as a non-interlaced image. The moving image is sampled every t1 time and is represented by a continuous still image every t1 time. The resolution of the moving image is equal to the resolution of the sampled still image. [0004] On the other hand, FIG. 3 shows a video image similar to that of FIG. 2 expressed by an interlaced image. The video is represented by an image containing only the odd-numbered scanning lines and an image containing only the even-numbered scanning lines at every t2 time. There are two types of expression methods for this interlaced image. One is a field structure in which only the odd-numbered scanning lines or only the even-numbered scanning lines are expressed as one image. The resolution of the field image is １／ in the vertical direction with respect to the frame image. The other is one in which one image is created from continuous two-field images. This is the frame structure. The image in this case is
Images of different fields are alternately arranged for each horizontal line of the pixel column. FIG. 12 shows a block diagram of a conventional moving picture coding apparatus. Conventionally, as shown in the figure, a moving picture coding apparatus is configured to perform coding with a pre-designated picture structure (field structure or frame structure). FIG. 13 is a block diagram showing the configuration and operation of the moving picture coding apparatus shown in FIG. In FIG. 13, when the input image signal 1 of the first screen is input, each switch is connected to each side by the prediction mode control unit 12, and the input signal is converted to an orthogonal transform signal to obtain high coding efficiency. 3 is input directly to the orthogonal transformer 3,
The orthogonal transform is performed using CT (discrete cosine transform) or the like,
The quantizer 4 quantizes the orthogonal transform coefficients. The quantized coefficients are converted into a variable length code such as a Huffman code by the first variable length encoder 5 and input to the video multiplexer 15. On the other hand, the quantized coefficients input to the inverse quantizer 6 are inversely quantized, and the image data is restored by the inverse orthogonal transformer 7. The restored image data is stored in the frame memory 9
Is accumulated in The video multiplexer 15 multiplexes the coded data from the first variable length coder 5 and the quantized information 18 from the quantizer 4 to output a coded video data output 1.
Output as 6. When the input image signal 1 for the next screen is input, each switch is connected to the side contact by the encoding mode control unit 12, and the input image signal 1 It is input to the detector 10. In the motion detector 10, a motion vector is detected from the input image signal 1 and the reference image input from the frame memory 9, and the motion vector is detected by the position shifter 11 and the second variable length encoder 1
4 is input. In the second variable length encoder 14, the motion vector information is converted into a variable length code such as a Huffman code and input to the video multiplexer 15. The position shifter 11 extracts an image signal specified by a motion vector from the frame memory 9 and
It is output to the prediction signal subtractor 2 and the local decoding adder 8 as a motion compensation prediction signal. The motion compensation prediction signal is subtracted from the input image signal 1 by the prediction signal subtractor 2, and the prediction error is encoded. The prediction error signal is orthogonally transformed using a DCT (Discrete Cosine Transform) in the orthogonal transformer 3 in order to obtain a high coding efficiency, and the signal quantized by the quantizer 4 is converted into a first variable-length encoder 5. Is converted into a variable length code such as a Huffman code. Further, in order to use the same prediction signal as that on the decoding side, the quantization coefficient obtained by the quantizer 4 is inversely quantized by the inverse quantizer 6, and the prediction error signal is locally decoded by the inverse orthogonal transformer 7. . Further, the motion compensation prediction signal is added to the prediction error signal restored by the local decoding adder 8,
It is stored in the frame memory 9. In compression coding of a moving image using motion compensation prediction coding, generally, the higher the correlation between the images to be subjected to motion compensation prediction, the higher the coding efficiency. Therefore, in a video close to a still image, the coding efficiency is improved by increasing the interval between the encoded image and the reference image to be subjected to motion compensation prediction encoding. If this happens, the coding efficiency will decrease. This is because the correlation between the encoded image and the reference image becomes low, and the motion compensation prediction becomes ineffective. The difference in encoding due to the difference in picture structure between the frame structure and the field structure lies in the minimum distance between the encoded image and a reference image used for prediction. When the frame structure is used, the minimum value of the interval from the reference image is time t1 in FIG. 3, but in the case of the field structure, it can be time t2 (half of time t1) in FIG. is there. Therefore, in a moving image,
By adopting the field structure, motion-compensated prediction coding can be effectively operated, and as a result, coding efficiency can be improved. In a moving image compression method capable of coding in either a frame structure or a field structure, one picture image to be coded is encoded in a “field structure” in which one picture image is coded corresponding to one field image. It is possible to use either of encoding and encoding in a “frame structure” in which one image to be encoded corresponds to one interlaced frame image. However, conventionally, before coding a moving image, a frame structure or a field structure is designated from the outside in advance, and the designated structure is fixedly used for an input moving image. Then, encoding is performed, and encoded moving image information is output. [0014] In the image coding according to the above-mentioned conventional method, as shown in FIG. 12, a fixed picture structure is applied irrespective of the characteristics of the image. . Therefore, for example, even when encoding a rapidly moving image material whose encoding efficiency is improved by adopting a field structure, if the encoding picture structure is specified in advance as a frame structure, the frame structure Will be continued, and as a result, the coding efficiency will be reduced. Conversely, when coding in the field structure is specified, even if the coding efficiency in the frame structure is improved, the coding efficiency is improved because the field structure is fixedly used. do not do. If it is not known whether the input video is an interlaced image or a non-interlaced image, it is determined in advance by another method whether or not the input image is an interlaced image. , The picture structure is switched from outside at the time of encoding based on the determined information. Such a two-stage method is not possible when coding in real time is assumed. An object of the present invention is to solve the above-mentioned problems of the prior art, and to automatically determine whether an input image is interlaced or non-interlaced for an input image having no information on the characteristics and structure of the image. In addition, the characteristics of the input image are analyzed, and the picture structure in the moving image compression coding is adaptively changed to the frame structure / field structure, thereby improving the coding efficiency and improving the coded image quality. It is an object of the present invention to provide an improved image encoding device. In order to achieve the above object, the present invention provides a means for discriminating whether a continuously input image is an interlaced image or a non-interlaced image. The first feature is that, when an image is determined to be an interlaced image, encoding in a field structure is selected, and in other cases, encoding in a frame structure is selected. Further, the present invention has a second feature in that the correlation of pixel information of a sampled still image is used when the interlaced image is detected. A third feature of the present invention is that a change amount of the input interlaced image is calculated from the input interlaced image, and the frame structure / field structure is switched based on the calculated value. A fourth feature of the present invention is that a simple motion search using a deviation of a pixel in units of small blocks is used to calculate a change amount of an input image. According to the above-mentioned feature, it is possible to eliminate a decrease in coding efficiency due to a characteristic change of an input image which cannot be avoided when a conventional fixed frame structure / field structure is selected. In order to automatically detect the distinction between interlaced and non-interlaced images that had to be grasped beforehand at the time of encoding, efficient encoding is performed regardless of the characteristics and structure of the input image. It is possible to do. Hereinafter, the present invention will be described in detail with reference to the drawings. FIG. 1 is a block diagram showing the configuration of one embodiment of the present invention. In the following description, the encoding device shown in FIG. 13 is used as a moving image encoding method, but the present invention is not limited to this. In this embodiment, it is determined whether or not an image is an interlaced image based on a continuously input image signal (still image signal). Is created, and the coding in the frame structure / field structure is determined based on the result of the simple motion search processing using the reduced feature plane. In FIG. 1, a continuously input image signal 1 is discriminated by an interlaced / non-interlaced image discriminating section 2 as to whether the image is an interlaced image or a non-interlaced image. The discrimination result is output to the reduced feature plane creating unit 4 as interlace / non-interlace discrimination information 3. The reduced feature plane creating unit 4 creates reduced feature plane information 5 reflecting the image feature for the image determined as an interlaced image by the interlaced / non-interlaced image determining unit 2, and generates the reduced feature plane information. 5 is output to the simple motion search unit 6. The simple motion search unit 6 performs a simple motion search process between the two reduced feature planes, and outputs the motion compensation prediction error amount at that time to the frame structure / field structure determination unit 8 as image change amount information 7. The frame structure / field structure determining section 8 uses the image change amount information 7 obtained from the simple motion search section 6 to perform coding in the frame structure when the change amount is small.
If the amount of change is large, the encoding in the field structure is determined and output to the video encoding unit 10 as the picture structure control signal 9. The moving picture coding unit 10 performs moving picture coding on the input image signal 1 according to the picture structure control signal 9 instructed by the frame structure / field structure determining unit 8 and outputs coded moving picture information 11. Here, the moving image encoding unit 10 performs, for example, the motion compensator 10, the first variable length encoder 5, and the second variable length encoder 5 shown in FIG. 13 according to the specification of the frame structure / field structure by the picture structure control signal 9. The operation of the long encoder 14 is switched to a method adapted to encoding in a frame structure / field structure. Next, an example of the configuration and operation of each unit in FIG. 1 will be described in detail. First, the interlaced / non-interlaced image discriminating section 2 will be described. Whether an image is an interlaced image or not is determined by calculation using some adjacent pixels from input image information.
FIG. 4 shows the structure of the input frame image information. Image information is formed by an array of pixels that are spatially uniformly arranged. From this image information, as shown in FIG. 5, the values of five consecutive pixels in an arbitrary position in the vertical direction are extracted, and the absolute difference value between two pixels is calculated. The absolute difference values to be calculated are 0 and −2, 0 and 2, and −5 in p (−2) to p (2), where p (0) is the pixel located at the center in the vertical direction. An absolute difference value between pixels belonging to 1 and 1 and the same field and an absolute difference value between pixels belonging to 0 and −1 and pixels belonging to different fields of 0 and 1 are calculated. Then, first, it is verified whether or not the condition of the following equation (1) is satisfied. Max (d (0, −2), d (0,2), d (−1,1)) <threshold (1) Next, when the condition of the above equation (1) is satisfied, And the next
(2) Verify whether the condition of expression is satisfied. (Max (d (0, -2), d (0,2), d (-1,1)) + offset) <Min (d (0, -1), d (0,1)) (2) where d (a, b) represents the absolute difference between a and b, and Max (a, b, c)
Represents the maximum value of a, b, c, and Min (a, b, c) represents the minimum value of a, b, c. That is, when the pixel values of the same field are similar and the maximum absolute difference value is less than the threshold value (constant value), the minimum value of the absolute difference value in the different field becomes the maximum value of the absolute difference in the same field. At offset
Verifies whether the value exceeds the sum of (constant value). This processing is performed at all pixels in the image or at an arbitrary number of positions.If the points satisfying the above equations (1) and (2) exceed a certain percentage of the points satisfying the equation (1), The image is discriminated as an interlaced image, and the result is output to the reduced feature plane creating unit 4 for each frame as interlaced / non-interlaced discriminating information. Further, in this example, an example in which the interlaced / non-interlaced discrimination is performed using five pixels in the vertical direction has been described. However, the number of pixels between adjacent pixels in the same field and pixels in different fields is determined. Verification can be performed with an arbitrary number of pixels as long as three or more pixels can be compared. Further, the pixel positions to be verified can be determined for all pixel positions in the image. For example, 5 pixels in the vertical direction and n pixels in the horizontal direction are defined as one block, and a specific position in the block is determined. , Or a sample point survey that performs the above verification at any position. further,
It is also possible to extract completely arbitrary points at random. Next, the processing of the reduced feature plane creating unit 4 in FIG. 1, that is, the processing of creating a reduced plane reflecting the features of the image from the original image will be described with reference to FIG.
First, the original image is divided into small blocks, and each small block is represented by a representative value. In the present invention, the standard value of the pixel value of each small block is used as the representative value. In addition, an average value and a median value can be used as the representative value. Further, as the pixel value at the time of this calculation, a luminance component of the pixel can be used, or another component or an average thereof can be used. Also, the size of the small block can be set arbitrarily. Assuming that the size of the reduced feature plane is small: horizontal: ph pixels, vertical: pv pixels, the size of the original image (horizontal: H pixels, vertical: V pixels) is horizontal: H / ph, vertical: V / pv, and the number of samples is 1 / (ph × pv) with respect to the number of pixels of the original image. A reduced plane having the standard deviation of the small block as a representative value is defined as reduced characteristic plane information 5. Next, the processing of the simple motion search section 6 in FIG. 1 will be described. The simplified motion search unit 6 obtains 2 from the reduced feature plane information 5 created by the reduced feature plane creation unit 4.
A motion search process is performed between the reduced feature planes. The temporal distance between the reference plane on which the simple motion search is performed and the target plane is an arbitrary fixed value. As a motion search method, a motion search by block matching or the like can be used.
In this case, the block can take any natural number in both the horizontal and vertical sizes on the reduced feature plane. Therefore, it is possible to perform a block-by-block motion search with a minimum of one sample unit and a maximum of one reduced feature plane as one block. Referring to FIG. 7, the upper left coordinate of the set block is (k, l), the element on the reduced feature plane 1 is c (k, l), and the element on the reduced feature plane 2 is r (k, l),
The horizontal size of the block is N, the vertical size is M, and the search range is ± sh in the horizontal direction and ± sv in the vertical direction. In this simple motion search, the average motion compensation prediction error amount E (k, l) of one element is calculated as shown in equations (3) and (4) in FIG.
It is obtained from the minimum error amount within the search range. After calculating the square error amount, the prediction error amount E (k, l) can be subjected to a square root process, or an absolute difference value can be used. The prediction error amount E (k, l) obtained by the simple motion search processing is performed for all blocks on the reduced feature plane 1, and the sum Esum in the reduced feature plane 1 is obtained. The motion compensation prediction error amount Esum is an index indicating the magnitude of the change between the two images. Then, the value Esum is output to the frame structure / field structure determination unit as image change amount information. Next, the frame structure / field structure determination unit 8 in FIG. 1 performs threshold processing of this value from the input image change amount information for each frame, and when the value is equal to or more than the threshold, the field structure and the value less than the threshold are used. In this case, it is determined that a frame structure is to be adopted, and the result of the determination is output to the video encoding unit 10 as a picture structure control signal 9. The moving picture coding section 10 in FIG. 1 uses the picture structure specified by the picture structure control signal 9 output from the frame structure / field structure determining section 8 to compress the input image signal. And outputs encoded video information 11. Specifically, the video encoding unit 10 performs, for example, the operations of the motion compensator 10, the first variable length encoder 5, and the second variable length encoder 14 of FIG. / Switch to a method adapted to encoding in the field structure. FIG. 8 is a block diagram showing the configuration of the second embodiment of the present invention. 1 that are the same as or equivalent to those in FIG. This embodiment includes a reduced feature plane creating unit 4, a simple motion search unit 6, and a frame structure / field structure determination unit 8. The frame structure / field structure determination unit 8 performs a simple motion search in the simple motion search unit 6. It is characterized in that when the image change amount information 7 obtained by the processing is large, encoding with a field structure is selected, and when it is small, encoding with a frame structure is selected. FIG. 9 is a block diagram showing the configuration of the third embodiment of the present invention. 1 that are the same as or equivalent to those in FIG. This embodiment includes an interlaced / non-interlaced image discriminating unit 2 and a frame structure / field structure deciding unit 8, and determines whether or not an input image is an interlaced image in the interlaced / non-interlaced image discriminating unit 2. The frame structure / field structure determination unit 8 selects the field structure when the image data is determined to be an interlaced image, and selects the encoding using the frame structure when the image structure is determined to be a non-interlaced image. There is a characteristic in that FIG. 10 is a block diagram showing the configuration of the fourth embodiment of the present invention. 1 that are the same as or equivalent to those in FIG. This embodiment uses interlacing /
It is composed of a non-interlaced image discriminating unit 2, an interlaced / non-interlaced switching unit 21, a reduced feature plane creating unit 4, a simple motion search unit 6, and a frame structure / field structure determining unit 8. The discrimination of the interlaced / non-interlaced image of the input image in the image discriminating unit 2 is determined based on one or a plurality of images input first, and based on this determination, the interlaced / non-interlaced switching unit 21 is determined. Is characterized in that the switch is set to "0" or "1", and the interlaced / non-interlaced image discriminating section 2 does not discriminate between interlaced and non-interlaced for the subsequent input images. There is. FIG. 11 is a block diagram showing the configuration of the fifth embodiment of the present invention. 1 that are the same as or equivalent to those in FIG. This embodiment uses interlacing /
The interlaced / non-interlaced image discriminating unit 2 includes an interlaced / non-interlaced switching unit 21 and a frame structure / field structure determining unit 8. The discrimination of a race image is determined based on one or a plurality of images input first, and based on this determination, the interlace / non-interlace switching unit 21 sets the switch to “0” or “1”. However, the subsequent input image is characterized in that the interlaced / non-interlaced image discriminating section 2 does not discriminate between interlaced and non-interlaced images. As is clear from the above description, in the conventional coding of a fixed picture structure, the video with high coding efficiency is limited. Encoding is selected according to the picture structure according to the characteristics and changes of the image, so that high encoding efficiency is maintained regardless of what kind of image is input or the image characteristics change in the middle. be able to. As a moving picture coding method using the motion compensation prediction coding which can be coded in both the frame structure and the field structure, a moving picture coding simulation experiment using the MPEG-2 video coding method. Do
As a result, in the present invention, when compression encoding is performed at an encoding rate of 4 Mbit / s, the image quality can be improved by about 0.4 dB to 1.0 dB in PSNR as compared with the case where the frame rate is fixed. did it.

【図面の簡単な説明】【図１】本発明の第１実施形態の構成を示すブロック
図である。【図２】ノンインタレース構造における画像の構成を
示す図である。【図３】インタレース構造における画像の構成を示す
図である。【図４】フレーム画像の構成を示す図である。【図５】絶対差分量を算出する画素を説明する図であ
る。【図６】縮小特徴平面の作成を説明する図である。【図７】簡易動き探索処理を説明する図である。【図８】本発明の第２実施形態の構成を示すブロック
図である。【図９】本発明の第３実施形態の構成を示すブロック
図である。【図１０】本発明の第４実施形態の構成を示すブロッ
ク図である。【図１１】本発明の第５実施形態の構成を示すブロッ
ク図である。【図１２】従来の動画像符号化方式を示すブロック図
である。【図１３】動き補償を用いた動画像符号化装置のブロ
ック図である。【図１４】数式を示す図である。【符号の説明】１…画像信号、２…インタレース／ノンインタレース画
像判別部、４…縮小特徴平面作成部、６…簡易動き探索
部、８…フレーム構造／フィールド構造決定部、１０…
動画像符号化部、２１…インタレース／ノンインタレー
ス切替部。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram showing a configuration of a first embodiment of the present invention. FIG. 2 is a diagram illustrating a configuration of an image in a non-interlaced structure. FIG. 3 is a diagram showing a configuration of an image in an interlaced structure. FIG. 4 is a diagram showing a configuration of a frame image. FIG. 5 is a diagram illustrating pixels for calculating an absolute difference amount. FIG. 6 is a diagram illustrating creation of a reduced feature plane. FIG. 7 is a diagram illustrating a simple motion search process. FIG. 8 is a block diagram illustrating a configuration of a second exemplary embodiment of the present invention. FIG. 9 is a block diagram showing a configuration of a third embodiment of the present invention. FIG. 10 is a block diagram showing a configuration of a fourth embodiment of the present invention. FIG. 11 is a block diagram showing a configuration of a fifth embodiment of the present invention. FIG. 12 is a block diagram showing a conventional moving picture coding method. FIG. 13 is a block diagram of a video encoding device using motion compensation. FIG. 14 is a diagram showing a mathematical expression. [Explanation of Codes] 1 ... image signal, 2 ... interlaced / non-interlaced image discriminating section, 4 ... reduced feature plane creating section, 6 ... simple motion search section, 8 ... frame structure / field structure determining section, 10 ...
Video encoding unit, 21... Interlace / non-interlace switching unit.

───────────────────────────────────────────────────── フロントページの続き (72)発明者柳原広昌埼玉県上福岡市大原２−１−15 株式会社ケイディディ研究所内 (72)発明者菅野勝埼玉県上福岡市大原２−１−15 株式会社ケイディディ研究所内Ｆターム(参考） 5C059 MA00 MA05 MA23 MC11 ME01 NN01 NN28 PP04 RB02 TA24 TA46 TB05 TB08 TC01 TC12 TC15 TD05 TD11 UA02 UA33 ────────────────────────────────────────────────── ─── Continuation of front page (72) Inventor Hiromasa Yanagihara 2-1-15 Ohara, Kamifukuoka City, Saitama Prefecture Kaedi Institute (72) Inventor Masaru Kanno 2-1-15 Ohara, Kamifukuoka City, Saitama Prefecture Kaedi Institute F term (reference) 5C059 MA00 MA05 MA23 MC11 ME01 NN01 NN28 PP04 RB02 TA24 TA46 TB05 TB08 TC01 TC12 TC15 TD05 TD11 UA02 UA33

Claims

Claims: 1. An image coding apparatus for switching between a frame structure and a field structure, which is capable of coding an interlaced image in either a field structure or a frame structure. Has a means for determining whether the image is an interlaced image or a non-interlaced image. If it is determined that the image is an interlaced image, the coding in the field structure is selected. A frame structure / field structure switching type image encoding apparatus characterized in that encoding of the frame structure is selected. 2. The image coding apparatus according to claim 1, wherein the input image is an interlaced image or a non-interlaced image. A frame characterized by measuring the spatial correlation of pixels consecutive in the vertical direction at the position of, and determining that the image is an interlaced image when the correlation between the same fields is higher than the correlation between different fields. Structure / field structure switching type image encoding device. 3. A frame structure / field structure switching type image coding apparatus according to claim 2, wherein the following method is used for measuring a spatial correlation of pixels consecutive in the vertical direction.
A frame characterized in that, when the number of pixels satisfying the conditions shown in the expressions (1) and (2) is equal to or more than a certain percentage of the number of pixels satisfying the expression (1), encoding using the field structure is selected. Structure / field structure switching type image encoding device. Max (d (0, -2), d (0,2), d (-1,1)) <threshold ... (1) (Max (d (0, -2), d (0,2), d (-1,1)) + offset) <Min (d (0, -1), d (0,1)) ... (2) where d (a, b) represents the absolute difference value of a, b . 4. A frame structure / field structure switching type image coding apparatus capable of coding an interlaced image in either a field structure or a frame structure, wherein a continuously input image has a time interval of Some 2
Means for calculating a correlation of an image, and means for determining whether to perform encoding in a field structure or a frame structure based on the correlation, and performing encoding in the frame structure when the correlation is high. A frame structure / field structure switching type image coding apparatus, wherein coding is performed in a field structure when the correlation is low. 5. The image coding apparatus according to claim 4, wherein the means for calculating the correlation between the two images includes a reduced plane reflecting a feature of a continuously input image. And means for performing a simple motion search process on the reduced plane, and when the amount of motion compensation prediction error obtained in the simple motion search process is large, select encoding with a field structure. A frame-structure / field-structure switching type image encoding apparatus characterized in that it is configured as described above. 6. A frame structure / field structure switching type image encoding apparatus according to claim 5, wherein the means for creating a reduced plane reflecting the characteristics of the image comprises:
A frame / field structure switching type image encoding apparatus, wherein an image is divided into small blocks, a deviation amount of each divided small block is calculated, and the deviation amount is used as an element of a reduced plane. 7. The image coding apparatus according to claim 4, further comprising means for determining whether an input image is an interlaced image or a non-interlaced image. Only for an image determined as an interlaced image, the correlation between the two images is detected to analyze the amount of change in the image and to select a field structure / frame structure. A frame structure / field structure switching type image encoding apparatus, wherein 8. The image coding apparatus according to claim 7, further comprising: an interlaced / non-interlaced picture switching setting means, wherein one or more of the input pictures are input first. It is characterized by determining whether an input image is an interlaced image or a non-interlaced image from a plurality of images, and setting interlaced / non-interlaced image switching setting means based on the determination result. Frame / field structure switching type image coding apparatus.