JP3942088B2

JP3942088B2 - Image processing apparatus and image processing method

Info

Publication number: JP3942088B2
Application number: JP2002282733A
Authority: JP
Inventors: 毅小山; 郁子草津; 隆夫井上; 慶一池辺; 彰高橋; 隆史牧; 卓児玉; 隆則矢野; 伸青木; 宏幸作山
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2002-09-27
Filing date: 2002-09-27
Publication date: 2007-07-11
Anticipated expiration: 2022-09-27
Also published as: JP2004120497A

Description

【０００１】
【発明の属する技術分野】
本発明は、動画像の符号化データの処理に係り、特に、圧縮符号化された動画像のシーンチェンジ検出に関する。
【０００２】
【従来の技術】
動画像のシーンが変化するフレームを自動的に検出する技術は数多く提案されているが、基本的には連続したフレームの静止画特徴量を比較し、その変化の大きいフレームをシーンチェンジ発生フレームとして検出する。
【０００３】
特許文献１には、予測符号化方式の符号化装置又は復号化装置において、各フレームの予測誤差量や符号数、フレーム内符号化（又はフレーム間符号化）された画素数を計数し、それらの計数値の変化が大きいフレームをシーンチェンジ発生フレームとして検出する如き構成の装置が記載されている。また、特許文献２には、動画像の各フレームの符号量を計数し、その計数値が閾値を越えたときにシーンチェンジとして検出する装置が記載されている。
【０００４】
本発明に関連する画像符号化方式として、ＪＰＥＧ２０００（ＩＳＯ／ＩＥＣＦＣＤ１５４４４−１）とＭｏｔｉｏｎ−ＪＰＥＧ２０００（ＩＳＯ／ＩＥＣＦＣＤ１５４４４−３）がある（例えば、非特許文献１参照）。Ｍｏｔｉｏｎ−ＪＰＥＧ２０００は連続した複数の静止画像のそれぞれをフレームとして動画像を圧縮符号化するが、各フレームの符号化データのフォーマットはＪＰＥＧ２０００に準拠している。
【０００５】
【特許文献１】
特許３０９３４９９号公報
【特許文献２】
特開２０００−７８５８５号公報
【非特許文献１】
野水泰之著、「次世代画像符号化方式ＪＰＥＧ２０００」、
株式会社トリケップス、２００１年２月１３日
【０００６】
【発明が解決しようとする課題】
本発明の目的は、動画像の符号化データのヘッダ情報を利用し、少ない処理量で高速にシーンチェンジを検出する新規な装置及び方法を提供することにある。本発明のもう１つの目的は、動画像の符号化データのヘッダ情報を利用し、簡易な手段もしくは処理により、多様な特徴量に基づいたシーンチェンジ検出が可能な新規な装置及び方法を提供することにある。本発明のもう１つの目的は、動画像の符号化データのヘッダ情報を利用し、高速かつ高精度のシーンチェンジ検出が可能な新規な装置及び方法を提供することにある。
【０００７】
【課題を解決するための手段】
請求項１記載の発明に係る画像処理装置は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力手段と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出手段と、
前記動画像の符号化データ及び前記シーンチェンジ検出手段によるシーンチェンジの検出結果を記憶する記憶手段と、
を有し、
前記シーンチェンジ検出手段は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダ、タイルヘッダ又はパケットヘッダからパケット長の情報を抽出し、抽出したパケット長の情報に基づき解像度レベル毎のパケット長の合計を求める情報取得手段と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得手段により求められた解像度レベル毎のパケット長の合計と、前記現在フレームの直前フレームについて前記情報取得手段により求められた解像度レベル毎のパケット長の合計とを比較することにより、前記現在フレームと前記直前フレームとの解像度レベル毎の符号量の分布の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理手段と、を有することを特徴とする。
【０００８】
請求項２記載の発明に係る画像処理装置は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力手段と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出手段と、
前記動画像の符号化データ及び前記シーンチェンジ検出手段によるシーンチェンジの検出結果を記憶する記憶手段と、
を有し、
前記シーンチェンジ検出手段は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダ、タイルヘッダ又はパケットヘッダから符号パス数の情報を抽出し、抽出した符号パス数の情報に基づきプリシンクト毎の符号パス数の合計を求める情報取得手段と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得手段により求められたプリシンクト毎の符号パス数の合計と、前記現在フレームの直前フレームについて前記情報取得手段により求められたプリシンクト毎の符号パス数の合計とを比較することにより、前記現在フレームと前記直前フレームとのプリシンクト毎の符号量の分布の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理手段と、を有することを特徴とする。
【０００９】
請求項３記載の発明に係る画像処理装置は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力手段と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出手段と、
前記動画像の符号化データ及び前記シーンチェンジ検出手段によるシーンチェンジの検出結果を記憶する記憶手段と、
を有し、
前記シーンチェンジ検出手段は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダ、タイルヘッダ又はパケットヘッダからパケット長の情報を抽出し、抽出したパケット長の情報に基づきコンポーネント毎のパケット長の合計を求める情報取得手段と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得手段により求められたコンポーネント毎のパケット長の合計と、前記現在フレームの直前フレームについて前記情報取得手段により求められたコンポーネント毎のパケット長の合計とを比較することにより、前記現在フレームと前記直前フレームとのコンポーネント毎の符号量の分布の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理手段と、を有することを特徴とする。
【００１０】
請求項４記載の発明に係る画像処理装置は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力手段と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出手段と、
前記動画像の符号化データ及び前記シーンチェンジ検出手段によるシーンチェンジの検出結果を記憶する記憶手段と、
を有し、
前記シーンチェンジ検出手段は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダ、タイルヘッダ又はパケットヘッダからパケット長の情報を抽出し、抽出したパケット長の情報に基づきレイヤ毎のパケット長の合計を求める情報取得手段と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得手段により求められたレイヤ毎のパケット長の合計と、前記現在フレームの直前フレームについて前記情報取得手段により求められたレイヤ毎のパケット長の合計とを比較することにより、前記現在フレームと前記直前フレームとのレイヤ毎の符号量の分布の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理手段と、を有することを特徴とする。
【００１１】
請求項５記載の発明に係る画像処理装置は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力手段と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出手段と、
前記動画像の符号化データ及び前記シーンチェンジ検出手段によるシーンチェンジの検出結果を記憶する記憶手段と、
を有し、
前記シーンチェンジ検出手段は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダ、タイルヘッダ又はパケットヘッダから符号パス数の情報を抽出し、抽出した符号パス数の情報からプリシンクト毎の符号パス数の合計を求め、求めたプリシンクト毎の符号パス数の合計に基づいてプリシンクト単位でのＲＯＩ領域の有無を示す情報を取得する情報取得手段と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得手段により取得されたプリシンクト単位でのＲＯＩ領域の有無を示す情報と、前記現在フレームの直前フレームについて前記情報取得手段により取得されたプリシンクト単位でのＲＯＩ領域の有無を示す情報とを比較することにより、前記現在フレームと前記直前フレームとのＲＯＩ領域の位置の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理手段と、を有することを特徴とする。
【００１２】
請求項６記載の発明に係る画像処理装置は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力手段と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出手段と、
前記動画像の符号化データ及び前記シーンチェンジ検出手段によるシーンチェンジの検出結果を記憶する記憶手段と、
を有し、
前記シーンチェンジ検出手段は、
前記動画像の各フレームについて、その符号化データ中のタイルヘッダからＲＧＮマーカセグメントを抽出することにより、タイル単位でのＲＯＩ領域の有無を示す情報を取得する情報取得手段と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得手段により取得されたタイル単位でのＲＯＩ領域の有無を示す情報と、前記現在フレームの直前フレームについて前記情報取得手段により取得されたタイル単位でのＲＯＩ領域の有無を示す情報とを比較することにより、前記現在フレームと前記直前フレームとのＲＯＩ領域の位置の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理手段と、を有することを特徴とする。
【００１３】
請求項７記載に係る画像処理装置は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力手段と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出手段と、
前記動画像の符号化データ及び前記シーンチェンジ検出手段によるシーンチェンジの検出結果を記憶する記憶手段と、
を有し、
前記シーンチェンジ検出手段は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダから圧縮方法に関連した所定のヘッダ情報を取得する情報取得手段と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得手段により取得された所定のヘッダ情報と、前記現在フレームの直前フレームについて前記情報取得手段により取得された所定のヘッダ情報とを比較することにより、前記現在フレームと前記直前フレームとの圧縮方法が異なると判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理手段と、を有することを特徴とする。
【００１４】
請求項８記載の発明に係る画像処理方法は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力工程と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出工程と、
前記動画像の符号化データ及び前記シーンチェンジ検出工程によるシーンチェンジの検出結果を記憶する記憶工程と、
を有し、
前記シーンチェンジ検出工程は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダ、タイルヘッダ又はパケットヘッダからパケット長の情報を抽出し、抽出したパケット長の情報に基づき解像度レベル毎のパケット長の合計を求める情報取得工程と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得工程により求められた解像度レベル毎のパケット長の合計と、前記現在フレームの直前フレームについて前記情報取得工程により求められた解像度レベル毎のパケット長の合計とを比較することにより、前記現在フレームと前記直前フレームとの解像度レベル毎の符号量の分布の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理工程と、を有することを特徴とする。
【００１５】
請求項９記載の発明に係る画像処理方法は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力工程と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出工程と、
前記動画像の符号化データ及び前記シーンチェンジ検出工程によるシーンチェンジの検出結果を記憶する記憶工程と、
を有し、
前記シーンチェンジ検出工程は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダ、タイルヘッダ又はパケットヘッダから符号パス数の情報を抽出し、抽出した符号パス数の情報に基づきプリシンクト毎の符号パス数の合計を求める情報取得工程と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得工程により求められたプリシンクト毎の符号パス数の合計と、前記現在フレームの直前フレームについて前記情報取得工程により求められたプリシンクト毎の符号パス数の合計とを比較することにより、前記現在フレームと前記直前フレームとのプリシンクト毎の符号量の分布の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理工程と、を有することを特徴とする。
【００１６】
請求項１０記載の発明に係る画像処理方法は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力工程と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出工程と、
前記動画像の符号化データ及び前記シーンチェンジ検出工程によるシーンチェンジの検出結果を記憶する記憶工程と、
を有し、
前記シーンチェンジ検出工程は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダ、タイルヘッダ又はパケットヘッダからパケット長の情報を抽出し、抽出したパケット長の情報に基づきコンポーネント毎のパケット長の合計を求める情報取得工程と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得工程により求められたコンポーネント毎のパケット長の合計と、前記現在フレームの直前フレームについて前記情報取得工程により求められたコンポーネント毎のパケット長の合計とを比較することにより、前記現在フレームと前記直前フレームとのコンポーネント毎の符号量の分布の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理工程と、を有することを特徴とする。
【００１７】
請求項１１記載の発明に係る画像処理方法は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力工程と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出工程と、
前記動画像の符号化データ及び前記シーンチェンジ検出工程によるシーンチェンジの検出結果を記憶する記憶工程と、
を有し、
前記シーンチェンジ検出工程は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダ、タイルヘッダ又はパケットヘッダからパケット長の情報を抽出し、抽出したパケット長の情報に基づきレイヤ毎のパケット長の合計を求める情報取得工程と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得工程により求められたレイヤ毎のパケット長の合計と、前記現在フレームの直前フレームについて前記情報取得工程により求められたレイヤ毎のパケット長の合計とを比較することにより、前記現在フレームと前記直前フレームとのレイヤ毎の符号量の分布の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理工程と、を有することを特徴とする。
【００１８】
請求項１２記載の発明に係る画像処理方法は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力工程と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出工程と、
前記動画像の符号化データ及び前記シーンチェンジ検出工程によるシーンチェンジの検出結果を記憶する記憶工程と、
を有し、
前記シーンチェンジ検出工程は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダ、タイルヘッダ又はパケットヘッダから符号パス数の情報を抽出し、抽出した符号パス数の情報からプリシンクト毎の符号パス数の合計を求め、求めたプリシンクト毎の符号パス数の合計に基づいてプリシンクト単位でのＲＯＩ領域の有無を示す情報を取得する情報取得工程と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得工程により取得されたプリシンクト単位でのＲＯＩ領域の有無を示す情報と、前記現在フレームの直前フレームについて前記情報取得工程により取得されたプリシンクト単位でのＲＯＩ領域の有無を示す情報とを比較することにより、前記現在フレームと前記直前フレームとのＲＯＩ領域の位置の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理工程と、を有することを特徴とする。
【００１９】
請求項１３記載の発明に係る画像処理方法は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力工程と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出工程と、
前記動画像の符号化データ及び前記シーンチェンジ検出工程によるシーンチェンジの検出結果を記憶する記憶工程と、
を有し、
前記シーンチェンジ検出工程は、
前記動画像の各フレームについて、その符号化データ中のタイルヘッダからＲＧＮマーカセグメントを抽出することにより、タイル単位でのＲＯＩ領域の有無を示す情報を取得する情報取得工程と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得工程により取得されたタイル単位でのＲＯＩ領域の有無を示す情報と、前記現在フレームの直前フレームについて前記情報取得工程により取得されたタイル単位でのＲＯＩ領域の有無を示す情報とを比較することにより、前記現在フレームと前記直前フレームとのＲＯＩ領域の位置の違いが所定の程度を越えたと判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理工程と、を有することを特徴とする。
【００２０】
請求項１４記載の発明に係る画像処理方法は、
Motion-JPEG2000 に準拠した動画像の符号化データを入力する符号化データ入力工程と、
前記動画像のシーンチェンジを検出するシーンチェンジ検出工程と、
前記動画像の符号化データ及び前記シーンチェンジ検出工程によるシーンチェンジの検出結果を記憶する記憶工程と、
を有し、
前記シーンチェンジ検出工程は、
前記動画像の各フレームについて、その符号化データ中のメインヘッダから圧縮方法に関連した所定のヘッダ情報を取得する情報取得工程と、
前記動画像の各フレーム（以下、現在フレームと記す）について前記情報取得工程により取得された所定のヘッダ情報と、前記現在フレームの直前フレームについて前記情報取得工程により取得された所定のヘッダ情報とを比較することにより、前記現在フレームと前記直前フレームとの圧縮方法が異なると判定したときに、前記現在フレームをシーンチェンジ発生フレームと判定する比較判定処理工程と、を有することを特徴とする。
【００２１】
【発明の実施の形態】
以下に説明する実施の形態において処理の対象となる符号化データは、Ｍｏｔｉｏｎ−ＪＰＥＧ２０００（ＩＳＯ／ＩＥＣＦＣＤ１５４４４−３）による動画像の符号化データである。各フレームの符号化データのフォーマットはＪＰＥＧ２０００に準拠しているので、以下の説明と関連する範囲でＪＰＥＧ２０００（ＩＳＯ／ＩＥＣＦＣＤ１５４４４−１）について概説する。
【００２２】
図１はＪＰＥＧ２０００の圧縮符号化アルゴリズムを説明するための簡略化されたブロック図である。圧縮符号化の対象となる画像データ（動画像を扱う場合には各フレームの画像データ）は、コンポーネント毎にタイルと呼ばれる重複しない矩形領域に分割され、コンポーネント毎にタイルを単位として処理される。ただし、タイルサイズを画像サイズと同一にすること、つまりタイル分割を行わないことも可能である。
【００２３】
タイル画像は、圧縮率の向上を目的として、ＲＧＢデータやＣＭＹデータからＹＣｒＣｂデータへの色空間変換が施される（ステップＳ１）。この色空間変換が省かれる場合もある。
【００２４】
色空間変換後の各コンポーネントの各タイル画像に対し２次元ウェーブレット変換（離散ウェーブレット変換：ＤＷＴ）が実行される（ステップＳ２）。
【００２５】
図２はデコンポジションレベル数が３の場合のウェーブレット変換の説明図である。図２（ａ）に示すタイル画像（デコンポジションレベル０）に対する２次元ウェーブレット変換により、図２（ｂ）に示すような１ＬＬ，１ＨＬ，１ＬＨ，１ＨＨの各サブバンドに分割される。１ＬＬサブバンドの係数に対し２次元ウェーブレット変換が適用されることにより、図２（ｃ）に示すように２ＬＬ，２ＨＬ，２ＬＨ，２ＨＨのサブバンドに分割される。２ＬＬサブバンドの係数に対し２次元ウェーブレット変換が適用されることにより、図２（ｄ）に示すように３ＬＬ，３ＨＬ，３ＬＨ，３ＨＨのサブバンドに分割される。デコンポジションレベルと解像度レベルとの関係であるが、図２（ｄ）の各サブバンドに括弧で囲んで示した数字が、そのサブバンドの解像度レベルを示している。
【００２６】
このような低周波成分（ＬＬサブバンド係数）の再帰的分割（オクターブ分割）により得られたウェーブレット係数は、サブバンド毎に量子化される（ステップＳ３）。ＪＰＥＧ２０００ではロスレス（可逆）圧縮とロッシー（非可逆）圧縮のいずれも可能であり、ロスレス圧縮の場合には量子化ステップ幅は常に１であり、この段階では量子化されない。
【００２７】
量子化後の各サブバンド係数はエントロピー符号化される（ステップＳ４）。このエントロピー符号化には、ブロック分割、係数モデリング及び２値算術符号化からなるＥＢＣＯＴ（Embedded Block Coding with Optimized Truncation）と呼ばれる符号化方式が用いられ、量子化後の各サブバンド係数のビットプレーンが上位プレーンから下位プレーンへ向かって、コードブロックと呼ばれるブロック毎に符号化される。
【００２８】
最後の２つのステップＳ５，Ｓ６は符号形成プロセスである。まず、ステップＳ５において、ステップＳ４で生成されたコードブロックの符号をまとめてパケットが作成される。次のステップＳ６において、ステップＳ５で生成されたパケットがプログレッション順序に従って並べられるとともに必要なタグ情報が付加されることにより、所定のフォーマットの符号化データが作成される。ＪＰＥＧ２０００では、符号順序制御に関して、解像度レベル、プリシンクト(position)、レイヤ、コンポーネント（ＹＣｒＣｂやＲＧＢなどの色成分）の組み合わせによる５種類のプログレッション順序が定義されている。
【００２９】
このようにして生成されるＪＰＥＧ２０００の符号化データのフォーマットを図３に示す。図３に見られるように、符号化データはその始まりを示すＳＯＣマーカと呼ばれるタグで始まり、その後に符号化パラメータや量子化パラメータ等を記述したメインヘッダ(Main Header)と呼ばれるタグ情報が続き、その後に各タイル毎の符号データが続く。各タイル毎の符号データは、ＳＯＴマーカと呼ばれるタグで始まり、タイルヘッダ(Tile Header)と呼ばれるタグ情報、ＳＯＤマーカと呼ばれるタグ、各タイルの符号列を内容とするタイルデータ（Tile Data）で構成される。最後のタイルデータの後に、終了を示すＥＯＣマーカと呼ばれるタグが置かれる。
【００３０】
ここで、プリシンクト、コードブロック、パケット、レイヤについて簡単に説明する。画像≧タイル≧サブバンド≧プリシンクト≧コードブロックの大きさ関係がある。
【００３１】
プリシンクトとは、サブバンドの矩形領域で、同じデコンポジションレベルのＨＬ，ＬＨ，ＨＨサブバンドの空間的に同じ位置にある３つの領域の組が１つのプリシンクトとして扱われる。ただし、ＬＬサブバンドでは、１つの領域が１つのプリシンクトとして扱われる。プリシンクトのサイズをサブバンドと同じサイズにすることも可能である。また、プリシンクトを分割した矩形領域がコードブロックである。図４にデコンポジションレベル１における１つのプリシンクトとコードブロックを例示した。図中のプリシンクトと記された空間的に同じ位置にある３つの領域の組が１つのプリシンクトとして扱われる。
【００３２】
プリシンクトに含まれる全てのコードブロックの符号の一部（例えば最上位から３ビット目までの３枚のビットプレーンの符号）を取り出して集めたものがパケットである。符号が空（から）のパケットも許される。コードブロックの符号をまとめてパケットを生成し、所望のプログレッション順序に従ってパケットを並べることにより符号化データを形成する。図３の各タイルに関するＳＯＤ以下の部分がパケットの集合である。
【００３３】
全てのプリシンクト（つまり、全てのコードブロック、全てのサブバンド）のパケットを集めると、画像全域の符号の一部（例えば、画像全域のウェーブレット係数の最上位のビットプレーンから３枚目までのビットプレーンの符号）ができるが、これがレイヤである（ただし、次に示す例のように、必ずしも全てのプリシンクトのパケットをレイヤに含めなくともよい）。したがって、伸長時に復号されるレイヤ数が多いほど再生画像の画質は向上する。つまり、レイヤは画質の単位とも言える。全てのレイヤを集めると、画像全域の全てのビットプレーンの符号になる。
【００３４】
デコンポジションレベル数＝２（解像度レベル数＝３）の場合のパケットとレイヤの例を図５に示す。図中の縦長の小さな矩形がパケットであり、その内部に示した数字はパケット番号である。レイヤを濃淡を付けた横長矩形領域として図示してある。すなわち、この例では、パケット番号０〜１６のパケットの符号からなるレイヤ０、パケット番号１７〜３３のパケットの符号からなるレイヤ１、パケット番号３４〜５０のパケットの符号からなるレイヤ２、パケット番号５１〜６７のパケットの符号からなるレイヤ３、パケット番号６８〜８４のパケットの符号からなるレイヤ４、パケット番号８５〜１０１のパケットの符号からなるレイヤ５、パケット番号１０２〜１１８のパケットの符号からなるレイヤ６、パケット番号１１９〜１３５のパケットの符号からなるレイヤ７、パケット番号１３６〜１４８のパケットの符号からなるレイヤ８、及び、残りのパケット番号１４９〜１６１のパケットの符号からなるレイヤ９の９レイヤに分割されている。なお、パケットとプリシンクトとの対応関係などは、プログレッション順序の違いやレイヤ分割数等により様々に変化するものであり、上に示したレイヤ構成はあくまで一例である。
【００３５】
ＪＰＥＧ２０００においてはＬＲＣＰ、ＲＬＣＰ、ＲＰＣＬ、ＰＣＲＬ、ＣＰＲＬの５つのプログレッション順序が定義されている。ここで、Ｌはレイヤ、Ｒは解像度レベル、Ｃはコンポーネント、Ｐはプリシンクトである。
【００３６】
ＬＲＣＰプログレッションの場合、パケットの配置（符号化時）又はパケットの解釈（復号化時）の順序は、Ｌ，Ｒ，Ｃ，Ｐの順にネストされた次のようなforループで表すことができる。
for(レイヤ)｛
for(解像度レベル)｛
for(コンポーネント)｛
for(プリシンクト)｛
パケットを配置：符号化時
パケットを解釈：復号化時
｝
｝
｝
｝
【００３７】
具体例を示せば、画像サイズ＝１００×１００画素（タイル分割なし）、レイヤ数＝２、解像度レベル数＝３（レベル０〜２）、コンポーネント数＝３、プリシンクトサイズ＝３２×３２の場合における３６個のパケットは、図６のような順に配置され、また解釈される。
【００３８】
また、ＲＬＣＰプログレッションの場合には、
for(解像度レベル)｛
for(レイヤ)｛
for(コンポーネント)｛
for(プリシンクト)｛
パケットを配置：符号化時
パケットを解釈：復号化時
｝
｝
｝
｝
という順で、パケットの配置（符号化時）又はパケットの解釈（復号化時）がなされる。
他のプログレッション順序の場合も同様のネストされたforループにより、パケットの配置順又は解釈順が決まる。
【００３９】
また、ＪＰＥＧ２０００には、選択した領域（ＲＯＩ領域）の画質を他の領域よりも向上させる機能があり、基本仕様ではＲＯＩ領域のウェーブレット係数のビットシフトによるMax Shift方式が規定されている。
【００４０】
本発明においては、符号化データに付加されるメインヘッダ、タイルヘッダ、パケットヘッダの情報を利用するので、ヘッダ情報の概要を以下に示す。
図７にメインヘッダの構成を示す。このメインヘッダの情報中でＳＩＺ，ＣＯＤ，ＱＣＤの各マーカセグメントは必須であるが、他のマーカセグメントはオプションである。
図８にタイルヘッダの構成を示す。（ａ）はタイルデータの先頭に付加されるヘッダであり、（ｂ）はタイル内が複数に分割されている場合に分割されたタイル部分列の先頭に付加されるヘッダである。タイルヘッダでは必須のマーカセグメントはなく、すべてオプションである。
図９に、マーカ及びマーカセグメントの一覧表を示す。
ＳＯＴマーカセグメントの構成を図１０に、ＳＩＺマーカセグメントの構成を図１１に、ＣＯＤマーカセグメントの構成を図１２に、ＣＯＣマーカセグメントの構成を図１３に、ＲＧＮマーカセグメントの構成を図１４に、ＱＣＤマーカセグメントの構成を図１５に、ＱＣＣマーカセグメントの構成を図１６に、ＴＬＭマーカセグメントの構成を図１７に、ＰＬＭマーカセグメントの構成を図１８に、ＰＬＴマーカセグメントの構成を図１９に、ＰＰＭマーカセグメントの構成を図２０に、ＰＰＴマーカセグメントの構成を図２１に、ＣＯＭマーカセグメントの構成を図２２に、それぞれ示す。
【００４１】
なお、パケットは、本体であるパケットデータにパケットヘッダを付加した構成である。パケットヘッダには、パケットデータの長さ、符号パス数、０ビットプレーン数などの情報が含まれている（非特許文献１参照）。
【００４２】
《実施の形態１》図２３は、本発明による画像処理装置の一例を説明するためのブロック図である。この画像処理装置は、単独の装置として実現される形態、デジタルビデオカメラや監視カメラ等の撮像装置、画像編集装置、ビデオ再生装置、画像検索システム、画像データベースシステム等の一部として実現される形態、パソコンなどのコンピュータのハードウェアを利用しプログラムにより実現される形態とがあるが、いずれの形態も本発明に包含される。
【００４３】
この画像処理装置は、符号化データ入力部（符号化データ入力手段）１００、シーンチェンジ検出装置（シーンチェンジ検出手段）１０１Ａ、記憶部１０２、外部インターフェース部１０３を備える。
【００４４】
符号化データ入力部１００は、Ｍｏｔｉｏｎ−ＪＰＥＧ２０００の動画像の符号化データを取り込むための手段である。より具体的には、この符号化データ入力部１００は、有線又は無線の伝送路もしくはネットワークを通じ外部より符号化データを直接取り込むインターフェース手段、記録媒体から符号化データを読み込む手段、被写体を撮影して動画像の符号化データを入力する撮像手段、動画像の圧縮符号化を行うＭｏｔｉｏｎ−ＪＰＥＧ２０００準拠のエンコーダなどである。
【００４５】
シーンチェンジ検出装置１０１Ａは、動画像の各フレームの符号化データの特定のヘッダの情報を利用して動画像のシーンチェンジ（シーンの切り替わり）を自動的に検出するものであり、情報取得部１１１、情報一時記憶部１１２、比較判定処理部１１３からなる。
【００４６】
記憶部１０２は、動画像の符号化データと、シーンチェンジ検出装置１０１Ａによる検出結果の記憶に利用される。シーンチェンジ検出結果の記憶方法としては、動画像ファイルのファイルヘッダにシーンチェンジ検出結果を記述する方法、フレームのメインヘッダのＣＯＭマーカセグメント（図２２）にコメントデータとしてシーンチェンジ検出結果を記述する方法、動画像ファイルと関連付けた別のファイルとしてシーンチェンジ検出結果を記憶する方法など、様々な方法をとることができる。外部インターフェース部１０３は、符号化データ及びシーンチェンジ検出結果を外部に出力するために利用される。
【００４７】
次に、シーンチェンジ検出装置１０１Ａについて詳細に説明する。情報取得部（情報取得手段）１１１は、各フレームの符号化データより、１つ又は複数の特定のヘッダ情報を抽出する。後述のように様々なシーンチェンジ検出モードを選択できるが、ヘッダ情報そのものの比較によりシーンチェンジを検出するモードが選択された場合には、情報取得部１１１は抽出したヘッダ情報をそのまま出力する。ヘッダ情報に基づいて計算される特徴量の比較によりシーンチェンジを検出するモードが選択された場合には、情報取得部１１１は、抽出したヘッダ情報に基づいて特徴量を計算し、その特徴量を出力する。モードによっては、抽出されたヘッダ情報と、計算された特徴量の両方が出力される。
【００４８】
情報取得部１１１より出力されたヘッダ情報及び／又は特徴量は比較判定処理部（比較判定処理手段）１１３に渡されるとともに情報一時記憶部１１２に記憶される。情報一時記憶部１１２には、現在のフレームと直前のフレームのヘッダ情報及び／又は特徴量が保存され、それより前のフレームのヘッダ情報及び／又は特徴量は順次消去される。情報一時記憶部１１２より、現在フレームの直前フレームのヘッダ情報及び／又は特徴量が比較判定処理部１１３へ入力される。比較判定処理部１１３は、入力された現在フレームと直前フレームのヘッダ情報そのもの比較、及び／又は、ヘッダ情報に基づいて計算された特徴量の比較により、シーンチェンジが発生したか否かの判定を行う。なお、情報一時記憶部１１２を情報取得部１１１又は比較判定処理部１１３と一体化してもよいことは明らかである。
【００４９】
このシーンチェンジ検出装置１０１Ａは、様々な検出モードを選択することができる。
以下、各モード別に具体的に説明するが、複数のモードを組み合わせたモードも選択することができる。この組み合わせモードの場合には、情報取得部１１１は各モードで利用される複数のヘッダ情報の抽出、及び／又は特徴量の計算を行い、比較判定処理部１１３は組み合わされた各モードによる比較判定処理を行い、その結果を評価することによりシーンチェンジであるか否かを判断する。例えば、全てのモードの結果がシーンチェンジの場合にシーンチェンジと判断し、又は、多数決により判断する。
【００５０】
［モードＡ］解像度レベル毎の符号量分布の変化に着目してシーンチェンジを検出するモードである（請求項１，８対応）。比較判定処理部１１３は、現在フレームと直前フレームの解像度レベル毎の符号量分布の違いが所定の程度を越えたときに、現在フレームをシーンチェンジ発生フレームと判定する。このモードでは、情報取得部１１１は、ＰＬＭマーカセグメント（図１８）のＩplm（パケット長リスト）、ＰＬＴマーカセグメント（図１９）のＩplt（パケット長リスト）、ＰＰＭマーカセグメント（図２０）のＩppm（パケットヘッダ集合）、ＰＰＴマーカセグメント（図２１）のＩppt（パケットヘッダ集合）、又は、符号中のパケットヘッダよりパケット長の情報を抽出し、それに基づいて、各解像度レベルのパケット長の合計を求める。なお、パケット長の合計は、必ずしも全ての解像度レベルについて求める必要はなく、特定の一部解像度レベルについて求めるようにしてもよい。
【００５１】
［モードＢ］位置（タイル又はプリシンクト）毎の符号量分布の変化に着目してシーンチェンジを検出するモードである（請求項２，９対応）。比較判定処理部１１３は、現在フレームと直前フレームのプリシンクト毎の符号量分布の違いが所定の程度を越えたときに、現在フレームをシーンチェンジ発生フレームと判定する。このモードでは、次のヘッダ情報の予め指定されたものが利用される。なお、コードブロック毎の符号量分布の変化に着目したシーンチェンジ検出も可能であり、これも本発明に包含される。情報取得部１１１は、
１）ＴＬＭマーカセグメント（図１７）のＰtml（タイル長）、又は、ＳＯＴマーカセグメント（図１０）のＰsot（タイル長）よりタイル長の情報を抽出する。あるいは、
２）ＰＰＭマーカセグメント（図２０）のＩppm（パケットヘッダ集合）、ＰＰＴマーカセグメント（図２１）のＩppt（パケットヘッダ集合）、又は、符号中のパケットヘッダより符号パス数の情報を抽出し、それに基づいて各プリシンクト毎の符号パス数の合計を求める。
【００５２】
なお、タイル長と各プリシンクト毎の合計符号パス数の両方を抽出することも可能であり、これも本発明に包含される。また、以上の特徴量は、一部のタイル、一部のプリシンクト、又は、一部のコードブロックについて求めるようにしてもよい。
【００５３】
［モードＣ］コンポーネント毎の符号量分布の変化に着目してシーンチェンジを検出するモードである（請求項３，１０対応）。比較判定処理部１１３は、現在フレームと直前フレームのコンポーネント毎の符号量分布の違いが所定の程度を越えたときに、現在フレームをシーンチェンジ発生フレームと判定する。このモードでは、情報取得部１１１は、ＰＬＭマーカセグメント（図１８）のＩplm（パケット長リスト）、ＰＬＴマーカセグメント（図１９）のＩplt（パケット長リスト）、ＰＰＭマーカセグメント（図２０）のＩppm（パケットヘッダ集合）、ＰＰＴマーカセグメント（図２１）のＩppt（パケットヘッダ集合）、又は、符号中のパケットヘッダより各パケットのパケット長の情報を抽出し、それに基づいて各コンポーネントのパケット長の合計を求める。なお、一部のコンポーネントについてのみ特徴量を求めるようにしてもよい。
【００５４】
［モードＤ］レイヤ毎の符号量分布の変化に着目してシーンチェンジを検出するモードである（請求項４，１１対応）。比較判定処理部１１３は、現在フレームと直前フレームのレイヤ毎の符号量分布の違いが所定の程度を越えたときに、現在フレームをシーンチェンジ発生フレームと判定する。このモードでは、情報取得部１１１は、ＰＬＭマーカセグメント（図１８）のＩplm（パケット長リスト）、ＰＬＴマーカセグメント（図１９）のＩplt（パケット長リスト）、ＰＰＭマーカセグメント（図２０）のＩppm（パケットヘッダ集合）、ＰＰＴマーカセグメント（図２１）のＩppt（パケットヘッダ集合）、又は、符号中のパケットヘッダより各パケットのパケット長の情報を抽出し、それに基づいて各レイヤのパケット長の合計を求める。なお、一部のレイヤについてのみ特徴量を求めるようにしてもよい。
【００５５】
［モードＥ］ＲＯＩ領域の位置の変化に着目してシーンチェンジを検出するモードである（請求項５，６，１２，１３対応）。比較判定処理部１１３は、現在フレームと直前フレームのＲＯＩ領域の位置の違いを評価し、その違いが所定の程度を越えたときに、現在フレームをシーンチェンジ発生フレームと判定する。情報取得部１１１は、
１）各タイルのタイルヘッダ中のＲＧＮマーカセグメントの有無を調べ、タイル単位でのＲＯＩ領域の有無を示す情報を取得する（請求項６，１３）。あるいは、
２）ＰＰＭマーカセグメント（図２０）のＩppm（パケットヘッダ集合）、ＰＰＴマーカセグメント（図２１）のＩppt（パケットヘッダ集合）、又は、符号中のパケットヘッダより各パケットの符号パス数を抽出してプリシンクト毎（又はコードブロック毎）の符号パス数の合計を求め、その合計符号パス数に基づいて、プリシンクト（又はコードブロック）単位でＲＯＩ領域の有無を示す情報を取得する（請求項５，１２）。
【００５６】
［モードＦ］圧縮方法の変化に着目してシーンチェンジを検出するモードである（請求項７，１４対応）。比較判定処理部１１３は、現在フレームと直前フレームの圧縮方法が異なるときに、現在フレームをシーンチェンジ発生フレームと判定する。このモードでは、情報取得部１１１は、
１）ＣＯＤマーカセグメント（図１２）のＳgcod（符号スタイル）よりプログレッション順序の情報を抽出する。
２）ＣＯＤマーカセグメント（図１２）のＳgcod（符号スタイル）よりレイヤ数の情報を抽出する。
３）ＱＣＤマーカセグメント（図１５）のＳqcd（量子化スタイル），ＳＰqcd（量子化ステップサイズ）の情報を抽出する。
４）ＣＯＤマーカセグメント（図１２）のＳgcod（符号スタイル）より色変換の情報を抽出する。
５）ＣＯＤマーカセグメント（図１２）のＳＰcod（符号スタイル）又はＣＯＣマーカセグメント（図１３）のＳＰcoc（符号スタイル）よりデコンポジション数の情報を抽出する。
６）ＣＯＤマーカセグメント（図１２）のＳＰcod（符号スタイル）又はＣＯＣマーカセグメント（図１３）のＳＰcoc（符号スタイル）よりコードブロックのサイズの情報を抽出する。
７）ＣＯＤマーカセグメント（図１２）のＳＰcod（符号スタイル）又はＣＯＣマーカセグメント（図１３）のＳＰcoc（符号スタイル）よりウェーブレット変換方法の情報を抽出する。あるいは
８）ＣＯＤマーカセグメント（図１２）のＳＰcod（符号スタイル）又はＣＯＣマーカセグメント（図１３）のＳＰcoc（符号スタイル）よりプリシンクトのサイズの情報を抽出する。
【００５７】
なお、このモードにおいて以上の２つ以上の情報を抽出することも可能であり、これも本発明に包含される。
【００５８】
［モードＧ］前記各モードとは異なる画像特徴の変化に着目してシーンチェンジを検出するモードである。比較判定処理部１１３は、現在フレームと直前フレームの画像特徴が異なるときに、現在フレームをシーンチェンジ発生フレームと判定する。このモードでは、情報取得部１１１は、
１）ＳＩＺマーカセグメント（図１１）のＸＴsiz，ＹＴsiz（タイルの水平、垂直方向サイズ）の情報を抽出する。
２）ＳＩＺマーカセグメント（図１１）のＸＴＯsiz，ＹＴＯsiz（タイルの水平、垂直方向オフセット）の情報を抽出する。
３）ＳＩＺマーカセグメント（図１１）のＣsi（コンポーネント数）の情報を抽出する。
４）ＳＩＺマーカセグメント（図１１）のＳsiz（ビット数）の情報を抽出する。
５）ＳＩＺマーカセグメント（図１１）のＸsiz，Ｙsiz（画像の水平、垂直方向サイズ）の情報を抽出する。
６）ＳＩＺマーカセグメント（図１１）のＸＯsiz，ＹＯsiz（画像の水平、垂直方向オフセット）の情報を抽出する。
７）ＳＩＺマーカセグメント（図１１）のＸＲsiz，ＹＲsiz（水平、垂直方向サンプル数）の情報を抽出する。あるいは
８）ＣＯＭマーカセグメント（図２２）のＣcom（コメントデータ）の情報を抽出する。
【００５９】
なお、このモードにおいて、以上の２つ以上の情報を抽出することも可能であり、これも本発明に包含される。
【００６０】
《実施の形態２》図２４は、本発明による画像処理装置の別の例を説明するためのブロック図である。ここに示す画像処理装置は、シーンチェンジ検出装置１０１Ｂの構成が図２３に示したシーンチェンジ検出装置１０１Ａと相違する。すなわち、シーンチェンジ検出装置１０１Ｂは、情報取得部１１１、情報一時記憶部１１２、比較判定処理部１１３に加え、部分復号化部１１４、画像一時記憶部１１５及び二次判定処理部１１６を有する。情報取得部１１１及び比較判定処理部１１３の動作は前述した通りであるので説明を割愛する。
【００６１】
部分復号化部１１４は、符号化データの一部の符号の復号伸長を行う手段である。例えば、低解像度レベル（レベル０、レベル１など）、あるいは、上位レイヤ（上位の１レイヤ、数レイヤなど）の特定コンポーネント又は全コンポーネントの符号の復号伸長を行う。部分復号化部１１４より出力される画像データは二次判定処理部１１６に入力されるとともに、画像一時記憶部１１５に記憶される。画像一時記憶部１１５は、現在のフレームと直前のフレームの画像データを一時的に記憶するもので、直前のフレームの画像データは二次判定部１１６に入力される。
【００６２】
二次判定処理部１１６は、比較判定処理部１１３の判定結果も入力され、比較判定処理部１１３により現在フレームがシーンチェンジ発生フレームと判定された場合にのみ、二次判定処理を行う。この二次判定処理は、部分復号化部１１４より入力された現在フレームの画像データと、画像一時記憶部１１５より入力された直前フレームの画像データとを比較し、その差異が所定の程度を越えたときに現在フレームを最終的にシーンチェンジ発生フレームと判定する処理である。この二次判定処理部１１６の判定結果がシーンチェンジ検出装置１０１Ｂの最終的な判定結果として出力される。
【００６３】
すなわち、シーンチェンジ検出装置１０１Ｂは、比較判定処理部１１３でヘッダ情報及び／又はヘッダ情報に基づいて計算された特徴量の比較によりシーンチェンジの一次判定を行い、二次判定処理部１１６で画像データの比較により、より精密なシーンチェンジの二次判定を行う構成である。このような２段階のシーンチェンジ判定を行うことにより、より高精度なシーンチェンジ検出が可能である。また、一次判定でシーンチェンジとされたフレームに関してのみ画像データの比較による二次判定を行うため、全てのフレーム間で画像データの比較によるシーンチェンジ判定を行う構成に比べ、シーンチェンジ検出速度も向上する。
【００６４】
二次判定には、従来から知られている方法を利用できる。例えば、特許文献１にも記載されているような、画像の小領域毎の色分布を求め、画像間の色分布の相関を調べるような方法を利用することができる。どのような方法を利用するにしても、比較対象となる画像データは一部符号を復号伸長したものであり、全符号を復号伸長した画像データに比べデータ量は遙かに小さいため、処理にそれほどの時間はかからない。したがって、高速かつ高精度のシーンチェンジ検出が可能である。
【００６５】
《実施の形態３》図２５は、本発明による画像処理装置の別の例を説明するためのブロック図である。ここに示す画像処理装置は、静止画特徴量抽出部１２０が追加されている。シーンチェンジ検出装置１０１としては、図２３に示したシーンチェンジ検出装置１０１Ａ、又は、図２４に示したシーンチェンジ検出装置１０１Ｂが用いられる。
【００６６】
静止画特徴量抽出部１２０は、画像検索に利用するための特徴量を検出する手段である。抽出された特徴量は、該当フレームのメインヘッダのＣＯＭマーカセグメント（図２２）にコメントデータとして記述されて記憶部１０２に記憶され、あるいは、動画像ファイルと関連付けた別のファイルとして記憶部１０２に記憶される。この特徴量を外部インターフェース部１０３を介して外部へ出力することもできる。
【００６７】
抽出される静止画特徴量としては、例えば、カラーヒストグラム、代表色、エッジ特徴量、動き（カメラの動き：水平又は垂直方向のパン）の中から１つ以上を指定することができる。
【００６８】
静止画特徴量としてカラーヒストグラム又は代表色が指定された場合、静止画特徴量抽出部１２０は、各フレームの符号化データの低解像度レベル（例えばレベル０）又は上位レイヤ（例えば上位の１レイヤ）の全コンポーネントの符号の復号伸長を行い、得られた画像データよりカラーヒストグラム又は代表色を求める。なお、シーンチェンジ検出装置１０１として図２４に示したシーンチェンジ装置１０１Ｂが用いられる場合には、部分復号化部１１４（図２４）を符号化データの一部符号の復号伸長に利用できる。また、シーンチェンジ検出のために低解像度レベル又は上位レインの全コンポーネントの符号が復号伸長されるならば、画像一時記憶部１１５に記憶されている画像データを特徴量抽出にそのまま用いることができる。
【００６９】
静止画特徴量としてエッジ特徴量が指定された場合には、静止画特徴量抽出部１２０は、各フレームの符号化データのヘッダ情報を利用して、高解像度レベル（例えば最高解像度レベル）のＨＬ，ＬＨ，ＨＨサブバンドの符号量を位置（コードブロック又はプリシンクト）毎に求める。具体的には、ＰＬＭマーカセグメント（図１８）のＩplm（パケット長リスト）、ＰＬＴマーカセグメント（図１９）のＩplt（パケット長リスト）、ＰＰＭマーカセグメント（図２０）のＩppm（パケットヘッダ集合）、ＰＰＴマーカセグメント（図２１）のＩppt（パケットヘッダ集合）、又は、符号中のパケットヘッダよりパケット長の情報を抽出し、それに基づいて、位置毎に高解像度レベル（例えば最高解像度レベル）のＨＬ，ＬＨ，ＨＨサブバンドのパケット長の合計を符号量として求める。そして、各位置の符号量が最大のサブバンドに対応した方向をエッジ方向として検出し、位置とエッジ方向の組をエッジ特徴量として抽出する。
【００７０】
静止画特徴量として動きが指定された場合には、静止画特徴量抽出部１２０は、各フレームの符号化データのヘッダ情報を利用して、高解像度レベル（例えば最高解像度レベル）のＨＬ，ＬＨサブバンドの符号量を求める。具体的には、ＰＬＭマーカセグメント（図１８）のＩplm（パケット長リスト）、ＰＬＴマーカセグメント（図１９）のＩplt（パケット長リスト）、ＰＰＭマーカセグメント（図２０）のＩppm（パケットヘッダ集合）、ＰＰＴマーカセグメント（図２１）のＩppt（パケットヘッダ集合）、又は、符号中のパケットヘッダよりパケット長の情報を抽出し、それに基づいて、高解像度レベル（例えば最高解像度レベル）のＨＬ，ＬＨサブバンドのパケット長の合計を符号量として求める。カメラを水平方向にパンしながら撮影されたフレームでは、水平方向の手振れが生じた時と同様に、パンしない場合に比べＬＨサブバンドの符号量は一般に減少する。その減少の程度はパンの速度が速いほど激しい。カメラを垂直方向にパンしながら撮影したフレームは、パンしない場合に比べＨＬサブバンドの符号量が一般に減少し、その減少程度はパン速度が速いほど激しい。そこで、静止画特徴量抽出部１２０は、連続したフレーム間のＬＨ，ＨＬサブバンドの符号量に基づいて、パンの有無と方向、さらには速度（低速、中速、高速など）を判定し、それを動きの特徴量として抽出する。なお、低解像度又は上位レイヤの符号を復号伸長した画像データを用い、フレーム間の動きベクトルを抽出してもよく、これも本発明に包含される。
【００７１】
ここまで、Ｍｏｔｉｏｎ−ＪＰＥＧ２０００に準拠した動画像の符号化データを対象として本発明の実施の形態を説明したが、連続した複数の静止画像のそれぞれを動画像のフレームとして圧縮符号化し、同様のヘッダ情報を利用可能な他の圧縮符号化アルゴリズムによる符号化データに対しても、本発明を適用可能であることは以上の説明から明白であろう。
【００７２】
また、以上に説明したシーンチェンジ検出装置の情報取得部、比較判定処理部、部分復号化部、二次判定処理部の機能、静止画特徴量抽出部の機能、シーンチェンジ検出装置におけるシーンチェンジ検出手順（方法）は、パソコンやマイクロコンピュータなどのコンピュータ上でプログラムによって実現することも可能である。そのようなプログラム、及び、それが記録されたコンピュータが読み取り可能な各種記録（記憶）媒体も本発明に包含される。
【００７３】
【発明の効果】
以上の説明から明らかなように、本発明によれば、
（１）符号化データのヘッダ情報から求められる解像度レベル毎の符号量分布の変化、位置毎の符号量分布の変化、コンポーネント毎の符号量分布の変化、レイヤ毎の符号量分布の変化、ＲＯＩ領域の位置の変化、その他の圧縮方法などに関連した様々なヘッダ情報そのものの変化などに着目した極めて多様なシーンチェンジ検出が可能である。
（２）シーンチェンジの検出にヘッダ情報そのものの変化を利用する場合には、ヘッダ情報を抽出してフレーム間で比較するだけであるから、極めて簡易な処理もしくは手段によって、様々なヘッダ情報による多様なシーンチェンジ検出を極めて高速に行うことができる。
（３）シーンチェンジの検出にヘッダ情報から求められる符号量分布などの特徴量の変化を利用する場合には、ヘッダ情報の抽出のほかに特徴量を求めるための演算などのための処理もしくは手段が必要になるが、その処理もしくは手段は簡易なものでよく、また、処理にそれほどの時間を要しないため、高速なシーンチェンジ検出が可能である、等々の効果を得られる。
【図面の簡単な説明】
【図１】ＪＰＥＧ２０００のアルゴリズムを説明するためのブロック図である。
【図２】２次元ウェーブレット変換を説明するための図である。
【図３】符号化データのフォーマットを説明するための図である。
【図４】プリシンクトとコードブロックの説明図である。
【図５】パケットとレイヤの例を示す図である。
【図６】パケットの配列順を例示する図である。
【図７】メインヘッダの構成を示す図である。
【図８】タイルヘッダの構成を示す図である。
【図９】マーカ及びマーカセグメントの一覧表を示す図である。
【図１０】ＳＯＴマーカセグメントの構成を示す図である。
【図１１】ＳＩＺマーカセグメントの構成を示す図である。
【図１２】ＣＯＤマーカセグメントの構成を示す図である。
【図１３】ＣＯＣマーカセグメントの構成を示す図である。
【図１４】ＲＧＮマーカセグメントの構成を示す図である。
【図１５】ＱＣＤマーカセグメントの構成を示す図である。
【図１６】ＱＣＣマーカセグメントの構成を示す図である。
【図１７】ＴＬＭマーカセグメントの構成を示す図である。
【図１８】ＰＬＭマーカセグメントの構成を示す図である。
【図１９】ＰＬＴマーカセグメントの構成を示す図である。
【図２０】ＰＰＭマーカセグメントの構成を示す図である。
【図２１】ＰＰＴマーカセグメントの構成を示す図である。
【図２２】ＣＯＭマーカセグメントの構成を示す図である。
【図２３】本発明の実施の形態１を説明するためのブロック図である。
【図２４】本発明の実施の形態２を説明するためのブロック図である。
【図２５】本発明の実施の形態３を説明するためのブロック図である。
【符号の説明】
１００符号化データ入力部
１０１，１０１Ａ，１０１Ｂシーンチェンジ検出装置
１０２記憶部
１１１情報取得部
１１２情報一時記憶部
１１３比較判定処理部
１１４部分復号化部
１１５画像一時記憶部
１１６二次判定処理部
１２０静止画特徴量抽出部[0001]
BACKGROUND OF THE INVENTION
  The present invention relates to processing of encoded data of moving images, and more particularly, to scene change detection of compressed and encoded moving images.
[0002]
[Prior art]
  Many techniques have been proposed for automatically detecting frames in which a moving image scene changes. Basically, still image feature values of consecutive frames are compared, and frames with large changes are used as scene change generation frames. To detect.
[0003]
  In Patent Document 1, the prediction error amount and the number of codes of each frame and the number of pixels subjected to intra-frame coding (or inter-frame coding) are counted in the coding apparatus or decoding apparatus of the predictive coding method. Describes a device configured to detect a frame having a large change in the count value as a scene change occurrence frame. Patent Document 2 describes an apparatus that counts the code amount of each frame of a moving image and detects it as a scene change when the counted value exceeds a threshold value.
[0004]
  As an image encoding method related to the present invention, there are JPEG2000 (ISO / IEC FCD 15444-1) and Motion-JPEG2000 (ISO / IEC FCD 15444-3) (for example, see Non-Patent Document 1). Motion-JPEG2000 compresses and encodes a moving image using each of a plurality of continuous still images as a frame, and the format of encoded data of each frame conforms to JPEG2000.
[0005]
[Patent Document 1]
        Japanese Patent No. 3093499
[Patent Document 2]
        JP 2000-78585 A
[Non-Patent Document 1]
        Yasuyuki Nomizu, “Next Generation Image Coding JPEG2000”,
        Triqueps, Inc. February 13, 2001
[0006]
[Problems to be solved by the invention]
  An object of the present invention is to provide a novel apparatus and method for detecting a scene change at high speed with a small amount of processing using header information of encoded data of a moving image. Another object of the present invention is to provide a novel apparatus and method capable of detecting scene changes based on various feature quantities by simple means or processing using header information of encoded data of moving images. There is. Another object of the present invention is to provide a novel apparatus and method capable of detecting scene changes at high speed and with high accuracy using header information of encoded data of moving images.
[0007]
[Means for Solving the Problems]
  An image processing apparatus according to the invention of claim 1 is provided.
  Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
Scene change detecting means for detecting a scene change of the moving image;
Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
The scene change detection means includes
Information for extracting the packet length information from the main header, tile header or packet header in the encoded data for each frame of the moving image and obtaining the total packet length for each resolution level based on the extracted packet length information Acquisition means;
The total packet length for each resolution level determined by the information acquisition unit for each frame of the moving image (hereinafter referred to as the current frame), and the resolution level determined by the information acquisition unit for the frame immediately before the current frame When the difference in code amount distribution for each resolution level between the current frame and the immediately preceding frame is determined to exceed a predetermined level by comparing the total packet length for each frame, the current frame is changed to a scene change. And a comparison determination processing means for determining that the frame is a generated frame.
[0008]
  An image processing apparatus according to a second aspect of the present invention provides:
  Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
Scene change detecting means for detecting a scene change of the moving image;
Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
The scene change detection means includes
For each frame of the moving image, code path number information is extracted from the main header, tile header, or packet header in the encoded data, and based on the extracted code path number information Information acquisition means for calculating the total number of code passes for each precinct;
The total number of code passes for each precinct determined by the information acquisition unit for each frame of the moving image (hereinafter referred to as the current frame), and each precinct determined by the information acquisition unit for the immediately preceding frame of the current frame When it is determined that the difference in code amount distribution for each precinct between the current frame and the immediately preceding frame exceeds a predetermined level by comparing the total number of code passes of the current frame, a scene change occurs in the current frame. And a comparison determination processing means for determining a frame.
[0009]
  An image processing apparatus according to the invention of claim 3 is provided.
  Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
Scene change detecting means for detecting a scene change of the moving image;
Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
The scene change detection means includes
For each frame of the moving image, information on the packet length is extracted from the main header, tile header, or packet header in the encoded data, and information is obtained for calculating the total packet length for each component based on the extracted packet length information Means,
The sum of the packet lengths for each component determined by the information acquisition unit for each frame of the moving image (hereinafter referred to as the current frame), and for each component determined by the information acquisition unit for the frame immediately before the current frame. When it is determined that the difference in code amount distribution for each component between the current frame and the immediately preceding frame exceeds a predetermined level by comparing the total packet length, the current frame is determined as a scene change occurrence frame. And a comparison determination processing means for determining.
[0010]
  An image processing apparatus according to a fourth aspect of the present invention provides:
  Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
Scene change detecting means for detecting a scene change of the moving image;
Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
The scene change detection means includes
For each frame of the moving image, information on the packet length is extracted from the main header, tile header or packet header in the encoded data, and information is obtained for calculating the total packet length for each layer based on the extracted packet length information Means,
The total packet length for each layer obtained by the information obtaining unit for each frame of the moving image (hereinafter referred to as the current frame), and for each layer obtained by the information obtaining unit for the frame immediately before the current frame. By comparing the total packet length with the current frame and the immediately preceding frame, when it is determined that the difference in code amount distribution for each layer exceeds a predetermined level, the current frame is determined as a scene change occurrence frame. And a comparison determination processing means for determining.
[0011]
  An image processing apparatus according to a fifth aspect of the present invention provides:
  Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
Scene change detecting means for detecting a scene change of the moving image;
Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
The scene change detection means includes
For each frame of the moving image, information on the number of code passes is extracted from the main header, tile header, or packet header in the encoded data, and the total number of code passes for each precinct is obtained from the extracted information on the number of code passes. Information acquisition means for acquiring information indicating the presence or absence of the ROI area in units of precincts based on the total number of code passes for each precinct obtained;
Information indicating the presence / absence of the ROI area in units of precinct acquired by the information acquisition unit for each frame of the moving image (hereinafter referred to as the current frame), and acquired by the information acquisition unit for the frame immediately before the current frame. When it is determined that the difference in the position of the ROI area between the current frame and the previous frame exceeds a predetermined level by comparing the information indicating the presence or absence of the ROI area in units of precincts, And a comparison determination processing means for determining a scene change occurrence frame.
[0012]
  An image processing apparatus according to a sixth aspect of the present invention provides:
  Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
Scene change detecting means for detecting a scene change of the moving image;
Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
The scene change detection means includes
For each frame of the moving image, an information acquisition unit that acquires information indicating the presence or absence of an ROI region in units of tiles by extracting an RGN marker segment from a tile header in the encoded data;
Information indicating the presence / absence of an ROI area in units of tiles acquired by the information acquisition unit for each frame of the moving image (hereinafter referred to as a current frame), and acquired by the information acquisition unit for a frame immediately before the current frame. When it is determined that the difference in the position of the ROI area between the current frame and the immediately preceding frame exceeds a predetermined level by comparing the information indicating the presence or absence of the ROI area in units of tiles, the current frame is And a comparison determination processing means for determining a scene change occurrence frame.
[0013]
  An image processing apparatus according to claim 7 is provided.
  Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
Scene change detecting means for detecting a scene change of the moving image;
Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
The scene change detection means includes
Information acquisition means for acquiring predetermined header information related to the compression method from the main header in the encoded data for each frame of the moving image;
Predetermined header information acquired by the information acquisition unit for each frame of the moving image (hereinafter referred to as a current frame), and predetermined header information acquired by the information acquisition unit for a frame immediately before the current frame. Comparing and determining processing means for determining that the current frame is a scene change occurrence frame when it is determined by comparison that the compression method of the current frame is different from that of the immediately preceding frame.
[0014]
  An image processing method according to the invention of claim 8 is provided.
  Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
A scene change detection step of detecting a scene change of the moving image;
A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
The scene change detection step includes
Information for extracting the packet length information from the main header, tile header or packet header in the encoded data for each frame of the moving image and obtaining the total packet length for each resolution level based on the extracted packet length information Acquisition process;
The total packet length for each resolution level determined by the information acquisition step for each frame of the moving image (hereinafter referred to as the current frame), and the resolution level determined by the information acquisition step for the frame immediately before the current frame When the difference in code amount distribution for each resolution level between the current frame and the immediately preceding frame is determined to exceed a predetermined level by comparing the total packet length for each frame, the current frame is changed to a scene change. And a comparison determination process step for determining a generated frame.
[0015]
  An image processing method according to the invention of claim 9 is provided.
  Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
A scene change detection step of detecting a scene change of the moving image;
A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
The scene change detection step includes
For each frame of the moving image, information on the number of code paths is extracted from the main header, tile header, or packet header in the encoded data, and the total number of code paths for each precinct is calculated based on the extracted information on the number of code paths. Information acquisition process to be requested,
The total number of code passes for each precinct determined by the information acquisition step for each frame of the moving image (hereinafter referred to as the current frame), and for each precinct determined by the information acquisition step for the immediately preceding frame of the current frame When it is determined that the difference in code amount distribution for each precinct between the current frame and the immediately preceding frame exceeds a predetermined level by comparing the total number of code passes of the current frame, a scene change occurs in the current frame. A comparison determination process step for determining a frame.
[0016]
  An image processing method according to the invention of claim 10 is provided.
  Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
A scene change detection step of detecting a scene change of the moving image;
A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
The scene change detection step includes
For each frame of the moving image, information on the packet length is extracted from the main header, tile header, or packet header in the encoded data, and information is obtained for calculating the total packet length for each component based on the extracted packet length information Process,
The sum of the packet lengths for each component determined by the information acquisition step for each frame of the moving image (hereinafter referred to as the current frame), and for each component determined by the information acquisition step for the frame immediately before the current frame. When it is determined that the difference in code amount distribution for each component between the current frame and the previous frame exceeds a predetermined level by comparing the total packet length, the current frame is determined as a scene change occurrence frame. A comparison determination process step for determining.
[0017]
  An image processing method according to the invention of claim 11 is provided.
  Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
A scene change detection step of detecting a scene change of the moving image;
A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
The scene change detection step includes
For each frame of the moving image, information on the packet length is extracted from the main header, tile header or packet header in the encoded data, and information is obtained for calculating the total packet length for each layer based on the extracted packet length information Process,
The total packet length for each layer determined by the information acquisition step for each frame of the moving image (hereinafter referred to as the current frame), and for each layer determined by the information acquisition step for the immediately preceding frame of the current frame By comparing the total packet length with the current frame and the immediately preceding frame, when it is determined that the difference in code amount distribution for each layer exceeds a predetermined level, the current frame is determined as a scene change occurrence frame. A comparison determination process step for determining.
[0018]
  An image processing method according to the invention of claim 12 is provided.
  Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
A scene change detection step of detecting a scene change of the moving image;
A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
The scene change detection step includes
For each frame of the moving image, information on the number of code passes is extracted from the main header, tile header, or packet header in the encoded data, and the total number of code passes for each precinct is obtained from the extracted information on the number of code passes. An information acquisition step of acquiring information indicating the presence or absence of the ROI region in units of precinct based on the total number of code paths for each precinct obtained;
Information indicating the presence / absence of an ROI area in units of precinct acquired by the information acquisition step for each frame of the moving image (hereinafter referred to as a current frame), and acquired by the information acquisition step for a frame immediately before the current frame. When it is determined that the difference in the position of the ROI area between the current frame and the previous frame exceeds a predetermined level by comparing the information indicating the presence or absence of the ROI area in units of precincts, A comparison determination process step for determining a scene change occurrence frame.
[0019]
  An image processing method according to the invention of claim 13 is provided.
  Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
A scene change detection step of detecting a scene change of the moving image;
A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
The scene change detection step includes
For each frame of the moving image, an information acquisition step of acquiring information indicating the presence or absence of the ROI region in units of tiles by extracting an RGN marker segment from a tile header in the encoded data;
According to the information acquisition step, each frame of the moving image (hereinafter referred to as a current frame). By comparing the information indicating the presence / absence of the ROI region in units of tiles acquired with the information indicating the presence / absence of the ROI region in units of tiles acquired by the information acquisition step for the frame immediately before the current frame, A comparison / determination processing step of determining that the current frame is a scene change occurrence frame when it is determined that the difference in the position of the ROI region between the current frame and the immediately preceding frame exceeds a predetermined level. To do.
[0020]
  An image processing method according to the invention of claim 14 is provided.
  Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
A scene change detection step of detecting a scene change of the moving image;
A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
The scene change detection step includes
For each frame of the moving image, an information acquisition step of acquiring predetermined header information related to the compression method from the main header in the encoded data;
Predetermined header information acquired by the information acquisition step for each frame of the moving image (hereinafter referred to as a current frame), and predetermined header information acquired by the information acquisition step for a frame immediately before the current frame. A comparison and determination processing step of determining that the current frame is a scene change occurrence frame when it is determined by comparison that the compression method of the current frame is different from that of the immediately preceding frame.
[0021]
DETAILED DESCRIPTION OF THE INVENTION
  In the embodiment described below, encoded data to be processed is encoded data of a moving image according to Motion-JPEG2000 (ISO / IEC FCD 15444-3). Since the format of the encoded data of each frame conforms to JPEG2000, JPEG2000 (ISO / IEC FCD 15444-1) will be outlined in a range related to the following description.
[0022]
  FIG. 1 is a simplified block diagram for explaining a JPEG2000 compression encoding algorithm. Image data to be compression-encoded (image data of each frame in the case of handling moving images) is divided into non-overlapping rectangular areas called tiles for each component, and processed for each component in units of tiles. However, it is possible to make the tile size the same as the image size, that is, not to perform tile division.
[0023]
  The tile image is subjected to color space conversion from RGB data or CMY data to YCrCb data for the purpose of improving the compression rate (step S1). This color space conversion may be omitted.
[0024]
  A two-dimensional wavelet transform (discrete wavelet transform: DWT) is performed on each tile image of each component after color space conversion (step S2).
[0025]
  FIG. 2 is an explanatory diagram of wavelet transform when the number of decomposition levels is 3. FIG. The tile image (decomposition level 0) shown in FIG. 2A is divided into 1LL, 1HL, 1LH, and 1HH subbands as shown in FIG. 2B by two-dimensional wavelet transform. By applying the two-dimensional wavelet transform to the coefficients of the 1LL subband, the subbands are divided into 2LL, 2HL, 2LH, and 2HH as shown in FIG. By applying the two-dimensional wavelet transform to the coefficients of the 2LL subband, the subbands are divided into 3LL, 3HL, 3LH, and 3HH as shown in FIG. Regarding the relationship between the decomposition level and the resolution level, each subband in FIG. 2D enclosed in parentheses indicates the resolution level of the subband.
[0026]
  The wavelet coefficients obtained by recursive division (octave division) of such low frequency components (LL subband coefficients) are quantized for each subband (step S3). In JPEG2000, both lossless (lossless) compression and lossy (lossy) compression are possible. In the case of lossless compression, the quantization step width is always 1, and quantization is not performed at this stage.
[0027]
  Each subband coefficient after quantization is entropy-coded (step S4). For this entropy coding, an encoding method called EBCOT (Embedded Block Coding with Optimized Truncation) consisting of block division, coefficient modeling and binary arithmetic coding is used, and a bit plane of each subband coefficient after quantization is used. Coding is performed for each block called a code block from the upper plane to the lower plane.
[0028]
  The last two steps S5 and S6 are a code forming process. First, in step S5, a packet is created by combining the codes of the code blocks generated in step S4. In the next step S6, the packets generated in step S5 are arranged according to the progression order, and necessary tag information is added to create encoded data of a predetermined format. In JPEG2000, five types of progression orders are defined for the code order control by combinations of resolution levels, precincts (positions), layers, and components (color components such as YCrCb and RGB).
[0029]
  The format of JPEG2000 encoded data generated in this way is shown in FIG. As shown in FIG. 3, the encoded data starts with a tag called an SOC marker indicating the beginning thereof, followed by tag information called a main header (Main Header) describing an encoding parameter, a quantization parameter, and the like, After that, the code data for each tile follows. The code data for each tile starts with a tag called an SOT marker, and consists of tag information called a tile header (Tile Header), a tag called an SOD marker, and tile data (Tile Data) containing the code string of each tile. Is done. A tag called an EOC marker indicating the end is placed after the last tile data.
[0030]
  Here, the precinct, the code block, the packet, and the layer will be briefly described. There is a size relationship of image ≧ tile ≧ subband ≧ precinct ≧ code block.
[0031]
  A precinct is a rectangular region of subbands, and a set of three regions at the same spatial position of HL, LH, and HH subbands having the same decomposition level is treated as one precinct. However, in the LL subband, one area is treated as one precinct. It is also possible to make the precinct size the same as the subband. A rectangular area obtained by dividing the precinct is a code block. FIG. 4 illustrates one precinct and code block at the decomposition level 1. A set of three regions at the same spatial position, denoted as precinct in the figure, is treated as one precinct.
[0032]
  A packet is a collection of a part of codes of all code blocks included in the precinct (for example, codes of three bit planes from the most significant bit to the third bit). Packets with an empty code are allowed. The code of the code block is collected to generate a packet, and the encoded data is formed by arranging the packet according to a desired progression order. The portion below SOD for each tile in FIG. 3 is a set of packets.
[0033]
  When packets of all precincts (that is, all code blocks and all subbands) are collected, a part of the code for the entire image area (for example, the bits from the most significant bit plane to the third frame of the wavelet coefficients for the entire image area) This is a layer (however, as in the following example, not all precinct packets need to be included in the layer). Therefore, as the number of layers decoded at the time of expansion increases, the quality of the reproduced image improves. That is, the layer can be said to be a unit of image quality. When all layers are collected, it becomes the code of all bit planes of the entire image.
[0034]
  FIG. 5 shows an example of packets and layers when the number of decomposition levels = 2 (the number of resolution levels = 3). In the figure, a vertically small rectangle is a packet, and the number shown inside is a packet number. The layer is illustrated as a horizontally long rectangular region with shading. That is, in this example, layer 0 consisting of the code of the packet number 0-16, layer 1 consisting of the code of the packet number 17-33, layer 2, consisting of the code of the packet number 34-50, packet number From layer 3 consisting of codes of packets 51 to 67, layer 4 consisting of codes of packets of packet numbers 68 to 84, layer 5 consisting of codes of packets of packet numbers 85 to 101, and codes of packets of packet numbers 102 to 118 Layer 6 consisting of codes of packets with packet numbers 119 to 135, layer 8 consisting of codes of packets with packet numbers 136 to 148, and layer 9 consisting of codes of packets with the remaining packet numbers 149 to 161 It is divided into 9 layers. Note that the correspondence between packets and precincts changes variously depending on the difference in progression order, the number of layer divisions, and the like, and the layer configuration shown above is merely an example.
[0035]
  In JPEG2000, five progression orders of LRCP, RLCP, RPCL, PCRL, and CPRL are defined. Here, L is a layer, R is a resolution level, C is a component, and P is a precinct.
[0036]
  In the case of LRCP progression, the order of packet arrangement (when encoding) or packet interpretation (when decoding) can be expressed by the following for loop nested in the order of L, R, C, and P.
        for (layer) {
          for (resolution level) {
            for (component) {
              for (Precinct) {
                  Place packet: When encoding
                  Interpret packets: when decrypting
          }
        }
      }
    }
[0037]
  Specifically, the image size = 100 × 100 pixels (no tile division), the number of layers = 2, the number of resolution levels = 3 (level 0 to 2), the number of components = 3, and the precinct size = 32 × 32 The 36 packets in the case are arranged and interpreted in the order as shown in FIG.
[0038]
  In the case of RLCP progression,
        for (resolution level) {
          for (layer) {
            for (component) {
              for (Precinct) {
                  Place packet: When encoding
                  Interpret packets: when decrypting
          }
        }
      }
    }
In this order, packet arrangement (when encoding) or packet interpretation (when decoding) is performed.
In the case of other progression orders, a similar nested for loop determines the arrangement order or interpretation order of packets.
[0039]
  In addition, JPEG2000 has a function of improving the image quality of a selected region (ROI region) as compared with other regions, and the basic specification defines a Max Shift method by bit shift of wavelet coefficients in the ROI region.
[0040]
  In the present invention, since information on the main header, tile header, and packet header added to the encoded data is used, an outline of the header information is shown below.
  FIG. 7 shows the configuration of the main header. The SIZ, COD, and QCD marker segments are essential in the main header information, but the other marker segments are optional.
  FIG. 8 shows the configuration of the tile header. (A) is a header added to the head of tile data, and (b) is a header added to the head of the divided tile subsequence when the inside of the tile is divided into a plurality. There is no mandatory marker segment in the tile header, all are optional.
  FIG. 9 shows a list of markers and marker segments.
  The SOT marker segment configuration is shown in FIG. 10, the SIZ marker segment configuration in FIG. 11, the COD marker segment configuration in FIG. 12, the COC marker segment configuration in FIG. 13, and the RGN marker segment configuration in FIG. 15 shows the configuration of the QCD marker segment, FIG. 16 shows the configuration of the QCC marker segment, FIG. 17 shows the configuration of the TLM marker segment, FIG. 18 shows the configuration of the PLM marker segment, and FIG. 19 shows the configuration of the PLT marker segment. The configuration of the PPM marker segment is shown in FIG. 20, the configuration of the PPT marker segment is shown in FIG. 21, and the configuration of the COM marker segment is shown in FIG.
[0041]
  The packet has a configuration in which a packet header is added to the packet data which is the main body. The packet header includes information such as the length of packet data, the number of code paths, and the number of 0-bit planes (see Non-Patent Document 1).
[0042]
  Embodiment 1 FIG. 23 is a block diagram for explaining an example of an image processing apparatus according to the present invention. This image processing device is realized as a single device, and is realized as a part of an imaging device such as a digital video camera or a surveillance camera, an image editing device, a video playback device, an image search system, an image database system, etc. There are forms realized by a program using computer hardware such as a personal computer, and any form is included in the present invention.
[0043]
  The image processing apparatus includes an encoded data input unit(Encoded data input means)100, scene change detection device(Scene change detection means)101A, a storage unit 102, and an external interface unit 103.
[0044]
  The encoded data input unit 100 is a means for taking in encoded data of a Motion-JPEG2000 moving image. More specifically, the encoded data input unit 100 includes an interface unit that directly imports encoded data from the outside through a wired or wireless transmission path or network, a unit that reads encoded data from a recording medium, and a subject. An imaging unit that inputs encoded data of a moving image, a Motion-JPEG2000 compliant encoder that compresses and encodes a moving image, and the like.
[0045]
  The scene change detection apparatus 101A automatically detects a scene change (scene change) of a moving image using information of a specific header of encoded data of each frame of the moving image, and an information acquisition unit 111 , An information temporary storage unit 112 and a comparison determination processing unit 113.
[0046]
  The storage unit 102 is used for storing encoded data of moving images and detection results by the scene change detection apparatus 101A. As a method of storing the scene change detection result, a method of describing the scene change detection result in the file header of the moving image file, and a method of describing the scene change detection result as comment data in the COM marker segment (FIG. 22) of the main header of the frame. Various methods such as a method of storing the scene change detection result as another file associated with the moving image file can be used. The external interface unit 103 is used to output the encoded data and the scene change detection result to the outside.
[0047]
  Next, the scene change detection apparatus 101A will be described in detail. Information acquisition unit(Information acquisition means)111 extracts one or more specific header information from the encoded data of each frame. Various scene change detection modes can be selected as will be described later. However, when a mode for detecting a scene change is selected by comparing header information itself, the information acquisition unit 111 outputs the extracted header information as it is. When the mode for detecting a scene change is selected by comparing the feature amounts calculated based on the header information, the information acquisition unit 111 calculates the feature amount based on the extracted header information, and calculates the feature amount. Output. Depending on the mode, both the extracted header information and the calculated feature value are output.
[0048]
  The header information and / or feature amount output from the information acquisition unit 111 is a comparison determination processing unit.(Comparison judgment processing means)113 and stored in the information temporary storage unit 112. In the information temporary storage unit 112, header information and / or feature amounts of the current frame and the immediately preceding frame are stored, and header information and / or feature amounts of previous frames are sequentially deleted. From the information temporary storage unit 112, the header information and / or feature amount of the frame immediately before the current frame is input to the comparison determination processing unit 113. The comparison determination processing unit 113 determines whether or not a scene change has occurred by comparing the header information itself of the input current frame and the immediately preceding frame and / or comparing the feature amounts calculated based on the header information. Do. Obviously, the information temporary storage unit 112 may be integrated with the information acquisition unit 111 or the comparison determination processing unit 113.
[0049]
  The scene change detection apparatus 101A can select various detection modes.
Hereinafter, although it demonstrates concretely according to each mode, the mode which combined several mode can also be selected. In the case of this combination mode, the information acquisition unit 111 extracts a plurality of header information used in each mode and / or calculates feature amounts, and the comparison determination processing unit 113 performs comparison determination by each combined mode. It is determined whether or not it is a scene change by performing processing and evaluating the result. For example, if the result of all modes is a scene change, it is determined to be a scene change or a majority decision.
[0050]
  [Mode A] This mode detects a scene change by paying attention to a change in the code amount distribution for each resolution level.(Corresponding to claims 1 and 8). The comparison determination processing unit 113 determines that the current frame is a scene change occurrence frame when the difference in code amount distribution for each resolution level between the current frame and the previous frame exceeds a predetermined level. In this mode, EmotionThe information acquisition unit 111 includes Iplm (packet length list) of the PLM marker segment (FIG. 18), Iplt (packet length list) of the PLT marker segment (FIG. 19), and Ippm (packet header set) of the PPM marker segment (FIG. 20). The packet length information is extracted from the Ipt (packet header set) of the PPT marker segment (FIG. 21) or the packet header in the code, and the total packet length of each resolution level is obtained based on the extracted information. The total packet length does not necessarily have to be obtained for all resolution levels, but may be obtained for a specific partial resolution level.
[0051]
  [Mode B] This mode detects a scene change by paying attention to a change in code amount distribution for each position (tile or precinct).(Corresponding to claims 2 and 9). The comparison determination processing unit 113 determines that the current frame is a scene change occurrence frame when the difference in code amount distribution for each precinct between the current frame and the immediately preceding frame exceeds a predetermined level. In this mode, the next specified header information is used. It should be noted that scene change detection focusing on a change in code amount distribution for each code block is also possible, and this is also included in the present invention. The information acquisition unit 111
  1) Tile length information is extracted from Ptml (tile length) of the TLM marker segment (FIG. 17) or Psot (tile length) of the SOT marker segment (FIG. 10). Or
  2) Extract information on the number of code paths from Ippm (packet header set) of the PPM marker segment (FIG. 20), Ippt (packet header set) of the PPT marker segment (FIG. 21), or packet header in the code, Based on this, the total number of code passes for each precinct is obtained.
[0052]
  It is possible to extract both the tile length and the total number of code passes for each precinct, which are also included in the present invention. The above feature quantities may be obtained for some tiles, some precincts, or some code blocks.
[0053]
  [Mode C] This mode detects a scene change by paying attention to a change in code amount distribution for each component.(Claims 3 and 10). The comparison determination processing unit 113 determines that the current frame is a scene change occurrence frame when the difference in code amount distribution for each component between the current frame and the immediately preceding frame exceeds a predetermined level. In this mode, the information acquisition unit 111 includes Iplm (packet length list) of the PLM marker segment (FIG. 18), Iplt (packet length list) of the PLT marker segment (FIG. 19), and Ippm of the PPM marker segment (FIG. 20). The packet length information of each packet is extracted from the packet header set), the Ipt (packet header set) of the PPT marker segment (FIG. 21), or the packet header in the code, and based on this, the total packet length of each component is calculated. Ask. Note that feature amounts may be obtained only for some components.
[0054]
  [Mode D] This mode detects a scene change by paying attention to a change in code amount distribution for each layer.(Corresponding to claims 4 and 11). The comparison determination processing unit 113 determines that the current frame is a scene change occurrence frame when the difference in code amount distribution for each layer between the current frame and the immediately preceding frame exceeds a predetermined level. In this mode, the information acquisition unit 111 includes Iplm (packet length list) of the PLM marker segment (FIG. 18), Iplt (packet length list) of the PLT marker segment (FIG. 19), and Ippm of the PPM marker segment (FIG. 20). The packet length information of each packet is extracted from the packet header set), the Ipt (packet header set) of the PPT marker segment (FIG. 21), or the packet header in the code, and based on this, the total packet length of each layer is calculated. Ask. Note that the feature amount may be obtained only for some of the layers.
[0055]
  [Mode E] This mode detects a scene change by paying attention to the change in the position of the ROI area (corresponding to claims 5, 6, 12, and 13). The comparison determination processing unit 113 evaluates the difference in the position of the ROI area between the current frame and the immediately preceding frame, and determines that the current frame is a scene change occurrence frame when the difference exceeds a predetermined level. The information acquisition unit 111
  1) Check for the presence or absence of an RGN marker segment in the tile header of each tile and obtain information indicating the presence or absence of the ROI area in tile units.(Claims 6 and 13). Or
  2) Extract the number of code paths of each packet from Ippm (packet header set) of the PPM marker segment (FIG. 20), Ippt (packet header set) of the PPT marker segment (FIG. 21), or the packet header in the code. Calculate the total number of code passes for each precinct (or code block)PaThe information indicating the presence or absence of the ROI area is acquired in units of precincts (or code blocks) based on the number of services.(Claims 5 and 12).
[0056]
  [Mode F] This mode detects a scene change by focusing on the change in compression method.(Corresponding to claims 7 and 14). The comparison determination processing unit 113 determines that the current frame is a scene change occurrence frame when the compression method of the current frame is different from that of the immediately preceding frame. In this mode, the information acquisition unit 111
  1) Progression order information is extracted from Sgcod (code style) of the COD marker segment (FIG. 12).
  2) Information on the number of layers is extracted from Sgcod (code style) of the COD marker segment (FIG. 12).
  3) Information of Sqcd (quantization style) and SPqcd (quantization step size) of the QCD marker segment (FIG. 15) is extracted.
  4) Color conversion information is extracted from Sgcod (code style) of the COD marker segment (FIG. 12).
  5) Information on the number of decompositions is extracted from the SPcod (code style) of the COD marker segment (FIG. 12) or the SPcoc (code style) of the COC marker segment (FIG. 13).
  6) Code block size information is extracted from the SPcod (code style) of the COD marker segment (FIG. 12) or the SPcoc (code style) of the COC marker segment (FIG. 13).
  7) Information on the wavelet transform method is extracted from the SPcod (code style) of the COD marker segment (FIG. 12) or the SPcoc (code style) of the COC marker segment (FIG. 13). Or
  8) Precinct size information is extracted from the SPcod (code style) of the COD marker segment (FIG. 12) or the SPcoc (code style) of the COC marker segment (FIG. 13).
[0057]
  In this mode, it is possible to extract the above two or more pieces of information, and this is also included in the present invention.
[0058]
  [Mode G] In this mode, a scene change is detected by paying attention to a change in image characteristics different from each mode. The comparison determination processing unit 113 determines that the current frame is a scene change occurrence frame when the image characteristics of the current frame and the previous frame are different. In this mode, the information acquisition unit 111
  1) Information on XTsiz and YTsiz (horizontal and vertical sizes of tiles) of the SIZ marker segment (FIG. 11) is extracted.
  2) Extract XTOsiz and YTOsiz (tile horizontal and vertical offset) information of SIZ marker segment (FIG. 11).
  3) Information on Csi (number of components) of the SIZ marker segment (FIG. 11) is extracted.
  4) Information on Ssiz (number of bits) of the SIZ marker segment (FIG. 11) is extracted.
  5) Information on Xsiz and Ysiz (image horizontal and vertical size) of the SIZ marker segment (FIG. 11) is extracted.
  6) Information of XOsiz and YOsiz (horizontal and vertical offsets of the image) of the SIZ marker segment (FIG. 11) is extracted.
  7) Information on XRsiz and YRsiz (number of samples in the horizontal and vertical directions) of the SIZ marker segment (FIG. 11) is extracted. Or
  8) Information of Ccom (comment data) of the COM marker segment (FIG. 22) is extracted.
[0059]
  In this mode, it is also possible to extract the above two or more pieces of information, which are also included in the present invention.
[0060]
  Embodiment 2 FIG. 24 is a block diagram for explaining another example of an image processing apparatus according to the present invention. The image processing apparatus shown here is different from the scene change detection apparatus 101A shown in FIG. 23 in the configuration of the scene change detection apparatus 101B. That is, the scene change detection apparatus 101B includes a partial decoding unit 114, an image temporary storage unit 115, and a secondary determination processing unit 116 in addition to the information acquisition unit 111, the information temporary storage unit 112, and the comparison determination processing unit 113. Since the operations of the information acquisition unit 111 and the comparison determination processing unit 113 are as described above, a description thereof will be omitted.
[0061]
  The partial decoding unit 114 is means for performing decoding / decompression of a part of the encoded data. For example, decoding / decompression of a specific component or all components of a low resolution level (level 0, level 1, etc.) or an upper layer (upper one layer, several layers, etc.) is performed. The image data output from the partial decoding unit 114 is input to the secondary determination processing unit 116 and stored in the image temporary storage unit 115. The image temporary storage unit 115 temporarily stores the image data of the current frame and the immediately preceding frame, and the image data of the immediately preceding frame is input to the secondary determination unit 116.
[0062]
  The secondary determination processing unit 116 performs the secondary determination processing only when the determination result of the comparison determination processing unit 113 is also input and the comparison determination processing unit 113 determines that the current frame is a scene change occurrence frame. This secondary determination processing compares the image data of the current frame input from the partial decoding unit 114 with the image data of the immediately preceding frame input from the image temporary storage unit 115, and the difference exceeds a predetermined level. This is a process for finally determining the current frame as a scene change occurrence frame. The determination result of the secondary determination processing unit 116 is output as the final determination result of the scene change detection apparatus 101B.
[0063]
  That is, the scene change detection apparatus 101B performs primary determination of a scene change by comparing the header information and / or the feature amount calculated based on the header information in the comparison determination processing unit 113, and the secondary determination processing unit 116 performs image data Thus, a more precise secondary determination of scene change is performed. By performing such a two-stage scene change determination, it is possible to detect a scene change with higher accuracy. In addition, because the secondary determination is performed by comparing the image data only for the frames that have been changed into scenes in the primary determination, the scene change detection speed is also improved compared to the configuration in which the scene change determination is performed by comparing the image data between all frames. To do.
[0064]
  For the secondary determination, a conventionally known method can be used. For example, a method for obtaining a color distribution for each small region of an image and examining the correlation of the color distribution between images as described in Patent Document 1 can be used. Regardless of the method used, the image data to be compared is obtained by decoding and decompressing part of the code, and the amount of data is much smaller than the image data obtained by decoding and decompressing all the codes. It doesn't take much time. Therefore, it is possible to detect a scene change with high speed and high accuracy.
[0065]
  Embodiment 3 FIG. 25 is a block diagram for explaining another example of an image processing apparatus according to the present invention. In the image processing apparatus shown here, a still image feature amount extraction unit 120 is added. As the scene change detection device 101, the scene change detection device 101A shown in FIG. 23 or the scene change detection device 101B shown in FIG. 24 is used.
[0066]
  The still image feature amount extraction unit 120 is a means for detecting a feature amount to be used for image search. The extracted feature amount is described as comment data in the COM marker segment (FIG. 22) of the main header of the corresponding frame and stored in the storage unit 102, or is stored in the storage unit 102 as another file associated with the moving image file. Remembered. This feature amount can also be output to the outside via the external interface unit 103.
[0067]
  As the extracted still image feature amount, for example, one or more of color histogram, representative color, edge feature amount, and motion (camera motion: horizontal or vertical pan) can be designated.
[0068]
  When a color histogram or a representative color is designated as a still image feature amount, the still image feature amount extraction unit 120 uses a low resolution level (for example, level 0) or an upper layer (for example, an upper one layer) of encoded data of each frame. Are decoded and decompressed, and a color histogram or representative color is obtained from the obtained image data. When the scene change device 101B shown in FIG. 24 is used as the scene change detection device 101, the partial decoding unit 114 (FIG. 24) can be used for decoding and decompressing a partial code of encoded data. Further, if the codes of all components of the low resolution level or the upper rain are decoded and decompressed for scene change detection, the image data stored in the image temporary storage unit 115 can be used as it is for feature amount extraction.
[0069]
  When an edge feature amount is designated as a still image feature amount, the still image feature amount extraction unit 120 uses the header information of the encoded data of each frame to generate an HL at a high resolution level (for example, the highest resolution level). , LH, and HH subband code amounts are obtained for each position (code block or precinct). Specifically, Iplm (packet length list) of the PLM marker segment (FIG. 18), Iplt (packet length list) of the PLT marker segment (FIG. 19), Ippm (packet header set) of the PPM marker segment (FIG. 20), The packet length information is extracted from the Ipt (packet header set) of the PPT marker segment (FIG. 21) or the packet header in the code, and based on the extracted information, the HL of a high resolution level (for example, the highest resolution level) for each position is extracted. The total packet length of the LH and HH subbands is obtained as the code amount. Then, the direction corresponding to the subband having the maximum code amount at each position is detected as an edge direction, and a set of the position and the edge direction is extracted as an edge feature amount.
[0070]
  When a motion is designated as a still image feature amount, the still image feature amount extraction unit 120 uses the header information of the encoded data of each frame to generate a high resolution level (for example, the highest resolution level) HL, LH. The code amount of the subband is obtained. Specifically, Iplm (packet length list) of the PLM marker segment (FIG. 18), Iplt (packet length list) of the PLT marker segment (FIG. 19), Ippm (packet header set) of the PPM marker segment (FIG. 20), The packet length information is extracted from the Ipt (packet header set) of the PPT marker segment (FIG. 21) or the packet header in the code, and based on it, the HL and LH subbands of the high resolution level (for example, the highest resolution level) Is obtained as a code amount. In the frame shot while panning the camera in the horizontal direction, the code amount of the LH subband is generally reduced as compared with the case where panning is not performed, as in the case of horizontal camera shake. The degree of the decrease is more severe as the bread speed is faster. In the frame shot while panning the camera in the vertical direction, the code amount of the HL subband is generally reduced as compared with the case where panning is not performed. Therefore, the still image feature amount extraction unit 120 determines the presence / absence and direction of pan and further the speed (low speed, medium speed, high speed, etc.) based on the code amount of the LH and HL subbands between consecutive frames, It is extracted as a motion feature. Note that motion vectors between frames may be extracted using image data obtained by decoding and decompressing low-resolution or higher-layer codes, and this is also included in the present invention.
[0071]
  So far, the embodiment of the present invention has been described with respect to encoded data of a moving image compliant with Motion-JPEG2000. However, each of a plurality of continuous still images is compressed and encoded as a frame of a moving image, and the same header is used. It will be apparent from the above description that the present invention can be applied to data encoded by other compression encoding algorithms that can use information.
[0072]
  In addition, the information acquisition unit, comparison determination processing unit, partial decoding unit, secondary determination processing unit function, still image feature amount extraction unit function, scene change detection device scene change detection described above The procedure (method) can also be realized by a program on a computer such as a personal computer or a microcomputer. Such a program and various recording (storage) media readable by a computer on which the program is recorded are also included in the present invention.
[0073]
【The invention's effect】
  As is clear from the above description, according to the present invention,
(1) Change in code amount distribution for each resolution level obtained from header information of encoded data, change in code amount distribution for each position, change in code amount distribution for each component, change in code amount distribution for each layer, ROI It is possible to detect an extremely wide variety of scene changes by paying attention to changes in the position of the area and changes in various header information itself related to other compression methods.
(2) When the change of the header information itself is used for detecting the scene change, the header information is only extracted and compared between frames. Scene change detection can be performed at extremely high speed.
(3) When a change in a feature amount such as a code amount distribution obtained from header information is used for scene change detection, processing or means for calculating a feature amount in addition to extracting header information However, the processing or means is simple, and the processing does not take much time, so high-speed scene change detection is possible.TheAnd so on.
[Brief description of the drawings]
FIG. 1 is a block diagram for explaining an algorithm of JPEG2000.
FIG. 2 is a diagram for explaining a two-dimensional wavelet transform.
FIG. 3 is a diagram for explaining a format of encoded data;
FIG. 4 is an explanatory diagram of a precinct and a code block.
FIG. 5 is a diagram illustrating an example of a packet and a layer.
FIG. 6 is a diagram illustrating an arrangement order of packets.
FIG. 7 is a diagram illustrating a configuration of a main header.
FIG. 8 is a diagram illustrating a configuration of a tile header.
FIG. 9 is a diagram showing a list of markers and marker segments.
FIG. 10 is a diagram showing a configuration of an SOT marker segment.
FIG. 11 is a diagram illustrating a configuration of an SIZ marker segment.
FIG. 12 is a diagram showing a configuration of a COD marker segment.
FIG. 13 is a diagram showing a configuration of a COC marker segment.
FIG. 14 is a diagram showing a configuration of an RGN marker segment.
FIG. 15 is a diagram illustrating a configuration of a QCD marker segment.
FIG. 16 is a diagram showing a configuration of a QCC marker segment.
FIG. 17 is a diagram showing a configuration of a TLM marker segment.
FIG. 18 is a diagram illustrating a configuration of a PLM marker segment.
FIG. 19 is a diagram showing a configuration of a PLT marker segment.
FIG. 20 is a diagram illustrating a configuration of a PPM marker segment.
FIG. 21 is a diagram showing a configuration of a PPT marker segment.
FIG. 22 is a diagram illustrating a configuration of a COM marker segment.
FIG. 23 is a block diagram for explaining the first embodiment of the present invention;
FIG. 24 is a block diagram for explaining the second embodiment of the present invention;
FIG. 25 is a block diagram for explaining the third embodiment of the present invention;
[Explanation of symbols]
  100 Encoded data input section
  101, 101A, 101B Scene change detection device
  102 storage unit
  111 Information acquisition unit
  112 Information temporary storage unit
  113 Comparison determination processing unit
  114 Partial decoding unit
  115 Image temporary storage
  116 Secondary determination processing unit
  120 Still image feature extraction unit

Claims

Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
  Scene change detecting means for detecting a scene change of the moving image;
  Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
  The scene change detection means includes
  Information for extracting the packet length information from the main header, tile header or packet header in the encoded data for each frame of the moving image and obtaining the total packet length for each resolution level based on the extracted packet length information Acquisition means;
  The total packet length for each resolution level determined by the information acquisition unit for each frame of the moving image (hereinafter referred to as the current frame), and the resolution level determined by the information acquisition unit for the frame immediately before the current frame When the difference in code amount distribution for each resolution level between the current frame and the immediately preceding frame is determined to exceed a predetermined level by comparing the total packet length for each frame, the current frame is changed to a scene change. An image processing apparatus comprising: a comparison determination processing unit that determines a generated frame.

Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
  Scene change detecting means for detecting a scene change of the moving image;
  Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
  The scene change detection means includes
  For each frame of the moving image, information on the number of code paths is extracted from the main header, tile header, or packet header in the encoded data, and the total number of code paths for each precinct is calculated based on the extracted information on the number of code paths. Information acquisition means to be requested;
  The total number of code passes for each precinct determined by the information acquisition unit for each frame of the moving image (hereinafter referred to as the current frame), and each precinct determined by the information acquisition unit for the frame immediately before the current frame When it is determined that the difference in code amount distribution for each precinct between the current frame and the immediately preceding frame exceeds a predetermined level by comparing the total number of code passes of the current frame, a scene change occurs in the current frame. An image processing apparatus comprising: a comparison determination processing unit configured to determine a frame.

Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
  Scene change detecting means for detecting a scene change of the moving image;
  Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
  The scene change detection means includes
  For each frame of the moving image, information on the packet length is extracted from the main header, tile header, or packet header in the encoded data, and information is obtained for calculating the total packet length for each component based on the extracted packet length information Means,
  The sum of the packet lengths for each component determined by the information acquisition unit for each frame of the moving image (hereinafter referred to as the current frame), and for each component determined by the information acquisition unit for the frame immediately before the current frame. When it is determined that the difference in code amount distribution for each component between the current frame and the previous frame exceeds a predetermined level by comparing the total packet length, the current frame is An image processing apparatus comprising: a comparison determination processing unit configured to determine that the frame is a frame in which a problem occurs.

Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
  Scene change detecting means for detecting a scene change of the moving image;
  Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
  The scene change detection means includes
  For each frame of the moving image, information on the packet length is extracted from the main header, tile header or packet header in the encoded data, and information is obtained for calculating the total packet length for each layer based on the extracted packet length information Means,
  The total packet length for each layer obtained by the information obtaining unit for each frame of the moving image (hereinafter referred to as the current frame), and for each layer obtained by the information obtaining unit for the frame immediately before the current frame. By comparing the total packet length with the current frame and the immediately preceding frame, when it is determined that the difference in code amount distribution for each layer exceeds a predetermined level, the current frame is determined as a scene change occurrence frame. An image processing apparatus comprising: a comparison determination processing unit for determining.

Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
  Scene change detecting means for detecting a scene change of the moving image;
  Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
  The scene change detection means includes
  For each frame of the moving image, information on the number of code passes is extracted from the main header, tile header, or packet header in the encoded data, and the total number of code passes for each precinct is obtained from the extracted information on the number of code passes. Information acquisition means for acquiring information indicating the presence or absence of the ROI area in units of precincts based on the total number of code passes for each precinct obtained;
  Information indicating the presence / absence of the ROI area in units of precinct acquired by the information acquisition unit for each frame of the moving image (hereinafter referred to as the current frame), and acquired by the information acquisition unit for the frame immediately before the current frame. When it is determined that the difference in the position of the ROI area between the current frame and the previous frame exceeds a predetermined level by comparing the information indicating the presence or absence of the ROI area in units of precincts, An image processing apparatus comprising: comparison determination processing means for determining a scene change occurrence frame.

Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
  Scene change detecting means for detecting a scene change of the moving image;
  Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
  The scene change detection means includes
  For each frame of the moving image, an information acquisition unit that acquires information indicating the presence or absence of an ROI region in units of tiles by extracting an RGN marker segment from a tile header in the encoded data;
  Information indicating the presence / absence of an ROI area in units of tiles acquired by the information acquisition unit for each frame of the moving image (hereinafter referred to as a current frame), and acquired by the information acquisition unit for a frame immediately before the current frame. By comparing the information indicating the presence or absence of the ROI area in units of tiles, the ROI area of the current frame and the immediately preceding frame is compared. An image processing apparatus comprising: comparison determination processing means for determining that the current frame is a scene change occurrence frame when it is determined that the difference in position exceeds a predetermined level.

Motion-JPEG2000 Encoded data input means for inputting encoded data of a moving image compliant with
  Scene change detecting means for detecting a scene change of the moving image;
  Storage means for storing encoded data of the moving image and a scene change detection result by the scene change detection means;
Have
  The scene change detection means includes
  Information acquisition means for acquiring predetermined header information related to the compression method from the main header in the encoded data for each frame of the moving image;
  Predetermined header information acquired by the information acquisition unit for each frame of the moving image (hereinafter referred to as a current frame), and predetermined header information acquired by the information acquisition unit for a frame immediately before the current frame. Comparing and determining processing means for determining that the current frame is a scene change occurrence frame when it is determined by comparison that the compression method of the current frame is different from that of the immediately preceding frame. apparatus.

Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
  A scene change detection step of detecting a scene change of the moving image;
  A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
  The scene change detection step includes
  Information for extracting the packet length information from the main header, tile header or packet header in the encoded data for each frame of the moving image and obtaining the total packet length for each resolution level based on the extracted packet length information Acquisition process;
  The total packet length for each resolution level determined by the information acquisition step for each frame of the moving image (hereinafter referred to as the current frame), and the resolution level determined by the information acquisition step for the frame immediately before the current frame When the difference in code amount distribution for each resolution level between the current frame and the immediately preceding frame is determined to exceed a predetermined level by comparing the total packet length for each frame, the current frame is changed to a scene change. An image processing method comprising: a comparison determination processing step of determining a generated frame.

Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
  A scene change detection step of detecting a scene change of the moving image;
  A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
  The scene change detection step includes
  For each frame of the moving image, information on the number of code paths is extracted from the main header, tile header, or packet header in the encoded data, and the total number of code paths for each precinct is calculated based on the extracted information on the number of code paths. Information acquisition process to be requested,
  The total number of code passes for each precinct determined by the information acquisition step for each frame of the moving image (hereinafter referred to as the current frame), and for each precinct determined by the information acquisition step for the immediately preceding frame of the current frame When it is determined that the difference in code amount distribution for each precinct between the current frame and the immediately preceding frame exceeds a predetermined level by comparing the total number of code passes of the current frame, a scene change occurs in the current frame. And a comparison determination process step for determining a frame.

Motion-JPEG2000 Code to input encoded data of moving images conforming to Data input process,
  A scene change detection step of detecting a scene change of the moving image;
  A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
  The scene change detection step includes
  For each frame of the moving image, information on the packet length is extracted from the main header, tile header, or packet header in the encoded data, and information is obtained for calculating the total packet length for each component based on the extracted packet length information Process,
  The sum of the packet lengths for each component determined by the information acquisition step for each frame of the moving image (hereinafter referred to as the current frame), and for each component determined by the information acquisition step for the frame immediately before the current frame. When it is determined that the difference in code amount distribution for each component between the current frame and the previous frame exceeds a predetermined level by comparing the total packet length, the current frame is determined as a scene change occurrence frame. An image processing method comprising: a comparison determination processing step for determining.

Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
  A scene change detection step of detecting a scene change of the moving image;
  A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
  The scene change detection step includes
  For each frame of the moving image, information on the packet length is extracted from the main header, tile header, or packet header in the encoded data, and information is obtained for calculating the total packet length for each layer based on the extracted packet length information Process,
  The total packet length for each layer determined by the information acquisition step for each frame of the moving image (hereinafter referred to as the current frame), and for each layer determined by the information acquisition step for the immediately preceding frame of the current frame By comparing the total packet length with the current frame and the immediately preceding frame, when it is determined that the difference in code amount distribution for each layer exceeds a predetermined level, the current frame is determined as a scene change occurrence frame. An image processing method comprising: a comparison determination processing step for determining.

Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
  A scene change detection step of detecting a scene change of the moving image;
  A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
  The scene change detection step includes
  For each frame of the moving image, code path number information is extracted from the main header, tile header, or packet header in the encoded data, and the total number of code paths for each precinct is obtained from the extracted code path number information. An information acquisition step of acquiring information indicating the presence or absence of the ROI region in units of precinct based on the total number of code passes for each precinct obtained;
  Information indicating the presence / absence of an ROI area in units of precinct acquired by the information acquisition step for each frame of the moving image (hereinafter referred to as a current frame), and acquired by the information acquisition step for a frame immediately before the current frame. When it is determined that the difference in the position of the ROI area between the current frame and the previous frame exceeds a predetermined level by comparing the information indicating the presence or absence of the ROI area in units of precincts, A comparison determination process step for determining a scene change occurrence frame.

Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
  A scene change detection step of detecting a scene change of the moving image;
  A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
  The scene change detection step includes
  For each frame of the moving image, an information acquisition step of acquiring information indicating the presence or absence of the ROI region in units of tiles by extracting an RGN marker segment from a tile header in the encoded data;
  Information indicating the presence / absence of an ROI area in units of tiles acquired by the information acquisition step for each frame of the moving image (hereinafter referred to as the current frame), and the information acquisition step for the frame immediately before the current frame. When it is determined that the difference in the position of the ROI area between the current frame and the immediately preceding frame exceeds a predetermined level by comparing the information indicating the presence or absence of the ROI area in units of tiles, the current frame is A comparison determination process step of determining a scene change occurrence frame.

Motion-JPEG2000 An encoded data input step for inputting encoded data of a moving image conforming to
  A scene change detection step of detecting a scene change of the moving image;
  A storage step of storing encoded data of the moving image and a scene change detection result by the scene change detection step;
Have
  The scene change detection step includes
  For each frame of the moving image, an information acquisition step of acquiring predetermined header information related to the compression method from the main header in the encoded data;
  Predetermined header information acquired by the information acquisition step for each frame of the moving image (hereinafter referred to as a current frame), and predetermined header information acquired by the information acquisition step for a frame immediately before the current frame. And a comparison determination processing step of determining the current frame as a scene change occurrence frame when it is determined by comparison that the compression method of the current frame is different from that of the immediately preceding frame. Method.