JP2001169287A

JP2001169287A - Device and method for detecting scene change of compressed moving image and recording medium recording program therefor

Info

Publication number: JP2001169287A
Application number: JP2000230768A
Authority: JP
Inventors: Yukiko Inoue; 由紀子井上; Koji Arimura; 耕治有村; Atsushi Ikeda; 淳池田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1999-10-01
Filing date: 2000-07-31
Publication date: 2001-06-22
Anticipated expiration: 2020-07-31
Also published as: JP4350877B2

Abstract

PROBLEM TO BE SOLVED: To provide a technique capable of detecting a scene change even when frame/field structures coexist. SOLUTION: This device is provided with an image structure decision part 1 for deciding the image structure of input compressed moving images, a feature amount extraction part 2 for extracting feature vector, based on data for up- and-down double amount in the image vertical direction for a field structure image if the decision result of the image structure decision part is a frame structure image, a memory 6 for the data for recording the data extracted in the feature vector extraction part, an extracted data comparison part 3 for comparing extracted data and obtaining the change amount of video images, and a scene change decision part 4 for deciding the scene change by using the change amount obtained in the extracted data comparison part.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、圧縮動画像から、
シーンの変わり目を検出するシーンチェンジ検出装置及
びその関連技術に関するものである。TECHNICAL FIELD The present invention relates to a method for converting a compressed video
The present invention relates to a scene change detection device for detecting a scene change and related technology.

【０００２】[0002]

【従来の技術】近年、デジタルビデオカメラの普及や、
デジタル放送の出現などにより、ＭＰＥＧやＤＶなど、
圧縮動画像を扱う機会が増加している。また、過去の大
量なアナログ映像を、デジタルの圧縮動画像として保存
する場合もある。そして、このような圧縮動画像を復号
せずに、符号化されたまま、編集する技術が実用化され
つつある。2. Description of the Related Art In recent years, the spread of digital video cameras,
With the advent of digital broadcasting, MPEG, DV, etc.
Opportunities for handling compressed moving images are increasing. In addition, a large amount of past analog video may be stored as a digital compressed moving image. A technique of editing such a compressed moving image without decoding the encoded moving image without decoding is being put to practical use.

【０００３】このような編集においては、圧縮動画像、
すなわちビットストリームのなかから、シーンチェンジ
（映像又は場面の変わり目の位置）を、高速に自動検出
する技術が是非とも必要となる。なぜなら、検出された
シーンの先頭の位置情報やシーンチェンジ技術によって
切り分けられたシーンの代表画像は、映像内容のインデ
ックスとして有用であり、内容の検索や編集の重要な手
助けとなるからである。[0003] In such editing, compressed moving images,
That is, a technique for automatically detecting a scene change (the position of a transition between a video and a scene) from a bit stream at a high speed is definitely required. This is because the position information of the head of the detected scene or the representative image of the scene separated by the scene change technique is useful as an index of the video content, and is an important aid for searching and editing the content.

【０００４】次に、圧縮動画像のフォーマットとして広
く用いられている、ＭＰＥＧ２の符号化について説明す
る。ＭＰＥＧ２符号化は、ＭＰＥＧ１，Ｈ．２６１と同
様に動きベクトルとＤＣＴ（離散コサイン変換：Ｄｉｓ
ｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）を用い
た方式である。データは、フレーム内で、輝度（Ｙ）、
色差（Ｃｂ，Ｃｒ）に分離され、１６＊１６画素のマク
ロブロック単位で、符号化される。Next, MPEG2 encoding, which is widely used as a format of a compressed moving image, will be described. MPEG2 encoding is based on MPEG1, H.264. Motion vector and DCT (discrete cosine transform: Dis
This is a method using “Crete Cosine Transform”. The data is divided into luminance (Y),
It is separated into color differences (Cb, Cr) and is coded in macroblock units of 16 * 16 pixels.

【０００５】そして、各マクロブロックを符号化するに
あたっては、参照画像から動き予測を行う動き補償予測
が選択されるか、または、符号化を行うデータのみで符
号化を行うイントラ符号化が選択される。[0005] When coding each macroblock, motion compensation prediction for performing motion prediction from a reference image is selected, or intra coding for performing coding using only data to be coded is selected. You.

【０００６】このうち、動き補償予測は、フレーム間の
時間的な相関が高い場合に符号化率が高くなる方式であ
り、符号化を行うマクロブロックと、参照画像から動き
予測によって得られるマクロブロックのデータとの、差
分から予測誤差信号を得て時間空間的に情報の圧縮を行
うものである。なお、動き補償予測では、この予測誤差
信号が、８＊８画素のブロック単位で、ＤＣＴにより空
間周波数領域に変換される。[0006] Among them, the motion compensation prediction is a method in which the coding rate is increased when the temporal correlation between the frames is high, and the macroblock to be coded and the macroblock obtained by motion prediction from a reference image. And compressing information in a time-space manner by obtaining a prediction error signal from a difference between the data and the data. In the motion compensation prediction, this prediction error signal is converted into a spatial frequency domain by DCT in units of 8 * 8 pixels.

【０００７】一方、イントラ符号化は、符号化されるブ
ロックのデータそのものを、８＊８画素のブロックに分
け、これらのブロック単位で、単純にＤＣＴ符号化する
方式である。On the other hand, the intra coding is a method in which data of a block to be coded is divided into blocks of 8 * 8 pixels, and DCT coding is simply performed in units of these blocks.

【０００８】以下、符号化の単位について説明する。Ｍ
ＰＥＧ２では、インタレース画像も対象としており、画
面の符号化の単位として、フレーム構造とフィールド構
造とがある。The encoding unit will be described below. M
In PEG2, an interlaced image is also targeted, and there are a frame structure and a field structure as units of screen coding.

【０００９】フレーム構造では、奇数フィールドと偶数
フィールドの２フィールドをインタレースに配置したフ
レームに対して符号化を行う。一方、フィールド構造で
は、奇数フィールドもしくは偶数フィールドの、１フィ
ールドに対して符号化を行う。In the frame structure, encoding is performed on a frame in which two fields of an odd field and an even field are arranged in an interlaced manner. On the other hand, in the field structure, encoding is performed on one field of an odd field or an even field.

【００１０】さて、本明細書において、フレーム構造で
符号化された画像を「フレーム構造画像」といい、フィ
ールド構造で符号化された画像を「フィールド構造画
像」という。[0010] In the present specification, an image encoded with a frame structure is called a "frame structure image", and an image encoded with a field structure is called a "field structure image".

【００１１】次に、動き補償について説明する。ＭＰＥ
Ｇ２では、上述のように、フレーム構造とフィールド構
造とがある。そして、フレーム構造画像の動き補償予測
については、フレーム予測、フィールド予測及びデュア
ルプライム（Ｄｕａｌ−Ｐｒｉｍｅ）予測がある。ま
た、フィールド構造画像の動き補償予測としては、フィ
ールド予測、１６＊８ＭＣ予測及びデュアルプライム予
測がある。さらに、フレーム予測以外の予測において
は、参照するフィールドが奇数フィールドであるか偶数
フィールドであるかを選択することができる。Next, the motion compensation will be described. MPE
G2 has a frame structure and a field structure as described above. The motion compensation prediction of the frame structure image includes frame prediction, field prediction, and dual-prime prediction. The motion compensation prediction of the field structure image includes a field prediction, a 16 * 8 MC prediction, and a dual prime prediction. Further, in prediction other than frame prediction, it is possible to select whether a field to be referred to is an odd field or an even field.

【００１２】次に、図１５を参照しながら、符号化の方
法について説明する。フレーム構造画像では、符号化の
方法として、フレームＤＣＴとフィールドＤＣＴという
２種類のＤＣＴを用いることができる。Next, an encoding method will be described with reference to FIG. In a frame structure image, two types of DCT, a frame DCT and a field DCT, can be used as an encoding method.

【００１３】このうち、フレームＤＣＴは、マクロブロ
ックの輝度信号を、図１５（ａ）に示すように、４個の
ブロックに分解する際に、各ブロックがフレームで構成
されるように分解し、これにＤＣＴを施すものである。Of these, the frame DCT decomposes the luminance signal of a macroblock into four blocks as shown in FIG. 15A so that each block is composed of frames. DCT is applied to this.

【００１４】一方、フィールドＤＣＴは、図１５（ｂ）
に示すように、マクロブロックの輝度信号を４個のブロ
ックに分解する際に、各ブロックがフィールドで構成さ
れるように分解し、これにＤＣＴを施すものである。On the other hand, the field DCT is shown in FIG.
As shown in (1), when a luminance signal of a macroblock is decomposed into four blocks, each block is decomposed so that each block is composed of fields, and DCT is applied to this.

【００１５】さて、符号化の際には、この２種類のＤＣ
Ｔのどちらを用いてもよいが、一般的に、奇数フィール
ドと偶数フィールドの画像データの差が大きい場合にフ
ィールドＤＣＴを用いると符号化の効率がよくなること
が知られている。特に、二つのシーンが、一つのフィー
ルドに混在しているような場合は、フィールドＤＣＴを
行うと圧縮率が高くなる。Now, at the time of encoding, these two types of DC
Either T may be used, but it is generally known that the coding efficiency is improved by using the field DCT when the difference between the image data of the odd field and the image data of the even field is large. In particular, when two scenes are mixed in one field, the field DCT increases the compression ratio.

【００１６】しかし、フィールドＤＣＴの場合、フレー
ム構造を２つのフィールドに分解する必要があるため、
フレームＤＣＴに対して処理速度は遅くなる。つまり、
このような性質に合わせて、２種類のＤＣＴをうまく用
いることにより、フレーム構造画像（インタレース画
像）の符号化効率を向上させることができる。なお、
４：２：０フォーマットにおける色差信号については、
常にフレームＤＣＴを用いる。また、フィールド構造画
像では、マクロブロックが１フィールドの信号のみで構
成されるため、常にフィールドＤＣＴを行うことにな
る。However, in the case of field DCT, it is necessary to decompose the frame structure into two fields.
Processing speed is slower than frame DCT. That is,
Coding efficiency of a frame structure image (interlaced image) can be improved by appropriately using two types of DCT in accordance with such a property. In addition,
Regarding the color difference signal in the 4: 2: 0 format,
Frame DCT is always used. In a field structure image, since a macroblock is composed of only one field signal, field DCT is always performed.

【００１７】以上の説明をふまえて、以下従来のシーン
チェンジ技術について説明する。従来、シーンチェンジ
検出には、（１）画像の色のヒストグラム、（２）圧縮
動画像のデータサイズ、（３）２フレームの画像の同一
位置におけるブロックデータ差分、などの特徴量を用い
ている。（１）画像の色のヒストグラムを用いるには、１フレー
ムの画像に用いられている色を、１フレーム分あるいは
１フレームを分割した領域において、ヒストグラムに
し、このヒストグラムをそのフレームの特徴量とし、そ
の前後のフレーム画像における特徴量と比較して類似度
を求める（例えば、特開平７−５９１０８号公報参
照）。（２）圧縮動画像のデータサイズを用いるには、シーン
チェンジ部分では、圧縮率が悪いという性質を利用し
て、隣り合うフレームのデータのサイズを比較し、その
差分が所定の閾値よりも大きくなった時にシーンチェン
ジと判定する（例えば、特開平７−１２１５５５号公報
参照）。Based on the above description, a conventional scene change technique will be described below. Conventionally, scene change detection uses feature amounts such as (1) a color histogram of an image, (2) a data size of a compressed moving image, and (3) a block data difference at the same position of an image of two frames. . (1) To use a color histogram of an image, a color used in an image of one frame is made into a histogram in one frame or in a region obtained by dividing one frame, and this histogram is used as a feature amount of the frame. The degree of similarity is obtained by comparing with the feature amounts of the frame images before and after that (for example, see Japanese Patent Application Laid-Open No. 7-59108). (2) In order to use the data size of the compressed moving image, in the scene change portion, the data size of adjacent frames is compared by using the property that the compression ratio is poor, and the difference is larger than a predetermined threshold value. It is determined that a scene change has occurred (see, for example, JP-A-7-121555).

【００１８】しかし、これら（１）〜（２）の手法で
は、フレーム単位でしかシーンチェンジを検出できない
から、シーンチェンジが、１フレーム中の奇数フィール
ドと偶数フィールドの間（２つのフィールド間）に存在
すると、このシーンチェンジを正確に検出できない。However, according to the methods (1) and (2), a scene change can be detected only on a frame basis, so that a scene change is caused between an odd field and an even field (between two fields) in one frame. If it exists, this scene change cannot be detected accurately.

【００１９】この問題を解決するために、特開平９−３
２２１２０号公報では、フィールド予測方式を用いて符
号化された映像データから復号処理を行うこと無しにシ
ーンチェンジを検出する方法を提案している。このもの
では、予測が行われるフレームにおいて、参照フレーム
の奇数フィールドもしくは偶数フィールドのどちらを選
択して予測を行うかという、参照フィールド選択信号に
基づいて、フィールド間の類似度を複数算出し、その結
果を用いてシーンチェンジ検出を行っている。In order to solve this problem, Japanese Patent Application Laid-Open No. Hei 9-3
Japanese Patent No. 22120 proposes a method for detecting a scene change from video data encoded using a field prediction method without performing decoding processing. In this method, in a frame in which prediction is performed, a plurality of similarities between fields are calculated based on a reference field selection signal indicating whether to select an odd field or an even field of a reference frame to perform prediction. The scene change is detected using the result.

【００２０】[0020]

【発明が解決しようとする課題】しかしながら、この手
法は、フィールド予測方式に依存しているから、フィー
ルド予測方式（フレーム間予測方式）を用いない映像
や、フィールド予測方式の映像とそれ以外の予測方式の
映像が混在する映像に対しては、適用できない。However, since this method relies on the field prediction method, the video without the field prediction method (inter-frame prediction method), the video of the field prediction method and other prediction methods are used. This method cannot be applied to video in which video of different formats is mixed.

【００２１】また、（３）データとして同じ位置のＤＣ
ＴにおけるＤＣ成分のみの差分を用いると、位置の対応
がとれない場合がある。これは、フレーム構造画像で
は、符号化の方法としてフレームＤＣＴとフィールドＤ
ＣＴという２種類のＤＣＴを用いることができるため、
ＤＣＴを解かずにブロックデータの比較を行うと、比較
する片方のデータがフレームＤＣＴで符号化され、もう
一方のデータがフィールドＤＣＴで符号化されていた場
合に、画像における８＊８画素のデータと、８＊１６画
素データ中の、奇数もしくは偶数フィールドのみの８＊
８画素のデータとを、比較することになるからである。(3) DC data at the same position as data
If the difference of only the DC component at T is used, there is a case where the correspondence of the position cannot be obtained. This is because, in a frame structure image, the frame DCT and the field D
Because two types of DCT called CT can be used,
When the block data is compared without solving the DCT, when one of the data to be compared is encoded by the frame DCT and the other data is encoded by the field DCT, data of 8 * 8 pixels in the image is obtained. And 8 * of only odd or even fields in 8 * 16 pixel data
This is because data of eight pixels is compared.

【００２２】これを改善するには、フレーム構造画像１
フレーム分と、フィールド構造画像１フレーム相当分
（奇数フィールド画像と偶数フィールド画像）とを比較
しなければならない。しかし、これでは、２枚のフィー
ルド構造画像のデータが揃った時点でしか比較ができ
ず、処理が煩雑で処理速度も遅くなるという、問題点が
ある。To improve this, the frame structure image 1
One frame has to be compared with one frame of the field structure image (odd field image and even field image). However, in this case, the comparison can be performed only when the data of the two field structure images are prepared, and there is a problem that the processing is complicated and the processing speed is reduced.

【００２３】そこで本発明は、フィールド／フレームの
各構造が混在していても、シーンチェンジを検出可能な
技術を提供することを、第１の目的とする。Accordingly, it is a first object of the present invention to provide a technique capable of detecting a scene change even when field / frame structures are mixed.

【００２４】また、本発明は、フレーム構造のフィール
ド間に存在するシーンチェンジをも検出できる技術を提
供することを、第２の目的とする。It is a second object of the present invention to provide a technique capable of detecting a scene change existing between fields having a frame structure.

【００２５】さらに、本発明は、予め開始点と終了点と
の間隔が分かっている場合に、迅速に目的のシーンチェ
ンジを検出できる技術を提供することを、第３の目的と
する。It is a third object of the present invention to provide a technique capable of quickly detecting a target scene change when an interval between a start point and an end point is known in advance.

【００２６】[0026]

【課題を解決するための手段】まず本発明では、フィー
ルド構造画像とフレーム構造画像とが混在する圧縮動画
像を入力し、入力した圧縮動画像におけるシーンチェン
ジを検出する。そして本発明は、第１の目的のため、請
求項１、２、６、８、１２、１３において、圧縮動画像
がフレーム構造画像の場合には、フィールド構造画像に
対応する縦方向に２倍のデータを抽出する。According to the present invention, a compressed moving image in which a field structure image and a frame structure image are mixed is input, and a scene change in the input compressed moving image is detected. According to the first aspect of the present invention, when the compressed moving image is a frame structure image, the compressed moving image is doubled in the vertical direction corresponding to the field structure image in the first, second, sixth, eighth, twelfth, and thirteenth embodiments. Extract the data of

【００２７】本発明は、第２の目的のため、請求項３、
９、１４において、フィールドＤＣＴ符号化ブロック数
カウント部を備える。[0027] The present invention provides, for the second object, claim 3,
9 and 14, a field DCT encoded block number counting unit is provided.

【００２８】本発明は、第３の目的のため、請求項４、
１０、１５において、複数の検出されたシーンチェンジ
の中から、指定された間隔に一致するシーンチェンジの
組を検出し、そのシーンチェンジを結果として出力す
る。The present invention provides, for a third object, claim 4,
In steps 10 and 15, a set of scene changes corresponding to a designated interval is detected from a plurality of detected scene changes, and the scene changes are output as a result.

【００２９】[0029]

【発明の実施の形態】本発明におけるシーンチェンジ検
出は、全て、予測方式に依存しない。したがって、予測
方式が何であれ、あるいは、予測方式の有無にかかわら
ず、シーンチェンジを検出できる。なお以下、本発明の
全ての実施の形態において、入力圧縮動画像は、ＭＰＥ
Ｇ２によることとして説明を行うが、ＤＣＴを用い、か
つ、フィールド／フレームの各構造を混在させ得る圧縮
方式であれば、同様の効果が得られるから、このような
圧縮方式による技術であれば、本発明の対象に包含され
る。そして、本発明では、フィールド構造画像とフレー
ム構造画像とが混在する圧縮動画像を入力し、入力した
圧縮動画像におけるシーンチェンジを検出するものであ
る。BEST MODE FOR CARRYING OUT THE INVENTION The scene change detection in the present invention does not depend on a prediction method. Therefore, a scene change can be detected regardless of the prediction method or regardless of the presence or absence of the prediction method. Hereinafter, in all the embodiments of the present invention, the input compressed moving image
A description will be given assuming that G2 is used. However, the same effect can be obtained if the compression method uses DCT and can mix the respective structures of fields / frames. Included in the subject of the present invention. In the present invention, a compressed moving image in which a field structure image and a frame structure image are mixed is input, and a scene change in the input compressed moving image is detected.

【００３０】（第１の実施の形態）以下、図面を参照し
ながら、本発明の実施の形態を説明する。図１は、本発
明の第１の実施の形態におけるシーンチェンジ検出装置
のブロック図である。(First Embodiment) An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram of a scene change detection device according to the first embodiment of the present invention.

【００３１】図１に示すように、このシーンチェンジ検
出装置には、ＭＰＥＧ２に従って符号化されたビットス
トリームが、入力圧縮動画像として入力される。また、
このシーンチェンジ検出装置からは、ビットストリーム
において検出されたシーンチェンジの位置を示す情報
が、検出結果として出力される。ここで、従来の技術の
項で述べたように、入力圧縮動画像の符号化方式、符号
化の単位（フレーム／フィールド構造）、ＤＣＴなど
は、種々のものが考えられ、また、１種類だけでなく、
複数種のものが、時間軸上で混在していてもよい。As shown in FIG. 1, a bit stream encoded according to MPEG2 is input to the scene change detection device as an input compressed moving image. Also,
From the scene change detection device, information indicating the position of the scene change detected in the bit stream is output as a detection result. Here, as described in the section of the prior art, various coding schemes, coding units (frame / field structure), DCT, and the like of the input compressed moving image can be considered. But not
A plurality of types may be mixed on the time axis.

【００３２】就中、フィールド構造画像とフレーム構造
画像とが、時間軸上で交互に存在するような場合でも差
し支えない。以上の点は、後述する他の実施の形態につ
いても、そのまま適合する。In particular, a case where the field structure image and the frame structure image alternately exist on the time axis may be used. The above points also apply to other embodiments described later.

【００３３】図１に示すように、入力圧縮動画像は、ま
ず、画像構造判定部１に入力される。画像構造判定部１
は、現在入力している画像が、フィールド構造画像、フ
レーム構造画像のいずれであるかを、ビットストリーム
の特定領域の情報を参照して、判定する。そして、この
判定結果と、ビットストリームの内容が、次段の特徴量
抽出部２に出力される。As shown in FIG. 1, an input compressed moving image is first input to an image structure determining unit 1. Image structure determination unit 1
Determines whether the currently input image is a field-structured image or a frame-structured image by referring to information on a specific area of the bit stream. Then, the result of this determination and the contents of the bit stream are output to the feature extraction unit 2 at the next stage.

【００３４】次に、特徴量、変化量及び特徴量抽出部２
の動作を、図５〜図６を例にとって、説明する。図５で
は、図の左側が時間軸上で古い画像を示し、右側が新し
い画像を示し、ｔ枚目の画像を、画像ｔのように記述す
る。また、この例では、画像ｔ−２〜ｔ＋１，ｔ＋４〜
ｔ＋５がフィールド構造画像であり、画像ｔ＋２〜ｔ＋
３はフレーム構造画像である。即ち、画像ｔ＋１〜ｔ＋
２，ｔ＋３〜ｔ＋４では、構造の変化がある。Next, the characteristic amount, change amount and characteristic amount extracting unit 2
Will be described with reference to FIGS. 5 to 6 as an example. In FIG. 5, the left side of the figure shows an old image on the time axis, the right side shows a new image, and the t-th image is described as an image t. In this example, the images t-2 to t + 1, t + 4 to
t + 5 is a field structure image, and images t + 2 to t +
3 is a frame structure image. That is, images t + 1 to t +
At 2, t + 3 to t + 4, there is a structural change.

【００３５】そして、図６（ａ）（トップフィール
ド）、図６（ｂ）（ボトムフィールド）のような、フィ
ールド構造画像では、１ブロック分のブロックデータＤ
ｔを用い、ブロックデータＤｔをそのまま特徴量ｄｔと
して、使用する（ｄｔ＝Ｄｔ）。In a field structure image as shown in FIGS. 6A (top field) and FIG. 6B (bottom field), block data D for one block is used.
Using t, the block data Dt is used as it is as the feature amount dt (dt = Dt).

【００３６】また、図６（ｃ）のような、フレーム構造
画像では、上下２ブロック分のブロックデータ（ブロッ
クデータＤｔｕとブロックデータＤｔｂ）を用い、その
平均値を特徴量ｄｔとして使用する（ｄｔ＝（Ｄｔｕ＋
Ｄｔｂ）／２）。In a frame structure image as shown in FIG. 6C, two blocks of upper and lower blocks (block data Dtu and block data Dtb) are used, and the average value is used as a feature value dt (dt). = (Dtu +
Dtb) / 2).

【００３７】ここで、ブロックデータＤｔ、Ｄｔｕ、Ｄ
ｔｂは、同種のデータでなければならないが、ブロック
内の輝度平均値やその他ブロック内の画像を代表する種
々のデータを用いることができる。Here, the block data Dt, Dtu, D
Although tb must be the same type of data, various data representing the average luminance value in the block and other images in the block can be used.

【００３８】また、図６（ａ）、（ｂ）、（ｃ）に示し
ているように、フレーム構造画像は、フィールド構造画
像に対して、上下２倍の高さを持つので、フィールド構
造画像とフレーム構造画像とを比較するには、フィール
ド構造画像において、ブロック座標（ｘ，ｙ）のブロッ
クデータＤｔと、ブロック座標（ｘ，２＊ｙ）のブロッ
クデータＤｔｕ及びブロック座標（ｘ，２＊ｙ＋１）の
ブロックデータＤｔｂとの組とを、対比して用いる。As shown in FIGS. 6A, 6B and 6C, the frame structure image is twice as high as the field structure image. And the frame structure image, in the field structure image, the block data Dt of the block coordinates (x, y), the block data Dtu of the block coordinates (x, 2 * y) and the block coordinates (x, 2 *) The pair with the block data Dtb of (y + 1) is used for comparison.

【００３９】次に、時間軸上で異なる位置にある、画像
ｔと画像ｓ（典型的には、ｓ＝ｔ＋１）とにおける、ブ
ロック座標（ｘ，ｙ）の特徴量を、それぞれｄｔ（ｘ，
ｙ）、ｄｓ（ｘ，ｙ）とするとき、画像ｔと画像ｓとに
おける、第１の変化量Ｒ（ｔ，ｓ）を次のように定義す
る。この第１の変化量Ｒ（ｔ，ｓ）は、画像ｔと画像ｓ
との単純な差分に近い。Next, the feature amounts of the block coordinates (x, y) in the image t and the image s (typically, s = t + 1) at different positions on the time axis are respectively represented by dt (x, y).
y) and ds (x, y), the first change amount R (t, s) between the image t and the image s is defined as follows. This first change amount R (t, s) is obtained by calculating the image t and the image s
And close to a simple difference.

【００４０】[0040]

【数１】 (Equation 1)

【００４１】また、同様に、第２の変化量Ｑ（ｔ，ｓ）
を次のように定義する。第２の変化量Ｑ（ｔ，ｓ）は、
画像ｔと画像ｓの微分値に近い。Similarly, the second variation Q (t, s)
Is defined as follows. The second variation Q (t, s) is
It is close to the differential value between image t and image s.

【００４２】[0042]

【数２】 (Equation 2)

【００４３】ここで、第１の変化量Ｒ（ｔ，ｓ）と、こ
れに対して、予め経験的に設定される第１の閾値とを、
比較することにより、画像に大きな変化が現れた位置を
取り出すことができ、この位置をシーンチェンジと検出
できる。但し、画像によっては、動きの激しいものな
ど、シーンは同じではあるが、連続的な変化が起き、第
１の変化量Ｒ（ｔ，ｓ）のみでは、過剰にシーンチェン
ジを検出するおそれがある。このような場合、第２の変
化量Ｑ（ｔ，ｓ）と、これに対して、予め設定される第
２の閾値との、比較を併用することにより、不連続で、
かつ、大きな変化がある位置のみを、シーンチェンジと
検出することができ、検出されるシーンチェンジの信頼
性を向上できる。Here, the first change amount R (t, s) and a first threshold value which is set empirically in advance are represented by
By comparing, a position where a large change appears in the image can be extracted, and this position can be detected as a scene change. However, depending on the image, although the scene is the same, such as a scene with rapid movement, a continuous change occurs, and a scene change may be detected excessively with only the first change amount R (t, s). . In such a case, the comparison between the second change amount Q (t, s) and a second threshold value set in advance is used in combination, so that
In addition, only a position where there is a large change can be detected as a scene change, and the reliability of the detected scene change can be improved.

【００４４】なお、動画像によっては、第１の変化量Ｒ
（ｔ，ｓ）、第２の変化量Ｑ（ｔ，ｓ）の一方のみで十
分な場合もあり得るし、第１の変化量Ｒ（ｔ，ｓ）、第
２の変化量Ｑ（ｔ，ｓ）に適当な重みつけをして、両方
検討するようにしても差し支えない。勿論、以上の２つ
の式は、単なる例示に過ぎず、これらと等価な式はもと
より、シーンチェンジを判定できるものであれば、他の
式で代用しても良い。It should be noted that the first variation R
In some cases, only one of (t, s) and the second change amount Q (t, s) is sufficient, and the first change amount R (t, s) and the second change amount Q (t, s). An appropriate weighting may be applied to s) to consider both. Of course, the above two expressions are merely examples, and other expressions may be used as long as a scene change can be determined, as well as equivalent expressions.

【００４５】これらの第１の変化量Ｒ（ｔ，ｓ）、第２
の変化量Ｑ（ｔ，ｓ）を求めるには、図３に示すよう
に、入力画像１枚分の抽出データの全てを得た後に、比
較対象画像との比較をし変化量を算出しても良いが、図
４に示すように、１ブロック（もしくは複数ブロック）
のデータを抽出した時に、同時に、比較対象画像の同じ
位置に対応するデータとの比較を行う方が望ましい。な
ぜなら、図４のようにした方が、処理速度は上がるから
である。The first variation R (t, s) and the second
In order to obtain the change amount Q (t, s), as shown in FIG. 3, after obtaining all the extracted data for one input image, it is compared with the comparison target image to calculate the change amount. However, as shown in FIG. 4, one block (or a plurality of blocks)
It is preferable that when the data is extracted, it is compared with the data corresponding to the same position of the comparison target image at the same time. This is because the processing speed increases as shown in FIG.

【００４６】このようにすると、時間軸上において、フ
ィールド構造画像とフレーム構造画像の混合する一連の
映像から、一律にシーンチェンジを検出できる。In this way, a scene change can be uniformly detected on the time axis from a series of videos in which the field structure image and the frame structure image are mixed.

【００４７】さて、図１に示すように、特徴量抽出部２
が抽出した特徴量ｄｔと、特徴量ｄｔが抽出されたフレ
ーム番号などの画像情報とが、第１のデータメモリ６に
対応付けて格納される。この第１のデータメモリ６は、
記憶領域に相当するものであり、ハードディスク装置な
どの他の記憶手段で代用しても良い。Now, as shown in FIG.
Are stored in the first data memory 6 in association with the extracted feature amount dt and the image information such as the frame number from which the feature amount dt is extracted. This first data memory 6
This is equivalent to a storage area, and may be replaced by another storage means such as a hard disk device.

【００４８】特徴量抽出部２の次段にある抽出データ比
較部３は、特徴量抽出部２が新たに特徴量ｄｔを抽出
し、第１のデータ用メモリ６に、この新たな特徴量ｄｔ
を記録すると、この新たな特徴量ｄｔと、それ以前（典
型的にはその直前回）に第１のデータ用メモリ６に記録
された特徴量とを参照し、これらの特徴量間の変化量を
求める。この変化量は、第１の変化量Ｒ（ｔ，ｓ）、第
２の変化量Ｑ（ｔ，ｓ）の一方のみでも良いが、望まし
くは、両方求める。The extracted data comparing unit 3 at the next stage of the characteristic amount extracting unit 2 extracts the characteristic amount dt from the characteristic amount extracting unit 2 and stores the new characteristic amount dt in the first data memory 6.
Is recorded, the new feature value dt and the feature values recorded before (typically immediately before) in the first data memory 6 are referred to, and the change amount between these feature values is referred to. Ask for. This change amount may be only one of the first change amount R (t, s) and the second change amount Q (t, s), but preferably both are obtained.

【００４９】そして、抽出データ比較部３は、求めた変
化量を比較結果情報として、第２のデータ用メモリ７に
記録すると共に、この変化量が生じたフレーム番号等の
画像情報を、比較結果情報に対応付けて第２のデータ用
メモリ７に記録する。Then, the extracted data comparing section 3 records the obtained change amount as comparison result information in the second data memory 7 and, at the same time, compares the image information such as the frame number where the change amount occurs with the comparison result. The information is recorded in the second data memory 7 in association with the information.

【００５０】また、シーンチェンジ判定用データ入力部
５は、シーンチェンジ判定時に使用する閾値を保持し、
この閾値をシーンチェンジ判定部４へ出力する。この閾
値としては、第１の変化量Ｒ（ｔ，ｓ）用の第１の閾値
と、第２の変化量Ｑ（ｔ，ｓ）用の第２の閾値とがあ
る。The scene change determination data input unit 5 holds a threshold value used for scene change determination.
This threshold value is output to the scene change determination unit 4. The threshold value includes a first threshold value for the first change amount R (t, s) and a second threshold value for the second change amount Q (t, s).

【００５１】シーンチェンジ判定部４は、抽出データ比
較部３が比較を行うと、第２のデータ用メモリ７に記録
された比較結果情報を参照し、これがシーンチェンジ判
定用データ入力部５から入力する閾値を越えていれば、
この位置でシーンチェンジが発生したと判定し、この比
較結果情報に係る、フレーム番号等の画像情報（つま
り、ビットストリーム内の位置）を検出結果として出力
する。また、そうでなければ、シーンチェンジ判定部４
は、検出結果を出力しないようにしても良いし、あるい
は、検出しなかった旨の検出結果を出力するようにして
も良い。When the extracted data comparing section 3 makes a comparison, the scene change determining section 4 refers to the comparison result information recorded in the second data memory 7 and receives the information from the scene change determining data input section 5. If the threshold is exceeded,
It is determined that a scene change has occurred at this position, and image information such as a frame number (that is, a position in the bit stream) related to this comparison result information is output as a detection result. Otherwise, the scene change determination unit 4
May output no detection result, or may output a detection result indicating that no detection was performed.

【００５２】図２は、本発明の第１の実施の形態におけ
るシーンチェンジ検出装置のフローチャートである。次
に、図２を用いて、本形態のシーンチェンジ検出装置の
動作を説明する。まず、入力圧縮動画像が、画像構造判
定部１に至ると、この判定部１は、現在の画像が、フレ
ーム構造画像／フィールド構造画像のいずれであるかを
判定する（ステップ１）。フレーム構造画像であれば、
特徴量抽出部２は、画像の縦方向上下２ブロック分のデ
ータを用いた値を特徴量として抽出し、第１のデータ用
メモリ６に記録する（ステップ２）。フィールド構造画
像であれば、特徴量抽出部２は、１ブロック分のデータ
を用いた値を特徴量として抽出し、第１のデータ用メモ
リ６に記録する（ステップ３）。つまり、フレーム構造
画像ではフィールド構造画像の縦方向２倍のデータを用
いる。FIG. 2 is a flowchart of the scene change detecting device according to the first embodiment of the present invention. Next, the operation of the scene change detection device according to the present embodiment will be described with reference to FIG. First, when the input compressed moving image reaches the image structure determining unit 1, the determining unit 1 determines whether the current image is a frame structure image or a field structure image (step 1). If it is a frame structure image,
The feature amount extraction unit 2 extracts a value using data of two blocks in the vertical direction of the image as a feature amount and records the value in the first data memory 6 (step 2). If it is a field structure image, the feature amount extraction unit 2 extracts a value using data of one block as a feature amount and records it in the first data memory 6 (step 3). That is, the frame structure image uses twice the data in the vertical direction as the field structure image.

【００５３】そして、ステップ４にて、抽出データ比較
部３が、今回の特徴量と、それ以前の特徴量とを比較し
て、比較結果情報を第２のデータ用メモリ７に記録す
る。次に、ステップ５にて、シーンチェンジ判定部４
は、この比較結果情報をシーンチェンジ判定用データ入
力部５から入力する、閾値と比較し、シーンチェンジと
判定できれば、シーンチェンジが発生した位置を検出結
果として出力する。Then, in step 4, the extracted data comparison unit 3 compares the current feature value with the previous feature value and records the comparison result information in the second data memory 7. Next, in step 5, the scene change determination unit 4
Compares the comparison result information with a threshold value input from the scene change determination data input unit 5, and if it is determined that the scene change has occurred, outputs the position where the scene change has occurred as a detection result.

【００５４】なお、本形態では、特徴量ｄｔとして、フ
ィールド構造画像では、１ブロック分のブロックデータ
Ｄｔを用い、フレーム構造画像では、縦方向に２ブロッ
ク分のブロックデータＤｔｕ、Ｄｔｂの平均値を用いた
が、フィールド構造画像とフレーム構造画像のデータの
比較レベルが同等になれば、他の手法によっても良い。
例えば、フィールド構造画像の１ブロック分のブロック
データを２倍にしたもの（ｄｔ＝２＊Ｄｔ）と、フレー
ム構造画像の縦方向上下２ブロックのブロックデータの
和（ｄｔ＝Ｄｔｕ＋Ｄｔｂ）とを、特徴量ｄｔとして、
これらを比較してもよい。In the present embodiment, the block data Dt for one block is used for the field structure image as the feature amount dt, and the average value of the block data Dtu and Dtb for two blocks in the vertical direction is used for the frame structure image. Although used, other methods may be used as long as the comparison levels of the data of the field structure image and the data of the frame structure image become equal.
For example, a feature is that the block data of one block of the field structure image is doubled (dt = 2 * Dt) and the sum of the block data of the upper and lower two blocks in the vertical direction of the frame structure image (dt = Dtu + Dtb). As the amount dt,
These may be compared.

【００５５】また、本形態では、フィールド構造画像に
おいて抽出されるデータを１ブロックずつの単位とし、
フレーム構造画像において抽出されるデータを２ブロッ
クずつの単位としたが、フレーム構造画像から抽出され
るデータが、フィールド構造画像において抽出されるデ
ータと対応する位置に存在する縦方向に２倍のデータで
あれば、同様にシーンチェンジを検出できる。つまり、
フィールド構造画像Ｎブロックのデータを特徴量として
抽出した場合、フレーム構造画像からは対応する位置の
縦方向に２倍の、２＊Ｎブロックからのデータを、特徴
量として抽出するようにすればよい。In this embodiment, the data extracted from the field structure image is set in units of one block,
Although the data extracted from the frame structure image is in units of two blocks, the data extracted from the frame structure image is twice as long as the data extracted at the position corresponding to the data extracted from the field structure image. If so, a scene change can be similarly detected. That is,
When the data of the N blocks of the field structure image is extracted as the feature amount, the data from the 2 * N block, which is twice as long as the corresponding position in the vertical direction, may be extracted as the feature amount from the frame structure image. .

【００５６】これにより、フレーム構造画像とフィール
ド構造画像の混合する圧縮動画像からも、一律にシーン
チェンジを検出できる。また、フレーム構造画像内でフ
レームＤＣＴを用いられようとフィールドＤＣＴを用い
られようと、特別な配慮をすることなくシーンチェンジ
を検出できる。また、この手法は、予測方式に依存しな
いので、予測方式が何であるかは不問であるし、予測方
式の有無にかかわらず、所望のシーンチェンジを検出で
きる。Thus, a scene change can be uniformly detected from a compressed moving image in which a frame structure image and a field structure image are mixed. In addition, whether a frame DCT or a field DCT is used in a frame structure image, a scene change can be detected without special consideration. In addition, since this method does not depend on the prediction method, it does not matter what the prediction method is, and a desired scene change can be detected regardless of the presence or absence of the prediction method.

【００５７】（第２の実施の形態）図７は、本発明の第
２の実施の形態におけるシーンチェンジ検出装置のブロ
ック図、図８は、本発明の第２の実施の形態におけるシ
ーンチェンジ検出装置のフローチャートである。(Second Embodiment) FIG. 7 is a block diagram of a scene change detecting device according to a second embodiment of the present invention, and FIG. 8 is a scene change detecting device according to the second embodiment of the present invention. It is a flowchart of an apparatus.

【００５８】図７では、第１の実施の形態に係る図１に
対し、特徴量抽出部２と画像構造判定部１の位置を入れ
換えてある。また、本形態の特徴量抽出部２は、入力さ
れた圧縮動画像がフレーム構造画像であってもフィール
ド構造画像であっても、画面全体について、１ブロック
ずつのブロックデータから、特徴量を抽出する点が異な
る（ステップ１０）。In FIG. 7, the positions of the feature quantity extraction unit 2 and the image structure determination unit 1 are replaced with those of FIG. 1 according to the first embodiment. Further, the feature amount extraction unit 2 of the present embodiment extracts the feature amount from the block data of each block for the entire screen, whether the input compressed moving image is a frame structure image or a field structure image. (Step 10).

【００５９】そして、図８に示すように、抽出データ比
較部３の前段にある、画像構造判定部１は、今回入力し
た画像の構造を調べ、フレーム構造画像の場合、抽出デ
ータ比較部３は、画像の縦方向に上下２ブロック分のデ
ータを用いてデータを比較する（ステップ１２）。一
方、フィールド構造画像ならば、抽出データ比較部３
は、１ブロック分のデータを用いてデータを比較する
（ステップ１３）。ここでの比較は、第１の実施の形態
と同様である。Then, as shown in FIG. 8, the image structure judging unit 1 at the preceding stage of the extracted data comparing unit 3 examines the structure of the image inputted this time, and in the case of a frame structure image, the extracted data comparing unit 3 Then, the data is compared using the data of the upper and lower blocks in the vertical direction of the image (step 12). On the other hand, if it is a field structure image, the extracted data comparison unit 3
Compares data using data for one block (step 13). The comparison here is the same as in the first embodiment.

【００６０】さて、図５の例でいえば、画像ｔと画像ｔ
＋１とを比較するとき、両画像ともフィールド構造画像
であるので、画面上の同じ位置にあるブロックのデータ
が比較される。Now, in the example of FIG. 5, the image t and the image t
When comparing with +1, since both images are field structure images, the data of the blocks at the same position on the screen are compared.

【００６１】画像ｔ＋１と画像ｔ＋２の比較であれば、
画像ｔ＋２はフレーム構造画像であるので、画像ｔ＋１
におけるブロック座標（ｘ，ｙ）から得られるデータ
と、画像ｔ＋２のデータは、ブロック座標（ｘ，２＊
ｙ）とブロック座標（ｘ，２＊ｙ＋１）から得られるデ
ータとが、比較される。For comparison between the image t + 1 and the image t + 2,
Since the image t + 2 is a frame structure image, the image t + 1
The data obtained from the block coordinates (x, y) and the data of the image t + 2 are represented by the block coordinates (x, 2 *).
y) and data obtained from the block coordinates (x, 2 * y + 1) are compared.

【００６２】そして、例えば、画像ｔ＋１におけるブロ
ック座標（ｘ，ｙ）から得られるデータをＡ、ブロック
座標（ｘ，２＊ｙ）とブロック座標（ｘ，２＊ｙ＋１）
から得られるデータをそれぞれＢ、Ｃとすると、データ
ＡとデータＢ、Ｃの平均値との差分の絶対値を、変化量
とする。この変化量を画像全体において求めることで、
２枚の画像の変化量が得られる。For example, the data obtained from the block coordinates (x, y) in the image t + 1 is A, the block coordinates (x, 2 * y) and the block coordinates (x, 2 * y + 1)
Are respectively B and C, the absolute value of the difference between the data A and the average value of the data B and C is defined as the amount of change. By calculating this change amount for the entire image,
The amount of change between the two images is obtained.

【００６３】なお、ここでは、フレーム構造画像のデー
タとして縦方向に２ブロック分の平均値を用いたが、こ
れは単なる和でも良く、その場合、対応するフィールド
構造画像のデータは、２倍にするなど、データの正規化
を行えば良い。具体的には、データＡの２倍の値とデー
タＢ、Ｃの和との差分の絶対値を変化量とすると良い。Here, the average value of two blocks in the vertical direction is used as the data of the frame structure image. However, this may be a simple sum. In this case, the data of the corresponding field structure image is doubled. For example, the data may be normalized. Specifically, it is preferable to use the absolute value of the difference between the value twice the data A and the sum of the data B and C as the amount of change.

【００６４】その他の点は、第１の実施の形態と同様で
ある。The other points are the same as in the first embodiment.

【００６５】（第３の実施の形態）図９は、本発明の第
３の実施の形態におけるシーンチェンジ検出装置のブロ
ック図、図１０は、本発明の第３の実施の形態における
シーンチェンジ検出装置のフローチャートである。(Third Embodiment) FIG. 9 is a block diagram of a scene change detection device according to a third embodiment of the present invention, and FIG. 10 is a scene change detection device according to the third embodiment of the present invention. It is a flowchart of an apparatus.

【００６６】本形態では、図１と図９とを比較すれば明
らかなように、フィールドＤＣＴ符号化ブロック数カウ
ント部８と、第３のデータ用メモリ９を追加している。
このフィールドＤＣＴ符号化ブロック数カウント部８
は、画像構造判定部１による判定結果が、フレーム構造
画像であった場合、その符号化がフレームＤＣＴを用い
ているのかフィールドＤＣＴを用いているのかを判定
し、１フィールド中のフィールドＤＣＴ符号化が行われ
ているマクロブロック数（ブロック数でも良い）をカウ
ントする。そして、フィールドＤＣＴ符号化ブロック数
カウント部８がカウントしたフィールドＤＣＴ符号化ブ
ロック数は、第３のデータ用メモリ９に記録される。In this embodiment, as apparent from a comparison between FIG. 1 and FIG. 9, a field DCT coded block number counting section 8 and a third data memory 9 are added.
This field DCT coding block number counting section 8
When the result of the determination by the image structure determination unit 1 is a frame structure image, it is determined whether the encoding uses the frame DCT or the field DCT, and the field DCT encoding in one field is performed. Is counted (or the number of blocks may be used). Then, the number of field DCT encoded blocks counted by the field DCT encoded block number counting section 8 is recorded in the third data memory 9.

【００６７】したがって、図１０に示すように、まず、
入力圧縮動画像が、画像構造判定部１に至ると、この判
定部１は、現在の画像が、フレーム構造画像／フィール
ド構造画像のいずれであるかを判定する（ステップ２
０）。フレーム構造画像であれば、特徴量抽出部２は、
画像の縦方向上下２ブロック分のブロックデータを用い
た値を特徴量として抽出し、第１のデータ用メモリ６に
記録し（ステップ２１）、フィールドＤＣＴ符号化ブロ
ック数カウント部８がフィールドＤＣＴ符号化ブロック
数をカウントし、このブロック数が第３のデータ用メモ
リ９に記録される（ステップ２３）。なお、ステップ２
１，２３の順序は入れ換えても差し支えない。Therefore, as shown in FIG.
When the input compressed moving image reaches the image structure determining unit 1, the determining unit 1 determines whether the current image is a frame structure image or a field structure image (Step 2).
0). If it is a frame structure image, the feature amount extraction unit 2
A value using block data of two blocks in the vertical direction of the image is extracted as a feature amount, recorded in the first data memory 6 (step 21), and the field DCT coding block number counting unit 8 outputs the field DCT code. The number of coded blocks is counted, and the number of blocks is recorded in the third data memory 9 (step 23). Step 2
The order of 1 and 23 can be changed.

【００６８】一方、フィールド構造画像であれば、特徴
量抽出部２は、１ブロック分のブロックデータを用いた
値を特徴量として抽出し、第１のデータ用メモリ６に記
録する（ステップ２２）。つまり、フレーム構造画像で
はフィールド構造画像の縦方向２倍のデータを用いる。On the other hand, if the image is a field structure image, the characteristic amount extracting unit 2 extracts a value using the block data of one block as a characteristic amount and records it in the first data memory 6 (step 22). . That is, the frame structure image uses twice the data in the vertical direction as the field structure image.

【００６９】そして、ステップ２４にて、抽出データ比
較部３が、今回の特徴量と、それ以前の特徴量とを比較
して、比較結果情報を第２のデータ用メモリ７に記録す
る。次に、ステップ２５にて、シーンチェンジ判定部４
は、この比較結果情報をシーンチェンジ判定用データ入
力部５から入力する、閾値と比較し、シーンチェンジと
判定できるかどうか検討する。さらに、ステップ２６に
て、シーンチェンジ判定部４は、第３のデータ用メモリ
９に記録した、ブロック数と閾値とを比較して、シーン
チェンジと判定できるかどうか検討する。そして、シー
ンチェンジ判定部４は、ステップ２５又はステップ２６
のいずれかで、シーンチェンジと判定したら、発生した
位置を検出結果として出力する。Then, in step 24, the extracted data comparing section 3 compares the current feature value with the preceding feature value, and records the comparison result information in the second data memory 7. Next, in step 25, the scene change determination unit 4
Compares the comparison result information with a threshold value input from the scene change determination data input unit 5 to determine whether a scene change can be determined. Further, in step 26, the scene change determining unit 4 compares the number of blocks recorded in the third data memory 9 with a threshold value to determine whether a scene change can be determined. Then, the scene change judging unit 4 determines in step 25 or step 26
If a scene change is determined in any of the above, the position where the scene change has occurred is output as a detection result.

【００７０】さて、図１１のように、二つのシーンがフ
ィールドで混ざっているような場合には、入力画像と直
前直後の画像との変化量が小さくなってしまう場合があ
り、検出漏れを起こす原因となっていた。しかし、従来
の技術の項で述べたように、このような入力画像におい
てフィールドＤＣＴを行うと圧縮率が高くなる。As shown in FIG. 11, when two scenes are mixed in the field, the amount of change between the input image and the image immediately before and after may be small, causing detection omission. Was causing it. However, as described in the section of the related art, when field DCT is performed on such an input image, the compression ratio increases.

【００７１】そのため、フレーム内でフィールドＤＣＴ
が多く用いられている場合には、フレーム内の奇数フィ
ールドと偶数フィールドの相関が低いと見做すことがで
きる。このため、特徴量抽出部２は、フィールドＤＣＴ
符号化が行われている数をカウントし、これをフレーム
内の第３の変化量として、比較検討対象に追加する。Therefore, the field DCT in the frame
Are frequently used, it can be considered that the correlation between the odd field and the even field in the frame is low. For this reason, the feature quantity extraction unit 2 uses the field DCT
The number of encodings is counted, and this is added to the comparison target as the third variation in the frame.

【００７２】抽出データ比較部３は、第１，第２の実施
の形態と同様であるが、シーンチェンジ判定部４は、第
１，第２の実施の形態における判定に加えて、第３の変
化量と、この第３の変化量のために予め設定された、第
３の閾値とを比較した場合に、第３の変化量が第３の閾
値よりも大きい時には、フレームのフィールド間にシー
ンチェンジがあると判断する。The extracted data comparison unit 3 is the same as that of the first and second embodiments, but the scene change judgment unit 4 adds the third and third embodiments in addition to the judgment in the first and second embodiments. When comparing the amount of change with a third threshold value preset for the third amount of change, when the third amount of change is greater than the third threshold value, a scene change between the fields of the frame is performed. Judge that there is a change.

【００７３】以上、説明したように、フィールドＤＣＴ
が用いられた数をカウントすることにより、従来検出が
非常に困難であった、フレーム構造画像の２つのフィー
ルド間に存在するシーンチェンジ（図１１に例示してい
る）を検出できる。As described above, the field DCT
By counting the number of times that is used, it is possible to detect a scene change (illustrated in FIG. 11) existing between two fields of a frame structure image, which has been very difficult to detect conventionally.

【００７４】さらに、本形態では、シーンチェンジ判定
用データ入力部５が、シーンチェンジ判定部４に出力す
る閾値について、次の工夫がなされている。即ち、画像
の最大変化量を基準（１００％）として、閾値は、この
基準の所定パーセントと定める。Further, in the present embodiment, the following changes are made regarding the threshold value output from the scene change determination data input section 5 to the scene change determination section 4. That is, with the maximum change amount of the image as a reference (100%), the threshold is defined as a predetermined percentage of the reference.

【００７５】例えば、比較画像Ａ、Ｂにおいて、比較す
る１ブロックのデータの取り得る最小値が０であり最大
値が２５５であるとすると、１ブロックの最大変化量は
２５５である。そして、画像の比較に用いたブロック数
が１３２０であるとすると、画像全体の最大変化量は、
２５５＊１３２０＝３３６６００となる。この変化量を
基準（１００％）とする。また、ここでの閾値は、例え
ば３％（１００９８）〜１０％（３３６６０）程度が好
適である。For example, in comparison images A and B, if the minimum value of the data of one block to be compared is 0 and the maximum value is 255, the maximum change amount of one block is 255. If the number of blocks used for comparing images is 1320, the maximum change amount of the entire image is
255 * 1320 = 336600. This change amount is set as a reference (100%). Also, the threshold here is preferably, for example, about 3% (10098) to 10% (33660).

【００７６】勿論、使用するデータの数やデータの最大
変化量が変われば、それに伴って閾値は変化するが、閾
値と基準の比率は一定とする。Of course, if the number of data to be used or the maximum change amount of the data changes, the threshold value changes accordingly, but the ratio between the threshold value and the reference is fixed.

【００７７】これにより、画像サイズ（縦×横）が変わ
ったり、判定に使用されるデータの種類が変わったりし
ても、検出のばらつきを抑制して、ほぼ一様な検出結果
を得ることができる。As a result, even if the image size (vertical × horizontal) changes or the type of data used for determination changes, it is possible to suppress a variation in detection and obtain a substantially uniform detection result. it can.

【００７８】（第４の実施の形態）図１２は、本発明の
第４の実施の形態におけるシーンチェンジ検出装置のブ
ロック図、図１３は、本発明の第４の実施の形態におけ
るシーンチェンジ検出装置のフローチャートである。(Fourth Embodiment) FIG. 12 is a block diagram of a scene change detecting device according to a fourth embodiment of the present invention, and FIG. 13 is a scene change detecting device according to the fourth embodiment of the present invention. It is a flowchart of an apparatus.

【００７９】本形態では、図１と図１２とを比較すれば
明らかなように、シーンチェンジ判定部４が検出結果を
ダイレクトに出力するのではなく、シーンチェンジ判定
部４がシーンチェンジと判定したシーンチェンジ位置情
報を、一旦、第４のデータ用メモリ１１に格納するよう
にしている。In the present embodiment, as apparent from a comparison between FIG. 1 and FIG. 12, the scene change determining unit 4 does not directly output the detection result, but determines that the scene change is a scene change. The scene change position information is temporarily stored in the fourth data memory 11.

【００８０】また、シーンチェンジ判定用データ入力部
５には、基準となるシーンチェンジからターゲットシー
ンまでの時間軸上の間隔が設定される。そして、シーン
チェンジ間隔検索部１０を追加している。このシーンチ
ェンジ間隔検索部１０は、第４のデータ用メモリ１１に
記録されたシーンチェンジ位置情報同士の時間軸上の間
隔を求め、求めた間隔と、シーンチェンジ判定用データ
入力部５から与えられる間隔とを、比較する。In the scene change determination data input section 5, an interval on the time axis from a reference scene change to a target scene is set. Then, a scene change interval search unit 10 is added. The scene change interval search unit 10 obtains an interval on the time axis between pieces of scene change position information recorded in the fourth data memory 11, and is provided from the obtained interval and the scene change determination data input unit 5. Compare with interval.

【００８１】したがって、図１３に示すように、先の実
施の形態と同様に、シーンチェンジ判定部４は、シーン
チェンジを探す（ステップ３０）。そして、シーンチェ
ンジ判定部４が、シーンチェンジを見つけると、このシ
ーンチェンジ位置情報を第４のデータ用メモリ１１に格
納する（ステップ３１）。Therefore, as shown in FIG. 13, the scene change judging section 4 looks for a scene change as in the previous embodiment (step 30). When the scene change determining section 4 finds a scene change, the scene change position information is stored in the fourth data memory 11 (step 31).

【００８２】そして、シーンチェンジ間隔検索部１０
は、第４のデータ用メモリ１１をアクセスして、シーン
チェンジ間の間隔を調べ（ステップ３２）、シーンチェ
ンジ判定用データ入力部５から与えられた間隔と一致す
るシーンチェンジの組が見つかると、見つかったシーン
チェンジの先頭と末尾とからなる、組の位置情報を、検
出結果として出力する（ステップ３３）。The scene change interval search unit 10
Accesses the fourth data memory 11 and checks the interval between scene changes (step 32). When a set of scene changes matching the interval given from the scene change determination data input unit 5 is found, The position information of the set consisting of the beginning and end of the found scene change is output as a detection result (step 33).

【００８３】例えば、５分の映像の中から、３０秒のシ
ーンを検出したい場合、映像全体からシーンチェンジ検
出を行い、その結果を、第４のデータ用メモリ１１に記
録する。その後、記録されたデータの中から、ちょうど
３０秒間隔になっているシーンチェンジの組を探し出
し、それを検出結果として出力する。For example, when it is desired to detect a 30-second scene from a 5-minute video, a scene change is detected from the entire video, and the result is recorded in the fourth data memory 11. After that, a set of scene changes at exactly 30-second intervals is searched for from the recorded data, and is output as a detection result.

【００８４】図１４を例にとると、シーンチェンジ１〜
シーンチェンジ４が検出された場合に、シーンチェンジ
１とシーンチェンジ４がちょうど３０秒間隔であれば、
シーンチェンジ１とシーンチェンジ４の組が、検出結果
として出力される。Referring to FIG. 14 as an example, scene changes 1 to
When scene change 4 is detected, if scene change 1 and scene change 4 are exactly 30 seconds apart,
A set of scene change 1 and scene change 4 is output as a detection result.

【００８５】このことにより、例えば、テレビ放送など
から得られた大量な映像からＣＭ部分のみを取り出した
り、放送時間長の決まったニュースや番組を取り出した
りすることが可能になる。As a result, for example, it is possible to extract only a CM portion from a large amount of video obtained from a television broadcast or the like, or to extract news or a program with a fixed broadcast time length.

【００８６】さらに、あるシーンチェンジが見つかった
場合、それから与えられた間隔までのシーンチェンジ判
定を省略でき、無駄な検出動作を極力省いて、処理時間
を短縮できる。Further, when a certain scene change is found, it is possible to omit the scene change determination up to a given interval, to reduce unnecessary detection operation as much as possible, and to shorten the processing time.

【００８７】例えば、長い映像の中から、１５秒のＣＭ
だけを検出したい場合には、ターゲット時間として１５
秒を与える。シーンチェンジ間隔検索部１０では、演算
によってシーンチェンジとして検出されたフレームか
ら、１５秒後のフレームにシーンチェンジが検出される
かを判定し、１５秒後にシーンチェンジが検出された場
合にのみ、そのフレームと１５秒後のフレームを要求さ
れたシーンチェンジとして出力する。この１５秒間がタ
ーゲットのシーンとして検出されることになる。続けて
ターゲットシーンを検索するには、最後に検出されたシ
ーンチェンジからまた１５秒後にシーンチェンジが存在
するかを判定し、シーンチェンジが検出されなければ次
のフレームからシーンチェンジ判定を継続する。検出さ
れればそこがターゲットシーンとなる。For example, from a long video, a 15-second CM
If you want to detect only
Give seconds. The scene change interval search unit 10 determines whether a scene change is detected in a frame 15 seconds later from a frame detected as a scene change by calculation, and only when a scene change is detected 15 seconds later, The frame and the frame 15 seconds later are output as the requested scene change. The 15 seconds are detected as the target scene. To continuously search for a target scene, it is determined whether a scene change exists 15 seconds after the last detected scene change, and if no scene change is detected, the scene change determination is continued from the next frame. If detected, that becomes the target scene.

【００８８】図１４を例にとると、シーンチェンジ１が
検出され、そのちょうど１５秒後にシーンチェンジ４が
あったとすると、この１５秒間に存在する、シーンチェ
ンジ２とシーンチェンジ３との判定（無駄な判定）をス
キップすることができる。In the example shown in FIG. 14, if a scene change 1 is detected and a scene change 4 occurs exactly 15 seconds after that, a determination is made between the scene change 2 and the scene change 3 existing during the 15 seconds (wastefulness). Determination) can be skipped.

【００８９】このように、ここでは、一つ目のシーンチ
ェンジが見つかってから、ターゲットの間隔として与え
られた時間だけ後のフレームがシーンチェンジであるか
どうかを判定している。したがって、検出されたターゲ
ットシーンの中にあるシーンチェンジ判定のための処理
を省くことができ、処理時間を短縮できる。As described above, in this case, it is determined whether or not a frame after the first scene change is found is a scene change by a time given as a target interval. Therefore, processing for determining a scene change in the detected target scene can be omitted, and processing time can be reduced.

【００９０】ここで、本明細書にいう「圧縮動画像のシ
ーンチェンジ検出プログラムをコンピュータ読み取り可
能に記録した記録媒体」には、複数の記録媒体にプログ
ラムを分散して配布する場合を含む。また、このプログ
ラムが、オペレーティングシステムの一部であるか否か
を問わず、種々のプロセスないしスレッド（ＤＬＬ、Ｏ
ＣＸ、ＡｃｔｉｖｅＸ等（マイクロソフト社の商標を含
む））に機能の一部を肩代わりさせている場合には、肩
代わりさせた機能に係る部分が、記録媒体に格納されて
いない場合も含む。Here, the “recording medium in which the program for detecting a scene change of a compressed moving image is recorded in a computer-readable manner” as used in the present specification includes a case where the program is distributed and distributed to a plurality of recording media. Also, regardless of whether this program is part of the operating system or not, various processes or threads (DLL, O
When a part of the function is taken over by CX, ActiveX, or the like (including a trademark of Microsoft Corporation), the case where the part relating to the substituted function is not stored in the recording medium is also included.

【００９１】図１、図７、図９及び図１２（以下「図１
等」という）には、スタンドアロン形式のシステムを例
示したが、サーバー／クライアント形式にしても良い。
つまり、１つの端末機のみに、本明細書に出現する全て
の要素が含まれている場合の他、１つの端末機がクライ
アントであり、これが接続可能なサーバないしネットワ
ーク上に、全部又は一部の要素が実存していても、差し
支えない。FIG. 1, FIG. 7, FIG. 9 and FIG.
Etc.), a stand-alone system is illustrated, but a server / client system may be used.
In other words, in addition to the case where only one terminal includes all the elements appearing in this specification, one terminal is a client and is connected to a server or network to which it can be connected, in whole or in part. It does not matter if the element exists.

【００９２】さらには、図１等のほとんどの要素をサー
バー側で持ち、クライアント側では、例えば、ＷＷＷブ
ラウザだけにしても良い。この場合、各種の情報は、通
常サーバ上にあり、基本的にネットワークを経由してク
ライアントに配布されるものだが、必要な情報が、サー
バ上にあるときは、そのサーバの記憶装置が、ここにい
う「記録媒体」となり、クライアント上にあるときは、
そのクライアントの記録装置が、ここにいう「記録媒
体」となる。Further, most of the elements shown in FIG. 1 and the like may be provided on the server side, and the client side may be, for example, only a WWW browser. In this case, various kinds of information are usually on a server and are basically distributed to clients via a network. However, when necessary information is on a server, the storage device of the server is stored in the When it is on the client,
The recording device of the client is the “recording medium” here.

【００９３】さらに、この「圧縮動画像のシーンチェン
ジ検出プログラム」には、コンパイルされて機械語にな
ったアプリケーションの他、上述のプロセスないしスレ
ッドにより解釈される中間コードとして実存する場合
や、少なくともリソースとソースコードとが「記録媒
体」上に格納され、これらから機械語のアプリケーショ
ンを生成できるコンパイラ及びリンカが「記録媒体」に
ある場合や、少なくともリソースとソースコードとが
「記録媒体」上に格納され、これらから中間コードのア
プリケーションを生成できるインタープリタが「記録媒
体」にある場合なども含む。Further, the "program for detecting a scene change of a compressed moving image" includes, in addition to an application that has been compiled into a machine language, a program that actually exists as an intermediate code interpreted by the above-described process or thread, And the source code are stored on the “recording medium”, and the compiler and linker that can generate the machine language application from these are stored on the “recording medium”, or at least the resources and source code are stored on the “recording medium” This includes the case where an interpreter capable of generating an application of the intermediate code from these is present in the “recording medium”.

【００９４】[0094]

【発明の効果】本発明では、フィールド構造画像とフレ
ーム構造画像とが混在する圧縮動画像を入力し、入力し
た圧縮動画像におけるシーンチェンジを検出するもので
ある。したがって、画像構造の如何を気にすることな
く、シーンチェンジを検出でき、圧縮動画像を検索する
際の、重要なインデックスを得ることができる。そし
て、請求項１、２、７、８、１２、１３の構成によれ
ば、フレーム構造画像とフィールド構造画像が時間軸上
で混在していても、一律にシーンチェンジを検出でき
る。また、フレーム構造画像内でフレームＤＣＴを用い
られようとフィールドＤＣＴを用いられようと、特別な
配慮なしに、シーンチェンジを検出できる。According to the present invention, a compressed moving image in which a field structure image and a frame structure image are mixed is input, and a scene change in the input compressed moving image is detected. Therefore, a scene change can be detected without worrying about the image structure, and an important index can be obtained when a compressed moving image is searched. According to the first, second, seventh, eighth, twelfth, and thirteenth aspects, a scene change can be detected uniformly even when a frame structure image and a field structure image are mixed on a time axis. In addition, whether a frame DCT is used or a field DCT is used in a frame structure image, a scene change can be detected without any special consideration.

【００９５】請求項３、９、１４の構成によれば、フィ
ールド符号化ブロック数をカウントし、このカウント数
と閾値とを比較することによって、従来検出が非常に困
難であった、フィールド間に存在するシーンチェンジを
も検出できる。According to the third, ninth and fourteenth aspects, by counting the number of field-coded blocks and comparing the counted number with a threshold value, it is extremely difficult to detect the number of field-coded blocks. Existing scene changes can also be detected.

【００９６】請求項４、１０、１５の構成によれば、長
い入力動画像から一部のターゲットを容易に取り出し得
るし、無駄なシーンチェンジ判定を極力省略でき、その
結果、処理時間を短縮できる。According to the structure of the fourth, tenth and fifteenth aspects, some targets can be easily extracted from a long input moving image, and unnecessary scene change determination can be omitted as much as possible. As a result, the processing time can be reduced. .

【００９７】請求項５、１１、１６の構成によれば、画
像の大きさや、データの種類が変わっても、検出のばら
つきが少なく、一様の検出結果を得ることができる。According to the fifth, eleventh, and sixteenth aspects, even if the size of the image or the type of data changes, the detection variation is small, and a uniform detection result can be obtained.

[Brief description of the drawings]

【図１】本発明の第１の実施の形態におけるシーンチェ
ンジ検出装置のブロック図FIG. 1 is a block diagram of a scene change detection device according to a first embodiment of the present invention.

【図２】本発明の第１の実施の形態におけるシーンチェ
ンジ検出装置のフローチャートFIG. 2 is a flowchart of a scene change detection device according to the first embodiment of the present invention.

【図３】同詳細フローチャートFIG. 3 is a detailed flowchart of the same.

【図４】同詳細フローチャートFIG. 4 is a detailed flowchart of the same.

【図５】本発明の第１の実施の形態における入力圧縮動
画像のモデル図FIG. 5 is a model diagram of an input compressed moving image according to the first embodiment of the present invention.

【図６】（ａ）本発明の第１の実施の形態におけるフィ
ールド構造画像の例示図（トップフィールド）（ｂ）同
フィールド構造画像の例示図（ボトムフィールド）
（ｃ）同フレーム構造画像の例示図6A is a view showing an example of a field structure image according to the first embodiment of the present invention (top field); FIG. 6B is a view showing an example of the same field structure image (bottom field);
(C) Illustrative view of the same frame structure image

【図７】本発明の第２の実施の形態におけるシーンチェ
ンジ検出装置のブロック図FIG. 7 is a block diagram of a scene change detection device according to a second embodiment of the present invention.

【図８】本発明の第２の実施の形態におけるシーンチェ
ンジ検出装置のフローチャートFIG. 8 is a flowchart of a scene change detection device according to a second embodiment of the present invention.

【図９】本発明の第３の実施の形態におけるシーンチェ
ンジ検出装置のブロック図FIG. 9 is a block diagram of a scene change detection device according to a third embodiment of the present invention.

【図１０】本発明の第３の実施の形態におけるシーンチ
ェンジ検出装置のフローチャートFIG. 10 is a flowchart of a scene change detection device according to a third embodiment of the present invention.

【図１１】本発明の第３の実施の形態におけるフィール
ド間に存在するシーンチェンジの例示図FIG. 11 is a view showing an example of a scene change existing between fields according to the third embodiment of the present invention.

【図１２】本発明の第４の実施の形態におけるシーンチ
ェンジ検出装置のブロック図FIG. 12 is a block diagram of a scene change detection device according to a fourth embodiment of the present invention.

【図１３】本発明の第４の実施の形態におけるシーンチ
ェンジ検出装置のフローチャートFIG. 13 is a flowchart of a scene change detection device according to a fourth embodiment of the present invention.

【図１４】本発明の第４の実施の形態におけるシーンチ
ェンジのモデル図FIG. 14 is a model diagram of a scene change according to the fourth embodiment of the present invention.

【図１５】（ａ）従来のフレームＤＣＴにおけるマクロ
ブロック構造の説明図（ｂ）従来のフィールドＤＣＴにおけるマクロブロック
構造の説明図15A is an explanatory diagram of a macroblock structure in a conventional frame DCT. FIG. 15B is an explanatory diagram of a macroblock structure in a conventional field DCT.

[Explanation of symbols]

１画像構造判定部２特徴量抽出部３抽出データ比較部４シーンチェンジ判定部５シーンチェンジ判定用データ入力部６第１のデータ用メモリ７第２のデータ用メモリ８フィールドＤＣＴ符号化ブロック数カウント部９第３のデータ用メモリ１０シーンチェンジ間隔検索部１１第４のデータ用メモリ DESCRIPTION OF SYMBOLS 1 Image structure judgment part 2 Feature amount extraction part 3 Extracted data comparison part 4 Scene change judgment part 5 Scene change judgment data input part 6 First data memory 7 Second data memory 8 Field DCT coding block count Unit 9 Third data memory 10 Scene change interval search unit 11 Fourth data memory

───────────────────────────────────────────────────── フロントページの続き (72)発明者池田淳大阪府門真市大字門真1006番地松下電器産業株式会社内Ｆターム(参考） 5C059 KK36 MA00 MA23 NN28 NN43 PP05 PP06 PP07 TA64 TB07 TB08 TC14 TC43 TD12 UA38 UA39 5J064 AA01 BA16 BB03 BC01 BD02 ────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Jun Ikeda 1006 Kazuma Kadoma, Kazuma-shi, Osaka Matsushita Electric Industrial Co., Ltd. F-term (reference) 5C059 KK36 MA00 MA23 NN28 NN43 PP05 PP06 PP07 TA64 TB07 TB08 TC14 TC43 TD12 UA38 UA39 5J064 AA01 BA16 BB03 BC01 BD02

Claims

[Claims]

An image structure determining unit for determining an image structure of an input compressed moving image; and, if the determination result of the image structure determining unit is a frame structure image, the image structure is vertically doubled with respect to the field structure image. A feature amount extraction unit that extracts a feature amount based on the minute block data, a storage area for recording the block data extracted by the feature amount extraction unit, and a video change amount by comparing the extracted block data. An apparatus for detecting a scene change of a compressed moving image, comprising: an extracted data comparison unit to be obtained; and a scene change determination unit that determines a scene change using the amount of change obtained by the extraction data comparison unit.

2. The method according to claim 1, wherein one of the input compressed moving images has a structure
A feature amount extraction unit for extracting a feature amount based on the block data of the block, a storage area for recording the block data extracted by the feature amount extraction unit, and an image from which the feature amount is extracted is a frame structure image. In this case, an extracted data comparison unit for comparing a feature amount with a field structure image using twice the block data in the image vertical direction to obtain a change amount, and a scene using the change amount obtained in the extracted data comparison unit. A scene change detection device for a compressed moving image, comprising a scene change determination unit for determining a change.

3. When the image is a frame structure image, a field DCT coded block number counting unit for counting the number of blocks subjected to field DCT coding is compared with a threshold value, and the number of blocks is determined between fields. And a scene change determining unit for determining a scene change to be performed.

4. A scene change judging section for judging a scene change, and a scene for searching for a scene change existing at a start point and an end point of a specified specific interval among the scene changes detected by the scene change judging section. A scene change detection apparatus for a compressed moving image, comprising: a change interval search unit.

5. A threshold value used as a criterion for determining a scene change by the scene change determining unit includes a threshold value determined based on a maximum change amount of an image. Scene change detection device for compressed moving images.

6. A method for detecting a scene change in a compressed moving image, comprising: inputting a compressed moving image in which a field structure image and a frame structure image are mixed; and detecting a scene change in the input compressed moving image.

7. An image structure judging step for judging the image structure of an input compressed moving image, and when the judgment result of the image structure judging step is a frame structure image, the image structure is vertically doubled with respect to the field structure image. A feature amount extraction step of extracting a feature amount based on the block data of the minute, a storage area for recording the block data extracted in the feature amount extraction step, and a change amount of the video by comparing the extracted block data. A method of detecting a scene change of a compressed moving image, comprising: a step of comparing extracted data to be obtained; and a step of determining a scene change using a change amount obtained in the step of comparing extracted data.

8. The method according to claim 1, irrespective of the structure image of the input compressed moving image.
A feature amount extracting step of extracting a feature amount based on the block data of the block, a storage area for recording the block data extracted in the feature amount extracting step, and an image from which the feature amount is extracted is a frame structure image. In the case, an extracted data comparison step of comparing the feature amount with the field structure image using twice the block data in the image vertical direction to obtain a change amount, and a scene using the change amount obtained in the extracted data comparison step A scene change detecting method for a compressed moving image, comprising a scene change determining step of determining a change.

9. A field DCT encoded block number counting step for counting the number of field DCT encoded blocks when the image is a frame structure image;
A scene change determining step of comparing the number of blocks with a threshold to determine a scene change existing between fields.

10. A scene change judging step for judging a scene change, and a scene for searching for a scene change existing at a start point and an end point of a specified specific interval from the scene changes detected in the scene change judging step. A change interval search step.

11. In the scene change determining step, a threshold value used as a criterion for determining a scene change includes:
11. The method for detecting a scene change of a compressed moving image according to claim 7, wherein a threshold determined based on a maximum change amount of the image is included.

12. An image structure judging step for judging the image structure of an input compressed moving image, and when the judgment result of the image structure judging step is a frame structure image, the image structure is vertically doubled with respect to the field structure image. A feature amount extraction step of extracting a feature amount based on the block data of the minute, a storage area for recording the block data extracted in the feature amount extraction step, and a change amount of the video by comparing the extracted block data. A computer-readable program for detecting a scene change of a compressed moving image, comprising: a step of determining a scene change using the amount of change obtained in the step of extracting extracted data; The recording medium on which it was recorded.

13. An image processing apparatus according to claim 1, wherein:
A feature amount extracting step of extracting a feature amount based on one block of block data, a storage area for recording the block data extracted in the feature amount extracting step, and an image from which the feature amount is extracted is a frame structure image. In some cases, an extracted data comparison step of comparing a feature amount with a field structure image using twice the block data in the image vertical direction to obtain a change amount, and using the change amount obtained in the extracted data comparison step A recording medium having a computer-readable recording of a program for detecting a scene change of a compressed moving image, comprising a scene change determining step of determining a scene change.

14. When the image is a frame structure image, a field DCT coded block number counting step for counting the number of field DCT coded blocks, and comparing the number of blocks with a threshold value to determine the number of blocks between fields. Recording a scene change detection program for a compressed moving image in a computer-readable manner.

15. A scene change judging step for judging a scene change, and a scene for retrieving a scene change existing at a start point and an end point of a specified specific interval from the scene changes detected in the scene change judging step. A computer-readable recording medium for storing a program for detecting a scene change of a compressed moving image, the program comprising: a change interval search step.

16. In the scene change determining step, a threshold value used as a criterion for determining a scene change includes:
16. The recording medium according to claim 12, further comprising a computer-readable recording program for detecting a scene change of a compressed moving image, wherein the program includes a threshold determined based on a maximum change amount of the image.