JP2002269569A

JP2002269569A - Animation database

Info

Publication number: JP2002269569A
Application number: JP2001072594A
Authority: JP
Inventors: Masajiro Iwasaki; 雅二郎岩崎
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2001-03-14
Filing date: 2001-03-14
Publication date: 2002-09-20

Abstract

PROBLEM TO BE SOLVED: To provide animation database capable of performing a process at high speed and identifying, searching, and classifying scene and animation by extracting feature amount after dividing frame image. SOLUTION: This animation database is for analyzing feature amount of frames and identifying animation changing point. Simple cuts are eliminated from the animation with using difference of feature amount between frames, and animation frames eliminating the cuts are divided in a fixed direction. Gathering of feature amount, extracted from each domain, is defined as a divided image feature amount, the divided image feature amount corresponding to plural division directions is calculated, and the divided image characteristic amount corresponding to the plural division directions is analyzed to identify a changing point of animation.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、動画像情報から抽
出した動画像の特徴量を用いた動画像の構造化、識別、
検索、分類に関わる技術に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to structuring, identification, and the like of a moving image using feature amounts of the moving image extracted from the moving image information.
It is related to technology related to search and classification.

【０００２】[0002]

【従来の技術】特許２９１４１７０号公報には、動画像
の変化点としてパン、ワイプ、ディゾルブを検出する映
像変化点検出方法が開示されている。2. Description of the Related Art Japanese Patent No. 2914170 discloses a video change point detection method for detecting pan, wipe, and dissolve as a change point of a moving image.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、検索や
分類、識別といった機能が提供されていないため、高速
に、また精度高く、シーンや動画像を分類、検索、識別
することができない。However, since functions such as search, classification, and identification are not provided, scenes and moving images cannot be classified, searched, and identified at high speed and with high accuracy.

【０００４】本発明は以上のような従来技術の問題点を
解消するためになされたもので、フレーム画像を分割し
た上で特徴量を抽出することで、高速に処理を行うと共
に、シーンや動画像の識別、検索、分類を可能とする動
画像データベースを提供することを目的とする。The present invention has been made to solve the above-mentioned problems of the prior art, and performs high-speed processing by extracting a feature amount after dividing a frame image, and at the same time, scenes and moving images. It is an object of the present invention to provide a moving image database capable of identifying, searching, and classifying images.

【０００５】[0005]

【課題を解決するための手段】請求項１記載の発明は、
フレームの特徴量を解析して動画像の変化点を識別する
ための動画像データベースであって、フレーム間におけ
る特徴量の相違度を用いて動画像から単なるカットを除
き、上記カットを除いた動画像のフレームを一定方向に
分割した上で、個々の領域から抽出した特徴量の集合を
分割画像特徴量とすると共に、複数の分割の方向に対応
した分割画像特徴量を算出し、上記複数の分割の方向に
対応した分割画像特徴量を解析して動画像の変化点を識
別することを特徴とする。According to the first aspect of the present invention,
A moving image database for analyzing a feature amount of a frame and identifying a change point of the moving image, wherein a moving image in which a simple cut is removed from a moving image using a degree of difference of a feature amount between frames, and the cut is removed. After dividing the frame of the image in a certain direction, a set of feature amounts extracted from individual regions is used as a divided image feature amount, and a divided image feature amount corresponding to a plurality of division directions is calculated. The method is characterized in that a feature point of a moving image is identified by analyzing a feature value of a divided image corresponding to a dividing direction.

【０００６】請求項２記載の発明は、請求項１記載の発
明において、フレームを格子状の領域に分割したことを
特徴とする。According to a second aspect of the present invention, in the first aspect, the frame is divided into grid-like regions.

【０００７】請求項３記載の発明は、請求項１または２
記載の発明において、１つの分割の方向に対応した分割
画像特徴量の相違度が、他の分割の方向に対応した分割
画像特徴量の相違度と比較して際立って小さい時に、変
化点をカメラワークと識別すると共に、当該分割画像特
徴量に対応する分割の方向をカメラワークの方向として
識別することを特徴とする。[0007] The invention described in claim 3 is claim 1 or 2.
In the described invention, when the difference between the divided image feature amounts corresponding to one division direction is significantly smaller than the difference between the divided image feature amounts corresponding to the other division directions, the change point is set to the camera. In addition to identifying the workpiece, the direction of the division corresponding to the divided image feature amount is identified as the direction of the camera workpiece.

【０００８】請求項４記載の発明は、請求項１または２
記載の発明において、１つの分割の方向に対応した分割
画像特徴量の相違度が、他の分割の方向に対応した分割
画像特徴量の相違度と比較して際立って大きい時に、変
化点をワイプと識別すると共に、当該分割画像特徴量に
対応する分割の方向をワイプの方向として識別する請求
ことを特徴とする。[0008] The invention described in claim 4 is claim 1 or 2.
In the described invention, when the difference between the divided image feature amounts corresponding to one division direction is significantly greater than the difference between the divided image feature amounts corresponding to the other division directions, the change point is wiped. And a division direction corresponding to the divided image feature amount is identified as a wipe direction.

【０００９】請求項５記載の発明は、請求項１または２
記載の発明において、分割画像特徴量の相違度が、全領
域でほぼ均等に変化している時に、変化点をディゾルブ
と識別することを特徴とする。The invention according to claim 5 is the first or second invention.
In the described invention, when the degree of difference between the divided image feature amounts changes almost uniformly in all regions, the change point is identified as dissolve.

【００１０】請求項６記載の発明は、特定時間区間内の
動画像を構成するフレームの特徴量を解析して動画像を
分類するための動画像データベースであって、特徴量
は、動画像を構成するフレームの中からＮｅａｒｅｓｔ
Ｎｅｉｇｈｂｏｒクラスタリングにより抽出した複数
のシーン代表フレームの特徴量を用いることを特徴とす
る。According to a sixth aspect of the present invention, there is provided a moving image database for classifying a moving image by analyzing a characteristic amount of a frame constituting the moving image in a specific time section, wherein the characteristic amount is a moving image. Nearest from the frames that compose
It is characterized by using the feature amounts of a plurality of scene representative frames extracted by Neighbor clustering.

【００１１】請求項７記載の発明は、特定時間区間内の
動画像を構成するフレームの特徴量を解析して動画像を
分類するための動画像データベースであって、特徴量
は、動画像を構成するフレームにおいて近接するフレー
ム間の特徴量の相違度の統計量を用いることを特徴とす
る。According to a seventh aspect of the present invention, there is provided a moving image database for classifying a moving image by analyzing a characteristic amount of a frame constituting the moving image in a specific time section. A feature of the present invention is to use a statistic of a difference in a feature amount between adjacent frames in a constituent frame.

【００１２】請求項８記載の発明は、特定時間区間内の
動画像を構成するフレームの特徴量を解析して動画像を
分類するための動画像データベースであって、特徴量
は、動画像を構成する各フレームと第１フレームとの間
の特徴量の相違度の統計量を用いることを特徴とする。According to an eighth aspect of the present invention, there is provided a moving image database for classifying a moving image by analyzing a characteristic amount of a frame constituting the moving image in a specific time section. It is characterized in that a statistic of the degree of difference between the feature amounts of each of the constituent frames and the first frame is used.

【００１３】請求項９記載の発明は、請求項６乃至８の
いずれかに記載の発明において、特定時間区間として動
画像のシーンを用いることを特徴とする。According to a ninth aspect of the present invention, in any one of the sixth to eighth aspects, a scene of a moving image is used as a specific time section.

【００１４】請求項１０記載の発明は、請求項６乃至８
のいずれかに記載の発明において、特定時間区間として
一定時間間隔を用いることを特徴とする。The invention according to claim 10 is the invention according to claims 6 to 8
In the invention described in any one of the above, a fixed time interval is used as the specific time section.

【００１５】請求１１項記載の発明は、動画像を構成す
るフレームの特徴量を解析して動画像を分類するための
動画像データベースであって、特徴量は、動画像を構成
するシーンの時間長の統計量を用いることを特徴とす
る。According to an eleventh aspect of the present invention, there is provided a moving image database for classifying a moving image by analyzing a characteristic amount of a frame forming the moving image, wherein the characteristic amount is a time of a scene forming the moving image. It is characterized by using length statistics.

【００１６】請求項１２記載の発明は、請求項６記載の
発明において、近接するシーン代表フレーム間の特徴量
の相違度の統計量を動画像特徴量とすることを特徴とす
る。According to a twelfth aspect of the present invention, in the sixth aspect of the present invention, a statistic of a feature amount difference between adjacent scene representative frames is used as a moving image feature amount.

【００１７】請求項１３記載の発明は、請求項６記載の
発明において、１つのシーン代表フレームと他のシーン
代表フレーム間の特徴量の相違度の統計量を動画像特徴
量とすることを特徴とする。According to a thirteenth aspect of the present invention, in the invention according to the sixth aspect, a statistic of a difference between feature amounts between one scene representative frame and another scene representative frame is used as a moving image feature amount. And

【００１８】請求項１４記載の発明は、請求項６記載の
発明において、シーン代表フレームを時系列に並べて動
画像のダイジェストとすることを特徴とする。The invention according to claim 14 is the invention according to claim 6, characterized in that scene representative frames are arranged in chronological order to form a digest of a moving image.

【００１９】[0019]

【発明の実施の形態】以下、図面を参照しながら本発明
にかかる動画像データベースの実施の形態について説明
する。動画像データベースは、「動画像解析」、「動画
像特徴量抽出」、「動画像、動画像属性、動画像特徴量
管理」、「動画像分類、検索、識別」、「動画像一覧表
示」の各機能から構成される。図１は、本発明にかかる
動画像データベースの実施の形態を示す機能構成図であ
り、符号１乃至５はそれぞれ、「動画像解析」、「動画
像特徴量抽出」、「動画像、動画像属性、動画像特徴量
管理」、「動画像分類、検索、識別」、「動画像一覧表
示」の各機能ブロックを示す。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of a moving picture database according to the present invention will be described below with reference to the drawings. The moving image database includes "moving image analysis", "moving image feature amount extraction", "moving image, moving image attribute, moving image feature amount management", "moving image classification, search, identification", "moving image list display". It consists of each function. FIG. 1 is a functional configuration diagram showing an embodiment of a moving image database according to the present invention. Reference numerals 1 to 5 indicate “moving image analysis”, “moving image feature amount extraction”, “moving image, moving image Functional blocks of “attribute and moving image feature amount management”, “moving image classification, search, identification”, and “moving image list display”.

【００２０】本発明にかかる動画像データベースを用い
た、動画像解析処理について説明する。動画像解析処理
は、動画像のフレームを時系列順に順次以下の手順で行
う。（１）フレームｉを先行フレーム、フレームｉ＋ｄを後
続フレームとする。ここで、ｄは１以上の整数であり、
ｄを大きくすると粗く調べる、すなわち精度が低くな
る。（２）変位点検出処理を行う。変位点でないなら（７）
の処理へ移行する。（３）カット検出処理を行う。カットなら（７）の処理
へ移行する。（４）カメラワーク検出処理を行う。カメラワークなら
（７）の処理へ移行する。（５）ワイプ検出処理を行う。ワイプなら（７）の処理
へ移行する。（６）ディゾルブ検出処理を行う。ディゾルブなら
（７）の処理へ移行する。（７）ｉにｄを加え（１）にもどる。以下に、変位点、
カット、カメラワーク、ワイプ、ディゾルブのそれぞれ
の検出処理について説明する。A moving image analysis process using the moving image database according to the present invention will be described. The moving image analysis process sequentially performs the following procedure on the frames of the moving image in chronological order. (1) Let frame i be the preceding frame and frame i + d be the subsequent frame. Here, d is an integer of 1 or more,
When d is increased, the coarse inspection is performed, that is, the accuracy is reduced. (2) Displacement point detection processing is performed. If it is not the displacement point (7)
Move to the processing of. (3) Perform cut detection processing. If it is cut, the process moves to (7). (4) Perform camera work detection processing. If it is a camera work, the process proceeds to the process (7). (5) Perform a wipe detection process. If it is a wipe, the process moves to (7). (6) Dissolve detection processing is performed. If it is a dissolve, the process proceeds to (7). (7) Add d to i and return to (1). Below are the displacement points,
The respective detection processes of cut, camera work, wipe, and dissolve will be described.

【００２１】変位点検出処理について説明する。変位点
検出処理では、広域動画像特徴量を用いて、以下の手順
で変位点を検出する。ここで、広域動画像特徴量には、
色のヒストグラムといった一般的な画像特徴量を利用で
きる。先ず、フレームｉの広域動画像特徴量Ｆ_ｉと、フ
レームｉ＋ｄの広域動画像特徴量Ｆ_ｉ＋ｄを抽出する。
次に、Ｆ_ｉとＦ_ｉ＋ｄの距離（相違度）を算出する。算
出した距離が閾値Ｔｇを越えた場合には、変位点とす
る。The displacement point detection processing will be described. In the displacement point detection process, a displacement point is detected in the following procedure using the wide area moving image feature amount. Here, the wide area moving image feature amount includes
A general image feature amount such as a color histogram can be used. First, it extracts the wide-area video image characteristic amount F _i frames i, the frame i + global motion picture feature amount of d F _{i + d.}
Next, the distance (degree of difference) between F _i and F _{i + d} is calculated. If the calculated distance exceeds the threshold value Tg, it is determined as a displacement point.

【００２２】次に、カット検出処理について説明する。
カット検出処理でも、上記の広域動画像特徴量を用い
て、以下の手順でカット点を検出する。先ず、フレーム
ｉの広域画像特徴量Ｆ_ｉと、フレームｉ＋ｄの広域画像
特徴量Ｆ_ｉ＋ｄを抽出する。次に、Ｆ_ｉとＦ_ｉ＋ｄの距
離（相違度）を算出する。算出した距離が閾値Ｔｃを超
えた場合には、単なるカット点とする。なお、Ｔｇ＜Ｔ
ｃである。Next, the cut detection processing will be described.
Also in the cut detection processing, a cut point is detected by the following procedure using the above-described wide area moving image feature amount. First, a wide area image feature F _{i of} frame i and a wide area image feature F i + d of frame _{i + d} are extracted. Next, the distance (degree of difference) between F _i and F _{i + d} is calculated. If the calculated distance exceeds the threshold Tc, it is simply a cut point. Note that Tg <T
c.

【００２３】次に、カメラワーク検出処理について説明
する。カメラワーク検出処理では、分割画像特徴量を用
いて、カメラワーク方向を検出する。分割画像特徴量
は、フレーム画像全体の特徴量ではなく、フレーム画像
を幾つかの領域に分割し、個々の領域から抽出した特徴
量の集合である。図２は、領域の分割の例を示した図で
あり、（ａ）は上下方向パン、（ｂ）は左右方向パン、
（ｃ）（ｄ）は斜め方向パン、（ｅ）はズームイン、ア
ウト、（ｆ）は回転、の各カメラワーク方向を識別する
特徴量である。個々の領域の特徴量としては、一般的な
色のヒストグラムなどを利用できるが、高速な処理を実
現するためには、領域の平均色を利用する。Next, the camera work detection processing will be described. In the camera work detection process, the camera work direction is detected using the divided image feature amount. The divided image feature amount is not a feature amount of the entire frame image but a set of feature amounts obtained by dividing the frame image into several regions and extracting the individual regions. 2A and 2B are diagrams illustrating an example of division of a region, wherein FIG. 2A illustrates vertical panning, FIG. 2B illustrates horizontal panning,
(C) and (d) are characteristic amounts for identifying the camera work directions of panning in the oblique direction, (e) zooming in and out, and (f) rotating. A general color histogram or the like can be used as a feature amount of each area, but in order to realize high-speed processing, an average color of the area is used.

【００２４】カメラワーク方向は、以下の手順で検出す
る。先ず、フレームｉの分割画像特徴量と、フレームｉ
＋ｄの分割画像特徴量を抽出する。次に、個々の分割画
像特徴量間の距離を算出する。算出した１つの分割画像
特徴量間の距離のみが、個々の特徴量に予め設定されて
いる閾値Ｔｃ１より小さく、その他の距離がＴｃ２より
大きい場合（Ｔｃ２＞Ｔｃ１）にのみ、カメラワーク方
向は当該分割画像特徴量に対応する方向である、と判断
して、フレームｉ＋ｄにカメラワーク方向種別を付加す
る。すなわち、たとえば左右方向のパンの場合、フレー
ムｉとフレームｉ＋ｄの左右方向パンの分割画像特徴量
間の距離が、他の分割画像特徴量の距離より小さくな
る。The camera work direction is detected in the following procedure. First, the divided image feature amount of the frame i and the frame i
The + d divided image feature amount is extracted. Next, the distance between the individual divided image feature amounts is calculated. Only when the calculated distance between one of the divided image feature amounts is smaller than the threshold value Tc1 set in advance for each feature amount and the other distance is longer than Tc2 (Tc2> Tc1), the camera work direction is determined. It is determined that the direction corresponds to the divided image feature amount, and the camera work direction type is added to the frame i + d. That is, for example, in the case of panning in the left-right direction, the distance between the divided image feature amounts of the panning in the left-right direction of frame i and frame i + d is smaller than the distance of the other divided image feature amounts.

【００２５】以上説明した分割画像特徴量を用いた動画
像解析では、他の距離よりも際立って１つの距離が小さ
い場合に、カメラワークと判断することができ、かつ、
最小の距離に対応してカメラワークの方向を識別するこ
とが可能となる。In the moving image analysis using the divided image feature amount described above, when one distance is remarkably smaller than other distances, it can be determined that the camera work is performed, and
The direction of the camera work can be identified corresponding to the minimum distance.

【００２６】なお領域を格子状に分割し、格子状の領域
からそれぞれ抽出した特徴量（格子状特徴量）から分割
画像特徴量を抽出するようにしてもよい。図３（ａ）
は、格子状に分割した領域の例を示した図である。また
図３（ｂ）は、格子状特徴量から斜め方向パン特徴量を
算出する例を示した図である。It is also possible to divide the area into a lattice and extract the divided image feature from the feature (grid-like feature) extracted from each of the grid-like areas. FIG. 3 (a)
FIG. 4 is a diagram showing an example of a region divided in a grid pattern. FIG. 3B is a diagram showing an example in which the oblique panning feature amount is calculated from the grid-like feature amount.

【００２７】格子状特徴量を用いて分割画像特徴量を抽
出することで、分割画像特徴量は、さらに高速に算出す
ることが可能となる。By extracting the divided image feature using the lattice feature, the divided image feature can be calculated at a higher speed.

【００２８】次に、ワイプ検出処理について説明する。
ワイプは、通常画面上で線を境にシーンが切り替わるた
め、線上の画像の時間軸上での変化が大きくなる。した
がって、上記分割画像特徴量の距離において１つの領域
のみが際立って大きい場合には、ワイプの方向は、その
分割画像特徴量に対応する方向と判断し、当該フレーム
にワイプとその方向を付加する。Next, the wipe detection processing will be described.
In the wipe, the scene is switched at the border of the line on the normal screen, so that the change of the image on the line on the time axis becomes large. Therefore, when only one region is significantly large in the distance of the divided image feature amount, the wipe direction is determined to be the direction corresponding to the divided image feature amount, and the wipe and the direction are added to the frame. .

【００２９】次に、ディゾルブ検出処理について説明す
る。分割画像特徴量、または格子状特徴量の全領域の距
離の分散が、ある閾値より小さい場合、つまり、全領域
でほぼ均等に変化している場合には、ディゾルブと決定
付ける。Next, the dissolve detection processing will be described. If the variance of the distance of the divided image feature amount or the lattice-shaped feature amount in the entire region is smaller than a certain threshold, that is, if the variance of the entire region is almost uniformly changed, the dissolve is determined.

【００３０】上記説明した処理において、カメラワーク
以外のカット、ワイプ、ディゾルブにより分割された区
間をシーンとする。ただし、簡便のためシーンを単に一
定時間で区切って以降の処理を行うようにしても良い。In the processing described above, a section other than the camera work, which is divided by cutting, wiping, and dissolving, is defined as a scene. However, for the sake of simplicity, the subsequent processing may be performed simply by dividing the scene at a fixed time.

【００３１】次に、シーン代表フレーム抽出処理につい
て説明する。シーンは、類似するフレームの集合である
から、シーン中の任意の１フレームをシーン代表フレー
ムとすることができる。しかし実際には、シーンの最初
と最後では全く異なる場合であって徐々に変化する場合
は、１つのシーンとなり得ることもあるため、複数のシ
ーン代表フレームを抽出する必要がある。そこで、クラ
スタリングのアルゴリズムの１つである、Ｎｅａｒｅｓ
ｔＮｅｉｇｈｂｏｒ（最近隣）クラスタリングによ
り、複数のシーン代表フレームを抽出する。Next, the scene representative frame extraction processing will be described. Since a scene is a set of similar frames, any one frame in the scene can be used as a scene representative frame. However, in actuality, if the scene is completely different at the beginning and end and changes gradually, there may be one scene, so it is necessary to extract a plurality of scene representative frames. Therefore, one of the clustering algorithms, Nears
A plurality of scene representative frames are extracted by t Neighbor (nearest neighbor) clustering.

【００３２】ＮｅａｒｅｓｔＮｅｉｇｈｂｏｒクラス
タリングについて説明する。ＮｅａｒｅｓｔＮｅｉｇ
ｈｂｏｒクラスタリングは、以下の手順で複数のシーン
代表フレームを抽出する。（１）第１フレームを第１代表フレームとする。（２）ｉ＝２,ｊ＝１とする。（３）第ｊ代表フレームの特徴量とｉ番目のフレームの
特徴量を比較し、距離が予め指定した閾値を超えた場
合、そのフレームを第ｊ＋１代表フレームとし、ｊに１
を加える。（４）ｉに１を加え、シーンのフレームが最後まで到達
していなければ（３）にもどる。Next, Nearest Neighbor clustering will be described. Nearest Neig
The hbor clustering extracts a plurality of scene representative frames by the following procedure. (1) Let the first frame be the first representative frame. (2) i = 2, j = 1. (3) The feature amount of the j-th representative frame is compared with the feature amount of the i-th frame. If the distance exceeds a predetermined threshold, the frame is set as the (j + 1) th representative frame, and 1 is assigned to j.
Add. (4) Add 1 to i, and return to (3) if the frame of the scene has not reached the end.

【００３３】以上説明したＮｅａｒｅｓｔＮｅｉｇｈ
ｂｏｒクラスタリングにより、シーンを代表する複数の
適切なフレームを抽出することができる。Nearest Neighbor explained above
By the bor clustering, a plurality of appropriate frames representing the scene can be extracted.

【００３４】次に、本発明にかかる動画像データベース
を用いた、動画像特徴量抽出処理について説明する。動
画像特徴量抽出処理では、類似する特徴量のクラスタリ
ングにより動画像やシーンを分類して、動画像の検索、
分類、識別などの処理を行うための動画像特徴量を抽出
する。Next, a description will be given of a moving image feature extraction process using the moving image database according to the present invention. In the moving image feature amount extraction processing, moving images and scenes are classified by clustering similar feature amounts, and searching for moving images,
A moving image feature amount for performing processes such as classification and identification is extracted.

【００３５】シーン特徴量について説明する。シーン特
徴量には、シーン代表フレーム画像特徴量と動き特徴量
とがある。シーン代表フレーム画像特徴量は、上記シー
ン代表フレームから抽出した画像特徴量とする。ここで
画像特徴量には、色のヒストグラムや格子状に分割した
領域から抽出した平均色など、一般的な画像特徴量を利
用する。The scene feature will be described. The scene feature amount includes a scene representative frame image feature amount and a motion feature amount. The scene representative frame image feature is an image feature extracted from the scene representative frame. Here, a general image feature amount such as a color histogram or an average color extracted from an area divided into a grid is used as the image feature amount.

【００３６】動き特徴量は、シーンを構成する各フレー
ムの画像特徴量の画像特徴量空間中での距離の統計量と
する。動き特徴量の例として、格子状に分割した各フレ
ームの各領域から抽出した画像特徴量（例えば平均色）
に関して近接するフレーム間の領域単位の距離を求め、
シーン中の近接するすべてのフレーム間の領域単位の距
離の平均及び分散といった統計量をシーンの動き特徴量
とする。また動き特徴量の別の例として、格子状に分割
した各フレームの各領域から抽出した画像特徴量（例え
ば平均色）に関して近接するフレーム間の領域単位の距
離を求め、第１フレームから抽出した画像特徴量とシー
ン中の各フレーム間の領域単位の距離の平均及び分散と
いった統計量をシーンの動き特徴量とする。The motion feature value is a statistical value of a distance in the image feature value space of the image feature value of each frame constituting the scene. As an example of the motion feature amount, an image feature amount (for example, an average color) extracted from each region of each frame divided into a lattice shape
Find the distance in area units between adjacent frames with respect to
A statistic such as an average and a variance of a distance in a region unit between all adjacent frames in the scene is set as a motion feature amount of the scene. Further, as another example of the motion feature amount, a distance of an area unit between adjacent frames with respect to an image feature amount (for example, an average color) extracted from each region of each frame divided in a lattice shape is obtained and extracted from the first frame. Statistics such as the average and variance of the image feature and the distance of each area between the frames in the scene are defined as the motion feature of the scene.

【００３７】以上説明した動き特徴量は、各領域単位に
抽出されるので、動画像中の位置単位の動きを、動き特
徴量によって示すことができる。また、シーン特徴量や
動画像特徴量において、類似する特徴量のクラスタリン
グにより動画像やシーンの分類を行うことができる。し
たがって、シーンや動画像の精度の高い分類、検索、識
別が可能となる。さらに、特定の特徴量が人により認識
し得る何らかの意味のある動画像やシーンに対応付ける
ことで、その特徴量に類似する動画像やシーン、たとえ
ば、ドラマやバラエティを自動識別することが可能とな
る。Since the above-described motion feature is extracted for each region, the motion of each position in the moving image can be indicated by the motion feature. Further, in the scene feature amount and the moving image feature amount, classification of a moving image and a scene can be performed by clustering similar feature amounts. Therefore, highly accurate classification, search, and identification of scenes and moving images can be performed. Furthermore, by associating a specific feature with a meaningful moving image or scene recognizable by a human, it becomes possible to automatically identify a moving image or scene similar to the feature, for example, a drama or variety. .

【００３８】ここで、１つの動画像は複数のシーンによ
って構成されるので、上記シーン特徴量の集合が１つの
動画像特徴量となる。他にも、「シーン長特徴量」「代
表シーン動き特徴量」を動画像特徴量とすることもでき
る。シーン長特徴量は、シーンの時間長平均及び分散な
どの統計量を動画像特徴量とするものである。代表シー
ン動き特徴量は、時系列に並べたシーン代表フレームの
近接するすべてのシーン代表フレーム間の特徴量の領域
単位の距離を算出し、その距離の平均及び分散などの統
計量をも動画像特徴量とするものである。Here, since one moving image is composed of a plurality of scenes, a set of the above-mentioned scene feature amounts becomes one moving image feature amount. In addition, the “scene length feature amount” and the “representative scene motion feature amount” can be used as the moving image feature amount. The scene length feature amount is obtained by using a statistic such as a time length average and a variance of the scene as a moving image feature amount. The representative scene motion feature amount is calculated by calculating a distance of a feature amount between all the scene representative frames adjacent to the scene representative frame arranged in time series in an area unit, and calculating a statistic such as an average and a variance of the distance. This is a feature amount.

【００３９】また、時系列に並べたシーン代表フレーム
の第１シーン代表フレーム間と各シーン代表フレーム間
の特徴量の領域単位の距離を算出し、その距離の平均及
び分散などの統計量をも動画像特徴量とすることができ
る。Further, the distance between the first scene representative frames of the scene representative frames arranged in time series and the feature amount between each scene representative frame is calculated in units of regions, and statistics such as the average and variance of the distances are also calculated. It can be a moving image feature amount.

【００４０】次に、本発明にかかる動画像データベース
を用いた、動画像一覧表示処理について説明する。動画
像のダイジェストは、上記シーン代表フレーム列で示さ
れ、画像数が少なければ複数の動画像を一画面に表示す
ることも可能である。前述のように類似する動画像が分
類されていれば、各分類が動画像データベースの代表動
画像となり、代表動画像のダイジェストを画面に表示す
れば、データベースのコンテンツの把握が容易となる。Next, a moving image list display process using the moving image database according to the present invention will be described. The digest of a moving image is indicated by the scene representative frame sequence, and a plurality of moving images can be displayed on one screen if the number of images is small. As described above, if similar moving images are classified, each classification becomes a representative moving image in the moving image database, and if the digest of the representative moving image is displayed on the screen, it is easy to grasp the contents of the database.

【００４１】[0041]

【発明の効果】請求項１乃至４記載の発明によれば、フ
レームのカメラワークやワイプの有無及び方向の識別を
高速に処理することが可能となる。According to the first to fourth aspects of the present invention, it is possible to process the camerawork of a frame, the presence / absence of a wipe, and the identification of the direction at a high speed.

【００４２】請求項５記載の発明によれば、フレームの
ディゾルブの有無を高速に識別することが可能となる。According to the fifth aspect of the present invention, it is possible to quickly determine whether or not a frame is dissolved.

【００４３】請求項６記載の発明によれば、シーンを代
表する複数の適切なフレームを抽出することができる。According to the present invention, a plurality of appropriate frames representing a scene can be extracted.

【００４４】請求項６，７，８または１１記載の発明に
よれば、シーンの適切な動き特徴量を抽出することが可
能となる。According to the sixth, seventh, eighth, or eleventh aspect of the present invention, it is possible to extract an appropriate motion feature amount of a scene.

【００４５】請求項１４記載の発明によれば、利用者が
把握しやすい適切な動画像のダイジェストを生成するこ
とが可能となる。According to the fourteenth aspect, it is possible to generate a digest of an appropriate moving image which is easy for the user to grasp.

[Brief description of the drawings]

【図１】本発明にかかる動画像データベースの実施の形
態を示す機能構成図である。FIG. 1 is a functional configuration diagram showing an embodiment of a moving image database according to the present invention.

【図２】フレーム画像の領域の分割の例を示した図であ
る。FIG. 2 is a diagram illustrating an example of division of a region of a frame image.

【図３】（ａ）はフレーム画像の領域を格子状に分割し
た例、（ｂ）は格子状特徴量から斜め方向パン特徴量を
算出する例、を示した図である。3A is a diagram illustrating an example in which a region of a frame image is divided into a grid shape, and FIG. 3B is a diagram illustrating an example in which a diagonal pan feature amount is calculated from a grid-like feature amount.

[Explanation of symbols]

１動画像解析の機能ブロック２動画像特徴量抽出の機能ブロック３動画像、動画像属性、動画像特徴量管理の機能ブ
ロック４動画像分類、検索、識別の機能ブロック５動画像一覧表示の機能ブロックDESCRIPTION OF SYMBOLS 1 Functional block of moving image analysis 2 Function block of moving image feature amount extraction 3 Function block of moving image, moving image attribute, moving image feature amount management 4 Function block of moving image classification, search and identification 5 Function of moving image list display block

Claims

[Claims]

1. A moving image database for analyzing a feature amount of a frame to identify a change point of a moving image, wherein a simple cut is removed from the moving image by using a difference of the feature amount between frames. After dividing a frame of a moving image excluding cuts in a certain direction, a set of feature amounts extracted from individual regions is used as a divided image feature amount, and a divided image feature amount corresponding to a plurality of division directions is calculated. A moving image database characterized by analyzing divided image feature amounts corresponding to the plurality of division directions and identifying a change point of the moving image.

2. The moving image database according to claim 1, wherein the frame is divided into a grid-like area.

3. When the degree of difference between the divided image feature amounts corresponding to one division direction is significantly smaller than the degree of difference between the divided image feature amounts corresponding to the other division directions, the change point is determined by the camera. 3. The moving image database according to claim 1, wherein the moving image database is identified as a workpiece and a direction of division corresponding to the divided image feature amount is identified as a direction of a camera work.

4. A change point is wiped when the difference between the divided image feature amounts corresponding to one division direction is significantly greater than the difference between the divided image feature amounts corresponding to the other division directions. 3. The moving image database according to claim 1, further comprising: identifying a division direction corresponding to the divided image feature amount as a wipe direction.

5. The moving image database according to claim 1, wherein a change point is identified as dissolve when the degree of difference between the divided image feature amounts changes substantially uniformly in all regions.

6. A moving image database for classifying a moving image by analyzing a characteristic amount of a frame constituting the moving image in a specific time section, wherein the characteristic amount is selected from frames included in the moving image. Near
A moving image database using the feature amounts of a plurality of scene representative frames extracted by est Neighbor clustering.

7. A moving picture database for classifying moving pictures by analyzing feature quantities of frames constituting a moving picture in a specific time section, wherein the feature quantities are close to each other in the frames constituting the moving picture. A moving image database using a statistic of a degree of difference between feature amounts between frames.

8. A moving image database for classifying a moving image by analyzing a feature amount of a frame constituting a moving image in a specific time section, wherein the feature amount is determined by each frame constituting the moving image and A moving image database using a statistic of a degree of difference between a feature amount and one frame.

9. The moving image database according to claim 6, wherein a scene of the moving image is used as the specific time section.

10. The moving image database according to claim 6, wherein a fixed time interval is used as the specific time section.

11. A moving image database for classifying a moving image by analyzing a characteristic amount of a frame forming the moving image, wherein the characteristic amount uses a statistic of a time length of a scene forming the moving image. A moving image database characterized in that:

12. The moving image database according to claim 6, wherein a statistic of the degree of difference between the characteristic amounts between adjacent scene representative frames is used as the moving image characteristic amount.

13. The moving image database according to claim 6, wherein a statistic of a feature amount difference between one scene representative frame and another scene representative frame is used as a moving image feature amount.

14. The moving image database according to claim 6, wherein scene representative frames are arranged in chronological order to form a digest of the moving image.