JP3534592B2

JP3534592B2 - Representative image generation device

Info

Publication number: JP3534592B2
Application number: JP30954897A
Authority: JP
Inventors: 等加藤
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1997-10-24
Filing date: 1997-10-24
Publication date: 2004-06-07
Anticipated expiration: 2017-10-24
Also published as: JPH11136637A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、ランダムアクセス
が可能な記憶装置を備えた動画編集装置、マルチメディ
ア編集装置、動画データベース、動画ファイル管理装置
などにおける代表画像生成装置に関し、特に、編集のた
めの一覧用表示に供するフレームを自動的に決めるに当
たり、これを機械的に一律に決めるのではなく、動画デ
ータの中身を的確に表す画像フレームに決めて表示する
よう構成したものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a representative image generating device in a moving image editing device, a multimedia editing device, a moving image database, a moving image file management device, etc., which is provided with a storage device capable of random access When automatically determining the frame to be used for the list display, the image is configured not to be uniformly determined mechanically but to be displayed by determining the image frame that accurately represents the contents of the moving image data.

【０００２】[0002]

【従来の技術】従来の代表画像生成装置としては、たと
えば、対象となる動画素材の先頭フレームや先頭から一
定時間を経過した時点のフレーム（例えば、先頭から５
フレーム目）の内容をインデクス静止画とする方法（特
開平6-89549）や、映像カットの切替りを検出して検出
されたフレーム（またはその前後のフレーム）画像をイ
ンデクス静止画とする方法（特開平5-183862、特開平9-
200687）などが知られている。2. Description of the Related Art As a conventional representative image generation apparatus, for example, a leading frame of a target moving image material or a frame at a time point after a certain time has passed from the beginning (for example, 5 frames from the beginning).
The method of using the contents of (frame number) as an index still image (Japanese Patent Laid-Open No. 6-89549), and the method of detecting the frame cut (or the frames before and after that) by detecting the switching of video cuts as the index still image ( JP-A-5-183862, JP-A-9-
200687) is known.

【０００３】しかしながら、上記従来の代表画像生成装
置は、動画の中身（どのような映像であるか）を的確に
示していないという問題があった。However, the conventional representative image generating apparatus described above has a problem that it does not accurately show the contents of the moving image (what kind of image it is).

【０００４】そこで本発明は、前記従来の問題を解決す
るものであり、動画の中身を的確に表すインデクスとな
る画像を提示できる代表画像生成装置を提供することを
目的にするものである。Therefore, the present invention solves the above-mentioned conventional problems, and an object of the present invention is to provide a representative image generating apparatus capable of presenting an image serving as an index that accurately represents the contents of a moving image.

【０００５】[0005]

【課題を解決するための手段】前記問題を解決するため
に本発明は、あらかじめ撮影し蓄積しておいた動画を読
み出して提供できるようにした動画入力手段と、提供さ
れた動画をフレーム単位で格納するフレームメモリと、
該フレームメモリに格納されたフレーム単位の画像デー
タを読み出してカットを検出するカット検出手段と、各
カットのカット長を計算するカット長計算手段と、計算
されたカット長を記憶するカット長記憶手段と、カット
間で似ている絵などのカットを纏めて映像シーンを構成
するシーン構成手段と、構成したシーンの長さを計算す
るシーン長計算手段と、計算したシーン長を記憶するシ
ーン長記憶手段と、記憶されたシーン長から最も長いシ
ーン長を有するシーンを最長シーンとして決定する最長
シーン決定手段と、前記最長シーンから代表フレーム画
像をインデクス静止画として決定するインデクス静止画
決定手段と、決定されたインデクス静止画を出力するイ
ンデクス静止画出力手段とから構成されることを特徴と
するものである。In order to solve the above-mentioned problems, the present invention provides a moving image input means capable of reading and providing a moving image that has been captured and stored in advance, and the provided moving image in frame units. Frame memory to store,
Cut detection means for reading out image data in frame units stored in the frame memory to detect a cut, cut length calculation means for calculating a cut length of each cut, and cut length storage means for storing the calculated cut length And cut
Compose a video scene by combining cuts such as pictures that are similar between
And the length of the composed scene
Scene length calculation means and a system that stores the calculated scene length.
Scene length storage means and the longest scene length from the stored scene lengths.
The longest scene is determined to be the longest scene
Scene determination means and representative frame image from the longest scene
It is characterized by comprising index still image determination means for determining an image as an index still image and index still image output means for outputting the determined index still image.

【０００６】以上により、動画の中身を的確に表すイン
デクスとなる画像を提示できる代表画像生成装置を提供
することができる。As described above, it is possible to provide a representative image generating device capable of presenting an image as an index that accurately represents the contents of a moving image.

【０００７】[0007]

【発明の実施の形態】本発明の請求項１に記載の発明
は、あらかじめ撮影し蓄積しておいた動画を読み出して
提供できるようにした動画入力手段と、提供された動画
をフレーム単位で格納するフレームメモリと、該フレー
ムメモリに格納されたフレーム単位の画像データを読み
出してカットを検出するカット検出手段と、各カットの
カット長を計算するカット長計算手段と、計算されたカ
ット長を記憶するカット長記憶手段と、カット間で似て
いる絵（色、ものの配置）などのカットを纏めて映像シ
ーンを構成するシーン構成手段と、構成したシーンの長
さを計算するシーン長計算手段と、計算したシーン長を
記憶するシーン長記憶手段と、記憶されたシーン長から
最も長いシーン長を有するシーンを最長シーンとして決
定する最長シーン決定手段と、前記最長シーンから代表
フレーム画像をインデクス静止画として決定するインデ
クス静止画決定手段と、決定されたインデクス静止画を
出力するインデクス静止画出力手段とから構成されるこ
とを特徴とする代表画像生成装置としたものであり、映
像の撮影および編集時に、制作者の意図を反映する映像
時間配分の面で最も重点が置かれたシーンの絵をインデ
クス静止画とすることができるという作用を有する。BEST MODE FOR CARRYING OUT THE INVENTION The invention according to claim 1 of the present invention is a moving image input means capable of reading and providing a moving image that has been photographed and stored in advance, and the provided moving image is stored in frame units. A frame memory, a cut detection means for detecting a cut by reading out image data in frame units stored in the frame memory, a cut length calculation means for calculating a cut length of each cut, and a calculated cut length and cut length storage means for, similar between cut
Put together cuts such as existing pictures (colors, arrangement of things)
Scene composition means that compose the scene and the length of the composed scene
And the calculated scene length
From the stored scene length storage means and the stored scene length
The scene with the longest scene length is determined as the longest scene.
Longest scene determining means to be defined, and representative from the longest scene
An index still image determination means for determining a frame image as an index still image, and an index still image output means for outputting the determined index still image, which is a representative image generation device, It has an effect that a picture of a scene, which is most important in terms of video time distribution that reflects the intention of the creator, can be used as an index still image when shooting and editing the video.

【０００８】[0008]

【０００９】本発明の請求項２に記載の発明は、あらか
じめ撮影し蓄積しておいた動画を読み出して提供できる
ようにした動画入力手段と、提供された動画をフレーム
単位で格納するフレームメモリと、該フレームメモリに
格納されたフレーム単位の画像データを一定間隔区間毎
に１フレームずつ読み出す一定間隔区間フレーム抽出手
段と、抽出した複数のフレーム間で似ている映像の区間
を纏めることによりシーンを構成するシーン構成手段
と、構成したシーンの長さを計算するシーン長計算手段
と、計算したシーン長を記憶するシーン長記憶手段と、
記憶された各シーン長から最も長いシーン長を有するシ
ーンを最長シーンとして決定する最長シーン決定手段
と、前記最長シーンから代表フレーム画像をインデクス
静止画として決定するインデクス静止画決定手段と、決
定されたインデクス静止画を出力するインデクス静止画
出力手段とから構成されることを特徴とする代表画像生
成装置としたものであり、映像の撮影および編集時に、
制作者の意図を反映する映像時間配分の面で最も重点が
置かれたシーンの絵をインデクス静止画とすることがで
きるという作用を有する。According to a second aspect of the present invention, a moving image input means capable of reading and providing a moving image that has been captured and accumulated in advance, and a frame memory for storing the provided moving image in frame units. A frame interval unit for extracting image data stored in the frame memory on a frame-by-frame basis one frame at a time, and a plurality of extracted frames that are similar to each other to compose a scene. A scene structuring means for configuring, a scene length calculating means for calculating the length of the configured scene, and a scene length storing means for storing the calculated scene length,
Longest scene determination means for determining the scene having the longest scene length from the stored scene lengths as the longest scene, and a representative frame image indexed from the longest scene.
And index still picture determination means for determining as a still image, which has a representative image generating apparatus characterized in that it is composed of an index still picture output means for outputting the determined index still pictures, video shooting and When editing,
It has the effect that the picture of the scene, which is the most important in terms of video time distribution that reflects the intention of the creator, can be used as the index still image.

【００１０】[0010]

【００１１】[0011]

【００１２】[0012]

【００１３】[0013]

【００１４】[0014]

【００１５】本発明の請求項３に記載の発明は、あらか
じめ撮影し蓄積しておいた動画を読み出して提供できる
ようにした動画入力手段と、提供された動画をフレーム
単位で格納するフレームメモリと、該フレームメモリに
格納されたフレーム単位の画像データを読み出してカッ
トを検出するカット検出手段と、各カットのカット長を
計算するカット長計算手段と、計算されたカット長を記
憶するカット長記憶手段と、記憶された各カット長から
カット長の短いカットが連続する短カット区間をそれぞ
れ一つのシーンとし、前記シーンの中から最長の短カッ
ト区間を最も長いシーン長を有する最長シーンとして決
定する最長シーン決定手段と、前記最長シーンから代表
フレーム画像をインデクス静止画として決定するインデ
クス静止画決定手段と、決定されたインデクス静止画を
出力するインデクス静止画出力手段とから構成されるこ
とを特徴とする代表画像生成装置としたものであり、映
像の撮影および編集時に、カット数量の面で最も重点が
置かれたシーンの絵をインデクス静止画すなわち代表画
像とすることができるという作用を有する。According to a third aspect of the present invention, there is provided a moving image input means capable of reading and providing a moving image that has been captured and stored in advance, and a frame memory for storing the provided moving image in frame units. A cut detecting means for detecting a cut by reading out image data in frame units stored in the frame memory, a cut length calculating means for calculating a cut length of each cut, and a cut length storage for storing the calculated cut length and means, the short-cut sections stored cut short the cut length from the cut length is continuous it
Re one a scene, and the longest scene determining means that determine constant as the longest scenes with the longest scenes length the longest short cut sections from the scene, the representative frame image from the longest scenes Is a representative image generation device characterized by comprising an index still image determination means for determining as an index still image and an index still image output means for outputting the determined index still image. This has the effect that the picture of the scene that is most important in terms of the number of cuts at the time of shooting and editing can be used as an index still image, that is, a representative image.

【００１６】本発明の請求項４に記載の発明は、あらか
じめ撮影し蓄積しておいた動画を読み出して提供できる
ようにした動画入力手段と、提供された動画をフレーム
単位で格納するフレームメモリと、該フレームメモリに
格納されたフレーム単位の画像データを一定間隔区間毎
に１フレームずつ読み出す一定間隔区間フレーム抽出手
段と、抽出した複数のフレーム間で似ている映像の区間
を纏めることによりシーンを構成するシーン構成手段
と、構成したシーンの長さを計算するシーン長計算手段
と、計算したシーン長を記憶するシーン長記憶手段と、
記憶された各シーン長からシーン長の短いシーンが連続
する短シーン区間の中から最長の短シーン区間を最長短
シーン区間として決定する短シーン区間決定手段と、前
記最長短シーン区間から代表フレーム画像をインデクス
静止画として決定するインデクス静止画決定手段と、決
定されたインデクス静止画を出力するインデクス静止画
出力手段とから構成されることを特徴とする代表画像生
成装置としたものであり、映像の撮影および編集時に、
シーン数量の面で最も重点が置かれた短シーン区間の絵
をインデクス静止画すなわち代表画像とすることができ
るという作用を有する。According to a fourth aspect of the present invention, there is provided a moving image input means capable of reading and providing a moving image which has been previously captured and accumulated, and a frame memory which stores the provided moving image in frame units. A frame interval unit for extracting image data stored in the frame memory on a frame-by-frame basis one frame at a time, and a plurality of extracted frames that are similar to each other to compose a scene. A scene structuring means for configuring, a scene length calculating means for calculating the length of the configured scene, and a scene length storing means for storing the calculated scene length,
From the stored short scenes, the longest short scenes are the longest among the short scenes that are short scenes.
A short scene period determining means for determining a scene section, before
Representative frame image index from the longest short scene section
And index still picture determination means for determining as a still image, which has a representative image generating apparatus characterized in that it is composed of an index still picture output means for outputting the determined index still pictures, video shooting and When editing,
This has the effect that the picture in the short scene section , which is most important in terms of the number of scenes, can be used as the index still picture, that is, the representative picture.

【００１７】以下、本発明の実施の形態について、図面
を用いて説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１８】（第１の実施の形態）図１は、本発明の第
１の実施の形態に係る代表画像生成装置の構成を示すも
のである。図１において代表画像生成装置は、あらかじ
め撮影し蓄積しておいた動画を読み出して提供できるよ
うにした動画入力手段１と、提供された動画をフレーム
単位で格納するフレームメモリ２と、フレームメモリ２
に格納されたフレーム単位の画像データを読み出してカ
ット（１纏まりの映像データ）を検出するカット検出手
段３と、各カットのカット長を計算するカット長計算手
段４と、計算されたカット長を記憶するカット長記憶手
段５と、記憶されたカット長の中から最も長いカット長
を有するカットを決定する最長カット決定手段６と、最
も長いカット長を有するカットからインデクスとして相
応しいフレーム画像を決定するインデクス静止画決定手
段７と、決定されたインデクス静止画を出力するインデ
クス静止画出力手段８とから構成されている。上記構成
において、動画入力手段１から取り出された動画は、フ
レームメモリ２と、インデクス静止画決定手段７にそれ
ぞれ入力される。(First Embodiment) FIG. 1 shows the configuration of a representative image generating apparatus according to the first embodiment of the present invention. In FIG. 1, the representative image generation apparatus includes a moving image input unit 1 capable of reading and providing a moving image that has been captured and stored in advance, a frame memory 2 that stores the provided moving image in frame units, and a frame memory 2
The cut detection means 3 for reading out the image data of the frame unit stored in to detect the cut (one set of video data), the cut length calculation means 4 for calculating the cut length of each cut, and the calculated cut length. A cut length storage means 5 to be stored, a longest cut determination means 6 to determine a cut having the longest cut length from the stored cut lengths, and a frame image suitable as an index is determined from the cuts having the longest cut length. The index still image determining means 7 and the index still image outputting means 8 for outputting the determined index still image are included. In the above structure, the moving image taken out from the moving image input means 1 is inputted to the frame memory 2 and the index still image determining means 7, respectively.

【００１９】図２を用いて代表画像が決定される過程を
上記動画入力手段１から取り出される動画のフレーム画
像開始点から始まる複数のカットから構成される動画の
例で説明する。すなわち、上記カット検出手段３は、カ
ット変化点に着目してカットの開始点を把握する。上記
最長カット決定手段６は、把握されたカットの中で最も
長いカット長を有するカットを特定する。上記インデク
ス静止画決定手段７は、そのカットの先頭から一定割合
にあるフレーム画像、たとえばカット長の先頭から30％
近傍の位置（最長カットの３分の１弱）にあるフレーム
画像を代表画像と決定する。そのように決定する理由
は、上記のような動画例においてその動画の中で最も長
いカット長を有するカットの先頭から一定割合にあるフ
レーム画像、たとえば最も長いカット長を有するカット
の先頭から30％近傍の位置にあるフレーム画像がその動
画の中身を代表するに相応しいフレーム画像であること
が多いという映像制作上の経験則に基づいている。な
お、カット検出の方法としては、隣り合うフレーム画像
間の特徴量の相関係数変化率によりカットの変化点を検
出する技術が公知なので、その技術を利用すればよい。The process of determining the representative image will be described with reference to FIG. 2 by taking an example of a moving image composed of a plurality of cuts starting from the frame image start point of the moving image taken out from the moving image inputting means 1. That is, the cut detection means 3 grasps the cut start point by focusing on the cut change point. The longest cut determination means 6 specifies the cut having the longest cut length among the grasped cuts. The index still image determining means 7 determines the frame images at a certain ratio from the beginning of the cut, for example, 30% from the beginning of the cut length.
A frame image at a nearby position (a little less than one-third of the longest cut) is determined as a representative image. The reason for deciding so is that in the above moving picture example, a frame image having a certain proportion from the beginning of the cut having the longest cut length in the moving picture, for example, 30% from the beginning of the cut having the longest cut length. It is based on the empirical rule in video production that frame images in the vicinity are often frame images suitable for representing the contents of the moving image. As a method of detecting a cut, a technique of detecting a change point of a cut based on a change rate of a correlation coefficient of a feature amount between adjacent frame images is known, and thus the technique may be used.

【００２０】このように本発明の第１の実施の形態の代
表画像生成装置によれば、映像の撮影および編集時に、
制作者の意図を反映する映像時間配分の面で最も重点が
置かれたシーンの絵をインデクス静止画とすることがで
き、動画の中身を的確に表すインデクス画像を提示でき
る代表画像生成装置を提供することができる。As described above, according to the representative image generating apparatus of the first embodiment of the present invention, at the time of shooting and editing a video,
Provides a representative image generation device that can display the index still image that accurately represents the contents of the video, by making the picture of the scene that is the most important in terms of video time allocation that reflects the creator's intention can do.

【００２１】（第２の実施の形態）図３は、本発明の第２の実施の形態に係る代表画像生成
装置の構成を示すものである。図３において代表画像生
成装置は、あらかじめ撮影し蓄積しておいた動画を読み
出して提供できるようにした動画入力手段１と、提供さ
れた動画をフレーム単位で格納するフレームメモリ２
と、フレームメモリ２に格納されたフレーム単位の画像
データを読み出してカット（１纏まりの映像データ）を
検出するカット検出手段３と、各カットのカット長を計
算するカット長計算手段４と、計算されたカット長を記
憶するカット長記憶手段５と、カット間で似ている絵
（色、ものの配置）などのカットを纏めて映像シーンを
構成するシーン構成手段９と、構成したシーンの長さを
計算するシーン長計算手段10と、計算したシーン長を記
憶するシーン長記憶手段11と、記憶されたシーン長から
最も長いシーン長を有するシーンを最長シーンとして決
定する最長シーン決定手段12と、前記最長シーンから代
表フレーム画像をインデクス静止画として決定するイン
デクス静止画決定手段７と、決定されたインデクス静止
画を出力するインデクス静止画出力手段８とから構成さ
れている。上記構成において、動画入力手段１から取り
出される動画は、フレームメモリ２と、シーン構成手段
９と、インデクス静止画決定手段７にそれぞれ入力され
る。(Second Embodiment) FIG. 3 shows the configuration of a representative image generating apparatus according to a second embodiment of the present invention. In FIG. 3, the representative image generation apparatus includes a moving image input unit 1 capable of reading and providing a moving image that has been captured and accumulated in advance, and a frame memory 2 that stores the provided moving image in frame units.
A cut detecting means 3 for reading out image data in frame units stored in the frame memory 2 to detect a cut (one set of video data); a cut length calculating means 4 for calculating a cut length of each cut; A cut length storage means 5 for storing the cut lengths, a scene construction means 9 for constructing a video scene by grouping cuts such as pictures (colors, object arrangements ) similar between the cuts, and the length of the constructed scenes. A scene length calculating means 10, a scene length storage means 11 for storing the calculated scene length, and a longest scene that determines the scene having the longest scene length from the stored scene lengths as the longest scene. and decision means 12 and the cash from the longest scene
An index still picture decision unit 7 for determining the table frame image as an index still picture, and a index still picture outputting means 8 for outputting the determined index still picture. In the above-mentioned structure, the moving picture taken out from the moving picture inputting means 1 is inputted to the frame memory 2, the scene forming means 9 and the index still picture deciding means 7, respectively.

【００２２】図４を用いて代表画像が決定される過程を
上記動画入力手段１から取り出される動画のフレーム画
像開始点から始まる複数のカットで構成される動画の例
で説明する。すなわち、上記カット検出手段３は、カッ
ト変化点に着目してカットの開始点を把握する。上記シ
ーン構成手段９は、複数のカットの間で似ている絵
（色、ものの配置など）のカットを纏めて映像シーンを
構成する。上記インデクス静止画決定手段７は、構成し
たシーンの中から最も長いシーン長を有するシーンの先
頭から一定割合にあるフレーム画像、たとえば最も長い
シーン長を有するシーンの先頭から30％近傍の位置（最
長シーンの３分の１弱）にあるフレーム画像を代表画像
と決定する。そのように決定する理由は、上記のような
動画例においてその動画の中で最も長いシーン長のシー
ンの先頭から一定割合にあるフレーム画像、たとえば最
も長いシーン長のシーンの先頭から30％近傍の位置にあ
るフレーム画像がその動画の中身を代表するに相応しい
フレーム画像であることが多いという映像制作上の経験
則に基づいている。なお、複数のカット間で似ているカ
ットを判定することは、結局は、フレーム画像間の類似
を判定することであり、一方フレーム画像間の類似を判
定する技術は公知なので、その技術を利用する。The process of determining the representative image will be described with reference to FIG. 4 by taking an example of a moving picture composed of a plurality of cuts starting from the frame image start point of the moving picture taken out from the moving picture inputting means 1. That is, the cut detection means 3 grasps the cut start point by focusing on the cut change point. The scene composing means 9 composes cuts of similar pictures (colors, arrangement of things, etc.) among a plurality of cuts to compose a video scene. The index still image determining means 7 is a frame image at a fixed ratio from the beginning of the scene having the longest scene length among the composed scenes, for example, a position near 30% from the beginning of the scene having the longest scene length (longest A frame image in a little less than one-third of the scene) is determined as a representative image. The reason for making such a determination is that in the above-mentioned moving image example, a frame image at a certain ratio from the beginning of the scene with the longest scene length in the moving image, for example, in the vicinity of 30% from the beginning of the scene with the longest scene length. It is based on the empirical rule in video production that the frame image at the position is often a frame image suitable for representing the contents of the moving image. Note that determining similar cuts among a plurality of cuts is ultimately determining similarity between frame images. On the other hand, since a technique for determining similarity between frame images is known, that technique is used. To do.

【００２３】このように本発明の第２の実施の形態の代
表画像生成装置によれば、映像の撮影および編集時に、
制作者の意図を反映する映像時間配分の面で最も重点が
置かれたシーンの絵をインデクス静止画とすることがで
き、動画の中身を的確に表すインデクス画像を提示でき
る代表画像生成装置を提供することができる。As described above, according to the representative image generating apparatus of the second embodiment of the present invention, at the time of shooting and editing a video,
Provides a representative image generation device that can display the index still image that accurately represents the contents of the video, by making the picture of the scene that is the most important in terms of video time allocation that reflects the creator's intention can do.

【００２４】（第３の実施の形態）図５は、本発明の第３の実施の形態の代表画像生成装置
の構成を示すものである。図５において代表画像生成装
置は、あらかじめ撮影し蓄積しておいた動画を読み出し
て提供できるようにした動画入力手段１と、提供された
動画をフレーム単位で格納するフレームメモリ２と、フ
レームメモリ２に格納されたフレーム単位の画像データ
を一定間隔区間毎に１フレームずつ読み出す一定間隔区
間フレーム抽出手段13と、抽出した複数のフレーム間で
似ている映像の区間を纏めることによりシーンを構成す
るシーン構成手段９と、構成したシーンの長さを計算す
るシーン長計算手段10と、計算したシーン長を記憶する
シーン長記憶手段11と、記憶されたシーン長から最も長
いシーン長を有するシーンを最長シーンとして決定する
最長シーン決定手段12と、前記最長シーンから代表フレ
ーム画像をインデクス静止画として決定するインデクス
静止画決定手段７と、決定されたインデクス静止画を出
力するインデクス静止画出力手段８とから構成されてい
る。上記構成において、動画入力手段１から取り出され
る動画は、フレームメモリ２と、シーン構成手段９と、
インデクス静止画決定手段７にそれぞれ入力される。(Third Embodiment) FIG. 5 shows the configuration of a representative image generating apparatus according to a third embodiment of the present invention. In FIG. 5, the representative image generation apparatus includes a moving image input unit 1 capable of reading and providing a moving image that has been captured and stored in advance, a frame memory 2 that stores the provided moving image in frame units, and a frame memory 2 A constant interval section frame extracting means 13 for reading out the frame-by-frame image data stored in the frame one by one for each constant interval section, and a scene forming a scene by collecting similar video sections among a plurality of extracted frames. The composition means 9, the scene length calculation means 10 for calculating the length of the composed scene, the scene length storage means 11 for storing the calculated scene length, and the scene having the longest scene length from the stored scene lengths is the longest. The longest scene determining means 12 for determining a scene and the representative frame from the longest scene.
The index still image determination unit 7 determines the index image as the index still image, and the index still image output unit 8 outputs the determined index still image. In the above-mentioned configuration, the moving image taken out from the moving image input means 1 is composed of the frame memory 2, the scene forming means 9,
It is input to the index still image determination means 7, respectively.

【００２５】代表画像が決定される過程を上記動画入力
手段１から取り出される動画のフレーム画像開始点から
一定間隔区間毎に１つずつ取り出されるフレーム画像に
ついて説明する。すなわち、上記一定間隔区間フレーム
抽出手段13は、一定間隔区間毎に１フレーム画像ずつフ
レーム画像を取り出す。上記シーン構成手段９は、取り
出された複数のフレーム画像間で似ている画像の区間を
纏めることによりシーンを構成する。上記インデクス静
止画決定手段７は、構成したシーンの中から最も長いシ
ーン長を有するシーンの先頭から一定割合にあるフレー
ム画像、たとえば最も長いシーン長を有するシーンの先
頭から30％近傍の位置（最長シーンの３分の１弱）にあ
るフレーム画像を代表画像と決定する。そのように決定
する理由は、上記のような動画例においてその動画の中
で最も長いシーン長を有するシーンの先頭から一定割合
にあるフレーム画像、たとえば最も長いシーン長を有す
るシーンの先頭から30％近傍の位置にあるフレーム画像
がその動画の中身を代表するに相応しいフレーム画像で
あることが多いという映像制作上の経験則に基づいてい
る。なお、前記したようにフレーム間の類似を判定する
技術は公知なので、その技術を利用すればよい。The process of determining the representative image will be described with respect to the frame images extracted from the frame image start point of the moving image extracted from the moving image input means 1 one by one at regular intervals. That is, the constant interval section frame extracting means 13 extracts one frame image for each constant interval section. The scene composing means 9 composes a scene by collecting sections of similar images among a plurality of extracted frame images. The index still image determining means 7 is a frame image at a fixed ratio from the beginning of the scene having the longest scene length among the composed scenes, for example, a position near 30% from the beginning of the scene having the longest scene length (longest A frame image in a little less than one-third of the scene) is determined as a representative image. The reason for making such a determination is that in the above-described moving image example, a frame image at a certain ratio from the beginning of the scene having the longest scene length in the moving image, for example, 30% from the beginning of the scene having the longest scene length. It is based on the empirical rule in video production that frame images in the vicinity are often frame images suitable for representing the contents of the moving image. Since the technique for determining the similarity between frames as described above is known, that technique may be used.

【００２６】このように本発明の第３の実施の形態の代
表画像生成装置によれば、映像の撮影および編集時に、
制作者の意図を反映する映像時間配分の面で最も重点が
置かれたシーンの絵をインデクス静止画とすることがで
き、動画の中身を的確に表すインデクス画像を提示でき
る代表画像生成装置を提供することができる。As described above, according to the representative image generating apparatus of the third embodiment of the present invention, at the time of shooting and editing a video,
Provides a representative image generation device that can display the index still image that accurately represents the contents of the video, by making the picture of the scene that is the most important in terms of video time allocation that reflects the creator's intention can do.

【００２７】（第４の実施の形態）図６は、本発明の第
４の実施の形態に係る代表画像生成装置の構成を示すも
のである。図６において代表画像生成装置は、あらかじ
め撮影し蓄積しておいた動画を読み出して提供できるよ
うにした動画入力手段１と、提供された動画をフレーム
単位で格納するフレームメモリ２と、フレームメモリ２
に格納された対象動画の全フレーム、または一部のフレ
ームを読み出し、そのうちフレーム特徴量（何らかのパ
ラメータ値）を抽出する特徴量パラメータ抽出手段14
と、抽出した特徴量パラメータから特徴量の大きさを計
算する特徴量計算手段15と、計算した特徴量の大きさの
うち最大のものを記憶する最大特徴量記憶手段16と、最
大の特徴量を有するフレーム画像をインデクスすべきフ
レーム画像であると決定するインデクス静止画決定手段
７と、決定されたインデクス静止画を出力するインデク
ス静止画出力手段８とから構成されている。上記構成に
おいて、動画入力手段１から取り出された動画は、フレ
ームメモリ２と、インデクス静止画決定手段７にそれぞ
れ入力される。(Fourth Embodiment) FIG. 6 shows the configuration of a representative image generating apparatus according to the fourth embodiment of the present invention. In FIG. 6, the representative image generation apparatus includes a moving image input unit 1 capable of reading and providing a moving image that has been captured and stored in advance, a frame memory 2 that stores the provided moving image in frame units, and a frame memory 2
Feature amount parameter extraction means 14 for reading out all frames or a part of the frames of the target moving image stored in and extracting the frame feature amount (some parameter value) among them.
A feature amount calculation means 15 for calculating the size of the feature amount from the extracted feature amount parameter, a maximum feature amount storage means 16 for storing the largest of the calculated feature amount sizes, and a maximum feature amount. The index still image determining unit 7 determines that the frame image having the index is a frame image to be indexed, and the index still image output unit 8 that outputs the determined index still image. In the above structure, the moving image taken out from the moving image input means 1 is inputted to the frame memory 2 and the index still image determining means 7, respectively.

【００２８】代表画像が決定される過程を上記動画入力
手段１から取り出される連続するフレーム画像について
説明する。すなわち、上記特徴量パラメータ抽出手段14
は、読み出された対象動画の全フレーム、または一部の
フレームのうち、フレーム特徴量（何らかのパラメータ
値）を抽出する。つぎに、上記特徴量計算手段15は、抽
出した特徴量パラメータからその特徴量の大きさを計算
する。そして、上記最大特徴量記憶手段16は、計算され
た特徴量の大きさのうち最大のものを記憶する。上記イ
ンデクス静止画決定手段７は、最大の特徴量を呈するフ
レーム画像をインデクスすべきフレーム画像であるとし
て代表画像を決定する。The process of determining the representative image will be described for successive frame images taken out from the moving image input means 1. That is, the feature quantity parameter extraction means 14
Extracts a frame feature amount (some parameter value) from all or some of the frames of the read target moving image. Next, the characteristic amount calculation means 15 calculates the size of the characteristic amount from the extracted characteristic amount parameter. Then, the maximum feature amount storage means 16 stores the maximum of the calculated feature amounts. The index still image determination means 7 determines the representative image as the frame image to be indexed, which is the frame image exhibiting the maximum feature amount.

【００２９】このように本発明の第４の実施の形態の代
表画像生成装置によれば、例えば、音量が最大のフレー
ム、動き量が大きいフレーム、最も絵柄が複雑なフレー
ム、赤色が極端に多いフレームなど、最大の特徴量を有
するフレーム画像をインデクス静止画とすることがで
き、動画の中身を的確に表すインデクス画像を提示でき
る代表画像生成装置を提供することができる。なお、上
記例では、最大の特徴量を有するフレーム画像に着目し
てインデクス画像を決定したが、これとは反対に、最小
の特徴量を有するフレーム画像に着目してインデクス画
像を決定してもよい。また、ＭＰＥＧなどの圧縮動画の
場合、圧縮動画データに含まれる特徴量パラメータによ
り、フレームの動き量、複雑度など、これらの特徴量を
得ることができることが多いため、その場合は、処理時
間を大きく削減できる。As described above, according to the representative image generating apparatus of the fourth embodiment of the present invention, for example, the frame having the largest volume, the frame having a large amount of movement, the frame having the most complicated pattern, and the extremely large amount of red color. It is possible to provide a representative image generation device that can use a frame image having the maximum feature amount such as a frame as an index still image, and can present an index image that accurately represents the contents of a moving image. In the above example, the index image is determined by focusing on the frame image having the maximum feature amount, but conversely, the index image is determined by focusing on the frame image having the minimum feature amount. Good. Further, in the case of a compressed moving image such as MPEG, it is often possible to obtain these characteristic amounts such as the amount of movement of a frame and the degree of complexity by the characteristic amount parameter included in the compressed moving image data. It can be greatly reduced.

【００３０】（第５の実施の形態）図７は、本発明の第
５の実施の形態に係る代表画像生成装置の構成を示すも
のである。図７において代表画像生成装置は、あらかじ
め撮影し蓄積しておいた動画を読み出して提供できるよ
うにした動画入力手段１と、提供された動画をフレーム
単位で格納するフレームメモリ２と、フレームメモリ２
に格納された対象動画の全フレーム、または一部のフレ
ームを読み出し、そのうち、フレーム特徴量（何らかの
パラメータ値）を抽出する特徴量パラメータ抽出手段14
と、抽出した特徴量パラメータから特徴量の大きさを計
算する特徴量計算手段15と、計算された特徴量の大きさ
が規定の値以上になった時点を判定する特徴量判定手段
17と、規定の値を越えたフレーム画像をインデクスすべ
きフレーム画像であると決定するインデクス静止画決定
手段７と、決定されたインデクス静止画を出力するイン
デクス静止画出力手段８とから構成されている。上記構
成において、動画入力手段１から取り出された動画は、
フレームメモリ２と、インデクス静止画決定手段７にそ
れぞれ入力される。(Fifth Embodiment) FIG. 7 shows the configuration of a representative image generating apparatus according to the fifth embodiment of the present invention. In FIG. 7, the representative image generation device is a moving image input means 1 capable of reading and providing a moving image that has been captured and stored in advance, a frame memory 2 that stores the provided moving image in frame units, and a frame memory 2
Feature amount parameter extraction means 14 for reading out all frames or a part of frames of the target moving image stored in and extracting a frame feature amount (some parameter value) from them.
And a feature quantity calculating means 15 for calculating the size of the feature quantity from the extracted feature quantity parameter, and a feature quantity determining means for determining the time point when the magnitude of the calculated feature quantity exceeds a specified value.
17, an index still image determining means 7 for determining a frame image exceeding a specified value as a frame image to be indexed, and an index still image output means 8 for outputting the determined index still image. There is. In the above configuration, the moving image extracted from the moving image input means 1 is
It is input to the frame memory 2 and the index still image determining means 7, respectively.

【００３１】代表画像が決定される過程を上記動画入力
手段１から取り出される連続するフレーム画像について
説明する。すなわち、上記特徴量パラメータ抽出手段14
は、対象動画のフレーム（例えば先頭フレーム）から順
に走査していき、フレーム特徴量（何らかのパラメータ
値、例えば輝度情報）について抽出する。つぎに、上記
特徴量計算手段15は、抽出した特徴量パラメータからそ
の特徴量、例えば輝度情報の大きさを計算する。そし
て、上記特徴量判定手段17は、計算された特徴量例えば
輝度情報の大きさがあらかじめ設定されている規定の値
を越えた時点を判定する。上記インデクス静止画決定手
段７は、規定の値（閾値）を越えた時点のフレーム画像
をインデクスとして相応しいフレーム画像であるとして
代表画像を決定する。そしてそれを出力するようにす
る。The process of determining the representative image will be described for successive frame images taken out from the moving image input means 1. That is, the feature quantity parameter extraction means 14
Scans sequentially from the frame (for example, the first frame) of the target moving image, and extracts the frame feature amount (some parameter value, for example, luminance information). Next, the characteristic amount calculation means 15 calculates the characteristic amount, for example, the size of the brightness information, from the extracted characteristic amount parameter. Then, the feature amount determination means 17 determines the time when the calculated feature amount, for example, the size of the brightness information exceeds a preset specified value. The index still image determination means 7 determines the representative image as a frame image suitable for the index of the frame image at the time when the value exceeds the specified value (threshold value). And output it.

【００３２】このように本発明の第５の実施の形態の代
表画像生成装置によれば、通常、動画の先頭画像や動画
途中のシーン変化点の先頭画像は、黒く内容がないフレ
ーム画像であることが多いけれども、フレーム特徴量、
例えば輝度情報がある閾値を越えた時点のフレーム画像
を代表画像としているので、真っ黒なインデクス静止画
を生成してしまうことを避けることができ、動画の中身
を的確に表すインデクス画像を提示できる代表画像生成
装置を提供することができる。As described above, according to the representative image generating apparatus of the fifth embodiment of the present invention, normally, the leading image of a moving image or the leading image of a scene change point in the middle of a moving image is a frame image which is black and has no content. Often, but frame features,
For example, since the frame image at the time when the brightness information exceeds a certain threshold is used as the representative image, it is possible to avoid generating a black index still image and present an index image that accurately represents the contents of the moving image. An image generation device can be provided.

【００３３】（第６の実施の形態）図８は、本発明の第
６の実施の形態に係る代表画像生成装置の構成を示すも
のである。図８において代表画像生成装置は、あらかじ
め撮影し蓄積しておいた動画を読み出して提供できるよ
うにした動画入力手段１と、提供された動画をフレーム
単位で格納するフレームメモリ２と、フレームメモリ２
に格納されたフレーム単位の画像データを順に（或いは
飛び飛びに）走査しながら物体を検出する物体検出手段
18と、物体が検出された時点でその物体がフレーム画像
の特定領域（例えば中心付近）で検出されたかどうかを
判定する検出物体判定手段19と、検出物体判定手段19で
物体が検出された時点のフレーム画像をインデクスすべ
きフレーム画像であると決定するインデクス静止画決定
手段７と、決定されたインデクス静止画を出力するイン
デクス静止画出力手段８とから構成されている。上記構
成において、動画入力手段１から取り出された動画は、
フレームメモリ２と、インデクス静止画決定手段７にそ
れぞれ入力される。(Sixth Embodiment) FIG. 8 shows the configuration of a representative image generating apparatus according to the sixth embodiment of the present invention. In FIG. 8, the representative image generation device is a moving image input unit 1 capable of reading and providing moving images that have been captured and accumulated in advance, a frame memory 2 that stores the provided moving images in frame units, and a frame memory 2
Detecting means for detecting an object while sequentially (or intermittently) scanning the image data in frame units stored in
18, a detection object determination means 19 for determining whether or not the object is detected in a specific area (for example, near the center) of the frame image when the object is detected, and a time when the object is detected by the detection object determination means 19. The index still image determining means 7 for determining that the frame image is a frame image to be indexed, and the index still image outputting means 8 for outputting the determined index still image. In the above configuration, the moving image extracted from the moving image input means 1 is
It is input to the frame memory 2 and the index still image determining means 7, respectively.

【００３４】代表画像が決定される過程を上記動画入力
手段１から取り出される連続するフレーム画像について
説明する。すなわち、上記物体検出手段18は、対象動画
のフレーム（例えば先頭フレーム）から順に（或いは飛
び飛びに）走査しながら、例えば明るさの濃度データの
積算値の変化に基づいて物体を検出する。上記検出物体
判定手段19は、物体が検出された領域が画像の特定領域
（例えば、中心付近）にあるかどうかを判定する。そし
て上記インデクス静止画決定手段７は、上記検出物体判
定手段19の判定結果に基づいてその時点のフレーム画像
をインデクスとして相応しいフレーム画像であるとして
代表画像を決定する。上記インデクス静止画出力手段８
は、決定されたフレーム画像を出力する。The process of determining the representative image will be described for successive frame images taken from the moving image input means 1. That is, the object detecting means 18 detects an object based on, for example, a change in the integrated value of the density data of the brightness while sequentially (or intermittently) scanning from the frame (for example, the first frame) of the target moving image. The detected object determining means 19 determines whether the area in which the object is detected is in a specific area (for example, near the center) of the image. Then, the index still image determination means 7 determines a representative image based on the determination result of the detected object determination means 19 by regarding the frame image at that time as a suitable frame image as an index. The index still image output means 8
Outputs the determined frame image.

【００３５】このように本発明の第６の実施の形態の代
表画像生成装置によれば、通常、動画の先頭画像や動画
途中のシーン変化点の先頭画像は、動画の対象目的とな
る人物や物体が表示されていないフレームであることが
多いけれども、画像の特定領域（例えば、中心付近）に
物体があると判定された時点では動画の対象目的となる
人物や物体の画像が検出されたことになることから、特
定領域に人物や物体の画像を判定する検出物体判定手段
を備えることにより、人物や物体がフレーム領域の特定
位置に表示された時点のフレーム画像を代表画像とする
ことができ、動画の中身を的確に表すインデクス画像を
提示できる代表画像生成装置を提供することができる。As described above, according to the representative image generating apparatus of the sixth embodiment of the present invention, normally, the leading image of a moving image or the leading image of a scene change point in the middle of a moving image is Although it is often a frame in which the object is not displayed, the image of the target person or object of the moving image was detected when it was determined that the object was in a specific area (for example, near the center) of the image. Therefore, by providing the detected object determination means for determining the image of the person or the object in the specific area, the frame image at the time when the person or the object is displayed at the specific position of the frame area can be used as the representative image. It is possible to provide a representative image generation device that can present an index image that accurately represents the contents of a moving image.

【００３６】（第７の実施の形態）図９は、本発明の第
７の実施の形態に係る代表画像生成装置の構成を示すも
のである。図９において本発明の第７の実施の形態に係
る代表画像生成装置は、あらかじめ撮影し蓄積しておい
た動画を読み出して提供できるようにした動画入力手段
１と、提供された動画をフレーム単位で格納するフレー
ムメモリ２と、フレームメモリ２に格納されたフレーム
単位の画像データを順に（或いは飛び飛びに）走査しな
がら字幕スーパーやテロップを検出するテロップ検出手
段20と、検出した字幕スーパーやテロップを記憶するテ
ロップ情報記憶手段21と、テロップが検出された時点の
字幕スーパーやテロップが挿入されたフレーム画像をイ
ンデクスすべき画像であると決定するインデクス静止画
決定手段７と、決定されたインデクス静止画に対してテ
ロップ情報記憶手段21に記憶されたテロップ情報をもと
に、例えばインデクス静止画でもテロップが確実に読み
取れるようにテロップを拡大して表示できるようインデ
クス静止画を加工するインデクス静止画加工手段22と、
加工されたインデクス静止画を出力するインデクス静止
画出力手段８とから構成されている。上記構成におい
て、動画入力手段１から取り出された動画は、フレーム
メモリ２と、インデクス静止画決定手段７にそれぞれ入
力される。(Seventh Embodiment) FIG. 9 shows the configuration of a representative image generating apparatus according to the seventh embodiment of the present invention. In FIG. 9, a representative image generating apparatus according to a seventh embodiment of the present invention is a moving image input unit 1 capable of reading and providing a moving image that has been captured and accumulated in advance, and the provided moving image in frame units. The frame memory 2 to be stored in, the telop detection means 20 for detecting the subtitle super or telop while scanning the image data in frame units stored in the frame memory 2 in order (or intermittently), and the detected subtitle super or telop. A telop information storage unit 21 to be stored, an index still image determining unit 7 that determines a frame image in which a subtitle super and a telop are inserted when the telop is detected as an image to be indexed, and a determined index still image On the other hand, based on the telop information stored in the telop information storage means 21, for example, an index still image is displayed. An index still picture processing means 22 for processing the index still picture as telop can display an enlarged telop as read reliably,
It comprises an index still image output means 8 for outputting a processed index still image. In the above structure, the moving image taken out from the moving image input means 1 is inputted to the frame memory 2 and the index still image determining means 7, respectively.

【００３７】代表画像が決定される過程を上記動画入力
手段１から取り出されるフレーム画像について説明す
る。すなわち、上記テロップ検出手段20が対象動画のあ
るフレーム（先頭フレーム）から順に（或いは飛び飛び
に）走査しながら字幕スーパーやテロップを検出する。
テロップの検出技術それ自体は公知なので、ここでは詳
述しない。次に、上記テロップ情報記憶手段21は、検出
した字幕スーパーやテロップを記憶する。テロップが記
憶されたことを受けて、上記インデクス静止画決定手段
７は、テロップが検出された時点のフレーム画像をイン
デクスすべき画像であると決定する。決定されたインデ
クス静止画は上記インデクス静止画加工手段22に与えら
れ、上記テロップ情報記憶手段21に記憶されたテロップ
情報をもとにテロップ情報の加工、すなわちインデクス
静止画でもテロップが確実に読み取れるように例えばテ
ロップを拡大して表示できるよう加工する。そして上記
インデクス静止画出力手段８よりインデクス静止画を出
力する。The process of determining the representative image will be described for the frame image extracted from the moving image input means 1. That is, the telop detection means 20 detects a subtitle super or a telop while sequentially (or intermittently) scanning from a frame (head frame) having a target moving image.
Since the telop detection technique itself is known, it will not be described in detail here. Next, the telop information storage means 21 stores the detected subtitle super and telop. In response to the telop being stored, the index still image determination means 7 determines that the frame image at the time when the telop is detected is the image to be indexed. The determined index still image is given to the index still image processing means 22, and the telop information is processed based on the telop information stored in the telop information storage means 21, that is, the telop can be surely read even in the index still image. For example, the telop is processed so that it can be enlarged and displayed. Then, the index still image is output from the index still image output means 8.

【００３８】このように本発明の第７の実施の形態の代
表画像生成装置によれば、映像編集者により説明用キャ
プションが入れられたフレーム画像がインデクス静止画
となるので、動画がニュース素材、ニュース番組、教育
用動画素材であるような場合には、代表画像として相応
しいものを代表画像に簡単に生成できるので、動画の中
身を的確に表すインデクス画像を提示できる代表画像生
成装置を提供することができる。As described above, according to the representative image generating apparatus of the seventh embodiment of the present invention, since the frame image in which the caption for explanation is put by the video editor becomes the index still image, the moving image is the news material, In the case of a news program or moving image material for education, a representative image suitable for the representative image can be easily generated in the representative image. Therefore, a representative image generation device capable of presenting an index image accurately representing the contents of the moving image is provided. You can

【００３９】（第８の実施の形態）図１０は、本発明の
第８の実施の形態に係る代表画像生成装置の構成を示す
ものである。図１０において本発明の第８の実施の形態
に係る代表画像生成装置は、あらかじめ撮影し蓄積して
おいた動画を読み出して提供できるようにした動画入力
手段１と、提供された動画をフレーム単位で格納するフ
レームメモリ２と、フレームメモリ２に格納されたフレ
ーム単位の画像データを読み出してカット（１纏まりの
映像データ）を検出するカット検出手段３と、各カット
のカット長を計算するカット長計算手段４と、計算され
たカット長を記憶するカット長記憶手段５と、記憶され
たカット長の中から最も長いカット長を有するカットを
決定する最長カット決定手段６と、最も長いカット長を
有するカットにおいてズーム、パン、チルトなどのカメ
ラ操作を抽出するカメラ操作抽出手段23と、カメラ操作
が施されたフレーム画像の前後のフレームを記憶するカ
メラ操作情報記憶手段24と、記憶された前後フレーム画
像の中から１フレームを選択してインデクスすべき画像
を決定するインデクス静止画決定手段７と、決定された
インデクス静止画を出力するインデクス静止画出力手段
８とから構成されている。上記構成において、動画入力
手段１から取り出された動画は、フレームメモリ２と、
カメラ操作抽出手段23にそれぞれ入力される。(Eighth Embodiment) FIG. 10 shows the configuration of a representative image generating apparatus according to an eighth embodiment of the present invention. In FIG. 10, a representative image generating apparatus according to an eighth embodiment of the present invention is a moving image input unit 1 capable of reading and providing a moving image that has been captured and stored in advance, and the provided moving image in frame units. The frame memory 2 to be stored in, the cut detection unit 3 that reads the image data in frame units stored in the frame memory 2 to detect a cut (a set of video data), and the cut length that calculates the cut length of each cut. The calculation means 4, the cut length storage means 5 that stores the calculated cut length, the longest cut determination means 6 that determines the cut having the longest cut length among the stored cut lengths, and the longest cut length Camera operation extraction means 23 for extracting camera operations such as zoom, pan, and tilt in a cut, and before and after the frame image on which the camera operation is performed. Camera operation information storage means 24 for storing frames, index still image determination means 7 for selecting one frame from the stored previous and next frame images and determining an image to be indexed, and outputting the determined index still image Index still image output means 8 for In the above configuration, the moving image taken out from the moving image input means 1 is stored in the frame memory 2 and
It is input to the camera operation extracting means 23, respectively.

【００４０】代表画像が決定される過程を上記動画入力
手段１から取り出される連続するフレーム画像について
説明する。すなわち、上記カメラ操作抽出手段23は最も
長いカット長を有するカットにおいてズーム、パン、チ
ルトなどのカメラ操作を抽出する。上記カメラ操作情報
記憶手段24は、カメラ操作が施されたフレーム画像の前
後のフレーム画像を記憶する。ズーム、パン、チルトな
どのカメラ操作の検出技術それ自体は公知なので、ここ
では詳述しない。次に、上記インデクス静止画決定手段
７は、記憶された前後フレームの中から１フレーム、例
えばズームアップ直後のフレーム画像を選択しインデク
スとして相応しいフレーム画像を決定する。そして上記
インデクス静止画出力手段８よりインデクス静止画を出
力する。The process of determining the representative image will be described for successive frame images taken out from the moving image input means 1. That is, the camera operation extracting means 23 extracts camera operations such as zoom, pan, and tilt in the cut having the longest cut length. The camera operation information storage means 24 stores the frame images before and after the frame image on which the camera operation is performed. A technique for detecting a camera operation such as zooming, panning, and tilting is known per se, and will not be described in detail here. Next, the index still image determining means 7 selects one frame, for example, a frame image immediately after zooming in, from the stored preceding and following frames and determines a suitable frame image as an index. Then, the index still image is output from the index still image output means 8.

【００４１】このように本発明の第８の実施の形態の代
表画像生成装置によれば、映像の撮影および編集時に、
カット時間やカット数量の面で最も重点が置かれたシー
ンで、かつカメラ操作による効果の面で最も重点が置か
れたシーンの絵がインデクス静止画となるので、動画の
中身を的確に表すインデクス画像を提示できる代表画像
生成装置を提供することができる。As described above, according to the representative image generating apparatus of the eighth embodiment of the present invention, at the time of shooting and editing a video,
The index still image is a picture of the scene that is most important in terms of cutting time and quantity, and the scene that is most important in terms of the effect of operating the camera. A representative image generation device capable of presenting an image can be provided.

【００４２】（第９の実施の形態）図１１は、本発明の第９の実施の形態に係る代表画像生
成装置の構成を示すものである。図１１において本発明
の第９の実施の形態に係る代表画像生成装置は、あらか
じめ撮影し蓄積しておいた動画を読み出して提供できる
ようにした動画入力手段１と、提供された動画をフレー
ム単位で格納するフレームメモリ２と、フレームメモリ
２に格納されたフレーム単位の画像データを読み出して
カット（１纏まりの映像データ）を検出するカット検出
手段３と、各カットのカット長を計算するカット長計算
手段４と、計算されたカット長を記憶するカット長記憶
手段５と、記憶された各カット長からカット長の短いカ
ットが連続する短カット区間をそれぞれ一つのシーンと
し、前記シーンの中から最長の短カット区間を最も長い
シーン長を有する最長シーンとして決定する最長シーン
決定手段25と、前記最長シーンから代表フレーム画像を
インデクス静止画として決定するインデクス静止画決定
手段7と、決定されたインデクス静止画を出力するイン
デクス静止画出力手段８とから構成されている。上記構
成において、動画入力手段１から取り出された動画は、
フレームメモリ２と、インデクス静止画決定手段７にそ
れぞれ入力される。(Ninth Embodiment) FIG. 11 shows the configuration of a representative image generating apparatus according to the ninth embodiment of the present invention. In FIG. 11, a representative image generating apparatus according to a ninth embodiment of the present invention is a moving image input unit 1 capable of reading and providing a moving image that has been captured and accumulated in advance, and the provided moving image in frame units. The frame memory 2 to be stored in, the cut detection unit 3 that reads the image data in frame units stored in the frame memory 2 to detect a cut (a set of video data), and the cut length that calculates the cut length of each cut. A calculating means 4, a cut length storing means 5 for storing the calculated cut length, and a short cut section in which short cuts from each of the stored cut lengths are continuous as one scene.
And, the longest the longest short cut sections from the scene
Longest scene determining means 25 for determining the longest scene having a scene length, and a representative frame image from the longest scene.
An index still picture decision unit 7 for determining the index still picture, and a index still picture outputting means 8 for outputting the determined index still picture. In the above configuration, the moving image extracted from the moving image input means 1 is
It is input to the frame memory 2 and the index still image determining means 7, respectively.

【００４３】図１２を用いて代表画像が決定される過程
を上記動画入力手段１から取り出される動画のフレーム
画像開始点から始まる複数のカットの例で説明する。す
なわち、上記カット検出手段３は、カット変化点に着目
してカットの開始点を把握する。さらに、上記カット長
計算手段４は、各カットの長さを計算する。そして、上
記最長シーン決定手段25は、記憶された各カット長から
カット長の短いカットが連続する短カット区間をそれぞ
れ一つのシーンとし、前記シーンの中から最長の短カッ
ト区間を最も長いシーン長を有する最長シーンとして決
定する。そして、上記インデクス静止画決定手段７は、
最長シーンの先頭から一定割合にあるフレーム画像、例
えば最長シーンの先頭から30％近傍の位置にあるフレー
ム画像を代表画像と決定する。そのように決定する理由
は、上記のような動画例においてその動画の中で最長の
短カット区間、すなわち最長シーンは、映像の撮影およ
び編集時に、カット数量の面で最も重点が置かれたシー
ンの絵であってこれを視聴覚者に印象付けるために制作
者が多用するという経験則に基づくものである。The process of determining the representative image will be described with reference to FIG. 12 by taking an example of a plurality of cuts starting from the frame image start point of the moving picture taken out from the moving picture inputting means 1. That is, the cut detection means 3 grasps the cut start point by focusing on the cut change point. Further, the cut length calculation means 4 calculates the length of each cut. Then, the longest scene determination means 25 selects a short cut section in which cuts each having a short cut length continue from the stored cut lengths.
One scene, and the longest short
The segment is determined as the longest scene having the longest scene length . Then, the index still image determination means 7
A frame image at a certain ratio from the beginning of the longest scene , for example, a frame image at a position near 30% from the beginning of the longest scene is determined as the representative image. The reason for deciding so is that in the above video example, the longest short cut section in the video , that is, the longest scene, is the scene where the most emphasis is placed on the number of cuts when shooting and editing the video. Is based on the rule of thumb that the creators often use this to impress the viewer.

【００４４】このように本発明の第９の実施の形態の代
表画像生成装置によれば、映像の撮影および編集時に、
カット数量の面で最も重点が置かれたシーンの絵をイン
デクス静止画すなわち代表画像とすることができ、動画
が映画やドラマ映像のような場合に動画の中身を的確に
表すインデクス画像を提示できる代表画像生成装置を提
供することができる。As described above, according to the representative image generating apparatus of the ninth embodiment of the present invention, when photographing and editing a video,
The picture of the scene with the most emphasis on the number of cuts can be used as the index still image, that is, the representative image, and the index image that accurately represents the contents of the video can be presented when the video is a movie or drama video. A representative image generation device can be provided.

【００４５】（第１０の実施の形態）図１３は、本発明の第１０の実施の形態に係る代表画像
生成装置の構成を示すものである。図１３において本発
明の第１０の実施の形態の代表画像生成装置は、あらか
じめ撮影し蓄積しておいた動画を読み出して提供できる
ようにした動画入力手段１と、提供された動画をフレー
ム単位で格納するフレームメモリ２と、フレームメモリ
２に格納されたフレーム単位の画像データを一定間隔区
間毎に１フレームずつ読み出す一定間隔区間フレーム抽
出手段13と、抽出した複数のフレーム間で似ている映像
の区間を纏めることによりシーンを構成するシーン構成
手段９と、構成したシーンの長さを計算するシーン長計
算手段10と、計算したシーン長を記憶するシーン長記憶
手段11と、記憶された各シーン長からシーン長の短いシ
ーンが連続する短シーン区間の中から最長の短シーン区
間を最長短シーン区間として決定する短シーン区間決定
手段26と、前記最長短シーン区間から代表フレーム画像
をインデクス静止画として決定するインデクス静止画決
定手段7と、決定されたインデクス静止画を出力するイ
ンデクス静止画出力手段８とから構成されている。上記
構成において、動画入力手段１から取り出される動画
は、フレームメモリ２と、シーン構成手段９と、インデ
クス静止画決定手段７にそれぞれ入力される。(Tenth Embodiment) FIG. 13 shows the configuration of a representative image generating apparatus according to the tenth embodiment of the present invention. In FIG. 13, the representative image generation apparatus according to the tenth embodiment of the present invention is a moving image input unit 1 capable of reading and providing a moving image that has been captured and stored in advance, and the provided moving image in frame units. The frame memory 2 for storing the image data in frame units stored in the frame memory 2 and the constant interval section frame extracting means 13 for reading out one frame at a time for each constant interval section, and the extracted image of a plurality of similar frames. Scene composition means 9 that composes a scene by collecting sections, scene length calculation means 10 that calculates the length of the composed scene, scene length storage means 11 that stores the calculated scene length, and each stored scene short scene section determined to be determined from among the short scene section the successive short scene length scenes from the length of the longest short scene period as the top length scene section A stage 26, the and a top length representative frame image from the scene section and index still picture decision unit 7 for determining the index still picture, and outputs the determined index still picture index still image output means 8. In the above-mentioned structure, the moving picture taken out from the moving picture inputting means 1 is inputted to the frame memory 2, the scene forming means 9 and the index still picture deciding means 7, respectively.

【００４６】代表画像が決定される過程を上記動画入力
手段１から取り出される動画のフレーム画像開始点から
一定間隔区間毎に１フレーム画像ずつ取り出されるフレ
ーム画像について説明する。すなわち、上記一定間隔区
間フレーム抽出手段13は、一定間隔区間毎に１フレーム
画像ずつフレーム画像を取り出す。上記シーン構成手段
９は、取り出された複数のフレーム画像間で似ている画
像の区間を纏めることによりシーンを構成する。上記シ
ーン長計算手段10は、各シーンの長さを計算する。そし
て、上記短シーン区間決定手26は、シーン長の短いシー
ンが連続する短シーン区間の中から最長の短シーン区間
を決定する。そして、上記インデクス静止画決定手段７
は、最長の短シーン区間の先頭から一定割合にあるフレ
ーム画像、例えば決定された短シーン区間の先頭から30
％近傍の位置にあるフレーム画像を代表画像と決定す
る。そのように決定する理由は、上記のような動画例に
おいてその動画の中で最長の短シーン区間は、映像の撮
影および編集時に、シーン数量の面で最も重点が置かれ
た短シーン区間の絵であってこれを視聴覚者に印象付け
るために制作者が多用するという経験則に基づくもので
ある。なお、上記したようにフレーム間の類似を判定す
る技術は公知なので、その技術を利用すればよい。The process of determining the representative image will be described with respect to the frame images taken out one frame image at a fixed interval from the frame image start point of the moving picture taken out from the moving picture inputting means 1. That is, the constant interval section frame extracting means 13 extracts one frame image for each constant interval section. The scene composing means 9 composes a scene by collecting sections of similar images among a plurality of extracted frame images. The scene length calculation means 10 calculates the length of each scene. Then, the short scene section determiner 26 determines the longest short scene section from the short scene sections in which the scenes with short scene lengths continue. Then, the index still image determination means 7
Is a frame image at a fixed ratio from the beginning of the longest short scene section, for example, 30 frames from the beginning of the determined short scene section.
The frame image in the vicinity of% is determined as the representative image. Reason is the longest short scene segments in the video in the video example as described above, when the photographing and editing of the video, a picture of a short scene period most emphasis is placed in terms of scenes quantity to be determined as such However, it is based on the rule of thumb that the creators often use this to impress the viewers. Since the technique for determining the similarity between frames as described above is known, that technique may be used.

【００４７】このように本発明の第１０の実施の形態の
代表画像生成装置によれば、映像の撮影および編集時
に、シーン数量の面で最も重点が置かれた短シーンの絵
をインデクス静止画すなわち代表画像とすることがで
き、動画が映画やドラマ映像のような場合に動画の中身
を的確に表すインデクス画像を提示できる代表画像生成
装置を提供することができる。As described above, according to the representative image generating apparatus of the tenth embodiment of the present invention, at the time of shooting and editing a video, a picture of a short scene, which is most important in terms of the number of scenes, is displayed as an index still image. That is, it is possible to provide a representative image generation device that can be used as a representative image and can present an index image that accurately represents the contents of a moving image when the moving image is a movie or drama video.

【００４８】なお、以上の実施の形態では、静止画が代
表画像となる例について説明したが、代表画像を静止画
に限定する必要はなく、動画像を代表画像とすることも
できることを付け加えておく。In the above embodiment, the example in which the still image is the representative image has been described. However, it is not necessary to limit the representative image to the still image, and the moving image can be used as the representative image. deep.

【００４９】[0049]

【発明の効果】以上のように本発明によれば、動画の中
身を的確に表すインデクスとなる画像を提示できる代表
画像生成装置を提供することができるという優れた効果
が得られる。As described above, according to the present invention, it is possible to provide a representative image generating apparatus capable of presenting an image as an index that accurately represents the contents of a moving image.

【００５０】さらに念のため、本発明の請求項毎にその
効果を記載すると、請求項１乃至請求項２に記載の発明
によれば、映像の撮影および編集時に、制作者の意図を
反映する映像時間配分の面で最も重点が置かれたシーン
の絵をインデクス静止画とすることができ、動画の中身
を的確に表すインデクス画像を提示できる代表画像生成
装置を提供することができる。[0050] Further a precaution, when describing the effect every aspect of the present invention, according to the invention described in claims 1 to 2, at the time of photographing and editing the image, reflecting an intention of the author It is possible to provide a representative image generation device that can render a picture of a scene that is most important in terms of video time allocation as an index still image, and that can present an index image that accurately represents the contents of a moving image.

【００５１】[0051]

【００５２】[0052]

【００５３】[0053]

【００５４】[0054]

【００５５】[0055]

【００５６】請求項３に記載の発明によれば、映像の撮
影および編集時に、カット数量の面で最も重点が置かれ
たシーンの絵をインデクス静止画すなわち代表画像とす
ることができ、動画が映画やドラマ映像のような場合に
動画の中身を的確に表すインデクス画像を提示できる代
表画像生成装置を提供することができる。According to the third aspect of the present invention, at the time of shooting and editing the video, the picture of the scene that is most important in terms of the number of cuts can be used as the index still picture, that is, the representative picture, and the moving picture can be displayed. It is possible to provide a representative image generation device capable of presenting an index image that accurately represents the contents of a moving image such as a movie or a drama image.

【００５７】請求項４に記載の発明によれば、映像の撮
影および編集時に、シーン数量の面で最も重点が置かれ
た短シーン区間の絵をインデクス静止画すなわち代表画
像とすることができ、動画が映画やドラマ映像のような
場合に動画の中身を的確に表すインデクス画像を提示で
きる代表画像生成装置を提供することができる。According to the invention described in claim 4 , the picture in the short scene section , which is most important in terms of the number of scenes, can be used as the index still picture, that is, the representative picture, at the time of shooting and editing the video. It is possible to provide a representative image generation device capable of presenting an index image that accurately represents the contents of a moving image when the moving image is a movie or drama video.

[Brief description of drawings]

【図１】本発明の第１の実施の形態に係る代表画像生成
装置の構成を示す図、FIG. 1 is a diagram showing a configuration of a representative image generation device according to a first embodiment of the present invention,

【図２】本発明の第１の実施の形態に係る代表画像生成
装置の動作を説明するための図、FIG. 2 is a diagram for explaining the operation of the representative image generation device according to the first embodiment of the present invention,

【図３】本発明の第２の実施の形態に係る代表画像生成
装置の構成を示す図、FIG. 3 is a diagram showing a configuration of a representative image generation device according to a second embodiment of the present invention,

【図４】本発明の第２の実施の形態に係る代表画像生成
装置の動作を説明するための図、FIG. 4 is a diagram for explaining the operation of the representative image generation device according to the second embodiment of the present invention,

【図５】本発明の第３の実施の形態に係る代表画像生成
装置の構成を示す図、FIG. 5 is a diagram showing a configuration of a representative image generation device according to a third embodiment of the present invention,

【図６】本発明の第４の実施の形態に係る代表画像生成
装置の構成を示す図、FIG. 6 is a diagram showing a configuration of a representative image generation device according to a fourth embodiment of the present invention,

【図７】本発明の第５の実施の形態に係る代表画像生成
装置の構成を示す図、FIG. 7 is a diagram showing a configuration of a representative image generation device according to a fifth embodiment of the present invention,

【図８】本発明の第６の実施の形態に係る代表画像生成
装置の構成を示す図、FIG. 8 is a diagram showing a configuration of a representative image generation device according to a sixth embodiment of the present invention,

【図９】本発明の第７の実施の形態に係る代表画像生成
装置の構成を示す図、FIG. 9 is a diagram showing a configuration of a representative image generation device according to a seventh embodiment of the present invention,

【図１０】本発明の第８の実施の形態に係る代表画像生
成装置の構成を示す図、FIG. 10 is a diagram showing a configuration of a representative image generation device according to an eighth embodiment of the present invention,

【図１１】本発明の第９の実施の形態に係る代表画像生
成装置の構成を示す図、FIG. 11 is a diagram showing a configuration of a representative image generation device according to a ninth embodiment of the present invention,

【図１２】本発明の第９の実施の形態に係る代表画像生
成装置の動作を説明するための図、FIG. 12 is a diagram for explaining the operation of the representative image generation device according to the ninth embodiment of the present invention,

【図１３】本発明の第１０の実施の形態に係る代表画像
生成装置の構成を示す図である。FIG. 13 is a diagram showing a configuration of a representative image generation device according to a tenth embodiment of the invention.

[Explanation of symbols]

１動画入力手段２フレームメモリ３カット検出手段４カット長計算手段５カット長記憶手段６最長カット決定手段７インデクス静止画決定手段８インデクス出力手段９シーン構成手段 10 シーン長計算手段 11 シーン長記憶手段 12 最長シーン決定手段 13 一定間隔区間フレーム抽出手段 14 特徴量パラメータ抽出手段 15 特徴量計算手段 16 最大特徴量記憶手段 17 特徴量判定手段 18 物体検出手段 19 検出物体判定手段 20 テロップ抽出手段 21 テロップ情報記憶手段 22 インデクス静止画加工手段 23 カメラ操作抽出手段 24 カメラ操作情報記憶手段 25 最長シーン決定手段 26 短シーン区間決定手段1 Video Input Means 2 Frame Memory 3 Cut Detection Means 4 Cut Length Calculation Means 5 Cut Length Storage Means 6 Longest Cut Determining Means 7 Index Still Image Determining Means 8 Index Output Means 10 Scene Constructing Means 10 Scene Length Calculating Means 11 Scene Length Means Means 12 longest scene determination means 13 constant interval section frame extraction means 14 feature amount parameter extraction means 15 feature amount calculation means 16 maximum feature amount storage means 17 feature amount determination means 18 object detection means 19 detected object determination means 20 telop extraction means 21 telop information Storage means 22 Index still image processing means 23 Camera operation extraction means 24 Camera operation information storage means 25 Longest scene determination means 26 Short scene section determination means

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平８−191411（ＪＰ，Ａ) 特開平８−32924（ＪＰ，Ａ) 特開平９−233422（ＪＰ，Ａ) 特開平７−111630（ＪＰ，Ａ) 特開平６−149902（ＪＰ，Ａ) 特開平６−89549（ＪＰ，Ａ) 特開平６−113253（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) H04N 5/76 - 5/956 ─────────────────────────────────────────────────── ─── Continuation of the front page (56) Reference JP-A-8-191411 (JP, A) JP-A-8-32924 (JP, A) JP-A-9-233422 (JP, A) JP-A-7- 111630 (JP, A) JP-A-6-149902 (JP, A) JP-A-6-89549 (JP, A) JP-A-6-113253 (JP, A) (58) Fields investigated (Int.Cl. ⁷ , DB name) H04N 5/76-5/956

Claims

(57) [Claims]

1. A moving image input means capable of reading and providing a moving image that has been captured and accumulated in advance, a frame memory for storing the provided moving image in frame units, and a frame unit stored in the frame memory. Cut detection means for detecting the cuts by reading the image data of, cut length calculation means for calculating the cut length of each cut, cut length storage means for storing the calculated cut length, and a similar picture between the cuts. From the stored scene length, the scene composition means that composes a video scene by collecting cuts, etc., the scene length calculation means that calculates the length of the composed scene, the scene length storage means that stores the calculated scene length, the longest scene determination means, representative frame image from the longest scene in that determine up a scene having a longest scene length as the longest scenes And index still picture determining means for determining a hex still image, the representative image generating apparatus characterized in that it is composed of an index still picture output means for outputting the determined index still picture.

2. A moving image input means capable of reading and providing a moving image that has been captured and accumulated in advance, a frame memory for storing the provided moving image in frame units, and a frame unit stored in the frame memory. Constant interval section frame extracting means for reading out one frame of the image data for each constant interval section, scene composing means for composing a scene by collecting similar video sections among a plurality of extracted frames, and the composed scene A scene length calculation means for calculating the length of the scene, a scene length storage means for storing the calculated scene length, and a longest scene determination means for determining a scene having the longest scene length from the stored scene lengths as the longest scene. the index still picture determining means for determining a representative frame beam image as an index still image from the longest scene, Representative image generating apparatus characterized in that it is composed of an index still picture output means for outputting a constant has been indexed still image.

3. A moving image input means capable of reading and providing a moving image that has been captured and stored in advance, a frame memory for storing the provided moving image in frame units, and a frame unit stored in the frame memory. Cut detection means for reading out the image data of to detect the cut, cut length calculation means for calculating the cut length of each cut, cut length storage means for storing the calculated cut length, and each stored cut length The short cut sections where short cuts are continuous are treated as one scene,
Of the above scenes, the longest short cut section is the longest scene
Longest scene determination means for determining the longest scene having a length, and indexing a representative frame image from the longest scene
Index still image determination means for determining as a still image,
A representative image generation device, comprising: an index still image output unit that outputs the determined index still image.

4. A moving image input means capable of reading and providing a moving image that has been captured and stored in advance, a frame memory for storing the provided moving image in frame units, and a frame unit stored in the frame memory. Constant interval section frame extracting means for reading out one frame of the image data for each constant interval section, scene composing means for composing a scene by collecting similar video sections among a plurality of extracted frames, and the composed scene Scene length calculation means for calculating the length of a scene, a scene length storage means for storing the calculated scene length, and a longest short scene from short scene sections in which short scenes are consecutive from the stored scene lengths. a short scene section determining unit, a representative frame over beam image from the uppermost length scene section indenyl determining the interval as the top length scene section And index still picture determining means for determining a scan still image, the representative image generating apparatus characterized in that it is composed of an index still picture output means for outputting the determined index still picture.