JP2003348528A

JP2003348528A - Representative image selection method, representative image selection apparatus, representative image selection program and recording medium for representative image selection program

Info

Publication number: JP2003348528A
Application number: JP2002147886A
Authority: JP
Inventors: Yukinori Minamida; 幸紀南田; Yukinobu Taniguchi; 行信谷口; Haruhiko Kojima; 治彦児島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2002-05-22
Filing date: 2002-05-22
Publication date: 2003-12-05

Abstract

PROBLEM TO BE SOLVED: To select a frame in which the motion of an object in a video easy to recognize and no camera shaking is present in order to create a representative image from the video in a video database system, a video indexing means or a digital video editing system. SOLUTION: An image change amount in a video block is calculated by an image change amount calculating means 111, and a frame in which the image change amount is minimum is calculated by an image change amount minimum frame calculating means 112. The frame in which the image change amount becomes minimum is found to detect an image having a high possibility of a characteristic attitude to become a node of the motion when positively moving a figure or an animal. Then, with the found frame in which the image change amount becomes minimum as a reference, a representative image candidate selecting means 113 selects candidates of the representative image in the video block. COPYRIGHT: (C)2004,JPO

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は，映像データベース
システム，映像索引付けシステム，デジタル映像編集シ
ステムなどにおいて，映像から代表画像を作成するため
に，映像中の被写体の動きが分かりやすく，ぶれのない
フレームを選択する方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a video database system, a video indexing system, a digital video editing system, and the like, in which a representative image is created from a video. It relates to a method of selecting a frame.

【０００２】[0002]

【従来の技術】近年，計算機およびハードディスク装置
などのランダムアクセス記憶装置の高速化や低廉化によ
り，映像を計算機読み取り可能なデジタルデータに変換
してランダムアクセス記憶装置に記録して取り扱うこと
が広く行われるようになってきている。このような，い
わゆる映像のデジタル化による方法で映像を取り扱うこ
とには，動画像を磁気テープ等に記録して取り扱う方法
に較べて様々な利点がある。2. Description of the Related Art In recent years, with the increase in speed and cost of random access storage devices such as computers and hard disk devices, it has become widely practiced to convert video into computer-readable digital data and record and handle the data in the random access storage device. Is becoming increasingly common. There are various advantages to handling video by a method based on so-called video digitization as compared with a method of recording and handling moving images on a magnetic tape or the like.

【０００３】利点の一つは，映像の内容にもとづいて索
引を作成する「映像の索引付け」が好適に適用できるこ
とである。映像の索引は，映像のある区間の内容を表す
ラベルと，該映像区間の区間情報の組である。映像区間
の区間情報は，映像区間の開始位置と終了位置によって
表される。また，開始位置と区間の長さによっても表さ
れる。開始位置だけで表される場合もある。映像中の位
置は，映像の先頭からの経過時間や，映像の先頭フレー
ムからの通算フレーム番号などで表される。映像区間の
長さは，経過時間や，フレーム数などで表される。映像
の索引を使うことにより，映像全体を見ることなく，所
望の映像区間を選び，即座にランダムアクセス記憶装置
から所望の映像区間を呼び出して閲覧することができ
る。One of the advantages is that "video indexing" for creating an index based on the content of a video can be suitably applied. The video index is a set of a label indicating the content of a certain section of the video and section information of the video section. The section information of the video section is represented by the start position and the end position of the video section. It is also indicated by the start position and the length of the section. In some cases, only the start position is indicated. The position in the video is represented by the elapsed time from the top of the video, the total frame number from the top frame of the video, and the like. The length of the video section is represented by an elapsed time, the number of frames, or the like. By using the index of a video, a desired video section can be selected without immediately viewing the entire video, and the desired video section can be immediately called from the random access storage device and browsed.

【０００４】映像区間内容を表すラベルには様々なもの
があるが，視覚的に示すために，映像区間中のあるフレ
ームの縮小画像を該映像区間のラベルとして用いること
が多い。このとき，該縮小画像を該映像区間の代表画像
と呼ぶ。なお，１映像区間の代表画像は１枚に限る必要
はなく，複数枚でもよい。また，映像から抽出した１枚
のフレームまたは複数枚のフレームを加工して代表画像
を作成してもよい。また，代表画像は静止画に限定され
るものではなく，例えば，動画をその映像区間を代表す
るラベルとして用いてもよい。したがって，動画をラベ
ルとして用いる場合も代表画像と呼ぶこととする。There are various labels representing the contents of a video section, but a reduced image of a certain frame in the video section is often used as a label for the video section for visual indication. At this time, the reduced image is called a representative image of the video section. Note that the number of representative images in one video section need not be limited to one, but may be plural. Also, a representative image may be created by processing one frame or a plurality of frames extracted from a video. Further, the representative image is not limited to a still image, and for example, a moving image may be used as a label representing the video section. Therefore, the case where a moving image is used as a label is also referred to as a representative image.

【０００５】映像の索引付けを自動的に行う方法も様々
なものが提案されている。例えば，映像区間を自動的に
抽出する方法としては，映像が時間的に急激に変化する
部分であるいわゆるシーンチェンジによって映像を区切
る方法がある。また，映像区間の代表画像を抽出する方
法としては，映像区間の先頭フレームを選択する方法
や，先頭から一定時間経過した時点のフレームを選択す
る方法がある。Various methods have been proposed for automatically indexing videos. For example, as a method of automatically extracting a video section, there is a method of separating a video by a so-called scene change, which is a portion where the video changes rapidly with time. Further, as a method of extracting a representative image of a video section, there are a method of selecting a first frame of the video section and a method of selecting a frame at a time when a predetermined time has elapsed from the top.

【０００６】映像製作業者や放送事業者など様々な業者
は，自社の所有する映像をデジタル化して索引付けする
ことにより，膨大な映像を管理したり，顧客への映像販
売サービスを提供する試みを検討しているところであ
る。[0006] Various manufacturers such as video producers and broadcasters have attempted to manage a huge amount of videos and provide video sales services to customers by digitizing and indexing the videos owned by the companies. We are considering it.

【０００７】[0007]

【発明が解決しようとする課題】ところで，上記の映像
区間の代表画像は，映像区間の内容を最もよく表す静止
画像であることが望ましい。しかしながら，映像区間の
先頭フレームや，先頭から一定時間経過した時点のフレ
ームを選択する従来の方法では，必ずしも映像区間の内
容を最もよく表す静止画像とはならず，この点が大きな
問題となっていた。Incidentally, it is desirable that the representative image of the video section is a still image that best represents the contents of the video section. However, the conventional method of selecting the first frame of a video section or a frame at a time after a certain time has elapsed from the top does not always result in a still image that best represents the content of the video section, and this point is a major problem. Was.

【０００８】例えば，ゴルフの一場面で人物がゴルフク
ラブをスウィングする映像区間では，スウィングの瞬間
が代表画像としてふさわしいであろう。しかし，上記従
来方法はスウィングの瞬間のフレームを選択する作用を
持たないため，スウィングと無関係な，ただ人物が立っ
ているだけのフレームや，風景だけが映っているフレー
ムを代表画像として選択してしまう可能性がある。その
ような代表画像から，その映像区間にスウィングの映像
があると知ることは困難である。For example, in a video section in which a person swings a golf club in one scene of golf, the instant of the swing may be appropriate as a representative image. However, since the above-mentioned conventional method does not have an effect of selecting a frame at the moment of a swing, a frame which is irrelevant to the swing and in which only a person stands or a frame in which only a landscape is reflected is selected as a representative image. May be lost. It is difficult to know from such a representative image that there is a swing image in the image section.

【０００９】また，動きが速い被写体を撮影した映像で
は，被写体がぶれることがある。ぶれ画像は代表画像と
してふさわしくないが，上記従来方法は，被写体がぶれ
ているかどうかを判断する作用がないため，ぶれの激し
いフレームを代表画像として選択してしまう可能性があ
るという問題がある。[0009] Further, in a video image of a fast-moving subject, the subject may be blurred. Although a blurred image is not suitable as a representative image, the conventional method has no effect of determining whether or not a subject is blurred, and thus has a problem that a frame with severe blurring may be selected as a representative image.

【００１０】これら代表画像選択における問題を解決す
るために，特開平９−９３５２７号公報には，映像を再
生しながら最も適切なフレームを人手によって選択する
方法が示されている。しかし，この方法によれば適切な
代表画像は得られるが，人的コストがかかるという問題
がある。また，映像を再生しながら人手によって選択す
るために，時間がかかるという問題もある。In order to solve the problem in selecting a representative image, Japanese Patent Laid-Open No. 9-93527 discloses a method of manually selecting the most appropriate frame while reproducing a video. However, according to this method, although an appropriate representative image can be obtained, there is a problem that human cost is required. In addition, there is also a problem that it takes time to manually select while reproducing the video.

【００１１】本発明は，上記従来方法の問題点を解決す
るためのものであり，映像区間の中から，自動的に被写
体の動きが分かり易く，ぶれのないフレームを代表画像
として選択する方法もしくは選択しやすくする方法を提
供することを目的とする。An object of the present invention is to solve the problem of the above-mentioned conventional method. A method of automatically selecting a frame in which movement of a subject is easy to understand and has no blur from a video section as a representative image. The aim is to provide a way to make selection easier.

【００１２】[0012]

【課題を解決するための手段】本発明では，人物や動物
が能動的に動く場合の動きを観察し，動きの節目で一瞬
静止したり，あるいは動きが極小になることに着目し
た。このように，動きが一瞬静止するか極小になる時点
を，以後，「動き谷間」と称する。例えば，人物がゴル
フクラブをスウィングする一連の動きは，ゴルフクラブ
を振り上げる動作と，振り下ろす動作に分解でき，その
境目が節目となり，動き谷間となる。この例に限らず，
多くの場合，人物や動物の能動的な動きは，連続する単
純な動作に分解でき，それらの単純な動作の境目で動き
谷間となる。もっとも，単純な動作に分解できず，節目
が無いような動きもあり得るが，生物が節目無く，緩急
のリズムもなく動き続けることには無理が伴い，そのよ
うな動きは多くないと考えられる。SUMMARY OF THE INVENTION In the present invention, the movement of a person or an animal in the case of active movement is observed, and attention is paid to the fact that the movement or the movement is momentarily stopped at a turning point of the movement. Such a point at which the movement stops for a moment or becomes a minimum is hereinafter referred to as a "motion valley". For example, a series of movements of a person swinging a golf club can be decomposed into a swinging-up operation and a swinging-down operation of the golf club, and the boundary becomes a turning point and a movement valley. Not limited to this example,
In many cases, the active movement of a person or an animal can be broken down into a series of simple movements, and at the boundaries of these simple movements, movement valleys occur. Although it cannot be broken down into simple movements, some movements may be seamless, but it is considered impossible for organisms to continue to move without any breaks and without slow rhythms. .

【００１３】本発明では，また，人物や動物などの能動
的な動きの中の動き谷間における姿勢に，該人物や動物
などの動きがよく表われていることに着目した。例えば
人物がゴルフクラブをスウィングする動きの節目である
ところのゴルフクラブを振り上げた直後であって，かつ
振り下ろす直前の姿勢は，これからゴルフクラブを振り
下ろそうとしていることが如実にわかる姿勢となってい
る。この例に限らず，人物や動物の能動的な動きの中の
動き谷間は，連続する単純な動作の境目であって，ある
動作の終点であると同時にある動作の起点でもあり，特
徴的な姿勢をとることが多い。動き谷間における姿勢が
必ず特徴的であるとは限らないとしても，動き谷間以外
の場所と較べると，特徴的である可能性は高いと考えら
れる。In the present invention, attention has been paid to the fact that the movement of the person or the animal is well represented in the posture in the movement valley in the active movement of the person or the animal. For example, the position immediately after swinging the golf club, which is the turning point of the swinging motion of the golf club, and immediately before swinging down the golf club is a posture that clearly shows that the golf club is about to swing down. ing. Not limited to this example, the valley in the active movement of a person or an animal is a boundary between continuous simple movements, an end point of a certain movement and a starting point of a certain movement. I often take a posture. Even if the posture in the moving valley is not always characteristic, it is considered that there is a high possibility that the posture is characteristic as compared with a place other than the moving valley.

【００１４】動き谷間では，動きの主体が静止するかも
しくは動きが緩やかになるので，ビデオカメラなどによ
って撮影した場合のぶれは小さい。動きの緩急の尺度と
しては，隣接フレーム間で画素の輝度変化量を用いるこ
とができ，機械的に計算が可能である。In a moving valley, the main body of the motion is stationary or the motion is slow, so that the blurring when photographing with a video camera or the like is small. As a measure of the degree of movement, the amount of change in the luminance of pixels between adjacent frames can be used, and can be calculated mechanically.

【００１５】以上の考察に基づき，本発明の第１は，映
像の変化量が時間的に極小になる時点のフレームを基準
として代表画像候補を選択することを特徴とする。Based on the above considerations, the first aspect of the present invention is characterized in that a representative image candidate is selected on the basis of a frame at a point in time when the amount of change in video becomes temporally minimal.

【００１６】また，本発明の第２は，上記フレームを基
準とする場合の典型的な例として，映像の変化量が時間
的に極小になる時点のフレームを代表画像候補とするこ
とを特徴とする。The second aspect of the present invention is characterized in that, as a typical example in the case where the above-mentioned frame is used as a reference, the frame at the time when the amount of change in the image becomes temporally minimal is set as a representative image candidate. I do.

【００１７】静止画で動きを表す方法として，動く被写
体の異なる時刻における複数の像を１枚の画像に写す，
いわゆるストロボ画像がある。このストロボ画像の利用
に着目し，本発明の第３は，映像の変化量が時間的に極
小になる時点のフレームを基準として，その前後の複数
枚のフレームを加工してストロボ画像を構成し，代表画
像候補とすることを特徴とする。As a method of expressing motion by a still image, a plurality of images of a moving subject at different times are copied into one image.
There is a so-called strobe image. Focusing on the use of this strobe image, the third aspect of the present invention is to construct a strobe image by processing a plurality of frames before and after the frame at the time when the amount of change in the image becomes minimal in time. , Representative image candidates.

【００１８】ところで，映像の画像変化量は，被写体の
動きによってのみ発生するものではなく，カメラワーク
（カメラの平行移動，回転，ズーミング）によっても発
生する。そこで，映像からカメラワークを推定し，カメ
ラワークを打ち消すように映像を変換し，変換後の映像
に対して画像変化量を計算すれば，画像変化量に被写体
の動きがよく反映されるようになる。By the way, the amount of change in the image of a video does not occur only due to the movement of a subject, but also due to camera work (translation, rotation, zooming of a camera). Therefore, by estimating the camera work from the video, converting the video to cancel the camera work, and calculating the image change amount for the converted video, the movement of the subject is reflected well in the image change amount. Become.

【００１９】本発明の第４は，映像からカメラワーク
（カメラの平行移動，回転，ズーミング）を推定し，カ
メラワークによる見かけ上の被写体の動きを打ち消すよ
うに該映像を変換し，変換後の映像に対して上記発明を
適用することを特徴とする。A fourth aspect of the present invention is to estimate camera work (translation, rotation, and zooming of a camera) from an image, convert the image so as to cancel an apparent movement of a subject due to the camera work, and convert the image. The present invention is characterized in that the invention is applied to a video.

【００２０】本発明の第５は，本発明の第１の方法によ
って代表画像候補を選択し，所定の手続きによって該代
表画像候補の各々の代表画像としての適切さを求め，該
適切さに基づいて該代表画像候補の中から代表画像を選
択することを特徴とする。According to a fifth aspect of the present invention, a representative image candidate is selected by the first method of the present invention, and an appropriateness of each representative image candidate is determined as a representative image by a predetermined procedure. A representative image is selected from the representative image candidates.

【００２１】本発明の第６は，上記本発明の第５のもの
において，代表画像の適切さを求める際に，次の数式
（１）に示す凹みＸに基づき，｜Ｘ−θ_x｜がより小さ
いものに，より大きな適切さを与えることを特徴とす
る。According to a sixth aspect of the present invention, in the fifth aspect of the present invention, when determining the appropriateness of the representative image, | X−θ _x | is calculated based on the dent X shown in the following equation (1). It is characterized by giving greater relevance to smaller ones.

【００２２】[0022]

【数３】 (Equation 3)

【００２３】ここで，ｆは映像の変化量を時間ｔの関数
として表したものであり，θ_xはある定数であり，
ｔ₁，ｔ₂は着目している極小点の両隣でｆが極大とな
る時刻を表す。Where f is the amount of change in the image as a function of time t, θ _x is a constant,
t ₁ and t ₂ represent the times when f is maximum on both sides of the minimum point of interest.

【００２４】本発明の第７は，上記本発明の第５のもの
において，代表画像の適切さを求める際に，ｆ（ｔ₁）
／ｆ（ｔ₀）がより大きいものに，より大きな適切さを
与え，かつ，ｆ（ｔ₂）／ｆ（ｔ₀）がより大きいもの
に，より大きな適切さを与えることを特徴とする。ここ
で，ｔ₀は，着目している極小点の時刻を表す。According to a seventh aspect of the present invention, in the fifth aspect of the present invention, f (t ₁ )
/ F to those (t ₀₎ Gayori large, giving greater relevance, _{and, f (t 2) / f} (t 0) Gayori the larger, characterized in providing a greater relevance. Here, t ₀ represents the time of the minimum point of interest.

【００２５】[0025]

【発明の実施の形態】以下，本発明の実施形態を詳細に
説明する。最初に，本発明の代表画像選択方法を用い
て，映像の索引付けを行う処理の第１の実施形態を例を
挙げて説明する。Embodiments of the present invention will be described below in detail. First, a first embodiment of a process for indexing a video using the representative image selection method of the present invention will be described with an example.

【００２６】図１は，本実施形態を実現するための映像
索引作成装置の構成例を示す。映像索引作成装置１は，
与えられた映像の各映像区間における代表画像を選択し
て，それをもとに映像の索引を作成するものであり，Ｃ
ＰＵおよびメモリ等からなる情報処理装置１１と，表示
装置１２と，キーボードやマウス等の入力装置１３と，
ハードディスクその他の記憶装置１４から構成される。
記憶装置１４には，予め処理対象となる映像がデジタル
データ化され，格納されているものとする。FIG. 1 shows an example of the configuration of a video index creation device for realizing this embodiment. The video indexing device 1
A representative image in each video section of a given video is selected, and a video index is created based on the selected representative image.
An information processing device 11 including a PU and a memory, a display device 12, an input device 13 such as a keyboard and a mouse,
It comprises a hard disk and other storage devices 14.
It is assumed that a video to be processed is digitized and stored in the storage device 14 in advance.

【００２７】情報処理装置１１は，映像区間の画像変化
量を計算する画像変化量計算手段１１１と，画像変化量
が極小となるフレームを求める画像変化量極小フレーム
算出手段１１２と，画像変化量が極小となるフレームを
基準として映像区間の代表画像の候補を選択する代表画
像候補選択手段１１３と，選択した映像区間の代表画像
の候補の中から代表画像を選択する代表画像選択手段１
１４とを備える。The information processing apparatus 11 includes an image change amount calculating unit 111 for calculating an image change amount in a video section, an image change amount minimum frame calculating unit 112 for obtaining a frame in which the image change amount is minimum, and an image change amount Representative image candidate selecting means 113 for selecting a representative image candidate of a video section with reference to a minimum frame, and representative image selecting means 1 for selecting a representative image from representative image candidates of the selected video section.
14.

【００２８】図２は，本実施形態における代表画像選択
処理のフローチャートである。本処理は，デジタル化さ
れた映像を入力とし，その索引として，代表画像と映像
区間の組を出力するものである。FIG. 2 is a flowchart of a representative image selection process according to the present embodiment. This processing is to input a digitized video and output a set of a representative image and a video section as an index.

【００２９】まず，ステップＳ２０１では，情報処理装
置１１が記憶装置１４からデジタル化された映像を読み
込み，入力映像を映像区間に分割する。映像区間に分割
する方法は，映像が時間的に急激に変化するいわゆるシ
ーンチェンジによって分割する方法が好適に適用できる
が，この方法に限らず他の方法でもよい。分割の結果，
該入力映像がｎ個の映像区間に分割されたものとする。
各映像区間を記憶するには，映像の中での開始位置と終
了位置を記憶しておけば足りる。映像の中の位置は，映
像の先頭からの経過時間でもよいし，映像の先頭フレー
ムから数えた通算フレーム番号などでもよい。First, in step S201, the information processing device 11 reads a digitized video from the storage device 14, and divides the input video into video sections. As a method of dividing into video sections, a method of dividing by a so-called scene change in which a video changes abruptly in time can be suitably applied, but is not limited to this method, and another method may be used. As a result of the division,
It is assumed that the input video is divided into n video sections.
In order to store each video section, it is sufficient to store the start position and the end position in the video. The position in the video may be an elapsed time from the top of the video, a total frame number counted from the top frame of the video, or the like.

【００３０】次に，ステップＳ２０２では，変数Ｎに映
像区間の個数ｎを代入し，ステップＳ２０３では，変数
ｉに１を代入する。ステップＳ２０４において，変数ｉ
の値と変数Ｎの値とを比較し，ｉ≦Ｎであれば，ステッ
プＳ２０５へ進み，そうでなければ処理を終了する。こ
の条件分岐によってループを制御する。ループの中で
は，ｉ回目のループでｉ番目の映像区間に着目し，処理
を行う。Next, in step S202, the number n of video sections is substituted for a variable N, and in step S203, 1 is substituted for a variable i. In step S204, the variable i
Is compared with the value of the variable N. If i ≦ N, the process proceeds to step S205; otherwise, the process ends. The loop is controlled by this conditional branch. In the loop, the processing is performed by focusing on the i-th video section in the i-th loop.

【００３１】ステップＳ２０５では，ｉ番目の映像区間
の各フレームについて画像変化量ｆを計算し，結果を配
列Ｆに格納する。画像変化量ｆは，例えば，隣り合うフ
レーム間で，次の式（２）によって算出する。In step S205, the image change amount f is calculated for each frame of the i-th video section, and the result is stored in the array F. The image change amount f is calculated by, for example, the following equation (2) between adjacent frames.

【００３２】[0032]

【数４】 (Equation 4)

【００３３】ただし，Ｉ₁，Ｉ₂は隣り合う２フレーム
の画像とする。ｘs ，ｘe ，ｙs ，ｙe は予め定めた定
数で，ｘs ≦ｘ≦ｘe ，ｙs ≦ｙ≦ｙe なる点（ｘ，
ｙ）により画面内の矩形領域を定義し，該矩形領域内の
画素について，画像の輝度値の差の絶対値を合計し，画
像変化量ｆを求める。画像変化量は，画素の輝度値の差
の絶対値の合計に限らず，色ヒストグラムなどの他の量
を用いてもよい。また，画像変化量計算の対象となる画
素は，画面全体でもよい。However, I ₁ and I ₂ are images of two adjacent frames. xs, xe, ys, ye are predetermined constants, and the points (x, x) satisfying xs ≦ x ≦ xe, ys ≦ y ≦ ye
y), a rectangular area in the screen is defined, and for the pixels in the rectangular area, the absolute values of the differences between the luminance values of the images are summed to obtain the image change amount f. The image change amount is not limited to the sum of the absolute values of the differences between the luminance values of the pixels, but may be another amount such as a color histogram. Further, the pixel to be subjected to the image change amount calculation may be the entire screen.

【００３４】該映像区間の先頭フレームを第１フレーム
として，第１フレームと第２フレームの変化量をＦ
［１］に代入し，第２フレームと第３フレームの変化量
をＦ［２］に代入し，以下順に同様に代入する。つま
り，第ｋフレームと第ｋ＋１フレームの変化量をＦ
［ｋ］に代入する。なお，ここでいうフレーム番号は該
映像区間の先頭から数えたフレーム番号であり，映像の
先頭から数えたフレーム番号とは一致しない。また，本
実施形態では，配列Ｆの添字が１から始るよう記述する
が，１から始まる必要はない。Assuming that the first frame of the video section is the first frame, the amount of change between the first frame and the second frame is F
[1], the change amount of the second frame and the third frame is substituted for F [2], and so on. That is, the change amount between the k-th frame and the (k + 1) -th frame is represented by F
Substitute [k]. Note that the frame number here is a frame number counted from the beginning of the video section, and does not match the frame number counted from the beginning of the video. In the present embodiment, the subscript of the array F is described to start from 1, but need not start from 1.

【００３５】次に，ステップＳ２０６では，配列Ｆか
ら，Ｆ［ｔ］が極小となる添字ｔを求め，配列ＴＭＩＮ
に格納する。極小点を求める方法については後述する。
なお，ノイズの影響を軽減するために，極小を求める前
に，Ｆに記憶した画像変化量の系列に，平滑化処理を施
してもよい。Next, in step S206, a subscript t at which F [t] is minimized is obtained from the array F, and
To be stored. A method for obtaining the minimum point will be described later.
Note that, in order to reduce the influence of noise, a smoothing process may be performed on the series of image change amounts stored in F before obtaining the minimum.

【００３６】今，説明のために着目している映像区間
が，図３に示すように人物がゴルフクラブをスウィング
する場面であると仮定する。該映像区間には，図３に概
略を示すように，人物がゴルフクラブをスウィングして
ゴルフボールを打ち，その後カメラが人物の顔にズーム
アップするような場面が収められているものとする。該
映像区間はａｅ枚のフレームから構成されているとす
る。Assume now that the video section of interest for the sake of explanation is a scene in which a person swings a golf club as shown in FIG. As shown schematically in FIG. 3, the video section includes a scene in which a person swings a golf club and hits a golf ball, and then the camera zooms up to the face of the person. It is assumed that the video section is composed of ae frames.

【００３７】図４は，該映像区間の画像変化量をグラフ
に表したものであり，図４のグラフは横軸が添字ｔを表
し，縦軸がＦの値を表し，線４０１が第ｔフレームにお
けるＦの値Ｆ［ｔ］を表す。横軸は，フレーム番号と解
釈しても良い。図４では，Ｆの値がｔ＝ａ２およびｔ＝
ａ４のとき極小になり，ｔ＝ａ１およびｔ＝ａ３のとき
Ｆの値が極大になっている例を表している。FIG. 4 is a graph showing the amount of image change in the video section. In the graph of FIG. 4, the horizontal axis represents the suffix t, the vertical axis represents the value of F, and the line 401 represents the t-th. Represents the value F [t] of F in the frame. The horizontal axis may be interpreted as a frame number. In FIG. 4, the values of F are t = a2 and t = a2.
An example is shown in which the value of F becomes a maximum when a4 and the value of F becomes a maximum when t = a1 and t = a3.

【００３８】図５は，上記着目している映像区間の中の
複数枚のフレームを並べて示したものである。図５の５
０１は，該映像区間の第１フレーム（先頭フレーム），
５０２は該映像区間の第ａ１フレーム，５０３は該映像
区間の第ａ２フレーム，５０４は該映像区間の第ａ３フ
レーム，５０５は該映像区間の第ａ４フレーム，５０６
は該映像区間の第ａｅフレーム（最終フレーム）である
とする。FIG. 5 shows a plurality of frames arranged in the video section of interest. 5 in FIG.
01 is the first frame (head frame) of the video section,
502 is the a1 frame of the video section, 503 is the a2 frame of the video section, 504 is the a3 frame of the video section, 505 is the a4 frame of the video section, 506
Is the ae frame (final frame) of the video section.

【００３９】ここで，該映像区間の例において，極値を
とるｔの値と，該映像区間の被写体の動きとの関係を図
４と図５を対照させて説明する。図５において，該映像
区間の被写体である人物は，第１フレーム５０１から第
ａ２フレーム５０３にかけてゴルフクラブを振り上げて
いる。第ａ１フレーム５０２で振り上げる速さが極大と
なり，その結果，図４のように画像変化量も第ａ１フレ
ームにおいて，すなわちｔ＝ａ１において，極大となっ
ている。第ａ１フレーム５０２では，ゴルフクラブの移
動速度が速く，ぶれが大きい。また，第ａ２フレーム５
０３は，振り上げ動作から振り下ろし動作に移行する境
目であり，一時的に動きが小さくなっている。その結
果，図４のように画像変化量も第ａ２フレームにおい
て，すなわちｔ＝ａ２において，極小となっている。Here, in the example of the video section, the relationship between the value of t, which takes an extreme value, and the movement of the subject in the video section will be described with reference to FIGS. In FIG. 5, the person who is the subject in the video section swings the golf club from the first frame 501 to the a2 frame 503. In the a1st frame 502, the swinging speed is a maximum, and as a result, as shown in FIG. 4, the image change amount is also a maximum in the a1st frame, that is, at t = a1. In the a1st frame 502, the moving speed of the golf club is fast and the shake is large. Also, the a2 frame 5
Numeral 03 indicates a transition from the swing-up operation to the swing-down operation, and the movement is temporarily reduced. As a result, as shown in FIG. 4, the image change amount is also minimal in the a2 frame, that is, at t = a2.

【００４０】また，図５において，該人物は第ａ２フレ
ーム５０３から第ａ３フレーム５０４にかけて，ゴルフ
クラブを振り下ろし，ボールを打撃している。第ａ３フ
レーム５０４で振り下ろす速さが極大となり，その結
果，図４のように画像変化量も第ａ３フレームにおい
て，すなわちｔ＝ａ３において，極大となっている。第
ａ３フレーム５０４では，ゴルフクラブの移動速度が速
く，ぶれが大きい。また，図５の第ａ４フレーム５０５
において，該人物は振り切った腕を下ろし始め，一時的
に動きが小さくなっている。その結果，図４に示す画像
変化量も第ａ４フレーム５０５において，すなわち，ｔ
＝ａ４において，極小となっている。In FIG. 5, the person swings down the golf club and hits the ball from the a2 frame 503 to the a3 frame 504. The speed of swinging down in the a3 frame 504 is maximum, and as a result, the image change amount is also maximum in the a3 frame, that is, at t = a3 as shown in FIG. In the a3rd frame 504, the moving speed of the golf club is high and the shake is large. The a4th frame 505 in FIG.
In, the person starts to lower his arm, and his movement is temporarily reduced. As a result, the image change amount shown in FIG.
= A4, it is minimal.

【００４１】また，図５において，該人物は，第ａ４フ
レーム５０５から第ａｅフレーム５０６にかけて腕を下
ろしている。該人物の動作の節目であるところの，ゴル
フクラブの振り上げ動作から振り下ろし動作に移行する
境目，およびゴルフクラブの振り切りから腕を下ろす境
目が，図４に示す画像変化量の極小として表れている。In FIG. 5, the person has his arms down from the a4th frame 505 to the aeth frame 506. The boundary between the swinging motion of the golf club and the swinging down motion, and the boundary between the swinging down of the golf club and the lowering of the arm, which are the milestones of the motion of the person, appear as the minimum image change amount shown in FIG. .

【００４２】次に，ステップＳ２０７では，ステップＳ
２０６で求められたＦ［ｔ］の極小点の個数を変数Ｍに
代入する。ステップＳ２０８では，変数Ｍの値と０とを
比較し，Ｍ＞０であれば，ステップＳ２０９へ進み，そ
うでなければ，ステップＳ２１４へ進む。Next, at step S207, step S
The number of the minimum points of F [t] obtained in 206 is substituted for a variable M. In step S208, the value of the variable M is compared with 0. If M> 0, the process proceeds to step S209; otherwise, the process proceeds to step S214.

【００４３】ステップＳ２０９では，ステップＳ２０６
で求めたＦ［ｔ］の極小点におけるＭ枚のフレームを基
準として，代表画像の候補をＬ枚選択する。この選択の
方法は種々のものがあり得るが，例えば，該極小点にお
けるフレームのそれぞれを代表画像候補としてもよい。
このとき，Ｌ＝Ｍとなる（後述する第２の実施形態）。
また，該極小点から予め定めた一定時間前または後のフ
レームを代表画像候補としてもよい。また，該極小点の
前後で予め定めた一定時間置きに現れる複数フレームを
代表画像候補としてもよい。また，ある極小点から次の
極小点までの動画像を代表画像としてもよい。In step S209, step S206
Based on the M frames at the minimum point of F [t] obtained in the above, L representative image candidates are selected. There may be various selection methods. For example, each of the frames at the minimum point may be used as a representative image candidate.
At this time, L = M (second embodiment described later).
A frame before or after a predetermined time from the minimum point may be set as a representative image candidate. Also, a plurality of frames appearing at predetermined time intervals before and after the minimum point may be set as representative image candidates. Further, a moving image from a certain minimum point to the next minimum point may be set as the representative image.

【００４４】次に，ステップＳ２１０では，選択したＬ
枚の代表画像候補を表示装置１２に表示する。これに対
し，操作者は表示されたフレームを吟味し，該映像区間
の代表画像としてふさわしいものを１枚選択し，入力装
置１３によって選択したものを指定する。Next, in step S210, the selected L
The representative image candidates are displayed on the display device 12. On the other hand, the operator examines the displayed frame, selects one suitable image as the representative image of the video section, and specifies the selected image using the input device 13.

【００４５】ステップＳ２１１では，情報処理装置１１
は，入力装置１３から，操作者によって選択された１枚
のフレームを入力し，そのフレーム番号を変数Ｓに代入
する。入力の方法は，マウスなどによって表示されたフ
レームを指し示す方法でもよいし，命令文によってフレ
ーム番号を入力する方法でもよいし，フレームにラベル
を付け，命令文によってラベルを入力する方法でもよ
い。In step S211, the information processing apparatus 11
Inputs one frame selected by the operator from the input device 13 and substitutes the frame number for a variable S. The input method may be a method of pointing to a frame displayed by a mouse or the like, a method of inputting a frame number by a command, or a method of attaching a label to a frame and inputting a label by a command.

【００４６】ステップＳ２１２では，選択されたフレー
ムの縮小画像を作成し，該映像区間の位置と組にして，
該映像区間の索引として記憶装置１４に出力する。該映
像区間の位置は，該映像の先頭フレームからの経過時間
でもよいし，該映像の先頭フレームからの通算フレーム
番号でもよい。In step S212, a reduced image of the selected frame is created, paired with the position of the video section, and
The index is output to the storage device 14 as an index of the video section. The position of the video section may be the elapsed time from the top frame of the video or the total frame number from the top frame of the video.

【００４７】ステップＳ２１３では，変数ｉの値に１を
加え，その後，ステップＳ２０４へ進み，同様に次の映
像区間に着目して，処理を進める。In step S213, 1 is added to the value of the variable i, and thereafter, the process proceeds to step S204, and the process is similarly performed focusing on the next video section.

【００４８】上記ステップＳ２０８において，Ｍ＞０で
ない場合は，ステップＳ２１４により，変数Ｓに先頭フ
レーム番号１を代入し，ステップＳ２１２へ進む。この
ステップＳ２１４は，画像変化量Ｆに極値がなかった場
合の処理であり，この場合，代表画像の候補が得られな
いので，映像区間の先頭フレームを代表画像としてい
る。必要であれば別の方法で代表画像を求めてもよい。
すなわち，本実施形態では，映像区間の先頭フレームを
代表画像とする方法を用いたが，例えば特開平９−９３
５２７号公報に示されている方法などを用いてもよい。
ステップＳ２０４で変数ｉがＮを越え，ループから脱出
すると，該映像の全映像区間の索引が，記憶装置１４に
蓄積されていることになる。In step S208, if M> 0 is not satisfied, the head frame number 1 is substituted for the variable S in step S214, and the flow advances to step S212. This step S214 is a process when there is no extreme value in the image change amount F. In this case, since a representative image candidate cannot be obtained, the first frame of the video section is set as the representative image. If necessary, the representative image may be obtained by another method.
That is, in the present embodiment, the method of using the first frame of the video section as the representative image is used.
For example, a method disclosed in Japanese Patent No. 527 may be used.
When the variable i exceeds N in step S204 and escapes from the loop, the indexes of all video sections of the video are stored in the storage device 14.

【００４９】本実施形態により出力された索引の例を図
６に示す。図６に示すように，映像区間通番毎に，代表
画像と区間情報の組が映像の索引として記憶装置１４に
格納される。FIG. 6 shows an example of an index output according to the present embodiment. As shown in FIG. 6, a set of a representative image and section information is stored in the storage device 14 as a video index for each video section serial number.

【００５０】本実施形態では，予め映像全体が記憶装置
１４に格納されていることを前提としたが，情報処理装
置１１にビデオ入力装置を接続し，ビデオデッキやビデ
オカメラやテレビ放送などから映像を入力し，逐次的に
処理し，本発明を適用してもよい。In the present embodiment, it is assumed that the entire video is stored in the storage device 14 in advance. However, a video input device is connected to the information processing device 11, and a video deck, a video camera, a May be input and processed sequentially to apply the present invention.

【００５１】また，本実施形態では，画像変化量の極小
を求めるために，映像区間全体の画像変化量を配列Ｆに
記憶するという手順を説明したが，配列の全体を格納せ
ずに，一時には連続する数フレームの画像変化量だけを
記憶し，逐次的に極小か否かを判定するという手順でも
よい。In this embodiment, the procedure of storing the image change amount of the entire video section in the array F in order to find the minimum of the image change amount has been described. Sometimes, only the image change amount of several consecutive frames is stored, and it is possible to sequentially determine whether or not the change is minimal.

【００５２】また，本実施形態では，１映像区間に対し
て１枚の代表画像を選択する方法を示したが，１映像区
間に対して複数枚の代表画像を選択してもよい。In this embodiment, a method of selecting one representative image for one video section has been described. However, a plurality of representative images may be selected for one video section.

【００５３】また，本実施形態では，映像区間の候補の
中から操作者の操作により代表画像を選択する方法を示
したが，操作者の選択によらず，代表画像候補の全てを
代表画像にしてもよいし，例えば，フレーム番号が最も
若い候補を代表画像に選択するというようにしてもよ
い。あるいは，何らかの評価尺度を導入し，操作者の操
作によらず，最も評価の高い候補を代表画像に選択して
もよい。In this embodiment, the method of selecting a representative image from the candidates of the video section by the operation of the operator has been described. However, regardless of the selection of the operator, all the representative image candidates are set as the representative images. Alternatively, for example, a candidate having the youngest frame number may be selected as the representative image. Alternatively, some evaluation scale may be introduced, and the candidate with the highest evaluation may be selected as the representative image regardless of the operation of the operator.

【００５４】以上が本発明の代表画像選択方法を用いて
映像の索引付けを行う処理の第１の実施形態の説明であ
る。The above is an explanation of the first embodiment of the processing for indexing a video using the representative image selection method of the present invention.

【００５５】次に，本発明の代表画像選択方法を用いて
映像の索引付けを行う処理の第２の実施形態を説明す
る。図７は，本発明の第２の実施形態による代表画像選
択処理のフローチャートである。Next, a description will be given of a second embodiment of processing for indexing a video using the representative image selection method of the present invention. FIG. 7 is a flowchart of a representative image selection process according to the second embodiment of the present invention.

【００５６】前述した第１の実施形態では，図２に示す
ステップＳ２０９において，画像変化量Ｆ［ｔ］の極小
点におけるＭ枚のフレームを基準として，代表画像の候
補をＬ枚選択するのに対し，第２の実施形態において
は，図７のステップＳ３０９において，画像変化量Ｆ
［ｔ］が極小となるフレームを代表画像候補として選択
する。その他の処理（ステップＳ３０１〜Ｓ３０８，Ｓ
３１０〜ステップＳ３１４）は，第１の実施形態と同様
であるので説明は省略する。In the first embodiment described above, in step S209 shown in FIG. 2, it is necessary to select L representative image candidates based on M frames at the minimum point of the image change amount F [t]. On the other hand, in the second embodiment, in step S309 of FIG.
A frame in which [t] is minimal is selected as a representative image candidate. Other processing (steps S301 to S308, S
Steps S 310 to S 314 are the same as in the first embodiment, and a description thereof will not be repeated.

【００５７】以上が本発明の代表画像選択方法を用いて
映像の索引付けを行う処理の第２の実施形態の説明であ
る。The above is the description of the second embodiment of the process of indexing a video using the representative image selection method of the present invention.

【００５８】次に，本発明の代表画像選択方法を用いて
映像の索引付けを行う処理の第３の実施形態を説明す
る。第３の実施形態は，第１の実施形態に，ストロボ画
像作成の処理を加えたものである。図８は，この第３の
実施形態による代表画像選択処理のフローチャートであ
る。Next, a description will be given of a third embodiment of the processing for indexing a video using the representative image selection method according to the present invention. The third embodiment is obtained by adding a strobe image creation process to the first embodiment. FIG. 8 is a flowchart of the representative image selection processing according to the third embodiment.

【００５９】第３の実施の形態では，代表画像としてス
トロボ画像を作成する。このストロボ画像変形処理は，
図２のフローチャートのステップＳ２０８〜Ｓ２１２，
Ｓ２１４の部分を，図８のフローチャートのステップＳ
４０８〜Ｓ４１３で置きかえるものである。その他の処
理は，第１の実施形態と同じであるので，該ストロボ画
像作成処理部分についてのみ，図８のフローチャートに
基づいて第３の実施形態を説明する。In the third embodiment, a strobe image is created as a representative image. This strobe image transformation process
Steps S208 to S212 in the flowchart of FIG.
Step S214 is replaced by step S214 in the flowchart of FIG.
408 to S413. Other processes are the same as those of the first embodiment, and therefore, the third embodiment will be described with reference to the flowchart of FIG.

【００６０】第３の実施形態においては，Ｆ［ｔ］の極
小点の個数を変数Ｍに代入した後（ステップＳ４０
７），ステップＳ４０８において変数Ｍの値と０とを比
較し，Ｍ＞０であれば，ステップＳ４０９へ進み，そう
でなければ，ステップＳ４１３へ進む。In the third embodiment, after the number of the minimum points of F [t] is substituted for the variable M (step S40).
7) In step S408, the value of the variable M is compared with 0. If M> 0, the process proceeds to step S409; otherwise, the process proceeds to step S413.

【００６１】ステップＳ４０９では，ステップＳ４０６
で求めたＦ［ｔ］の極小点のそれぞれについて，前後の
複数フレームからストロボ画像を作成する。例えば，予
め定数ｐ，ｑ，ｒを定め，ある極小点のフレームよりｐ
ｒ秒前，（ｐ−１）ｒ秒前，…，２ｒ秒前，ｒ秒前，０
秒前，ｒ秒後，２ｒ秒後，…，（ｑ−１）ｒ秒後，ｑｒ
秒後のフレーム（ｐ＋ｑ＋１）枚を抽出し，ストロボ画
像を作成する。ｐかｑのどちらかは０でもよい。Ｍ個の
極小点について，Ｍ枚のストロボ画像を作成し，これら
を代表画像候補とする。In step S409, step S406
For each of the minimum points of F [t] obtained in the above, a strobe image is created from a plurality of frames before and after. For example, constants p, q, and r are determined in advance, and p is calculated from a frame at a certain minimum point.
r seconds ago, (p-1) r seconds ago, ..., 2 r seconds ago, r seconds ago, 0
Seconds before, after r seconds, after 2r seconds, ..., after (q-1) r seconds, qr
Secondly, (p + q + 1) frames are extracted to create a strobe image. Either p or q may be 0. M strobe images are created for the M minimum points, and these are used as representative image candidates.

【００６２】次に，ステップＳ４１０では，作成したＭ
枚のストロボ画像を表示装置１２に表示する。この表示
に対して，操作者は表示されたストロボ画像を吟味し，
該映像区間の代表画像としてふさわしいものを１枚選ん
で指定する。Next, in step S410, the created M
The strobe images are displayed on the display device 12. In response to this display, the operator examines the displayed strobe image,
One suitable image is selected and designated as a representative image of the video section.

【００６３】ステップＳ４１１では，情報処理装置１１
は，入力装置１３から操作者が指定した１枚のストロボ
画像を選択する。入力の方法は，マウスなどによって表
示されたフレームを指し示す方法でもよいし，ストロボ
画像にラベルを付け，命令文によってラベルを入力する
方法でもよい。In step S411, the information processing device 11
Selects one strobe image specified by the operator from the input device 13. The input method may be a method of pointing a frame displayed by a mouse or the like, or a method of attaching a label to a strobe image and inputting the label by a command statement.

【００６４】ステップＳ４１２では，該選択されたスト
ロボ画像と，該映像区間の位置とを組にして，該映像区
間の索引として記憶装置１４に出力する。該映像区間の
位置は，該映像の先頭フレームからの経過時間でもよい
し，該映像の先頭フレームからの通算フレーム番号でも
よい。ステップＳ４１２の後は，ステップＳ４１４へ進
み，変数ｉの値に１を加算し，ステップＳ４０４に進
む。In step S412, the selected strobe image and the position of the video section are paired and output to the storage device 14 as an index of the video section. The position of the video section may be the elapsed time from the top frame of the video or the total frame number from the top frame of the video. After step S412, the process proceeds to step S414, where 1 is added to the value of the variable i, and the process proceeds to step S404.

【００６５】ステップＳ４０８において，Ｍ＞０でない
場合には，ステップＳ４１３へ進み，ｉ番目の映像区間
の先頭フレームと該映像区間の位置とを組にして，該映
像区間の索引として記憶装置１４に出力する。このステ
ップＳ４１３は，画像変化量Ｆに極値がなかった場合の
処理であり，本実施形態では，映像区間の先頭フレーム
を代表画像とする方法を用いたが，別の方法で代表画像
を求めてもよい。例えば特開平９−９３５２７号公報に
示されている方法などを用いることもできる。ステップ
Ｓ４１３の後はステップＳ４１４へ進む。その他の処理
は第１の実施形態と同様であるので説明は省略する。If M> 0 is not satisfied in step S408, the process proceeds to step S413, where the head frame of the i-th video section and the position of the video section are paired and stored in the storage device 14 as an index of the video section. Output. This step S413 is processing in the case where there is no extreme value in the image change amount F. In the present embodiment, the method of using the first frame of the video section as the representative image is used, but the representative image is obtained by another method. You may. For example, a method disclosed in JP-A-9-93527 can be used. After step S413, the process proceeds to step S414. Other processes are the same as in the first embodiment, and a description thereof will be omitted.

【００６６】以上が本発明の代表画像選択方法を用いて
映像の索引付けを行う処理の第３の実施形態の説明であ
る。The above is the description of the third embodiment of the process of indexing a video using the representative image selection method of the present invention.

【００６７】次に，本発明の代表画像選択方法を用いて
映像の索引付けを行う処理の第４の実施形態を説明す
る。図９および図１０は，この第４の実施形態による代
表画像選択処理のフローチャートである。Next, a description will be given of a fourth embodiment of the processing for indexing a video using the representative image selection method according to the present invention. FIGS. 9 and 10 are flowcharts of the representative image selection process according to the fourth embodiment.

【００６８】第４の実施形態の処理は，第１の実施形態
に，カメラの動きを打ち消すように画像を変形する処理
（ステップＳ５０５〜Ｓ５１１）を加えたものである。
この画像変形処理は，図２のフローチャートのステップ
Ｓ２０５を，ステップＳ５０５〜Ｓ５１１で置きかえる
ものである。その他の部分の処理（ステップＳ５０１〜
Ｓ５０４，Ｓ５１２〜Ｓ５２２）は，図２に示す第１の
実施形態における処理と同じであるので，該画像変形処
理部分についてのみ，第４の実施形態を説明する。The processing according to the fourth embodiment is obtained by adding processing (steps S505 to S511) for deforming an image so as to cancel the movement of the camera to the first embodiment.
This image transformation process replaces step S205 in the flowchart of FIG. 2 with steps S505 to S511. Processing of other parts (steps S501 to S501)
Steps S504, S512 to S522) are the same as the processing in the first embodiment shown in FIG. 2, and therefore, the fourth embodiment will be described only for the image deformation processing part.

【００６９】第４の実施形態では，ステップＳ５０４に
おいて変数ｉとＮを比較して，ｉ≦Ｎである場合，ステ
ップＳ５０５において，ｉ番目の映像区間の全フレーム
を入力する。先頭フレームから順にＩ₁，Ｉ₂，…，Ｉ
_aeと呼ぶことにする。次に，ステップＳ５０６では，変
数ｊに１を代入する。In the fourth embodiment, the variable i is compared with N in step S504, and if i ≦ N, all frames in the i-th video section are input in step S505. I ₁ , I ₂ ,..., I
_Let's call it _ae . Next, in step S506, 1 is substituted for a variable j.

【００７０】ステップＳ５０７では，ｊ番目のフレーム
Ｉ_jと，ｊ＋１番目のフレームＩ_j+ ₁から，カメラ移動
のパラメータを抽出し，該２フレーム間でどのようにカ
メラが動いたかを推定する。この推定には，例えば特開
平１１−２２５３１０号公報に開示されている方法を用
いる。特開平１１−２２５３１０号公報に開示されてい
るカメラ移動の推定方法は，カメラのパン，チルト，ズ
ームによって，被写体の点が画像上で式（３）のように
見かけ上動くと仮定している。In step S507, parameters for camera movement are extracted from the j-th frame I _j and the (j + 1) -th frame I _{j +} ₁ , and how the camera has moved between the two frames is estimated. For this estimation, for example, a method disclosed in JP-A-11-225310 is used. The method of estimating camera movement disclosed in Japanese Patent Application Laid-Open No. H11-225310 assumes that a point of a subject moves apparently on an image as shown in Expression (3) due to panning, tilting, and zooming of the camera. .

【００７１】（ｘ′，ｙ′）＝（ａｘ″＋ｂ，ａｙ″＋ｃ） …（３）式（３）は，ある被写体の点が，あるフレームＡでは画
像上の座標（ｘ′，ｙ′）に投影されており，別のフレ
ームＢでは座標（ｘ″，ｙ″）に投影されているときの
関係式を表している。この見かけ上の点の移動が，被写
体は不動で，カメラの移動によって起きたものとする
と，未知数ａ，ｂ，ｃはフレームＡからフレームＢまで
の間のカメラの動きを記述しており，カメラパラメータ
と呼ばれる。特開平１１−２２５３１０号公報によれ
ば，平均二乗誤差を最小化することにより，カメラパラ
メータａ，ｂ，ｃの値を決定できる。上記ステップＳ５
０７では，フレームＩ_jからフレームＩ_j+1までの間の
カメラ移動のカメラパラメータを求め，ａ_j，ｂ_j，ｃ
_jとする。(X ′, y ′) = (ax ″ + b, ay ″ + c) (3) Equation (3) indicates that a point of a certain subject has coordinates (x ′, y ′) on an image in a certain frame A ), And in another frame B, a relational expression when projected at coordinates (x ″, y ″). Assuming that the apparent movement of the point is caused by the movement of the camera while the subject is stationary, the unknowns a, b, and c describe the movement of the camera from frame A to frame B. Called parameters. According to JP-A-11-225310, the values of the camera parameters a, b, and c can be determined by minimizing the mean square error. Step S5 above
In 07 obtains the camera parameters of the camera motion between the frame I _j to frame _{_{I j + 1, a j,}} b j, c
_j .

【００７２】ステップＳ５０８では，求めたカメラパラ
メータを用いて，フレームＩ_jから，フレームＩ_j+1ま
でのカメラの動きを打ち消すようにフレームＩ_j+1を変
形し，Ｉ′_j+1とする。上記の方法で求めたカメラパラ
メータａ_j，ｂ_j，ｃ_jを用いれば，次の式（４）を用
いて，フレームＩ_j+1上の点（ｘ，ｙ）から，フレーム
Ｉ_j上での点（ｘ′，ｙ′）を求めることができる。In step S508, using the obtained camera parameters, the frame I _{j + 1} is deformed so as to cancel the movement of the camera from the frame I _j to the frame I _{j + 1} to obtain I ′ _{j + 1} . . Using the camera parameters a _j , b _j , and c _j obtained by the above method, the following equation (4) is used to calculate the point (x, y) on the frame I _{j + 1} on the frame I _j . (X ', y') can be obtained.

【００７３】（ｘ′，ｙ′）＝（ａ_jｘ＋ｂ_j，ａ_jｙ＋ｃ_j） …（４）次に，ステップＳ５０９では，画像Ｉ_jと画像Ｉ′_j+1
の画像変化量を求め，Ｆ［ｊ］に代入する。その後，ス
テップＳ５１０では，変数ｊの値に１を加える。[0073] (x ', y') = (a j x + b j, a j y + c j) ... (4) Then, in step S509, the image I _j and the image I _{'j + 1}
Is obtained and substituted into F [j]. Then, in step S510, 1 is added to the value of the variable j.

【００７４】ステップＳ５１１では，変数ｊの値と該ｉ
番目の映像区間のフレーム数ａｅとを比較し，ｊ＞ａｅ
−１であれば，画像変形処理を終了して，図２のステッ
プＳ２０６と同様の処理を行うステップＳ５１２へ進
む。ｊ＞ａｅ−１でなければ，ステップＳ５０７へ進み
ループを構成する。このループで，該ｉ番目の映像区間
のフレームＩ₂からフレームＩ_aeまでの画像変化量を配
列Ｆに格納する。In step S511, the value of variable j and the value of i
J> ae
If the value is -1, the image transformation process is terminated, and the process proceeds to step S512 where the same process as step S206 in FIG. 2 is performed. If j> ae−1 is not satisfied, the process proceeds to step S507 to form a loop. In this loop, stores the image change from frame I ₂ of the i-th image segment until frame I _ae in sequence F.

【００７５】画像変形処理が終了すると，この映像区間
であたかもカメラが不動であったかのような画像変化量
Ｆを得ることができる。このことによって，カメラの動
きに影響されずに，画像変化量に被写体の動きがよく反
映されるようになる。When the image deformation processing is completed, an image change amount F can be obtained in this video section as if the camera had not moved. As a result, the movement of the subject is well reflected in the image change amount without being affected by the movement of the camera.

【００７６】以上が本発明の代表画像選択方法を用いて
映像の索引付けを行う処理の第４の実施形態の説明であ
る。The above is an explanation of the fourth embodiment of the processing for indexing a video using the representative image selection method of the present invention.

【００７７】次に，本発明の代表画像選択方法を用いて
映像の索引付けを行う処理の第５の実施形態を説明す
る。図１１および図１２は，この第５の実施形態による
代表画像選択処理のフローチャートである。Next, a description will be given of a fifth embodiment of processing for indexing a video using the representative image selection method according to the present invention. FIGS. 11 and 12 are flowcharts of a representative image selection process according to the fifth embodiment.

【００７８】第５の実施形態の処理は，第１の実施形態
に，代表画像候補から代表画像を機械的に選択する処理
を加えたものである。この代表画像選択処理は，図２の
フローチャートのステップＳ２１０からステップＳ２１
１までの処理の部分を，図１１および図１２のステップ
Ｓ６１０からステップＳ６２１までの処理に置きかえる
ものである。その他の部分の処理（ステップＳ６０１〜
Ｓ６０９，Ｓ６２２〜Ｓ６２４）は，第１の実施形態と
同じであるので，この代表画像選択処理についての部分
のみ，図１１および図１２のフローチャートに基づいて
第５の実施形態を説明する。The processing of the fifth embodiment is obtained by adding processing of mechanically selecting a representative image from representative image candidates to the first embodiment. This representative image selection processing is performed in steps S210 to S21 in the flowchart of FIG.
1 is replaced with the processing from step S610 to step S621 in FIGS. 11 and 12. Processing of other parts (steps S601 to S601)
Steps S609, S622 to S624) are the same as those in the first embodiment. Therefore, only the part relating to the representative image selection processing will be described with reference to the flowcharts in FIGS.

【００７９】第５の実施形態では，ステップＳ６０９に
おいてＦ［ｔ］の極小値を基準として代表画像候補Ｌ枚
を選択した後，ステップＳ６１０では，配列Ｆから，Ｆ
［ｔ］が極大となる添字ｔを求め，配列ＴＭＡＸに格納
する。Ｆ［ｔ］の極大を求める方法は，Ｆ［ｔ］の符号
を変えれば，極小を求める問題に帰着できる。In the fifth embodiment, after selecting L representative image candidates in step S609 based on the minimum value of F [t], in step S610, F
The subscript t at which [t] is maximized is obtained and stored in the array TMAX. The method of finding the maximum of F [t] can be reduced to the problem of finding the minimum by changing the sign of F [t].

【００８０】以下の変数ＳとＰＭＡＸは，代表画像とし
ての適切さが最大である代表画像候補を探すために用い
る。ステップＳ６１１では，変数ＳにＴＭＩＮ［１］を
代入し，ステップＳ６１２では，変数ＰＭＡＸに０を代
入する。また，ステップＳ６１３では，変数ｋに１を代
入する。The following variables S and PMAX are used to search for a representative image candidate that is most appropriate as a representative image. In step S611, TMIN [1] is substituted for the variable S, and in step S612, 0 is substituted for the variable PMAX. In step S613, 1 is substituted for the variable k.

【００８１】ステップＳ６１４では，ＴＭＡＸ［ｈ］＜
ＴＭＩＮ［ｋ］かつＴＭＩＮ［ｋ］＜ＴＭＡＸ［ｈ＋
１］なるｈが存在するかどうかを検査する。系列の最
初，最後において，ｈが存在しない場合があり得る。条
件を満たすｈが存在する場合には，代表画像としての適
切さを評価するために，ステップＳ６１５へ進む。条件
を満たすｈが存在しない場合には，代表画像としての適
切さの評価をスキップし，ステップＳ６２０へ進む。In step S614, TMAX [h] <
TMIN [k] and TMIN [k] <TMAX [h +
1] is checked for the existence of h. At the beginning and end of the sequence, h may not exist. If there is h that satisfies the condition, the process proceeds to step S615 to evaluate the suitability as a representative image. If there is no h that satisfies the condition, the evaluation of the suitability as the representative image is skipped, and the process proceeds to step S620.

【００８２】ステップＳ６１５では，ステップＳ６１４
で求めたｈの値を変数Ｈに代入する。次に，ステップＳ
６１６では，ＴＭＡＸ［Ｈ］≦ｔ≦ＴＭＡＸ［Ｈ＋１］
である領域から，ｔ＝ＴＭＩＮ［ｋ］における凹みＸ
を，式（１）によって求め，変数Ｐに１−｜Ｘ−θ_x｜
の値を格納する。Ｘは，区間ＴＭＡＸ［ｈ］≦ｔ≦ＴＭ
ＡＸ［ｈ＋１］におけるＦの凹みの度合いを表す。Ｘが
０に近いならば，該区間は静止状態に近く動き特徴が乏
しいとみなすことができるし，Ｘが０．５に近いか０．
５より大きいならば，該区間は動いてばかりで動き特徴
に乏しいとみなすことができる。中間の適当な値θ_xに
近いほど，該区間は動き特徴に富んでいるとみなせるの
で，１−｜Ｘ−θ_x｜を代表画像としての適切さとす
る。θ_xは，例えば０．２５とする。At step S615, step S614
Is substituted into the variable H. Next, step S
In 616, TMAX [H] ≦ t ≦ TMAX [H + 1]
From the region that is, the dent X at t = TMIN [k]
Is obtained by the equation (1), and 1− | X−θ _x |
Store the value of. X is the section TMAX [h] ≦ t ≦ TM
AX [h + 1] represents the degree of F depression. If X is close to 0, the section is close to a stationary state and can be considered to have poor motion characteristics, and if X is close to 0.5 or 0.
If it is greater than 5, the section can only be considered moving and poor in motion characteristics. Closer to the middle of the appropriate value theta _x, since the compartment between can be regarded as rich in movement feature, 1- | X-θ _x | a and appropriateness of the representative image. θ _x is, for example, 0.25.

【００８３】ステップＳ６１７では，Ｐの値とＰＭＡＸ
の値とを比較する。Ｐが，ＰＭＡＸに記憶されている適
切さより大きい場合には，ステップＳ６１８へ進み，Ｐ
ＭＡＸの値を置きかえる。そうでない場合には，ステッ
プＳ６２０へ進む。In step S617, the value of P and PMAX
To the value of. If P is larger than the appropriateness stored in PMAX, the process proceeds to step S618, where P
Replace the value of MAX. Otherwise, the process proceeds to step S620.

【００８４】ステップＳ６１８では，ＳにＴＭＩＮ
［ｋ］の値を代入し，ステップＳ６１９では，ＰＭＡＸ
にＰの値を代入する。ステップＳ６２０では，変数ｋに
ｋ＋１の値を代入する。In step S618, SMIN is set to TMIN.
The value of [k] is substituted, and in step S619, PMAX
To the value of P. In step S620, the value of k + 1 is substituted for the variable k.

【００８５】次に，ステップＳ６２１では，変数ｋの値
とＭの値とを比較し，ｋ＞ＭでなければステップＳ６１
４へ戻り，次の代表画像候補について代表画像としての
適切さの評価を行う。ｋ＞Ｍであればループを脱出し，
ステップＳ６２２へ進む。ステップＳ６２２へ進むと，
変数Ｓには，代表画像としての適切さが最も大である代
表画像候補のフレーム番号が一つ格納されている。以降
の処理は，図２のステップＳ２１２以降の処理と同様で
ある。Next, in step S621, the value of the variable k is compared with the value of M. If k> M is not satisfied, step S61 is executed.
Then, the process returns to step 4 to evaluate the suitability of the next representative image candidate as a representative image. If k> M, escape the loop,
Proceed to step S622. Proceeding to step S622,
The variable S stores one frame number of a representative image candidate having the highest suitability as a representative image. Subsequent processing is the same as the processing after step S212 in FIG.

【００８６】例外的な場合として，ステップＳ６１４に
おけるｈの存在の検査で，一度も条件が成立しなかった
場合には，初期値である最若番の代表画像候補のフレー
ム番号がＳに格納され，これを代表画像とするようにな
っているが，これに限らず，他の方法で代表画像を決定
してもよい。As an exceptional case, if the condition is not satisfied even in the inspection of the existence of h in step S614, the frame number of the youngest representative image candidate which is the initial value is stored in S. , Is used as the representative image. However, the present invention is not limited to this, and the representative image may be determined by another method.

【００８７】また，本実施形態では，代表画像としての
適切さが最大である代表画像候補を一つ選択するように
なっているが，適切さが大きいものを優先的に複数枚選
択するようにしてもよい。Further, in the present embodiment, one representative image candidate having the maximum suitability as a representative image is selected. However, a plurality of representative images having higher suitability are preferentially selected. You may.

【００８８】また，式（１）の分母を変更し，Ｘを次の
ように定義しても，本質的に違いはなく，本発明は有効
に適用できる。Further, even if the denominator of equation (1) is changed and X is defined as follows, there is essentially no difference, and the present invention can be applied effectively.

【００８９】[0089]

【数５】 (Equation 5)

【００９０】ここで，ｍａｘ（ａ，ｂ）は，ａ，ｂのう
ち大きい方の値をとる関数とする。Here, max (a, b) is a function that takes the larger value of a and b.

【００９１】以上が本発明の代表画像選択方法を用いて
映像の索引付けを行う処理の第５の実施形態の説明であ
る。The above is an explanation of the fifth embodiment of the processing for indexing a video using the representative image selection method of the present invention.

【００９２】次に，本発明の代表画像選択方法を用いて
映像の索引付けを行う処理の第６の実施形態を説明す
る。図１３および図１４は，この第６の実施形態による
代表画像選択処理のフローチャートである。Next, a description will be given of a sixth embodiment of the processing for indexing a video using the representative image selection method according to the present invention. FIGS. 13 and 14 are flowcharts of the representative image selection process according to the sixth embodiment.

【００９３】第６の実施形態の処理は，第５の実施形態
の代表画像としての適切さを評価する処理において，図
１２のフローチャートのステップＳ６１６に相当する処
理の部分を，図１４に示すステップＳ７１６の処理で置
き換えたもので，画像変化量の極小値とその両隣の極大
値との比率により代表画像としての適切さを評価するよ
うにしたものである。この処理以外（ステップＳ７０１
〜Ｓ７１５，Ｓ７１７〜Ｓ７２４）は，第５の実施形態
と同じであるので，当該処理についてのみ，図１３およ
び図１４のフローチャートに基づいて説明する。In the processing of the sixth embodiment, the processing corresponding to step S616 in the flowchart of FIG. 12 in the processing of evaluating the suitability as a representative image of the fifth embodiment is replaced with the steps shown in FIG. It is replaced by the processing of S716, and the suitability as a representative image is evaluated based on the ratio between the minimum value of the image change amount and the maximum value on both sides thereof. Other than this processing (step S701)
To S715, S717 to S724) are the same as those in the fifth embodiment, and therefore only this process will be described with reference to the flowcharts in FIGS.

【００９４】第６の実施形態では，ｈの値を変数Ｈに格
納した後（ステップＳ７１５），ステップＳ７１６にお
いて，ＴＭＡＸ［Ｈ］≦ｔ≦ＴＭＡＸ［Ｈ＋１］である
領域から，ｔ＝ＴＭＩＮ［ｋ］における代表画像として
の適切さを求め，変数Ｐに格納する。Ｐを求めるには，
例えば，Ｐ＝ｐ₁ｐ₂ ｐ₁＝Ｆ［ＴＭＡＸ［Ｈ］］／Ｆ［ＴＭＩＮ［ｋ］］ｐ₂＝Ｆ［ＴＭＡＸ［Ｈ＋１］］／Ｆ［ＴＭＩＮ
［ｋ］］とする。Ｐは線形である必要はなく，例えば，Ｐ＝０ …ｐ₁＜θ₁またはｐ₂＜θ₂のときＰ＝ｐ₁ｐ₂ …ｏｔｈｅｒｗｉｓｅのように非線型でもよい。θ₁，θ₂は，予め定めた閾
値とする。ステップＳ７１６は，ｔ＝ＴＭＩＮ［ｋ］に
おける極小値が，その両隣の極大値にくらべて十分小さ
いかどうかを評価する処理で，小さいほどＰの値は大き
くなる。次に，ステップＳ７１７へ進むが，以降の処理
は，図１２のステップＳ６１７以降の処理と同様であ
る。In the sixth embodiment, after the value of h is stored in the variable H (step S715), in step S716, from the area where TMAX [H] ≦ t ≦ TMAX [H + 1], t = TMIN [k ] Is determined as a representative image and stored in a variable P. To find P,
For example, P = p ₁ p ₂ p ₁ = F [TMAX [H]] / F [TMIN [k]] p ₂ = F [TMAX [H + 1]] / F [TMIN
[K]]. P need not be linear, and may be non-linear, such as P = p ₁ p ₂ ... Other when P = 0... P ₁ <θ ₁ or p ₂ <θ ₂ . θ ₁ and θ ₂ are predetermined thresholds. Step S716 is a process of evaluating whether or not the local minimum value at t = TMIN [k] is sufficiently smaller than the local maximum values on both sides. The smaller the value, the larger the value of P. Next, the process proceeds to step S717, and the subsequent processing is the same as the processing after step S617 in FIG.

【００９５】以上が本発明の代表画像選択方法を用いて
映像の索引付けを行う処理の第６の実施形態の説明であ
る。この発明を用いることにより，操作者が目視により
判断し代表画像を選択するというステップを経なくて
も，機械的な処理によって代表画像の選択が可能とな
る。なお，前記第５の実施形態と第６の実施形態で，代
表画像としての適切さを評価するそれぞれの処理を個別
に用いる例を示したが，両者を同時に用いてもよい。そ
のためには例えば，Ｐ＝ｐ₁ｐ₂Ｘなどとして，代表画
像としての適切さに，ｐ₁，ｐ₂，Ｘを共に反映させる
ようにする。The above is the description of the sixth embodiment of the processing for indexing a video using the representative image selection method of the present invention. By using the present invention, it is possible to select a representative image by mechanical processing, without going through the step of visually selecting and selecting a representative image by an operator. In the fifth and sixth embodiments, examples have been described in which each process for evaluating the suitability as a representative image is used individually, but both processes may be used simultaneously. For this purpose, for example, P = p ₁ p ₂ X, etc., so that both p ₁ , p ₂ , and X are reflected in the appropriateness as the representative image.

【００９６】また，着目している極小点と，その隣の極
大点との距離が所定の閾値より離れている場合や，所定
の閾値より近い場合には，このような極小点が代表画像
として選ばれにくくするように，Ｐに，０などの特に低
い値を設定するようにしてもよい。When the distance between the minimum point of interest and the next maximum point is larger than a predetermined threshold value or is smaller than a predetermined threshold value, such a minimum point is regarded as a representative image. A particularly low value such as 0 may be set for P to make it difficult to select.

【００９７】そのほか，Ｘやｐ₁やｐ₂に定数をかけた
り，加えたり，単調増加関数によって変換するなど，意
味を変えないような操作をほどこしてもよいし，他の評
価尺度と組み合わせて用いてもよいことはいうまでもな
い。In addition, operations that do not change the meaning, such as multiplying or adding a constant to X, p _1, or p ₂ , or converting with a monotonically increasing function, may be performed, or combined with other evaluation scales. It goes without saying that it may be used.

【００９８】ここで，配列に格納された数列Ｆ［ｔ］の
極小点を求める処理の方法の一例を図１５のフローチャ
ートに基づいて説明する。この処理は図２のステップＳ
２０６の詳細にあたる。本処理の開始時には，着目して
いる映像区間ｉの画像変化量の系列が配列Ｆに格納され
ている。当該映像区間のフレーム数はａｅであるとす
る。したがって，配列Ｆの要素数はａｅ−１である。Here, an example of a method of processing for finding the minimum point of the sequence F [t] stored in the array will be described with reference to the flowchart of FIG. This processing corresponds to step S in FIG.
206 corresponds to the details. At the start of this processing, a sequence of image change amounts of the video section i of interest is stored in the array F. It is assumed that the number of frames in the video section is ae. Therefore, the number of elements of the array F is ae-1.

【００９９】まず，ステップＳ８０１では，配列ＴＭＩ
Ｎを初期化する。ステップＳ８０２では，変数ｍに１を
代入し，ステップＳ８０３では，変数Ｄに０を代入す
る。変数Ｄは，直前のＦの挙動を格納する変数であり，
値１が増加を表し，値−１が減少を表す。値０はどちら
でもないことを表す。First, in step S801, the array TMI
Initialize N. In step S802, 1 is substituted for a variable m, and in step S803, 0 is substituted for a variable D. The variable D is a variable that stores the behavior of the immediately preceding F,
A value of 1 indicates an increase and a value of -1 indicates a decrease. A value of 0 indicates neither.

【０１００】次に，ステップＳ８０４では，変数ｇに１
を代入する。変数ｇは，当該映像区間ｉのフレームを走
査するためのフレーム番号を格納する変数である。Next, in step S804, the variable g is set to 1
Is assigned. The variable g is a variable that stores a frame number for scanning the frame of the video section i.

【０１０１】ステップＳ８０５では，Ｆ［ｇ＋１］とＦ
［ｇ］を比較し，Ｆ［ｇ＋１］＞Ｆ［ｇ］であれば，す
なわち，Ｆが増加していれば，ステップＳ８０６へ進
む。そうでなければ，ステップＳ８１２へ進む。In step S805, F [g + 1] and F [g + 1]
[G] is compared, and if F [g + 1]> F [g], that is, if F is increasing, the process proceeds to step S806. Otherwise, the process proceeds to step S812.

【０１０２】ステップＳ８０６では，変数Ｄの値が−１
かを判断する。変数Ｄの値が−１であれば，Ｆが直前で
は減少しており，現在増加しているので，Ｆがｔ＝ｇに
おいて極小値をとると判定し，ステップＳ８０７へ進
む。そうでなければ，ステップＳ８０９へ進む。In the step S806, the value of the variable D is -1.
Judge. If the value of the variable D is -1, since F has decreased immediately before and is currently increasing, it is determined that F has a minimum value at t = g, and the process proceeds to step S807. Otherwise, the process proceeds to step S809.

【０１０３】ステップＳ８０７では，ＴＭＩＮ［ｍ］に
フレーム番号ｇの値を代入する。これは，ｍ番目に見つ
かった極小値を配列ＴＭＩＮに格納する処理である。ス
テップＳ８０７に到達したということは，着目している
フレームにおいて，系列Ｆが昇順になっており，その前
は，降順かどちらでもない状態であったのであるから，
降順から昇順に変化した点とみなすことができる。した
がって，ここを極小点として抽出する。続いて，ステッ
プＳ８０８では，変数ｍにｍ＋１の値を代入する。In step S807, the value of the frame number g is substituted for TMIN [m]. This is a process of storing the minimum value found at the m-th position in the array TMIN. Reaching step S807 means that the sequence F is in ascending order in the frame of interest, and before that, it is in a state of neither descending order,
It can be regarded as a point changed from descending order to ascending order. Therefore, this is extracted as a minimum point. Subsequently, in step S808, the value of m + 1 is substituted for the variable m.

【０１０４】ステップＳ８０９では，変数Ｄに１を代入
する。また，ステップＳ８１０では，変数ｇにｇ＋１の
値を代入する。In step S809, 1 is substituted for a variable D. In step S810, the value of g + 1 is substituted for the variable g.

【０１０５】次に，ステップＳ８１１では，変数ｇの値
とａｅ−２の値とを比較し，ｇ≧ａｅ−２であれば手続
きを終了する。ｇ≧ａｅ−２でなければ，ステップＳ８
０５へ戻り，次のフレームについて処理を行う。Next, in step S811, the value of the variable g is compared with the value of ae-2, and if g ≧ ae-2, the procedure ends. If not g ≧ ae-2, step S8
Returning to step 05, processing is performed for the next frame.

【０１０６】上記ステップＳ８０５で，Ｆ［ｇ＋１］＞
Ｆ［ｇ］でないと判断した場合には，ステップＳ８１２
において，Ｆ［ｇ＋１］の値とＦ［ｇ］の値とを比較
し，Ｆ［ｇ＋１］＜Ｆ［ｇ］であれば，ステップＳ８１
３へ進み，そうでなければ，ステップＳ８１０へ進む。
ステップＳ８１３では，変数Ｄに−１を代入し，その後
にステップＳ８１０に進む。本手続きが終了すると，系
列Ｆが極小となるフレーム番号が配列ＴＭＩＮに格納さ
れており，極小点の数はｍ−１個である。In step S805, F [g + 1]>
If it is determined that it is not F [g], step S812
In step S81, the value of F [g + 1] is compared with the value of F [g], and if F [g + 1] <F [g], the process proceeds to step S81.
Go to step S3, otherwise go to step S810.
In step S813, -1 is substituted for the variable D, and thereafter, the process proceeds to step S810. When this procedure is completed, the frame number at which the sequence F becomes the minimum is stored in the array TMIN, and the number of the minimum points is m-1.

【０１０７】通例，系列が降順から昇順へ変化する点を
極小点というが，系列が降順である区間と昇順である区
間の間に値が一定である区間が存在する場合，これを極
小点とみなすか否かは実施者が任意に定義してよい。本
実施形態は，これを極小点とみなすという定義に基づき
構成されている。また，これを極小点とみなすとすれ
ば，この値が一定である区間のどの１点を極小点とみな
すかは，実施者が任意に定義してよい。本実施形態で
は，この値が一定である区間の末尾を極小点とみなすと
いう定義に基づき構成されている。Usually, the point at which the sequence changes from descending to ascending order is referred to as a minimum point. If there is a section having a constant value between the section in which the series is descending and the section in which the series is ascending, this is referred to as the minimum point. Whether or not to consider it may be arbitrarily defined by the practitioner. The present embodiment is configured based on the definition that this is regarded as a minimum point. If this point is regarded as the minimum point, the implementer may arbitrarily define which one point in the section where this value is constant is regarded as the minimum point. The present embodiment is configured based on the definition that the end of the section in which this value is constant is regarded as the minimum point.

【０１０８】以上の処理は，コンピュータとソフトウェ
アプログラムとによって実現することができ，そのプロ
グラムは，コンピュータが読み取り可能な可搬媒体メモ
リ，半導体メモリ，ハードディスク等の適当な記録媒体
に格納して，そこから読み出すことによりコンピュータ
に実行させることができる。The above processing can be realized by a computer and a software program, and the program is stored in an appropriate recording medium such as a portable medium memory, a semiconductor memory, and a hard disk which can be read by a computer. By reading from the computer.

【０１０９】[0109]

【発明の効果】本発明を用いて映像区間から代表画像を
抽出すれば，被写体の動きをよく表す代表画像を選択で
きるようになる。また，本発明に示す方法によって，代
表画像候補の中から適切な代表画像を自動的に選択する
こともでき，この場合には，操作者の目視と判断を介さ
なくても代表画像を選択することが可能となる。According to the present invention, when a representative image is extracted from a video section, a representative image that well represents the movement of a subject can be selected. In addition, by the method according to the present invention, an appropriate representative image can be automatically selected from the representative image candidates. In this case, the representative image is selected without the operator's visual observation and judgment. It becomes possible.

【０１１０】このようにして選択した代表画像を用いて
映像の索引を作成すれば，映像の内容の理解しやすい索
引が作成できるという効果がある。If an index of a video is created using the representative image selected in this way, there is an effect that an index in which the contents of the video can be easily understood can be created.

[Brief description of the drawings]

【図１】本発明を実施する装置の構成例を示す図であ
る。FIG. 1 is a diagram showing a configuration example of an apparatus for implementing the present invention.

【図２】第１の実施形態による代表画像選択処理のフロ
ーチャートである。FIG. 2 is a flowchart of a representative image selection process according to the first embodiment.

【図３】処理の対象となる映像区間の例を示す図であ
る。FIG. 3 is a diagram illustrating an example of a video section to be processed;

【図４】映像区間の画像変化量を表すグラフを示す図で
ある。FIG. 4 is a diagram showing a graph representing an image change amount in a video section.

【図５】画像変化量と映像区間の対応関係を表す図であ
る。FIG. 5 is a diagram illustrating a correspondence relationship between an image change amount and a video section.

【図６】映像索引の例を示す図である。FIG. 6 is a diagram illustrating an example of a video index.

【図７】第２の実施形態による代表画像選択処理のフロ
ーチャートである。FIG. 7 is a flowchart of a representative image selection process according to the second embodiment.

【図８】第３の実施形態による代表画像選択処理のフロ
ーチャートである。FIG. 8 is a flowchart of a representative image selection process according to the third embodiment.

【図９】第４の実施形態による代表画像選択処理のフロ
ーチャートである。FIG. 9 is a flowchart of a representative image selection process according to a fourth embodiment.

【図１０】第４の実施形態による代表画像選択処理のフ
ローチャートである。FIG. 10 is a flowchart of a representative image selection process according to a fourth embodiment.

【図１１】第５の実施形態による代表画像選択処理のフ
ローチャートである。FIG. 11 is a flowchart of a representative image selection process according to a fifth embodiment.

【図１２】第５の実施形態による代表画像選択処理のフ
ローチャートである。FIG. 12 is a flowchart of a representative image selection process according to a fifth embodiment.

【図１３】第６の実施形態による代表画像選択処理のフ
ローチャートである。FIG. 13 is a flowchart of a representative image selection process according to a sixth embodiment.

【図１４】第６の実施形態による代表画像選択処理のフ
ローチャートである。FIG. 14 is a flowchart of a representative image selection process according to a sixth embodiment.

【図１５】極小点を求める処理のフローチャートであ
る。FIG. 15 is a flowchart of a process for obtaining a minimum point.

【符号の説明】１映像索引作成装置１１情報処理装置１２表示装置１３入力装置１４記憶装置１１１画像変化量計算手段１１２画像変化量極小フレーム算出手段１１３代表画像候補選択手段１１４代表画像選択手段４０１第ｔフレームにおける画像変化量（Ｆ［ｔ］）５０１〜５０６第１〜第ａｅフレーム[Explanation of symbols] 1 Video indexing device 11 Information processing device 12 Display device 13 Input device 14 Storage device 111 Image variation calculation means 112 Image change minimal frame calculating means 113 Representative image candidate selection means 114 Representative image selection means 401 Image change amount in t-th frame (F [t]) 501-506 1st-aeth frame

───────────────────────────────────────────────────── フロントページの続き (72)発明者児島治彦東京都千代田区大手町二丁目３番１号日本電信電話株式会社内Ｆターム(参考） 5B075 ND12 NS01 UU40 5C052 AA01 AB02 AB04 AC08 DD04 EE03 5C053 FA14 GB05 HA29 KA01 KA24 LA06 LA11 5L096 CA02 FA00 GA19 HA04 ────────────────────────────────────────────────── ─── Continuation of front page (72) Inventor Haruhiko Kojima 2-3-1 Otemachi, Chiyoda-ku, Tokyo Sun Within the Telegraph and Telephone Corporation F term (reference) 5B075 ND12 NS01 UU40 5C052 AA01 AB02 AB04 AC08 DD04 EE03 5C053 FA14 GB05 HA29 KA01 KA24 LA06 LA11 5L096 CA02 FA00 GA19 HA04

Claims

[Claims]

1. A method for selecting a representative image in a video section, comprising: calculating an image change amount in a video section; obtaining a frame in which the image change amount is minimal; Selecting a candidate of a representative image of the video section based on a frame having a minimum value.

2. The representative image selection method according to claim 1, wherein in the step of selecting a representative image candidate of the video section, a frame having a minimum image change amount is set as a candidate of the representative image of the video section. Method.

3. The step of selecting a candidate of a representative image of the video section includes the steps of selecting a plurality of frames based on a frame having a minimal image change amount, and forming a strobe image from the selected plurality of frames. When,
Setting the created strobe image as a candidate for a representative image of the video section.

4. The step of calculating an image change amount in the video section includes the steps of estimating a camera movement parameter from an input video, and transforming each frame of the input image so as to cancel the camera movement using the camera movement parameter. 2. The method according to claim 1, further comprising the step of:

5. A step of obtaining appropriateness as a representative image by a predetermined evaluation criterion for each of the representative image candidates selected in the step of selecting a representative image candidate in the video section; Selecting a representative image from the representative image candidates based on adequacy.

6. In the step of determining the suitability as a representative image, f represents an image change amount, t ₁ and t ₂ represent frame numbers at which f is maximum on both sides of the representative image candidate frame,
When θ _x is a predetermined constant, the value X, Look, | X-θ _x | of what value is smaller, the representative image selection method according to claim 5, wherein providing the appropriateness of a larger representative image.

7. In the step of determining the suitability as a representative image, f represents an image change amount, t ₀ represents a frame number of the representative image candidate, and t ₁ and t ₂ represent f ₂ on both sides of the representative image candidate frame. Is the maximum frame number
A larger value of (t ₁ ) / f (t ₀ ) is given greater suitability as a representative image, and f (t ₂ ) / f
6. A representative image selecting method according to claim 5, wherein a value having a larger value of (t ₀ ) is given a greater suitability as a representative image.

8. A representative image selecting device for selecting a representative image in a video section, comprising: means for calculating an image change amount in a video section; means for obtaining a frame in which the image change amount is minimal; Means for selecting a candidate of a representative image of the video section based on a frame having a minimal image change amount.

9. The representative image selection unit according to claim 8, wherein said means for selecting a candidate of a representative image of the video section sets a frame having a minimum image change amount as a candidate of a representative image of the video section. apparatus.

10. A means for selecting a representative image candidate in the video section, means for selecting a plurality of frames based on a frame having a minimal image change amount, and means for creating a strobe image from the selected plurality of frames. 9. The representative image selection device according to claim 8, further comprising: a unit that sets the created strobe image as a candidate for a representative image of the video section.

11. A means for calculating an image change amount in a video section, comprising: means for estimating a camera movement parameter from an input video, and transforming each frame of the input image so as to cancel camera movement using the camera movement parameter. 9. The representative image selection device according to claim 8, further comprising:

12. A means for determining the suitability of each representative image selected by the means for selecting a representative image of the video section as a representative image based on a predetermined evaluation criterion. 9. The representative image selecting apparatus according to claim 8, further comprising: means for selecting a representative image from the representative image candidates based on suitability.

13. The means for determining the suitability as a representative image includes f representing an image change amount, t ₁ and t ₂ representing frame numbers at which f is a maximum on both sides of the representative image candidate frame, and θ _x
Is a constant, the value X, Look, | X-θ _x | of what value is smaller, the representative image selection apparatus according to claim 12, wherein providing the appropriateness of a larger representative image.

14. The means for determining the suitability as a representative image includes f representing an image change amount, t ₀ representing a frame number of the representative image candidate, and t ₁ and t ₂ representing f at both sides of the frame of the representative image candidate. Is the maximum frame number, f (t ₁ ) /
f (t ₀₎ value gives the appropriateness of a large representative image by what is more _{large, f (t 2) / f} (t 0) of the value that gives the appropriateness of a large representative image by a larger one 13. The representative image selection device according to claim 12, wherein:

15. A representative image selection program for causing a computer to execute the representative image selection method according to any one of claims 1 to 7.

16. A recording medium for a representative image selection program, wherein a program for causing a computer to execute the representative image selection method according to any one of claims 1 to 7 is recorded.